Inconsistent Behavior with Similar Insertion Sort Functions in C++

144 Views Asked by At

I'm experiencing an issue with two similar insertion sort functions in C++. Both functions are intended to sort a vector of string-character pairs based on the character value in ascending order. The first function insertionSort1 produces the expected sorted output, but the second function insertionSort2, which only differs by the placement of the decrement operation j--, gives an unexpected and incorrect result.

Below is the code snippet that demonstrates both sorting functions and their respective outputs. The only difference between the two functions is that in insertionSort2, j-- is combined with the assignment statement, while in insertionSort1, it is separated onto its own line.

#include <bits/stdc++.h>
using namespace std;

void insertionSort1(vector<pair<string,char>>&v) {
  for (int i=1; i<v.size(); i++) {
    auto curr=v[i];
    int j=i-1;
    while(j>=0 && curr.second<v[j].second) {v[j+1]=v[j]; v[j--];}
    v[j+1] = curr;
  }
}

void insertionSort2(vector<pair<string,char>>&v) {
  for (int i=1; i<v.size(); i++) {
    auto curr=v[i];
    int j=i-1;
    while(j>=0 && curr.second<v[j].second) {v[j+1]=v[j--];}
    v[j+1] = curr;
  }
}

int main() {
  vector<pair<string, char>> v1 {
    {"Alice"  , 'B'},
    {"Bob"    , 'A'},
    {"Charlie", 'B'},
    {"David"  , 'A'}
  };
  // Print the original vector
  for(auto i : v1) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Alice, B} {Bob, A} {Charlie, B} {David, A}
  // Sort the vector
  insertionSort1(v1);
  // Print the sorted vector
  for(auto i : v1) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Bob, A} {David, A} {Alice, B} {Charlie, B}
  
  cout << "///////////////////////////////////////////\n";
  
  vector<pair<string, char>> v2 {
    {"Alice"  , 'B'},
    {"Bob"    , 'A'},
    {"Charlie", 'B'},
    {"David"  , 'A'}
  };
  // Print the original vector
  for(auto i : v2) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Alice, B} {Bob, A} {Charlie, B} {David, A}
  // Sort the vector
  insertionSort2(v2);
  // Print the sorted vector
  for(auto i : v2) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Bob, A} {Bob, A} {David, A} {David, A} // Unexpected output occurs here!
  return 0;
}

The expected sorted output for both vectors should be: {Bob, A} {David, A} {Alice, B} {Charlie, B}

However, the actual output I'm getting after using insertionSort2 is incorrect: {Bob, A} {Bob, A} {David, A} {David, A}

Why does insertionSort2 produce an incorrect result when it seemingly only has a minor syntactic difference from insertionSort1, which works as expected?


@tbxfreeware If possible, Taher, please post a program that demonstrates that there is no problem with integer data.

#include <bits/stdc++.h>
using namespace std;

void insertionSort1(vector<int>&v) {
  for (int i=1; i<v.size(); i++) {
    auto curr=v[i];
    int j=i-1;
    while(j>=0 && curr<v[j]) {v[j+1]=v[j]; v[j--];}
    v[j+1] = curr;
  }
}

void insertionSort2(vector<int>&v) {
  for (int i=1; i<v.size(); i++) {
    auto curr=v[i];
    int j=i-1;
    while(j>=0 && curr<v[j]) {v[j+1]=v[j--];}
    v[j+1] = curr;
  }
}

int main() {
  vector<int> v1 {7, 4, 1, 2, 3, 6, 9, 8, 5};
  for(auto i : v1) cout << i << " "; cout << endl; // 7 4 1 2 3 6 9 8 5
  insertionSort1(v1);
  for(auto i : v1) cout << i << " "; cout << endl; // 1 2 3 4 5 6 7 8 9
  
  cout << "//////////////////\n";
  
  vector<int> v2 {7, 4, 1, 2, 3, 6, 9, 8, 5};
  for(auto i : v2) cout << i << " "; cout << endl; // 7 4 1 2 3 6 9 8 5
  insertionSort2(v2);
  for(auto i : v2) cout << i << " "; cout << endl; // 1 2 3 4 5 6 7 8 9
  return 0;
}```
1

There are 1 best solutions below

0
tbxfreeware On

Under C++17 and later, the right side of an assignment expression is guaranteed to be evaluated completely, including all side-effects, prior to any evaluation of the left side. See: Refining Expression Evaluation Order for Idiomatic C++ by Gabriel Dos Reis, Herb Sutter, and Jonathan Caves.

Consider then, the first assignment (i.e., move right), when:

  • i == 1
  • j == 0
  • j + 1 == 1

You want v[0] to be copied into v[1], using the expression v[j + 1] = v[j--];

What happens, however, is that j is decremented before j + 1 is evaluated.

  • original value of j == 0
  • decremented value of j == -1
  • decremented value of j + 1 == 0

So instead of copying v[0] into v[1], it is copied back to v[0], and v[1] remains unaltered.

At this point, j has the value -1, which causes the "j-loop" to end. Afterwards, variable curr, which came from v[1] (and which is still stored there because of the failure to overwrite v[1] above), is stored at v[j + 1], which is v[0].

So, curr is stored in two places: v[0] and v[1]. The data originally stored at v[0] was overwritten, and is lost. You can see this in the output below, in the line marked is2, after j-loop: i: 1, j: -1:

is2, after  j-loop: i: 1, j: -1, v: [{"Bob",'A'},{"Bob",'A'},{"Charlie",'B'},{"David",'A'}]

If the decrement is moved to the left side of the assignment, as in insertionSort3, below, then the sort works correctly.

void insertionSort3(vector<pair<string, char>>& v) {
    for (int i = 1; i < v.size(); i++) {
        auto curr = v[i];
        int j = i - 1;
        while (j >= 0 && curr.second < v[j].second) { v[j-- + 1] = v[j]; }
        v[j + 1] = curr;
    }
}

Expanded Output

This data was generated using the program from the next section. It verifies the overwrite that was predicted in the foregoing analysis.

{Alice, B} {Bob, A} {Charlie, B} {David, A}

is1, before j-loop: i: 1, j: 0, v: [{"Alice",'B'},{"Bob",'A'},{"Charlie",'B'},{"David",'A'}]
is1, inside j-loop: i: 1, j: -1, v: [{"Alice",'B'},{"Alice",'B'},{"Charlie",'B'},{"David",'A'}]
is1, after  j-loop: i: 1, j: -1, v: [{"Bob",'A'},{"Alice",'B'},{"Charlie",'B'},{"David",'A'}]

is1, before j-loop: i: 2, j: 1, v: [{"Bob",'A'},{"Alice",'B'},{"Charlie",'B'},{"David",'A'}]
is1, after  j-loop: i: 2, j: 1, v: [{"Bob",'A'},{"Alice",'B'},{"Charlie",'B'},{"David",'A'}]

is1, before j-loop: i: 3, j: 2, v: [{"Bob",'A'},{"Alice",'B'},{"Charlie",'B'},{"David",'A'}]
is1, inside j-loop: i: 3, j: 1, v: [{"Bob",'A'},{"Alice",'B'},{"Charlie",'B'},{"Charlie",'B'}]
is1, inside j-loop: i: 3, j: 0, v: [{"Bob",'A'},{"Alice",'B'},{"Alice",'B'},{"Charlie",'B'}]
is1, after  j-loop: i: 3, j: 0, v: [{"Bob",'A'},{"David",'A'},{"Alice",'B'},{"Charlie",'B'}]
{Bob, A} {David, A} {Alice, B} {Charlie, B}
///////////////////////////////////////////
{Alice, B} {Bob, A} {Charlie, B} {David, A}

is2, before j-loop: i: 1, j: 0, v: [{"Alice",'B'},{"Bob",'A'},{"Charlie",'B'},{"David",'A'}]
is2, inside j-loop: i: 1, j: -1, v: [{"Alice",'B'},{"Bob",'A'},{"Charlie",'B'},{"David",'A'}]
is2, after  j-loop: i: 1, j: -1, v: [{"Bob",'A'},{"Bob",'A'},{"Charlie",'B'},{"David",'A'}]

is2, before j-loop: i: 2, j: 1, v: [{"Bob",'A'},{"Bob",'A'},{"Charlie",'B'},{"David",'A'}]
is2, after  j-loop: i: 2, j: 1, v: [{"Bob",'A'},{"Bob",'A'},{"Charlie",'B'},{"David",'A'}]

is2, before j-loop: i: 3, j: 2, v: [{"Bob",'A'},{"Bob",'A'},{"Charlie",'B'},{"David",'A'}]
is2, inside j-loop: i: 3, j: 1, v: [{"Bob",'A'},{"Bob",'A'},{"Charlie",'B'},{"David",'A'}]
is2, after  j-loop: i: 3, j: 1, v: [{"Bob",'A'},{"Bob",'A'},{"David",'A'},{"David",'A'}]
{Bob, A} {Bob, A} {David, A} {David, A}
///////////////////////////////////////////
{Alice, B} {Bob, A} {Charlie, B} {David, A}

is3, before j-loop: i: 1, j: 0, v: [{"Alice",'B'},{"Bob",'A'},{"Charlie",'B'},{"David",'A'}]
is3, inside j-loop: i: 1, j: -1, v: [{"Alice",'B'},{"Alice",'B'},{"Charlie",'B'},{"David",'A'}]
is3, after  j-loop: i: 1, j: -1, v: [{"Bob",'A'},{"Alice",'B'},{"Charlie",'B'},{"David",'A'}]

is3, before j-loop: i: 2, j: 1, v: [{"Bob",'A'},{"Alice",'B'},{"Charlie",'B'},{"David",'A'}]
is3, after  j-loop: i: 2, j: 1, v: [{"Bob",'A'},{"Alice",'B'},{"Charlie",'B'},{"David",'A'}]

is3, before j-loop: i: 3, j: 2, v: [{"Bob",'A'},{"Alice",'B'},{"Charlie",'B'},{"David",'A'}]
is3, inside j-loop: i: 3, j: 1, v: [{"Bob",'A'},{"Alice",'B'},{"Charlie",'B'},{"Charlie",'B'}]
is3, inside j-loop: i: 3, j: 0, v: [{"Bob",'A'},{"Alice",'B'},{"Alice",'B'},{"Charlie",'B'}]
is3, after  j-loop: i: 3, j: 0, v: [{"Bob",'A'},{"David",'A'},{"Alice",'B'},{"Charlie",'B'}]
{Bob, A} {David, A} {Alice, B} {Charlie, B}

Output operators / debug routine

This program has been adapted from the OP. It adds output operators and a debug function that simplify the task of watching the sorting steps.

// main.cpp

//#include <bits/stdc++.h>
#include <iomanip>
#include <iostream>
#include <string>
#include <utility>
#include <vector>

using std::cout;
using std::endl;
using std::pair;
using std::string;
using std::vector;

std::ostream& operator<< (std::ostream& ost, std::pair<std::string, char> const& p)
{
    ost << '{' << std::quoted(p.first) << ",'" << p.second << "'}";
    return ost;
}
std::ostream& operator<< (std::ostream& ost, vector<pair<string, char>> const& v)
{
    ost.put('[');
    if (!v.empty())
    {
        auto n_commas{ v.size() - 1u };
        for (std::size_t i{}; i < n_commas; ++i)
            ost << v[i] << ',';
        ost << v.back();
    }
    ost.put(']');
    return ost;
}
void debug(
    std::string const& heading, 
    int const i, 
    int const j, 
    std::vector<std::pair<std::string, char>>& v)
{
    std::cout << heading << ": " 
        << "i: " << i
        << ", j: " << j
        << ", v: "<< v 
        << '\n';
}

void insertionSort1(vector<pair<string, char>>& v) {
    for (int i = 1; i < v.size(); i++) {
        auto curr = v[i];
        int j = i - 1;
        debug("\nis1, before j-loop", i, j, v);
        while (j >= 0 && curr.second < v[j].second) { 
            v[j + 1] = v[j]; v[j--];
            debug("is1, inside j-loop", i, j, v);
        }
        v[j + 1] = curr;
        debug("is1, after  j-loop", i, j, v);
    }
}

void insertionSort2(vector<pair<string, char>>& v) {
    for (int i = 1; i < v.size(); i++) {
        auto curr = v[i];
        int j = i - 1;
        debug("\nis2, before j-loop", i, j, v);
        while (j >= 0 && curr.second < v[j].second) {
            v[j + 1] = v[j--];
            debug("is2, inside j-loop", i, j, v);
        }
        v[j + 1] = curr;
        debug("is2, after  j-loop", i, j, v);
    }
}

void insertionSort3(vector<pair<string, char>>& v) {
    for (int i = 1; i < v.size(); i++) {
        auto curr = v[i];
        int j = i - 1;
        debug("\nis3, before j-loop", i, j, v);
        while (j >= 0 && curr.second < v[j].second) { 
            v[j-- + 1] = v[j]; 
            debug("is3, inside j-loop", i, j, v);
        }
        v[j + 1] = curr;
        debug("is3, after  j-loop", i, j, v);
    }
}

int main() {
    vector<pair<string, char>> v1{
      {"Alice"  , 'B'},
      {"Bob"    , 'A'},
      {"Charlie", 'B'},
      {"David"  , 'A'}
    };
    // Print the original vector
    for (auto i : v1) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Alice, B} {Bob, A} {Charlie, B} {David, A}
    // Sort the vector
    insertionSort1(v1);
    // Print the sorted vector
    for (auto i : v1) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Bob, A} {David, A} {Alice, B} {Charlie, B}

    cout << "///////////////////////////////////////////\n";

    vector<pair<string, char>> v2{
      {"Alice"  , 'B'},
      {"Bob"    , 'A'},
      {"Charlie", 'B'},
      {"David"  , 'A'}
    };
    // Print the original vector
    for (auto i : v2) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Alice, B} {Bob, A} {Charlie, B} {David, A}
    // Sort the vector
    insertionSort2(v2);
    // Print the sorted vector
    for (auto i : v2) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Bob, A} {Bob, A} {David, A} {David, A} // Unexpected output occurs here!

    cout << "///////////////////////////////////////////\n";

    vector<pair<string, char>> v3{
      {"Alice"  , 'B'},
      {"Bob"    , 'A'},
      {"Charlie", 'B'},
      {"David"  , 'A'}
    };
    // Print the original vector
    for (auto i : v3) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Alice, B} {Bob, A} {Charlie, B} {David, A}
    // Sort the vector
    insertionSort3(v3);
    // Print the sorted vector
    for (auto i : v3) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Bob, A} {Bob, A} {David, A} {David, A} // Unexpected output occurs here!

    return 0;
}
// end file: main.cpp

What about integers?

@Taher Anaya, the OP, reports that the problem only occurs when sorting pairs, and specifically, that it does not occur when sorting integers. I have no explanation for this. The analysis above indicates it should affect any objects that are sorted using insertionSort2.

The mystery to me is not that insertionSort2 fails to sort pairs, but rather that it succeeds in sorting anything.

If possible, Taher, please post a program that demonstrates that there is no problem with integer data.

[Edit:] Thank you, Taher, for posting a program. Unfortunately, I am not able to coroborate your results. When I ran your program under MSVC (/std:c++latest), I got output that shows overwritten data, similar to what was seen (and expected) with pairs.

7 4 1 2 3 6 9 8 5
1 2 3 4 5 6 7 8 9
//////////////////
7 4 1 2 3 6 9 8 5
1 4 1 2 3 5 8 8 5    <--------- scads of bad data!

This is the same result that @molbdnilo recorded from his run on coliru.stacked-crooked.com.