Sorting a binary file with structs

60 Views Asked by At

I'm working on a C++ program that deals with Pokémon data stored in a binary file. The program reads Pokémon records from a binary file, sorts them based on their power, and then writes the sorted Pokémons back to another binary file. However, I'm encountering issues with the sorting function.

struct Pokemon
{
    char Name[50];

    int Type;

    unsigned Power;
};

The sortPokemons function is receiving an input stream which is associated with a binary file containing pokemons (which are NOT sorted) and an output stream which is associated with the file in which we should store the sorted pokemons (by their Power).

void sortPokemons(std::istream& is, std::ostream& os)
{
    unsigned count = 1;

    while (true)
    {
        Pokemon p = readPokemonFromBinary(is);

        if (is.eof())
        {
            break;
        }

        Pokemon min;
        min.Power = INT32_MAX;

        while (true)
        {
            Pokemon c = readPokemonFromBinary(is);

            if (is.eof())
            {
                break;
            }

            if (c.Power < min.Power)
            {
                min = c;
            }
        }

        printPokemon(min);

        is.clear(); //the eof bit is set for the inner while to end
        is.seekg(count * sizeof(Pokemon), std::ios::beg);
        count++;
        writePokenomToBinary(min, os);
    }
}

Note that we know that the input file would contain only sequential pokemon records in it.

The following algorithm won't work and I think I know why: I am trying to simulate the selection sort algorithm, but for it in order to work we need to swap the elements if needed, which I am not sure how to do using only files. Is there a more elegant approach to achieve my goal (a new file containing the pokemons sorted by their Power in ascending order)? Is my idea valid?

1

There are 1 best solutions below

0
Joseph Larson On

Hmm. Random access on an input stream. Is that even possible? It's unexpected.

But even with random access on your file type, your algorithm is flawed. The first time through the loop, you will read the entire input file and find the lowest power one, then write it out. Great. You're good so far.

The second time through your outer loop, you'll skip the first record of the input. But unless your input is already sorted, the one you skipped might be the next-lowest in the file.

I'd do this entirely differently. Let's assume that random access is possible on the input stream. I'd read the file once into a std::vector. You said you don't want to read it all that way due to memory issues. (It must be a really big file for that to be an issue.) But you could store an simple structure:

struct PowerIndex {
     size_t index;
     long power;
}

Just store the index (count) 0..n plus the power of the record you read.

Then sort THAT. You now can run through the vector, seek to the proper file location, read the record, write it, and you're done.

And it will be correct.

Of course, I still wonder whether random access is available on a generic std::stream.