External Merge Sort with limited space

81 Views Asked by At

so my task is to make an external merge sort with a text file. I'm also supposed to only have a maximum of 3 Strings at one time.

This is what my merge-Method looks like:

public static void mergeSortExternal(File c) throws IOException {

        int n = lineCount(c);
        int mid = n / 2;

        Scanner s = new Scanner(c);
        s.useDelimiter("[.,:;()?!\"\\s]+");

        File a = new File(dir + c.getName().replaceFirst("[.][^.]+$", "") + "-1.txt");
        File b = new File(dir + c.getName().replaceFirst("[.][^.]+$", "") + "-2.txt");
        FileWriter file_1 = new FileWriter(a);
        BufferedWriter bufferedWriter_1 = new BufferedWriter(file_1);
        FileWriter file_2 = new FileWriter(b);
        BufferedWriter bufferedWriter_2 = new BufferedWriter(file_2);

        for (int i = 0; i < mid; i++){
            bufferedWriter_1.write(s.next() + "\n");
            bufferedWriter_1.flush();
        }
        bufferedWriter_1.close();

        for (int i = 0; i < n - mid; i++){
            bufferedWriter_2.write(s.next() + "\n");
            bufferedWriter_2.flush();
        }
        bufferedWriter_2.close();

        if(lineCount(a) > 2) mergeSortExternal(a);
        if(lineCount(b) > 2) mergeSortExternal(b);

        //merge(a, b, c);

    }

For the actual merging-part I tried modifying a pseudo code for my purpose, but I didn't really get anywhere.

I don't want any code from anyone, just need an idea to get me thinking because I am really running out of ideas.

1

There are 1 best solutions below

0
rcgldr On

It would be easier to implement a bottom up merge sort, using 2 strings (2 way merge) and 4 working files, or 3 strings (3 way merge) and 6 working files.

For 2 way merge, the first pass repeatedly reads 2 strings, compares the 2 strings, writes them in order to alternating temp files (write to temp[0], write to temp[1], write to temp[0], ...). It sets run size to 2.

The next passes are 2 way merge only passes. read 1 string from temp[0], 1 string from temp[1], compare two strings, write the smaller string to temp[2], and if the smaller string was from temp[i], read the next string from temp[i], until "run size" strings have been read from either of the temp files. When run size strings have been read from either file, then the remaining string in memory is written and the rest of the "run size" strings in the other file are read from that remaining temp file and written to complete a merge step. If The next merge step would do the same, writing to temp[3]. This is repeated until all strings have been written. Then temp[0] is swapped with temp[2], temp[1] swapped with temp[3], were temp[...] is the "file". Run size is then doubled. This process is repeated until run size >= file size (total number of strings), or when (run size * 2 ) >= file size, write all strings to the destination file.

With 3 strings, a 3 way merge with 6 files could be used.