Manipulating input and output streams from one file to another

214 Views Asked by At

I have a new file every few seconds that looks like this:

23
45
21
1
9
23
42
22
40
11
33
32
18
11
12
32
22
7
37
30

In this text file to be read there is one number per line that will be between 1-40. These files are generated several times a minute.

I am trying to order them ascending with StringReader and StringWriter. My logic must be flawed as nothing shows up in this file I intended to send it to. I passed true as the append parameter but still nothing is populated in my sorted file.

The goal was to read from the text file with the for loop which iterates over 1-40 int values and compare that to each string or int from the file read and when found copy that from the read file into the sorted file in sorted order.

I have been looking at it for a while and it should work but does not. Would this be easier with the file reader/writer classes or streamreader/writer as I have done?

public static void ProcessDirectory()
{
    int variable1;
    StreamReader readToSort = new StreamReader(@"C:write.txt");
    StreamWriter writeSorted = new StreamWriter(@"C:Sorted_File.txt", true);

    for (int i = 1; i > 41; i++)
    {
        variable1 = (readToSort.Read());

        while (!readToSort.EndOfStream)
        {
            if (variable1 == i)
            {
                writeSorted.Write(i.ToString() + "\n");
            }
        }

        MessageBox.Show("processing #" + variable1);
    }

    readToSort.Close();
    writeSorted.Close();
}
1

There are 1 best solutions below

0
On

To make sure I correctly understand the problem you are trying to solve, I made a list of requirements based on your question and the comments below it.

  • Your input consists of text files which are several gigabytes large, and therefore cannot be fully loaded into memory
  • These text files consist of numeric values only, with each value being on its own line
  • These numeric values need to be written to another output file, in sorted order

It's not entirely clear to me what your input consists of so you might need to correct me here. Do you need to combine multiple (smaller) input files, sort the combined contents, and output that to a single (larger) file?

Example:

  • Input: file1_unsorted.txt (6GB), file2_unsorted.txt (6GB)
  • Output: file1_and_file2_sorted.txt (12GB)

If so, is each individual file small enough to be loaded in memory (but just not the combined whole?)

Example (assuming 1GB RAM):

  • Input: file1_unsorted.txt (600MB), file2_unsorted.txt (600MB), ..., file10_unsorted.txt (600MB)
  • Output: file1_through_file10_sorted.txt (6GB)

Or, can each individual input file be large enough that it does not fit in memory, and do these files each need to be sorted to a corresponding output file?

Example:

  • Input: file_unsorted.txt (6GB)
  • Output: file_sorted.txt (6GB)

Assuming that both your (unsorted) input and (sorted) output files are too large to fit into memory, you need a way to sort the contents of these files in chunks. The keyword you are looking for is External Sort.

Here is a good example of that on CodeProject (with source code and explanation): Sorting Huge Text Files

A somewhat similar StackOverflow question you might want to look into: Reading large text files with streams in C#

If you need any help with your actual implementation, please provide additional information on what your input and (desired) output looks like. The files themselves are obviously too big to upload - a screenshot of the directory with your input and output files would also work. Then I (and others) can see how large each file is and to what degree (if at all) they need to be aggregated.