What's most efficient way of reading larger files in python?

82 Views Asked by At

I had to write something that will open "signals.txt" file with 1000 non-empty words, take 10th letter every 40th word and write this letters to file "results" (Letters are making another word). I wanted to do this with smallest memory usage, so I made this in 3 different ways and I used tracemalloc module to monitor memory usage.

The point is I don't really know how to read output from tracemalloc.take_snapshot. I think that way 1 is most efficient, basing on tracemalloc, but I don't know if I'm using it right.

So can anyone tell me which way is most efficient? Or maybe these 3 ways are doing the same. Is there any better way to do it?

And yes I know probably "Way 2" will be best when I have only 1000 lines long txt file but let's assume that file is much larger than that.

My code

import tracemalloc
tracemalloc.start()


#------------------------Way 1----------------------

def Gen():
    with open('signals.txt','rt') as file:
        text = file.read().splitlines()
        for i in range(39,len(text),40):
            yield text[i][9]


with open('results.txt','w') as results:
    for i in Gen():
        results.write(i)


#------------------------Way 2----------------------

 with open('signals.txt','rt') as file:
     with open('results.txt','w') as results:
         text = file.read().splitlines()
         results.write(''.join(text[i][9] for i in range(39,len(text),40)))


#------------------------Way 3----------------------
#here I tried to do this without making list with file content

 with open('signals.txt','rt') as file:
     with open('results.txt','w') as results:
         iWord=39
         for index,word in enumerate(file):
             if index == iWord:
                 results.write(word[9])
                 iWord+=40


snapshot = tracemalloc.take_snapshot()

for stat in snapshot.statistics('lineno'):
    print(stat)

tracemalloc.stop()
1

There are 1 best solutions below

0
Alexander C On

file.read() reads whole file. You better use file.readline() that reads next line from file.

Also iterating over text file does readline() too. So third method has to be the most efficient.

Read docs with details