Skipping first line of a file when using linecache

124 Views Asked by At

I've created a lazy dataloader in Pytorch using linecache. It pulls from a tsv file that I also use to building the vocabulary with Pytorch's build_vocab, so I need to have a header line of titles for each of the columns.

For the dataset I'm using getitem:

def __getitem__(self, index):
    "Generates one sample of data"
    line = linecache.getline(self._filepath, index + 1)

However, since linecache doesn't load the whole file in memory, there's no obvious way to skip the first line/header of the tsv file. I tried "if index == 0: pass" but this obviously returned None that threw a different error.

My current solution is just to have two tsvs, one with a header and one without.

0

There are 0 best solutions below