How to delete lines in a file up to a certain character in python 3

94 Views Asked by At

I have a very big file that I need to parse. I don't need any of the lines up to '&'. I just need the information after the '&' in the file. How do I delete the lines before the '&'? This is what I have so far:

import re

original_file = 'file.rpt'
file_copy = 'file_copy.rpt'

with open(original_file, 'r') as rf:
    with open(file_copy, 'r+') as wf:
        for line in rf:
            #if statement to write after the '&' has been encountered?
            wf.write(line)

Input file:

sample text1
sample text2
sample text3
sample text4
&sample text5
sample text6

expected output file:
&sample text5
sample text6

In the rpt file, it has 6 lines, lines 1-4 are information that isn't needed. I want to delete lines 1-4, so I can focus on lines 5 and 6.

2

There are 2 best solutions below

0
CodeSamurai777 On

A better and safer way would be to create a new file with smaller contents so that you can check the contents before deleting the old file. So my suggestion would look like this:


original_file = 'file.rpt'
file_copy = 'file_copy.rpt'
omit = True
with open(original_file, 'r') as rf:
    with open(file_copy, 'w') as wf:
        for line in rf:
            if "&" in line:
                omit = False
            if omit:
                continue
            else:
                wf.write(line)

This code will omit all the lines up to and excluding the line containing the &

You can also analyze the line with & symbol:

original_file = 'file.rpt'
file_copy = 'file_copy.rpt'
omit = True
with open(original_file, 'r') as rf:
    with open(file_copy, 'r+') as wf:
        for line in rf:
            if "&" in line:
                before,after = line.split("&")
                wf.write(after)
                omit = False
                continue
            if omit:
                continue
            else:
                wf.write(line)

The above will write also all the contents after & but in the same line omitting anything before & in the same line

EDIT

Also check if your opening the second file in a correct mode maybe you should use 'w' to truncate file first 'r+' will append to the contents of the file and I am not sure this is what you want

0
Mad Physicist On

You don't really need to modify your file if you just want to work with some portion of it. Using your original code, you can load the portion that you want:

def load_data(filename):
    with open(filename, 'r') as f:
        for line in f:
            if '&' in line:  # or if line.startswith('&'):
                break
        else:
            return []
        return [line] + list(f)

The function load_data will load in all the lines after the first & it encounters. You can then write the data to another file, out just process it as you choose.

You can even make it into a lazy generator that will only return lines as you need them:

def trim_data(filename):
    with open(filename, 'r') as f:
        for line in f:
            if '&' in line:  # or if line.startswith('&'):
                yield line
                break
        else:
            return
        yield from f

Copying the file this way, if that's what you want to do, is even easier:

with open(copy_file, 'w') as f:
    for line in trim_data(original_file):
        f.write(line)