How can I find out the number of tabs at the beginning of each line in a text file?

164 Views Asked by At

I have a text file, where each line may start with a number of tabs, including no tabs. For example, the first line starts with no tab, the second line with 1 tab, and the third line with 2 tabs:

Chapter 1 
    1
        1.1
Chapter 2
    1
        1.1

Is it possible to get the number of tabs at the beginning of each line, by using Python?

2

There are 2 best solutions below

1
mozway On

Using re.findall, map and len:

import re

out = list(map(len, re.findall('^\t*', text, flags=re.M)))

Or from a file:

with open('file.txt') as f:
    out = list(map(len, re.findall('^\t*', f.read(), flags=re.M)))

Note that the MULTILINE (M) flag is required to match each line start. Also the use of the zero-or-more quantifier (*) ensures matching all starts, even to return an empty string.

regex demo

Regex-less variant with itertools.takewhile:

from itertools import takewhile

out = []
with open('myfile.txt') as f:
    for line in f:
        out.append(sum(1 for _ in takewhile(lambda x: x == '\t', line)))

Output: [0, 1, 2, 0, 1, 2]

Used input for the first approach:

text = '''Chapter 1 
\t1
\t\t1.1
Chapter 2
\t1
\t\t1.1'''
1
furas On

You could use for-loop to work with every line separately and later use regex to search tabs at the beginning of line (ie. ^\t*), and len() to check how long string you found.

import re

text = '''Chapter 1 
\t1
\t\t1.1
Chapter 2
\t1
\t\t1.1'''

for line in text.splitlines(): 
    found = re.search('^\t*', line)[0]
    count = len(found)
    print(count)

Result

0
1
2
0
1
2

Or you can use nested for-loop to work with every char in line. And you can check char by char if it is tab - and count tabs. When you get different char then you can use break to exit for-loop

text = '''Chapter 1 
\t1
\t\t1.1
Chapter 2
\t1
\t\t1.1'''

for line in text.splitlines(): 
    count = 0
    for char in line:
        if char == '\t':
            count += 1
        else:
            break            
    print(count)