How to find the shortest of substring of string before a certain text in python 3

161 Views Asked by At

I am trying to extract the shortest substring of a string before a certain text in Python 3. For instance, I have the following string.

\\n...\\n...\\n...TEXT

I want to extract the shortest sub-string of the string that contains exactly two \\n before 'TEXT'. The example text may have random number of \\n and random letters between \\n.

I have already tried this in Python 3.4 but I get the result as the original text. It seems like when I try the code, it finds the first '\n' as the first search find and treats rest of '\n' as just any other texts.

text='\\n abcd \\n efg \\n hij TEXT'

pattern1=re.compile(r'\\n.\*?\\n.\*?TEXT', re.IGNORECASE)

obj = re.search(pattern1, text)

obj.group(0)

When I try my code, I get the result as \\n abcd \\n efg \\n hij TEXT which is exactly same as the input.

I would like to result to be

\\n efg \\n hij TEXT

Can anyone help me with this?

2

There are 2 best solutions below

1
zxzak On BEST ANSWER

Using regex with negative lookahead:

import re

text = '\\n abcd \\n efg \\n hij TEXT'
pattern = re.compile(r'(\\n(?!.*\\n.*\\).*)')
res = re.search(pattern, str(respData))
res.group(0)

Using python methods:

text = '\\n abcd \\n efg \\n hij TEXT'
text[text[:text.rfind("\\n")].rfind("\\n"):]
0
Michał Kozłowski On

I am not sure if I good understanding the problem... Using simple split text, meaby was useful:

text = '\\\n abcd \\\n efg \\\n hij TEXT - the rest of string'
text = text.split('TEXT')[0]
list_part = text.split('\\\n ')
print(list_part)
minimal_set = text
for parts in list_part:
 if len(parts) is not 0 and len(parts) < len(minimal_set):
  minimal_set = parts
print (minimal_set)