How to merge some elements in a python list based on some features

46 Views Asked by At

This is a list, each elements consists of two strings and a "/t" between.We can call the string on the left "label" and the part on the right "text".

continued   In the film, "Girl Interrupted," Winona Ryder plays an 18-year-old
continued   who enters a mental institution for
continued   what
anecdote    is diagnosed as borderline personality disorder
anecdote    The year is 1967
anecdote    the country is in turmoil over Vietnam and civil rights
continued   While
continued   lying on her bed one night
continued   and
continued   watching TV
continued   ,
anecdote    she sees a news report about a demonstration
continued   The narrator says something
continued   that might apply to today's turmoil
continued   :
continued   "We live in a time of doubt
continued   .
continued   The institutions
continued   we once trusted no longer
anecdote    seem reliable."
continued   As 2014 ends    Modd-NU
statistics  the stock market is at record highs
assumption  our traditional institutions and self-confidence are in decline
continued   A Pew Research Center study confirms one trend
testimony   that has been obvious over several years
assumption  The "typical" American family is no longer typical
statistics  Just 46 percent of American children now live in homes with their married, heterosexual parents
statistics  Five percent have no parents at home
continued   They most likely are living with grandparents
continued   ,
testimony   says the study
assumption  These startling figures about the decline of the American family contrast with the year 1960
continued   when    Modd-NU
statistics  73 percent of American children lived in traditional families
assumption  A major contributor to this trend has been the assault on marriage and other institutions by the Baby Boom generation

My problem is that I want to merge the element marked as "continue" with the first element after it that is not marked as "continue".For example, the first three elements are labelled "continue", so I want to merge them with the fourth element and use the label "anecdote" for the fourth element. I'm a beginner so I'm not familiar with iterative operations, thanks a lot!

1

There are 1 best solutions below

2
Tim Roberts On

It's not a good plan to try to "read ahead" in the iterator. Instead, just do one line at a time, and gather up "continued" lines until you get one that's not "continued".

result = [] 
build = []
for line in open('x.txt'):
    line = line.strip()
    if not line:
        continue
    label, text = line.split('\t', 1) 
    if label == 'continued':
        build.append( text )
    else:
        if build:
            build = " ".join(build)
            result.append(f"continued\t{build}")
            build = []
        result.append(f'{label}\t{text}')
if build:
    build = " ".join(build)
    result.append(f"continued\t{build}")
for item in result:
    print(item)

Output:

continued   In the film, "Girl Interrupted," Winona Ryder plays an 18-year-old who enters a mental institution for what
anecdote    is diagnosed as borderline personality disorder
anecdote    The year is 1967
anecdote    the country is in turmoil over Vietnam and civil rights
continued   While lying on her bed one night and watching TV ,
anecdote    she sees a news report about a demonstration
continued   The narrator says something that might apply to today's turmoil : "We live in a time of doubt . The institutions we once trusted no longer
anecdote    seem reliable."
continued   As 2014 ends    Modd-NU
statistics  the stock market is at record highs
assumption  our traditional institutions and self-confidence are in decline
continued   A Pew Research Center study confirms one trend
testimony   that has been obvious over several years
assumption  The "typical" American family is no longer typical
statistics  Just 46 percent of American children now live in homes with their married, heterosexual parents
statistics  Five percent have no parents at home
continued   They most likely are living with grandparents ,
testimony   says the study
assumption  These startling figures about the decline of the American family contrast with the year 1960
continued   when    Modd-NU
statistics  73 percent of American children lived in traditional families
assumption  A major contributor to this trend has been the assault on marriage and other institutions by the Baby Boom generation