python list comprehension with tuples (nested list)

732 Views Asked by At

I am working on the following list comprehension which is supposed to convert a list of (document, category) tuples into a (list of paragraphs), category) tuples into ((list of sentences), (list of sentences), category) tuples.

Each (document, category) tuple is being split into (paragraph-list), category) tuple and then that goes 4 levels deep (document -> paragraphs -> sentences -> words).

Spyder says it has a syntax error. Any help please?

Ultimately the idea is to break documents into paragraphs into sentences
into words in the following hierarchy:
Doc-List

(Doc1, cat), (Doc2, cat), (Doc3, cat)
(doc1sent1, doc1sent2, doc1sent3), cat)
((sent1word1, sent1word2, sent1word3), (sent2word1, sent2word2), cat) ...

self._PSW =       
[[list(self.ConvertOneDoc(paragraph, "Sents")     
for paragraph in [list((self.ConvertOneDoc(document, "Para"), category))    
for document, category in self._CatDocs]]    
1

There are 1 best solutions below

1
jferard On

Abit late, but an answer...

You miss one parenthese to close the first list `list(...)̀ and you have one opening square bracket in excess. Try:

self._PSW = [
    list(self.ConvertOneDoc(paragraph, "Sents"))     
    for paragraph in [
            list((self.ConvertOneDoc(document, "Para"), category))    
            for document, category in self._CatDocs
        ]
    ]

Now, your code is a bit obscure, but your intention seems clear. As I understood, you want something nested like, per document: ([[words]], category) (words is a sentence, [words] a paragraph and [[words]] a document). Here's a try to fix it:

self._PSW = [
    ([
        [
            self.ConvertOneDoc(sentence, "Word") 
            for sentence in self.ConvertOneDoc(paragraph, "Sent")
        ]
        for paragraph in self.ConvertOneDoc(document, "Para") 
    ], category)
    for document, category in self._CatDocs
]