'NoneType' object is not iterable for collocation function

1.7k Views Asked by At

I am new to NLTK and trying to return the collocation output. I am getting the output and along with it, I am getting none as well. Below is my code, input and output.

import nltk
from nltk.corpus import stopwords


def performBigramsAndCollocations(textcontent, word):
    stop_words = set(stopwords.words('english'))
    pattern = r'\w+'
    tokenizedwords = nltk.regexp_tokenize(textcontent, pattern)
    for i in range(len(tokenizedwords)):
        tokenizedwords[i] = tokenizedwords[i].lower()
    tokenizedwordsbigrams = nltk.bigrams(tokenizedwords)
    tokenizednonstopwordsbigrams = [ (w1, w2) for w1, w2 in tokenizedwordsbigrams if w1 not in stop_words and w2 not in stop_words]
    cfd_bigrams = nltk.ConditionalFreqDist(tokenizednonstopwordsbigrams)
    mostfrequentwordafter = cfd_bigrams[word].most_common(3)
    tokenizedwords = nltk.Text(tokenizedwords)
    collocationwords = tokenizedwords.collocations()
    return mostfrequentwordafter, collocationwords


if __name__ == '__main__':
    textcontent = input()

    word = input()


    mostfrequentwordafter, collocationwords = performBigramsAndCollocations(textcontent, word)
    print(sorted(mostfrequentwordafter, key=lambda element: (element[1], element[0]), reverse=True))
    print(sorted(collocationwords))

input :Thirty-five sports disciplines and four cultural activities will be offered during seven days of competitions. He skated with charisma, changing from one gear to another, from one direction to another, faster than a sports car. Armchair sports fans settling down to watch the Olympic Games could be for the high jump if they do not pay their TV licence fee. Such invitationals will attract more viewership for sports fans by sparking interest among sports fans. She barely noticed a flashy sports car almost run them over, until Eddie lunged forward and grabbed her body away. And he flatters the mother and she kind of gets prissy and he talks her into going for a ride in the sports car.

sports

output:
sports car; sports fans.

[('fans', 3), ('car', 3), ('disciplines', 1)]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-191-40624b3de987> in <module>
     43     mostfrequentwordafter, collocationwords = performBigramsAndCollocations(textcontent, word)
     44     print(sorted(mostfrequentwordafter, key=lambda element: (element[1], element[0]), reverse=True))
---> 45     print(sorted(collocationwords))

TypeError: 'NoneType' object is not iterable

Can you please help me to resolve the issue

5

There are 5 best solutions below

2
Seaver Olson On

key transforms the collections items before it is ran. key= really means as I run through this list I will- so when you use key=lambda element: (element[1], element[0]) you are asking it to run twice. instead try something like this. Note that this may not be exactly correct as it is 7 am and I just woke up I will edit it later if it does not work for you.

mylist = [0,1]
print(sorted(mostfrequentwordafter, key=lambda element: (element[mylist]), reverse=True))
0
Dharmendra Singh On

collocations() is buggy and causing error in nltk. I faced the issue recently and able to resolve the issue by using collocation_list(). Try this approach.

collocationwords = tokenizedwords.collocation_list()
2
harshad_ On

Use the below code it should work.

def performBigramsAndCollocations(textcontent, word):
    
    from nltk.corpus import stopwords
    from nltk import ConditionalFreqDist
    tokenizedword = nltk.regexp_tokenize(textcontent, pattern = r'\w*', gaps = False)
    tokenizedwords = [x.lower() for x in tokenizedword if x != '']
    tokenizedwordsbigrams=nltk.bigrams(tokenizedwords)
    stop_words= stopwords.words('english')
    tokenizednonstopwordsbigrams=[(w1,w2) for w1 , w2 in tokenizedwordsbigrams if (w1 not in stop_words and w2 not in stop_words)]
    cfd_bigrams=nltk.ConditionalFreqDist(tokenizednonstopwordsbigrams)
    mostfrequentwordafter=cfd_bigrams[word].most_common(3)
    tokenizedwords = nltk.Text(tokenizedwords)
    collocationwords = tokenizedwords.collocation_list()

    return mostfrequentwordafter ,collocationwords
    
1
Suchitra U On

collocation_list() alone was not helping. I tried the below and it worked for me.

collocationwords1 = tokenizedwords.collocation_list()

collocationwords=list()
for item in collocationwords1:
    newitem=item[0]+" "+item[1]
    collocationwords.append(newitem)
1
Rohan Kottargi On

def performBigramsAndCollocations(textcontent, word):

from nltk.corpus import stopwords
from nltk import ConditionalFreqDist
tokenizedword = nltk.regexp_tokenize(textcontent, pattern = r'\w*', gaps =False)
tokenizedwords = [x.lower() for x in tokenizedword if x != '']
tokenizedwordsbigrams=nltk.bigrams(tokenizedwords)
stop_words= stopwords.words('english')
tokenizednonstopwordsbigrams=[(w1,w2) for w1 , w2 in tokenizedwordsbigrams if (w1 not in stop_words and w2 not in stop_words)]
cfd_bigrams=nltk.ConditionalFreqDist(tokenizednonstopwordsbigrams)
mostfrequentwordafter=cfd_bigrams[word].most_common(3)
tokenizedwords = nltk.Text(tokenizedwords)
collocationwords1 = tokenizedwords.collocation_list()

collocationwords=list()
for item in collocationwords1:
    newitem=item[0]+" "+item[1]
    collocationwords.append(newitem)


return mostfrequentwordafter ,collocationwords

##this code worked for me