I am unable count the occurrences of words in a pandas series row-wise

47 Views Asked by At

I have a pandas DataFrame with a series called 'spam['v2']', where each row contains a sentence. I would like to create a new series that calculates the word count for each row, where the output is a dictionary with words as keys and their corresponding counts as values.

For example, if my original series looks like this:

enter image description here

I would like to create a new series where the rows have the following dictionary:

enter image description here

I tried this and was successful in achieving the task but it was done using regular python:

for those who want to see the full working file (One Drive Link) : https://1drv.ms/f/s!AsQPI-pwVwq5v03-11e7R3Rme-2l?e=9LMtgd

import pandas as pd

spam = pd.read_csv('spam.csv')

def freq(text):
    
    words = []
    words = text.split()
    wfreq=[words.count(w) for w in words]
    
    return dict(zip(words,wfreq))

count = spam['v2'].apply(freq)
count = pd.Series(count)

I'm not sure how to approach this problem efficiently with pandas and series methods and without the use of regular python. Could someone please guide me on how to achieve this using pandas?

Thank you!

1

There are 1 best solutions below

0
notarealgreal On
import pandas as pd

spam = pd.read_csv('spam.csv')

def word_count(text):
    words = text.split()
    word_count = {}
    for word in words:
        if word in word_count:
            word_count[word] += 1
        else:
            word_count[word] = 1
    return word_count

spam['word_count'] = spam['v2'].apply(word_count)