r sentiment analysis applied to a whole column

853 Views Asked by At

I have a dataframe of tweets. A given tweet has multiple sentences. When I use sentimentr's sentiment function, it returns a score for each function like so:

sentiment(as.character(tweets$text[1]))$sentiment
>>> [1] 0.2474874 0.0000000

But if I want a single score for the whole tweet, I can ~accomplish this effect by taking the mean score

mean(sentiment(as.character(tweets$text[1]))$sentiment)
>>>[1] 0.1237437

So, I figured I could apply this same logic to the entire dataframe

tweets$sentiment <- mean(sentiment(as.character((tweets$text)))$sentiment)

But...this returns the same value for all tweets. And if I drop the mean() I get NULL as there are too many sentences/scores to unpack.

How can I get a single value assigned to every row of my dataframe?

2

There are 2 best solutions below

1
Ronak Shah On BEST ANSWER

We can use sapply to apply sentiment function to each text individually.

library(sentimentr)

tweets$text <- as.character(tweets$text)
tweets$sentiment_score <- sapply(tweets$text, function(x) 
                             mean(sentiment(x)$sentiment))
0
jazzurro On

If you prefer a sentimentr/tidy way, you can do the following. get_sentences() split each tweet into sentences. Then, you use sentiment_by(). Here I used id as a grouping variable and getting average sentiment score for each tweet.

library(magrittr)
library(dplyr)

mytweets <- tibble(id = 1:3,
                   mytext = c("do you like it?  But I hate really bad dogs",
                              "I think the sentimentr package is great. But I need to learn how to use it",
                              "Do you like data science? I do!"))

mutate(mytweets,
      sentence_split = get_sentences(mytext)) %$%
sentiment_by(sentence_split, list(id))

   id word_count        sd ave_sentiment
1:  1         10 1.4974654    -0.8088680
2:  2         16 0.2906334     0.3944911
3:  3          7 0.1581139     0.1220192