NLP - Worse result when adding stemming or lemmitization for Sentiment Analysis

Question

NLP - Worse result when adding stemming or lemmitization for Sentiment Analysis

137 Views Asked by AdamG At 13 December 2022 at 00:15

I'm trying to create a full pipeline of results for sentiment analysis for a smaller subset of the IMDB reviews (only 2k pos, 2k neg) so I'm tryna show results at each stage

i.e. without any pre-processing, then basic cleaning (remove specials, stopwords, lowercasing) then testing both stemming and lemmitization (seperately) on top of the basic cleaning.

After basic cleaning I'm jumping from 50% (only binary classification so makes sense) to mid-to-low 80%'s. Then after adding stemming and lemming, it either doesn't change or for random forest gets the recall below 80%.

Why's this the case? Are my results normal? If so how do you justify using either one?

Also to note all of the models and feature extractions are using default parameters from sklearn so I haven't gotten to the model optimization part, should I try that for these 3 cases and then see if they perform worse?

Feature Extractions: Bag of Words and TF-Idf

Models: SVM, Logistic Regression, Multinomial Naive Bayes and Random Forest

Results:

Basic Cleaning (remove specials, stopwords, lowercasing)

SVM BOW
              precision    recall  f1-score   support

    Positive       0.85      0.85      0.85       530
    Negative       0.83      0.83      0.83       470

    accuracy                           0.84      1000
   macro avg       0.84      0.84      0.84      1000
weighted avg       0.84      0.84      0.84      1000


SVM TF-IDF
              precision    recall  f1-score   support

    Positive       0.85      0.88      0.86       530
    Negative       0.86      0.83      0.84       470

    accuracy                           0.85      1000
   macro avg       0.86      0.85      0.85      1000
weighted avg       0.86      0.85      0.85      1000


LR BOW
              precision    recall  f1-score   support

    Positive       0.87      0.85      0.86       530
    Negative       0.83      0.85      0.84       470

    accuracy                           0.85      1000
   macro avg       0.85      0.85      0.85      1000
weighted avg       0.85      0.85      0.85      1000


LR TF-IDF
              precision    recall  f1-score   support

    Positive       0.89      0.82      0.85       530
    Negative       0.81      0.88      0.84       470

    accuracy                           0.85      1000
   macro avg       0.85      0.85      0.85      1000
weighted avg       0.85      0.85      0.85      1000


MNB BOW
              precision    recall  f1-score   support

    Positive       0.83      0.85      0.84       530
    Negative       0.82      0.81      0.82       470

    accuracy                           0.83      1000
   macro avg       0.83      0.83      0.83      1000
weighted avg       0.83      0.83      0.83      1000


MNB TF-IDF
              precision    recall  f1-score   support

    Positive       0.86      0.84      0.85       530
    Negative       0.82      0.85      0.83       470

    accuracy                           0.84      1000
   macro avg       0.84      0.84      0.84      1000
weighted avg       0.84      0.84      0.84      1000


RFC BOW
              precision    recall  f1-score   support

    Positive       0.85      0.80      0.82       530
    Negative       0.79      0.84      0.81       470

    accuracy                           0.82      1000
   macro avg       0.82      0.82      0.82      1000
weighted avg       0.82      0.82      0.82      1000


RFC TF-IDF
              precision    recall  f1-score   support

    Positive       0.84      0.81      0.83       530
    Negative       0.80      0.83      0.81       470

    accuracy                           0.82      1000
   macro avg       0.82      0.82      0.82      1000
weighted avg       0.82      0.82      0.82      1000

Basic Cleaning + Stemming

SVM BOW
              precision    recall  f1-score   support

    Positive       0.85      0.82      0.83       530
    Negative       0.80      0.83      0.82       470

    accuracy                           0.82      1000
   macro avg       0.82      0.82      0.82      1000
weighted avg       0.82      0.82      0.82      1000


SVM TF-IDF
              precision    recall  f1-score   support

    Positive       0.85      0.85      0.85       530
    Negative       0.83      0.83      0.83       470

    accuracy                           0.84      1000
   macro avg       0.84      0.84      0.84      1000
weighted avg       0.84      0.84      0.84      1000


LR BOW
              precision    recall  f1-score   support

    Positive       0.85      0.83      0.84       530
    Negative       0.81      0.84      0.83       470

    accuracy                           0.83      1000
   macro avg       0.83      0.83      0.83      1000
weighted avg       0.83      0.83      0.83      1000


LR TF-IDF
              precision    recall  f1-score   support

    Positive       0.89      0.81      0.85       530
    Negative       0.80      0.88      0.84       470

    accuracy                           0.84      1000
   macro avg       0.84      0.85      0.84      1000
weighted avg       0.85      0.84      0.84      1000


MNB BOW
              precision    recall  f1-score   support

    Positive       0.83      0.84      0.84       530
    Negative       0.82      0.81      0.82       470

    accuracy                           0.83      1000
   macro avg       0.83      0.83      0.83      1000
weighted avg       0.83      0.83      0.83      1000


MNB TF-IDF
              precision    recall  f1-score   support

    Positive       0.87      0.83      0.85       530
    Negative       0.82      0.86      0.84       470

    accuracy                           0.84      1000
   macro avg       0.84      0.84      0.84      1000
weighted avg       0.84      0.84      0.84      1000


RFC BOW
              precision    recall  f1-score   support

    Positive       0.84      0.77      0.80       530
    Negative       0.76      0.83      0.79       470

    accuracy                           0.80      1000
   macro avg       0.80      0.80      0.80      1000
weighted avg       0.80      0.80      0.80      1000


RFC TF-IDF
              precision    recall  f1-score   support

    Positive       0.83      0.79      0.81       530
    Negative       0.78      0.81      0.80       470

    accuracy                           0.80      1000
   macro avg       0.80      0.80      0.80      1000
weighted avg       0.80      0.80      0.80      1000

Basic Cleaning + Lemmitization

SVM BOW
              precision    recall  f1-score   support

    Positive       0.84      0.83      0.83       530
    Negative       0.81      0.82      0.82       470

    accuracy                           0.83      1000
   macro avg       0.83      0.83      0.83      1000
weighted avg       0.83      0.83      0.83      1000


SVM TF-IDF
              precision    recall  f1-score   support

    Positive       0.85      0.86      0.86       530
    Negative       0.84      0.83      0.84       470

    accuracy                           0.85      1000
   macro avg       0.85      0.85      0.85      1000
weighted avg       0.85      0.85      0.85      1000


LR BOW
              precision    recall  f1-score   support

    Positive       0.86      0.84      0.85       530
    Negative       0.82      0.84      0.83       470

    accuracy                           0.84      1000
   macro avg       0.84      0.84      0.84      1000
weighted avg       0.84      0.84      0.84      1000


LR TF-IDF
              precision    recall  f1-score   support

    Positive       0.88      0.81      0.84       530
    Negative       0.80      0.87      0.84       470

    accuracy                           0.84      1000
   macro avg       0.84      0.84      0.84      1000
weighted avg       0.84      0.84      0.84      1000


MNB BOW
              precision    recall  f1-score   support

    Positive       0.82      0.85      0.83       530
    Negative       0.82      0.80      0.81       470

    accuracy                           0.82      1000
   macro avg       0.82      0.82      0.82      1000
weighted avg       0.82      0.82      0.82      1000


MNB TF-IDF
              precision    recall  f1-score   support

    Positive       0.85      0.83      0.84       530
    Negative       0.81      0.84      0.82       470

    accuracy                           0.83      1000
   macro avg       0.83      0.83      0.83      1000
weighted avg       0.83      0.83      0.83      1000


RFC BOW
              precision    recall  f1-score   support

    Positive       0.84      0.78      0.81       530
    Negative       0.77      0.83      0.80       470

    accuracy                           0.80      1000
   macro avg       0.80      0.81      0.80      1000
weighted avg       0.81      0.80      0.80      1000


RFC TF-IDF
              precision    recall  f1-score   support

    Positive       0.84      0.81      0.82       530
    Negative       0.80      0.82      0.81       470

    accuracy                           0.82      1000
   macro avg       0.82      0.82      0.82      1000
weighted avg       0.82      0.82      0.82      1000

Original Q&A

There are 1 best solutions below

**Darren Cook** · Answer 1 · 2022-12-13T11:56:06.070000

I would assume the scores you get are as good as they will get using bag of words or tf-idf approaches.

For instance the sentiment doesn't change between "I hated every minute of this movie, the plot was going nowhere" and "I hate every minute of this movie, the plot is go nowhere".

NLP - Worse result when adding stemming or lemmitization for Sentiment Analysis

There are 1 best solutions below

Related Questions in MACHINE-LEARNING

Related Questions in NLP

Related Questions in SENTIMENT-ANALYSIS

Related Questions in STEMMING

Related Questions in LEMMATIZATION

Trending Questions

Popular # Hahtags

Popular Questions