R quanteda textplot_network for each document and influence number of features

54 Views Asked by At

Many thanks for the Quanteda Package, so powerful to use.

I have three questions :

  • below is an example to see that (in french sorry), and my question is : can we plot for each document the DFM (FCM) categorize by dictionary ? it seems all the time for all docs.

  • is there a way to link features (only the main) to the keys on the graph ?

  • and more theoretically: is it possible (and useful) to address the influence of the number of features in a dictionary regarding the number of times the features are present in document ? in other words: influence or not of few features in a keys compare to many in other one ?

Thank you R

library(quanteda)
library(quanteda.textplots)
s <-  c("Je suis certain de trouve philosophie et de tristesse", "  je trouve la vie belle et       tristesse", "je suis blanc et je suis peureux",
     "Je suis belle et pleine de tristesse")

toks <- tokens(s) 
dfm <- dfm(toks)
dict1 <- dictionary(list(emotion=c("philosophie", "tristesse","belle", "blanc","trouve"),
                     peu=c("je"),
                     tout= c("Je", 'suis',"et","de")))

dict_dtm2 <- dfm_lookup(dfm, dict1, nomatch="_unmatched")                                 
tail(dict_dtm2)  
dict_sel <- dfm_select(dfm, pattern = dict1)    
tail(dict_sel)  
fcm_dtm2 <-fcm(dict_dtm2)
size <- log(colSums(fcm_dtm2))
fcm_dtm2 %>%
 textplot_network( min_freq = 1,
                omit_isolated= TRUE,
                vertex_size = size / max(size) * 5 ,  edge_alpha =0.5) 
0

There are 0 best solutions below