I have a list below. I want to get the corresponding POS against each token. I have given a sample output below
processed_lst = [['The', 'wild', 'is', 'dangerous'], ['The', 'rockstar', 'is', 'wild']]
I want to use the spacy library and get output like
final_lst = [[(The, DET), (wild, NOUN), (is, AUX), (dangerous, ADJ)], [(The, DET), (rockstar, NOUN), (is, AUX), (wild, ADJ) ]]
You can do this with the
.pos_attributes of a token after you turn it into a spaCy document. The code below is pulled from this post on Part of Speech Tagging.