sprintf("%s%s") returning 'character(0)' instead of string when combining two lists

63 Views Asked by At

I'm trying to combine two lists to extract the subject associated with each verb in a list of sentences. (In this case, the subject is the noun directly preceding the verb). I am having trouble with the annotation of the sentences. I can get it to work for a single sentence, but when I try to print out all of the annotations it only prings character(0) instead. What am I doing wrong? Here is a minimal working example to illustrate:

library(NLP)
library(openNLP)
library(stringr)


allsentences<-paste(c(
  "Adam pitched the ball",
  "Adam adopts the cat",
  "Adam talks to the monkey",
  "Adam pets the dog?",
  "Adam tosses the bat",
  "Adam opens the book",
  "Adam moves the desk",
  "Adam threw the bag?",
  "Adam looks at the rat",
  "Alex threw the ball",
  "Alex loves the cat",
  "Alex pets the monkey",
  "Alex adopted the dog",
  "Alex throws the bat",
  "Alex reads the book",
  "Alex buys the desk",
  "Alex moves the bag",
  "Alex avoids the rat"
),collapse=". ")

allsentences

#identify verb in each statement
s <- as.String(allsentences)  #want this to be across all sentence, not just the first one
#s <- as.String(allsentences[1])  #want this to be across all sentence, not just the first one
sent_token_annotator <- Maxent_Sent_Token_Annotator()
word_token_annotator <- Maxent_Word_Token_Annotator()
a2 <- annotate(s, list(sent_token_annotator, word_token_annotator))
pos_tag_annotator <- Maxent_POS_Tag_Annotator()
pos_tag_annotator
a3 <- annotate(s, pos_tag_annotator, a2)
a3

## Determine the distribution of POS tags for word tokens.
a3w <- subset(a3, type == "word")
tags <- sapply(a3w$features, `[[`, "POS")
## Extract token/POS pairs (all of them): easy.
sprintf("%s/%s", s[a3w], tags)

## Extract pairs of word tokens and POS tags for second sentence:
#THIS WORKS 
#a3ws2 <- annotations_in_spans(subset(a3, type == "word"),
                              subset(a3, type == "sentence")[2])[[1]]
#THIS IS THE PART THAT ISNT WORKING
a3ws2 <- annotations_in_spans(subset(a3, type == "word"), 
   subset(a3, type == "sentence"))
#this is printing character(0) but I want a printout of all of the annotated sentences
sprintf("%s/%s", s[a3ws2], sapply(a3ws2$features, `[[`, "POS"))
0

There are 0 best solutions below