I have the following output from data that I have downloaded from the Wall Street Journal.
> Search(MySymList, " Net Income")
Fiscal year is July-June. All values AUD Millions. 2018 2017 2016 2015 2014 5-year trend
82 Consolidated Net Income 949 814 376 850 769
86 Net Income 934 792 335 817 737
88 Net Income Growth 18.04% 135.99% -58.93% 10.83% -
103 Net Income After Extraordinaries 934 792 335 817 909
107 Net Income Available to Common 934 792 335 817 565
I want to capture Net Income but as there is no consistency in where Net Income will be in the data (as in line number), I tried using library qdap and Search in particular. It does a wonderful job of finding most information but I am stumped with how to remove the other lines.
I thought that exclude might be helpful but it just doesn't seem to work.
Search(MySymList, " Net Income", exclude = "Common")
Error in agrep(term, x, ignore.case = TRUE, max.distance = max.distance, :
unused argument (exclude = "Common")
I can get the Net Income by other means but I would prefer to do it with just one function, that being Search or anything that the library qdap might offer.
Any guidance would be most welcome.
EDIT!!
The cut down code is as follows as it is easier to run it than to provide data for it. The symbol is different from the original so the line numbers will have changed.
library(httr)
library(XML)
library(data.table)
library(qdap)
library(Hmisc)
getwsj.quotes <- function(Symbol)
{
MyUrl <- sprintf("https://quotes.wsj.com/AU/XASX/%s/financials/annual/income-statement", Symbol)
Symbol.Data <- GET(MyUrl)
x <- content(Symbol.Data, as = 'text')
wsj.tables <- sub('cr_dataTable cr_sub_capital', '\\1', x)
SymData <- readHTMLTable(wsj.tables)
return(SymData)
}
TickerList <- c("AMC")
SymbolDataList <- lapply(TickerList, FUN = getwsj.quotes)
MySymList <- data.frame()
MySymList <- SymbolDataList[[1]][[2]]
Search(MySymList, " Net Income")
Regards Stephen
I have made a breakthrough but it might not be the most efficient code. Giving a short name to the first column helped a lot. The function
whichprovides an exact match function for searching. Alas, I cannot answer my own question about thelibrary qdap Searchfunction.The output is:
Thanks to everyone who considered this problem. Regards Stephen