Subtraction of two data-points always results in 0, regardless of data

53 Views Asked by At

I'm working with some financial data from Yahoo.

My code is

library(forecast)
library(tseries)
library(astsa)
library(fGarch)
library(quantmod)

getSymbols("^gspc",src='yahoo')

adj <- GSPC[, "GSPC.Adjusted", drop = FALSE]
n <- nrow(adj)
simpleReturn <- (adj[2:n,1] - adj[1:(n-1),1]) / adj[1:(n-1),1]

simpleReturn is a table full of zeros I checked and even adj[2,1] - adj[1,1] results in zero, while it should be 1.74

I expected the code to return the simple rate of return for all data-points past the first one

2

There are 2 best solutions below

2
AndS. On

I feel like there is something weird with the class/data type such that simple subtraction does not work as expected. I converted to a tibble and calculated everything just fine:

library(tidyverse)

as_tibble(adj) |>
  mutate(date = date(adj),
         simple = (lead(GSPC.Adjusted)-GSPC.Adjusted)/GSPC.Adjusted)
#> # A tibble: 4,235 x 3
#>    GSPC.Adjusted date          simple
#>            <dbl> <date>         <dbl>
#>  1         1417. 2007-01-03  0.00123 
#>  2         1418. 2007-01-04 -0.00608 
#>  3         1410. 2007-01-05  0.00222 
#>  4         1413. 2007-01-08 -0.000517
#>  5         1412. 2007-01-09  0.00194 
#>  6         1415. 2007-01-10  0.00634 
#>  7         1424. 2007-01-11  0.00485 
#>  8         1431. 2007-01-12  0.000818
#>  9         1432. 2007-01-16 -0.000894
#> 10         1431. 2007-01-17 -0.00297 
#> # i 4,225 more rows
0
Joshua Ulrich On

getSymbols() returns an xts object, which is a time series. All operations on time series data should be aligned by the timestamp and xts does this for you. Converting to a class that isn't time series creates the potential for many issues.

You can use the ROC() function to calculate the rate-of-change.

simpleReturn <- ROC(adj, type = "discrete")
head(simpleReturn)
##            GSPC.Adjusted
## 2007-01-03            NA
## 2007-01-04   0.001228286
## 2007-01-05  -0.006084581
## 2007-01-08   0.002220318
## 2007-01-09  -0.000516676
## 2007-01-10   0.001940352

Notice how the result for 2007-01-04 is 0.0012 but it's -0.00608 in AndS's answer. That's because their result uses data from the future, before you could have known it. This is one example of a serious issue that can happen when you don't use a time series class for time series data.

I strongly recommend you use a time series class for time series data. You can find lots of choices in the CRAN Time Series Task View.


EDIT: also, notice that 4 of the 5 packages you loaded use time series classes: zoo (tseries, forecast), xts (quantmod), timeSeries (fGarch).