I need help to understand the output of this code. Why am I getting Nan instead of float value? Please suggest necessary amendments require:
import matplotlib.pyplot as plt
from scipy import stats
import pandas as pd
import fix_yahoo_finance as fyf
from pandas_datareader import data as pdr
import numpy as np
fyf.pdr_override()
p=pdr.get_data_yahoo('IBM',start ='2009-01-01',end ='2013-01-01')
p.to_csv('YF_IBM_2009_2013.csv')
print(p.head())
ret = (p.Close[1:]-p.Close[:-1])/p.Close[1:]
print ('ticker=','IBM','W-test, and P-value')
print (stats.shapiro(ret))
And output is:
ret = (p.Close[1:]-p.Close[:-1])/p.Close[1:]
print ('ticker=','IBM','W-test, and P-value')
print (stats.shapiro(ret))
ticker= IBM W-test, and P-value
(nan, 1.0)
There is a small issue with your code. When you directly subtract two pandas series, the index comes along. Below is the output for
Having index along with values is the reason you're getting nan values. To select only the values from a pandas series, you have to do
so the ret = line now is
This should do what you're looking for. Comment if anything else is needed.