I am trying to follow a tutorial whereby an ARIMA time series analysis using differenced data is being done:
The following is the python code:
def difference(dataset):
diff = list()
for i in range(1, len(dataset)):
value = dataset[i] - dataset[i - 1]
diff.append(value)
return Series(diff)
series = pd.read_csv('dataset.csv')
X = series.values # The error in building the list can be seen here
X = X.astype('float32')
stationary = difference(X)
stationary.index = series.index[1:]
...
stationary.plot()
pyplot.show()
When the process reaches the plotting stage I get the error:
TypeError: no numeric data to plot
Tracing back, I find that the data that is being parsed is resulting in a collection of array. Saving the collection stationary as *.csv file gives me a list like:
[11.]
[0.]
[16.]
[45.]
[27.]
[-141.]
[46.]
Can somebody tell me what is going wrong here?
PS. I have exluded the parts of import of libraries
Edit 1
A section of the dataset is reproduced below:
Year,Obs
1994,21
1995,62
1996,56
1997,29
1998,38
1999,201

To difference, just use
Series.difforDataFrame.diff().Also,
Yearshould be the index:Output: