Difficulties training an LSTM for Straightforward Time Series Predictions

32 Views Asked by At

I'm attempting to train an LSTM in Keras for what appears to be a straight forward time series problem.

Need guidance to see if I'm overlooking not considering something.

In words, the data set is a count of actions taken on a particular day over a history of 3 years. For my use case the data is indexed on days-prior-to-last day of series. So my data set look like this:

index    value
-1094     5
-1093     3
-1092     4
 ...

  -2      0
  -1      1
   0      0

This isn't a big deal but provided for context.

the data graphically looks like this:

enter image description here

I want to train the LSTM on this data and then predict values for the subsequent 365 days.

My results are terrible.

I have experimented with batch size and sequence_length with no great improvement.

Appreciate any guidance.

Code:

enter image description here

This is annual data and so I am choosing a sequence length of 365 to capture what I perceive as annual seasonality.

Nothing special in the code (except that I currently have stateful=True. I'll provide for reference

scaler = MinMaxScaler()
train_df['scaled_regs'] = scaler.fit_transform(train_df[['total_daily_reg']])
total_daily_reg_scaled = train_df['scaled_regs'].values.reshape(-1, 1)

# Define a function to create sequences for training
def create_sequences(data, seq_length):
    X, y = [], []
    for i in range(seq_length, len(data)):
        X.append(data[i - seq_length:i])
        y.append(data[i])
    return np.array(X), np.array(y)

# Define the sequence length (365 days)
sequence_length = 365

# Create sequences for training
X_train, y_train = create_sequences(total_daily_reg_scaled, sequence_length)

model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True, batch_input_shape=(1,sequence_length,1)))
model.add(LSTM(16,return_sequences=False, stateful=True))
model.add(Dense(1))


model.compile(optimizer='adam', loss= 'mean_squared_error')

# Train the model
for _ in range(20):
    model.reset_states()
    model.fit(X_train, y_train, epochs=1, batch_size=1, validation_split=0.1,callbacks=[early_stopping], verbose=2)

# Generate sequences for predicting the next 365 days
X_predict = total_daily_reg_scaled[-sequence_length:].reshape(1, sequence_length, 1)
predicted_scaled = []

for _ in range(365):
    # Predict the next day's registration
    #prediction = model.predict(X_predict)
    prediction = model.predict(X_predict, batch_size=32)
    predicted_scaled.append(prediction)

    # Shift the input sequence by one day
    X_predict = np.roll(X_predict, shift=-1)
    X_predict[0, -1, 0] = prediction  # Replace the last element with the prediction

# Inverse transform the predictions to get the actual values
predicted_scaled = np.array(predicted_scaled).reshape(-1, 1)
predicted = scaler.inverse_transform(predicted_scaled)
0

There are 0 best solutions below