R Keras: Prediction with 1D convolutional network is wrong size?

35 Views Asked by At

I am using keras in R to fit neural networks to multivariate time series data, generate predictions on test data (taken as a subset of the original data), and estimate RMSE by comparing predictions to the real data. This works fine for DNNs, GRUs, and LSTMs. I am currently trying to create a CNN-GRU (or CNN-LSTM) with a 1d convolutional layer after reading some posts (e.g. https://www.kaggle.com/code/davidchilders/time-series-prediction-in-r-keras). After some tinkering I can get this to train just fine. However, when I make a prediction I get an unexpected result. The output vector of predictions is only a fraction of the length of what it should be. For example, if I withhold 1000 time steps for test predictions, the output of predict() on the CNN-GRU will have a length of around 300. This seems to happen in any case that I use layer*_*conv_1d(), so it's not about the combination of using it with other types of layers. What is going on?

Here is some example code which reproduces this:

library(tidyverse)
library(keras)
library(reticulate)

# Function to generate sample data
generate_data = function(n_samples) {
  set.seed(123)
  
  time = seq(1, n_samples)
  covariate1 = rnorm(n_samples, mean = 0, sd = 1)
  covariate2 = rnorm(n_samples, mean = 5, sd = 2)
  covariate3 = rnorm(n_samples, mean = -3, sd = 3)
  target = sin(seq(1, n_samples) * 0.1) + rnorm(n_samples, mean = 0, sd = 0.2)
  
  data = tibble(Time = time, Covariate1 = covariate1, Covariate2 = covariate2, Covariate3 = covariate3, Target = target)
  return(data)
}

# Generate sample data
nsamp = 5000
sample_data = as.matrix(generate_data(nsamp))

I'm using standard generator functions (for example from the earlier post on Kaggle that I reference). I included them here for completeness, sorry about the space. Skip down for the model definition, etc.

generator <- function(data, lookback, delay, min_index, max_index,
                      shuffle = FALSE, batch_size, step, 
                      predseries) {
  
  if (is.null(max_index)) max_index <- nrow(data) - delay - 1
  i <- min_index + lookback
  function() {

    if (shuffle) {
      rows <- sample(c((min_index+lookback):max_index), size = batch_size)
    } else {
      if (i + batch_size >= max_index)
        i <<- min_index + lookback
        rows <- c(i:min(i+batch_size, max_index))
        i <<- i + length(rows)
    }

    samples <- array(0, dim = c(length(rows),
                                lookback / step,
                                dim(data)[[-1]]))

    targets <- array(0, dim = c(length(rows)))
    
    for (j in 1:length(rows)) {
      indices <- seq(rows[[j]] - lookback, rows[[j]],
                     length.out = dim(samples)[[2]])
      samples[j,,] <- data[indices,]
      targets[[j]] <- data[rows[[j]] + delay,predseries]
    }
    list(samples, targets)
  }
}

#Parameters for generator functions:
#How long of a series to use at a time
lookback = 10
#Use every time point
step = 1
#Number of time steps into the future to predict
delay = 1
#Samples
batch_size = 20
predser = 1 #Index of label

#Set variables for training, validation, and testing data sets
#Range of training, validation, and test sets:
min_train = 1
max_train = floor(nsamp*2/3)
min_val = max_train+1
max_val = min_val + floor(0.5*(nsamp-max_train))
min_test = max_val+1
max_test = NULL

#Validation and test steps 
val_steps = floor( (max_val - min_val - lookback) / batch_size )
test_steps = floor( (nrow(sample_data) - max_val - lookback) / batch_size)

#Training set 
  train_gen = generator(
    sample_data,
    lookback = lookback,
    delay = delay,
    min_index = min_train,
    max_index = max_train,
    #shuffle = TRUE,
    step = step,
    batch_size = batch_size,
    predseries = predser  
  )

  #Validation set 
  val_gen = generator(
    sample_data,
    lookback = lookback,
    delay = delay,
    min_index = min_val,
    max_index = max_val,
    step = step,
    batch_size = batch_size,
    predseries = predser  
  )

  #Test set looks at remaining
  test_gen = generator(
    sample_data,
    lookback = lookback,
    delay = delay,
    min_index = min_test,
    max_index = NULL,
    step = step,
    batch_size = batch_size,
    predseries = predser    
  )

Here is a simple version of the model with a1D Convolutional layer and a Dense layer. I assume that this must be where I'm missing something, possibly another layer or transformation of some kind?

build_and_compile_model = function() {
  model = keras_model_sequential() %>%
      layer_conv_1d(
          filters=64, 
          kernel_size=2, 
          activation="relu",
          input_shape = list(NULL, dim(sample_data)[[-1]])
        ) %>%
      layer_max_pooling_1d(pool_size=3) %>%
      layer_dense(64, activation = 'relu') %>% 
      layer_dense(units = 1)

   model %>% compile(
      loss = 'mean_absolute_error',
      optimizer = optimizer_adam()
    )

    model
  }

#Build the model
model1  = build_and_compile_model()

#Fit the model to training data
model1 %>% fit(
  train_gen,
  steps_per_epoch = test_steps,
  epochs = 20,
  validation_data = val_gen,
  validation_steps = val_steps
)

Final step for prediction. In this example, the length of the test data is 833 time points. The variable returned by predict(), "test_pred," has only 277 items.


#Generate predictions from test data
test_tmp = sample_data[min_test:nsamp, ]
test_data = array(test_tmp , 
                      dim = c(1, dim(test_tmp)[1],dim(test_tmp)[2] ) )

Thanks so much everyone!

0

There are 0 best solutions below