For deep learning models such as cnn and lstm, in order to use them for time series data, we need to apply a rolling/sliding window to divide the dataset into segments and then feed it into the model, my question is: does this also apply to transofmers? or can I go straight without a sliding window?
I want to know, whether the transformer requries a sliding window of the dataset or not ?