I'm new to LSTM. Now I have a task about destination prediction of online car-hailing, each order is the basic unit of the dataset, and the following are the features:
feature_1: GPS trajectories. Like
T = {t1, t2, ..., tn}, ti = (longitude i, latitude i). Each order will have a different length of trajectory. It has been vectorized and normalized in the preprocessing stage. And then I use padding to unify the lengths of vectors.
feature_2: Anonymous feature D. 1D nonsequential data. Each order has only a few of these features D (e.g. D1, D2, D3, which can be considered as 3 dimensions). They can reflect the characteristics of the order.
feature_3~x: Other nonsequential features. Such as pick-up location & time and number of POIs(Point of Interest) near pick-up point. Each order has only one feature of each of them.
The details of the task are entering the first 50%, 70%, 90% of the trajectory and predicting the destination, I want to input the trajectory data(feature_1) into BiLSTM, then I want to explore the influence of feature_2 on the experimental results. Feature 3~x are the other auxiliary features.
How should I design the BiLSTM model to achieve the goal?
One idea I see is to concatenate BiLSTM directly with other features after BiLSTM training, and finally put it into softmax classifier through several dense layers. However, that doesn't seem to emphasize the role of feature_2...