I created a model with this structure:
#Model
# Define an input sequence and process it.
encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder1 = LSTM(latent_dim,return_sequences=True,return_state=True,recurrent_dropout = 0.0)
encoder_outputs1, state_h1, state_c1, = encoder1(encoder_inputs)
encoder_states1 = [state_h1, state_c1]
encoder2 = LSTM(latent_dim,return_sequences=True,return_state=True,recurrent_dropout = 0.0)
encoder_outputs2, state_h2, state_c2, = encoder2(encoder_outputs1)
encoder_states2 = [state_h2, state_c2]
encoder3 = LSTM(latent_dim,return_state=True,recurrent_dropout = 0.0)
encoder_outputs, state_h3, state_c3, = encoder3(encoder_outputs2)
encoder_states = [state_h3, state_c3]
# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(None, num_decoder_tokens))
# We set up our decoder to return full output sequences,
# and to return internal states as well. We don't use the
# return states in the training model, but we will use them in inference.
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True,dropout=0.4,recurrent_dropout=0.0)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs,
initial_state=encoder_states)
decoder_dense = TimeDistributed(Dense(num_decoder_tokens, activation='softmax'))
decoder_outputs = decoder_dense(decoder_outputs)
# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
So in the end it looks like that:
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, None, 142)] 0 []
lstm (LSTM) [(None, None, 256), 408576 ['input_1[0][0]']
(None, 256),
(None, 256)]
lstm_1 (LSTM) [(None, None, 256), 525312 ['lstm[0][0]']
(None, 256),
(None, 256)]
input_2 (InputLayer) [(None, None, 51)] 0 []
lstm_2 (LSTM) [(None, 256), 525312 ['lstm_1[0][0]']
(None, 256),
(None, 256)]
lstm_3 (LSTM) [(None, None, 256), 315392 ['input_2[0][0]',
(None, 256), 'lstm_2[0][1]',
(None, 256)] 'lstm_2[0][2]']
time_distributed (TimeDist (None, None, 51) 13107 ['lstm_3[0][0]']
ributed)
==================================================================================================
Total params: 1787699 (6.82 MB)
Trainable params: 1787699 (6.82 MB)
Non-trainable params: 0 (0.00 Byte)
Then after training i save it in .h5 file. Next, in new notebook i loaded it, and wanted to recreate this:
# Define sampling models
encoder_inputs = model.input[0]
encoder_outputs, state_h, state_c= model.layers[4].output # lstm_1
encoder_states = [state_h, state_c]
encoder_model = Model(encoder_inputs, encoder_states)
decoder_state_input_h = Input(shape=(latent_dim,))
decoder_state_input_c = Input(shape=(latent_dim,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_inputs = model.input[1]
decoder_lstm = model.layers[5]
decoder_dense = model.layers[6]
decoder_outputs, state_h, state_c = decoder_lstm(
decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model(
[decoder_inputs] + decoder_states_inputs,
[decoder_outputs] + decoder_states)
So i get for encoder:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, None, 142)] 0
lstm (LSTM) [(None, None, 256), 408576
(None, 256),
(None, 256)]
lstm_1 (LSTM) [(None, None, 256), 525312
(None, 256),
(None, 256)]
lstm_2 (LSTM) [(None, 256), 525312
(None, 256),
(None, 256)]
=================================================================
Total params: 1459200 (5.57 MB)
Trainable params: 1459200 (5.57 MB)
Non-trainable params: 0 (0.00 Byte)
______________________________________________
and for decoder:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_2 (InputLayer) [(None, None, 51)] 0 []
input_51 (InputLayer) [(None, 256)] 0 []
input_52 (InputLayer) [(None, 256)] 0 []
lstm_3 (LSTM) [(None, None, 256), 315392 ['input_2[0][0]',
(None, 256), 'input_51[0][0]',
(None, 256)] 'input_52[0][0]']
time_distributed (TimeDist (None, None, 51) 13107 ['lstm_3[8][0]']
ributed)
==================================================================================================
Total params: 328499 (1.25 MB)
Trainable params: 328499 (1.25 MB)
Non-trainable params: 0 (0.00 Byte)
__________________________________________________________________________________________________
and when i want to predict somethink, in encoder i get this error:
states_value = encoder_model.predict(item)
alueError: Exception encountered when calling layer 'model_28' (type Functional).
Layer "lstm" expects 3 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'model_28/Cast:0' shape=(None, 190, 142) dtype=float32>]
Call arguments received by layer 'model_28' (type Functional):
• inputs=tf.Tensor(shape=(None, 190, 142), dtype=int32)
• training=False
• mask=None
What i'm doing wrong in model recreation?