Recreating SEQ2SEQ2 model in keras

Question

Recreating SEQ2SEQ2 model in keras

36 Views Asked by otbear At 12 October 2023 at 07:43

I created a model with this structure:

#Model
# Define an input sequence and process it.
encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder1 = LSTM(latent_dim,return_sequences=True,return_state=True,recurrent_dropout = 0.0)
encoder_outputs1, state_h1, state_c1,  = encoder1(encoder_inputs)
encoder_states1 = [state_h1, state_c1]

encoder2 = LSTM(latent_dim,return_sequences=True,return_state=True,recurrent_dropout = 0.0)
encoder_outputs2, state_h2, state_c2,  = encoder2(encoder_outputs1)
encoder_states2 = [state_h2, state_c2]

encoder3 = LSTM(latent_dim,return_state=True,recurrent_dropout = 0.0)
encoder_outputs, state_h3, state_c3,  = encoder3(encoder_outputs2)
encoder_states = [state_h3, state_c3]

# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(None, num_decoder_tokens))
# We set up our decoder to return full output sequences,
# and to return internal states as well. We don't use the
# return states in the training model, but we will use them in inference.
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True,dropout=0.4,recurrent_dropout=0.0)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs,
                                     initial_state=encoder_states)

decoder_dense = TimeDistributed(Dense(num_decoder_tokens, activation='softmax'))
decoder_outputs = decoder_dense(decoder_outputs)
# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

So in the end it looks like that:

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 input_1 (InputLayer)        [(None, None, 142)]          0         []                            
                                                                                                  
 lstm (LSTM)                 [(None, None, 256),          408576    ['input_1[0][0]']             
                              (None, 256),                                                        
                              (None, 256)]                                                        
                                                                                                  
 lstm_1 (LSTM)               [(None, None, 256),          525312    ['lstm[0][0]']                
                              (None, 256),                                                        
                              (None, 256)]                                                        
                                                                                                  
 input_2 (InputLayer)        [(None, None, 51)]           0         []                            
                                                                                                  
 lstm_2 (LSTM)               [(None, 256),                525312    ['lstm_1[0][0]']              
                              (None, 256),                                                        
                              (None, 256)]                                                        
                                                                                                  
 lstm_3 (LSTM)               [(None, None, 256),          315392    ['input_2[0][0]',             
                              (None, 256),                           'lstm_2[0][1]',              
                              (None, 256)]                           'lstm_2[0][2]']              
                                                                                                  
 time_distributed (TimeDist  (None, None, 51)             13107     ['lstm_3[0][0]']              
 ributed)                                                                                         
                                                                                                  
==================================================================================================
Total params: 1787699 (6.82 MB)
Trainable params: 1787699 (6.82 MB)
Non-trainable params: 0 (0.00 Byte)

Then after training i save it in .h5 file. Next, in new notebook i loaded it, and wanted to recreate this:

# Define sampling models
encoder_inputs = model.input[0]
encoder_outputs, state_h, state_c= model.layers[4].output   # lstm_1
encoder_states = [state_h, state_c]
encoder_model = Model(encoder_inputs, encoder_states)

decoder_state_input_h = Input(shape=(latent_dim,))
decoder_state_input_c = Input(shape=(latent_dim,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

decoder_inputs = model.input[1]
decoder_lstm = model.layers[5]
decoder_dense = model.layers[6]

decoder_outputs, state_h, state_c = decoder_lstm(
    decoder_inputs, initial_state=decoder_states_inputs)

decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model(
    [decoder_inputs] + decoder_states_inputs,
    [decoder_outputs] + decoder_states)

So i get for encoder:

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, None, 142)]       0         
                                                                 
 lstm (LSTM)                 [(None, None, 256),       408576    
                              (None, 256),                       
                              (None, 256)]                       
                                                                 
 lstm_1 (LSTM)               [(None, None, 256),       525312    
                              (None, 256),                       
                              (None, 256)]                       
                                                                 
 lstm_2 (LSTM)               [(None, 256),             525312    
                              (None, 256),                       
                              (None, 256)]                       
                                                                 
=================================================================
Total params: 1459200 (5.57 MB)
Trainable params: 1459200 (5.57 MB)
Non-trainable params: 0 (0.00 Byte)
______________________________________________

and for decoder:

__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 input_2 (InputLayer)        [(None, None, 51)]           0         []                            
                                                                                                  
 input_51 (InputLayer)       [(None, 256)]                0         []                            
                                                                                                  
 input_52 (InputLayer)       [(None, 256)]                0         []                            
                                                                                                  
 lstm_3 (LSTM)               [(None, None, 256),          315392    ['input_2[0][0]',             
                              (None, 256),                           'input_51[0][0]',            
                              (None, 256)]                           'input_52[0][0]']            
                                                                                                  
 time_distributed (TimeDist  (None, None, 51)             13107     ['lstm_3[8][0]']              
 ributed)                                                                                         
                                                                                                  
==================================================================================================
Total params: 328499 (1.25 MB)
Trainable params: 328499 (1.25 MB)
Non-trainable params: 0 (0.00 Byte)
__________________________________________________________________________________________________

and when i want to predict somethink, in encoder i get this error:

states_value = encoder_model.predict(item)

alueError: Exception encountered when calling layer 'model_28' (type Functional).
    
    Layer "lstm" expects 3 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'model_28/Cast:0' shape=(None, 190, 142) dtype=float32>]
    
    Call arguments received by layer 'model_28' (type Functional):
      • inputs=tf.Tensor(shape=(None, 190, 142), dtype=int32)
      • training=False
      • mask=None

What i'm doing wrong in model recreation?

Original Q&A

Recreating SEQ2SEQ2 model in keras

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in MACHINE-LEARNING

Related Questions in KERAS

Related Questions in SEQ2SEQ

Trending Questions

Popular # Hahtags

Popular Questions