Sentiment Analysis: tokenized data cannot fit in Keras model, Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray)

14 Views Asked by user23305107 At 27 January 2024 at 02:01

i have these two dataframe, below is the result if i print the info:

<class 'pandas.core.frame.DataFrame'\>
> Index: 3432 entries, 11433 to 559
> Data columns (total 3 columns):
> Column          Non-Null Count  Dtype
> 0   text            3432 non-null   object
> 1   input_ids       3432 non-null   object
> 2   attention_mask  3432 non-null   object
> dtypes: object(3)
> memory usage: 107.2+ KB
> None
>
> \<class 'pandas.core.frame.DataFrame'\>
> Index: 3432 entries, 11433 to 559
> Data columns (total 1 columns):
> Column  Non-Null Count  Dtype
>
> ----------------------------
>
> 0   labels  3432 non-null   int64
> dtypes: int64(1)
> memory usage: 53.6 KB
> None\`

then i split it by train_test_split:

X_train, X_test, y_train, y_test = train_test_split(X_resampled_df, y_resampled_df, test_size=0.2)

i just want to put input_ids into the model

X_train= X_train['input_ids']
X_test= X_test['input_ids']

this is my trial model:

from tensorflow.keras import layers
from keras.optimizers import Adam
model = Sequential()
model.add(layers.Embedding(2000,20)) #The embedding layer
model.add(layers.LSTM(15,dropout=0.5)) #Our LSTM layer
model.add(layers.Dense(6,activation='softmax'))
model.compile(optimizer=Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

i fit it with the model.fit()

model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test))

the error occurs, refering to the X_train and X_test:

Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

Here is the details if i print X_train.to_numpy()

[array([  101,  1045,  2514,  2004,  2065,  1996,  4177,  1997,  3032,
         2079,  2025, 17120,  1996,  2111,  1997,  2037,  3032,  2138,
         2005,  1996,  2293,  1997,  2643,  1045,  3246,  2053,  2028,
         2245,  2012,  2035,  1045,  2001,  1999,  2151,  2126, 16408,
         2030,  2066,  2577,  1059,   102,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0], dtype=int32)
 array([  101,  1045,  2572,  2074,  2785,  1997,  2187,  3110, 16021,
        29150,  1998, 15491,  1999,  2026,  2219,  3096,   102,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0], dtype=int32)
 array([  101,  1045,  2031,  2069,  2579,  2093,  9372,  7171,  2061,
         2521,  1998,  2428,  1045,  2031,  2042,  3110,  2026,  2126,
         2007,  1037,  2200,  4326,  4950,  1037,  2422, 22828,  1998,
         1996,  2146,  6404,  2245,  6194,  1997,  4030,  5855,   102,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0], dtype=int32)               ...
 array([  101,  1045,  2514,  2061,  8239, 22614,  2035,  1996,  3513,
         1998,  2049,  2061,  3483,  2098,  2066,  2065,  2465,  4627,
...
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0], dtype=int32)

and the dtype of X_train.to_numpy is object

Can anyone please help on this, i believe it is a format problem but i cannot find a solution after i spent half of my day. Thanks!

Expect a solution, i tried np.stack, as the sentence size is not the same, it cannot be used. i tried to change the object type by astype() but python do not allow. i tried to wrap it with an numpy array and not work as well.

Original Q&A

Sentiment Analysis: tokenized data cannot fit in Keras model, Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray)

There are 0 best solutions below

Related Questions in KERAS

Related Questions in DEEP-LEARNING

Related Questions in NEURAL-NETWORK

Related Questions in TOKENIZE

Related Questions in SENTIMENT-ANALYSIS

Trending Questions

Popular # Hahtags

Popular Questions