My goal is to make an LSTM autoencoder using tensorflow.keras. For this I want to use the kaggle GPUs/TPUs, however, even though I have the cunta verified and select an accelerator it runs with the CPU and is super slow.
Because I have found kaggle documentation on Keras, I have made an unsuccessful implementation where I try to train the model.
data= np.load("/kaggle/working/dataStack.npy")
# detect and init the TPU
tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
# instantiate a distribution strategy
tf.tpu.experimental.initialize_tpu_system(tpu)
tpu_strategy = tf.distribute.TPUStrategy(tpu)
with tpu_strategy.scope():
# use 'maxlen' as the number of time steps.
n_steps = maxlen
n_features = 3 # position, torque, thrust
# Define the model
model = Sequential()
# Input layer
model.add(Input(shape=(n_steps, n_features)))
# LSTM layer
model.add(LSTM(128, activation='relu'))
# Encoder layer
model.add(Dense(64, activation='relu', kernel_initializer='he_uniform'))
# Repeater layer to prepare encoder output for decoding
model.add(RepeatVector(n_steps))
# Decoder layer
model.add(LSTM(128, activation='relu', return_sequences=True))
# Output layer
model.add(TimeDistributed(Dense(n_features)))
model.compile(optimizer='adam', loss='mse', steps_per_execution=32)
y = data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
BATCH_SIZE = 16 * tpu_strategy.num_replicas_in_sync
model.fit(X_train, y_train, epochs=10, batch_size=BATCH_SIZE, validation_data=(X_test, y_test))
When the model is fitting, it runs using the CPU. Those are the outputs
INFO:tensorflow:Deallocate tpu buffers before initializing tpu system.
INFO:tensorflow:Initializing the TPU system: local
2024-03-26 08:32:59.510852: E external/local_xla/xla/stream_executor/stream_executor_internal.h:177] SetPriority unimplemented for this stream.
2024-03-26 08:32:59.510965: E external/local_xla/xla/stream_executor/stream_executor_internal.h:177] SetPriority unimplemented for this stream.
2024-03-26 08:32:59.511062: E external/local_xla/xla/stream_executor/stream_executor_internal.h:177] SetPriority unimplemented for this stream.
....(continues)
INFO:tensorflow:Finished initializing TPU system.
INFO:tensorflow:Found TPU system:
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Cores Per Worker: 8
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:TPU:0, TPU, 0, 0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:TPU:1, TPU, 0, 0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:TPU:2, TPU, 0, 0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:TPU:3, TPU, 0, 0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:TPU:4, TPU, 0, 0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:TPU:5, TPU, 0, 0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:TPU:6, TPU, 0, 0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:TPU:7, TPU, 0, 0)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0)
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1711441984.528757 13 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
add Codeadd Markdown
Epoch 1/10
I would like to know if you could help me since I am a newbie with Kaggle and Tensorflow.