I'm trying to increase the Accuracy of my Model. The model aims to leverage BERT's contextual understanding of language to perform binary classification on IMDb movie reviews. By fine-tuning specific layers of the pre-trained BERT model and integrating it into a neural network, the goal is to achieve high accuracy in classifying sentiment polarity (positive/negative) of movie reviews from the IMDb dataset.
Below is the code:
# Load the dataset but only keep the top n words, zero the rest
top_words = 5000
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
# Truncate and pad input sequences
max_review_length = 500
X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
# Convert integer sequences back to text
X_train_texts = [' '.join(map(str, x)) for x in X_train]
X_test_texts = [' '.join(map(str, x)) for x in X_test]
# Tokenize the input sequences
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
# Truncate sequences to fit within BERT's maximum sequence length
max_sequence_length = 512
X_train = [tokenizer.encode(text, add_special_tokens=True, max_length=max_sequence_length, truncation=True) for text in X_train_texts]
X_test = [tokenizer.encode(text, add_special_tokens=True, max_length=max_sequence_length, truncation=True) for text in X_test_texts]
# Pad the tokenized sequences
X_train = sequence.pad_sequences(X_train, maxlen=max_sequence_length, padding='post', truncating='post')
X_test = sequence.pad_sequences(X_test, maxlen=max_sequence_length, padding='post', truncating='post')
# Load pre-trained BERT model
bert_model = TFBertModel.from_pretrained('bert-base-uncased')
# Define input layer
input_layer = Input(shape=(max_sequence_length,), dtype='int32')
# BERT layer
bert_output = bert_model(input_layer)[0]
# Flatten layer
flatten_layer = Flatten()(bert_output)
# Output layer
output_layer = Dense(1, activation='sigmoid')(flatten_layer)
# Create model
model = Model(inputs=input_layer, outputs=output_layer)
# Make BERT layers trainable
for layer in bert_model.layers:
if layer.name.startswith('pooler'):
layer.trainable = True
else:
layer.trainable = False
# Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
# Train the model
model.fit(X_train, y_train, epochs=3, batch_size=64)
# Evaluate the model
scores = model.evaluate(X_test, y_test, verbose=32)
print("BERT-based Model Accuracy: %.2f%%" % (scores[1] * 100))
Using the above code. I am getting this accuracy with 3 Epochs
Accuracy Screenshot with 3 Epochs
When I tried using 10 Epochs, I didn't see any great difference and it took almost 6 to 8 hours on Kaggle Notebook using GPU T4x2. It did stop due to inactivity insufficient memory of might be due to inactivity. I'm attaching the image as well so you guys can get an idea.
Screenshot failed attempt using 10 Epochs.
Kindly, share some suggestions and solution to improve the accuracy of this model.