ERROR IS - RuntimeError: mat1 and mat2 shapes cannot be multiplied (896x28 and 784x512)
Background: Image classification problem with data of dimensions (60000, 28, 28). Labels have been encoded into dimensions (60000, 10). Data Class has been created:
import torch
from torch.utils.data import DataLoader , TensorDataset, Dataset
class image_Data_Class(Dataset):
def __init__(self, x_train, y_train):
super().__init__()
self.x = torch.from_numpy(x_train)
self.y = torch.from_numpy(y_train)
self.y = self.y.type(torch.LongTensor)
self.len = self.x.shape[0]
def __getitem__(self, index):
return self.x[index], self.y[index]
def __len__(self):
return self.len
## Create a Data Loader for use in the training of the model
BATCH_SIZE = 32 ## as a start for modelling purposes
img_data = image_Data_Class(x_train, y_train)
train_loader = DataLoader(img_data, batch_size = BATCH_SIZE, shuffle = True)
When I run:
print(f"x_train is : {img_data.x.shape}, y_train is : {img_data.y.shape}")
The output is: x_train is : torch.Size([60000, 28, 28]), y_train is : torch.Size([60000, 10])
Issue: When I run it in the following training model
NUM_EPOCHS = 3 # to start with
losses = []
for epoch in range(NUM_EPOCHS):
for x, y in train_loader:
## Initialise gradients
optimiser.zero_grad()
# forward pass
y_pred = model(x)
# compute the loss
loss = loss_func(y_pred, y)
## backward pass
loss.backward()
## update weights
optimiser.step()
print(loss)
losses.append(float(loss.data.detach().numpy()))
I get the following error:
RuntimeError: mat1 and mat2 shapes cannot be multiplied (896x28 and 784x512)
The math works out that the batch size (32) is being multiplied by the first dimension of the training data (28) to give...896. Is this a coincidence??
I have tried:
view() // squeeze() in the model but this results in an index error or a very large input dimension.
Model Class below for reference.
from torch import nn, optim
import torch.nn.functional as F
class nn_Multiclass_Model(nn.Module):
#lin1, lin2,
def __init__(self, NUM_FEATURES, NUM_CLASSES, NUM_HIDDEN_FEATURES):
super().__init__()
self.lin1 = nn.Linear(NUM_FEATURES, NUM_HIDDEN_FEATURES)
self.lin2 = nn.Linear(NUM_HIDDEN_FEATURES, NUM_CLASSES)
self.softmax = nn.Softmax(dim=1)
def forward(self, x):
x = self.lin1(x)
x = F.relu(x)
x = self.lin2(x)
x = self.softmax(x)
return x
I'm sure it's something elementary but I can't find it after looking for some time. Thx