I am using facenet to extract facial features from set of frames in a video. Below is the code:
import torch
from facenet_pytorch import MTCNN, InceptionResnetV1
resultlist = []
def ImgFeatures(image_path):
mtcnn = MTCNN(keep_all=True).eval()
model = InceptionResnetV1(pretrained='vggface2').eval()
image = Image.open(image_path)
cropped = mtcnn(image)
result = model(cropped).detach()
return result
resultlist.append(ImgFeatures(image_path)
troch.stack(resultlist)
I stack the output using troch.stack(resultlist). This gave the following error, showing that there are few values that are of dimension [2, 512]
Traceback (most recent call last): File "/home/local/ASUAD/pgouripe/Workspace/MEnMaE/MEGNN/facenet_check.py", line 37, in result = torch.stack(resultlist) RuntimeError: stack expects each tensor to be equal size, but got [1, 512] at entry 0 and [2, 512] at entry 54
I also notices the following warning when running facenet.
I0000 00:00:1711242913.736970 1889714 gl_context_egl.cc:85] Successfully initialized EGL. Major : 1 Minor: 5 I0000 00:00:1711242913.754006 1892985 gl_context.cc:357] GL version: 3.2 (OpenGL ES 3.2 NVIDIA 535.161.07), renderer: NVIDIA GeForce GTX 1080 Ti/PCIe/SSE2
I am unable to figure the cause for the issue, Can someone help with more clarity.
I have search for the specific issue and did not find any references.