output of pretrained facenet model is [2, 512] instead of [1, 512]

22 Views Asked by At

I am using facenet to extract facial features from set of frames in a video. Below is the code:

import torch
from facenet_pytorch import MTCNN, InceptionResnetV1

resultlist = []
def ImgFeatures(image_path):
    mtcnn = MTCNN(keep_all=True).eval()
    model = InceptionResnetV1(pretrained='vggface2').eval()
    image = Image.open(image_path)
    cropped = mtcnn(image)
    result = model(cropped).detach()
    return result

resultlist.append(ImgFeatures(image_path)
troch.stack(resultlist)

I stack the output using troch.stack(resultlist). This gave the following error, showing that there are few values that are of dimension [2, 512]

Traceback (most recent call last): File "/home/local/ASUAD/pgouripe/Workspace/MEnMaE/MEGNN/facenet_check.py", line 37, in result = torch.stack(resultlist) RuntimeError: stack expects each tensor to be equal size, but got [1, 512] at entry 0 and [2, 512] at entry 54

I also notices the following warning when running facenet.

I0000 00:00:1711242913.736970 1889714 gl_context_egl.cc:85] Successfully initialized EGL. Major : 1 Minor: 5 I0000 00:00:1711242913.754006 1892985 gl_context.cc:357] GL version: 3.2 (OpenGL ES 3.2 NVIDIA 535.161.07), renderer: NVIDIA GeForce GTX 1080 Ti/PCIe/SSE2

I am unable to figure the cause for the issue, Can someone help with more clarity.

I have search for the specific issue and did not find any references.

0

There are 0 best solutions below