I'm not in informatic or mathematic fields. I'm studying using python, and jupyter notebooks.
Versions : version de python installée : 3.10.9 | packaged by Anaconda, Inc. | (main, Mar 1 2023, 18:18:15) [MSC v.1916 64 bit (AMD64)] jupyter 6.5.2
I am working on an dataset of 1046 images, ranging from 0.015MB to 4.147MB.
I'm trying to have the features (descriptors,...) using SIFT from OpenCV (version 4.7.0).
And I don't understand what's going on; since yesterday, i lost around 10 Go.
I've never seen that.
Does anyone have an idea to explain this please ?
Sincerely yours, David
this code doesn't work well. I'm working on it. Maybe, there's something linked to the loss of available space on disk. I don't know...
import cv2
import os
import time
import numpy as np
import pandas as pd
def test_descripteur_v4(df = dataFull, var1 = 'image_path', var2 = 'clean_product_name',
var3 = 'clean_product_category_tree_L_base', feat_num = 8000):
sift_keypoints = []
descriptors = []
list_products = []
sift = cv2.SIFT_create(feat_num)
os.environ["OMP_NUM_THREADS"] = '5'
start = time.time()
for image, name, cat in zip(df[var1], df[var2], df[var3]):
color_image = cv2.imread(image)
gray_image = cv2.imread(image, cv2.IMREAD_GRAYSCALE) # convert in gray
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8)) # CLAHE (égalisation adaptative d'histogramme à contraste limité)
cl1 = clahe.apply(gray_image)
cl1_gauss = cv2.adaptiveThreshold(cl1,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,35,2) # seuil gaussien adaptatif
contours, hiérarchie = cv2.findContours(cl1_gauss, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE) # détection contours
cl1_smooth_contours = cv2.drawContours(cl1_gauss, contours, 0, (0,255,0), 3)
kp, des = sift.detectAndCompute(cl1_gauss, None)
sift_keypoints.append(des)
list_products.append([name, cat])
sift_keypoints_by_img = np.asarray(sift_keypoints, dtype=object)
sift_keypoints_all = np.concatenate(sift_keypoints_by_img, axis=0)
print("Nombre total de descripteurs : ", sift_keypoints_all.shape)
for i in sift_keypoints:
descriptors.append(i.shape)
descriptor_df = pd.DataFrame(descriptors, columns = ['descripteurs', 'vector_length'])
names_df = pd.DataFrame(list_products, columns = ['product_name', 'cat'])
descriptorNames = pd.concat([descriptor_df, names_df], axis=1).sort_values(by='descripteurs', ascending = False)
end = time.time()-start
duration = "%15.2f"%end
print('Temps de traitement SIFT descriptor : {} secondes'.format(str(duration)))
return sift_keypoints_by_img, sift_keypoints_all, descriptorNames