I am trying to convert images generated in to a dataset.
(All I have is just png images in n folders and there is no label or meta data)
This is what I aspire to do:
I am using
torch audioto convert audio formats toMel spectrogramand save the images aspngformat. Status:doneNow I have
nnumber of folders(classes) with images so I am curious if I could convert the newly generated images into data and target as in normal datasets, so that I can usesklearnto do the test train splitssklearn.model_selection.train_test_split. Status:not done
eg: fetch mnist dataset
ds_mnist = sklearn.datasets.fetch_openml(
data_id=554,
as_frame=False
)
Split data and target in to X and y
dataset_X = ds_mnist .data.astype('float32')
dataset_y = ds_mnist .target.astype('int64')