How do I make a Multimodal dataset of image and general tabular data of mobile malware?

76 Views Asked by At

I want to make a mobile malware classifier based on datasets that have image malware samples and general tabular data on different malware categories.

I did read some research papers on how to do multimodal learning but they already had a preprocessed multimodal dataset to work with which I don't.

So I have 3000 android malware image samples and the CIC-AndMal-2020 dataset. I need to create a custom made multimodal dataset, but I'm unsure as to how to do it.

Should I create a csv file and create a column to link the image file to another folder and then combine this csv file to the csv files of CIC-AndMal-2020?

Any suggestions would be appreciated?

I tried doing multimodal feature fusion which I think just multiplied the features which had nothing in common and tried to used the train and test data from only one mode of datasets and it was overfitting. So I think i need to create a custom dataset, but how?

0

There are 0 best solutions below