I have a folder of 800 .csv files, each with 3 columns and 2048 rows. I'm trying to use ML.NET to retrieve the first column of each .csv file, transpose it into a row, and append each into a Dataframe. So in the end I expect a dataframe of shape (200 x 2048).
I can do everything except transpose the columns and append rows to a dataframe. Can't find anything of use in the documentation either, unless I missed something vital.
var dataFrame1 = new DataFrame();
for (int i = 0; i < files.Length; i++)
{
IDataView data = mlContext.Data.LoadFromTextFile<CsvData>(files[i], separatorChar: ',', hasHeader: true);
data = mlContext.Data.SkipRows(data, minrange);
data = mlContext.Data.TakeRows(data, maxrange - minrange);
var pipeline = mlContext.Transforms.SelectColumns(new string[] { "PE" , "FPos"})
.Append(mlContext.Transforms.NormalizeMinMax("PE", fixZero:false));
var preview = pipeline.Fit(data).Transform(data);
var peColumn = preview.GetColumn<float>("PE");
var fPosColumn = preview.GetColumn<float>("FPos");
var testDfPE = new SingleDataFrameColumn("PE", peColumn);
var testDfFPOS = new SingleDataFrameColumn("FPos", fPosColumn);
...
}
I tried extracting each column into testDfPE, testDfFPOS, and maybe appending them to dataFrame1, but that just appends them column-wise, not row-wise. I can't figure out an efficient way of solving this, especially considering I'm working with a large number of .csv files. For additional context, I'm eventually going to use this dataframe with a pretrained model in .onnx format