I am reading in thousands of files. Each file has 5 rows of information, then a header row, then the data. Each file contains a different number of columns with different column names and I need to record the column names in order to know how to process it. Unfortunately when I read in the file with this:
temp.data <- read.csv(temp[i],header=TRUE,sep=",", stringsAsFactors = FALSE, skip= 5)
I get the error:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
duplicate 'row.names' are not allowed
The reason this happens is that the number of delimiters in the header row is one less than the number of delimiters in the data rows. I know this because if you count the fields - you can see the header has 10 and the data has 11. Example below.
count.fields(temp[i],sep=",",skip = 5)
[1] 10 11 11 11 11 11
If I knew all the files were formatted the same, then I would just input my own header columns and read the data from line 6, but because I need to know what each column is named so that I know how to use this I need to record the column headers coming in. How can I do this? An example file is below.
--Meta Data--
Start Time(UTC) LoggerName Pck Vin
7/22/2017 18:57 9160-06A0
--Raw Data--
Timestamp(UTC) Air Conditioning Power[Watts] C2 Input Voltage[V] HV Battery Current[A] HV Battery SOC[%] HV Battery Voltage[V] Is Driving[bool] OAT[DegC] Odometer[kilometers] Vehicle Speed[km/h]
7/22/2017 18:57 0 0 0 0 0 0 0 0 0
7/22/2017 18:57 0 0 0 0 0 0 0 0 0
7/22/2017 18:58 1700 13.21100044 0.600036621 94 394.7399902 1 37 62842.47656 33.07193375
7/22/2017 18:59 3300 13.04199982 -28 94 392.7399902 1 37 62843.01172 16.33480072
7/22/2017 19:00 3000 13.07600021 -10.09997559 93 393.9799805 1 36.5 62843.73047 0
In excel the column headers and the columns of data line up, so I dont know what is happening in this case. I am guessing that it is reading the date/time column as two columns.
If that is the case, how can I add another col header titled 'date' in at the front of the other column headers?