Working between Excel files and R: trouble reading data into R

112 Views Asked by At

I am reading in thousands of files. Each file has 5 rows of information, then a header row, then the data. Each file contains a different number of columns with different column names and I need to record the column names in order to know how to process it. Unfortunately when I read in the file with this:

temp.data <- read.csv(temp[i],header=TRUE,sep=",", stringsAsFactors = FALSE, skip= 5)

I get the error:

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  duplicate 'row.names' are not allowed

The reason this happens is that the number of delimiters in the header row is one less than the number of delimiters in the data rows. I know this because if you count the fields - you can see the header has 10 and the data has 11. Example below.

count.fields(temp[i],sep=",",skip = 5)
[1] 10 11 11 11 11 11

If I knew all the files were formatted the same, then I would just input my own header columns and read the data from line 6, but because I need to know what each column is named so that I know how to use this I need to record the column headers coming in. How can I do this? An example file is below.

--Meta Data--                                   
Start Time(UTC) LoggerName  Pck Vin                     
7/22/2017 18:57 9160-06A0                               

--Raw Data--                                    
Timestamp(UTC)  Air Conditioning Power[Watts]   C2 Input Voltage[V] HV Battery Current[A]   HV Battery SOC[%]   HV Battery Voltage[V]   Is Driving[bool]    OAT[DegC]   Odometer[kilometers]    Vehicle Speed[km/h]
7/22/2017 18:57 0   0   0   0   0   0   0   0   0
7/22/2017 18:57 0   0   0   0   0   0   0   0   0
7/22/2017 18:58 1700    13.21100044 0.600036621 94  394.7399902 1   37  62842.47656 33.07193375
7/22/2017 18:59 3300    13.04199982 -28 94  392.7399902 1   37  62843.01172 16.33480072
7/22/2017 19:00 3000    13.07600021 -10.09997559    93  393.9799805 1   36.5    62843.73047 0

In excel the column headers and the columns of data line up, so I dont know what is happening in this case. I am guessing that it is reading the date/time column as two columns.

If that is the case, how can I add another col header titled 'date' in at the front of the other column headers?

0

There are 0 best solutions below