The read.table family (read.table, read.csv, read.delim et al) has the argument check.names with the following explanation:
logical. If
TRUEthen the names of the variables in the data frame are checked to ensure that they are syntactically valid variable names. If necessary they are adjusted (bymake.names) so that they are, and also to ensure that there are no duplicates.
Say I have loaded a data frame containing syntactically invalid column names. Is there any other consequence apart from having to access a specific column by name using the ` character?
Check out
help(make.names)to understand what it is doing and why.The big ones that will trip you up are blank column names (
df$``gives an error) and repeated column names (df$valwill return the firstvalcolumn result only).Outside of that, if you pass this data.frame to a function that is expecting a data.frame with valid names, you will likely get errors, and perhaps silent ones that are hard to detect.