I am trying to create new variable using mutate from column. I am using strsplit to perform some string operation to create new variable contents. But it is not working as desired.
myDF = data.frame(filename = c("T9719178_Mazda_20230415",
"T9719179_Mazda_20230415",
"T9719180_Mazda_20230415",
"T9719001_Tesla_20230415",
"T9719002_Tesla_20230415",
"T9719003_Tesla_20230415"))
myDF%>%
mutate(LotCode = as.factor(strsplit(filename,"_",fixed=TRUE)[[1]][1]))%>%
mutate(ModelName = as.factor(strsplit(filename,"_",fixed=TRUE)[[1]][2]))
Current output
filename LotCode ModelName
T9719178_Mazda_20230415 T9719178 Mazda
T9719179_Mazda_20230415 T9719178 Mazda
T9719180_Mazda_20230415 T9719178 Mazda
T9719001_Tesla_20230415 T9719178 Mazda
T9719002_Tesla_20230415 T9719178 Mazda
T9719003_Tesla_20230415 T9719178 Mazda
Expected output
filename LotCode ModelName
T9719178_Mazda_20230415 T9719178 Mazda
T9719179_Mazda_20230415 T9719179 Mazda
T9719180_Mazda_20230415 T9719180 Mazda
T9719001_Tesla_20230415 T9719001 Tesla
T9719002_Tesla_20230415 T9719002 Tesla
T9719003_Tesla_20230415 T9719003 Tesla
I suppose above should be straight forward but not sure what is wrong. Any suggestions?
The issue is that for each row in your dataset you are extracting the same (i.e., the first) element of the list returned by
strsplit().To rectify your code, you can use
map()(orapply()) to extract the first and second elements from each element of the list:Created on 2023-04-15 with reprex v2.0.2
Alternatively, you could use
sep_wider_delim()instead which has the advantage of returning a vector instead of a list:Created on 2023-04-15 with reprex v2.0.2