I have a dataset called "data" that looks like this:
I am trying to create a new variable, called "Group" which codes elements in the "FileName" variable as the following:
- anything with element HC will be labelled as "HC PBMC"
- anything with elements SF and PBMC will be labelled as "AS PBMC"
- anything with elements SF and SFMC will be labelled as "AS SFMC"
In order to do this, I wrote this code:
data$Group<- ifelse(grepl("HC",data$FileName),"HC",
ifelse(grepl("SF & PBMC",data$FileName),"AS PBMC",
"AS SFMC"))
However, anything with elements SF and PBMC did not code as "AS PBMC" correctly. Instead it just skipped that condition and labelled it as "AS SFMC". Please see below:
Any help would be most welcome!


First note that "&" does not have the meaning of a logical "and" inside a regex. You could certainly achieve what you want with some sophisticated regex, but wouldn't it be more transparent to extract the components you consider for naming the groups first and then assign the cases in a second step?
Created on 2023-10-12 with reprex v2.0.2