I have a dataframe as shown below and the output from this code is shown below.
- Essentially, I want to read values from columns which can be of two formats as shown in col1 and col2.
- If the format is as in col1 then we split text and read third element and if the format is as in col2 we read the cell value as it is.
- We then want to pass condition if the values in columns (col1 and col3 after splitting col1) to create col4 and col5.
- Finally, we want the result in col6.
- the variable "selection" is a list of numbers
Question: Since in the original dataframe, there can be multiple columns of value format type as in col1 and col2, how the code below can be modified to get the desired output?
#STEP1
df <- data.frame(
col1 = c("abc_1_102", "abc_1_103", "xyz_1_104"),
col2 = c("107", "108", "106")
)
#STEP2
split_text <- strsplit(df$col1, "_")
third_elements <- sapply(split_text, function(x) if(length(x) >= 3) x[3] else NA)
#STEP3
df$col3<-third_elements
#STEP4
selection<-c(107,102,108)
df$col4<-ifelse(df$col2 %in% selection,"SELECT","NOTSELECT")
df$col5<-ifelse(df$col3 %in% selection,"SELECT","NOTSELECT")
#STEP5
df$col6<-paste(df$col4,df$col5,sep = ",")
Output from above code:
col1 col2 col3 col4 col5 col6
1 abc_1_102 107 102 SELECT SELECT SELECT,SELECT
2 abc_1_103 108 103 SELECT NOTSELECT SELECT,NOTSELECT
3 xyz_1_104 106 104 NOTSELECT NOTSELECT NOTSELECT,NOTSELECT
Desired output
col1 col2 col6
1 abc_1_102 107 SELECT,SELECT
2 abc_1_103 108 NOTSELECT,SELECT
3 xyz_1_104 106 NOTSELECT,NOTSELECT
You can do this all in one go with by pasting two
ifelsestatements together. Theifelseforcol2is straightforward. Theifelseforcol3usesgreplto search for any of the numbers inselectby creating a search string usingpaste(..., collapse = "|")(pasting the "or" operator between them). The outerpaste(..., sep = ",")puts it all together:Or to more safely add a word boundary to the second
ifelse(thanks to @r2evans!)Both give the same output in this example: