I have the following dataset:
ID<-c("A","B","C","D","E")
Fruits1<-c("orange","apple","pineapple","apple","pineapple")
Fruits2<-c("apple","orange","apple","pineapple","orange")
data<-tibble(ID,Fruits1,Fruits2)
data
# A tibble: 5 × 3
ID Fruits1 Fruits2
<chr> <chr> <chr>
1 A orange apple
2 B apple orange
3 C pineapple apple
4 D apple pineapple
5 E Pineapple orange
I want to create a new column called FruitsDiff, which combines columns Fruits1 and Fruits2 like so:
# A tibble: 5 × 4
ID Fruits1 Fruits2 FruitsDiff
<chr> <chr> <chr> <chr>
1 A orange apple apple-orange
2 B apple orange apple-orange
3 C pineapple apple apple-pineapple
4 D apple pineapple apple-pineapple
5 E pineapple orange orange-pineapple
My requirement for this new column is that the fruits be ordered alphabetically, regardless of which fruit comes first in the data frame (so for example, for line 1, even if orange comes before apple, the variable in FruitsDiff is apple-orange). I usually use paste() to combine columns, but this is not going to help me in this situation.
Any suggestions? Also, the separator between the two variables can be anything, I just used a dash for the sake of the example.
We get the elementwise
min/maxbetween the 'Fruit' columns withpmin/pmax(where the min/max will be based on alphabetic order) andpaste(str_c) the output from those functions to create the new column 'FruitsDiff'-output
Or in
base R, use the samepmin/pmaxtopasteOr with
applyandMARGIN = 1, loop over the rows,sortthe elements andpastewithcollapseas argument