I have the following table in R. I have 2 A columns, 3 B columns and 1 C column. I need to calculate the maximum difference possible between any columns of the same name and return the column name as output.
For row 1
- The max difference between A is 2
- The max difference between B is 4
- I need the output as B
For row 2
- The max difference between A is 3
- The max difference between B is 2
- I need the output as A
| A | A | B | B | B | C |
| 2 | 4 |5 |2 |1 |0 |
| -3 |0 |2 |3 |4 |2 |
First of all, it's a bit dangerous (and not allowed in some cases) to have non-unique column names, so the first thing I did was to uniqueify the names using
base::make.unique(). From there, I usedtidyr::pivot_longer()so that the grouping information contained in the column names could be accessed more easily. Here I use a regex insidenames_patternto discard the differentiating parts of the column names so they will be the same again. Then we usedplyr::group_by()followed bydplyr::summarize()to get the largest difference in eachidandgrpwhich corresponds to your rows and similar columns in the original data. Finally we usedplyr::slice_max()to return only the largest difference per group.Created on 2022-02-14 by the reprex package (v2.0.1)