I have table in SAS Enterprise Guide like below:
ID | COL_A | COL_B | COL_C
-----|-------|-------|------
111 | 10 | 20 | 30
222 | 15 | 80 | 10
333 | 11 | 10 | 20
444 | 20 | 5 | 20
Requirements:
I need to create new columns: TOP_1, TOP_2, TOP_3 where will be names of columns from the highest value from COL_A, COL_B, COL_C columns to the lowest per ID
If for example 2 or more columns have the same highest value take the first under the alphabet.
In TOP_1 - name of column with the hihest value per ID In TOP_2 - name of column with the second highest value per ID In TOP_3 - name of column with the third highest value per ID
Desire output:
ID | COL_A | COL_B | COL_C | TOP_1 | TOP_2 | TOP_3
-----|-------|-------|--------|--------|---------|---------
111 | 10 | 20 | 30 | COL_C | COL_B | COL_A
222 | 15 | 80 | 10 | COL_B | COL_A | COL_C
333 | 11 | 10 | 20 | COL_C | COL_A | COL_B
444 | 20 | 5 | 20 | COL_A | COL_C | COL_B
Because:
- for ID = 111 the highest value is in COL_C, co name "COL_C" going to column "TOP_1", second highest value is in COL_B, so name "COL_B" going to column "TOP_2" and so on...
- for ID = 444 two columns have the highest value, so we have to use alphabet criteria and in column "TOP_1" is name "COL_A" and name "COL_B is in column "TOP_2"
How can I do that in SAS Enterprise Gude or in PROC SQL ?
First let's convert your listing into an actual dataset.
If you use PROC TRANSPOSE to covert your COL_: into observations.
You can then sort by descending values (and ascending variable name):
And use another PROC TRANSPOSE to make your new variables:
If the data is really large (or you have a lot more than 3 columns to check) you might want to eliminate COL_A COL_B and COL_C from the BY group and instead just merge the resulting TOP_: variable back onto the original dataset.