I have a dataframe with 8 columns, lets call it df.
Duration Location a b c d e f
3M Tuscaloosa 1 2 3 4 5 6
LCM Tuscaloosa 7 8 7 2 3 5
CW Tuscaloosa 9 1 0 0 3 1
PW Tuscaloosa 1 9 9 9 1 0
3M Miami 4 7 0 1 3 8
LCM Miami 8 2 7 8 8 1
CW Miami 1 3 5 7 9 1
PW Miami 8 8 4 4 2 0
3M New York 1 4 6 7 8 8
LCM New York 3 3 4 4 5 5
CW New York 1 1 1 2 2 2
PW New York 9 9 9 9 9 9
3M San Diego 9 3 5 6 7 1
LCM San Diego 1 8 9 0 0 1
CW San Diego 7 8 2 1 1 3
PW San Diego 0 9 4 6 1 4
I would like to create new columns based on the value in 'Duration' and the column name and values of columns a through f. My expected output is
Duration Location a b c d e f 3Ma LCMa CWa PWa 3Mb LCMb CWb PWb 3Mc ....
3M Tuscaloosa 1 2 3 4 5 6 1 null null null 2 null null null 3
LCM Tuscaloosa 7 8 7 2 3 5 null 7 null null null 8 null null null
CW Tuscaloosa 9 1 0 0 3 1 null null 9 null null null 1 null null
PW Tuscaloosa 1 9 9 9 1 0 null null null 1 null null null 9 null
3M Miami 4 7 0 1 3 8 4 null null null 7 null null null 0
LCM Miami 8 2 7 8 8 1 null 8 null null null 2 null null null
CW Miami 1 3 5 7 9 1 null null 1 null null null 3 null null
PW Miami 8 8 4 4 2 0 null null null 8 null null null 8 null
3M New York 1 4 6 7 8 8 1 null null null 4 null null null 6
LCM New York 3 3 4 4 5 5 null 3 null null null 3 null null null
CW New York 1 1 1 2 2 2 null null 1 null null null 1 null null
PW New York 9 9 9 9 9 9 null null null 9 null null null 9 null
3M San Diego 9 3 5 6 7 1 9 null null null 3 null null null 5
LCM San Diego 1 8 9 0 0 1 null 1 null null null 8 null null null
CW San Diego 7 8 2 1 1 3 null null 7 null null null 8 null null
PW San Diego 0 9 4 6 1 4 null null null 0 null null null 9 null
I have experimented with the .unstack() method but I'm not entirely sure how to tailor it to my situation. I have be unsuccessful with the normalize method as well. Any help is appreciated.
Try this:
Output:
Note: Columns are in alphabetical order.
If we want to keep order of columns in the same order as Duration columns, let's convert Duration to Categorical Dtype with order:
Output:
There are a couple of pandas tricks in this solution.
df.set_index(df['columnname'])vsdf.set_index('columnname').The first, creates a copy of the series and adds that to the index where the second moved the dataframe column series into the index.