Change columns to rows per student ID

119 Views Asked by At

I have data in excel sheet that I am reading into a dataframe:

ID Grade Course Q1 Number Q1 Letter Q2 Number Q2 Letter
1 9 English 73 B 69 C
1 9 Math 70 B 52 C
1 9 Science 69 C 80 A

desired output:

ID Grade Course Semester Number Grade Letter Grade
1 9 English Q1 73 B
1 9 English Q2 69 C
1 9 Math Q1 70 B
1 9 Math Q2 52 C
1 9 Science Q1 69 C
1 9 Science Q2 80 A

I'm trying to do df.melt, but it's not working. Any help is appreciated.

1

There are 1 best solutions below

8
Scott Boston On

Update:

df = pd.read_excel('Downloads/grades_mock+data.xlsx')

dfm = df.set_index(['ID', 'GRADE', 'COURSE'])\
        .rename(columns=lambda x: ' '.join(x.split(' ', 1)[::-1]))\
        .reset_index()

#Eliminating duplicates.
dfm = dfm.groupby(['ID', 'GRADE', 'COURSE', 'DISCIPLINE COURSE'], as_index=False).first()

df_out = pd.wide_to_long(dfm,
                         ['GRADE NUMERIC', 'GRADE LETTER'], 
                         ['ID', 'GRADE', 'COURSE', 'DISCIPLINE COURSE'],
                         'Semester', sep=' ', suffix='.*')\
           .reset_index()

print(df_out)

Try this, using pd.wide_to_long, with some column renaming to make it easier:

df = pd.read_clipboard()


dfm = df.set_index(['ID', 'Grade', 'Course'])\
        .rename(columns=lambda x: ' '.join(x.split(' ')[::-1]))\
        .reset_index()

df_out = pd.wide_to_long(dfm,
                         ['Number', 'Letter'], 
                         ['ID', 'Grade', 'Course'],
                         'Semester', sep=' ', suffix='.*')\
           .reset_index()

print(df_out)

Output:

   ID  Grade   Course Semester  Number Letter
0   1      9  English       Q1      73      B
1   1      9  English       Q2      69      C
2   1      9     Math       Q1      70      B
3   1      9     Math       Q2      52      C
4   1      9  Science       Q1      69      C
5   1      9  Science       Q2      80      A