How to turn dataframe rows into multiindex levels?

59 Views Asked by At

I have a csv file that looks something like this:

ID ; name; location; level; DATE19970901; DATE19970902; ...;DATE20201031;survey;person
001;  foo;     east;   500;        123.1;        342.5; ...;       234.5;     A;  John
002;  bar;     west;    50;         67.8;         98.3; ...;        76.6;     A;  Jenn
003;  baz;    north;  5000;        535.7;         99.9; ...;       432.6;     B;  John

which I need to turn into a dataframe like this:

ID         001    002   003
name       foo    bar   baz
location   east   west  north
level      500    50    5000
survey     A      A     B
person     John   Jenn  John
date
1997-09-01 123.1  67.8  535,7 
1991-09-02 342.5  98.3  99.9
...
2020-10-31 234.5  76.6  432.6

Now the easiest way seems to me to read it in, .transpose() it and then tell it to turn the data rows 0,1,2,8443,8444 into multiindex rows, but I'm missing a function for it. .MultiIndex.from_frame does only seem to take a complete df to turn into a multiindex. I could probably split my df into a multiindex df and a data df and merge them, but that seems complicated and error prone to me.

What's an easy way to do it, is to read in the csv, transpose the df, export it to csv and read that in again, but that seems rather hacky (and slow, though that is not really an issue in my case).

1

There are 1 best solutions below

4
mozway On BEST ANSWER

You can use a transposition, with identification of the non-DATE columns, and conversion to_datetime:

# read dataset as semicolon-separated data, ignoring spaces
df = pd.read_csv('inpt_data.csv', sep=r'\s*;\s*', engine='python')

# identify non-DATE columns
cols = list(df.columns.difference(list(df.filter(like='DATE')), sort=False))
# or if you already know the columns:
# cols = ['ID', 'name', 'location', 'level', 'survey', 'person']

# reshape, convert to dates
out = df.set_index(cols).T.rename_axis('date')
out.index = pd.to_datetime(out.index, format='DATE%Y%m%d')

Output:

ID              1     2      3
name          foo   bar    baz
location     east  west  north
level        500   50     5000
survey          A     A      B
person       John  Jenn   John
date                          
1997-09-01  123.1  67.8  535.7
1997-09-02  342.5  98.3   99.9
2020-10-31  234.5  76.6  432.6