I have two multi-indexed pandas dataframes that look like this:
>>> df1 = pd.DataFrame({
... ('y1', '0'): [1, 2, 3],
... ('y2', '0'): [4, 5, 6],
... ('y11', '0'): [7, 8, 9],
... })
>>> df2 = pd.DataFrame({
... ('y1', '1'): [1.5, 2.5, 3.5],
... ('y2', '1'): [4.5, 5.5, 6.5],
... ('y11', '1'): [7.5, 8.5, 9.5],
... })
I want to concatenate them so that the result looks like:
>>> df = pd.DataFrame({
... ('y1', '0'): [1, 2, 3],
... ('y1', '1'): [1.5, 2.5, 3.5],
... ('y2', '0'): [4, 5, 6],
... ('y2', '1'): [4.5, 5.5, 6.5],
... ('y11', '0'): [7, 8, 9],
... ('y11', '1'): [7.5, 8.5, 9.5],
... })
i.e., the order of the first level of the multi-index: y1 ; y2 ; y11 , is preserved, while the second level is sensibly interleaved.
What is a solution to concatenate the two multi-indexed dataframes such that the ordering of the first level of the multi-index is preserved?
If I use:
>>> df = pd.concat((df1, df2), axis="columns").sort_index(axis="columns")
it almost works, but the ordering of the first level is messed up (lexicographically) to y1 , y11 , y2
>>> print(df)
y1 y11 y2
0 1 0 1 0 1
0 1 1.5 7 7.5 4 4.5
1 2 2.5 8 8.5 5 5.5
2 3 3.5 9 9.5 6 6.5
I can do this using a complicated regex, but I think that there should be a better solution than this.
One easy option could be to
concat,sort_index, then restore the desired order usingdf1:If you can't rely on the original order and want to force a natural sort, use
natsort:Or:
Output: