How to find the elementwise harmonic mean across two Pandas dataframes

Question

How to find the elementwise harmonic mean across two Pandas dataframes

2.1k Views Asked by panda At 17 May 2025 at 02:32

Simlarly to this post: efficient function to find harmonic mean across different pandas dataframes I have two Pandas dataframes that are identical in shape and I want to find the harmonic mean of each pair of elements - one from each dataframe in the same location. The solution given in that post was to use a Panel, but that is now deprecated.

If I do this:

import pandas as pd
import numpy as np
from scipy.stats.mstats import hmean

df1 = pd.DataFrame(dict(x=np.random.randint(5, 10, 5), y=np.random.randint(1, 6, 5)))
df2 = pd.DataFrame(dict(x=np.random.randint(5, 10, 5), y=np.random.randint(1, 6, 5)))
dfs_dictionary = {'DF1':df1,'DF2':df2}
df=pd.concat(dfs_dictionary)
print(df)

       x  y
DF1 0  9  4
    1  6  4
    2  7  2
    3  5  2
    4  5  2
DF2 0  9  2
    1  7  1
    2  7  1
    3  9  5
    4  8  3

x = df.groupby(level = 1).apply(hmean, axis = None).reset_index()
print(x)
   index         0
0      0  4.114286
1      1  2.564885
2      2  2.240000
3      3  3.956044
4      4  3.453237

I only get one column of values. Why? I was expecting two columns as per the original df, one for the hmean of the x values and one for the hmean of the y values. How can I achieve what I want to do?

Original Q&A

There are 2 best solutions below

Renato Aranha On 01 December 2020 at 05:26

Just try to remove axis = None parameter.

**Quang Hoang** · Accepted Answer

The reason is that you pass axis=None to hmean, which flattens the data. Remember when you do groupby().apply(), the argument is the whole group, e.g. df.loc['DF1']. Just remove axis=None:

x = df.groupby(level = 1).apply(hmean).reset_index()

And you get:

   index                                        0
0      0                 [6.461538461538462, 3.0]
1      1  [5.833333333333333, 2.4000000000000004]
2      2                               [8.0, 3.0]
3      3  [6.857142857142858, 2.4000000000000004]
4      4   [6.461538461538462, 2.857142857142857]

Or you can use agg:

x = df.groupby(level = 1).agg({'x':hmean,'y':hmean})

and get:

          x         y
0  6.461538  3.000000
1  5.833333  2.400000
2  8.000000  3.000000
3  6.857143  2.400000
4  6.461538  2.857143

In the case you have more columns than just x,y:

x = df.groupby(level=1).agg({c:hmean for c in df.columns})

How to find the elementwise harmonic mean across two Pandas dataframes

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in SCIPY.STATS

Trending Questions

Popular # Hahtags

Popular Questions