Dataframe merge and create multiple columns with multiple matches

70 Views Asked by Thenelly At 22 September 2023 at 17:27

I have two pandas dataframes. DF1 has columns A B and C. DF2 has columns B and C. I want to create DF3 with columns A B C D and E.

I want column A B and C to be the same and in the same order as column A B and C in DF1. Then I want column D to be the values in DF2 column B when DF2 B and C both match DF1 B and C. Then I want DF3 column E to be the values in DF2 column B when DF2 column B matches DF1 B but the column C amounts do not match.

To summarize:

Columns:

A - same as DF1 A

B - same as DF1 B

C - same as DF1 C

D - the values of DF2 B when both B and C match DF1.

E - the values of DF2B when only B matches DF1 and c does not.

Thanks in advance!

Here is an example:

DF1

DF2

DF3

Here is the same example in text:

DF1:
A	B	C
AAA	123	456
BBB	789	888
CCC	999	111
DDD	555	444

DF2:
B	C
123	456
789	888
789	900
789	950
999	111
555	000
555	444

DF3:
A	B	C	D	E
AAA	123	456	456	nan
BBB	789	888	888	900, 950
CCC	999	111	111	nan
DDD	555	444	444	000

I have tried to do a merge like the following: df1.merge(df2['C'].astype(str).groupby(df2['B']).agg(', '.join), on='B', how='left;).fillna('0') which has worked when I want to match on one column (for example B), and pull in the information in C. But I am not familiar with how to pull in multiple columns when one needs to match twice, and one needs to match only once, discluding any that match in both columns.

Original Q&A

There are 1 best solutions below

mozway On 22 September 2023 at 17:55

You can merge on "B", then assign "D"/"E" depending on the equality of the two "C" columns. Finally, pivot_table to reshape your DataFrame:

out = (df1
  .merge(df2, on='B', how='left', suffixes=(None, '_'))
  .assign(col=lambda d: np.where(d['C'].eq(d['C_']), 'D', 'E'))
  .pivot_table(index=['A', 'B', 'C'], columns='col', values='C_',
               aggfunc=lambda x: ','.join(x.dropna().astype(str)))
  .reset_index().rename_axis(columns=None)
)

Output:

     A    B    C    D        E
0  AAA  123  456  456      NaN
1  BBB  789  888  888  900,950
2  CCC  999  111  111      NaN
3  DDD  555  444  444      000

Dataframe merge and create multiple columns with multiple matches

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in MERGE

Related Questions in MERGING-DATA

Trending Questions

Popular # Hahtags

Popular Questions