How do I match a column entry from one df to a different df; and if they're the same, append another column's entry from the first df to the 2nd df?

81 Views Asked by cross12tamu At 07 November 2018 at 18:29

CONTEXT:

I have two dataframes that have the following set up:

df1 looks like this...and goes on for about 3500 rows:

| id1 | id2   | 
|:----|------:|
| a   | name1 |  
| b   | name2 | 
| c   | name3 | 
| d   | name4 | 
| e   | name5 |   
| f   | name6 |

df2 looks like this...and goes on for about 4000 rows and about 8 columns

| id1 | ranktrial1   | ranktrial2   | ...
|:----|-------------:|-------------:| ...
| a   | rank1        |rank1         | ...
| b   | rank2        |rank2         | ...
| c   | rank3        |rank3         | ...
| d   | rank4        |rank4         | ...
| e   | rank5        |rank5         | ...  
| f   | rank6        |rank6         | ...

NOTE1: some of the id1s, do not have id2s. Meaning they'll be NaNs when they're mapped; and I'll just drop them whenever I get to that step. I don't know if this is relevant, but I just wanted to add it in case it was.

QUESTION:

I need to append/join/place (don't know correct jargon here) the corresponding id2 names to the second dataframe, iff the id1 entry == id1 entry of df2. How do I do this?

The desired dataframe would look like this:

| id1 | id2   | ranktrial1   | tranktrail2  | ...
|:----|------:|-------------:|-------------:| ...
| a   | name1 | rank1        | rank1        | ...  
| b   | name2 | rank2        | rank2        | ...
| c   | name3 | rank3        | rank3        | ...
| d   | name4 | rank4        | rank4        | ...
| e   | name5 | rank5        | rank5        | ...
| f   | name6 | rank6        | rank6        | ...

I feel as if this is probably really simple and I'm being a bit of a doofus, as I am a novice Pythoner. However, I have not been able to use similar question's responses to achieve my goal. It is quite likely my fault though :p

Thanks in advance for your help!

edits changed 4000 entries --> 4000 rows. LIkewise for 3500 entries

Original Q&A

There are 1 best solutions below

piRSquared On 07 November 2018 at 18:32 BEST ANSWER

Given you are dropping the missing bits afterwards, this is an inner join and can be accomplished with merge. By default, merge uses all commonly named columns. In this case, the only commonly named column is id1. Also, how='inner' si also the default.

df1.merge(df2)

  id1    id2 ranktrial1 tranktrail2
0   a  name1      rank1       rank1
1   b  name2      rank2       rank2
2   c  name3      rank3       rank3
3   d  name4      rank4       rank4
4   e  name5      rank5       rank5
5   f  name6      rank6       rank6

You could be more explicit with

df1.merge(df2, how='inner', on='id1')

How do I match a column entry from one df to a different df; and if they're the same, append another column's entry from the first df to the 2nd df?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in MERGING-DATA

Trending Questions

Popular # Hahtags

Popular Questions