I have a dataframe like this basically:
- table1: Every market has 2 groups and it is a 2 years data. I want to make new data table based on several conditions.
| date | market | group | cost | income | clicks |
|---|---|---|---|---|---|
| A | R | .... | .... | ..... | |
| A | N | ||||
| B | R | ||||
| B | N | ||||
| C | R | ||||
| C | N |
I created this table by market and its groups:
table2 = table1.groupby(["market", "group"])[["cost"]].sum().reset_index()
| market | group | cost |
|---|---|---|
| A | R | ..... |
| A | N | |
| B | R | |
| B | N | |
| C | R | |
| C | N |
I want to create new table by iterating every row and create a new column based on: If cost < 1000 then I want to go with market with its own group in table1, then sum income and divide by clicks. If cost > 1000 then I want it to be 0.
| market | group | cost | new_column |
|---|---|---|---|
| A | R | ..... | |
| A | N | ||
| B | R | ||
| B | N | ||
| C | R | ||
| C | N |
I tried this:
for index, row in table2.iterrows():
if row['cost'] < 1000:
table2['new_column'] = table1.loc[(table1.market == row['market'] & (table1.group == row['group'])]['income'].sum() / table1.loc[(table1.market == row['market'] & (table1.group == row['group'])]['clicks'].sum()
else:
table1['new_column'] = 0
I just want to iterate every row and go to the first table with values of the second table. How can I do this?
It's faster to sum up all 3 columns
cost,incomeandclicksand then set a condition fornew_column:You can drop the unneeded columns after.