I have a table.
+------+---------------+
| id | co_id |
+------+---------------+
| 123 | abc |
| 123 | bcd |
| 234 |abc |
| 123 |def |
| 678 |urt |
| 678 |urt |
| 124 |xyz |
| 600 |urt |
| 700 |def |
+------+---------------+
I use Pandas to groupby id and co_id, because they are many to many or 1 to many or many to 1 relation(at least one match between id and co_id). Goal is to group together all the same id and same co_id, output would be:
+------+---------------+------+
| id | co_id |cat |
+------+---------------+------+
|123 |abc |group1|
|123 |bcd |group1|
|234 |abc |group1|
|123 |def |group1|
|700 |def |group1|
|678 |urt |group2|
|678 |urt |group2|
|600 |urt |group2|
|124 |xyz |group3|
+------+---------------+------+
I was going to group by id and co_id separately and join them together, but failed. My initial code as below, how can I achieve ideal output? Please advise.
import pandas as pd
df = pd.read_csv(r'diretory\sample.csv')
df_id = df.groupby('id').agg({'id': lambda x: list(x)}).reset_index()
df_coid = df.groupby('co_id').agg({'co_id': lambda x: list(x)}).reset_index()