I have data frame like this
office salary
| Column A | Column B |
| -------- | -------- |
| Montreal | a, b |
| Cell 3 | b,c,d |
| Montreal | b,c,d |
| Montreal | c,d,e |
| Toronto | n,c |
Output:
office salary
0 Montreal a,b,c,d,e
1 Toronto b,c,d,n,c
How i can do this in pandas?
I tried:
office = ['Montreal', 'Toronto', 'Montreal', 'Montreal','Toronto']
avg_salary = ["a,b","b,c,d","b,c,d","c,d,e","n,c"]
hr_lst = list (zip(office, avg_salary))
#hr_lst
df = pd.DataFrame(hr_lst, columns = ['office', 'salary'])
# Display Original DataFrames
print("Created DataFrame:\n",df,"\n")
`your text`
# Finding unique values
res = df.groupby('office')['salary'].apply(lambda x: list(np.unique(x)))
# Display Result
print("Unique Values:\n",res)
Output:
office salary
| Column A | Column B |
| -------- | -------- |
| Montreal | a, b |
| Cell 3 | b,c,d |
| Montreal | b,c,d |
| Montreal | c,d,e |
| Toronto | n,c |
Unique Values:
office
Montreal [a,b, b,c,d, c,d,e]
Toronto [b,c,d, n,c]
Name: salary, dtype: object
Use lambda function with list comprehension with
split, for remove duplicates usedict.fromkeystrick if ordering is important, last usejoin:If ordering in ouput is not important use
sets: