Is there any way I can implement an on-the run function like that on a dataframe?
I have an imaginary dataframe 'Classroom" as follows with - an imaginary implementation to illustrate the issue -:
| Student. | Subject. | Mark. | New Sub. |
|---|---|---|---|
| Mike | English | pass | English |
| Mike | French | pass | None |
| Mike | History | pass | None |
| Mike | Bio | Fail | None |
| ... | ...... | ...... | None |
**I want to implement values in 'New Sub." Column such that for every student, it checks all the subject's grades, if the subject is "pass" then it gets appended to the current list of "New Sub." if it's not added, if it is "Fail" then the subject should'nt be added, to have the current view: **
| Student. | Subject. | Mark. | New Sub. |
|---|---|---|---|
| Mike | English | pass | English |
| Mike | French | pass | English,French |
| Mike | History | pass | English,French,History |
| Mike | Bio | Fail | English,French,History |
| ... | ...... | ...... | None |
I tried implementing that using (np.where) like that:
Classroom["New Sub."]= np.where((Classroom["Mark."]=="pass"),Classroom["New Sub."].shift(1)+","+["Subject"],Classroom["New Sub."])
The issue is that "New Sub." Column doesn't get updated on the run while running np.where, so what I get is the following:
| Student. | Subject. | Mark. | New Sub. |
|---|---|---|---|
| Mike | English | pass | English |
| Mike | French | pass | English,French |
| Mike | History | pass | None,History |
| Mike | Bio | Fail | None |
| ... | ...... | ...... | None |
As if it gets the old values of "New Sub." and not getting the previous values after modifications.
Is there any way I can implement an on-the run function like that on a dataframe?
The issue with your approach using 'np.where' is that it's not designed to perform operations like concatenating strings and accessing previous values within a pandas DataFrame column directly. You're trying to use shift(1) and accessing the Subject column inside 'np.where', which won't work as expected.
you can use a combination of groupby, cumsum, and apply functions in pandas.