I have a dataset that lists employees and timestamps on which they complete certain actions. It is divided into three columns: Employee, Date, Hour.
I want to count the number of employees that is active each hour. In excel I would do this by adding a fourth column EmpFactor in which I perform a COUNTIFS operation:
=1/COUNTIFS(Name range;Name;Date range;Date;Hour range;Hour)
I can subsequently calculate the number of active employees by performing a SUMIF on the EmpFactor Column.
I tried the following code to compose the EmpFactor column using pandas:
for name,date,time in zip(df['Employee'],df['Date'],df['Time']):
df['EmpFactor'] = 1/(df[(df.Employee == name) and (df.Day == dag)
and (df.Time == tijd)].count())
This however doesn't work. I have searched extensively through numerous topics on SO, but haven't found a fitting answer yet.
Starting with this dataframe:
Output:
You can group by
DateandTimeand count the employees:Output: