For each ID in one table (Table 1), how can I count the number of rows in another table (Table 2) which fall between a start date and end date (two columns in the first table) in R?
Table 1:
| contact_number | start_date | end_date |
|---|---|---|
| 123 | 01/01/2023 | 01/01/2024 |
| 1234 | 01/01/2022 | 01/01/2023 |
Table 2:
| contact_number | visit_type | date |
|---|---|---|
| 123 | 1 | 02/01/2023 |
| 123 | 2 | 05/01/2023 |
| 1234 | 2 | 05/01/2022 |
Output wanted:
| contact_number | start_date | end_date | visits |
|---|---|---|---|
| 123 | 01/01/2023 | 01/01/2024 | 2 |
| 1234 | 01/01/2022 | 01/01/2023 | 1 |
I am wanting to replicate a Power BI DAX query:
visits_per_membership =
CALCULATE(
COUNTROWS('Table 2'),
FILTER(
'Table 2',
'Table 2'[contact_number] = Table 1[contact_number] && 'Table 2'[EntryDate].[Date] >= Table 1[start_date].[Date] && 'Table 2'[EntryDate].[Date] < Table 1[renewal_date].[Date])
)
Base R solution, mapping over rows of
table1and comparing totable2by id and dates:Or using
dplyr::left_join():