I am attempting to summarize a count of categorical variables by month and year. Below shows what I obtain versus what I desire.
library(dplyr)
library(tidyverse)
library(lubridate)
data <- tibble(Date = as.Date(c("2021-10-01", "2021-10-01", "2021-10-01", "2022-10-01")),
b = c("honey", "honey", "free", "love"))
data
# A tibble: 4 × 2
# Date b
# <date> <chr>
#1 2021-10-01 honey
#2 2021-10-01 honey
#3 2021-10-01 free
#4 2022-10-01 love
The code:
data %>% group_by(year=year(Date), month=month(Date)) %>% count()
Produces this:
# A tibble: 2 × 3
# Groups: year, month [2]
# year month n
# <dbl> <dbl> <int>
# 1 2021 10 3
# 2 2022 10 1
But this is what I desire as output ...
# year month honey free love
# <dbl> <dbl> <int> <int> <int>
# 1 2021 10 2 1 0
# 2 2022 10 0 0 1
You can use
tidyr::pivot_wider()with thevalues_fn(to count how many occurrences usinglength) andvalues_fill(to replaceNAvalues with 0):Output:
If you want to add
monthin there, just use a simplemutateafter pivoting:Output: