Conditional aggregates for every row of a data frame and append it as a new column value

27 Views Asked by At

I have tried to write some code to do conditional aggregates for every row of a data frame and append it as a new column value. The example is calculating season average statistics for every team in the nab game at a point in the NBA season.

This code I can get to run and it does what I want:

for (i in 1:nrow(game_data)) {
  # do something with df[i, "col1"], df[i, "col2"], and df[i, "col3"]
  wins <- game_data %>% filter(season == game_data[i, "season"]) %>% filter(game_date < game_data[i, "game_date"]) %>% filter(team_id == game_data[i, "team_id"]) %>% group_by(team_id) %>% summarize(x=sum(wl == "W"))
  if(nrow(wins) != 0)
  {
    game_data[i, "w_tot"] <- wins[2]
  }
}

Where game_data is a data frame. This is the code I thought would be used but I can seem to get it to run:

wins_up_to_function <- function(curr_season, curr_date, curr_teamId)
{
  game_data %>% filter(season == curr_season) %>% filter(game_date < curr_date) %>% filter(team_id == curr_teamId) %>% group_by(team_id) %>% summarize(x=sum(wl == "W"))
}

mapply(X = game_data, MARGIN = 1, FUN = wins_up_to_function, curr_season = df[,'season'], curr_date = df[,'game_date'], curr_teamId = df[,"team_id"])

Can you kind folks help understand where I'm going wrong?

1

There are 1 best solutions below

0
TeeseCaprice On

I figured it out this works:

wins_test <- game_data %>%
  group_by(season, team_id) %>%
  arrange(game_date) %>%
  mutate(w_tot = cumsum(wl == "W")) %>%
  ungroup()

game_data_test <- game_data %>%
  left_join(wins_test %>% select(season, team_id, game_date, w_tot), by = c("season", "team_id", "game_date"))