Dplyr rolling update of a dataframe based on conditions

Question

Dplyr rolling update of a dataframe based on conditions

58 Views Asked by user15791858 At 18 August 2021 at 15:59

Say I have a dataframe

  stim1  stim2    choice  outcome   Feedback
1     2     1      0       0           1
2     3     2      1       1           1
3     2     3      1       0           1
4     2     3      0       1           1

My objective is to update at each row for stim1 and stim2, the cumulative mean outcome from previous times that stimulus was chosen.

choice=0 -> stim1 was chosen. 
choice=1 -> stim2 was chosen. 


As an algorithm:
a) For stim=2, find all previous trials where (stim1=2 & choice=0) | (stim2=2 & choce=1)   
b) calculate the mean outcome over all such choices  

For example, at trial 4 the observed outcomes for stim1 (i.e. for 2) is 
    In trial 1 it was chosen (choice=0) and outcome=0
    In trial 2 it was chosen (choice=1) and outcome=1
    In trial 3,it was not chosen (choice=1) so its not included in the count 
    So the observed outcomes is 1/2

Desired outcome

  stim1  stim2 choice  outcome Feedback    Observed_Stim1   Observed_Stim2
1     2     1      0       0     1            NaN              NaN
2     3     2      1       1     1            NaN               0
3     2     3      1       0     1            1/2              NaN
4     2     3      1       1     1            1/2               0

The inefficient loop version of what I am trying to do is

data$trial=1:NROW(data)
data$relative_stim1=rep(NaN, nrow(data))
data$relative_stim2=rep(NaN, nrow(data))
for (i in 2:nrow(data)){
      
      data$relative_stim1[i]=mean(data$outcome[which((data$stim1==data$stim1[i]&data$choice==0&data$feedback==1& data$trial<data$trial[i]) | (data$stim2==data$stim1[i]&data$choice==1&data$feedback==1& data$trial<data$trial[i]))])
      data$relative_stim2[i]=mean(data$outcome[which((data$stim1==data$stim2[i]&data$choice==0&data$feedback==1& data$trial<data$trial[i]) | (data$stim2==data$stim2[i]&data$choice==1&data$feedback==1& data$trial<data$trial[i]))])
}

Original Q&A

There are 1 best solutions below

**Brenton M. Wiernik** · Answer 1 · 2021-08-18T19:34:46.673000

The dplyr package includes several functions for cumulative operations like this. In your case, you will want to combine those with group_by() to group by stimulus.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

dat <- tibble::tribble(
  ~stim1, ~stim2, ~choice, ~outcome, ~feedback,
  2,     1,      0,       0,           1,
  3,     2,      1,       1,           1,
  2,     3,      1,       0,           1,
  2,     3,      0,       1,           1
)

dat |> 
  group_by(stim1) |> 
  mutate(
    count_stim1 = row_number(), 
    observed_stim1 = cumsum(outcome) / row_number()
  ) |> 
  group_by(stim2) |> 
  mutate(
    count_stim2 = row_number(), 
    observed_stim2 = cumsum(outcome) / row_number()
  ) |> 
  ungroup()
#> # A tibble: 4 x 9
#>   stim1 stim2 choice outcome feedback count_stim1 observed_stim1 count_stim2
#>   <dbl> <dbl>  <dbl>   <dbl>    <dbl>       <int>          <dbl>       <int>
#> 1     2     1      0       0        1           1          0               1
#> 2     3     2      1       1        1           1          1               1
#> 3     2     3      1       0        1           2          0               1
#> 4     2     3      0       1        1           3          0.333           2
#> # ... with 1 more variable: observed_stim2 <dbl>

^{Created on 2021-08-18 by the reprex package (v2.0.0)}

Dplyr rolling update of a dataframe based on conditions

There are 1 best solutions below

Related Questions in R

Related Questions in DATAFRAME

Related Questions in DPLYR

Related Questions in CUMULATIVE-SUM

Related Questions in CUMULATIVE-FREQUENCY

Trending Questions

Popular # Hahtags

Popular Questions