R: If-else/mutate(?) statement where a new column value is added to a value from another column and row based on a condition

61 Views Asked by At

Using R, I get lost pretty quickly in conditional/if-else type work. Usually I'm able to problem solve with stack overflow threads, but I haven't figured out how to search for this specific problem. Let's say I have the following DF:

Column A Column B Column C
Panda Variant 1 10.0
Monkey Variant 2 5.0
Monkey Variant 1 7.0
Panda Variant 3 8.0

I want to make a new column, Column D. If Column B == any variant other than Variant 2, then I want the value of Column D == Column C. If Column B == "Variant 2", then I want the value of Column D == (Column C + Column C when Column A is the same but Column B is Variant 1).

So, for the above table the outcome would be:

Column A Column B Column C Column D
Panda Variant 1 10.0 10.0
Monkey Variant 2 5.0 12.0
Monkey Variant 1 7.0 7.0
Panda Variant 3 8.0 8.0

I've tried a few different if/else statements to try and get the ball rolling, but none have even come close. Any solution would be greatly appreciated!

2

There are 2 best solutions below

0
LMc On

Here is an option:

library(dplyr)

df |>
  group_by(`Column A`) |>
  filter(`Column B` %in% c("Variant 2", "Variant 1"), any(`Column B` == "Variant 2")) |>
  summarize(`Column C` = sum(`Column C`), `Column B` = "Variant 2") |> 
  rows_update(df, y = _, by = c("Column A", "Column B"))

Basically, you pull out those rows where "when Column A is the same" and Column B == "Variant 2" and sum the values and then only update those rows (since the others remain the same).

0
Ben On

You can also try using group_modify with tidyverse. You can group your data.frame by ColumnA and include your conditional value for ColumnD that will only consider ColumnC values for the same ColumnA.

library(tidyverse)

df |>
  group_by(ColumnA) |>
  group_modify(~ {
    .x |>
      mutate(ColumnD = ifelse(
        ColumnB != "Variant2",
        ColumnC,
        ColumnC + ColumnC[ColumnB == "Variant1"]
      ))
  })

Output

  ColumnA ColumnB  ColumnC ColumnD
  <chr>   <chr>      <dbl>   <dbl>
1 Monkey  Variant2       5      12
2 Monkey  Variant1       7       7
3 Panda   Variant1      10      10
4 Panda   Variant3       8       8