Calculate mean and add in new row in R but to reflect in all the entries of a particular column

516 Views Asked by At

I have the dataset like below,and I read it as a csv file and load the dataframe as df

Name   Value1 Value1    
A       2       5       
A       1       5       
B       3       4       
B       1       4       
C       0       3       
C       5       3       
C       1       3       

If I do the following command in R,

out<-ddply(df, .(Name), summarize,  Value1=mean(Value1),Value2=mean(Value2))

I am getting an output like this,

Name   Value1_mean   Value2_mean    
A       1.5             5       
B       2               4       
C       2               3       

But need to find the mean for Value2 and Value1 and store the result in a separate column say value1_mean and value2_mean like this for every entry,

Name   Value1 Value1   value1_mean  value2_mean
A       2       5       1.5           5
A       1       5       1.5           5
B       3       4       2             4
B       1       4       2             4
C       0       3       2             3
C       5       3       2             3
C       1       3       2             3

How can I get this above output?

1

There are 1 best solutions below

1
On

We can do this efficiently with data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'Name', specify the columns to take the mean with .SDcols, loop through the subset of data.table (.SD), get the mean and assign (:=) it to new columns.

library(data.table)
setDT(df)[, paste0(names(df)[2:3], "_mean") := lapply(.SD, mean), by = Name, .SDcols = 2:3]
df
#    Name Value1 Value2 Value1_mean Value2_mean
#1:    A      2      5         1.5           5
#2:    A      1      5         1.5           5
#3:    B      3      4         2.0           4
#4:    B      1      4         2.0           4
#5:    C      0      3         2.0           3
#6:    C      5      3         2.0           3
#7:    C      1      3         2.0           3

Or with dplyr, we use mutate_each

library(dplyr)
df %>%
   group_by(Name) %>%
   mutate_each(funs(Mean = mean)) 
#    Name Value1 Value2 Value1_Mean Value2_Mean
#  <chr>  <int>  <int>       <dbl>       <dbl>
#1     A      2      5         1.5           5
#2     A      1      5         1.5           5
#3     B      3      4         2.0           4
#4     B      1      4         2.0           4
#5     C      0      3         2.0           3
#6     C      5      3         2.0           3
#7     C      1      3         2.0           3

data

df <- structure(list(Name = c("A", "A", "B", "B", "C", "C", "C"), Value1 = c(2L, 
1L, 3L, 1L, 0L, 5L, 1L), Value2 = c(5L, 5L, 4L, 4L, 3L, 3L, 3L
)), .Names = c("Name", "Value1", "Value2"), class = "data.frame", 
row.names = c(NA, -7L))