I have the dataset like below,and I read it as a csv file and load the dataframe as df
Name Value1 Value1
A 2 5
A 1 5
B 3 4
B 1 4
C 0 3
C 5 3
C 1 3
If I do the following command in R,
out<-ddply(df, .(Name), summarize, Value1=mean(Value1),Value2=mean(Value2))
I am getting an output like this,
Name Value1_mean Value2_mean
A 1.5 5
B 2 4
C 2 3
But need to find the mean for Value2 and Value1 and store the result in a separate column say value1_mean and value2_mean like this for every entry,
Name Value1 Value1 value1_mean value2_mean
A 2 5 1.5 5
A 1 5 1.5 5
B 3 4 2 4
B 1 4 2 4
C 0 3 2 3
C 5 3 2 3
C 1 3 2 3
How can I get this above output?
We can do this efficiently with
data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'Name', specify the columns to take themean
with.SDcols
, loop through the subset of data.table (.SD
), get themean
and assign (:=
) it to new columns.Or with
dplyr
, we usemutate_each
data