For example I have some data
| x | y | Owner | H |
|---|---|---|---|
| 1 | 2 | A | 1 |
| 2 | 5 | A | NA |
| 2 | 2 | A | 5 |
| 2 | 5 | A | 4 |
| 2 | 5 | B | 4 |
| 2 | 5 | C | 3 |
| 3 | 3 | C | 6 |
| 3 | 3 | C | NA |
| 3 | 3 | D | NA |
I want to replace the Nans with the most common/mode when you group by (x,y,Owner) to get
| x | y | Owner | H |
|---|---|---|---|
| 1 | 2 | A | 1 |
| 2 | 5 | A | 4 |
| 2 | 2 | A | 5 |
| 2 | 5 | A | 4 |
| 2 | 5 | B | 4 |
| 2 | 5 | C | 3 |
| 3 | 3 | C | 6 |
| 3 | 3 | C | 6 |
| 3 | 3 | D | NA |
I tried this but I think its replacing everything when I only want to replace the Nas new_data <- my_data %>% group_by('Owner', 'x', 'y') %>% mutate(H=mean(H, na.rm=TRUE))
I've seen this: but it doesn't workhttps://stackoverflow.com/questions/45741879/can-i-replace-nans-with-the-mode-of-a-column-in-a-grouped-data-frame
new_data <- my_data %>% group_by('Owner', 'x', 'y') %>% mutate(H=mean(H, na.rm=TRUE)
But this replaces all the values of H with the mean. Also sometimes there will be no rows which match ie the group will only have 1 row and that has an Na (or maybe that its more than one but they just have NAs), is there a way to just keep the NA?
(I want it to find all the rows which match theOwner, x and y of the row with the NA and take the mode of their H values)
Use the
Modefunction defined here and currently included in thegtoolspackage