Merge wide mids with long df, convert back to mids

Question

Merge wide mids with long df, convert back to mids

74 Views Asked by jrcalabrese At 23 January 2024 at 18:02

I am trying to merge a wide-format mids object (as a result of multiple imputation with the mice package) with a long-format dataframe, which contains a time variable. Both dataframes contain the same IDs (id). However, I encounter an issue with rownames when trying to merge.

set.seed(123)
library(tidyverse)
library(mice)

wide <- data.frame(
  id = c(1, 2, 3, 4, 5),
  x = c(1.5, 3, NA, 4.2, 5.8),
  y = c(9.3, NA, 31.7, 41.1, 52.6),
  z = c(101, 198, 305, NA, 499)
)

long <- data.frame(
  id = rep(1:5, each = 2),
  time = rep(1:2, times = 5),
  a = c(10, 15, 20, 25, 50, 30, 35, 40, 30, 45),
  b = c(100, 150, 200, 250, 200, 300, 350, 400, 100, 450)
)

wide_mids <- mice(data = wide, 
                  m = 5, 
                  printFlag = FALSE)
#> Warning: Number of logged events: 2

completed_wide <- mice::complete(data = wide_mids,
                                 action = "long",
                                 include = TRUE)

merged <- merge(completed_wide, long, by = "id")
merged_mids <- as.mids(merged)
#> Warning: non-unique values when setting 'row.names': '1', '2', '3', '4', '5'
#> Error in `.rowNamesDF<-`(x, value = value): duplicate 'row.names' are not allowed

Trying different kinds of merging like full_join or left_join from dplyr still results in the same error message. Any help is appreciated.

Original Q&A

There are 1 best solutions below

**Mark** · Answer 1 · 2024-01-25T03:52:35.160000

The issue I believe is with the .id argument of as_mids():

.id
An optional column number or column name in long, indicating the subject identification. If not specified, then the function searches for a variable named ".id". If this variable is found, the values in the column will define the row names in the data element of the resulting mids object.

So as_mids() is using the .id column, which has 12 rows of 1s, 12 rows of 2s, etc.

One way of getting around this problem is to make a new id column and then use that:

# Reusing L Tyrone's code
long |>
  left_join(completed_wide,
            by = "id",
            relationship = "many-to-many") |>
  mutate(newid = row_number()) |>
  as.mids(.id = "newid")

Output:

Class: mids
Number of multiple imputations:  5 
Imputation methods:
   id  time     a     b   .id     x     y     z 
   ""    ""    ""    ""    "" "pmm" "pmm" "pmm" 
PredictorMatrix:
     id time a b .id x y z
id    0    1 1 1   1 1 1 1
time  1    0 1 1   1 1 1 1
a     1    1 0 1   1 1 1 1
b     1    1 1 0   1 1 1 1
.id   1    1 1 1   0 1 1 1
x     1    1 1 1   1 0 1 1

Merge wide mids with long df, convert back to mids

There are 1 best solutions below

Related Questions in R

Related Questions in DPLYR

Related Questions in R-MICE

Trending Questions

Popular # Hahtags

Popular Questions