I'm using a dataset that has been imported from a Stata dataset.
I have 17 potential predictor variables that I want look at for multicollinearity. They are ordinal so I've been trying to use the polychoric function. I get an error message when I run:
cor_matrix <- polychoric(df)
> You have more than 8 categories for your items, polychoric is probably not needed
Looking at my variables most have responses 1:5 or 1:7, but there are two out of the 17 that have 10 possible responses [1,2,3,4,5,6,7,8,98,99]. Levels 98 "refuse" and 99 "don't know" are not very important to my analysis, so I've been trying to combine levels 98 and 99 with 1 (1 is "does not apply").
Q. How often did the outside match the inside?
| Value | Label |
| -------- | ---------------- |
| 1 | DNA |
| 2 | Never |
| 3 | Rarely |
| 4 | Less than 1/2 |
| 5 | About 1/2 |
| 6 | More than 1/2 |
| 7 | Lots |
| 8 | Always |
| 98 | Refuse |
| 99 | Don't Know |
match <- labelled(c(7, 6, 4, 6, 3, 3, 2, 1, 3, 5, 99, 1, 3, 2, 2, 4, 5, 7, 8, 5, 98, 4, 6, 7, 4, 8, 4, 3, 4, 6, 7), c("DNA" = 1, "Never" = 2, "Rarely" = 3, "Less than half" = 4, "About half" = 5, "More than half" = 6, "Lots" = 7, "Always" = 8, "Refuse" = 98, "Don't know" = 99))
newmatch <- fct_collapse(match,
"Not known" = c("1", "98", "99"),
"Never" = "2",
"Rarely" = "3",
"Less than half" = "4",
"About half" = "5",
"More than half" = "6",
"Lots" = "7",
"Always" = "8"
)
The error message I get is:
.f must be a factor or character vector, not a <haven_labelled/vctrs_vctr/double> object.
The other variables in the data set are labelled vectors so I want them all to match once I've collapsed the levels in these two variables.