I have an example table that I would like to conduct KKNN to classify on. The variable, V4 is the response and I want the classifier to see if a new data point will classify as 0 or 1 (the actual data has 12 columns and the 12th column is the response but I will simplify the example nonetheless
library(kknn)
data <- data.frame(
V1=c(1.2, 2.5, 3.1, 4.8, 5.2),
V2=c(0.7, 1.8, 2.3, 3.9, 4.1),
V3=c(2.3, 3.7, 1.8, 4.2, 5.5),
V4= c(0, 1, 0, 1, 0)
)
Now, I want to build a kknnclassification via LOOCV using a for loop. Lets assume kknn=3
for (i in 1:nrow(data)) {
train_data <- data[-i, 1:3]
train_data_response <- data.frame(data[-i, 4])
colnames(train_data_response) <- "Response"
test_set <- data[i, 3]
model <- kknn(formula=train_data_response ~ ., data.frame(train_data),
data.frame(test_set), k=3, scale=TRUE)
}
Now I get this error that says:
Error in model.frame.default(formula, data = train) :
invalid type (list) for variable 'train_data_response'
Is there any way on how I can solve this error? I thought kknn accepts matrix or dataframes. My training and testing data are indeed dataframes so what gives?
Also, am I doing the LOOCV correctly?
We want to leave one out from train_data to validate if our results are not driven by one specific row, and we won't touch test_set. Both are created even before doing the
kknnwithout LOOCV,so we don't need the raw data anymore.
Say we want the result as a matrix
looofnrow(loo) == (test_set)andncol(loo) == (train_data), we initialize it doingand fill it now leaving one out in the
kknn.Note that we better classify the response
as.factorin the formula, which adds safety if is numerish as in OP. Thefit$fitted.valueswill, thus, also come back as a factor which in the matrix we wantas.character, though, to prevent coercing the factors to integers.Now we can do many things with the
looresult, e.g. look which left out observation might influence model prediction,which is ninth row of train_data in this case.
Or calculate the ratio where
allclassifications were predicted correctly.Data:
Extended a little to have more observations.