I have a large data of more than 300,000 variables and about 5000 rows. By applying singular value decomposition using RSpectra, I retreived 300 singular values. Running svm with hyperparameter tuning by using these 300 variables has become incredibly slow. It took more than 17 hours with a 24GB RAM machine. This algorithm worked much faster when I run it with a document feature matrix(dfm) of 60,000 variables and 5000 rows.
library(doMC)
start_time <- Sys.time()
registerDoMC(cores=5)
library(e1071)
set.seed(123) #for reproducibility
svm_tuned_upsample <- tune(svm,
train.x = train_svd_df[,-1],
train.y = as.factor(train_svd_df$Include),
kernel = "radial",
type = "C-classification",
parallel= TRUE,
ranges=list(cost=c(0.001, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 1, 5, 6, 7, 8, 10, 15),
gamma=c(0.0009, 0.001, 0.002, 0.003, 0.0035, 0.004, 0.0045, 0.005)),
validation.x=tune.control(sampling = "cross",cross=10)
)
Sys.time() - start_time