I currently started using shinyTree for one of my applications and I'm having trouble finding an efficient way in which to turn my directory into a list. My assumption is that the easiest way is to use something like Rcpp to take advantage of C++'s speed, but I'm not married to that idea. If that is the route to take however, my skill set in that arena is virtually zero, so I'm hoping someone might be able to provide a couple snippets of code to get me started in the right direction.
Here is the code I'm currently using to achieve what I'm trying to do:
create_directory_tree = function(root) {
tree = list()
file_lookup = data.frame(id=character(0), file_path=character(0), stringsAsFactors=FALSE)
files = list.files(root, all.files=F, recursive=T, include.dirs=T)
walk_directory = function(tree, path) {
fp = file.path(root, path)
is_dir = file.info(fp)$isdir
if (is.null(is_dir) | is.na(is_dir)) {
print(fp)
return(NULL)
}
path = gsub("'|\"", "", path)
folders = str_split(path, "/")[[1]]
if (is.na(dir) | is.null(dir)) {
print(paste("Failed:", fp))
return(NULL)
}
if (is_dir) {
txt = paste("tree", paste("$'", folders, "'", sep="", collapse=""), " = numeric(0)", sep="")
} else {
txt = paste("tree", paste("$'", folders, "'", sep="", collapse=""), " = structure('', sticon='file')", sep="")
}
eval(parse(text = txt))
return(tree)
}
for (i in 1:length(files)) {
tmp = data.frame(id=paste0("j1_", i), file_path=file.path(root, files[i]), stringsAsFactors=FALSE)
file_lookup = rbind(file_lookup, tmp)
tree = walk_directory(tree, files[i])
save(tree, file_lookup, file="www/dir_tree.Rdata")
}
}
This is taking an absurdly long time and I'm hoping there is something better. Thanks in advance.
The issue is you are growing the
data.framebyrbindinChances are the directory with
roothas lots and lots of content and, thus, the slow down happens when constantly copying and recreating thedata.frame. You already have a length of the number of files (e.g.length(files)) so precreate thedata.framewithAlso, you are aiming to constantly save the progress of the object within the
forloop, which is an I/O bottleneck. I would move:outside the loop.
Lastly, there are several posts on Rcpp Gallery that would be ideal tutorial posts.