Matching unique filenames to a dataframe column and then moving these files

35 Views Asked by At

I have a large number of files that have unique names and I need these files to be moved to a new file location based on whether they match the name in a dataframe.

For example, The file names would look like:

/users/documents/data/181009_153350-de-7-135_1.csv
/users/documents/data/123439_152450-de-5-134_1.csv 
/users/documents/data/181249_134329-de-7-131_1.csv 

The dataframe contains a column that has a portion of the file name (i.e. "181009_153350"). If the the "181009_153350" is matched from the dataframe to the filename it will move that file to a new location. The number can be quite varied and need to match the dataframe to be moved.

The dataframe would look like: dataframe

I was able to figure out how to move files when they have a pattern, but I couldn't figure out how to implement this when there is no pattern and matched.

I currently have this code to move the file based on the pattern:

File_move_funct <- function(Curr, New){
  library(filesstrings)
  current.folder <- Curr
  new.folder <- New
  list.of.files <- list.files(current.folder,pattern = "-de-", full.names = T, recursive = TRUE)
  list.of.files
  move_files(list.of.files, new.folder, overwrite = FALSE)
}

I began by trying to save a portion of the filename that I wanted as a dataframe itself to match the two dataframes. This was with the full pathname of the file to ensure I could transfer the file later.

files: /users/documents/data/181009_153350-de-7-135_1.csv <- (just an example, my filenames are longer) list <- as.data.frame(substr(list.of.files, 91,103))

Then, I tried match the values of one dataframe to the other, but couldn't get it to work.

compare2 <- function(df1, df2){
  n = nrow(df1); p = ncol(df1)
  result = matrix(NA,nrow = n, ncol = p)
  for(j in seq_len(p)){
    for(i in seq_len(n)){
    result[i,j] <- df1[i,j] == df2[i,j]
    }
  }
  print(result)
}

Overall, I just want to move a file if a specific portion of its filename matches the data within the column of a dataframe.

I don't think I am doing this efficiently or correctly. I have been trying for awhile and don't seem to be progressing.

0

There are 0 best solutions below