I´m using the following credentials auth for logging in blob storage in R:
library(AzureStor)
account_endpoint <- "https://mycorporation.blob.core.windows.net"
account_key <- "mykey"
container_name <- "mycorporation"
bl_endp_key <- storage_endpoint(account_endpoint, key = account_key)
cont <- storage_container(bl_endp_key, container_name)
w_con <- textConnection("foo", "w")
I need to read a lot of huge csv files located in mycorporation/my_folder without making download and sequentially reading using sparklyr.
What is the best way to do it ?
If you want to access a small number of files then, the Blob storage path WASBS is a simple and direct way to read files from blob storage. To access a large number of files and more complex data sets use mount point.
Depending upon your requirement either choose Blob storage path or mount point.
Note: R is not capable of doing the actual mounting .So the workaround is to mount using another language like python and read the file using the library "sparklyr" as shown below.
Mount using python:
R notebook with
sparklyrlibrary :Or
Configure the Blob storage .
Reading csv file using R