I'm using following query in R, the Query Run time is < 50 sec to fetch millions of records however while converting the output into dataframe using dbGetQuery it takes almost an hour. Please suggest how can I optimize the dataframe creation process in R.
Please help me with a faster approach to fetch query output in R dataframe.
# Define your start_date as needed
custom_start_date <- "YYYY-MM-DD" # Replace with your desired start date
# Construct the SQL query with generic variable names
custom_sql_query <- sprintf("
WITH FirstInstance AS (
SELECT var1,
MIN(var2)
FROM delta.var3.var4
WHERE date(var2) >= date('%s')
GROUP BY var1
)
SELECT var1,
var5,
var6,
var7,
var8,
var9,
var10,
var11
FROM delta.var3.var4
INNER JOIN FirstInstance
ON var1 = var1
AND var2 = var2", custom_start_date)
# Fetch the data
df <- dbGetQuery(con, sql_query)
