How to use parallelization with raster::extract in R using furrr

191 Views Asked by oatmilkyway At 09 December 2022 at 05:13

I am unsure if this is a bug or a how-to. I posted this question here and was told to ask StackOverflow!

library(tidyverse)
library(tigris)
library(elevatr)
library(raster)
library(sf)
library(furrr)

multco <- tigris::tracts(state = "OR", 
    county = "Multnomah") %>% 
  st_transform(2913) %>% 
  st_point_on_surface()

ex_elev <- elevatr::get_elev_raster(
    locations = st_bbox(multco) %>% st_as_sfc(), 
    z = 5)

# This works
ev <- raster::extract(ex_elev, multco, 
    fun = mean, na.rm = T, buffer = 100)

## This fails
ev2 <- multco %>% 
  furrr::future_map_dbl(.f = function(point){
    raster::extract(ex_elev, point, fun = mean, na.rm = T, buffer = 100)}, 
             .options = furrr_options(seed = TRUE,
                                      packages = c("raster", "sf")))

with the following error code: Error in round(y) : non-numeric argument to mathematical function

It works with serial processing however.

I'm not sure if it's a {raster} issue or a {future} issue or a {furrr} issue. If anyone has luck using furrr-based parallelization and mapping with {raster} functions, please let me know!

Edit 1: Changed code to fully reproducible example.

Original Q&A

There are 1 best solutions below

Elia On 12 December 2022 at 14:14

As far as I know, rarely parallel extraction is needed. Often the overheads to pass the data to the workers are more expensive than computing the extraction in sequential mode. However, purrr::map and their parallel version use a list as an argument, so you have to convert your sf to a list. See my example with a little time benchmark of different approaches:

library(tidyverse)
library(tigris)
library(elevatr)
library(raster)
library(sf)
library(furrr)


# This works
system.time(ev <- raster::extract(ex_elev, multco, 
                     fun = mean, na.rm = T, buffer = 100))#51.84

system.time(ev <- terra::extract(ex_elev, multco, 
                      fun = mean, na.rm = T, buffer = 100))#57.2

system.time(ev <- exactextractr::exact_extract(ex_elev, st_buffer(multco,100), 
                      "mean"))#0.43



#in parallel
xy.list <- split(multco, seq(nrow(multco)))

plan(multisession)

system.time(ev2 <- xy.list %>% 
  furrr::future_map_dbl(.f = function(point){
    raster::extract(ex_elev, point, fun = mean, na.rm = T, buffer = 100)}, 
    .options = furrr_options(seed = TRUE,
                             packages = c("raster", "sf")))
)#208
plan(sequential)

in the comment of each approach, you will see the elapsed time (in seconds) on my machine (64 Gb RAM and 48 logical cores). As you can see, with your toy data, the exact_extract approach is by far the better

How to use parallelization with raster::extract in R using furrr

There are 1 best solutions below

Related Questions in R

Related Questions in PARALLEL-PROCESSING

Related Questions in FUTURE

Related Questions in RASTER

Related Questions in FURRR

Trending Questions

Popular # Hahtags

Popular Questions