Error when subsetting sfc objects with explicit binary predicate

105 Views Asked by At

I don't understand why the last two examples in the following code chunk fail

library(sf)
#> Linking to GEOS 3.11.2, GDAL 3.7.2, PROJ 9.3.0; sf_use_s2() is TRUE

p1 <- st_sfc(st_point(c(0, 0)))
p2 <- st_sfc(st_point(c(1, 1)))

p1[p2]
#> Geometry set for 0 features 
#> Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
#> CRS:           NA
p1[p2, op = st_intersects]
#> Error in `[.default`(p1, p2, op = st_intersects): incorrect number of dimensions
p1[p2, , op = st_intersects]
#> Error in `[.default`(p1, p2, , op = st_intersects): incorrect number of dimensions

when the default operation specified by [.sfc is exactly st_intersects.

sf:::`[.sfc`
#> function (x, i, j, ..., op = st_intersects) 
#> {
#>     if (!missing(i) && (inherits(i, "sf") || inherits(i, "sfc") || 
#>         inherits(i, "sfg"))) 
#>         i = lengths(op(x, i, ...)) != 0
#>     st_sfc(NextMethod(), crs = st_crs(x), precision = st_precision(x), 
#>         dim = if (length(x)) 
#>             class(x[[1]])[1]
#>         else "XY")
#> }
#> <bytecode: 0x00000255ad682df8>
#> <environment: namespace:sf>

Created on 2024-01-21 with reprex v2.0.2

Could you please help me? How should I specify the op parameter in the correct way when subsetting sfc object?

EDIT (2024-01-23)

Just for future reference, I want to point out that the behaviour described in this question has been adjusted/fixed/modified in the sf package. See https://github.com/r-spatial/sf/issues/2320 for more details.

1

There are 1 best solutions below

8
Allan Cameron On BEST ANSWER

The problem here occurs because `[.sfc` uses NextMethod() internally. The way this works is that the arguments you explicitly passed to `[.sfc` are passed on to the default subsetting method for lists. This is the primitive "[" function, implemented in the C code function do_subset_dflt.

Unusually for an R function, this ignores the names of any arguments (with the single exception of drop), and goes by order only:

my_list <- list(a = 1, b = 2)

my_list[the_name_of_this_argument_is_ignored = 2]
#> $b
#> [1] 2

my_list[the_name_of_this_argument_is_ignored = 2, drop = FALSE]
#> $b
#> [1] 2

This even leads to the confusing result

my_mat <- matrix(1:4, 2, 2)

my_mat[i = 1, j = 2]
#> [1] 3

my_mat[j = 1, i = 2]
#> [1] 3

If the number of (non-drop) arguments exceeds the number of dimensions of our object, we get exactly your error:

my_list[2, op = sf::st_intersects]
#> Error in my_list[2, op = sf::st_intersects] : 
#>   incorrect number of dimensions

It seems confusing that a default argument works without a problem, yet making exactly the same function call but explicitly naming the default argument throws an error. This is a result of how NextMethod works - it only passes on the symbols you specifically passed to the initial call, ignoring any default parameters.

The whole situation can be shown quite easily by creating a new class with its own subsetting method:

test <- function(x) structure(matrix(x, 2, 2), class = "test")

`[.test` <- function(x, i, j, ..., nag = FALSE) {
  if(nag) cat("This won't work now\n")
  NextMethod()
}

x <- test(5)

x[1, 1]
#> [1] 5

x[1, 1, nag = TRUE]
#> This won't work now
#> Error in `[.default`(x, 1, 1, nag = TRUE): incorrect number of dimensions

x[1, 1, nag = FALSE]
#> Error in `[.default`(x, 1, 1, nag = FALSE): incorrect number of dimensions

There isn't any direct work-around for this because [ is a primitive function. The use of NextMethod means that the named argument op is not, in fact, available for use by the end-user. Since the function is not exported and is undocumented, this is not really a bug, but it seems like a strange design decision to have op passed as an argument rather than hard-coding `st_intersects in the body of the function.

In the meantime, the best thing is to avoid using the op argument, and just using the predicate function directly:

library(sf)
#> Linking to GEOS 3.9.3, GDAL 3.5.2, PROJ 8.2.1; sf_use_s2() is TRUE

p1 <- st_sfc(st_point(c(0, 0)))
p2 <- st_sfc(st_point(c(1, 1)))

p1[lengths(st_intersects(p1, p2)) != 0]
#> Geometry set for 0 features 
#> Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
#> CRS:           NA

Created on 2024-01-21 with reprex v2.0.2