How to get the co-ordinates into function that's being called by apply() in R?

89 Views Asked by At

I'm trying to write my own image filtering functions and one easier style is averaging the cell neighbour values. I've got a matrix with pixel values, but instead of using a huge for loop I'm trying to use apply, but can't seem to get the current co-ordinates of a cell being "worked on".

I'm struggling to get any multi-variable function to work with apply(), but ideally I think I'd pass the whole matrix to the function?

picture = matrix(data=0,nrow=px,ncol=px)

# I then have a function that draws a few squares and adds some random noise
# Now I'd like to go through the matrix, find the cells neighbour values, avg.
 
test <- function(x,cur_matrix){
    up    = cur_matrix[i,j+1]
    left  = cur_matrix[i-1,j]
    right = cur_matrix[i+1,j]
    down  = cur_matrix[i,j-1]

    avg = ( up + down + left + right ) / 4
    x = avg
    return(x)
}

picture_2 = apply(picture, 1:2, test, cur_matrix = picture)

How could I pass both the matrix "picture" and the current i,j co-ords into this function?

4

There are 4 best solutions below

8
ThomasIsCoding On

For pixel averaging, you usually may need to consider its valid neighbors + itself, instead of the neighbors only (as revealed in your post).


Update: Vectorized

As pointed out by @Onyambu, yes, the approach given in ftic0 (see my earliest solution) is super inefficient, both time-wisely and space-wisely. So, in what follows, we avoid spanning the distance space as did previously and vectorize the operations like below

ftic1 <- function(pic) {
  p <- list(
    orig = pic,
    up = rbind(NA, pic[-nrow(pic),]),
    down = rbind(pic[-1,], NA),
    left = cbind(NA, pic[, -ncol(pic)]),
    right = cbind(pic[, -1], NA)
  )
  rowMeans(array(do.call(cbind, p), c(dim(pic), 5)), TRUE, 2)
}

and let's see the performance improvement

set.seed(0)
pic <- matrix(runif(512^2), 512)
microbenchmark(
  fonyambu = localAverage(pic),
  ftic1 = ftic1(pic),
  unit = "relative",
  times = 100L,
  check = "equal"
)

which shows

Unit: relative
     expr      min       lq     mean   median       uq     max neval
 fonyambu 1.746054 1.753484 1.796011 1.768681 1.911893 2.17329   100
    ftic1 1.000000 1.000000 1.000000 1.000000 1.000000 1.00000   100

and bench::mark

bm <- bench::mark(
  fonyambu = localAverage(pic),
  ftic1 = ftic1(pic),
  iterations = 100L,
  check = TRUE
)

shows

> bm
# A tibble: 2 × 13
  expression      min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
  <bch:expr> <bch:tm> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm>
1 fonyambu     28.3ms 30.2ms      31.8    59.1MB    273.     12   103      377ms
2 ftic1        14.8ms 17.2ms      55.3      38MB     73.3    43    57      778ms
# ℹ 4 more variables: result <list>, memory <list>, time <list>, gc <list>

Previous Solution: Low-code version but inefficient

You can try sapply like this

# dummy picture pixel matrix
set.seed(0)
pic <- matrix(runif(30), 5)

ftic0 <- function(pic) {
  # complex coordinates, say, (1,2) is denoted by 1+2i
  idx <- row(pic) + 1i * col(pic)
  # average by neighboring pixels
  pic_out <- `dim<-`(sapply(c(idx), \(p) {
    u <- idx[abs(p - idx) <= 1] # neighbors + itself
    mean(diag(pic[Re(u), Im(u)]))
  }), dim(pic))
}

pic_out <- ftic0(pic)

and we will see that

> pic
          [,1]      [,2]       [,3]      [,4]      [,5]       [,6]
[1,] 0.8966972 0.2016819 0.06178627 0.7698414 0.7774452 0.26722067
[2,] 0.2655087 0.8983897 0.20597457 0.4976992 0.9347052 0.38611409
[3,] 0.3721239 0.9446753 0.17655675 0.7176185 0.2121425 0.01339033
[4,] 0.5728534 0.6607978 0.68702285 0.9919061 0.6516738 0.38238796
[5,] 0.9082078 0.6291140 0.38410372 0.3800352 0.1255551 0.86969085
> pic_out
          [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
[1,] 0.4546293 0.5146388 0.3098210 0.5266930 0.6873031 0.4769267
[2,] 0.6081799 0.5032460 0.3680813 0.6251678 0.5616213 0.4003576
[3,] 0.5387903 0.6105087 0.5463696 0.5191846 0.5059061 0.2485087
[4,] 0.6284957 0.6988927 0.5800774 0.6856513 0.4727331 0.4792857
[5,] 0.7033917 0.6455558 0.5200689 0.4704000 0.5067387 0.4592113

In the approach above, you should be aware of treatment for edge and corner pixels:

  • Each corner pixel has 2 neighbors for averaging
  • Each edge pixel has 3 neighbors for averaging
1
Onyambu On

The method presented above is quite inneficient. One way to speed up is not to loop over each cell but rather get the data of the surrounding cells, and find the mean:

localAverage <- function(im){
    dm <- dim(im)
    size <- prod(dm)
    x <- seq_len(size)
    m <- dm[1]
    
    xm <- x %% m 
    left <- (x - m) * NA^(x <= m) # left index
    up <- (x - 1) * NA^(xm == 1) # up index
    down <- (x + 1) * NA^(!xm) 
    right <- (x + m) * NA^(x > size - m)
    b <- matrix(im[c(left,up,x, down, right)], 5, byrow = TRUE)
    array(colMeans(b, na.rm =  TRUE), dim = dm)
}

set.seed(0)
pic <- matrix(runif(30), 5)
localAverage(pic)

         [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
[1,] 0.4546293 0.5146388 0.3098210 0.5266930 0.6873031 0.4769267
[2,] 0.6081799 0.5032460 0.3680813 0.6251678 0.5616213 0.4003576
[3,] 0.5387903 0.6105087 0.5463696 0.5191846 0.5059061 0.2485087
[4,] 0.6284957 0.6988927 0.5800774 0.6856513 0.4727331 0.4792857
[5,] 0.7033917 0.6455558 0.5200689 0.4704000 0.5067387 0.4592113

We could compare this to the above code for an image of 28*28 and 256 * 256 pixels:

thomas <- function(pic){
    idx <- row(pic) + 1i * col(pic)
    
    # average by neighboring pixels
    `dim<-`(sapply(c(idx), \(p) {
        u <- idx[abs(p - idx) <= 1] # neighbors + itself
        mean(diag(pic[Re(u), Im(u)]))
    }), dim(pic))
 }

pic28 <- matrix(runif(28^2), 28)

microbenchmark::microbenchmark(localAverage(pic28),thomas(pic28), check = 'equal')
Unit: microseconds
                expr     min       lq      mean   median       uq     max neval
 localAverage(pic28)    59.6    69.95    94.492    92.95   112.20   181.7   100
       thomas(pic28) 12389.4 12750.10 15582.934 13153.90 20927.75 30250.8   100
 
 pic256 <- matrix(runif(256^2), 256)
bench::mark(localAverage(pic256),thomas(pic256))
# A tibble: 2 × 13
  expression       min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result   memory    
  <bch:expr>    <bch:> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>   <list>    
1 localAverage… 59.3µs 64.6µs   14596.    181.6KB     36.5  5204    13      357ms <dbl[…]> <Rprofmem>
2 thomas(pic25… 12.6ms 13.2ms      73.8    19.1MB     25.7    23     8      312ms <dbl[…]> <Rprofmem>
# ℹ 2 more variables: time <list>, gc <list>

Take note of the memory usage also

0
ThomasIsCoding On

A follow-up appending to the answer by @ThomasIsCoding: We can use for loops to iterate through each pixels and find out their corresponding neighbors.

This base R implementation with for loops is slow, but can significantly save the memory utilization. The performance can be boosted dramatically if you rewrite the for loops in terms of Rcpp.

Code

You can takes this version as sort of coding practice

ftic2 <- function(pic) {
  nr <- nrow(pic)
  nc <- ncol(pic)
  out <- matrix(NA, nr, nc)
  for (i in 1:nr) {
    ii <- i + c(-1, 1)
    ii <- ii[ii >= 1 & ii <= nr]
    for (j in 1:nc) {
      jj <- j + c(-1, 1)
      jj <- jj[jj >= 1 & jj <= nc]
      out[i, j] <- mean(c(pic[i, j], pic[ii, j], pic[i, jj]))
    }
  }
  out
}

Benchmarking

set.seed(0)
pic <- matrix(runif(512^2), 512)
bm <- bench::mark(
  fonyambu = localAverage(pic),
  ftic1 = ftic1(pic),
  ftic2 = ftic2(pic),
  iterations = 100L,
  check = TRUE
)

shows (particularly, see the mem_alloc column to check the memory status)

> bm
# A tibble: 3 × 13
  expression     min  median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
  <bch:expr> <bch:t> <bch:t>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm>
1 fonyambu   25.41ms 36.25ms    23.5     59.13MB    27.5    100   117      4.26s
2 ftic1      15.17ms 22.33ms    42.9     38.02MB    27.0    100    63      2.33s
3 ftic2        1.42s   1.51s     0.614    3.09MB     6.23   100  1015      2.72m
0
jblood94 On

Brute-force vectorization with base R:

fblur <- function(x) {
  y <- x
  n <- nrow(x)
  m <- ncol(x)
  i <- c(1, n)
  j <- c(1, m)
  y[-1,] <- y[-1,] + x[-n,]
  y[-n,] <- y[-n,] + x[-1,]
  y[,-1] <- y[,-1] + x[,-m]
  y[,-m] <- y[,-m] + x[,-1]
  y[i, j] <- y[i, j]/3
  y[i, -j] <- y[i, -j]/4
  y[-i, j] <- y[-i, j]/4
  y[-i, -j] <- y[-i, -j]/5
  y
}

Benchmarking

Functions:

localAverage <- function(im){
  dm <- dim(im)
  size <- prod(dm)
  x <- seq_len(size)
  m <- dm[1]
  
  xm <- x %% m 
  left <- (x - m) * NA^(x <= m) # left index
  up <- (x - 1) * NA^(xm == 1) # up index
  down <- (x + 1) * NA^(!xm) 
  right <- (x + m) * NA^(x > size - m)
  b <- matrix(im[c(left,up,x, down, right)], 5, byrow = TRUE)
  array(colMeans(b, na.rm =  TRUE), dim = dm)
}

ftic1 <- function(pic) {
  p <- list(
    orig = pic,
    up = rbind(NA, pic[-nrow(pic),]),
    down = rbind(pic[-1,], NA),
    left = cbind(NA, pic[, -ncol(pic)]),
    right = cbind(pic[, -1], NA)
  )
  rowMeans(array(do.call(cbind, p), c(dim(pic), 5)), TRUE, 2)
}

Data:

m <- matrix(runif(512^2), 512)

Timings:

bench::mark(
  localAverage = localAverage(m),
  ftic1 = ftic1(m),
  fblur = fblur(m)
)
#> # A tibble: 3 × 6
#>   expression        min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>   <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 localAverage   47.7ms     55ms      13.3    59.2MB     28.4
#> 2 ftic1          31.4ms     35ms      24.5      38MB     33.3
#> 3 fblur          10.2ms   17.4ms      59.1    20.3MB     25.6

fblur is the fastest and most memory efficient.