How to break a random walk at destination with R

160 Views Asked by At

I have to generate a random walk for a sequence of positions in 2D. The person doing the random walk starts at (0,0). At every move, she goes left, right, up or down. The random walk stops if she comes back to (0,0), or until she made a 1000 steps. *I am using the R language

I have done this so far, but I am having trouble figuring out how to stop the random walk when she reaches (0,0) again. I only get two vectors back. Any help would be very appreciated. Thank you!

step.max<-1000
destination<-rbind(c(0,0))

Random.walk <- function(n=step.max){
  steps <- matrix(c(0,0,-1,1,0,-1,1,0),nrow = 4)
  walk <- steps[sample(1:5,n,replace = TRUE)] 
  walk.1 <-rbind(walk)
  ifelse(destination,break,apply(walk.1,2,cumsum))
  }
Random.walk(n)
2

There are 2 best solutions below

6
dcarlson On BEST ANSWER

If you want to stop when you reach the origin, you will have to use a loop. Alternatively you can generate a random walk of 1000 steps very quickly and identify when the origin was reached:

steps <- matrix(c(1, 0, -1, 0, 0, 1, 0, -1), nrow = 4, byrow=TRUE)
steps
#      [,1] [,2]
# [1,]    1    0  # Right
# [2,]   -1    0  # Left
# [3,]    0    1  # Up
# [4,]    0   -1  # Down

Now generate a random walk with 1000 steps:

set.seed(42)  # To create a reproducible example
rand <- sample.int(4, 1000, replace=TRUE)
moves <- steps[rand, ]         # 1000 random moves
position <- rbind(c(0, 0), apply(moves, 2, cumsum))  # Position after each move
home <- which(position[, 1] == 0 & position[, 2] == 0)  # When is position c(0, 0)?
home
# [1] 1   # We never revisit the origin in this random walk.

Now illustrate the results:

plot(position, type="l")
points(0, 0, cex=2, col="red")
points(position[1001, 1], position[1001, 2], cex=2, col="blue")

Random Walk

If you want break out of the walk when the position returns to c(0, 0), you will need a loop. Something like this:

steps <- matrix(c(1, 0, -1, 0, 0, 1, 0, -1), nrow = 4, byrow=TRUE)
position <- rbind(c(0, 0))
move <- 0
for (move in 2:1001) {  
    step <- steps[sample.int(4, 1), ]
    position <- rbind(position,  position[move-1, ] + step)
    if (position[move, 1] == 0 & position[move, 2] == 0) break
}
cat("move=", move, "position=", position[move, ], "\n")

The position variable will contain all of the locations visited during the walk.

If you want to know how often the walk returns to the origin before 1000 steps, we need to create a function for the walk and then use replicate to generate multiple walks, for example 10,000:

steps <- matrix(c(1, 0, -1, 0, 0, 1, 0, -1), nrow = 4, byrow=TRUE)
walking <- function(n) {
    rand <- sample.int(4, n, replace=TRUE)
    moves <- steps[rand, ]
    position <- rbind(c(0, 0), apply(moves, 2, cumsum))
    home <- which(position[, 1] == 0 & position[, 2] == 0)
    if (is.na(home[2])) 1000 else home[2]
}

results <- replicate(10000, walking(1000))
sum(results < 1000)/10000
# [1] 0.683

So about 68% of the time the walk ends in less than 1000 steps. You can increase the replications. If you run this several times, the percentage varies from 67 - 68%. You can narrow the range by increasing the number of replications, but the code will take longer to run.

3
jblood94 On

Here is a fully vectorized function that will return a matrix of positions in a random walk with k steps. The matrix will have k + 1 rows (it includes the original position at the origin) unless the walk returns to the origin.

walk2d <- function(k) {
  k1 <- k + 1L
  m <- matrix(0L, k1, 2)
  m[2:k1 + sample(c(0L, k1), k, 1)] <- sample(c(-1L, 1L), k, 1)
  m[,1] <- cumsum(m[,1])
  m[,2] <- cumsum(m[,2])
  i <- which.min(rowSums(abs(m[-1,])))
  if (i == 1L) m else m[1:(i + 1L),]
}

A walk that returns to the origin on the 14th step:

set.seed(123)
walk2d(20L)
#>       [,1] [,2]
#>  [1,]    0    0
#>  [2,]   -1    0
#>  [3,]   -1   -1
#>  [4,]   -2   -1
#>  [5,]   -1   -1
#>  [6,]   -2   -1
#>  [7,]   -1   -1
#>  [8,]   -1    0
#>  [9,]   -1    1
#> [10,]   -2    1
#> [11,]   -2    0
#> [12,]   -1    0
#> [13,]   -1    1
#> [14,]    0    1
#> [15,]    0    0

A walk that does not return to the origin within k steps

set.seed(94)
walk2d(20L)
#>       [,1] [,2]
#>  [1,]    0    0
#>  [2,]    1    0
#>  [3,]    2    0
#>  [4,]    2    1
#>  [5,]    3    1
#>  [6,]    3    2
#>  [7,]    4    2
#>  [8,]    4    1
#>  [9,]    3    1
#> [10,]    2    1
#> [11,]    3    1
#> [12,]    4    1
#> [13,]    4    0
#> [14,]    4    1
#> [15,]    3    1
#> [16,]    4    1
#> [17,]    5    1
#> [18,]    6    1
#> [19,]    5    1
#> [20,]    4    1
#> [21,]    3    1

To estimate the probability of returning to the origin in less than 1000 steps, here is a vectorized function that will perform n simulations and return the proportion that returned to the origin within k steps. Note that walks can return to the origin only on even steps, so the probability of returning to the origin in less than 1000 steps is the same as the probability of returning within 998 steps.

library(Rfast)
library(parallel)

pReturn <- function(n, k) {
  kn <- k*n
  a <- array(0, c(k, n, 2))
  a[1:kn + sample(c(0, kn), kn, 1)] <- sample(c(-1, 1), kn, 1)
  a[,,1] <- colCumSums(a[,,1])
  a[,,2] <- colCumSums(a[,,2])
  sum(colSums(rowSums(abs(a), dims = 2) == 0) != 0)/n
}

pReturn(1e3, 998)
#> [1] 0.664

Setting n too large will result in memory problems, so call pReturn multiple times to get a better estimate.

cl <- makeCluster(detectCores() - 1L, type = "PSOCK")
clusterExport(cl, "pReturn")
invisible(clusterEvalQ(cl, library(Rfast)))
mean(unlist(parLapply(cl, 1:1e3, function(i) pReturn(1e3, 998)))) # 1M replications
#> [1] 0.678182

This compares well against the exact probability (building off http://oeis.org/A054474 in Mathematica, with m = k/2):

In[1]:= m = 499; gf[x_] = 2 - Pi/(2*EllipticK[4*Sqrt[x]]);
N[Total[(List @@ Normal[Series[gf[x], {x, 0, m - 1}]] /. x -> 1)[[2 ;;
       m + 1]]*Table[4^k, {k, -1, -m, -1}]]]

Out[2]= 0.678159