Generating a normally distributed random variable that has range [1, 3] in R

Question

Generating a normally distributed random variable that has range [1, 3] in R

208 Views Asked by MinChul Park At 02 November 2023 at 12:08

I want to generate a normally distributed random variable that has range [1, 3].

Specifically, I tried the following R code:

x1 <- runif(100, 1, 2)
x2 <- rnorm(100, 0, 0.3)

V <- 1 + x1 + x2

Then, V follows a normal distribution (conditional on x1) and is roughly concentrated on [1, 3].

But, I want to make V to have range [1, 3]. That is, all elements should be in [1, 3], not roughly on [1, 3]:

min(V)
[1] 1
max(V)
[1] 3

I have no idea how to do. Is there a technique for this task?

Original Q&A

There are 3 best solutions below

uke On 02 November 2023 at 12:48

Here is another approach, but you would have to sacrifice the assumption that always 100% of values lay in [1, 3], and be satisfied with something like, at least 99.99% of values lay in [1, 3].

This is different from rescaling a standard normal distribution to always fit into [1, 3] completely, because the resulting distribution is going to have varying width (standard deviation), depending on whether there was an "outlier" or not in the random generation of the standard normal distribution. The whole distribution would be rescaled to make the outlier lay between [1, 3].

My approach is about setting the width (standard deviation) in a way, that for an infinitely large sample, e.g. 99.99% of values lay in between [1, 3].

You would have to reformulate the range as a percentage of the area under the normal distribution that you wish to cover. An area of 100% is always resulting in a range from -infinity to +infinity. So you have to step down on the area of the normal distribution that you want to be covered between [1, 3]. Let's say you want 99% of the area to be between 1 and 3.

You would have to use the sd argument to supply a standard deviation to rnorm() which is defining the normal distribution in a way that 99% of the area is between 1 and 3.

How to calculate that specific standard deviation? We can use qnorm() to get the limit value of a certain area. This area is ranging from -infinity to p. When we put p = 0.005, we say: Give me the x value left of which lays 0.05% of the distribution.

As the normal distribution is symmetrical, we determine p by halving what is leftover by 99% of 100% = 1 %. We split 1% in half and say: 0.5% should lay below 1 and 0.5% should lay above 3.

All that is left to supply the mean of our distribution to the qnorm(). It should be the center of your given range. In your case, this is 2.

So we know the desired outcome of qnorm(p = 0.005, mean = 2, sd = ???): It should be 1. We have to set sd in a way that the result is 1.

I did this with trial and error, approximating 1 and got to this point:

qnorm(0.005, mean = 2, sd = 0.388223)
#> 1.000004

So, in turn:

rnorm(mean = 2, sd = 0.388223, n = 100)

should give you random values of which ~99% fall in between the range [1, 3].

You could go more extreme by saying 99.99% should be inside [1, 3], approximating your goal of 100%.

p = (100% - 99.99%)/2 = 0.01% / 2 = 0.005% = 0.00005
same steps as above, optimize sd so that

qnorm(0.00005, mean = 2, sd = ???) == 1

pluck sd into your rnorm() call.

For the 99.99% example, sd would be ~ 0.25703. This is likely a bit extreme, because the values are more like [1.4, 2.8] then, but if you want to be sure, you can choose it. This would be suitable for large samples.

rnorm(mean = 2, sd = 0.25703, n = 10000) |> hist()

Stéphane Laurent On 02 November 2023 at 14:59

If you want a distribution with a bell-shaped density function centered at 2 and with endpoints at 1 and 3, you can construct it from a Beta(a,a) distribution:

a <- 4
simulations <- 2*rbeta(100, a, a) + 1

By increasing the value of a, the distribution becomes more concentrated around 2.

Here is the theoretical density:

a <- 4
x <- seq(1, 3, length.out = 200)
y <- dbeta((x-1)/2, a, a)/2
plot(x, y, type = "l")

**Allan Cameron** · Accepted Answer · 2023-11-02T12:45:57.473000

Since the support of any normal distribution is the whole real number line, the only way to get what you are looking for is to draw a sample and then normalize it into your specified range. As r2evans points out, there are theoretical problems with any such approach. However, a simple implementation of it would be

rnorm_limits <- function(n, min = 1, max = 3) {
  x <- rnorm(n)
  x <- (max - min) * x/diff(range(x))
  return(x - min(x) + min)
}

Testing, we have:

set.seed(1)

hist(rnorm_limits(100))

And of course the range will be exactly that specified:

range(rnorm_limits(100))
#> [1] 1 3

Generating a normally distributed random variable that has range [1, 3] in R

There are 3 best solutions below

Related Questions in R

Related Questions in NORMAL-DISTRIBUTION

Trending Questions

Popular # Hahtags

Popular Questions