drawing from uniforms on condition that the sum equals 1

75 Views Asked by At

I want to generate 3 random probabilities each from a uniform distribution on the condition that their sum = 1, while specifying the min and max for each probability.

For example consider the following

p1 = U(0.25, 0.75); p2 = U(0.25, 0.75); p3 = U(0.00, 0.25)

subject to: p1+p2+p3 = 1.

I am able to achieve this with this code

n_draws <- 1e5
set.seed(1)
p   <- data.frame(p1 = runif(n_draws, 0.25, 0.75))
p$p2 <- runif(n_draws,
                ifelse(p$p1 >= 0.5, 0.25, 0.75 - p$p1), 1 - p$p1)
p$p3 <- 1 - (p$p1 + p$p2)

A summary shows that the values are constrained within the specified limits

summary(p)
       p1               p2               p3          
 Min.   :0.2500   Min.   :0.2500   Min.   :0.000001  
 1st Qu.:0.3740   1st Qu.:0.2941   1st Qu.:0.030647  
 Median :0.5000   Median :0.3807   Median :0.079315  
 Mean   :0.4998   Mean   :0.4066   Mean   :0.093547  
 3rd Qu.:0.6254   3rd Qu.:0.5006   3rd Qu.:0.148063  
 Max.   :0.7500   Max.   :0.7489   Max.   :0.249999  

and their sums are all equal to 1

all(rowSums(p) == 1)
> TRUE

But p2 and p3 are skewed to the left. Distributions of p

Is there a more effective way to acheive a uniform distribution for each probability ensuring that their sum = 1?

(This is the closest I can get to answering my question, but it does not entirely helpin what I am trying to achieve: https://stats.stackexchange.com/questions/289258/how-to-simulate-a-uniform-distribution-of-a-triangular-area)

1

There are 1 best solutions below

0
Ottie On

The constraint of sum = 1 is always going to introduce a distortion: you can't have all 3 variables uniformly distributed. An intuitive understanding is that, if all variables are uniformly distributed, then all combinations of p1, p2 and p3 should be equally likely, but we know this is not true because some are made impossible by your constraint. If you generate any type of distribution, then reject parts of it based on some criteria, you almost certainly won't have the same type of distribution left.

Since your p1 and p3 can never sum to more than 1, you could draw those and then calculate p2 knowing that you will never have to discard a p2 value. This will ensure that p1 and p3 are drawn from a uniform distribution, but not so p2, because obviously the extreme values will be underrepresented, as they are incompatible with some of the possible values of p1 and p3.

p1s <- runif(n_draws, 0.25, 0.75)
p3s <- runif(n_draws, 0, 0.25)
p2s <- 1 - p1s - p3s
hist(p2s)

enter image description here