Is there any easy python package solution for following problem?
I have got 3 variables, lets say they share trivariate normal distribution - so we know means and covariance matrix. Is there any simple way how to simulate Z values when X, Y is given?
I have already coded bivariate simulation by employing Gibbs sampler, it can be extended to sample from multivariate distribution alas it is little bit confusing and annoying. Really, is there is no easier way to sample from conditional distributions?
Thank you.
Here is what I tried, I believe it works. But I would rather to use some package directly, without own calculations:
import numpy as np
import scipy
class GibbsMultivariateSampler():
def __init__(
self,
data: np.ndarray,
means: np.ndarray,
covariance: np.ndarray):
self.means=means
self.covariance=covariance
self.data = data
def conditioned_mean(self,x: np.ndarray):
i=0
conditioned_means=[]
for element in x:
i=i+1
_h=len(x)
_l = _h - i
conditioned_mean=(self.covariance[_h, _l] / self.covariance[_l, _l]) * (element - self.means[_l])
conditioned_means.append(conditioned_mean)
return self.means[_h]+sum(conditioned_means)
def conditioned_covariance(self,x: np.ndarray):
i=0
conditioned_covariances=[]
for element in x:
i=i+1
_h=len(x)
_l = _h - i
conditioned_covariance=-1*((self.covariance[_h, _l] ** 2 / self.covariance[_l, _l]))
conditioned_covariances.append(conditioned_covariance)
return self.covariance[_h, _h]+ sum(conditioned_covariances)
def simulate(self,size):
self.conditioned_covariance(self.data)
conditioned_distribution = scipy.stats.multivariate_normal(mean=self.conditioned_mean(self.data), cov=self.conditioned_covariance(self.data))
return conditioned_distribution.rvs(size=size)
mean4 = np.array([2, 3, 4, 5])
cov_matrix4 = np.array([[1, 0.5, 0.3, 0.2],
[0.5, 1, 0.4, 0.1],
[0.3, 0.4, 1, 0.15],
[0.2, 0.1, 0.15, 1]])
#simulate Z given X,Y
sampler=GibbsMultivariateSampler(data=np.array([10,20]), means=mean4, covariance=cov_matrix4)
simulation=sampler.simulate(1000)
#simulate W given X,Y,Z
sampler=GibbsMultivariateSampler(data=np.array([10,20,5]), means=mean4, covariance=cov_matrix4)
simulation=sampler.simulate(1000)
I'm operating in the belief that you want to know how to generate Gaussian random variables having the correct mean and variance/covariance structure, and that the focus on conditional increments was because you were using Gibbs sampling. You can avoid that by using Cholesky decomposition in the mathematical formulation described in this answer to a similar question.
A quick summary is that if M is the vector of means, V is the variance/covariance matrix, L is a lower triangular matrix such that LLt = V. The matrix L can be derived using Cholesky factorization. Given a vector Z of iid standard Gaussians, X = LZ+M yields a vector of Gaussians having the desired mean and variance/covariance structure. This is the matrix equivalent of the familiar univariate transformation X = σZ+μ with L acting in the role as σ, the square root of the variance.
The following is a python/numpy implementation of that algorithm. Note that even though your subject line said "trivariate", I've used the 4x4 covariance matrix and corresponding mean vector from your code.
If you actually do want to know how to generate Xk given X1,...,Xk-1, this approach can be inverted to algebraically determine the corresponding standard normals Z1,...,Zk-1, generating a value for Zk, multiplying the extended Z vector by the kth row of L, and adding Mk.