Truncated Maximum Likelihood estimator

Question

Truncated Maximum Likelihood estimator

146 Views Asked by Sanaullah Khan At 03 October 2023 at 07:09

I am trying to estimate the parameters of Truncated New Generalized Poisson Lindly distribution.

Here in this distribution, we use two parameters which are alpha1, beta1, When I run Optimum function then I should get the values of estimators alpha1, beta1.

I get those values in R, the value of alpha1 I get is correct, But the problem is that I am getting the incorrect value of beta1.

I have tried to run this in R, but the estimate of beta1 is incorrect.

## Data Generation
## PMF of New Generalized Poisson Lindley distribution

marginal=function(alpha1, beta1,x1){
  ab=(alpha1^2*(1+alpha1+beta1+beta1*x1))/((alpha1+beta1)*(alpha1+1)^(x1+2))
  return(ab)
}

alpha1=0.2;beta1=3;Tr=3

t1=c();
for(i in 1:10000){
  a=0:1000
  pr1=marginal(alpha1,beta1,a)
  y1=sample(a,1,replace=TRUE,prob=pr1);
  t1[i]=y1;
}
t1
##_________________###_____________________##___________
length(t1)
summary(t1)
hist(t1)
# Mean and Variance of Untruncated NGPLD

#Mean of Untruncated NGPLD 
U1=function(alpha1,beta1){
  U1=(alpha1+2*beta1)/(alpha1*(beta1+alpha1))
  return(sum(U1))
}
U1(alpha1,beta1);mean(t1)

# Variance of Untruncated NGPLD
var1=function(alpha1,beta1){
  var1=(2*beta1^2*(1+alpha1)+alpha1^2*(1+alpha1)+alpha1*beta1*(4+3*alpha1))/(alpha1^2*(alpha1+beta1)^2)
  return(sum(var1))
}
var1(alpha1,beta1);var(t1)

#Truncated data
Tx=t1[t1>Tr];Tx
hist(Tx)
length(Tx)
#Truncated NEw genralized poisson lindley Disrtribution ,where
#Truncated NGPLD=(PMF/1-Cf),where (1-Cf) is a Survival function

#survival Function
x=Tr
cf=((alpha1+beta1)*(1+alpha1)^(x+2)-(alpha1+alpha1^2+2*alpha1*beta1+alpha1*beta1*x+beta1))/((alpha1+beta1)*(1+alpha1)^(x+2))
1-cf

#Truncated NEw genralized poisson lindley Disrtribution 
UT <-function(x1,alpha1,beta1,Tr){
  x=Tr
  cf=((alpha1+beta1)*(1+alpha1)^(x+2)-(alpha1+alpha1^2+2*alpha1*beta1+alpha1*beta1*x+beta1))/((alpha1+beta1)*(1+alpha1)^(x+2))
  numerator =alpha1^2 * (1 + alpha1 + beta1 + beta1 * x1)
  denominator = (alpha1 + beta1) * (alpha1 + 1)^(x1 + 2)
  result=numerator/denominator
  result1=result/(1-cf)
  return(result1)
}
UT(4:100,0.2,3,3)
#sum of probilities of Truncated NGPLD is one
sum(UT(4:100,0.2,3,3))
plot(Tx,UT(Tx,alpha1,beta1,Tr))

#Mean and Varince of Truncated NGPLD

#mean of Truncated NGPLD
f1=function(alpha1,beta1,Tr){
  x1=Tr+1:100
  f1=(x1*alpha1^2*(1+alpha1+beta1+beta1*x1))/((alpha1+beta1)*(alpha1+1)^(x1+2))
  return(sum(f1))
}
X=f1(alpha1,beta1,Tr)/(1-cf)
X;mean(Tx)

#varince of Truncated NGPLD 

f3=function(alpha1,beta1,Tr){
  x1=Tr+1:100
  f3=(x1*(x1-1)*alpha1^2*(1+alpha1+beta1+beta1*x1))/((alpha1+beta1)*(alpha1+1)^(x1+2))
  return(sum(f3))
}

vx=f3(alpha1,beta1,Tr)/(1-cf)+f1(alpha1,beta1,Tr)/(1-cf)-(f1(alpha1,beta1,Tr)/(1-cf))^2
vx;var(Tx)



#likelihood function of Truncated NGPLD
lik=function(par,data){
  x1<-data
  alpha1=par[1];beta1=par[2];
  ll=-sum(log(UT(Tx, alpha1, beta1,Tr)))
  return(ll)}

# MLE Of Truncated NGPLD
pp=optim(par=c(alph1=1,beta1=2),fn=lik,data=Tx,
         control = list(parscale=c(alph1=1,beta1=1)))
#estimated parameter 
ap1=pp$par[1];bt1=pp$par[2];ap1;bt1

Original Q&A

There are 1 best solutions below

**Sandipan Dey** · Answer 1 · 2023-10-03T23:20:08.907000

Just changed your negative log-likelihood function to use x1 instead of Tx and it worked for me

nllik <- function(par, data, Tr) { # optionally pass additional constant param which is not an optimization variable
  x1 <- data
  alpha1 <- par[1]
  beta1 <- par[2]
  ll <- -sum(log(UT(x1, alpha1, beta1, Tr)))  # change Tx to x1
  return(ll)
}

# set.seed(1) before data generation to reproduce
pp <- optim(par = c(alph1=1, beta1=2), fn = nllik, data = Tx, Tr = Tr,
                   control = list(parscale = c(alph1=1, beta1=1)))
#estimated parameters
ap1 = pp$par[1]; bt1 = pp$par[2]
ap1; bt1
# alph1 
# 0.20076 
# beta1 
# 2.845234

The next plot shows the negative log-likelihood function surface as a function of alpha and beta, the function optim() tries to minimize the function, starting from the initial point.

library(plot3D)
m <- 500
n <- 25
x <- seq(0.1, 0.3, length=m) 
#x <- c(seq(-1, -0.01, length=25), seq(0.01, 1, length=25))
y <- seq(2, 4, length=n)
z <- matrix(rep(0, m*n), nrow=m)
for (i in 1:length(x)) {
  for (j in 1:length(y)) {
    z[i,j] <- llik(c(x[i], y[j]), Tx, Tr)
  }
}
persp3D(x, y, z, #col=colvec, #col=jet.col(256), theta = 30, phi = 30, expand = 0.5,
        ticktype='detailed', shade=0.05, contour=TRUE, lighting=TRUE, #curtain=TRUE,
        xlab="alpha", ylab="beta",zlab="nll", main='NLL')

Here is the distribution with MLE fitted parameters, the density plotted on data histogram:

Tx = sort(Tx)
hist(Tx, probability = TRUE, main = 'data & MLE fit', col = 'gray')
lines(Tx,UT(Tx,ap1,bt1,Tr), col='red')
legend("topright", legend=c("data", "MLE"), lty = c(NA, 1), border = c("gray", NA), fill = c("gray",NA), col = c(NA, 'red'), x.intersp=c(-1.5,0.5))

EDIT

Take a closer look at the contour of the NLL function, which we are aiming to minimize:

filled.contour(x, y, z, xlab="alpha", ylab="beta",zlab="nll", main='NLL contour')

Note that the nllik function is very highly sensitive with the change in value of alpha, but less sensitive with the change in the value of beta beyond a point.

beta <- seq(0.1, 10,length.out=100)
alpha <- c(0.19, 0.2, 0.21)

df <- NULL
for (a in alpha) {
  for (b in beta) {
    df <- rbind(df, data.frame(alpha=a, beta=b, nll=llik(c(a,b), Tx, Tr=3)))
  }
}
head(df)
library(ggplot2)
df$alpha <- as.factor(round(df$alpha,3))
ggplot(df, aes(beta, nll, group=alpha, color=alpha)) + geom_line(lwd=1.25) + 
  scale_color_manual(values=c('red', 'green', 'blue')) +
  theme_bw()

Note that the function keeps on decreasing (though the rate of decrease is small) till beta=10 when alpha=0.205, that's why optim() finds high beta values for this particular value of alpha (is it as per expectation? if not, you may want to double-check the function UT() for the computation of the pmf for the TNGPL distribution from the inverse CDF)

This is a good resource to learn more about

A New Generalized Poisson-Lindley Distribution

It's good to include a reference for a distribution which is not common.

Truncated Maximum Likelihood estimator

There are 1 best solutions below

EDIT

Related Questions in R

Related Questions in STATISTICS

Related Questions in POISSON

Related Questions in MLE

Related Questions in PROBABILITY-DISTRIBUTION

Trending Questions

Popular # Hahtags

Popular Questions