How to code in R, this log-likelihood function

70 Views Asked by At

An ordered logit model is given; enter image description here

Consider the following log-likelihood function of the logit model: enter image description here

enter image description here

Also given is that k = {0,1,2,3,4}, where for k = 0 & k = 4 we have alpha being -Inf and +Inf respectively. Now my question is how can I code a function that optimizes the log-likelihood function, that is find the MLE of B = B1 and alpha for k = {1,2,3}.

Here is my try;

J <- length(Poss)

log_likelihood <- function(params, Poss = Poss){
  beta <- params[1]
  alpha <- c(-Inf, params[2:4], +Inf)
  
  for (k in 2:4) {
    for (j in J) {
      delta[k] <- alpha[k] - Poss[j] * beta
      ll <- delta[k] - log(1 + exp(delta[k]) - delta[k] + log(1 + exp(delta[k])))
      return(ll)
    }
  }
  return(sum(ll))
}

guess_optim <- c(0,0,0,0) #for beta1 and alpha1,2,3

optim(guess_optim, log_likelihood, Poss = Poss)

Only this does not work, how can I fix this issue?

1

There are 1 best solutions below

0
PBulls On

I think the very first issue is that you return(ll) inside the inner loop, which will result in the very first value of the very first record -- your procedure does not iterate until it reaches return(sum(ll)). The second issue is that observations should only contribute to the likelihood for the category they're actually in, you don't make that distinction. Finally, optim minimizes, so you shouldn't return the value you want to maximize.

To test the implementation, let's obtain some example data first (from here):

## Example data
dat <- foreign::read.dta("https://stats.idre.ucla.edu/stat/data/ologit.dta")

Y <- as.integer(cut(dat$gpa, 4))  ## Create 4 quartiles
X <- dat$public                   ## Dummy predictor, could be slope too

## This is the result we'll be reproducing
MASS::polr(as.factor(Y) ~ X)

#> Coefficients:
#>        X 
#> 1.178314 
#> 
#> Intercepts:
#>         1|2         2|3         3|4 
#> -2.43862931 -0.06816316  2.05272249 

Now, here's an implementation that will get you roughly the same parameter estimates:

## Logit link inversion
expit <- function(x) 1/(1+exp(-x))

## -log-likelihood function
ll <- function(params, X, Y) {
  alpha <- params[-1]
  beta <- params[1]
  
  ll <- unlist(mapply(\(x, y) {
    eta <- alpha - x*beta
    mu <- expit(eta)
    ## This line calculates expit(delta) - expit(delta_-1) in probability scale
    probs <- pmin(pmax(diff(c(0, mu, 1)), .Machine$double.eps), 1)
    ## Only use the probability of the observed category!
    log(probs[y])
  }, x=X, y=Y))
  ## Return inverse so value is maximized
  -sum(ll)
}

guess <- c(1, -2, 0, 2)
optim(guess, ll, X=X, Y=Y)
#> $par
#> [1]  1.17991554 -2.43853589 -0.06775819  2.05298068

Some notes:

  • I calculated the likelihood in the probability scale and then logged, not directly using the formula you were given (but simple algebra will show they are the same). Rather than setting log-odds to (negative) infinity I set the probabilities to 0/1.
  • You need to provide decent starting values for this to work. In particular the intercepts should already be distinct & in the correct order (always increasing or decreasing depending on how you want to order your categories), starting with all of them equal will cause problems.
  • This relies on Y being numeric + sequential starting from 1 and will only handle a single numeric X predictor, a proper fitting routine should allow much more flexibility (e.g. converting the levels of Y to this ordered sequence internally).