Missing data/ uncertainty in GAMM Random Effect variable

15 Views Asked by At

I am facing an issue with a GAMM that I am attempting to fit.

I am modeling fish catch in response to several variables (see model below). I am using the mgcv's gamm() function to do so.

mod <- gamm(log_catch ~ gear_type + s(effort, k = 65) + s(doy, bs = 'cc', k = 20) + s(year, k = 22) + s(habitat, k = 35) + s(X,Y, bs= 'ts', k = 60),
       random = list(boat_id =~1, municipality =~1)

So the problem is primarily in my random effect, boat_id. There is quite a bit of uncertainty surrounding this variable; essentially, I am trying to correct for the variation in catch that is based on unique boats that repeat throughout the dataset.

I've attempted to generate unique boat id's based on the name of the boat, a combination of name and length, and a combination of name, length, and municipality. The issue is that many boats in the dataset could have the same name (unsure how many times this occurs), and the length of boats is generally unreliable (one day the fisher could have responded 11 m, the next time they were asked on the same boat they may have responded 11.2 m. This creates 2 different boat id's even though they are technically the same boat).

My question is, how can I correct for the uncertainty that is surrounding whichever method I take to generate this random variable? Do I treat it as missing data and try to impute boat id's somehow?

I am primarily using mgcv's gamm() and brms's brm() in order to generate the models.

Any help or resources to answer this question are appreciated.

0

There are 0 best solutions below