I have a continuous variable that goes from 0 to 1 (percentage data, including 0s), and I want to determine the best distribution to model it. I'm on R-Studio, data in question here. Note that about 27% of observations are 0, and I do plan on exploring zero inflation as I go.
I checked the histogram and ecdf (see below) to get an idea of what I'm dealing with. Fitdistrplus's gave me 'beta', while gamlss gave me a Pareto Type 2, which I'm not very familiar with.
I've determined the parameters of a beta distribution and fit it, used KS to test a few other distributions, but a stuck on that Pareto Type 2. The problem: all my atempts at estimating location and scale fail. As far as I can tell, that's because of the zeroes in the dataset. It works if I add a tiny amount to the entire dataset (i.e. 0.0001), but honestly I'm not sure that is a good solution and would make comparing it to anything else a living hell. I tried EnvStats, VGAM, CaDENCE, and all give me errors. So, I humbly come here in the hopes that someone can suggest another option for estimating the Pareto Type 2 parameters for that dataset.


You can consider the following approach :