Fitting data with a three-parameter Weibull distribution function

169 Views Asked by At

In my data, the first column is the x values, and the second column is the y values. I want to fit the data with a three-parameter Weibull function to describe the distribution.

I tried to find solutions by many searches, but a similar post on Stack Overflow seems to be only for one-column data. I also tried with the libraries scipy.stats.exponweib and scipy.stats.weibull_min, but failed due to the same problem.

So I tried to define the function and to fit the data with scipy.optimize.curve_fit. The following is the code I used. The data.txt file can be downloaded by this link.

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

# Load data
poresize, psd, psd_std = np.loadtxt("data.txt", unpack=True)

# Define the Weibull distribution function with the requested form
def weibull_func(x, a, b, c):
    return a * b * (x - c) ** (b - 1) * np.exp(-a * (x - c) ** b)

# Perform curve fitting
popt, pcov = curve_fit(weibull_func, poresize, psd, p0=[1, 1, 1])

# Plot the original data and the fitted curve
plt.scatter(poresize, psd, label='Data')
x_range = np.linspace(min(poresize), max(poresize), 100)
plt.plot(x_range, weibull_func(x_range, *popt), 'r-', label='Fitted curve (Weibull)')
plt.xlabel('Particle Size')
plt.ylabel('PSD')
plt.title('Fitting Weibull Distribution')
plt.legend()
plt.grid(True)
plt.show()

# Display the optimized parameters
a_opt, b_opt, c_opt = popt
print("Optimized Parameters:")
print("a:", a_opt)
print("b:", b_opt)
print("c:", c_opt)

The function form is adapted from the Weibull distribution from Matlab, but x is replaced with x-c. The problem is that I tried many combinations of the initial guess values for p0, but none of them could give me a satisfactory result.

Could you please tell me whether there is something fundamentally wrong with the code? Or how can I get the proper initial guess fast?

1

There are 1 best solutions below

6
jared On BEST ANSWER

Two pieces of advice when fitting a function:

  1. Understand the function you're trying to fit. In this case, you should realize that it is undefined for x < c when b is not an integer. (This is because you have (x-c)^(b-1). If x < c then x - c < 0. If b is not an integer then you will be taking a fractional power of a negative number, which produces a complex number.)
  2. Plot your data and play around with a model of your function to get a good guess. In this case, I plotted your data and saw that the spike is between x = 2 and x = 5 and it spikes up to y = 1. Knowing that, I made this Desmos model and played with the variables until I got something decent.

Here is the fixed function that can handle x < c:

def weibull_func(x, a, b, c):
    res = np.zeros_like(x)
    cond = x > c
    xcond = x[cond]
    res[cond] = a*b*(xcond - c)**(b - 1)*np.exp(-a*(xcond - c)**b)
    return res

I provided the curve_fit call with the guess I got using Desmos:

popt, pcov = curve_fit(weibull_func, poresize, psd, p0=[1.4, 1.93, 2.4])

The tuned parameters are:

a = 1.1172264863332277
b = 2.2231297860177226
c = 2.301372954142873

Resulting plot: