I am new in optimization and would like to write code that will determine the values of a1, a2, b1, b2 for a predefined function f_model based on the available inputs X1_data X2_data and output Y_list.
The code will be used on multiple datasets with available I/O and must meet 3 constraints:
- Levenberg-Marquardt algorithm must be used i.e.,
curve_fit(..., method='lm',...)(based on scipy documentationboundscan not be used withmethod='lm') 1 < b1 < a1 < a2 < b2y=((a2+(a1-a2)/(1+((x1/x2/b2)**b1)))*x2)/1000
Here is my code:
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
def f_model(x_data, b1, a1, a2, b2):
x1, x2 = x_data
if 1 < b1 < a1 < a2 < b2:
return 1e-3 * (a2 + (a1 - a2) / (1 + ((x1 / x2 / b2) ** b1))) * x2
else:
return np.inf
X1_data = ...
X2_data = ...
Y_list = ...
X_axis = [i + 1 for i in range(X1_data.shape[0])]
for Y_data in Y_list:
Y_data = np.array(Y_data)
# if np.min(Y_data) > 1:
# b1_guess = 1.01
# a1_guess = np.min(Y_data)
# a2_guess = np.max(Y_data) * 10
# b2_guess = np.mean(X1_data / X2_data)
# guess = [b1_guess, a1_guess, a2_guess, b2_guess]
# else:
# guess = [1.01, 100.0, 700.0, 10000.0]
guess = [3.5, 100.0, 7300.00, 30000.00]
# bounds = ([1.0, guess[0], guess[1], guess[2]], [np.inf, np.inf, np.inf, np.inf])
popt, pcov = curve_fit(f_model, (X1_data, X2_data), Y_data, p0=guess, method='lm', maxfev=5000)
print(f"Guess {guess}")
print(f"Optimal {popt}")
optimal_curve = f_model((X1_data, X2_data), *popt)
plt.scatter(X_axis, Y_data, label='Data')
plt.plot(X_axis, optimal_curve, label='Fitted Curve')
plt.xlabel('x-axis')
plt.ylabel('y-axis')
plt.legend()
plt.show()
The problem is that I couldn't figure out how to properly set the values for guess based on I/O. By changing the values of guess, I got several popt, from which it is difficult to determine the optimal one.
I have dataset1 to validate the code (the values of a1, a2, b1, b2 are known). Whenever I try different combinations of guess, the curve fits the data well but popt does not provide b1=3.98; a1=98.4; a2=7337.03; b2=29404.80.
If anyone could help me edit the code so that for dataset1, popt will be b1 = 3.98; a1 = 98.4; a2 = 7337.03; b2 = 29404.80. I will be happy!
ps. As soon as, code will be validated, I would like to use it for dataset2 as well.
dataset1 Y_list updated (13/12/2023)
X1_data = np.array([2894648.773, 3157919.698, 3474257.841, 3800193.356, 4156493.087, 4590298.639, 5042658.177])
X2_data = np.array([67.28579923, 69.52170573, 71.68019425, 73.68463411, 75.5179478, 77.1563269, 78.52232434])
Y_list = [[411.1179, 441.5084, 471.0992, 497.1712, 520.8332, 542.983, 560.8622]]
b1=3.98; a1=98.4; a2=7337.03; b2=29404.80
dataset2
X1_data = np.array([240228, 259164, 267785, 300530, 339812])
X2_data = np.array([32.188719, 32.710433, 33.243463, 33.80985, 34.353048])
Y_list = [[23.79969728, 25.34667811, 26.88872834, 30.86784396, 34.04270604],
[2.70371836, 3.33359208, 3.49283256, 3.9231444, 4.55001784],
[0.65156594, 0.66139181, 0.50957883, 0.46295474, 0.51407824]]