I am trying to generate approximately 5000 distinct curves that pass through a predetermined set of points, representing layer interfaces within the earth. It is essential that these curves do not exhibit significant oscillations or abrupt changes between nearby points, as the internal structure of the Earth does not undergo such drastic changes. All the curves must pass through the given set of points.
At present, my workflow involves randomly generating points between the provided data points and attempting to fit curves to them, resulting in various curves. However, I am interested in exploring more efficient approaches to achieve this. I am also seeking recommendations for interpolation methods that can produce smooth behavior between points and avoid oscillatory tendencies.
Appreciate any guidance!
import numpy as np
from scipy import interpolate
x = [7.81, 21.65, 186.29, 246.62, 402.74, 446.24, 572.99, 585.11, 613.57, 762.97]
y = [26.27, 26.41, 26.46, 27.36, 34.32, 34.50, 37.94, 39.05, 39.23, 39.88]
# y can only range between 25 and 40
fig = plt.figure(figsize=(20,5))
xnew = np.linspace(min(x), max(x), num=50)
oned = interpolate.CubicSpline(x, y)
yfit = oned(xnew)
plt.scatter(x, y, color='red')
plt.plot(xnew, yfit)
plt.xlim(0, 800)
plt.ylim(20,45)
plt.gca().invert_yaxis()
plt.show()
Example of one such curve:

What you are trying to achieve reminded me of a textbook example of a noise-free Gaussian Process (GP) Regression. GP is an advance machine learning topic that could be very useful for your use case. Here is a terrific interactive tool for understanding GPs.
As explained there
Let's see how you can use GP regression to accomplish what you want using the
scikit-learnlibrary.The code below is made of three functions:
train_gaussian_process: your observationsxandyare used to train the GP.generate_samples_from_gp: from the GP we've just trained, we sample 5 functions (in your case it will be 5000). The function values are stored in a 2D numpy array, where the i-th column represents the i-th function.draw_samples: draw the observations and the sampled functions.A critical component of a GP is the covariance function. You asked for smooth functions and you want to avoid oscillatory behavior, so the RBF kernel of the Matern one could be the right choice. The code below uses the Matern kernel. Play around with their parameters to get what you want.
As you can see from the image below, all the functions will pass through your observations.
You can also get the mean function by typing:
Here is the resulting plot with 5000 functions!
The code above is general so it's ready to be used for a n-dimensional use case. For instance, below 5 sample for a 3D case (I have suppose that the data generating process is
x + ln(y), but you can choose whatever you want, I needed this equation to get thezvalues for the training step).