Fitting two populations to measurements using GEKKO how to optimize the first data point

57 Views Asked by At

I have this code to try to fit the sum of two populations to a measurements data-series.

dS/dt = (a - b) * S
dR/dt = (a - b - c)* R
X(t) = S(t) + R(t)

I want also to optimize the first point which instead now is fixed, I had thought to use a new variable called f which stand in between 0 and 1 and that represents the initial ratio between population 1 and population 2. Something like this:

S0 = x_meas[0] * f
R0 = x_meas[0] * (1-f)

Here is the code:

from gekko import GEKKO
m = GEKKO(remote=False)

# data
m.time = [0,0.1,0.2,0.3,0.4,0.5,0.8,0.85,0.9,0.95,1]
x_meas = [1.1,1.2,1.56,1.77,1.89,2.1,2.3,2.5,2.7,2.2,2.1]

# parameters, single value over time horizon
p = m.Array(m.FV,3,lb=0.5,ub=10); a,b,c = p
# turn STATUS On for adjustment by solver
for fv in p:
   fv.STATUS=1

f0 = m.FV(value=0.01, lb=0, ub=1)
f0.STATUS=1
# variables
S,R,X = m.Array(m.Var,3,lb=0)
# change initial conditions
S0,R0 = [0.35,0.65]; X0 = S0+R0
S.value=S0; R.value=R0; X.value=X0

# measured values
Xm = m.Param(x_meas)

# equations
m.Equations([S.dt()==(a-b)*S,
             R.dt()==(a-b-c)*R,
             X==S+R])

# minimize difference to measurements
m.Minimize((X-Xm)**2)

m.options.IMODE = 5   # dynamic estimation
m.options.NODES = 3   # collocation nodes
m.solve(disp=False)   # display solver output

print(f'a: {a.value[0]}, b: {b.value[0]}, c: {c.value[0]}')

import matplotlib.pyplot as plt
plt.figure(figsize=(7,3))
plt.plot(m.time,S.value,'b:',linewidth=3,label='S')
plt.plot(m.time,R.value,'r--',linewidth=2,label='R')
plt.plot(m.time,X.value,'k--',label='X')
plt.plot(m.time,Xm.value,'kx',label='Xm')
plt.legend(); plt.grid(); plt.xlabel('Time')
plt.tight_layout()
plt.savefig('fit.png',dpi=300)
plt.show()
1

There are 1 best solutions below

0
John Hedengren On BEST ANSWER

By default, gekko is set up as an initial value problem where the values at t=0 are fixed and the equations are not solved at that initial point. One way to still fit into this framework with calculated initial values is to set fixed_initial=False and the differential equation initial condition is calculated. However, this doesn't fix the issue that the equations for the first node in the time horizon are deactivated.

To overcome this, try setting another very small time increment in the time horizon with a very small time-step, such as 1e-5. Create the fraction f as an FV type and set the STATUS=1 to calculate that single value with the optimizer. I included bounds on f between 0.2 and 0.8, just to make the problem more interesting and they can be changed.

# initial condition variables
f = m.FV(value=0.5,lb=0.2,ub=0.8)
f.STATUS = 1

Activate the equations for the relation between R, S and f only on that small-time step.

# activate only for initial condition
init = np.zeros(len(m.time)); init[1]=1
init = m.Param(init)
m.Equations([init*(S-x_meas[0]*f)==0,
             init*(R-x_meas[0]*(1-f))==0])

The remainder of the problem stays the same.

Fit to data

from gekko import GEKKO
import numpy as np
m = GEKKO(remote=True)

# data
m.time = [0,1e-5,0.1,0.2,0.3,0.4,0.5,0.8,0.85,0.9,0.95,1]
x_meas = [1.1,1.1,1.2,1.56,1.77,1.89,2.1,2.3,2.5,2.7,2.2,2.1]

# parameters, single value over time horizon
p = m.Array(m.FV,3,lb=0.5,ub=10); a,b,c = p
# turn STATUS On for adjustment by solver
for fv in p:
   fv.STATUS=1

# variables
S,R,X = m.Array(m.Var,3,value=0.5,lb=0,fixed_initial=False)

# initial condition variables
f = m.FV(value=0.5,lb=0.2,ub=0.8)
f.STATUS = 1

# activate only for initial condition
init = np.zeros(len(m.time)); init[1]=1
init = m.Param(init)
m.Equations([init*(S-x_meas[0]*f)==0,
             init*(R-x_meas[0]*(1-f))==0])

# change initial condition for X
X.value=x_meas[0]

# measured values
Xm = m.Param(x_meas)

# equations
m.Equations([S.dt()==(a-b)*S,
             R.dt()==(a-b-c)*R,
             X==S+R])

# minimize difference to measurements
m.Minimize((X-Xm)**2)

m.options.IMODE = 5   # dynamic estimation
m.options.NODES = 2   # collocation nodes
m.options.SOLVER = 1  # APOPT solver
m.solve(disp=True)    # display solver output

print(f'a: {a.value[0]}, b: {b.value[0]}, c: {c.value[0]}')
print(f'S0: {S.value[0]}, R0: {R.value[0]}, f:{f.value[0]}')

import matplotlib.pyplot as plt
plt.figure(figsize=(7,3))
plt.plot(m.time,S.value,'b:',linewidth=3,label='S')
plt.plot(m.time,R.value,'r--',linewidth=2,label='R')
plt.plot(m.time,X.value,'k--',label='X')
plt.plot(m.time,Xm.value,'kx',label='Xm')
plt.legend(); plt.grid(); plt.xlabel('Time')
plt.tight_layout()
plt.savefig('fit.png',dpi=300)
plt.show()

It solves with a successful solution:

 Iter    Objective  Convergence
   20  1.01871E+00  1.15106E-04
   21  1.01863E+00  1.02840E-04
   22  1.01863E+00  3.60679E-09
   23  1.01863E+00  5.74752E-10
   24  1.01863E+00  5.74752E-10
 Successful solution
 
 ---------------------------------------------------
 Solver         :  APOPT (v1.0)
 Solution time  :   6.410000000323635E-002 sec
 Objective      :    1.01863407387195     
 Successful solution
 ---------------------------------------------------
 
a: 1.6769964544, b: 0.50197865393, c: 0.5
S0: 0.21999741496, R0: 0.87999405984, f:0.2