Discrete path tracking with python gekko - gekko

I have some discrete data points representing a path and I want to minimize the distance between a trajectory of an object to these path points along with some other constraints. I'm trying out gekko as a tool to solve this problem and for that I made a simple problem by making data points from a parabola and a constraint to the path. My attempt to solve it is
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
import time
#path data points
x_ref = np.linspace(0, 4, num=21)
y_ref = - np.square(x_ref) + 16
#constraint for visualization purposes
x_bound = np.linspace(0, 4, num=10)
y_bound = 1.5*x_bound + 4
def distfunc(x,y,xref,yref,p):
'''
Shortest distance from (x,y) to (xref, yref)
'''
dtemp = []
for i in range(len(xref)):
d = (x-xref[i])**2+(y-yref[i])**2
dtemp.append(dtemp)
min_id = dtemp.index(min(dtemp))
if min_id == 0:
next_id = min_id+1
elif min_id == len(x_ref):
next_id = min_id-1
else:
d2 = (x-xref[min_id-1])**2+(y-yref[min_id-1])**2
d1 = (x-xref[min_id+1])**2+(y-yref[mid_id+1])**2
d_next = [d2, d1]
next_id = min_id + 2*d_next.index(min(d_next)) - 1
n1 = xref[next_id] - xref[min_id]
n2 = yref[next_id] - yref[min_id]
nnorm = p.sqrt(n1**2+n2**2)
n1 = n1 / nnorm
n2 = n2 / nnorm
difx = x-xref[min_id]
dify = y-yref[min_id]
dot = difx*n1 + dify*n2
deltax = difx - dot*n1
deltay = dify - dot*n2
return deltax**2+deltay**2
v_ref = 3
now = time.time()
p = GEKKO(remote=False)
p.time = np.linspace(0,10,21)
x = p.Var(value=0)
y = p.Var(value=16)
vx = p.Var(value=1)
vy = p.Var(value=0)
ax = p.Var(value=0)
ay = p.Var(value=0)
p.options.IMODE = 6
p.options.SOLVER = 3
p.options.WEB = 0
x_refg = p.Param(value=x_ref)
y_refg = p.Param(value=y_ref)
x_refg = p.Param(value=x_ref)
y_refg = p.Param(value=y_ref)
v_ref = p.Const(value=v_ref)
p.Obj(distfunc(x,y,x_refg,y_refg,p))
p.Obj( (p.sqrt(vx**2+vy**2) - v_ref)**2 + ax**2 + ay**2)
p.Equation(x.dt()==vx)
p.Equation(y.dt()==vy)
p.Equation(vx.dt()==ax)
p.Equation(vy.dt()==ay)
p.Equation(y>=1.5*x+4)
p.solve(disp=False, debug=True)
print(f'run time: {time.time()-now}')
plt.plot(x_ref, y_ref)
plt.plot(x_bound, y_bound)
plt.plot(x1.value,x2.value)
plt.show()
This is the result that I get. As you can see, its not exactly the solution that one should expect. For reference to a solution that you may expect, here is what I get using the cost function below
p.Obj((x-x_refg)**2 + (y-y_refg)**2 + ax**2 + ay**2)
However since what I actually wanted is the shortest distance to a path described by these points I expect the distfunc to be closer to what I want since the shortest distance is most likely to some interpolated point. So my question is twofold:
Is this the correct gekko expression/formulation for the objective function?
My other goal is solution speed so is there a more efficient way of expressing this problem for gekko?

You can't define an objective function that changes based on conditions unless you insert logical conditions that are continuously differentiable such as with the if2 or if3 function. Gekko evaluates the symbolic model once and then passes that off to an executable for solution. It only calls the Python model build once because it is compiling the model to efficient byte-code for execution. You can see the model that you created with p.open_folder(). The model file ends in the apm extension: gk_model0.apm.
Model
Constants
i0 = 3
End Constants
Parameters
p1
p2
p3
p4
End Parameters
Variables
v1 = 0
v2 = 16
v3 = 1
v4 = 0
v5 = 0
v6 = 0
End Variables
Equations
v3=$v1
v4=$v2
v5=$v3
v6=$v4
v2>=(((1.5)*(v1))+4)
minimize (((((v1-0.0)-((((((v1-0.0))*((0.2/sqrt(0.04159999999999994))))+(((v2-16.0))&
*((-0.03999999999999915/sqrt(0.04159999999999994))))))*&
((0.2/sqrt(0.04159999999999994))))))^(2))+((((v2-16.0)&
-((((((v1-0.0))*((0.2/sqrt(0.04159999999999994))))+(((v2-16.0))&
*((-0.03999999999999915/sqrt(0.04159999999999994))))))&
*((-0.03999999999999915/sqrt(0.04159999999999994))))))^(2)))
minimize (((((sqrt((((v3)^(2))+((v4)^(2))))-i0))^(2))+((v5)^(2)))+((v6)^(2)))
End Equations
End Model
One strategy is to split your problem into multiple optimization problems that are all minimal time problems where you navigate to the first way-point and then re-initialize the problem to navigate to the second way-point, and so on. If you want to preserve momentum and anticipate the turning then you'll need to use more advanced methods such as shown in the Pigeon / Eagle tracking problem (see source files) or similar to a trajectory optimization with UAVs or HALE UAVs (see references below).
Martin, R.A., Gates, N., Ning, A., Hedengren, J.D., Dynamic Optimization of High-Altitude Solar Aircraft Trajectories Under Station-Keeping Constraints, Journal of Guidance, Control, and Dynamics, 2018, doi: 10.2514/1.G003737.
Gates, N.S., Moore, K.R., Ning, A., Hedengren, J.D., Combined Trajectory, Propulsion and Battery Mass Optimization for Solar-Regenerative High-Altitude Long Endurance Unmanned Aircraft, AIAA Science and Technology Forum (SciTech), 2019.

Related

Getting optimal control with economic cost function to converge

I have been using gekko to optimize a bioreactor using example 12 (https://apmonitor.com/wiki/index.php/Main/GekkoPythonOptimization) as a basis.
My model is slightly more complicated with 6 states, 7 states and 2 manipulated variables. When I run it for small values of time (t ~20), the simulation is able to converge (albeit requiring a fine time resolution (dt < 0.1). However, when I try to extend the time (e.g., t = 30), it fails quite consistently with the following error:
EXIT: Converged to a point of local infeasibility. Problem may be infeasible
I have tried the following:
Employing different solvers with m.options.SOLVER = 1,2,3
Increasing m.options.MAX_ITER to 10000
Decreasing m.options.NODES to 1 (a lower order descretization seems to help with convergence)
Supplying a reasonable initial guess to the MVs by specifying a value in the declaration:
D = m.MV(value=0.1,lb=0.0,ub=0.1). From some of the various posts, it seems this should help.
I am not too sure how to go about solving this. For a simplified model (3 states, 5 parameters and 2 MVs), gekko is able to optimize it quite well (though it fails somewhat when I try to go to large t) even though the rate constants of the simplified model are a subset of the full model.
My code is as follows:
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
#Parameters and IC
full_params = [0.027,2.12e-9,7.13e-3,168,168,0.035,1e-3]
full_x0 = [5e6,0.0,0.0,0.0,1.25e5,0.0]
mu,k1,k2,k3,k33,k4, f= full_params
#Initialize model
m = GEKKO()
#Time discretization
n_steps = 201
m.time = np.linspace(0,20,n_steps)
#Define MVs
D = m.MV(value=0.1,lb=0.0,ub=0.1)
D.STATUS = 1
D.DCOST = 0.0
Tin = m.MV(value=1e7,lb=0.0,ub=1e7)
Tin.STATUS = 1
Tin.DCOST = 0.0
#Define States
T = m.Var(value=full_x0[0])
Id = m.Var(value=full_x0[1])
Is = m.Var(value=full_x0[2])
Ic = m.Var(value=full_x0[3])
Vs = m.Var(value=full_x0[4])
Vd = m.Var(value=full_x0[5])
#Define equations
m.Equation(T.dt() == mu*T -k1*(Vs+Vd)*T + D*(Tin-T))
m.Equation(Id.dt() == k1*Vd*T -(k1*Vs -mu)*Id -D*Id)
m.Equation(Is.dt() == k1*Vs*T -(k1*Vd + k2)*Is -D*Is)
m.Equation(Ic.dt() == k1*(Vs*Id + Vd*Is) -k2*Ic -D*Ic)
m.Equation(Vs.dt() == k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vs)
m.Equation(Vd.dt() == k33*Ic + f*k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vd)
#Define objective function
J = m.Var(value=0) # objective (profit)
Jf = m.FV() # final objective
Jf.STATUS = 1
m.Connection(Jf,J,pos2="end")
m.Equation(J.dt() == D*(Vs + Vd))
m.Obj(-Jf)
m.options.IMODE = 6 # optimal control
m.options.NODES = 1 # collocation nodes
m.options.SOLVER = 3
m.options.MAX_ITER = 10000
#Solve
m.solve()
For clarity, the model equations are:
I would be grateful for any assistance e.g., how to implement the scaling of the parameters per https://apmonitor.com/do/index.php/Main/ModelInitialization. Thank you!
Try increasing the value of the final time until the solver can no-longer find a solution such as with tf=28 (successful). A plot of the solution reveals that Tin is adjusted to be zero at about the time where the solution almost fails to converge. I added a couple additional objective forms that didn't help the convergence (see Objective Method #1 and #2). The values of J, Vs, Vd are high but not unmanageable by the solver. One way to think about scaling is by changing units such as changing from kg/day to kg/s as the basis. Gekko automatically scales variables by the initial condition.
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
#Parameters and IC
full_params = [0.027,2.12e-9,7.13e-3,168,168,0.035,1e-3]
full_x0 = [5e6,0.0,0.0,0.0,1.25e5,0.0]
mu,k1,k2,k3,k33,k4, f= full_params
#Initialize model
m = GEKKO()
#Time discretization
tf = 28
n_steps = tf*10+1
m.time = np.linspace(0,tf,n_steps)
#Define MVs
D = m.MV(value=0.1,lb=0.0,ub=0.1)
D.STATUS = 1
D.DCOST = 0.0
Tin = m.MV(value=1e7,lb=0,ub=1e7)
Tin.STATUS = 1
Tin.DCOST = 0.0
#Define States
T = m.Var(value=full_x0[0])
Id = m.Var(value=full_x0[1])
Is = m.Var(value=full_x0[2])
Ic = m.Var(value=full_x0[3])
Vs = m.Var(value=full_x0[4])
Vd = m.Var(value=full_x0[5])
#Define equations
m.Equation(T.dt() == mu*T -k1*(Vs+Vd)*T + D*(Tin-T))
m.Equation(Id.dt() == k1*Vd*T -(k1*Vs -mu)*Id -D*Id)
m.Equation(Is.dt() == k1*Vs*T -(k1*Vd + k2)*Is -D*Is)
m.Equation(Ic.dt() == k1*(Vs*Id + Vd*Is) -k2*Ic -D*Ic)
m.Equation(Vs.dt() == k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vs)
m.Equation(Vd.dt() == k33*Ic + f*k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vd)
# Original Objective
if True:
J = m.Var(value=0) # objective (profit)
Jf = m.FV() # final objective
Jf.STATUS = 1
m.Connection(Jf,J,pos2="end")
m.Equation(J.dt() == D*(Vs + Vd))
m.Obj(-Jf)
# Objective Method 1
if False:
p=np.zeros_like(m.time); p[-1]=1
final = m.Param(p)
J = m.Var(value=0) # objective (profit)
m.Equation(J.dt() == D*(Vs + Vd))
m.Maximize(J*final)
# Objective Method 2
if False:
m.Maximize(D*(Vs + Vd))
m.options.IMODE = 6 # optimal control
m.options.NODES = 2 # collocation nodes
m.options.SOLVER = 3
m.options.MAX_ITER = 10000
#Solve
m.solve()
plt.figure(figsize=(10,8))
plt.subplot(3,1,1)
plt.plot(m.time,Tin.value,'r.-',label='Tin')
plt.legend(); plt.grid()
plt.subplot(3,1,2)
plt.semilogy(m.time,T.value,label='T')
plt.semilogy(m.time,Id.value,label='Id')
plt.semilogy(m.time,Is.value,label='Is')
plt.semilogy(m.time,Ic.value,label='Ic')
plt.legend(); plt.grid()
plt.subplot(3,1,3)
plt.semilogy(m.time,Vs.value,label='Vs')
plt.semilogy(m.time,Vd.value,label='Vd')
plt.semilogy(m.time,J.value,label='Objective')
plt.legend(); plt.grid()
plt.show()
Is there any type of constraint in the problem that would favor a decrease at the end? This may be the cause of the infeasibility at tf=30. Another way to get a feasible solution is to solve with m.options.TIME_STEP=20 and resolve the problem with the initial conditions from the prior solution equal to the value at time step 20.
#Solve
m.solve()
m.options.TIME_SHIFT=20
m.solve()
This way, the solution steps forward in time to optimize in parts. This strategy was used to optimize a High Altitude Long Endurance (HALE) UAV and is called Receding Horizon Control.
Martin, R.A., Gates, N., Ning, A., Hedengren, J.D., Dynamic
Optimization of High-Altitude Solar Aircraft Trajectories Under
Station-Keeping Constraints, Journal of Guidance, Control, and
Dynamics, 2018, doi: 10.2514/1.G003737.

GEKKO RTO vs MPC MODES

This is a question derived from this one. After posting my question I found a solution (more like a patch to force the optimizer to optimize). There is something that baffles me. John Hedengren correctly points out that a b=1.0 in the ODE leads to an infeasible solution with IMODE=6. However in my patchy work around with IMODE=3 I do get a solution.
I'm trying to understand what's happening here reading GEKKO's documentation for IMODE=3 and 6 but it is not clear to me
IMODE=3
RTO Real-Time Optimization (RTO) is a steady-state mode that allows
decision variables (FV or MV types with STATUS=1) or additional
variables in excess of the number of equations. An objective function
guides the selection of the additional variables to select the optimal
feasible solution. RTO is the default mode for Gekko if
m.options.IMODE is not specified.
IMODE=6
MPC Model Predictive Control (MPC) is implemented with IMODE=6 as a
simultaneous solution or with IMODE=9 as a sequential shooting method.
Why b=1. works in one mode but not in the other?
This is my patchy work around with IMODE=3 and b=1.0:
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
m = GEKKO(remote=False)
m.time = np.linspace(0,23,24)
#initialize variables
T_e = [50.,50.,50.,50.,45.,45.,45.,60.,60.,63.,\
64.,45.,45.,50.,52.,53.,53.,54.,54.,53.,52.,51.,50.,45.]
temp_low = [55.,55.,55.,55.,55.,55.,55.,68.,68.,68.,68.,55.,55.,68.,\
68.,68.,68.,55.,55.,55.,55.,55.,55.,55.]
temp_upper = [75.,75.,75.,75.,75.,75.,75.,70.,70.,70.,70.,75.,75.,\
70.,70.,70.,70.,75.,75.,75.,75.,75.,75.,75.]
TOU = [0.05,0.05,0.05,0.05,0.05,0.05,0.05,200.,200.,\
200.,200.,200.,200.,200.,200.,200.,200.,200.,\
200.,200.,200.,0.05,0.05,0.05]
b = m.Param(value=1.)
k = m.Param(value=0.05)
u = [m.MV(0.,lb=0.,ub=1.) for i in range(24)]
# Controlled Variable
T = [m.SV(60.,lb=temp_low[i],ub=temp_upper[i]) for i in range(24)]
for i in range(24):
u[i].STATUS = 1
for i in range(23):
m.Equation( T[i+1]-T[i]-k*(T_e[i]-T[i])-b*u[i]==0.0 )
m.Obj(np.dot(TOU,u))
m.options.IMODE = 3
m.solve(debug=True)
myu =[u[0:][i][0] for i in range(24)]
print myu
myt =[T[0:][i][0] for i in range(24)]
plt.plot(myt)
plt.plot(temp_low)
plt.plot(temp_upper)
plt.show()
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.plot(myu,color='b')
ax2.plot(TOU,color='k')
plt.show()
Results:
The difference between the infeasible IMODE=6 and the feasible IMODE=3 is that the IMODE=3 case allows the temperature initial condition to be adjusted by the optimizer. The optimizer recognizes that the initial condition can be changed and so it modifies it to 75 to both stay feasible and also minimize the future energy consumption.
from gekko import GEKKO
import numpy as np
m = GEKKO(remote=False)
m.time = np.linspace(0,23,24)
#initialize variables
T_external = [50.,50.,50.,50.,45.,45.,45.,60.,60.,63.,\
64.,45.,45.,50.,52.,53.,53.,54.,54.,\
53.,52.,51.,50.,45.]
temp_low = [55.,55.,55.,55.,55.,55.,55.,68.,68.,68.,68.,\
55.,55.,68.,68.,68.,68.,55.,55.,55.,55.,55.,55.,55.]
temp_upper = [75.,75.,75.,75.,75.,75.,75.,70.,70.,70.,70.,75.,\
75.,70.,70.,70.,70.,75.,75.,75.,75.,75.,75.,75.]
TOU_v = [0.05,0.05,0.05,0.05,0.05,0.05,0.05,200.,200.,200.,200.,\
200.,200.,200.,200.,200.,200.,200.,200.,200.,200.,0.05,\
0.05,0.05]
b = m.Param(value=1.)
k = m.Param(value=0.05)
T_e = m.Param(value=T_external)
TL = m.Param(value=temp_low)
TH = m.Param(value=temp_upper)
TOU = m.Param(value=TOU_v)
u = m.MV(lb=0, ub=1)
u.STATUS = 1 # allow optimizer to change
# Controlled Variable
T = m.SV(value=75)
m.Equations([T>=TL,T<=TH])
m.Equation(T.dt() == k*(T_e-T) + b*u)
m.Minimize(TOU*u)
m.options.IMODE = 6
m.solve(disp=True,debug=True)
import matplotlib.pyplot as plt
plt.subplot(2,1,1)
plt.plot(m.time,temp_low,'k--')
plt.plot(m.time,temp_upper,'k--')
plt.plot(m.time,T.value,'r-')
plt.ylabel('Temperature')
plt.subplot(2,1,2)
plt.step(m.time,u.value,'b:')
plt.ylabel('Heater')
plt.xlabel('Time (hr)')
plt.show()
If you went for another day (48 hours), you'd probably see that the problem would eventually be infeasible because the smaller heater b=1 wouldn't be able to meet the temperature lower constraint.
One of the advantages of using IMODE=6 is that you can write the differential equation instead of doing the discretization yourself. With IMODE=3, you use an Euler's method for the differential equation. The default discretization for IMODE>=4 is NODES=2, equivalent to your Euler finite difference method. Setting NODES=3-6 increases the accuracy with orthogonal collocation on finite elements.

How to constrain model variables in GEKKO

I like to constrain the variable value u < 1 in y model. Added ub=1 to the variable definition u = m.Var(name='u', value=0, lb=-2, ub=1) but it resulted in "No soulution found" (EXIT: Converged to a point of local infeasibility. Problem may be infeasible.). I guess I have to reformulate the problem to avoid this, but I have not been able to find examples how this should be done. How do i write a proper model to avoid infeasible solutions when constraining variable values?
I hav tied to reformulate the problem by adding equation like m.Equation(u < 1) with no success.
import numpy as np
from gekko import GEKKO
import matplotlib.pyplot as pyplt
m = GEKKO(remote=False)
t = np.linspace(0, 1000, 101) # time
d = np.ones(t.shape)
d[0:10] = 0
y_delay=0
# Add data to model
m.time = t
K = m.Const(0.01, name='K')
r = m.Const(name='r', value=0) # Reference
d = m.Param(name='d', value=d) # Disturbance
y = m.Var(name='y', value=0, lb=-2, ub=2) # State variable
u = m.Var(name='u', value=0, lb=-2, ub=1) # Output
e = m.Var(name='e', value=0)
Tc = m.FV(name='Tc', value=1200, lb=60, ub=1200) # time constant
# Update variable status
Tc.STATUS = 1 # Optimizer can adjust value
Kp = m.Intermediate(1 / K * 1 / Tc, name='Kp')
Ti = m.Intermediate(4 * Tc, name='Ti')
# Model equations
m.Equations([y.dt() == K * (u-d),
e == r-y,
u.dt() == Kp*e.dt()+Kp/Ti*e])
# Model constraints
m.Equation(y < 0.5)
m.Equation(y > -0.5)
# Model objective
m.Obj(-Tc)
# options
m.options.IMODE = 6 # Problem type: 6 = Dynamic optimization
# solve
m.solve(disp=True, debug=True)
print('Tc: %6.2f [s]' % (Tc.value[-1], ))
fig1, (ax1, ax2, ax3) = pyplt.subplots(3, sharex='all')
ax1.plot(t, y.value)
ax1.set_ylabel("y", fontsize=8), ax1.grid(True, which='both')
ax2.plot(t, e.value)
ax2.set_ylabel("e", fontsize=8), ax2.grid(True, which='both')
ax3.plot(t, u.value)
ax3.plot(t, d.value)
ax3.set_ylabel("u and d", fontsize=8), ax3.grid(True, which='both')
pyplt.show()
EXIT: Converged to a point of local infeasibility. Problem may be infeasible.
An error occured.
The error code is 2
If I change the upper bound of u to 2, the optimization problem is solved as expected.
Hard constraints on variables can lead to an infeasible solution, as you observed. I recommend that you use soft constraints by specifying the variable y as a Controlled Variable and set an upper and lower set point range with SPHI and SPLO.
y = m.CV(name='y', value=0) # Controlled variable
y.STATUS = 1
y.TR_INIT = 0
y.SPHI = 0.5
y.SPLO = -0.5
I also removed the lb and ub from y and u to not give them hard bounds that can lead to the infeasibility. You also have an objective to maximize the value of Tc with m.Obj(-Tc). It goes to the maximum limit: 1200 when the solver is able to adjust the value. As you can see from the plot, the value of y exceeds the setpoint range. It may not be possible for the controller to keep it within that range. A soft constraint (objective based) approach to constraints penalizes deviations but does not lead to an infeasible solution. If you need to increase the penalty on violations of the SPHI or SPLO, the parameters WSPHI and WSPLO can be adjusted.
It appears that you have a first order dynamic model and you are trying to optimize PID parameters. If you need to model saturation of the controller output (actuator) then the if3, max3, min3 or corresponding if2, max2, min2 functions may be useful. There is more information on CV objectives and tuning in the Dynamic Optimization course.
Here is a feasible solution to your problem:
import numpy as np
from gekko import GEKKO
import matplotlib.pyplot as pyplt
m = GEKKO() # remote=False
t = np.linspace(0, 1000, 101) # time
d = np.ones(t.shape)
d[0:10] = 0
y_delay=0
# Add data to model
m.time = t
K = m.Const(0.01, name='K')
r = m.Const(name='r', value=0) # Reference
d = m.Param(name='d', value=d) # Disturbance
e = m.Var(name='e', value=0)
u = m.Var(name='u', value=0) # Output
Tc = m.FV(name='Tc', value=1200, lb=60, ub=1200) # time constant
y = m.CV(name='y', value=0) # Controlled variable
y.STATUS = 1
y.TR_INIT = 0
y.SPHI = 0.5
y.SPLO = -0.5
# Update variable status
Tc.STATUS = 1 # Optimizer can adjust value
Kp = m.Intermediate((1 / K) * (1 / Tc), name='Kp')
Ti = m.Intermediate(4 * Tc, name='Ti')
# Model equations
m.Equations([y.dt() == K * (u-d),
e == r-y,
u.dt() == Kp*e.dt()+(Kp/Ti)*e])
# Model constraints
#m.Equation(y < 0.5)
#m.Equation(y > -0.5)
# Model objective
m.Obj(-Tc)
# options
m.options.IMODE = 6 # Problem type: 6 = Dynamic optimization
m.options.SOLVER = 3
m.options.MAX_ITER = 1000
# solve
m.solve(disp=True, debug=True)
print('Tc: %6.2f [s]' % (Tc.value[-1], ))
fig1, (ax1, ax2, ax3) = pyplt.subplots(3, sharex='all')
ax1.plot(t, y.value)
ax1.plot([min(t),max(t)],[0.5,0.5],'k--')
ax1.plot([min(t),max(t)],[-0.5,-0.5],'k--')
ax1.set_ylabel("y", fontsize=8), ax1.grid(True, which='both')
ax2.plot(t, e.value)
ax2.set_ylabel("e", fontsize=8), ax2.grid(True, which='both')
ax3.plot(t, u.value)
ax3.plot(t, d.value)
ax3.set_ylabel("u and d", fontsize=8), ax3.grid(True, which='both')
pyplt.show()
Thanks for an extensive and useful answer to my question. I really appreciate it.
As you correctly observed I am trying to optimize tuning parameters for my simple control problem. I have executed your code with soft constraints, and it sure solves the feasibility issue. I also added the WSPHI/LO parameters and set their values high to have a solution within the constraints. Still, I like to have a model where the control output (“u”) is bounded [0,1]. Based on your answer I probably must add “if” or “max/min” statements in the model to avoid having a non-feasible set of equations when “u” hits the bound. Something like “if u<0, u.dt() = 0 else u.dt() = Kp*e ….”. Could it alternatively be possible to add a variable (a type slack variable) to ensure feasibility of the equation set? I will also investigate the material in the dynamic optimization course links to get a better understanding of dynamic modelling. Thanks again for guiding me in the right direction in this issue.

Why do the trace values have periods of (unwanted) stability?

I have a fairly simple test data set I am trying to fit with pymc3.
The result generated by traceplot looks something like this.
Essentially the trace of all parameter look like there is a standard 'caterpillar' for 100 iterations, followed by a flat line for 750 iterations, followed by the caterpillar again.
The initial 100 iterations happen after 25,000 ADVI iterations, and 10,000 tune iterations. If I change these amounts, I randomly will/won't have these periods of unwanted stability.
I'm wondering if anyone has any advice about how I can stop this from happening - and what is causing it?
Thanks.
The full code is below. In brief, I am generating a set of 'phases' (-pi -> pi) with a corresponding set of values y = a(j)*sin(phase) + b(j)*sin(phase). a and b are drawn for each subject j at random, but are related to each other.
I then essentially try to fit this same model.
Edit: Here is a similar example, running for 25,000 iterations. Something goes wrong around iteration 20,000.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import numpy as np
import pymc3 as pm
%matplotlib inline
np.random.seed(0)
n_draw = 2000
n_tune = 10000
n_init = 25000
init_string = 'advi'
target_accept = 0.95
##
# Generate some test data
# Just generates:
# x a vector of phases
# y a vector corresponding to some sinusoidal function of x
# subject_idx a vector corresponding to which subject x is
#9 Subjects
N_j = 9
#Each with 276 measurements
N_i = 276
sigma_y = 1.0
mean = [0.1, 0.1]
cov = [[0.01, 0], [0, 0.01]] # diagonal covariance
x_sub = np.zeros((N_j,N_i))
y_sub = np.zeros((N_j,N_i))
y_true_sub = np.zeros((N_j,N_i))
ab_sub = np.zeros((N_j,2))
tuning_sub = np.zeros((N_j,1))
sub_ix_sub = np.zeros((N_j,N_i))
for j in range(0,N_j):
aj,bj = np.random.multivariate_normal(mean, cov)
#aj = np.abs(aj)
#bj = np.abs(bj)
xj = np.random.uniform(-1,1,size = (N_i,1))*np.pi
xj = np.sort(xj)#for convenience
yj_true = aj*np.sin(xj) + bj*np.cos(xj)
yj = yj_true + np.random.normal(scale=sigma_y, size=(N_i,1))
x_sub[j,:] = xj.ravel()
y_sub[j,:] = yj.ravel()
y_true_sub[j,:] = yj_true.ravel()
ab_sub[j,:] = [aj,bj]
tuning_sub[j,:] = np.sqrt(((aj**2)+(bj**2)))
sub_ix_sub[j,:] = [j]*N_i
x = np.ravel(x_sub)
y = np.ravel(y_sub)
subject_idx = np.ravel(sub_ix_sub)
subject_idx = np.asarray(subject_idx, dtype=int)
##
# Fit model
hb1_model = pm.Model()
with hb1_model:
# Hyperpriors
hb1_mu_a = pm.Normal('hb1_mu_a', mu=0., sd=100)
hb1_sigma_a = pm.HalfCauchy('hb1_sigma_a', 4)
hb1_mu_b = pm.Normal('hb1_mu_b', mu=0., sd=100)
hb1_sigma_b = pm.HalfCauchy('hb1_sigma_b', 4)
# We fit a mixture of a sine and cosine with these two coeffieicents
# allowed to be different for each subject
hb1_aj = pm.Normal('hb1_aj', mu=hb1_mu_a, sd=hb1_sigma_a, shape = N_j)
hb1_bj = pm.Normal('hb1_bj', mu=hb1_mu_b, sd=hb1_sigma_b, shape = N_j)
# Model error
hb1_eps = pm.HalfCauchy('hb1_eps', 5)
hb1_linear = hb1_aj[subject_idx]*pm.math.sin(x) + hb1_bj[subject_idx]*pm.math.cos(x)
hb1_linear_like = pm.Normal('y', mu = hb1_linear, sd=hb1_eps, observed=y)
with hb1_model:
hb1_trace = pm.sample(draws=n_draw, tune = n_tune,
init = init_string, n_init = n_init,
target_accept = target_accept)
pm.traceplot(hb1_trace)
To partially answer my own question: After playing with this for a while, it looks like the problem might be due to the hyperprior standard deviation going to 0. I am not sure why the algorithm should get stuck there though (testing a small standard deviation can't be that uncommon...).
In any case, two solutions that seem to alleviate the problem (although they don't remove it entirely) are:
1) Add an offset to the definitions of the standard deviation. e.g.:
offset = 1e-2
hb1_sigma_a = offset + pm.HalfCauchy('hb1_sigma_a', 4)
2) Instead of using a HalfCauchy or HalfNormal for the SD prior, use a logNormal distribution set so that 0 is unlikely.
I'd look at the divergencies, as explained in notes and literature on Hamiltonian Monte Carlo, see, e.g., here and here.
with model:
np.savetxt('diverging.csv', hb1_trace['diverging'])
As a dirty solution, you can try to increase target_accept, perhaps.
Good luck!

Dirichlet process in PyMC 3

I would like to implement to implement the Dirichlet process example referenced in
Implementing Dirichlet processes for Bayesian semi-parametric models (source: here) in PyMC 3.
In the example the stick-breaking probabilities are computed using the pymc.deterministic
decorator:
v = pymc.Beta('v', alpha=1, beta=alpha, size=N_dp)
#pymc.deterministic
def p(v=v):
""" Calculate Dirichlet probabilities """
# Probabilities from betas
value = [u*np.prod(1-v[:i]) for i,u in enumerate(v)]
# Enforce sum to unity constraint
value[-1] = 1-sum(value[:-1])
return value
z = pymc.Categorical('z', p, size=len(set(counties)))
How would you implement this in PyMC 3 which is using Theano for the gradient computation?
edit:
I tried the following solution using the theano.scan method:
with pm.Model() as mod:
conc = Uniform('concentration', lower=0.5, upper=10)
v = Beta('v', alpha=1, beta=conc, shape=n_dp)
p, updates = theano.scan(fn=lambda stick, idx: stick * t.prod(1 - v[:idx]),
outputs_info=None,
sequences=[v, t.arange(n_dp)])
t.set_subtensor(p[-1], 1 - t.sum(p[:-1]))
category = Categorical('category', p, shape=n_algs)
sd = Uniform('precs', lower=0, upper=20, shape=n_dp)
means = Normal('means', mu=0, sd=100, shape=n_dp)
points = Normal('obs',
means[category],
sd=sd[category],
observed=data)
step1 = pm.Slice([conc, v, sd, means])
step3 = pm.ElemwiseCategoricalStep(var=category, values=range(n_dp))
trace = pm.sample(2000, step=[step1, step3], progressbar=True)
Which sadly is really slow and does not obtain the original parameters of the synthetic data.
Is there a better solution and is this even correct?
Not sure I have a good answer but perhaps this could be sped up by instead using a theano blackbox op which allows you to write a distribution (or deterministic) in python code. E.g.: https://github.com/pymc-devs/pymc3/blob/master/pymc3/examples/disaster_model_arbitrary_deterministic.py

Resources