Modelling an input that can be used a finite amount of times along a time horizon and is also binary - gekko

Happy holidays everyone! I finally have some time off to work on my project, and of course I'm stuck as usual lol.
I'm looking for guidance/examples that would let me be able to model the following:
I have an input (lets call it a 'jump') that is binary (0 or 1) and I want it to only be able to be used only once (or possibly a 'n' number of times where n<#time steps) over the entire time horizon. The affect this input has on the system is it will instantaneously increase the velocity of the system by some predetermined amount.
The second affect this has on the system is that the set of dynamics that progress the system forward in time change. (In this case the 'jump' will change the dynamics from a driving system to a flying system). In the future there will also be a 'double_jump' that does not change the dynamics but does still provide an instantaneous change in velocity. Currently I'm trying to get the first part down then I'm going to attempt to implement this. Just want to keep my bigger vision clear to anyone reading this.
Also another part that is for the future of the model: I'd like to be able to have the system interact with a ball object by let's say using the if2/if3 and if the system's position is some radius from another object's position an impulse will be imparted on the ball object dependent on things like the velocities of the ball and the system. To do properly I imagine I need a way to define a time step that happens at the interaction point, which I believe means I'll need some sort of variable time vector. Any examples for these would be much appreciated.
Okay so 2 and 3 are just here to be here, not really the main points of this question. I think I'll be able to figure them out once I can wrap my head around implementing this weird 'jump' input.
My current plan is to have an MV called 'u_jump' that is a non-integer. Then have a Var called 'jump_hist' that is essentially the 'integral' of 'u_jump', and I give jump_hist an upper bound of 1. What I do right now is just pretend this u_jump is an acceleration on the system by adding to the velocity.dt() equation. This works in theory but doesn't really represent the system I'm trying to control perfectly.
What would be the best example for me to learn some lessons from for implementing this? And another question, is there a way to make the IPOPT solver work for integer solutions by giving the integers a tolerance? Somewhat like the minlp solver option 'minlp_integer_tol 0.05', that way I can still get the speed of IPOPT but the ability to incorporate integer style variables/equations like if3() etc... If not, are there ways I can approach the integer solution with a non-integer solution such that when I implement the control on a real system, the difference between the non-integer solution and the integer solution is within some acceptable tolerance to consider it a disturbance that a feedback controller could mitigate?
Kind of a mouthful I know, my questions always are haha. Hopefully this is helpful for others in the future! Here's my code currently. Let me know if the code gives you issues or anything I could clear up in the question.
Oh and one final note, this is currently setup as a 2D flying system. I've removed the driving dynamics (the c splines commented out) for simplicity of implementing this 'jump' input.
Happy Holidays again everyone!
import numpy as np
import matplotlib.pyplot as plt
import math
import gekko
from gekko import GEKKO
import csv
from mpl_toolkits.mplot3d import Axes3D
class Optimizer():
def __init__(self):
#################GROUND DRIVING OPTIMIZER SETTTINGS##############
self.d = GEKKO(remote=False) # Driving on ground optimizer
ntd = 21
self.d.time = np.linspace(0, 1, ntd) # Time vector normalized 0-1
# options
self.d.options.NODES = 3
self.d.options.SOLVER = 3
self.d.options.IMODE = 6# MPC mode
self.d.options.MAX_ITER = 800
self.d.options.MV_TYPE = 0
self.d.options.DIAGLEVEL = 0
# self.d.options.OTOL = 1
# final time for driving optimizer = self.d.FV(value=1.0,lb=0.1,ub=10.0, name='tf')
# allow gekko to change the tf value = 1
# time variable
self.t = self.d.Var(value=0)
self.d.Equation(self.t.dt()/ == 1)
# Acceleration variable
self.a = self.d.MV(fixed_initial=False, lb = 0, ub = 1, name='a')
self.a.STATUS = 1
# Jumping integer varaibles and equations
self.u_jump = self.d.MV(fixed_initial=False, lb=0, ub=1, integer=True)
self.u_jump.STATUS = 1
self.jump_hist = self.d.Var(value=0, name='jump_hist', lb=0, ub=1)
self.d.Equation(self.jump_hist.dt() == self.u_jump*(ntd-1))
# self.d.Equation(1.0 >= self.jump_hist)
# pitch input throttle (rotation of system)
self.u_p = self.d.MV(fixed_initial=False, lb = -1, ub=1)
self.u_p.STATUS = 1
# Final variable that allows you to set an objective function considering only final state
self.p_d = np.zeros(ntd)
self.p_d[-1] = 1.0 = self.d.Param(value = self.p_d, name='final')
# Model constants and parameters
self.Dp = self.d.Const(value = 2.7982, name='D_pitch')
self.Tp = self.d.Const(value = 12.146, name='T_pitch')
self.pi = self.d.Const(value = 3.14159, name='pi')
self.g = self.d.Const(value = 0, name='Fg')
self.jump_magnitude = self.d.Param(value = 3000, name = 'jump_mag')
def optimize2D(self, si, sf, vi, vf, ri, omegai): #these are 1x2 vectors s or v [x, z]
# variables and intial conditions
# Position in 2d = self.d.Var(value=si[0], lb=-4096, ub=4096, name='x') #x position
# = self.d.Var(value=si[1], lb=-5120, ub=5120, name='y') #y position = self.d.Var(value = si[1])
# Pitch rotation and angular velocity
self.pitch = self.d.Var(value = ri, name='pitch', lb=-1*self.pi, ub=self.pi)
self.pitch_dot = self.d.Var(fixed_initial=False, name='pitch_dot')
# Velocity in 2D
self.v_mag = self.d.Var(value=(vi), name='v_mag')
self.vx = self.d.Var(value=np.cos(ri), name='vx') #x velocity
# self.vy = self.d.Var(value=(np.sin(ri) * vi), name='vy') #y velocity
self.vz = self.d.Var(value = (np.sin(ri) * vi), name='vz')
## Non-linear state dependent dynamics descired as csplines.
#curvature vs vel as a cubic spline for driving state
cur = np.array([0.0069, 0.00398, 0.00235, 0.001375, 0.0011, 0.00088])
v_cur = np.array([0,500,1000,1500,1750,2300])
v_cur_fine = np.linspace(0,2300,100)
cur_fine = np.interp(v_cur_fine, v_cur, cur)
self.curvature = self.d.Var(name='curvature')
self.d.cspline(self.v_mag, self.curvature, v_cur_fine, cur_fine)
# throttle vs vel as cubic spline for driving state
ba=991.666 #Boost acceleration magnitude
kv = np.array([0, 1410, 2300]) #velocity input
ka = np.array([1600+ba, 0+ba, 0+ba]) #acceleration ouput
kv_fine = np.linspace(0, 2300, 100) # Higher resolution
ka_fine = np.interp(kv_fine, kv, ka) # Piecewise linear high resolution of ka
self.throttle_acceleration = self.d.Var(fixed_initial=False, name='throttle_accel')
self.d.cspline(self.v_mag, self.throttle_acceleration, kv_fine, ka_fine)
# Differental equations
# Velocity diff eqs
self.d.Equation(self.vx.dt()/ == (self.a*ba * self.d.cos(self.pitch)*self.jump_hist) + (self.a * self.throttle_acceleration * (1-self.jump_hist)) + (self.u_jump * self.jump_magnitude * self.d.cos(self.pitch + np.pi/2)))
self.d.Equation(self.vz.dt()/ == (self.a*ba * self.d.sin(self.pitch)*self.jump_hist) - (self.g * (1-self.jump_hist)) + (self.u_jump * self.jump_magnitude * self.d.sin(self.pitch + np.pi/2)))
self.d.Equation(self.v_mag == self.d.sqrt((self.vx*self.vx) + (self.vz*self.vz)))
self.d.Equation(2300 >= self.v_mag)
# Position diff eqs
self.d.Equation( == self.vx)
# self.d.Equation( == self.vy)
self.d.Equation( == self.vz)
# Orientation diff eqs
self.d.Equation(self.pitch_dot.dt()/ == ((self.Tp * self.u_p) + (self.Dp * self.pitch_dot * (1 - self.d.abs2(self.u_p)))) * self.jump_hist)
self.d.Equation(self.pitch.dt()/ == self.pitch_dot)
# Objective functions
# Final Position Objectives
self.d.Minimize(*1e2*(([1])**2)) # z final position objective
self.d.Minimize(*1e2*(([0])**2)) # x final position objective
# Final Velocity Objectives
# self.d.Obj(*1e3*(self.vz-vf[1])**2)
# self.d.Obj(*1e3*(self.vx-vf[0])**2)
# Minimum Time Objective
# self.d.solve('') # Solve with local apmonitor server
self.ts = np.multiply(self.d.time,[0])
return self.a, self.u_p, self.ts
def getTrajectoryData(self):
return [self.ts,,, self.vx, self.vz, self.pitch, self.pitch_dot]
def getInputData(self):
return [self.ts, self.a]
# Main Code
opt = Optimizer()
s_ti = [0,0]
v_ti = 0
s_tf = [1000,500]
v_tf = [00.00, 00.0]
r_ti = 0 # inital orientation of the car
omega_ti = 0.0 # initial angular velocity of car
acceleration, turning, t_star = opt.optimize2D(s_ti, s_tf, v_ti, v_tf, r_ti, omega_ti)
# Printing stuff
# print('u', acceleration.value)
# print('tf',
# print('tf',[0])
# print('u jump', opt.jump)
# for i in opt.u_jump: print(i.value)
print('u_jump', opt.u_jump.value)
print('jump his', opt.jump_hist.value)
print('v_mag', opt.v_mag.value)
print('a', opt.a.value)
# Plotting stuff
ts = opt.d.time *[0]
t_max =[0]
x_max = np.max(
vx_max = np.max(opt.vx.value)
z_max = np.max(
vz_max = np.max(opt.vz.value)
# plot results
fig = plt.figure(2)
ax = fig.add_subplot(111, projection='3d')
# plt.subplot(2, 1, 1)
Axes3D.plot(ax,, ts,, c='r', marker ='o')
plt.ylim(0, t_max)
plt.xlim(0, x_max)
plt.xlabel('Position x')
ax.set_zlabel('position z')
n=5 #num plots
fig = plt.figure(3)
ax = fig.add_subplot(111, projection='3d')
# plt.subplot(2, 1, 1)
Axes3D.plot(ax, opt.vx.value, ts, opt.vz.value, c='r', marker ='o')
plt.ylim(0, t_max)
plt.xlim(-1*vx_max, vx_max)
# plt.zlim(0, 2000)
plt.xlabel('Velocity x')
plt.plot(ts, opt.a, 'r-')
plt.plot(ts, np.multiply(opt.pitch, 1/math.pi), 'r-')
plt.ylabel('pitch orientation')
plt.subplot(n, 1, 3)
plt.plot(ts, opt.v_mag, 'b-')
plt.subplot(n, 1, 4)
plt.plot(ts, opt.u_p, 'b-')
plt.subplot(n, 1, 5)
plt.plot(ts, opt.u_jump, 'b-')
plt.plot(ts, opt.jump_hist, 'r-')
plt.ylabel('jump(b), jump hist(r)')

One thing to try is solve with IPOPT for initialization and then APOPT to get the integer solution. Another thing to try is to use an MPCC for a switching condition that does not rely on a binary variable. I've found the MPCC form to be much less reliable than a binary variable switching condition because the solver often gets stuck at the saddle point. However, integer solutions often take much longer to solve.
Here is the solution with IPOPT:
EXIT: Optimal Solution Found.
The solution was found.
The final value of the objective function is 506284.8987787149
Solver : IPOPT (v3.12)
Solution time : 7.4613000000000005 sec
Objective : 506284.8987787149
Successful solution
The integer solution is obtained with APOPT.
--------- APM Model Size ------------
Variable time shift OFF
Number of state variables: 1286
Number of total equations: - 1180
Number of slack variables: - 40
Degrees of freedom : 66
Dynamic Control with APOPT Solver
Iter: 1 I: 0 Tm: 2.72 NLPi: 92 Dpth: 0 Lvs: 3 Obj: 5.07E+05 Gap: NaN
Iter: 2 I: -1 Tm: 0.53 NLPi: 17 Dpth: 1 Lvs: 2 Obj: 5.07E+05 Gap: NaN
Iter: 3 I: -9 Tm: 47.59 NLPi: 801 Dpth: 1 Lvs: 1 Obj: 5.07E+05 Gap: NaN
Iter: 4 I: 0 Tm: 2.26 NLPi: 35 Dpth: 1 Lvs: 3 Obj: 5.08E+05 Gap: NaN
--Integer Solution: 2.54E+07 Lowest Leaf: 5.08E+05 Gap: 1.92E+00
Iter: 5 I: 0 Tm: 3.56 NLPi: 32 Dpth: 2 Lvs: 2 Obj: 2.54E+07 Gap: 1.92E+00
Iter: 6 I: -9 Tm: 54.65 NLPi: 801 Dpth: 2 Lvs: 1 Obj: 5.08E+05 Gap: 1.92E+00
Iter: 7 I: -1 Tm: 2.18 NLPi: 83 Dpth: 2 Lvs: 0 Obj: 5.08E+05 Gap: 1.92E+00
No additional trial points, returning the best integer solution
Successful solution
Solver : APOPT (v1.0)
Solution time : 113.5842 sec
Objective : 2.5419931399165962E+7
Successful solution
APOPT chooses not to jump to minimize the final objective. You may need to add a hard constraint that the vsum() of u_jump is 1. There are additional tutorials on MPCC and integer / binary forms of switching conditions in the Optimization course.
Thanks for sharing your application and keep us updated!


Getting optimal control with economic cost function to converge

I have been using gekko to optimize a bioreactor using example 12 ( as a basis.
My model is slightly more complicated with 6 states, 7 states and 2 manipulated variables. When I run it for small values of time (t ~20), the simulation is able to converge (albeit requiring a fine time resolution (dt < 0.1). However, when I try to extend the time (e.g., t = 30), it fails quite consistently with the following error:
EXIT: Converged to a point of local infeasibility. Problem may be infeasible
I have tried the following:
Employing different solvers with m.options.SOLVER = 1,2,3
Increasing m.options.MAX_ITER to 10000
Decreasing m.options.NODES to 1 (a lower order descretization seems to help with convergence)
Supplying a reasonable initial guess to the MVs by specifying a value in the declaration:
D = m.MV(value=0.1,lb=0.0,ub=0.1). From some of the various posts, it seems this should help.
I am not too sure how to go about solving this. For a simplified model (3 states, 5 parameters and 2 MVs), gekko is able to optimize it quite well (though it fails somewhat when I try to go to large t) even though the rate constants of the simplified model are a subset of the full model.
My code is as follows:
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
#Parameters and IC
full_params = [0.027,2.12e-9,7.13e-3,168,168,0.035,1e-3]
full_x0 = [5e6,0.0,0.0,0.0,1.25e5,0.0]
mu,k1,k2,k3,k33,k4, f= full_params
#Initialize model
m = GEKKO()
#Time discretization
n_steps = 201
m.time = np.linspace(0,20,n_steps)
#Define MVs
D = m.MV(value=0.1,lb=0.0,ub=0.1)
D.DCOST = 0.0
Tin = m.MV(value=1e7,lb=0.0,ub=1e7)
Tin.STATUS = 1
Tin.DCOST = 0.0
#Define States
T = m.Var(value=full_x0[0])
Id = m.Var(value=full_x0[1])
Is = m.Var(value=full_x0[2])
Ic = m.Var(value=full_x0[3])
Vs = m.Var(value=full_x0[4])
Vd = m.Var(value=full_x0[5])
#Define equations
m.Equation(T.dt() == mu*T -k1*(Vs+Vd)*T + D*(Tin-T))
m.Equation(Id.dt() == k1*Vd*T -(k1*Vs -mu)*Id -D*Id)
m.Equation(Is.dt() == k1*Vs*T -(k1*Vd + k2)*Is -D*Is)
m.Equation(Ic.dt() == k1*(Vs*Id + Vd*Is) -k2*Ic -D*Ic)
m.Equation(Vs.dt() == k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vs)
m.Equation(Vd.dt() == k33*Ic + f*k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vd)
#Define objective function
J = m.Var(value=0) # objective (profit)
Jf = m.FV() # final objective
m.Equation(J.dt() == D*(Vs + Vd))
m.options.IMODE = 6 # optimal control
m.options.NODES = 1 # collocation nodes
m.options.SOLVER = 3
m.options.MAX_ITER = 10000
For clarity, the model equations are:
I would be grateful for any assistance e.g., how to implement the scaling of the parameters per Thank you!
Try increasing the value of the final time until the solver can no-longer find a solution such as with tf=28 (successful). A plot of the solution reveals that Tin is adjusted to be zero at about the time where the solution almost fails to converge. I added a couple additional objective forms that didn't help the convergence (see Objective Method #1 and #2). The values of J, Vs, Vd are high but not unmanageable by the solver. One way to think about scaling is by changing units such as changing from kg/day to kg/s as the basis. Gekko automatically scales variables by the initial condition.
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
#Parameters and IC
full_params = [0.027,2.12e-9,7.13e-3,168,168,0.035,1e-3]
full_x0 = [5e6,0.0,0.0,0.0,1.25e5,0.0]
mu,k1,k2,k3,k33,k4, f= full_params
#Initialize model
m = GEKKO()
#Time discretization
tf = 28
n_steps = tf*10+1
m.time = np.linspace(0,tf,n_steps)
#Define MVs
D = m.MV(value=0.1,lb=0.0,ub=0.1)
D.DCOST = 0.0
Tin = m.MV(value=1e7,lb=0,ub=1e7)
Tin.STATUS = 1
Tin.DCOST = 0.0
#Define States
T = m.Var(value=full_x0[0])
Id = m.Var(value=full_x0[1])
Is = m.Var(value=full_x0[2])
Ic = m.Var(value=full_x0[3])
Vs = m.Var(value=full_x0[4])
Vd = m.Var(value=full_x0[5])
#Define equations
m.Equation(T.dt() == mu*T -k1*(Vs+Vd)*T + D*(Tin-T))
m.Equation(Id.dt() == k1*Vd*T -(k1*Vs -mu)*Id -D*Id)
m.Equation(Is.dt() == k1*Vs*T -(k1*Vd + k2)*Is -D*Is)
m.Equation(Ic.dt() == k1*(Vs*Id + Vd*Is) -k2*Ic -D*Ic)
m.Equation(Vs.dt() == k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vs)
m.Equation(Vd.dt() == k33*Ic + f*k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vd)
# Original Objective
if True:
J = m.Var(value=0) # objective (profit)
Jf = m.FV() # final objective
m.Equation(J.dt() == D*(Vs + Vd))
# Objective Method 1
if False:
p=np.zeros_like(m.time); p[-1]=1
final = m.Param(p)
J = m.Var(value=0) # objective (profit)
m.Equation(J.dt() == D*(Vs + Vd))
# Objective Method 2
if False:
m.Maximize(D*(Vs + Vd))
m.options.IMODE = 6 # optimal control
m.options.NODES = 2 # collocation nodes
m.options.SOLVER = 3
m.options.MAX_ITER = 10000
plt.legend(); plt.grid()
plt.legend(); plt.grid()
plt.legend(); plt.grid()
Is there any type of constraint in the problem that would favor a decrease at the end? This may be the cause of the infeasibility at tf=30. Another way to get a feasible solution is to solve with m.options.TIME_STEP=20 and resolve the problem with the initial conditions from the prior solution equal to the value at time step 20.
This way, the solution steps forward in time to optimize in parts. This strategy was used to optimize a High Altitude Long Endurance (HALE) UAV and is called Receding Horizon Control.
Martin, R.A., Gates, N., Ning, A., Hedengren, J.D., Dynamic
Optimization of High-Altitude Solar Aircraft Trajectories Under
Station-Keeping Constraints, Journal of Guidance, Control, and
Dynamics, 2018, doi: 10.2514/1.G003737.

Discrete path tracking with python gekko

I have some discrete data points representing a path and I want to minimize the distance between a trajectory of an object to these path points along with some other constraints. I'm trying out gekko as a tool to solve this problem and for that I made a simple problem by making data points from a parabola and a constraint to the path. My attempt to solve it is
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
import time
#path data points
x_ref = np.linspace(0, 4, num=21)
y_ref = - np.square(x_ref) + 16
#constraint for visualization purposes
x_bound = np.linspace(0, 4, num=10)
y_bound = 1.5*x_bound + 4
def distfunc(x,y,xref,yref,p):
Shortest distance from (x,y) to (xref, yref)
dtemp = []
for i in range(len(xref)):
d = (x-xref[i])**2+(y-yref[i])**2
min_id = dtemp.index(min(dtemp))
if min_id == 0:
next_id = min_id+1
elif min_id == len(x_ref):
next_id = min_id-1
d2 = (x-xref[min_id-1])**2+(y-yref[min_id-1])**2
d1 = (x-xref[min_id+1])**2+(y-yref[mid_id+1])**2
d_next = [d2, d1]
next_id = min_id + 2*d_next.index(min(d_next)) - 1
n1 = xref[next_id] - xref[min_id]
n2 = yref[next_id] - yref[min_id]
nnorm = p.sqrt(n1**2+n2**2)
n1 = n1 / nnorm
n2 = n2 / nnorm
difx = x-xref[min_id]
dify = y-yref[min_id]
dot = difx*n1 + dify*n2
deltax = difx - dot*n1
deltay = dify - dot*n2
return deltax**2+deltay**2
v_ref = 3
now = time.time()
p = GEKKO(remote=False)
p.time = np.linspace(0,10,21)
x = p.Var(value=0)
y = p.Var(value=16)
vx = p.Var(value=1)
vy = p.Var(value=0)
ax = p.Var(value=0)
ay = p.Var(value=0)
p.options.IMODE = 6
p.options.SOLVER = 3
p.options.WEB = 0
x_refg = p.Param(value=x_ref)
y_refg = p.Param(value=y_ref)
x_refg = p.Param(value=x_ref)
y_refg = p.Param(value=y_ref)
v_ref = p.Const(value=v_ref)
p.Obj( (p.sqrt(vx**2+vy**2) - v_ref)**2 + ax**2 + ay**2)
p.solve(disp=False, debug=True)
print(f'run time: {time.time()-now}')
plt.plot(x_ref, y_ref)
plt.plot(x_bound, y_bound)
This is the result that I get. As you can see, its not exactly the solution that one should expect. For reference to a solution that you may expect, here is what I get using the cost function below
p.Obj((x-x_refg)**2 + (y-y_refg)**2 + ax**2 + ay**2)
However since what I actually wanted is the shortest distance to a path described by these points I expect the distfunc to be closer to what I want since the shortest distance is most likely to some interpolated point. So my question is twofold:
Is this the correct gekko expression/formulation for the objective function?
My other goal is solution speed so is there a more efficient way of expressing this problem for gekko?
You can't define an objective function that changes based on conditions unless you insert logical conditions that are continuously differentiable such as with the if2 or if3 function. Gekko evaluates the symbolic model once and then passes that off to an executable for solution. It only calls the Python model build once because it is compiling the model to efficient byte-code for execution. You can see the model that you created with p.open_folder(). The model file ends in the apm extension: gk_model0.apm.
i0 = 3
End Constants
End Parameters
v1 = 0
v2 = 16
v3 = 1
v4 = 0
v5 = 0
v6 = 0
End Variables
minimize (((((v1-0.0)-((((((v1-0.0))*((0.2/sqrt(0.04159999999999994))))+(((v2-16.0))&
minimize (((((sqrt((((v3)^(2))+((v4)^(2))))-i0))^(2))+((v5)^(2)))+((v6)^(2)))
End Equations
End Model
One strategy is to split your problem into multiple optimization problems that are all minimal time problems where you navigate to the first way-point and then re-initialize the problem to navigate to the second way-point, and so on. If you want to preserve momentum and anticipate the turning then you'll need to use more advanced methods such as shown in the Pigeon / Eagle tracking problem (see source files) or similar to a trajectory optimization with UAVs or HALE UAVs (see references below).
Martin, R.A., Gates, N., Ning, A., Hedengren, J.D., Dynamic Optimization of High-Altitude Solar Aircraft Trajectories Under Station-Keeping Constraints, Journal of Guidance, Control, and Dynamics, 2018, doi: 10.2514/1.G003737.
Gates, N.S., Moore, K.R., Ning, A., Hedengren, J.D., Combined Trajectory, Propulsion and Battery Mass Optimization for Solar-Regenerative High-Altitude Long Endurance Unmanned Aircraft, AIAA Science and Technology Forum (SciTech), 2019.

How to constrain model variables in GEKKO

I like to constrain the variable value u < 1 in y model. Added ub=1 to the variable definition u = m.Var(name='u', value=0, lb=-2, ub=1) but it resulted in "No soulution found" (EXIT: Converged to a point of local infeasibility. Problem may be infeasible.). I guess I have to reformulate the problem to avoid this, but I have not been able to find examples how this should be done. How do i write a proper model to avoid infeasible solutions when constraining variable values?
I hav tied to reformulate the problem by adding equation like m.Equation(u < 1) with no success.
import numpy as np
from gekko import GEKKO
import matplotlib.pyplot as pyplt
m = GEKKO(remote=False)
t = np.linspace(0, 1000, 101) # time
d = np.ones(t.shape)
d[0:10] = 0
# Add data to model
m.time = t
K = m.Const(0.01, name='K')
r = m.Const(name='r', value=0) # Reference
d = m.Param(name='d', value=d) # Disturbance
y = m.Var(name='y', value=0, lb=-2, ub=2) # State variable
u = m.Var(name='u', value=0, lb=-2, ub=1) # Output
e = m.Var(name='e', value=0)
Tc = m.FV(name='Tc', value=1200, lb=60, ub=1200) # time constant
# Update variable status
Tc.STATUS = 1 # Optimizer can adjust value
Kp = m.Intermediate(1 / K * 1 / Tc, name='Kp')
Ti = m.Intermediate(4 * Tc, name='Ti')
# Model equations
m.Equations([y.dt() == K * (u-d),
e == r-y,
u.dt() == Kp*e.dt()+Kp/Ti*e])
# Model constraints
m.Equation(y < 0.5)
m.Equation(y > -0.5)
# Model objective
# options
m.options.IMODE = 6 # Problem type: 6 = Dynamic optimization
# solve
m.solve(disp=True, debug=True)
print('Tc: %6.2f [s]' % (Tc.value[-1], ))
fig1, (ax1, ax2, ax3) = pyplt.subplots(3, sharex='all')
ax1.plot(t, y.value)
ax1.set_ylabel("y", fontsize=8), ax1.grid(True, which='both')
ax2.plot(t, e.value)
ax2.set_ylabel("e", fontsize=8), ax2.grid(True, which='both')
ax3.plot(t, u.value)
ax3.plot(t, d.value)
ax3.set_ylabel("u and d", fontsize=8), ax3.grid(True, which='both')
EXIT: Converged to a point of local infeasibility. Problem may be infeasible.
An error occured.
The error code is 2
If I change the upper bound of u to 2, the optimization problem is solved as expected.
Hard constraints on variables can lead to an infeasible solution, as you observed. I recommend that you use soft constraints by specifying the variable y as a Controlled Variable and set an upper and lower set point range with SPHI and SPLO.
y = m.CV(name='y', value=0) # Controlled variable
y.STATUS = 1
y.TR_INIT = 0
y.SPHI = 0.5
y.SPLO = -0.5
I also removed the lb and ub from y and u to not give them hard bounds that can lead to the infeasibility. You also have an objective to maximize the value of Tc with m.Obj(-Tc). It goes to the maximum limit: 1200 when the solver is able to adjust the value. As you can see from the plot, the value of y exceeds the setpoint range. It may not be possible for the controller to keep it within that range. A soft constraint (objective based) approach to constraints penalizes deviations but does not lead to an infeasible solution. If you need to increase the penalty on violations of the SPHI or SPLO, the parameters WSPHI and WSPLO can be adjusted.
It appears that you have a first order dynamic model and you are trying to optimize PID parameters. If you need to model saturation of the controller output (actuator) then the if3, max3, min3 or corresponding if2, max2, min2 functions may be useful. There is more information on CV objectives and tuning in the Dynamic Optimization course.
Here is a feasible solution to your problem:
import numpy as np
from gekko import GEKKO
import matplotlib.pyplot as pyplt
m = GEKKO() # remote=False
t = np.linspace(0, 1000, 101) # time
d = np.ones(t.shape)
d[0:10] = 0
# Add data to model
m.time = t
K = m.Const(0.01, name='K')
r = m.Const(name='r', value=0) # Reference
d = m.Param(name='d', value=d) # Disturbance
e = m.Var(name='e', value=0)
u = m.Var(name='u', value=0) # Output
Tc = m.FV(name='Tc', value=1200, lb=60, ub=1200) # time constant
y = m.CV(name='y', value=0) # Controlled variable
y.STATUS = 1
y.TR_INIT = 0
y.SPHI = 0.5
y.SPLO = -0.5
# Update variable status
Tc.STATUS = 1 # Optimizer can adjust value
Kp = m.Intermediate((1 / K) * (1 / Tc), name='Kp')
Ti = m.Intermediate(4 * Tc, name='Ti')
# Model equations
m.Equations([y.dt() == K * (u-d),
e == r-y,
u.dt() == Kp*e.dt()+(Kp/Ti)*e])
# Model constraints
#m.Equation(y < 0.5)
#m.Equation(y > -0.5)
# Model objective
# options
m.options.IMODE = 6 # Problem type: 6 = Dynamic optimization
m.options.SOLVER = 3
m.options.MAX_ITER = 1000
# solve
m.solve(disp=True, debug=True)
print('Tc: %6.2f [s]' % (Tc.value[-1], ))
fig1, (ax1, ax2, ax3) = pyplt.subplots(3, sharex='all')
ax1.plot(t, y.value)
ax1.set_ylabel("y", fontsize=8), ax1.grid(True, which='both')
ax2.plot(t, e.value)
ax2.set_ylabel("e", fontsize=8), ax2.grid(True, which='both')
ax3.plot(t, u.value)
ax3.plot(t, d.value)
ax3.set_ylabel("u and d", fontsize=8), ax3.grid(True, which='both')
Thanks for an extensive and useful answer to my question. I really appreciate it.
As you correctly observed I am trying to optimize tuning parameters for my simple control problem. I have executed your code with soft constraints, and it sure solves the feasibility issue. I also added the WSPHI/LO parameters and set their values high to have a solution within the constraints. Still, I like to have a model where the control output (“u”) is bounded [0,1]. Based on your answer I probably must add “if” or “max/min” statements in the model to avoid having a non-feasible set of equations when “u” hits the bound. Something like “if u<0, u.dt() = 0 else u.dt() = Kp*e ….”. Could it alternatively be possible to add a variable (a type slack variable) to ensure feasibility of the equation set? I will also investigate the material in the dynamic optimization course links to get a better understanding of dynamic modelling. Thanks again for guiding me in the right direction in this issue.

Finding center point given distance matrix

I have a matrix (really a loaded image) in which every element is a L2 distance from some unknown center point.
Here is a trivial example
A = [1.4142 1.0000 1.4142 2.2361]
[1.0000 0.0000 1.0000 2.0000]
[1.4142 1.0000 1.4142 2.2361]
In this case, the center is obviously at coordinate (1,1) (index A[1,1] in a 0-indexed matrix or 2D array).
However, in the case where my centers are not constrained to be integer indices, it's no longer as obvious. For example, given this matrix B, where is my center coordinate?
B = [3.0292 1.9612 2.8932 5.8252]
[1.2292 0.1612 1.0932 4.0252]
[1.4292 0.3612 1.2932 4.2252]
How would you find that the answer in this case is at row 1.034 and column 1.4?
I am aware of the trilateration solution (having provided MATLAB code to visualize that in 3D previously), but is there a more efficient way (e.g. one without a matrix inversion)?
This question is sort of language agnostic, as I am looking more for algorithmic help. If you could stick to MATLAB, Python, or C++ though in a solution, that would be great ;-).
While having no experience with similar tasks, i read some stuff and also tried something.
When unfamiliar with this topic it's hard to grasp it seems and all those resources i found are a bit chaotic.
Still unclear in regards to theory for me:
is the problem as stated above a convex-optimization problem (local-minimum = global-minimum; would mean access to powerful solvers!)
there are much more resources about more generic problems (Sensor Network
Localization), which are non-convex and where extremely complex methods have been developed
is your trilateration-approach able to exploit > 3 points (trilateration vs. multilateration; at least this code does not seem like it can which means: bad performance with noise!)
Here some example code with two approaches:
A: Convex-optimization: SOCP-Relaxation
Not impressive performance, but should be powerful as approximation for big-data
Guaranteed global-optimum for this relaxation!
Implemented with cvxpy
B: Nonlinear-programming optimization
Implemented using scipy.optimize
Pretty much perfect in my synthetic experiments; even good results in noisy case; despite the fact we are using numerical-differentiation (automatic-diff hard to use here)
Some additional remark:
Your example B surely has some (pretty bad) noise or some other problem in my opinion, as my approaches are completely off; while especially approach B shines for my synthetic-data (at least that's my impression)
import numpy as np
import cvxpy as cvx
from scipy.spatial.distance import cdist
from scipy.optimize import minimize
""" Create noise-free (not anymore!) fake-problem """
real_x = np.random.random(size=2) * 3
M, N = 5, 10
pos = np.array([(i,j) for i in range(M) for j in range(N)]) # ugly -> tile/repeat/stack
real_x_stacked = np.vstack([real_x for i in range(pos.shape[0])])
Y = cdist(pos, real_x[np.newaxis])
Y += np.random.normal(size=Y.shape)*NOISE_DISTS # Let's add some noise!
print('real x: ', real_x)
print('dist mat: ', np.round(Y,3).T)
""" Helper """
def cost(x, Y, pos):
res = np.linalg.norm(pos - x, ord=2, axis=1) - Y.ravel()
return np.linalg.norm(res, 2)
print('cost with real_x (check vs. noisy): ', cost(real_x, Y, pos))
def solve_socp_relax(pos, Y):
x = cvx.Variable(2)
y = cvx.Variable(pos.shape[0])
fake_stack = [x for i in range(pos.shape[0])] # hacky
objective = cvx.sum_entries(cvx.norm(y - Y))
x_stacked = cvx.reshape(cvx.vstack(*fake_stack), pos.shape[0], 2) # hacky
constraints = [cvx.norm(pos - x_stacked, 2, axis=1) <= y]
problem = cvx.Problem(cvx.Minimize(objective), constraints)
problem.solve(solver=cvx.ECOS, verbose=False)
return x.value.T
""" SOLVER NLP """
def solve_nlp(pos, Y):
sol = minimize(cost, np.zeros(pos.shape[1]), args=(Y, pos), method='BFGS')
# print(sol)
return sol.x
""" TEST """
socp_relax_sol = solve_socp_relax(pos, Y)
print('SOCP RELAX SOL: ', socp_relax_sol)
nlp_sol = solve_nlp(pos, Y)
print('NLP SOL: ', nlp_sol)
real x: [ 1.25106601 2.16097348]
dist mat: [[ 2.444 1.599 1.348 1.276 2.399 3.026 4.07 4.973 6.118 6.746
2.143 1.149 0.412 0.766 1.839 2.762 3.851 4.904 5.734 6.958
2.377 1.432 0.856 1.056 1.973 2.843 3.885 4.95 5.818 6.84
2.711 2.015 1.689 1.939 2.426 3.358 4.385 5.22 6.076 6.97
3.422 3.153 2.759 2.81 3.326 4.162 4.734 5.627 6.484 7.336]]
cost with real_x (check vs. noisy): 0.665125233772
SOCP RELAX SOL: [[ 1.95749275 2.00607253]]
NLP SOL: [ 1.23560791 2.16756168]
Edit: Further speedup can be achieved (especially in large-scale) in using nonlinear-least-squares instead of the more general NLP-approach! My results are still the same (as expected if the problem would be convex). Timings between NLP/NLS can look like 9 vs. 0.5 seconds!
This is my recommended method!
def solve_nls(pos, Y):
def res(x, Y, pos):
return np.linalg.norm(pos - x, ord=2, axis=1) - Y.ravel()
sol = least_squares(res, np.zeros(pos.shape[1]), args=(Y, pos), method='lm')
# print(sol)
return sol.x
Especially the second-approach (NLP) will also run for much bigger instances (cvxpy's overhead hurts; that's not a downside of the SOCP-solver which should scale much much better!).
Here some output for M, N = 500, 1000 with some more noise:
real x: [ 12.51066014 21.6097348 ]
dist mat: [[ 24.706 23.573 23.693 ..., 1090.29 1091.216
cost with real_x (check vs. noisy): 353.354267797
NLP SOL: [ 12.51082419 21.60911561]
used: 5.9552763315495625 # SECONDS
So in my experiments it works, but i won't give any global-convergence guarantees or reconstruction-guarantees (still missing some theory).
At first i though about using the global optimum of the relaxed-SOCP-problem as initial-point in the NLP-solver, but i did not find any example where this is needed!
Some just-for-fun visuals using:
M, N = 20, 30
import matplotlib.pyplot as plt
plt.imshow(Y.reshape(M, N), cmap='viridis', interpolation='none')
plt.scatter(nlp_sol[1], nlp_sol[0], color='red', s=20)
plt.xlim((0, N))
plt.ylim((0, M))
And some super noisy case (nice performance!):
M, N = 50, 100
real x: [ 12.51066014 21.6097348 ]
dist mat: [[ 22.329 18.745 27.588 ..., 94.967 80.034 91.206]]
cost with real_x (check vs. noisy): 354.527196716
NLP SOL: [ 12.44158986 21.50164637]
used: 0.01050068340320306
If I understand correctly, you have a matrix A, where A[i,j] holds the distance from (i,j) to some unknown point (y,x). You could find (y,x) like this:
Square each element of A, to make a matrix B say.
We then want to find (y,x) so
(y-i)*(y-i) + (x-j)*(x-j) = B[i,j]
Subtracting each equation from the 0,0 equation and rearranging:
2*i*y + 2*j*x = B[0,0] + i*i + j*j - B[i,j]
This can be solved by linear least squares. Note that since there are 2 unknowns, the matix inversion (better, factorisation) involved will be on a 2x2 matrix and so not time consuming. You could indeed, given just the dimensions of A, work out the required matrix and its inverse analytically.

How Could One Implement the K-Means++ Algorithm?

I am having trouble fully understanding the K-Means++ algorithm. I am interested exactly how the first k centroids are picked, namely the initialization as the rest is like in the original K-Means algorithm.
Is the probability function used based on distance or Gaussian?
In the same time the most long distant point (From the other centroids) is picked for a new centroid.
I will appreciate a step by step explanation and an example. The one in Wikipedia is not clear enough. Also a very well commented source code would also help. If you are using 6 arrays then please tell us which one is for what.
Interesting question. Thank you for bringing this paper to my attention - K-Means++: The Advantages of Careful Seeding
In simple terms, cluster centers are initially chosen at random from the set of input observation vectors, where the probability of choosing vector x is high if x is not near any previously chosen centers.
Here is a one-dimensional example. Our observations are [0, 1, 2, 3, 4]. Let the first center, c1, be 0. The probability that the next cluster center, c2, is x is proportional to ||c1-x||^2. So, P(c2 = 1) = 1a, P(c2 = 2) = 4a, P(c2 = 3) = 9a, P(c2 = 4) = 16a, where a = 1/(1+4+9+16).
Suppose c2=4. Then, P(c3 = 1) = 1a, P(c3 = 2) = 4a, P(c3 = 3) = 1a, where a = 1/(1+4+1).
I've coded the initialization procedure in Python; I don't know if this helps you.
def initialize(X, K):
C = [X[0]]
for k in range(1, K):
D2 = scipy.array([min([scipy.inner(c-x,c-x) for c in C]) for x in X])
probs = D2/D2.sum()
cumprobs = probs.cumsum()
r = scipy.rand()
for j,p in enumerate(cumprobs):
if r < p:
i = j
return C
EDIT with clarification: The output of cumsum gives us boundaries to partition the interval [0,1]. These partitions have length equal to the probability of the corresponding point being chosen as a center. So then, since r is uniformly chosen between [0,1], it will fall into exactly one of these intervals (because of break). The for loop checks to see which partition r is in.
probs = [0.1, 0.2, 0.3, 0.4]
cumprobs = [0.1, 0.3, 0.6, 1.0]
if r < cumprobs[0]:
# this event has probability 0.1
i = 0
elif r < cumprobs[1]:
# this event has probability 0.2
i = 1
elif r < cumprobs[2]:
# this event has probability 0.3
i = 2
elif r < cumprobs[3]:
# this event has probability 0.4
i = 3
One Liner.
Say we need to select 2 cluster centers, instead of selecting them all randomly{like we do in simple k means}, we will select the first one randomly, then find the points that are farthest to the first center{These points most probably do not belong to the first cluster center as they are far from it} and assign the second cluster center nearby those far points.
I have prepared a full source implementation of k-means++ based on the book "Collective Intelligence" by Toby Segaran and the k-menas++ initialization provided here.
Indeed there are two distance functions here. For the initial centroids a standard one is used based numpy.inner and then for the centroids fixation the Pearson one is used. Maybe the Pearson one can be also be used for the initial centroids. They say it is better.
from __future__ import division
def readfile(filename):
lines=[line for line in file(filename)]
for line in lines:
p=line.strip().split(' ') #single space as separator
#print p
# First column in each row is the rowname
# The data for this row is the remainder of the row
data.append([float(x) for x in p[1:]])
#print [float(x) for x in p[1:]]
return rownames,data
from math import sqrt
def pearson(v1,v2):
# Simple sums
# Sums of the squares
sum1Sq=sum([pow(v,2) for v in v1])
sum2Sq=sum([pow(v,2) for v in v2])
# Sum of the products
pSum=sum([v1[i]*v2[i] for i in range(len(v1))])
# Calculate r (Pearson score)
if den==0: return 0
return 1.0-num/den
import numpy
from numpy.random import *
def initialize(X, K):
C = [X[0]]
for _ in range(1, K):
#D2 = numpy.array([min([numpy.inner(c-x,c-x) for c in C]) for x in X])
D2 = numpy.array([min([numpy.inner(numpy.array(c)-numpy.array(x),numpy.array(c)-numpy.array(x)) for c in C]) for x in X])
probs = D2/D2.sum()
cumprobs = probs.cumsum()
#print "cumprobs=",cumprobs
r = rand()
#print "r=",r
for j,p in enumerate(cumprobs):
if r 0:
for rowid in bestmatches[i]:
for m in range(len(rows[rowid])):
for j in range(len(avgs)):
return bestmatches
kclust = kcluster(data,k=4)
print "Result:"
for c in kclust:
out = ""
for r in c:
out+=rows[r] +' '
print "["+out[:-1]+"]"
print 'done'
p1 1 5 6
p2 9 4 3
p3 2 3 1
p4 4 5 6
p5 7 8 9
p6 4 5 4
p7 2 5 6
p8 3 4 5
p9 6 7 8
