Gekko - APMonitor had a hard time in converging for a MINLP problem with linear-fractional objective function? - gekko

Gekko - APMonitor Optimization Suite is unable to solve an optimization problem. I am trying to solve Max a^Tx/b^Tx with the constraint d<=c^Tx <=e, where the decision vector x=[x_1, x_2, ..., x_n] are non-negative integers, and vectors a,b,c are positive real-number vectors, and constants d and e are positive lower and upper bounds. The problem is feasible because I got a feasible solution with the objective being replaced by 0. I was wondering whether APMonitor is capable of solving linear-fractional objective problems or not.
Anyone has experience with how to handle this kind of issues? Is there any options in the solver I could try to turn on to resolve the issue?
The option I was using is below:
from gekko import GEKKO
model = GEKKO()
model.options.SOLVER=1
model.solver_options = ['minlp_maximum_iterations 100', \
'minlp_max_iter_with_int_sol 10', \
'minlp_as_nlp 0', \
'nlp_maximum_iterations 50', \
'minlp_branch_method 1', \
'minlp_print_level 8', \
'minlp_integer_tol 0.05', \
'minlp_gap_tol 0.001']
model.solve(disp=True)
The output looks like below, where the solver status is inconsistent with APPSTATUS and APPINFO. This may be a APMonitor reporting issue.
apm 67.162.115.84_gk_model0 <br><pre> -----------------------------------------------
-----------------
APMonitor, Version 1.0.1
APMonitor Optimization Suite
----------------------------------------------------------------
--------- APM Model Size ------------
Each time step contains
Objects : 7
Constants : 0
Variables : 5626
Intermediates: 0
Connections : 4914
Equations : 4913
Residuals : 4913
Number of state variables: 5626
Number of total equations: - 4919
Number of slack variables: - 2
---------------------------------------
Degrees of freedom : 705
----------------------------------------------
Steady State Optimization with APOPT Solver
----------------------------------------------
Iter: 1 I: -9 Tm: 75.50 NLPi: 251 Dpth: 0 Lvs: 0 Obj: 0.00E+00 Gap:
NaN
Warning: no more possible trial points and no integer solution
Maximum iterations
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 75.5581999999995 sec
Objective : NaN
Unsuccessful with error code 0
---------------------------------------------------
Creating file: infeasibilities.txt
Use command apm_get(server,app,'infeasibilities.txt') to retrieve file
#error: Solution Not Found
Not successful
Gekko Solvetime: 1.0 s
#################################################
APPINFO = 0 - a successful solution
APPSTATUS =1 - solver converges to a successful solution
Solver status - Not successful, exception thrown
decision variable =[0,0, ...,0].

To maximize the objective, the solver minimizes the value of b so that the objective function goes to +infinity. Try setting a lower bound on b to a small number such as 0.001 to prevent the unbounded solution. Starting with non-zero values (default) can also help to find the solution.
b = model.Array(m.Var,n,value=1,lb=0.001)
Another suggestion is to set a lower bound constraint on b^Tx in case x also goes to zero.
model.Equation(b#x>=0.01)
If the APOPT solver does not converge with the modified problem, try using an NLP solver such as the Interior Point Method solver IPOPT to initialize the solution. Gekko retains the solution values from one solve to use as the initial guess for the next solve.
model.options.SOLVER=3
model.solve()
model.options.SOLVER=1
model.solver_options = ['minlp_maximum_iterations 100', \
'minlp_max_iter_with_int_sol 10', \
'minlp_as_nlp 0', \
'nlp_maximum_iterations 50', \
'minlp_branch_method 1', \
'minlp_print_level 8', \
'minlp_integer_tol 0.05', \
'minlp_gap_tol 0.001']
model.solve()
Please post a complete and minimal example if more specific suggestions are needed.

Related

GEKKO: Array size as a model variable

I'm quite new to Gekko. Is it possible to vary the size of a model array as part of an optimization? I am running a simple problem where various numbers of torsional springs engage at different angles, and I would like to allow the model to change the number of engagement angles. Each spring has several component variables, which I am also attempting to define as arrays of variables. However, the size definition of the array theta_engage, below, has not accepted int(n_engage.value). I get the following error:
TypeError: int() argument must be a string, a bytes-like object or a number, not 'GK_Value'
Relevant code:
n_engage = m.Var(2, lb=1, ub=10, integer=True)
theta_engage = m.Array(m.Var, (int(n_engage.value)))
theta_engage[0].value = 0.0
theta_engage[0].lower = 0.0
theta_engage[0].upper = 85.0
theta_engage[1].value = 15.0
theta_engage[1].lower = 0.0
theta_engage[1].upper = 85.0
If I try to define the size of theta_engage only by n_engage.value, I get this error:
TypeError: expected sequence object with len >= 0 or a single integer
I suppose I could define the array at the maximum size I am willing to accept and allow the number of springs to have a lower bound of 0, but I would have to enforce a minimum number of total springs somehow in the constraints. If Gekko is capable of varying the size of the arrays this way it seems to me the more elegant solution.
Any help is much appreciated.
The problem structure can't be changed iteration-to-iteration. However, it is easy to define a binary variable b that either activates or deactivates those parts of the model that should be included or excluded.
from gekko import GEKKO
import numpy as np
m = GEKKO()
# number of springs
n = 10
# number of engaged springs (1-10)
nb = m.Var(2, lb=1, ub=n, integer=True)
# engaged springs (binary, 0-1)
b = m.Array(m.Var,n,lb=0,ub=1,integer=True)
# angle of engaged springs
θ = m.Array(m.Param,n,lb=0,ub=85)
# initialize values
t0 = [0,15,20,25,30,15,30,25,10,50]
for i,ti in enumerate(t0):
θ[i].value = ti
# contributing spring forces
F = [m.Intermediate(b[i]*m.cos((np.pi/180.0)*θ[i])) \
for i in range(10)]
# force constraint
m.Equation(m.sum(F)>=3)
# engaged springs
m.Equation(nb==m.sum(b))
# minimize engaged springs
m.Minimize(nb)
# optimize with APOPT solver
m.options.SOLVER=1
m.solve()
# print solution
print(b)
This gives a solution in 0.079 sec that springs 1, 3, 9, and 10 should be engaged. It selects the minimum number of springs (4) to achieve the required force that is equivalent to 3 springs at 0 angle.
Successful solution
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 7.959999999729916E-002 sec
Objective : 4.00000000000000
Successful solution
---------------------------------------------------
[[1.0] [0.0] [1.0] [0.0] [0.0] [0.0] [0.0] [0.0] [1.0] [1.0]]

Unexpected results from sum using gekko variable

I am optimizing a simple problem where I am summing intermediate variables for a constraint where the sum needs to be lower than a certain budget.
When I print the sum, either using sum or np.sum, I get the following results:(((((((((((((((((((((((((((((i429+i430)+i431)+i432)+i433)+i434)+i435)+i436)+i437)+i438)+i439)+i440)+i441)+i442)+i443)+i444)+i445)+i446)+i447)+i448)+i449)+i450)+i451)+i452)+i453)+i454)+i455)+i456)+i457)+i458)
Here is the command to create the variables and the sum.
x = m.Array(m.Var, (len(bounds)),integer=True)
sums = [m.Intermediate(objective_inverse2(x,y)) for x,y in zip(x,reg_feats)]
My understanding of the intermediate variable is a variable which is dynamically calculated based on the value of x, which are decision variables.
Here is the summing function for the max budget constraint.
m.Equation(np.sum(sums) < max_budget)
Solving the problem returns an error saying there are no feasible solution, even through trivial solutions exist. Furthermore, removing this constraint returns a solution which naturally does not violate the max budget constraint.
What am I misunderstanding about the intermediate variable and how to sum them.
It is difficult to diagnose the problem without a complete, minimal problem. Here is an attempt to recreate the problem:
from gekko import GEKKO
import numpy as np
m = GEKKO()
nb = 5
x = m.Array(m.Var,nb,value=1,lb=0,ub=1,integer=True)
y = m.Array(m.Var,nb,lb=0)
i = [] # intermediate list
for xi,yi in zip(x,y):
i.append(m.Intermediate(xi*yi))
m.Maximize(m.sum(i))
m.Equation(m.sum(i)<=100)
m.options.SOLVER = 1
m.solve()
print(x)
print(y)
Instead of creating a list of Intermediates, the summation can also happen with the result of the list comprehension. This way, only one Intermediate value is created.
from gekko import GEKKO
import numpy as np
m = GEKKO()
nb = 5
x = m.Array(m.Var,nb,value=1,lb=0,ub=1,integer=True)
y = m.Array(m.Var,nb,lb=0)
sums = m.Intermediate(m.sum([xi*yi for xi,yi in zip(x,y)]))
m.Maximize(sums)
m.Equation(sums<=100)
m.options.SOLVER = 1
m.solve()
print(sums.value)
print(x)
print(y)
In both cases, the optimal solution is:
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 1.560000001336448E-002 sec
Objective : -100.000000000000
Successful solution
---------------------------------------------------
[100.0]
[[1.0] [1.0] [1.0] [1.0] [1.0]]
[[20.0] [20.0] [20.0] [20.0] [20.0]]
Try using the Gekko m.sum() function to improve solution efficiency, especially for large problems.

Problog - probabilistic graph example

I am going through the following example on Problog probabilistic graph
I tried to computed probability of path from 1 to 5. Here are my manual computations
0.6*0.4+0.1*0.3*0.8 = 0.264
However, Problog returns P(path(1,5)) = 0.25824
Am I computing it correctly?
No, you can't just add up all the probabilities for the different paths. To see that, just assume that both paths from 1 to 5 had a probability of 0.7 each. You would get a probability of 1.4 which is clearly wrong (meaning that it is impossible that there is no path).
The way to calculate the probability for either of two events A and B is to get the probability of neither being true and then looking at the inverse of this event.
P(1->2->5) = 0.24
P(1->3->4->5) = 0.024
P(either) = 1 - (1 - 0.24) * (1 - 0.024)
= 1 - 0.74176
= 0.25824
Sorry for probably bad terminology, my statistics knowledge is a bit rusty.

Q learning - epsilon greedy update

I am trying to understand the epsilon - greedy method in DQN. I am learning from the code available in https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js
Following is the update rule for epsilon which changes with age as below:
$this.epsilon = Math.min(1.0, Math.max(this.epsilon_min, 1.0-(this.age - this.learning_steps_burnin)/(this.learning_steps_total - this.learning_steps_burnin)));
Does this mean the epsilon value starts with min (chosen by user) and then increase with age reaching upto burnin steps and eventually becoming to 1? Or Does the epsilon start around 1 and then decays to epsilon_min ?
Either way, then the learning almost stops after this process. So, do we need to choose the learning_steps_burnin and learning_steps_total carefully enough? Any thoughts on what value needs to be chosen?
Since epsilon denotes the amount of randomness in your policy (action is greedy with probability 1-epsilon and random with probability epsilon), you want to start with a fairly randomized policy and later slowly move towards a deterministic policy. Therefore, you usually start with a large epsilon (like 0.9, or 1.0 in your code) and decay it to a small value (like 0.1). Most common and simple approaches are linear decay and exponential decay. Usually, you have an idea of how many learning steps you will perform (what in your code is called learning_steps_total) and tune the decay factor (your learning_steps_burnin) such that in this interval epsilon goes from 0.9 to 0.1.
Your code is an example of linear decay.
An example of exponential decay is
epsilon = 0.9
decay = 0.9999
min_epsilon = 0.1
for i from 1 to n
epsilon = max(min_epsilon, epsilon*decay)
Personally I recommend an epsilon decay such that after about 50/75% of the training you reach the minimum value of espilon (advice from 0.05 to 0.0025) from which then you have only the improvement of the policy itself.
I created a specific script to set the various parameters and it returns after what the decay stop is reached (at the indicated value)
import matplotlib.pyplot as plt
import numpy as np
eps_start = 1.0
eps_min = 0.05
eps_decay = 0.9994
epochs = 10000
pct = 0
df = np.zeros(epochs)
for i in range(epochs):
if i == 0:
df[i] = eps_start
else:
df[i] = df[i-1] * eps_decay
if df[i] <= eps_min:
print(i)
stop = i
break
print("With this parameter you will stop epsilon decay after {}% of training".format(stop/epochs*100))
plt.plot(df)
plt.show()

Effect of --oaa 2 and --loss_function=logistic in Vowpal Wabbit

What parameters should I use in VW for a binary classification task? For example, let's use rcv1_small.dat. I thought it is better to use the logistic loss function (or hinge) and it makes no sense to use --oaa 2. However, the empirical results (with progressive validation 0/1 loss reported in all 4 experiments) show that best combination is --oaa 2 without logistic loss (i.e. with the default squared loss):
cd vowpal_wabbit/test/train-sets
cat rcv1_small.dat | vw --binary
# average loss = 0.0861
cat rcv1_small.dat | vw --binary --loss_function=logistic
# average loss = 0.0909
cat rcv1_small.dat | sed 's/^-1/2/' | vw --oaa 2
# average loss = 0.0857
cat rcv1_small.dat | sed 's/^-1/2/' | vw --oaa 2 --loss_function=logistic
# average loss = 0.0934
My primary question is: Why --oaa 2 does not give exactly the same results as --binary (in the above setting)?
My secondary questions are: Why optimizing the logistic loss does not improve the 0/1 loss (compared to optimizing the default square loss)? Is this a specific of this particular dataset?
I have experienced something similar while using --csoaa. The details could be found here. My guess is that in case of multiclass problem with N classes (no matter that you specified 2 as a number of classes) vw virtually works with N copies of features. Same example gets different ft_offset value when it's predicted/learned for every possible class and this offset is used in hashing algorithm. So all classes get "independent" set of features from same dataset's row. Of course feature values are same, but vw doesn't keep values - only feature weights. And weights are different for each possible class. And as amount of RAM used for storing these weights is fixed with -b (-b 18 by default) - the more classes you have the more chance to get a hash collision. You can try to increase -b value and check if difference between --oaa 2 and --binary results is decreasing. But I might be wrong as I didn't go too deep into the vw code.
As for loss function - you can't compare avg loss values of squared (default) and logistic loss functions directly. You shall get raw prediction values from result obtained with squared loss and get loss of these predictions in terms of logistic loss. The function will be: log(1 + exp(-label * prediction) where label is a priori known answer. Such functions (float getLoss(float prediction, float label) ) for all loss functions implemented in vw could be found in loss_functions.cc. Or you can preliminary scale raw prediction value to [0..1] with 1.f / (1.f + exp(- prediction) and then calc log loss as described on kaggle.com :
double val = 1.f / (1.f + exp(- prediction); // y = f(x) -> [0, 1]
if (val < 1e-15) val = 1e-15;
if (val > (1.0 - 1e-15)) val = 1.0 - 1e-15;
float xx = (label < 0)?0:1; // label {-1,1} -> {0,1}
double loss = xx*log(val) + (1.0 - xx) * log(1.0 - val);
loss *= -1;
You can also scale raw predictions to [0..1] with '/vowpal_wabbit/utl/logistic' script or --link=logistic parameter. Both use 1/(1+exp(-i)).

Resources