Related
I have been using gekko to optimize a bioreactor using example 12 (https://apmonitor.com/wiki/index.php/Main/GekkoPythonOptimization) as a basis.
My model is slightly more complicated with 6 states, 7 states and 2 manipulated variables. When I run it for small values of time (t ~20), the simulation is able to converge (albeit requiring a fine time resolution (dt < 0.1). However, when I try to extend the time (e.g., t = 30), it fails quite consistently with the following error:
EXIT: Converged to a point of local infeasibility. Problem may be infeasible
I have tried the following:
Employing different solvers with m.options.SOLVER = 1,2,3
Increasing m.options.MAX_ITER to 10000
Decreasing m.options.NODES to 1 (a lower order descretization seems to help with convergence)
Supplying a reasonable initial guess to the MVs by specifying a value in the declaration:
D = m.MV(value=0.1,lb=0.0,ub=0.1). From some of the various posts, it seems this should help.
I am not too sure how to go about solving this. For a simplified model (3 states, 5 parameters and 2 MVs), gekko is able to optimize it quite well (though it fails somewhat when I try to go to large t) even though the rate constants of the simplified model are a subset of the full model.
My code is as follows:
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
#Parameters and IC
full_params = [0.027,2.12e-9,7.13e-3,168,168,0.035,1e-3]
full_x0 = [5e6,0.0,0.0,0.0,1.25e5,0.0]
mu,k1,k2,k3,k33,k4, f= full_params
#Initialize model
m = GEKKO()
#Time discretization
n_steps = 201
m.time = np.linspace(0,20,n_steps)
#Define MVs
D = m.MV(value=0.1,lb=0.0,ub=0.1)
D.STATUS = 1
D.DCOST = 0.0
Tin = m.MV(value=1e7,lb=0.0,ub=1e7)
Tin.STATUS = 1
Tin.DCOST = 0.0
#Define States
T = m.Var(value=full_x0[0])
Id = m.Var(value=full_x0[1])
Is = m.Var(value=full_x0[2])
Ic = m.Var(value=full_x0[3])
Vs = m.Var(value=full_x0[4])
Vd = m.Var(value=full_x0[5])
#Define equations
m.Equation(T.dt() == mu*T -k1*(Vs+Vd)*T + D*(Tin-T))
m.Equation(Id.dt() == k1*Vd*T -(k1*Vs -mu)*Id -D*Id)
m.Equation(Is.dt() == k1*Vs*T -(k1*Vd + k2)*Is -D*Is)
m.Equation(Ic.dt() == k1*(Vs*Id + Vd*Is) -k2*Ic -D*Ic)
m.Equation(Vs.dt() == k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vs)
m.Equation(Vd.dt() == k33*Ic + f*k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vd)
#Define objective function
J = m.Var(value=0) # objective (profit)
Jf = m.FV() # final objective
Jf.STATUS = 1
m.Connection(Jf,J,pos2="end")
m.Equation(J.dt() == D*(Vs + Vd))
m.Obj(-Jf)
m.options.IMODE = 6 # optimal control
m.options.NODES = 1 # collocation nodes
m.options.SOLVER = 3
m.options.MAX_ITER = 10000
#Solve
m.solve()
For clarity, the model equations are:
I would be grateful for any assistance e.g., how to implement the scaling of the parameters per https://apmonitor.com/do/index.php/Main/ModelInitialization. Thank you!
Try increasing the value of the final time until the solver can no-longer find a solution such as with tf=28 (successful). A plot of the solution reveals that Tin is adjusted to be zero at about the time where the solution almost fails to converge. I added a couple additional objective forms that didn't help the convergence (see Objective Method #1 and #2). The values of J, Vs, Vd are high but not unmanageable by the solver. One way to think about scaling is by changing units such as changing from kg/day to kg/s as the basis. Gekko automatically scales variables by the initial condition.
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
#Parameters and IC
full_params = [0.027,2.12e-9,7.13e-3,168,168,0.035,1e-3]
full_x0 = [5e6,0.0,0.0,0.0,1.25e5,0.0]
mu,k1,k2,k3,k33,k4, f= full_params
#Initialize model
m = GEKKO()
#Time discretization
tf = 28
n_steps = tf*10+1
m.time = np.linspace(0,tf,n_steps)
#Define MVs
D = m.MV(value=0.1,lb=0.0,ub=0.1)
D.STATUS = 1
D.DCOST = 0.0
Tin = m.MV(value=1e7,lb=0,ub=1e7)
Tin.STATUS = 1
Tin.DCOST = 0.0
#Define States
T = m.Var(value=full_x0[0])
Id = m.Var(value=full_x0[1])
Is = m.Var(value=full_x0[2])
Ic = m.Var(value=full_x0[3])
Vs = m.Var(value=full_x0[4])
Vd = m.Var(value=full_x0[5])
#Define equations
m.Equation(T.dt() == mu*T -k1*(Vs+Vd)*T + D*(Tin-T))
m.Equation(Id.dt() == k1*Vd*T -(k1*Vs -mu)*Id -D*Id)
m.Equation(Is.dt() == k1*Vs*T -(k1*Vd + k2)*Is -D*Is)
m.Equation(Ic.dt() == k1*(Vs*Id + Vd*Is) -k2*Ic -D*Ic)
m.Equation(Vs.dt() == k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vs)
m.Equation(Vd.dt() == k33*Ic + f*k3*Is - (k1*(T+Id+Is+Ic) + k4 + D)*Vd)
# Original Objective
if True:
J = m.Var(value=0) # objective (profit)
Jf = m.FV() # final objective
Jf.STATUS = 1
m.Connection(Jf,J,pos2="end")
m.Equation(J.dt() == D*(Vs + Vd))
m.Obj(-Jf)
# Objective Method 1
if False:
p=np.zeros_like(m.time); p[-1]=1
final = m.Param(p)
J = m.Var(value=0) # objective (profit)
m.Equation(J.dt() == D*(Vs + Vd))
m.Maximize(J*final)
# Objective Method 2
if False:
m.Maximize(D*(Vs + Vd))
m.options.IMODE = 6 # optimal control
m.options.NODES = 2 # collocation nodes
m.options.SOLVER = 3
m.options.MAX_ITER = 10000
#Solve
m.solve()
plt.figure(figsize=(10,8))
plt.subplot(3,1,1)
plt.plot(m.time,Tin.value,'r.-',label='Tin')
plt.legend(); plt.grid()
plt.subplot(3,1,2)
plt.semilogy(m.time,T.value,label='T')
plt.semilogy(m.time,Id.value,label='Id')
plt.semilogy(m.time,Is.value,label='Is')
plt.semilogy(m.time,Ic.value,label='Ic')
plt.legend(); plt.grid()
plt.subplot(3,1,3)
plt.semilogy(m.time,Vs.value,label='Vs')
plt.semilogy(m.time,Vd.value,label='Vd')
plt.semilogy(m.time,J.value,label='Objective')
plt.legend(); plt.grid()
plt.show()
Is there any type of constraint in the problem that would favor a decrease at the end? This may be the cause of the infeasibility at tf=30. Another way to get a feasible solution is to solve with m.options.TIME_STEP=20 and resolve the problem with the initial conditions from the prior solution equal to the value at time step 20.
#Solve
m.solve()
m.options.TIME_SHIFT=20
m.solve()
This way, the solution steps forward in time to optimize in parts. This strategy was used to optimize a High Altitude Long Endurance (HALE) UAV and is called Receding Horizon Control.
Martin, R.A., Gates, N., Ning, A., Hedengren, J.D., Dynamic
Optimization of High-Altitude Solar Aircraft Trajectories Under
Station-Keeping Constraints, Journal of Guidance, Control, and
Dynamics, 2018, doi: 10.2514/1.G003737.
I have written a code and try to use numba for accelerating the code. The main goal of the code is to group some values based on a condition. In this regard, iter_ is used for converging the code to satisfy the condition. I prepared a small case below to reproduce the sample code:
import numpy as np
import numba as nb
rng = np.random.default_rng(85)
# --------------------------------------- small data volume ---------------------------------------
# values_ = {'R0': np.array([0.01090976, 0.01069902, 0.00724112, 0.0068463 , 0.01135723, 0.00990762,
# 0.01090976, 0.01069902, 0.00724112, 0.0068463 , 0.01135723]),
# 'R1': np.array([0.01836379, 0.01900166, 0.01864162, 0.0182823 , 0.01840322, 0.01653088,
# 0.01900166, 0.01864162, 0.0182823 , 0.01840322, 0.01653088]),
# 'R2': np.array([0.02430913, 0.02239156, 0.02225379, 0.02093393, 0.02408692, 0.02110411,
# 0.02239156, 0.02225379, 0.02093393, 0.02408692, 0.02110411])}
#
# params = {'R0': [3, 0.9490579204466154, 1825, 7.070272000000002e-05],
# 'R1': [0, 0.9729203826820172, 167 , 7.070272000000002e-05],
# 'R2': [1, 0.6031363088057902, 1316, 8.007296000000003e-05]}
#
# Sno, dec_, upd_ = 2, 100, 200
# -------------------------------------------------------------------------------------------------
# ----------------------------- UPDATED (medium and large data volumes) ---------------------------
# values_ = np.load("values_med.npy", allow_pickle=True)[()]
# params = np.load("params_med.npy", allow_pickle=True)[()]
values_ = np.load("values_large.npy", allow_pickle=True)[()]
params = np.load("params_large.npy", allow_pickle=True)[()]
Sno, dec_, upd_ = 2000, 1000, 200
# -------------------------------------------------------------------------------------------------
# values_ = [*values_.values()]
# params = [*params.values()]
# #nb.jit(forceobj=True)
# def test(values_, params, Sno, dec_, upd_):
final_dict = {}
for i, j in enumerate(values_.keys()):
Rand_vals = []
goal_sum = params[j][1] * params[j][3]
tel = goal_sum / dec_ * 10
if params[j][0] != 0:
for k in range(Sno):
final_sum = 0.0
iter_ = 0
t = 1
while not np.allclose(goal_sum, final_sum, atol=tel):
iter_ += 1
vals_group = rng.choice(values_[j], size=params[j][0], replace=False)
# final_sum = 0.0016 * np.sum(vals_group) # -----> For small data volume
final_sum = np.sum(vals_group ** 3) # -----> UPDATED For med or large data volume
if iter_ == upd_:
t += 1
tel = t * tel
values_[j] = np.delete(values_[j], np.where(np.in1d(values_[j], vals_group)))
Rand_vals.append(vals_group)
else:
Rand_vals = [np.array([])] * Sno
final_dict["R" + str(i)] = Rand_vals
# return final_dict
# test(values_, params, Sno, dec_, upd_)
At first, for applying numba on this code #nb.jit was used (forceobj=True is used for avoiding warnings and …), which will have adverse effect on the performance. nopython is checked, too, with #nb.njit which get the following error due to not supporting (as mentioned in 1, 2) dictionary type of the inputs:
cannot determine Numba type of <class 'dict'>
I don't know if (how) it could be handled by Dict from numba.typed (by converting created python dictionaries to numba Dict) or if converting the dictionaries to lists of arrays have any advantage. I think, parallelization may be possible if some code lines e.g. Rand_vals.append(vals_group) or else section or … be taken or be modified out of the function to get the same results as before, but I don't have any idea how to do so.
I will be grateful for helping utilize numba on this code. numba parallelization will be the most desired (probably the best applicable method in terms of performance) solution if it could.
Data:
medium data volume: values_med, params_med
large data volume: values_large, params_large
This code can be converted to Numba but it is not straightforward.
First of all, the dictionary and list type must be defined since Numba njit functions cannot directly operate on reflected lists (aka. pure-python lists). This is a bit tedious to do in Numba and the resulting code is a bit verbose:
String = nb.types.unicode_type
ValueArray = nb.float64[::1]
ValueDict = nb.types.DictType(String, ValueArray)
ParamDictValue = nb.types.Tuple([nb.int_, nb.float64, nb.int_, nb.float64])
ParamDict = nb.types.DictType(String, ParamDictValue)
FinalDictValue = nb.types.ListType(ValueArray)
FinalDict = nb.types.DictType(String, FinalDictValue)
Then you need to convert the input dictionaries:
nbValues = nb.typed.typeddict.Dict.empty(String, ValueArray)
for key,value in values_.items():
nbValues[key] = value.copy()
nbParams = nb.typed.typeddict.Dict.empty(String, ParamDictValue)
for key,value in params.items():
nbParams[key] = (nb.int_(value[0]), nb.float64(value[1]), nb.int_(value[2]), nb.float64(value[3]))
Then, you need to write the core function. np.allclose and np.isin are not implemented in Numba so they should be reimplemented manually. But the main point is that Numba does not support the rng Numpy object. I think it will certainly not support it any time soon. Note that Numba has an random numbers implementation that try to mimic the behavior of Numpy but the management of the seed is a bit different. Note also that results should be the same with the np.random.xxx Numpy functions if the seed is set to the same value (Numpy and Numba have different seed variables that are not synchronized).
#nb.njit(FinalDict(ValueDict, ParamDict, nb.int_, nb.int_, nb.int_))
def nbTest(values_, params, Sno, dec_, upd_):
final_dict = nb.typed.Dict.empty(String, FinalDictValue)
for i, j in enumerate(values_.keys()):
Rand_vals = nb.typed.List.empty_list(ValueArray)
goal_sum = params[j][1] * params[j][3]
tel = goal_sum / dec_ * 10
if params[j][0] != 0:
for k in range(Sno):
final_sum = 0.0
iter_ = 0
t = 1
vals_group = np.empty(0, dtype=nb.float64)
while np.abs(goal_sum - final_sum) > (1e-05 * np.abs(final_sum) + tel):
iter_ += 1
vals_group = np.random.choice(values_[j], size=params[j][0], replace=False)
final_sum = 0.0016 * np.sum(vals_group)
# final_sum = 0.0016 * np.sum(vals_group) # (for small data volume)
final_sum = np.sum(vals_group ** 3) # (for med or large data volume)
if iter_ == upd_:
t += 1
tel = t * tel
# Perform an in-place deletion
vals, gr = values_[j], vals_group
cur = 0
for l in range(vals.size):
found = False
for m in range(gr.size):
found |= vals[l] == gr[m]
if not found:
# Keep the value (delete it otherwise)
vals[cur] = vals[l]
cur += 1
values_[j] = vals[:cur]
Rand_vals.append(vals_group)
else:
for k in range(Sno):
Rand_vals.append(np.empty(0, dtype=nb.float64))
final_dict["R" + str(i)] = Rand_vals
return final_dict
Note that the replacement implementation of np.isin is quite naive but it works pretty well in practice on your input example.
The function can be called using the following way:
nbFinalDict = nbTest(nbValues, nbParams, Sno, dec_, upd_)
Finally, the dictionary should be converted back to basic Python objects:
finalDict = dict()
for key,value in nbFinalDict.items():
finalDict[key] = list(value)
This implementation is fast for small inputs but not large ones since np.random.choice takes almost all the time (>96%). The thing is this function is clearly not optimal when the number of requested item is small (which is your case). Indeed, it surprisingly runs in linear time of the input array and not in linear time of the number of requested items.
Further Optimizations
The algorithm can be completely rewritten to extract only 12 random items and discard them from the main currant array in a much more efficient way. The idea is to swap n items (small target sample) at the end of the array with other items at random locations, then check the sum, repeat this process until a condition is fulfilled, and finally extract the view to the last n items before resizing the view so to discard the last items. All of this can be done in O(n) time rather than O(m) time where m is the size of the main current array with n << m (eg. 12 VS 20_000). It can also be compute without any expensive allocation. Here is the resulting code:
#nb.njit(nb.void(ValueArray, nb.int_, nb.int_))
def swap(arr, i, j):
arr[i], arr[j] = arr[j], arr[i]
#nb.njit(FinalDict(ValueDict, ParamDict, nb.int_, nb.int_, nb.int_))
def nbTest(values_, params, Sno, dec_, upd_):
final_dict = nb.typed.Dict.empty(String, FinalDictValue)
for i, j in enumerate(values_.keys()):
Rand_vals = nb.typed.List.empty_list(ValueArray)
goal_sum = params[j][1] * params[j][3]
tel = goal_sum / dec_ * 10
values = values_[j]
n = params[j][0]
if n != 0:
for k in range(Sno):
final_sum = 0.0
iter_ = 0
t = 1
m = values.size
assert n <= m
group = values[-n:]
while np.abs(goal_sum - final_sum) > (1e-05 * np.abs(final_sum) + tel):
iter_ += 1
# Swap the group view with other random items
for pos in range(m - n, m):
swap(values, pos, np.random.randint(0, m))
# For small data volume:
# final_sum = 0.0016 * np.sum(group)
# For med/large data volume
final_sum = 0.0
for v in group:
final_sum += v ** 3
if iter_ == upd_:
t += 1
tel *= t
assert iter_ > 0
values = values[:m-n]
Rand_vals.append(group)
else:
for k in range(Sno):
Rand_vals.append(np.empty(0, dtype=nb.float64))
final_dict["R" + str(i)] = Rand_vals
return final_dict
In addition to being faster, this implementation as the benefit of being also simpler. Results looks quite similar to the previous implementation despite the randomness make the check of the results tricky (especially since this function does not use the same method to choose the random sample). Note that this implementation does not remove items in values that are in group as opposed to the previous one (this is probably not wanted though).
Benchmark
Here are the results of the last implementation on my machine (compilation and conversion timings excluded):
Provided small input (embedded in the question):
- Initial code: 42.71 ms
- Numba code: 0.11 ms
Medium input:
- Initial code: 3481 ms
- Numba code: 11 ms
Large input:
- Initial code: 6728 ms
- Numba code: 20 ms
Note that the conversion time takes about the same time than the computation.
This last implementation is 316~388 times faster than the initial code on small inputs.
Notes
Note that the compilation time takes few seconds due to the dict and lists types.
Note that while it may be possible to parallelise the implementation, only the most encompassing loop can be parallelised. The thing is there is only few items to compute and the time is already quite small (not the best case for multi-threading). <-- Additionally, the creation of many temporary arrays (created by rng.choice) will certainly cause the parallel loop not to scale well anyway. --> Additionally, the list/dict cannot be written from multiple threads safely so one need to use Numpy arrays in the whole function to be able to do that (or add additional conversion that are already expensive). Moreover, Numba parallelism tends to increase significantly the compilation time which is already significant. Finally, the result will be less deterministic since each Numba thread has its own random number generator seed and the items computed by the threads cannot be predicted with prange (dependent of the parallel runtime chosen on the target platform). Note that in Numpy there is one global seed by default used by usual random functions (deprecated way) and RNG objects have their own seed (new preferred way).
I have a fairly simple test data set I am trying to fit with pymc3.
The result generated by traceplot looks something like this.
Essentially the trace of all parameter look like there is a standard 'caterpillar' for 100 iterations, followed by a flat line for 750 iterations, followed by the caterpillar again.
The initial 100 iterations happen after 25,000 ADVI iterations, and 10,000 tune iterations. If I change these amounts, I randomly will/won't have these periods of unwanted stability.
I'm wondering if anyone has any advice about how I can stop this from happening - and what is causing it?
Thanks.
The full code is below. In brief, I am generating a set of 'phases' (-pi -> pi) with a corresponding set of values y = a(j)*sin(phase) + b(j)*sin(phase). a and b are drawn for each subject j at random, but are related to each other.
I then essentially try to fit this same model.
Edit: Here is a similar example, running for 25,000 iterations. Something goes wrong around iteration 20,000.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import numpy as np
import pymc3 as pm
%matplotlib inline
np.random.seed(0)
n_draw = 2000
n_tune = 10000
n_init = 25000
init_string = 'advi'
target_accept = 0.95
##
# Generate some test data
# Just generates:
# x a vector of phases
# y a vector corresponding to some sinusoidal function of x
# subject_idx a vector corresponding to which subject x is
#9 Subjects
N_j = 9
#Each with 276 measurements
N_i = 276
sigma_y = 1.0
mean = [0.1, 0.1]
cov = [[0.01, 0], [0, 0.01]] # diagonal covariance
x_sub = np.zeros((N_j,N_i))
y_sub = np.zeros((N_j,N_i))
y_true_sub = np.zeros((N_j,N_i))
ab_sub = np.zeros((N_j,2))
tuning_sub = np.zeros((N_j,1))
sub_ix_sub = np.zeros((N_j,N_i))
for j in range(0,N_j):
aj,bj = np.random.multivariate_normal(mean, cov)
#aj = np.abs(aj)
#bj = np.abs(bj)
xj = np.random.uniform(-1,1,size = (N_i,1))*np.pi
xj = np.sort(xj)#for convenience
yj_true = aj*np.sin(xj) + bj*np.cos(xj)
yj = yj_true + np.random.normal(scale=sigma_y, size=(N_i,1))
x_sub[j,:] = xj.ravel()
y_sub[j,:] = yj.ravel()
y_true_sub[j,:] = yj_true.ravel()
ab_sub[j,:] = [aj,bj]
tuning_sub[j,:] = np.sqrt(((aj**2)+(bj**2)))
sub_ix_sub[j,:] = [j]*N_i
x = np.ravel(x_sub)
y = np.ravel(y_sub)
subject_idx = np.ravel(sub_ix_sub)
subject_idx = np.asarray(subject_idx, dtype=int)
##
# Fit model
hb1_model = pm.Model()
with hb1_model:
# Hyperpriors
hb1_mu_a = pm.Normal('hb1_mu_a', mu=0., sd=100)
hb1_sigma_a = pm.HalfCauchy('hb1_sigma_a', 4)
hb1_mu_b = pm.Normal('hb1_mu_b', mu=0., sd=100)
hb1_sigma_b = pm.HalfCauchy('hb1_sigma_b', 4)
# We fit a mixture of a sine and cosine with these two coeffieicents
# allowed to be different for each subject
hb1_aj = pm.Normal('hb1_aj', mu=hb1_mu_a, sd=hb1_sigma_a, shape = N_j)
hb1_bj = pm.Normal('hb1_bj', mu=hb1_mu_b, sd=hb1_sigma_b, shape = N_j)
# Model error
hb1_eps = pm.HalfCauchy('hb1_eps', 5)
hb1_linear = hb1_aj[subject_idx]*pm.math.sin(x) + hb1_bj[subject_idx]*pm.math.cos(x)
hb1_linear_like = pm.Normal('y', mu = hb1_linear, sd=hb1_eps, observed=y)
with hb1_model:
hb1_trace = pm.sample(draws=n_draw, tune = n_tune,
init = init_string, n_init = n_init,
target_accept = target_accept)
pm.traceplot(hb1_trace)
To partially answer my own question: After playing with this for a while, it looks like the problem might be due to the hyperprior standard deviation going to 0. I am not sure why the algorithm should get stuck there though (testing a small standard deviation can't be that uncommon...).
In any case, two solutions that seem to alleviate the problem (although they don't remove it entirely) are:
1) Add an offset to the definitions of the standard deviation. e.g.:
offset = 1e-2
hb1_sigma_a = offset + pm.HalfCauchy('hb1_sigma_a', 4)
2) Instead of using a HalfCauchy or HalfNormal for the SD prior, use a logNormal distribution set so that 0 is unlikely.
I'd look at the divergencies, as explained in notes and literature on Hamiltonian Monte Carlo, see, e.g., here and here.
with model:
np.savetxt('diverging.csv', hb1_trace['diverging'])
As a dirty solution, you can try to increase target_accept, perhaps.
Good luck!
I'm attempting to implement Neville's algorithm in MatLab with four given points in this case. However, I'm a little stuck at the moment.
This is my script so far:
% Neville's Method
% Function parameters
x = [7,14,21,28];
fx = [58,50,54,53];
t = 10;
n = length(x);
Q = zeros(n,n);
for i = 1:n
Q(i,1) = fx(i);
end
for j = 2:n
for i = j:n
Q(i,j) = ((t-x(i-j)) * Q(i,j-1)/(x(i)-x(i-j))) + ((x(i)-t) * Q(i-1,j-1)/(x(i)-x(i-j)));
end
end
print(Q);
As for the problem I'm having, I'm getting this output consistently:
Subscript indices must either be real positive integers or logicals.
I've been trying to tweak the loop iterations but to no avail. I know the problem is the primary logic line in the inner loop. Some of the operations result in array indices that are equal to zero initially.
That's where I am, any help would be appreciated!
In your loop at the first time i-j is 0 because you set i = j. In MATLAB indices start at 1. A simple fix to get running code would be to change
for i = j:n
to
for i = j+1:n
This solves
Subscript indices must either be real positive integers or logicals.
However, this may not be ideal and you may need to rethink your logic. The output I get is
>> neville
Q =
58.0000 0 0 0
50.0000 0 0 0
54.0000 50.8571 0 0
53.0000 54.2857 51.3469 0
I would like to know why the sampler is incredibly slow when sampling step by step.
For example, if I run:
mcmc = MCMC(model)
mcmc.sample(1000)
the sampling is fast. However, if I run:
mcmc = MCMC(model)
for i in arange(1000):
mcmc.sample(1)
the sampling is slower (and the more it samples, the slower it is).
If you are wondering why I am asking this.. well, I need a step by step sampling because I want to perform some operations on the values of the variables after each step of the sampler.
Is there a way to speed it up?
Thank you in advance!
------------------ EDIT -------------------------------------------------------------
Here I present the specific problem in more details:
I have two models in competition and they are part of a bigger model that has a categorical variable functioning as a 'switch' between the two.
In this toy example, I have the observed vector 'Y', that could be explained by a Poisson or a Geometric distribution. The Categorical variable 'switch_model' selects the Geometric model when = 0 and the Poisson model when =1.
After each sample, if switch_model selects the Geometric model, I want the variables of the Poisson model NOT to be updated, because they are not influencing the likelihood and therefore they are just drifting away. The opposite is true if the switch_model selects the Poisson model.
Basically what I do at each step is to 'change' the value of the non-selected model by bringing it manually one step back.
I hope that my explanation and the commented code will be clear enough. Let me know if you need further details.
import numpy as np
import pymc as pm
import pandas as pd
import matplotlib.pyplot as plt
# OBSERVED VALUES
Y = np.array([0, 1, 2, 3, 8])
# PRIOR ON THE MODELS
pi = (0.5, 0.5)
switch_model = pm.Categorical("switch_model", p = pi)
# switch_model = 0 for Geometric, switch_model = 1 for Poisson
p = pm.Uniform('p', lower = 0, upper = 1) # Prior of the parameter of the geometric distribution
mu = pm.Uniform('mu', lower = 0, upper = 10) # Prior of the parameter of the Poisson distribution
# LIKELIHOOD
#pm.observed
def Ylike(value = Y, mu = mu, p = p, M = switch_model):
if M == 0:
out = pm.geometric_like(value+1, p)
elif M == 1:
out = pm.poisson_like(value, mu)
return out
model = pm.Model([Ylike, p, mu, switch_model])
mcmc = pm.MCMC(model)
n_samples = 5000
traces = {}
for var in mcmc.stochastics:
traces[str(var)] = np.zeros(n_samples)
bar = pm.progressbar.progress_bar(n_samples)
bar.update(0)
mcmc.sample(1, progress_bar=False)
for var in mcmc.stochastics:
traces[str(var)][0] = mcmc.trace(var)[-1]
for i in np.arange(1,n_samples):
mcmc.sample(1, progress_bar=False)
bar.update(i)
for var in mcmc.stochastics:
traces[str(var)][i] = mcmc.trace(var)[-1]
if mcmc.trace('switch_model')[-1] == 0: # Gemetric wins
traces['mu'][i] = traces['mu'][i-1] # One step back for the sampler of the Poisson parameter
mu.value = traces['mu'][i-1]
elif mcmc.trace('switch_model')[-1] == 1: # Poisson wins
traces['p'][i] = traces['p'][i-1] # One step back for the sampler of the Geometric parameter
p.value = traces['p'][i-1]
print '\n\n'
traces=pd.DataFrame(traces)
traces['mu'][traces['switch_model'] == 0] = np.nan
traces['p'][traces['switch_model'] == 1] = np.nan
print traces.describe()
traces.plot()
plt.show()
The reason this is so slow is that Python's for loops are pretty slow, especially when they are compared to FORTRAN loops (Which is what PyMC is written in basically.) If you could show more detailed code, it might be easier to see what you are trying to do and to provide faster alternative algorithms.
Actually I found a 'crazy' solution, and I have the suspect to know why it works. I would still like to get an expert opinion on my trick.
Basically if I modify the for loop in the following way, adding a 'reset of the mcmc' every 1000 loops, the sampling fires up again:
for i in np.arange(1,n_samples):
mcmc.sample(1, progress_bar=False)
bar.update(i)
for var in mcmc.stochastics:
traces[str(var)][i] = mcmc.trace(var)[-1]
if mcmc.trace('switch_model')[-1] == 0: # Gemetric wins
traces['mu'][i] = traces['mu'][i-1] # One step back for the sampler of the Poisson parameter
mu.value = traces['mu'][i-1]
elif mcmc.trace('switch_model')[-1] == 1: # Poisson wins
traces['p'][i] = traces['p'][i-1] # One step back for the sampler of the Geometric parameter
p.value = traces['p'][i-1]
if i%1000 == 0:
mcmc = pm.MCMC(model)
In practice this trick erases the traces and the database of the sampler every 1000 steps. It looks like the sampler does not like having a long database, although I do not really understand why. (of course 1000 steps is arbitrary, too short it adds too much overhead, too long it will cause the traces and database to be too long).
I find this hack a bit crazy and definitely not elegant.. does any of the experts or developers have a comment on it? Thank you!