How to model a mixture of 3 Normals in PyMC? - pymc

There is a question on CrossValidated on how to use PyMC to fit two Normal distributions to data. The answer of Cam.Davidson.Pilon was to use a Bernoulli distribution to assign data to one of the two Normals:
size = 10
p = Uniform( "p", 0 , 1) #this is the fraction that come from mean1 vs mean2
ber = Bernoulli( "ber", p = p, size = size) # produces 1 with proportion p.
precision = Gamma('precision', alpha=0.1, beta=0.1)
mean1 = Normal( "mean1", 0, 0.001 )
mean2 = Normal( "mean2", 0, 0.001 )
#deterministic
def mean( ber = ber, mean1 = mean1, mean2 = mean2):
return ber*mean1 + (1-ber)*mean2
Now my question is: how to do it with three Normals?
Basically, the issue is that you can't use a Bernoulli distribution and 1-Bernoulli anymore. But how to do it then?
edit: With the CDP's suggestion, I wrote the following code:
import numpy as np
import pymc as mc
n = 3
ndata = 500
dd = mc.Dirichlet('dd', theta=(1,)*n)
category = mc.Categorical('category', p=dd, size=ndata)
precs = mc.Gamma('precs', alpha=0.1, beta=0.1, size=n)
means = mc.Normal('means', 0, 0.001, size=n)
#mc.deterministic
def mean(category=category, means=means):
return means[category]
#mc.deterministic
def prec(category=category, precs=precs):
return precs[category]
v = np.random.randint( 0, n, ndata)
data = (v==0)*(50+ np.random.randn(ndata)) \
+ (v==1)*(-50 + np.random.randn(ndata)) \
+ (v==2)*np.random.randn(ndata)
obs = mc.Normal('obs', mean, prec, value=data, observed = True)
model = mc.Model({'dd': dd,
'category': category,
'precs': precs,
'means': means,
'obs': obs})
The traces with the following sampling procedure look good as well. Solved!
mcmc = mc.MCMC( model )
mcmc.sample( 50000,0 )
mcmc.trace('means').gettrace()[-1,:]

there is a mc.Categorical object that does just this.
p = [0.2, 0.3, .5]
t = mc.Categorical('test', p )
t.random()
#array(2, dtype=int32)
It returns an int between 0 and len(p)-1. To model the 3 Normals, you make p a mc.Dirichlet object (it accepts a k length array as the hyperparameters; setting the values in the array to be the same is setting the prior probabilities to be equal). The rest of the model is nearly identical.
This is a generalization of the model I suggested above.
Update:
Okay, so instead of having different means, we can collapse them all into 1:
means = Normal( "means", 0, 0.001, size=3 )
...
#mc.deterministic
def mean(categorical=categorical, means = means):
return means[categorical]

Related

Issues with m.if2, m.abs2 in implementing a contact switch

This is a continuation of a prior question with a slightly different emphasis.
In summary, the prior solution helped with the implementation of the model inputs. The below model is working and provides a solution for the full contact condition. The framework and basic mechanics are in place for the variable contact constraint. However, no solution is available for when the contact switch variable is applied to the geometric constraints, whether as a Param or FV. (Note: no issues when applied to the dynamic equations).
One thing I've noted is that the m.if2 output is not correct in the [0] position. Below is the output of the switch-related variables:
adiff= [0.1, 0.10502512563, 0.11005025126, 0.11507537688, 0.12010050251,...
bdiff= [1.0, 0.99497487437, 0.98994974874, 0.98492462312, 0.97989949749,...
swtch= [0.1, 0.10449736118, 0.10894421858, 0.11334057221, 0.11768642206,...
c= [0.0, 1.000000005, 1.000000005, 1.000000005, 1.000000005, 1.000000005,...
Based on the logic swtch = adiff*bdiff and m.if2(swtch-thres,0,1), c[0] should be ~1.0. I've played with these parameters and haven't found a way to affect that first cell. I can't say for sure that this initial position is causing issues, but this seems like an erroneous output regardless.
Second, given that the m.if() outputs as approximately 0 and 1, I've attempted to soften the geometric constraint as: m.abs2({constraint}) <= {tol}. Even in the case when a generous tolerance is applied and c is excluded, this fails to produce a solution (whereas the hard constraint will).
Any suggestions for correcting either issue are appreciated.
Lastly, in the prior post, the use of m.integral() for setting the value of c was suggested. I'm unclear if that entails using if2 as well. If you can expand on implementing a switch that enables at t=a and switches off at t=b using an integral, that would be appreciated.
Full code:
###Import Libraries
import math
import matplotlib
matplotlib.use("TkAgg")
import matplotlib.animation as animation
import numpy as np
from gekko import GEKKO
###Defining a model
m = GEKKO(remote=True)
v = 1 #set walking speed (m/s)
L1 = .5 #set thigh length (m)
L2 = .5 #set shank length (m)
M = 75 #set mass (kg)
#################################
###Define secondary parameters
D = L1 + L2 #leg length parameter
pi = math.pi #define pi
g = 9.81 #define gravity
###Define initial and final conditions and limits
xmin = -D; xmax = D
xdotmin = .5*v; xdotmax = 1.5*v
ymin = 0*D; ymax = 5*D
q1min = -pi/2; q1max = pi/2
q2min = -pi/2; q2max = -.01
tfmin = .25; tfmax = 10
#amin = 0; amax = .45 #limits for FVs (future capability)
#bmin = .55; bmax = 1
###Defining the time parameter (0, 1)
N = 200
t = np.linspace(0,1,N)
m.time = t
###Final time Fixed Variable
TF = m.FV(1,lb=tfmin,ub=tfmax); TF.STATUS = 1
end_loc = len(m.time)-1
###Defining initial and final condition vectors
init = np.zeros(len(m.time))
final = np.zeros(len(m.time))
init[1] = 1
final[-1] = 1
init = m.Param(value=init)
final = m.Param(value=final)
###Parameters
M = m.Param(value=M) #cart mass
L1 = m.Param(value=L1) #link 1 length
L2 = m.Param(value=L2) #link 1 length
g = m.Const(value=g) #gravity
###Control Input Manipulated Variable
u = m.MV(0, lb=-70, ub=70); u.STATUS = 1
###Ground Contact Fixed Variables
#as fixed variables (future state)
#a = m.FV(0,lb=amin,ub=amax); a.STATUS = 1 #equates to the unscaled time when contact first occurs
#b = m.FV(1,lb=bmin,ub=bmax); b.STATUS = 1 #equates to the unscaled time when contact last occurs
#as fixed parameter
a = m.Param(value=-.1) #a<0 to drive m.time-a positive
b = m.Param(value=1)
###State Variables
x, y, xdot, ydot, q1, q2 = m.Array(m.Var, 6)
#Define BCs
m.free_initial(x)
m.free_final(x)
m.free_initial(xdot)
m.free_final(xdot)
m.free_initial(y)
m.free_initial(ydot)
#Define Limits
y.LOWER = ymin; y.UPPER = ymax
x.LOWER = xmin; x.UPPER = xmax
xdot.LOWER = xdotmin; xdot.UPPER = xdotmax
q1.LOWER = q1min; q1.UPPER = q1max
q2.LOWER = q2min; q2.UPPER = q2max
###Intermediates
xdot_int = m.Intermediate(final*m.integral(xdot)) #for average velocity constraint
adiff = m.Param(m.time-a.VALUE) #positive if m.time>a
bdiff = m.Param(b.VALUE-m.time) #positive if m.time<b
swtch = m.Intermediate(adiff*bdiff) #positive if m.time > a AND m.time < b
thres = .001
c = m.if2(swtch-thres,0,1) #c=0 if swtch <0, c=1 if swtch >0
###Defining the State Space Model
m.Equation(xdot.dt()/TF == -c*u*(L1*m.sin(q1)
+L2*m.sin(q1+q2))
/(M*L1*L2*m.sin(q2)))
m.Equation(ydot.dt()/TF == c*u*(L1*m.cos(q1)
+L2*m.cos(q1+q2))
/(M*L1*L2*m.sin(q2))-g)
m.Equation(x.dt()/TF == xdot)
m.Equation(y.dt()/TF == ydot)
m.periodic(y) #initial and final y position must be equal
m.periodic(ydot) #initial and final y velocity must be equal
m.periodic(xdot) #initial and final x velocity must be equal
m.Equation(m.abs2(xdot_int*final - v*final) <= .02) #soft constraint for average velocity ~= v
###Geometric constraints
#with no contact switch, this works
m.Equation(x + L1*m.sin(q1) + L2*m.sin(q1+q2) == 0) #x geometric constraint when in contact
m.Equation(y - L1*m.cos(q1) - L2*m.cos(q1+q2) == 0) #y geometric constraint when in contact
#soft constraint for contact switch. Produces no solution, with or without c, abs2 or abs3:
#m.Equation(c*m.abs2(x + L1*m.sin(q1) + L2*m.sin(q1+q2)) <= .01) #x geometric constraint when in contact
#.Equation(c*m.abs2(y - L1*m.cos(q1) - L2*m.cos(q1+q2)) <= .01) #y geometric constraint when in contact
###Objectives
#Maximize stride length
m.Maximize(100*final*x)
m.Minimize(100*init*x)
#Minimize torque
m.Obj(0.01*u**2)
###Solve
m.options.IMODE = 6
m.options.SOLVER = 3
m.solve()
###Scale time vector
m.time = np.multiply(TF, m.time)
###Display Outputs
print("adiff=", adiff.VALUE)
print("bdiff=", bdiff.VALUE)
print("swtch=", swtch.VALUE)
print("c=", c.VALUE)
########################################
####Plotting the results
import matplotlib.pyplot as plt
plt.close('all')
fig1 = plt.figure()
fig2 = plt.figure()
fig3 = plt.figure()
fig4 = plt.figure()
ax1 = fig1.add_subplot()
ax2 = fig2.add_subplot(221)
ax3 = fig2.add_subplot(222)
ax4 = fig2.add_subplot(223)
ax5 = fig2.add_subplot(224)
ax6 = fig3.add_subplot()
ax7 = fig4.add_subplot(121)
ax8 = fig4.add_subplot(122)
ax1.plot(m.time,u.value,'m',lw=2)
ax1.legend([r'$u$'],loc=1)
ax1.set_title('Control Input')
ax1.set_ylabel('Torque (N-m)')
ax1.set_xlabel('Time (s)')
ax1.set_xlim(m.time[0],m.time[-1])
ax1.grid(True)
ax2.plot(m.time,x.value,'r',lw=2)
ax2.set_ylabel('X Position (m)')
ax2.set_xlabel('Time (s)')
ax2.legend([r'$x$'],loc='upper left')
ax2.set_xlim(m.time[0],m.time[-1])
ax2.grid(True)
ax2.set_title('Mass X-Position')
ax3.plot(m.time,xdot.value,'g',lw=2)
ax3.set_ylabel('X Velocity (m/s)')
ax3.set_xlabel('Time (s)')
ax3.legend([r'$xdot$'],loc='upper left')
ax3.set_xlim(m.time[0],m.time[-1])
ax3.grid(True)
ax3.set_title('Mass X-Velocity')
ax4.plot(m.time,y.value,'r',lw=2)
ax4.set_ylabel('Y Position (m)')
ax4.set_xlabel('Time (s)')
ax4.legend([r'$y$'],loc='upper left')
ax4.set_xlim(m.time[0],m.time[-1])
ax4.grid(True)
ax4.set_title('Mass Y-Position')
ax5.plot(m.time,ydot.value,'g',lw=2)
ax5.set_ylabel('Y Velocity (m/s)')
ax5.set_xlabel('Time (s)')
ax5.legend([r'$ydot$'],loc='upper left')
ax5.set_xlim(m.time[0],m.time[-1])
ax5.grid(True)
ax5.set_title('Mass Y-Velocity')
ax6.plot(x.value, y.value,'g',lw=2)
ax6.set_ylabel('Y-Position (m)')
ax6.set_xlabel('X-Position (m)')
ax6.legend([r'$mass coordinate$'],loc='upper left')
ax6.set_xlim(x.value[0],x.value[-1])
ax6.set_ylim(0,1.1)
ax6.grid(True)
ax6.set_title('Mass Position')
ax7.plot(m.time,q1.value,'r',lw=2)
ax7.set_ylabel('q1 Position (rad)')
ax7.set_xlabel('Time (s)')
ax7.legend([r'$q1$'],loc='upper left')
ax7.set_xlim(m.time[0],m.time[-1])
ax7.grid(True)
ax7.set_title('Hip Joint Angle')
ax8.plot(m.time,q2.value,'r',lw=2)
ax8.set_ylabel('q2 Position (rad)')
ax8.set_xlabel('Time (s)')
ax8.legend([r'$q2$'],loc='upper left')
ax8.set_xlim(m.time[0],m.time[-1])
ax8.grid(True)
ax8.set_title('Knee Joint Angle')
plt.show()
The m.if2() function is a Mathematical Program with Complementarity Constraints (MPCC). It does not use a binary variable like the m.if3() function and therefore can be solved with any NLP solver, such as IPOPT. The disadvantage is that it has a saddle point at the switching condition and often gets stuck at the local solution. One way to overcome this issue is to use m.if3() with the IPOPT solver for initialization and then switch to the APOPT solver to generate an exact MINLP solution.
m.options.SOLVER=3 # IPOPT
m.solve()
m.options.SOLVER=1 # APOPT
m.options.TIME_SHIFT = 0 # don't update initial conditions
m.solve()
Additional information on MPCCs and binary conditional statements is in the Design Optimization course section on Logical Conditions.

How to set up GEKKO for parameter estimation from multiple independent sets of data?

I am learning how to use GEKKO for kinetic parameter estimation based on laboratory batch reactor data, which essentially consists of the concentration profiles of three species A, C, and P. For the purposes of my question, I am using a model that I previously featured in a question related to parameter estimation from a single data set.
My ultimate goal is to be able to use multiple experimental runs for parameter estimation, leveraging data that may be collected at different temperatures, species concentrations, etc. Due to the independent nature of individual batch reactor experiments, each data set features samples collected at different time points. These different time points (and in the future, different temperatures for instance) are difficult for me to implement into a GEKKO model, as I previosly used the experimental data collection time points as the m.time parameter for the GEKKO model. (See end of post for code) I have solved problems like this in the past with gPROMS and Athena Visual Studio.
To illustrate my problem, I generated an artificial data set of 'experimental' data from my original model by introducing noise to the species concentration profiles, and shifting the experimental time points slightly. I then combined all data sets of the same experimental species into new arrays featuring multiple columns. My thought process here was that GEKKO would carry out the parameter estimation by using the experimental data of each corresponding column of the arrays, so that times_comb[:,0] would be related to A_comb[:,0] while times_comb[:,1] would be related to A_comb[:,1].
When I attempt to run the GEKKO model, the system does obtain a solution for the parameter estimation, but it is unclear to me if the problem solution is reasonable, as I notice that the GEKKO Variables A, B, C, and P are 34 element vectors, which is double the elements in each of the experimental data sets. I presume GEKKO is somehow combining both columns of the time and Parameter vectors during model setup that leads to those 34 element variables? I am also concerned that during this combination of the columns of each input parameter, that the relationship between a certain time point and the collected species information is lost.
How could I improve the use of multiple data sets that GEKKO can simultaneously use for parameter estimation, with the consideration that the time points of each data set may be different? I looked on the GEKKO documentation examples as well as the APMonitor website, but I could not find examples featuring multiple data sets that I could use for guidance, as I am fairly new to the GEKKO package.
Thank you for your time reading my question and for any help/ideas you may have.
Code below:
import numpy as np
import matplotlib.pyplot as plt
from gekko import GEKKO
#Experimental data
times = np.array([0.0, 0.071875, 0.143750, 0.215625, 0.287500, 0.359375, 0.431250,
0.503125, 0.575000, 0.646875, 0.718750, 0.790625, 0.862500,
0.934375, 1.006250, 1.078125, 1.150000])
A_obs = np.array([1.0, 0.552208, 0.300598, 0.196879, 0.101175, 0.065684, 0.045096,
0.028880, 0.018433, 0.011509, 0.006215, 0.004278, 0.002698,
0.001944, 0.001116, 0.000732, 0.000426])
C_obs = np.array([0.0, 0.187768, 0.262406, 0.350412, 0.325110, 0.367181, 0.348264,
0.325085, 0.355673, 0.361805, 0.363117, 0.327266, 0.330211,
0.385798, 0.358132, 0.380497, 0.383051])
P_obs = np.array([0.0, 0.117684, 0.175074, 0.236679, 0.234442, 0.270303, 0.272637,
0.274075, 0.278981, 0.297151, 0.297797, 0.298722, 0.326645,
0.303198, 0.277822, 0.284194, 0.301471])
#Generate second set of 'experimental data'
times_new = times + np.random.uniform(0.0,0.01)
P_obs_noisy = P_obs+np.random.normal(0,0.05,P_obs.shape)
A_obs_noisy = A_obs+np.random.normal(0,0.05,A_obs.shape)
C_obs_noisy = A_obs+np.random.normal(0,0.05,C_obs.shape)
#Combine two data sets into multi-column arrays
times_comb = np.array([times, times_new]).T
P_comb = np.array([P_obs, P_obs_noisy]).T
A_comb = np.array([A_obs, A_obs_noisy]).T
C_comb = np.array([C_obs, C_obs_noisy]).T
m = GEKKO(remote=False)
t = m.time = times_comb #using two column time array
Am = m.Param(value=A_comb) #Using the two column data as observed parameter
Cm = m.Param(value=C_comb)
Pm = m.Param(value=P_comb)
A = m.Var(1, lb = 0)
B = m.Var(0, lb = 0)
C = m.Var(0, lb = 0)
P = m.Var(0, lb = 0)
k = m.Array(m.FV,6,value=1,lb=0)
for ki in k:
ki.STATUS = 1
k1,k2,k3,k4,k5,k6 = k
r1 = m.Var(0, lb = 0)
r2 = m.Var(0, lb = 0)
r3 = m.Var(0, lb = 0)
r4 = m.Var(0, lb = 0)
r5 = m.Var(0, lb = 0)
r6 = m.Var(0, lb = 0)
m.Equation(r1 == k1 * A)
m.Equation(r2 == k2 * A * B)
m.Equation(r3 == k3 * C * B)
m.Equation(r4 == k4 * A)
m.Equation(r5 == k5 * A)
m.Equation(r6 == k6 * A * B)
#mass balance diff eqs, function calls rxn function
m.Equation(A.dt() == - r1 - r2 - r4 - r5 - r6)
m.Equation(B.dt() == r1 - r2 - r3 - r6)
m.Equation(C.dt() == r2 - r3 + r4)
m.Equation(P.dt() == r3 + r5 + r6)
m.Minimize((A-Am)**2)
m.Minimize((P-Pm)**2)
m.Minimize((C-Cm)**2)
m.options.IMODE = 5
m.options.SOLVER = 3 #IPOPT optimizer
m.options.NODES = 6
m.solve()
k_opt = []
for ki in k:
k_opt.append(ki.value[0])
print(k_opt)
plt.plot(t,A)
plt.plot(t,C)
plt.plot(t,P)
plt.plot(t,B)
plt.plot(times,A_obs,'bo')
plt.plot(times,C_obs,'gx')
plt.plot(times,P_obs,'rs')
plt.plot(times_new, A_obs_noisy,'b*')
plt.plot(times_new, C_obs_noisy,'g*')
plt.plot(times_new, P_obs_noisy,'r*')
plt.show()
To have multiple data sets with different times and data points, you can join the data sets as a pandas dataframe. Here is a simple example:
# data set 1
t_data1 = [0.0, 0.1, 0.2, 0.4, 0.8, 1.00]
x_data1 = [2.0, 1.6, 1.2, 0.7, 0.3, 0.15]
# data set 2
t_data2 = [0.0, 0.15, 0.25, 0.45, 0.85, 0.95]
x_data2 = [3.6, 2.25, 1.75, 1.00, 0.35, 0.20]
The merged data has NaN where the data is missing:
x1 x2
Time
0.00 2.0 3.60
0.10 1.6 NaN
0.15 NaN 2.25
0.20 1.2 NaN
0.25 NaN 1.75
Take note of where the data is missing with a 1=measured and 0=not measured.
# indicate which points are measured
z1 = (data['x1']==data['x1']).astype(int) # 0 if NaN
z2 = (data['x2']==data['x2']).astype(int) # 1 if number
The final step is to set up Gekko variables, equations, and objective to accommodate the data sets.
xm = m.Array(m.Param,2)
zm = m.Array(m.Param,2)
for i in range(2):
m.Equation(x[i].dt()== -k * x[i]) # differential equations
m.Minimize(zm[i]*(x[i]-xm[i])**2) # objectives
You can also calculate the initial condition with m.free_initial(x[i]). This gives an optimal solution for one parameter value (k) over the 2 data sets. This approach can be expanded to multiple variables or multiple data sets with different times.
from gekko import GEKKO
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# data set 1
t_data1 = [0.0, 0.1, 0.2, 0.4, 0.8, 1.00]
x_data1 = [2.0, 1.6, 1.2, 0.7, 0.3, 0.15]
# data set 2
t_data2 = [0.0, 0.15, 0.25, 0.45, 0.85, 0.95]
x_data2 = [3.6, 2.25, 1.75, 1.00, 0.35, 0.20]
# combine with dataframe join
data1 = pd.DataFrame({'Time':t_data1,'x1':x_data1})
data2 = pd.DataFrame({'Time':t_data2,'x2':x_data2})
data1.set_index('Time', inplace=True)
data2.set_index('Time', inplace=True)
data = data1.join(data2,how='outer')
print(data.head())
# indicate which points are measured
z1 = (data['x1']==data['x1']).astype(int) # 0 if NaN
z2 = (data['x2']==data['x2']).astype(int) # 1 if number
# replace NaN with any number (0)
data.fillna(0,inplace=True)
m = GEKKO(remote=False)
# measurements
xm = m.Array(m.Param,2)
xm[0].value = data['x1'].values
xm[1].value = data['x2'].values
# index for objective (0=not measured, 1=measured)
zm = m.Array(m.Param,2)
zm[0].value=z1
zm[1].value=z2
m.time = data.index
x = m.Array(m.Var,2) # fit to measurement
x[0].value=x_data1[0]; x[1].value=x_data2[0]
k = m.FV(); k.STATUS = 1 # adjustable parameter
for i in range(2):
m.free_initial(x[i]) # calculate initial condition
m.Equation(x[i].dt()== -k * x[i]) # differential equations
m.Minimize(zm[i]*(x[i]-xm[i])**2) # objectives
m.options.IMODE = 5 # dynamic estimation
m.options.NODES = 2 # collocation nodes
m.solve(disp=True) # solve
k = k.value[0]
print('k = '+str(k))
# plot solution
plt.plot(m.time,x[0].value,'b.--',label='Predicted 1')
plt.plot(m.time,x[1].value,'r.--',label='Predicted 2')
plt.plot(t_data1,x_data1,'bx',label='Measured 1')
plt.plot(t_data2,x_data2,'rx',label='Measured 2')
plt.legend(); plt.xlabel('Time'); plt.ylabel('Value')
plt.xlabel('Time');
plt.show()
Including my updated code (not fully cleaned up to minimize number of variables) incorporating the selected answer to my question for reference. The model does a regression of 3 measured species in two separate 'datasets.'
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from gekko import GEKKO
#Experimental data
times = np.array([0.0, 0.071875, 0.143750, 0.215625, 0.287500, 0.359375, 0.431250,
0.503125, 0.575000, 0.646875, 0.718750, 0.790625, 0.862500,
0.934375, 1.006250, 1.078125, 1.150000])
A_obs = np.array([1.0, 0.552208, 0.300598, 0.196879, 0.101175, 0.065684, 0.045096,
0.028880, 0.018433, 0.011509, 0.006215, 0.004278, 0.002698,
0.001944, 0.001116, 0.000732, 0.000426])
C_obs = np.array([0.0, 0.187768, 0.262406, 0.350412, 0.325110, 0.367181, 0.348264,
0.325085, 0.355673, 0.361805, 0.363117, 0.327266, 0.330211,
0.385798, 0.358132, 0.380497, 0.383051])
P_obs = np.array([0.0, 0.117684, 0.175074, 0.236679, 0.234442, 0.270303, 0.272637,
0.274075, 0.278981, 0.297151, 0.297797, 0.298722, 0.326645,
0.303198, 0.277822, 0.284194, 0.301471])
#Generate second set of 'experimental data'
times_new = times + np.random.uniform(0.0,0.01)
P_obs_noisy = (P_obs+ np.random.normal(0,0.05,P_obs.shape))
A_obs_noisy = (A_obs+np.random.normal(0,0.05,A_obs.shape))
C_obs_noisy = (C_obs+np.random.normal(0,0.05,C_obs.shape))
#Combine two data sets into multi-column arrays using pandas DataFrames
#Set dataframe index to be combined time discretization of both data sets
exp1 = pd.DataFrame({'Time':times,'A':A_obs,'C':C_obs,'P':P_obs})
exp2 = pd.DataFrame({'Time':times_new,'A':A_obs_noisy,'C':C_obs_noisy,'P':P_obs_noisy})
exp1.set_index('Time',inplace=True)
exp2.set_index('Time',inplace=True)
exps = exp1.join(exp2, how ='outer',lsuffix = '_1',rsuffix = '_2')
#print(exps.head())
#Combine both data sets into a single data frame
meas_data = pd.DataFrame().reindex_like(exps)
#define measurement locations for each data set, with NaN written for time points
#not common in both data sets
for cols in exps:
meas_data[cols] = (exps[cols]==exps[cols]).astype(int)
exps.fillna(0,inplace = True) #replace NaN with 0
m = GEKKO(remote=False)
t = m.time = exps.index #set GEKKO time domain to use experimental time points
#Generate two-column GEKKO arrays to store observed values of each species, A, C and P
Am = m.Array(m.Param,2)
Cm = m.Array(m.Param,2)
Pm = m.Array(m.Param,2)
Am[0].value = exps['A_1'].values
Am[1].value = exps['A_2'].values
Cm[0].value = exps['C_1'].values
Cm[1].value = exps['C_2'].values
Pm[0].value = exps['P_1'].values
Pm[1].value = exps['P_2'].values
#Define GEKKO variables that determine if time point contatins data to be used in regression
#If time point contains species data, meas_ variable = 1, else = 0
meas_A = m.Array(m.Param,2)
meas_C = m.Array(m.Param,2)
meas_P = m.Array(m.Param,2)
meas_A[0].value = meas_data['A_1'].values
meas_A[1].value = meas_data['A_2'].values
meas_C[0].value = meas_data['C_1'].values
meas_C[1].value = meas_data['C_2'].values
meas_P[0].value = meas_data['P_1'].values
meas_P[1].value = meas_data['P_2'].values
#Define Variables for differential equations A, B, C, P, with initial conditions set by experimental observation at first time point
A = m.Array(m.Var,2, lb = 0)
B = m.Array(m.Var,2, lb = 0)
C = m.Array(m.Var,2, lb = 0)
P = m.Array(m.Var,2, lb = 0)
A[0].value = exps['A_1'][0] ; A[1].value = exps['A_2'][0]
B[0].value = 0 ; B[1].value = 0
C[0].value = exps['C_1'][0] ; C[1].value = exps['C_2'][0]
P[0].value = exps['P_1'][0] ; P[1].value = exps['P_2'][0]
#Define kinetic coefficients, k1-k6 as regression FV's
k = m.Array(m.FV,6,value=1,lb=0,ub = 20)
for ki in k:
ki.STATUS = 1
k1,k2,k3,k4,k5,k6 = k
#If doing paramrter estimation, enable free_initial condition, else not include them in model to reduce DOFs (for simulation, for example)
if k1.STATUS == 1:
for i in range(2):
m.free_initial(A[i])
m.free_initial(B[i])
m.free_initial(C[i])
m.free_initial(P[i])
#Define reaction rate variables
r1 = m.Array(m.Var,2, value = 1, lb = 0)
r2 = m.Array(m.Var,2, value = 1, lb = 0)
r3 = m.Array(m.Var,2, value = 1, lb = 0)
r4 = m.Array(m.Var,2, value = 1, lb = 0)
r5 = m.Array(m.Var,2, value = 1, lb = 0)
r6 = m.Array(m.Var,2, value = 1, lb = 0)
#Model Equations
for i in range(2):
#Rate equations
m.Equation(r1[i] == k1 * A[i])
m.Equation(r2[i] == k2 * A[i] * B[i])
m.Equation(r3[i] == k3 * C[i] * B[i])
m.Equation(r4[i] == k4 * A[i])
m.Equation(r5[i] == k5 * A[i])
m.Equation(r6[i] == k6 * A[i] * B[i])
#Differential species balances
m.Equation(A[i].dt() == - r1[i] - r2[i] - r4[i] - r5[i] - r6[i])
m.Equation(B[i].dt() == r1[i] - r2[i] - r3[i] - r6[i])
m.Equation(C[i].dt() == r2[i] - r3[i] + r4[i])
m.Equation(P[i].dt() == r3[i] + r5[i] + r6[i])
#Minimization objective functions
m.Obj(meas_A[i]*(A[i]-Am[i])**2)
m.Obj(meas_P[i]*(P[i]-Pm[i])**2)
m.Obj(meas_C[i]*(C[i]-Cm[i])**2)
#Solver options
m.options.IMODE = 5
m.options.SOLVER = 3 #APOPT optimizer
m.options.NODES = 6
m.solve()
k_opt = []
for ki in k:
k_opt.append(ki.value[0])
print(k_opt)
plt.plot(t,A[0],'b-')
plt.plot(t,A[1],'b--')
plt.plot(t,C[0],'g-')
plt.plot(t,C[1],'g--')
plt.plot(t,P[0],'r-')
plt.plot(t,P[1],'r--')
plt.plot(times,A_obs,'bo')
plt.plot(times,C_obs,'gx')
plt.plot(times,P_obs,'rs')
plt.plot(times_new, A_obs_noisy,'b*')
plt.plot(times_new, C_obs_noisy,'g*')
plt.plot(times_new, P_obs_noisy,'r*')
plt.show()

Tensorflow/Keras: volatile validation loss

I've been training a U-Net for single class small lesion segmentation, and have been getting consistently volatile validation loss. I have about 20k images split 70/30 between training and validation sets-so I don't think the issue is too little data. I've tried shuffling and resplitting the sets a few times with no change in volatility-so I don't think the validation set is unrepresentative. I have tried lowering the learning rate with no effect on volatility. And I have tried a few loss functions (dice coefficient, focal tversky, weighted binary cross-entropy). I'm using a decent amount of augmentation so as to avoid overfitting. I've also run through all my data (512x512 float64s with corresponding 512x512 int64 masks--both stored as numpy arrays) do double check that the value range, dtypes, etc. aren't screwy...and I even removed any ROIs in the masks under 35 pixels in area which I thought might be artifact and messing with loss.
I'm using keras ImageDataGen.flow_from_directory...I was initially using zca_whitening and brightness_range augmentation but I think this causes issues with flow_from_directory and the link between mask and image being lost.. so I skipped this.
I've tried validation generators with and without shuffle=True. Batch size is 8.
Here's some of my code, happy to include more if it would help:
# loss
from keras.losses import binary_crossentropy
import keras.backend as K
import tensorflow as tf
epsilon = 1e-5
smooth = 1
def dsc(y_true, y_pred):
smooth = 1.
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
score = (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
return score
def dice_loss(y_true, y_pred):
loss = 1 - dsc(y_true, y_pred)
return loss
def bce_dice_loss(y_true, y_pred):
loss = binary_crossentropy(y_true, y_pred) + dice_loss(y_true, y_pred)
return loss
def confusion(y_true, y_pred):
smooth=1
y_pred_pos = K.clip(y_pred, 0, 1)
y_pred_neg = 1 - y_pred_pos
y_pos = K.clip(y_true, 0, 1)
y_neg = 1 - y_pos
tp = K.sum(y_pos * y_pred_pos)
fp = K.sum(y_neg * y_pred_pos)
fn = K.sum(y_pos * y_pred_neg)
prec = (tp + smooth)/(tp+fp+smooth)
recall = (tp+smooth)/(tp+fn+smooth)
return prec, recall
def tp(y_true, y_pred):
smooth = 1
y_pred_pos = K.round(K.clip(y_pred, 0, 1))
y_pos = K.round(K.clip(y_true, 0, 1))
tp = (K.sum(y_pos * y_pred_pos) + smooth)/ (K.sum(y_pos) + smooth)
return tp
def tn(y_true, y_pred):
smooth = 1
y_pred_pos = K.round(K.clip(y_pred, 0, 1))
y_pred_neg = 1 - y_pred_pos
y_pos = K.round(K.clip(y_true, 0, 1))
y_neg = 1 - y_pos
tn = (K.sum(y_neg * y_pred_neg) + smooth) / (K.sum(y_neg) + smooth )
return tn
def tversky(y_true, y_pred):
y_true_pos = K.flatten(y_true)
y_pred_pos = K.flatten(y_pred)
true_pos = K.sum(y_true_pos * y_pred_pos)
false_neg = K.sum(y_true_pos * (1-y_pred_pos))
false_pos = K.sum((1-y_true_pos)*y_pred_pos)
alpha = 0.7
return (true_pos + smooth)/(true_pos + alpha*false_neg + (1-alpha)*false_pos + smooth)
def tversky_loss(y_true, y_pred):
return 1 - tversky(y_true,y_pred)
def focal_tversky(y_true,y_pred):
pt_1 = tversky(y_true, y_pred)
gamma = 0.75
return K.pow((1-pt_1), gamma)
model = BlockModel((len(os.listdir(os.path.join(imageroot,'train_ct','train'))), 512, 512, 1),filt_num=16,numBlocks=4)
#model.compile(optimizer=Adam(learning_rate=0.001), loss=weighted_cross_entropy)
#model.compile(optimizer=Adam(learning_rate=0.001), loss=dice_coef_loss)
model.compile(optimizer=Adam(learning_rate=0.001), loss=focal_tversky)
train_mask = os.path.join(imageroot,'train_masks')
val_mask = os.path.join(imageroot,'val_masks')
model.load_weights(model_weights_path) #I'm initializing with some pre-trained weights from a similar model
data_gen_args_mask = dict(
rotation_range=10,
shear_range=20,
width_shift_range=0.1,
height_shift_range=0.1,
zoom_range=[0.8,1.2],
horizontal_flip=True,
#vertical_flip=True,
fill_mode='nearest',
data_format='channels_last'
)
data_gen_args = dict(
**data_gen_args_mask
)
image_datagen_train = ImageDataGenerator(**data_gen_args)
mask_datagen_train = ImageDataGenerator(**data_gen_args)#_mask)
image_datagen_val = ImageDataGenerator()
mask_datagen_val = ImageDataGenerator()
seed = 1
BS = 8
steps = int(np.floor((len(os.listdir(os.path.join(train_ct,'train'))))/BS))
print(steps)
val_steps = int(np.floor((len(os.listdir(os.path.join(val_ct,'val'))))/BS))
print(val_steps)
train_image_generator = image_datagen_train.flow_from_directory(
train_ct,
target_size = (512, 512),
color_mode = ("grayscale"),
classes=None,
class_mode=None,
seed = seed,
shuffle = True,
batch_size = BS)
train_mask_generator = mask_datagen_train.flow_from_directory(
train_mask,
target_size = (512, 512),
color_mode = ("grayscale"),
classes=None,
class_mode=None,
seed = seed,
shuffle = True,
batch_size = BS)
val_image_generator = image_datagen_val.flow_from_directory(
val_ct,
target_size = (512, 512),
color_mode = ("grayscale"),
classes=None,
class_mode=None,
seed = seed,
shuffle = True,
batch_size = BS)
val_mask_generator = mask_datagen_val.flow_from_directory(
val_mask,
target_size = (512, 512),
color_mode = ("grayscale"),
classes=None,
class_mode=None,
seed = seed,
shuffle = True,
batch_size = BS)
train_generator = zip(train_image_generator, train_mask_generator)
val_generator = zip(val_image_generator, val_mask_generator)
# make callback for checkpointing
plot_losses = PlotLossesCallback(skip_first=0,plot_extrema=False)
%matplotlib inline
filepath = os.path.join(versionPath, model_version + "_saved-model-{epoch:02d}-{val_loss:.2f}.hdf5")
if reduce:
cb_check = [ModelCheckpoint(filepath,monitor='val_loss',
verbose=1,save_best_only=False,
save_weights_only=True,mode='auto',period=1),
reduce_lr,
plot_losses]
else:
cb_check = [ModelCheckpoint(filepath,monitor='val_loss',
verbose=1,save_best_only=False,
save_weights_only=True,mode='auto',period=1),
plot_losses]
# train model
history = model.fit_generator(train_generator, epochs=numEp,
steps_per_epoch=steps,
validation_data=val_generator,
validation_steps=val_steps,
verbose=1,
callbacks=cb_check,
use_multiprocessing = False
)
And here's how my loss looks:
Another potentially relevant thing: I tweaked the flow_from_directory code a bit (added npy to the white list). But training loss looks fine so assuming the issue isnt here
Two suggestions:
Switch to the classic validation data format (i.e. numpy array) instead of using a generator -- this will ensure you always use the exactly same validation data every time. If you see a different validation curve, then there is something "random" in the validation generator giving you different data at different epochs.
Use a fixed set of samples (100 or 1000 should be enough w/o any data augmentation) for both training and validation. If everything goes well, you should see your network quickly overfit to this dataset and your training and validation curves should very much similar. If not, debug your network.

Why does pymc.MAP not always return the same value

I am running pymc2 to fit a straight line through my data. The code is shown below (modified from examples I found online). When I call the MAP function multiple times, I get different answers, even though I start with the exact same model. I thought the optimization method, fmin_powell, starts at the supplied value for each parameter. As far as I know, fmin_powell has no random component, so it should always end at the same optimum, yet it doesn't. Why do I keep getting different results?
import numpy as np
import pymc
# observed data
n = 21
a = 6
b = 2
sigma = 2
x = np.linspace(0, 1, n)
np.random.seed(1)
y_obs = a * x + b + np.random.normal(0, sigma, n)
def model():
# define priors
a = pymc.Normal('a', mu=0, tau=1 /10 ** 2, value=5)
b = pymc.Normal('b', mu=0, tau=1 / 10 ** 2, value=1)
tau = pymc.Gamma('tau', alpha=0.1, beta=0.1, value=1)
# define likelihood
#pymc.deterministic
def mu(a=a, b=b, x=x):
return a * x + b
y = pymc.Normal('y', mu=mu, tau=tau, value=y_obs, observed=True)
return locals()
ml = model() # dictionary of all locals
mcmc = pymc.Model(ml) # MCMC object
mapmcmc = pymc.MAP(mcmc)
mapmcmc.fit(method='fmin_powell')
print(mcmc.a.value, mcmc.b.value, mcmc.tau.value)
ml = model() # dictionary of all locals
mcmc = pymc.Model(ml) # MCMC object
mapmcmc = pymc.MAP(mcmc)
mapmcmc.fit(method='fmin_powell')
print(mcmc.a.value, mcmc.b.value, mcmc.tau.value)
ml = model() # dictionary of all locals
mcmc = pymc.Model(ml) # MCMC object
mapmcmc = pymc.MAP(mcmc)
mapmcmc.fit(method='fmin_powell')
print(mcmc.a.value, mcmc.b.value, mcmc.tau.value)

Why does this error pop up, what are your thoughts on my neural network/genetic algorithm?

Preamble:
This is a combination of my first and second programs in python (besides hello world level tutorials). Any questions I've had have led me to this site so it seemed fitting that I post it here. I come from a TI-Basic background; so if you have no idea why I did it this why when you should do it this why that is likely why.
My first program was a genetic learning algorithm. Its testing setup was/is to guess your input string. There is currently a problem with it, but it only slightly affects the efficiency of the program.[1]
My second is a simple feed forward neural network (I am currently only working on the xor problem). Some of the code for customizing the variables (the number of inputs, the number of outputs, the number of hidden layers, the number of neurons in those hidden layers) is there but is currently not my focus.
What I am trying to do now is train my network with my genetic algorithm. All seems to be fine but I keep getting a un-explanable error.
Traceback (most recent call last):
File "python", line 174, in <module>
File "python", line 68, in fitness_function
File "python", line 146, in weight_dot_value_plus_bias
TypeError: 'int' object is not subscriptable
Now the weird thing is, the code this is referring to is a direct transfer of code from the original neural network.
I am using repl.it as my compiler, could that be the problem?
import random
from random import choice
from random import randint
#Global varables
length_of_phrase = 15
generation_number = 0
max_number_of_generations = 250
population = 150
perckill = 40
percparents = 35
percrandom = 1
percmutate = 1
individual_by_gene_matrix = [0]
one = 1
zero = 0
number_of_layers = 3
number_of_neurons = [2,3,1]
nnv = [0]*number_of_layers
nnw = [0]*number_of_layers
nnb = [0]*number_of_layers
val1 = randint(0,1)
val2 = randint(0,1)
living = int(((100 - perckill)*population)//100)
dead = population - living
random_strings = int((( percrandom)*population)//100)
reproduced_strings = int(living + random_strings)
parents = int(((100 - percparents)*population)//100)
"""
print(living)
print(dead)
print(population)
print(random_strings)
print(reproduced_strings)
"""
def random_matrix_generator(): #generates a matrix With = number of genes in the target and Hight = population
from random import randint
individual_by_gene_matrix = [[randint(-200, 200)/100 for x in range(length_of_phrase)] for x in range(population)]
#horozontal is traits, vertical is individual
#each gene represents a letter
#each individual represents a word
return(individual_by_gene_matrix)
"""
def convert_matrix_into_list_of_stings():
listofstrings = [ () for var in range( population)]
for var in range( population ):
list = individual_by_gene_matrix[var] #creates a list for each individual with their traits
lista = [ (chr(n )) for n in list] #the traits become letters
listofstrings[var] = ''.join(lista) #creates a list of all the individuals with letters joined
return(listofstrings)
"""
def fitness_function():
for individual in range (population):
number_of_layers,number_of_neurons ,nnv,nnw,nnb = NN_setup(val1,val2,individual_by_gene_matrix[individual][0],individual_by_gene_matrix[individual][1],individual_by_gene_matrix[individual][2],individual_by_gene_matrix[individual][3],individual_by_gene_matrix[individual][4],individual_by_gene_matrix[individual][5],individual_by_gene_matrix[individual][6],individual_by_gene_matrix[individual][7],individual_by_gene_matrix[individual][8],individual_by_gene_matrix[individual][9],individual_by_gene_matrix[individual][10],individual_by_gene_matrix[individual][11],individual_by_gene_matrix[individual][12],individual_by_gene_matrix[individual][13],individual_by_gene_matrix[individual][14])
for var in range (1,number_of_layers):
nnv = weight_dot_value_plus_bias(var)
nnv = sigmoid(var)
fitness[individual] = 1-abs((val1 ^ val2)- (nnv[2][0]))
#for n in range(population):
#print('{} : {} : {}'.format(n, listofstrings[n], fitness[n]))
return(fitness)
def matrix_reorder():
temp_individual_by_gene_matrix = [[0 for var in range(length_of_phrase)] for var in range(population)]
temp_fitness = [(0) for var in range(population)]
for var in range(population):
var_a = fitness.index(max(fitness))
temp_fitness[var] = fitness.pop(var_a)
temp_individual_by_gene_matrix[var] = individual_by_gene_matrix.pop(var_a)
return(temp_individual_by_gene_matrix, temp_fitness)
def kill():
for individal in range(living, population):
individual_by_gene_matrix[individal] = [0]*length_of_phrase
return(individual_by_gene_matrix)
def reproduce():
for individual in range(living,reproduced_strings):
for gene in range(length_of_phrase):
individual_by_gene_matrix[individual][gene] = randint(-200,200)/100
for individual in range(reproduced_strings, population):
mom = randint(0,parents)
dad = randint(0,parents)
for gene in range(length_of_phrase):
individual_by_gene_matrix[individual][gene] = random.choice([individual_by_gene_matrix[mom][gene],individual_by_gene_matrix[dad][gene]])
return(individual_by_gene_matrix)
def mutate():
for individual in range(population):
for gene in range(length_of_phrase):
if randint(0,100)<=percmutate:
individual_by_gene_matrix[individual][gene] = random.gauss(individual_by_gene_matrix[individual][gene],0.5)
return(individual_by_gene_matrix)
def NN_setup(val1,val2,w100,w101,w110,w111,w120,w121,w200,w201,w202,b00,b01,b10,b11,b12,b20):
number_of_layers = 3
number_of_neurons = [2,3,1]
nnv = [0]*number_of_layers
nnw = [0]*number_of_layers
nnb = [0]*number_of_layers
for layer in range ( number_of_layers ):
nnv[layer] = [0]*number_of_neurons[layer]
nnw[layer] = [0]*number_of_neurons[layer]
nnb[layer] = [0]*number_of_neurons[layer]
if layer != 0:
for neuron in range (number_of_neurons[layer]):
nnw[layer][neuron] = [0]*number_of_neurons[layer - 1]
nnv = [[val1,val2],[0.0,0.0,0.0],[0.0]]
nnw = [['inputs have no weight'],[[w100,w101],[w110,w111],[w120,w121]],[[w200,w201,w202]]]
nnb = [[b00,b01],[b10,b11,b12],[b20]]
return(number_of_layers,number_of_neurons,nnv,nnw,nnb)
|
|
|
v
def weight_dot_value_plus_bias(layer):
for nueron in range (number_of_neurons[layer]):
for weight in range (number_of_neurons[layer - 1]):
---> nnv[layer][nueron] += (nnv[layer-1][weight])*(nnw[layer][nueron][weight])
nnv[layer][nueron] += nnb[layer][nueron]
return(nnv)
def sigmoid(layer):
for neuron in range(number_of_neurons[layer]):
nnv[layer][neuron] = (1/(1+3**(-nnv[layer][neuron])))
return(nnv)
individual_by_gene_matrix = random_matrix_generator()
while (generation_number <= max_number_of_generations):
val1 = randint(0,1)
val2 = randint(0,1)
fitness = [(0) for var in range(population)]
#populations_phenotypes_by_individual = convert_matrix_into_list_of_stings()
fitness = fitness_function()
individual_by_gene_matrix , fitness = matrix_reorder()
individual_by_gene_matrix = kill()
individual_by_gene_matrix = reproduce()
individual_by_gene_matrix = mutate()
individual_by_gene_matrix , fitness = matrix_reorder()
#populations_phenotypes_by_individual = convert_matrix_into_list_of_stings()
print('{} {} {} {}'.format(generation_number,(10000(fitness[0]))//100),val1,val2)
generation_number += 1
print('')
print('')
print(individual_by_gene_matrix[0])
That was way to many indents!!!
How the hell do I just insert a block of code????!!!!!
I'll give you the source code to the individual programs once I learn how to insert a block of code
[1] Your going to have to wait till I give you the source code to just the genetic algorithm
Any tips, suggestions, maybe how would you write the code to what I'm trying to do?

Resources