Time series fitting: use of 'trend' in VAR models via statsmodels.tsa.vector_ar.var_model.VAR - statsmodels

I'm using the statsmodels.tsa.vector_ar.var_model.VAR package to fit some bivariate time series. That is, time series of vectors with 2 components: (x1, y1)T, ..., (xN, yN)T.
However, I do not understand how to use the parameter trend in the VAR.fit() function. There are 4 possible values for that parameter ({“c”, “ct”, “ctt”, “n”}) and the explanation given in the doc is: “c” - add constant “ct” - constant and trend “ctt” - constant, linear and quadratic trend “n” - co constant, no trend Note that these are prepended to the columns of the dataset.
What I understand is that the trend parameter allows to perform some pre-processing steps on the time series, before fitting the VAR model. In particular, I expect that, when trend is:
n: the VAR model yt = A1 yt-1 + ... + Ap yt-p is fitted on the series as it is;
c: the mean of the series is removed before the fit. So, defining mu_x = np.mean(x) and mu_y = np.mean(y) the mean values of the univariate series, the VAR model is fitted on the new series: (xi - mu_x, yi - mu_y)i=1,...,N;
ct: the VAR model is fitted on the series where the linear trend has been removed: (xi - a - b * i, yi - c - d * i)i=1,...,N;
ctt: the VAR model is fitted on the series where the quadratic trend has been removed: (xi - m - n * i - p* i2, yi - q - r * i - s* i2)i=1,...,N.
However, this is not what results from some attempts. Here after my code with a few examples.
1) I define a bivariate time series ts0 and, from that, 2 other transformed time series:
ts1: computed from ts0 by subtracting the mean;
ts2: compuetd from ts1 by removing the linear trend.
### original bivariate time series
ts0 = np.array([[-2.27390781, 4.89021106],
[ 0.56894665, 1.57356924],
[-1.54000883, -1.97090661],
[ 0.60917182, 0.3684891 ],
[-2.518067 , 0.42002855],
[-0.4788302 , -0.63284219],
[ 1.8208968 , 2.27831329],
[-1.65226058, -2.6647208 ],
[ 0.72437619, 1.09676352],
[-3.190304 , -0.48445386],
[ 0.41290842, 1.01441648]])
N = np.shape(ts0)[0] # number of time steps
### pre-processing of the time series by hand:
# remove the mean values
ts1 = ts0 - np.mean(ts0,axis=0) # translated time series
# remove the linear trend
time = np.arange(N)
b,d = (np.cov(time,ts1[:,0],ddof=1)[0,1]/np.var(time,ddof=1),
np.cov(time,ts1[:,1],ddof=1)[0,1]/np.var(time,ddof=1)) # ang. coef
a,c = (np.mean(ts1[:,0]) - b*np.mean(time),
np.mean(ts1[:,1]) - d*np.mean(time)) # intercept
x_lin,y_lin = ([a+b*i for i in time], [c+d*i for i in time]) # linear predictions
ts2 = ts1 - np.transpose([x_lin, y_lin]) # translated and rotated time series
In the following, I'll forecast the vector N+1 for the three time series, by setting different values for the parameter trend.
2) I expect the forecast for ts2 to return the same result regardeless of trend = n, c, or ct. This is not true:
### fit the series ts2
lag = 1
var_2 = VAR(pd.DataFrame(data=ts2))
var_2n = var_2.fit(lag, trend='n') # 3 fits
var_2c = var_2.fit(lag, trend='c')
var_2ct = var_2.fit(lag, trend='ct')
pred_2n = var_2n.forecast(y=ts2[-lag:,:], steps=1)[0] # 3 predictions
pred_2c = var_2c.forecast(y=ts2[-lag:,:], steps=1)[0]
pred_2ct = var_2ct.forecast(y=ts2[-lag:,:], steps=1)[0]
### plot
fig, axs = plt.subplots(1,2,figsize=(12,3))
# coord x
ax = axs[0]
ax.plot(time, ts2[:,0])
ax.hlines(xmin=time[0], xmax=time[-1], y=np.mean(ts2[:,0]), color='g', label='linear fit')
ax.scatter(N, pred_2n[0], marker='x', c='salmon', label='VAR(ts2, trend=n)')
ax.scatter(N, pred_2c[0], marker='D', c='orange', label='VAR(ts2, trend=c)', alpha=0.5)
ax.scatter(N, pred_2ct[0], marker='.', c='k', label='VAR(ts2, trend=ct)', alpha=0.5)
ax.grid(True)
ax.set_xlabel('time')
ax.set_ylabel('ts2: coord x')
ax.legend(loc='lower left')
# coord y
ax = axs[1]
ax.plot(time, ts2[:,1])
ax.hlines(xmin=time[0], xmax=time[-1], y=np.mean(ts2[:,1]), color='g', label='linear fit')
ax.scatter(N, pred_2n[1], marker='x', c='salmon')
ax.scatter(N, pred_2c[1], marker='D', c='orange')
ax.scatter(N, pred_2ct[1], marker='.', c='k')
ax.grid(True)
ax.set_xlabel('time')
ax.set_ylabel('ts2: coord y')
fig.tight_layout()
enter image description here
3) I expect to get the same prediction when:
I fit the ts2 series with trend=n and I transform back the points on the ts1 series;
I fit the ts1 series with trend=c.
This is not true:
### compute predictions
pred_2n_backto_1 = np.array([a,c]) + np.array([b,d])*N + pred_2n # prediction VAR(ts2, trend=n) back to ts1
var_1 = VAR(pd.DataFrame(data=ts1)) # prediction VAR(ts1, trend=c)
var_1c = var_1.fit(lag, trend='c')
pred_1c = var_1c.forecast(y=ts1[-lag:,:], steps=1)[0]
### plot
fig, axs = plt.subplots(1,2,figsize=(12,3))
# coord x
ax = axs[0]
ax.plot(time, ts1[:,0])
ax.plot(time, x_lin, c='g', label='linear fit')
ax.scatter(N, pred_2n_backto_1[0], marker='x', c='salmon', label='VAR(ts2, trend=n) back to ts1')
ax.scatter(N, pred_1c[0], marker='o', c='b', alpha=0.5, label='VAR(ts1, trend=c)')
ax.grid(True)
ax.set_xlabel('time')
ax.set_ylabel('ts1: coord x')
ax.legend(loc='lower left')
# coord y
ax = axs[1]
ax.plot(time, ts1[:,1])
ax.plot(time, y_lin, c='g', label='linear fit')
ax.scatter(N, pred_2n_backto_1[1], marker='x', c='salmon', label='VAR(ts2, trend=n) back to ts1')
ax.scatter(N, pred_1c[1], marker='o', c='b', alpha=0.5, label='VAR(ts1, trend=c)')
ax.grid(True)
ax.set_xlabel('time')
ax.set_ylabel('ts1: coord y')
fig.tight_layout()
enter image description here
4) More general investigation:
I compute predictions for the 3 series (ts0, ts1, ts2), with trend={n, c, ct};
I transform back those values through the pre-processing steps, in order to have all the predictions on the ts0 series;
I compare the 9 predictions.
I find that only some predictions are equals, but I don't understand the logic behind. In particular:
VAR(ts2, trend='ct'), VAR(ts1, trend='ct') and VAR(ts0, trend='ct') return the same predictions;
VAR(ts1, trend='c') and VAR(ts0, trend='c') also return the same predictions.
# pre-processing parameters
intercepts = np.array([a,c])
ang_coefs = np.array([b,d])
mean_ts0 = np.mean(ts0, axis=0)
# predictions for ts2 back to ts0
pred_2n_backto_0 = (intercepts + ang_coefs*N + pred_2n) + mean_ts0
pred_2c_backto_0 = (intercepts + ang_coefs*N + pred_2c) + mean_ts0
pred_2ct_backto_0 = (intercepts + ang_coefs*N + pred_2ct) + mean_ts0
# predictions for ts1 back to ts0
var_1 = VAR(pd.DataFrame(data=ts1))
var_1n = var_1.fit(lag, trend='n')
var_1c = var_1.fit(lag, trend='c')
var_1ct = var_1.fit(lag, trend='ct')
pred_1n = var_1n.forecast(y=ts1[-lag:,:], steps=1)[0]
pred_1c = var_1c.forecast(y=ts1[-lag:,:], steps=1)[0]
pred_1ct = var_1ct.forecast(y=ts1[-lag:,:], steps=1)[0]
pred_1n_backto_0 = pred_1n + mean_ts0
pred_1c_backto_0 = pred_1c + mean_ts0
pred_1ct_backto_0 = pred_1ct + mean_ts0
# predictions for ts0
var_0 = VAR(pd.DataFrame(data=ts0))
var_0n = var_0.fit(lag, trend='n')
var_0c = var_0.fit(lag, trend='c')
var_0ct = var_0.fit(lag, trend='ct')
pred_0n = var_0n.forecast(y=ts0[-lag:,:], steps=1)[0]
pred_0c = var_0c.forecast(y=ts0[-lag:,:], steps=1)[0]
pred_0ct = var_0ct.forecast(y=ts0[-lag:,:], steps=1)[0]
# compare predictions
list_pred_x = [pred_2n_backto_0[0], pred_2c_backto_0[0], pred_2ct_backto_0[0],
pred_1n_backto_0[0], pred_1c_backto_0[0], pred_1ct_backto_0[0],
pred_0n[0], pred_0c[0], pred_0ct[0]]
list_pred_y = [pred_2n_backto_0[1], pred_2c_backto_0[1], pred_2ct_backto_0[1],
pred_1n_backto_0[1], pred_1c_backto_0[1], pred_1ct_backto_0[1],
pred_0n[1], pred_0c[1], pred_0ct[1]]
list_labels = ['pred_2n_backto_0', 'pred_2c_backto_0', 'pred_2ct_backto_0',
'pred_1n_backto_0', 'pred_1c_backto_0', 'pred_1ct_backto_0',
'pred_0n', 'pred_0c', 'pred_0ct']
mat_x = mat_y = np.zeros((9,9))
for i in range(9):
x_i = list_pred_x[i]
y_i = list_pred_y[i]
for j in range(9):
x_j = list_pred_x[j]
y_j = list_pred_y[j]
mat_x[i,j] = np.log(np.fabs(x_i - x_j))
mat_y[i,j] = np.log(np.fabs(y_i - y_j))
fig = plt.figure(figsize=(16,5))
plt.subplot(1,2,1)
plt.imshow(mat_x, cmap='plasma')
plt.colorbar()
plt.title('predictions for coord x:' + '\n' + r'$val_{ij} = log(|pred_i - pred_j|)$')
plt.xticks(np.arange(9), list_labels, rotation=90)
plt.yticks(np.arange(9), list_labels)
plt.subplot(1,2,2)
plt.imshow(mat_y, cmap='plasma')
plt.colorbar()
plt.title('predictions for coord y:' + '\n' + r'$val_{ij} = log(|pred_i - pred_j|)$')
plt.xticks(np.arange(9), list_labels, rotation=90)
plt.yticks(np.arange(9), list_labels)
plt.show()
enter image description here
Could someone explain to me the logic of the trend parameter in the statsmodels.tsa.vector_ar.var_model.VAR package ?
Thank you in advance :)

Related

Issues with m.if2, m.abs2 in implementing a contact switch

This is a continuation of a prior question with a slightly different emphasis.
In summary, the prior solution helped with the implementation of the model inputs. The below model is working and provides a solution for the full contact condition. The framework and basic mechanics are in place for the variable contact constraint. However, no solution is available for when the contact switch variable is applied to the geometric constraints, whether as a Param or FV. (Note: no issues when applied to the dynamic equations).
One thing I've noted is that the m.if2 output is not correct in the [0] position. Below is the output of the switch-related variables:
adiff= [0.1, 0.10502512563, 0.11005025126, 0.11507537688, 0.12010050251,...
bdiff= [1.0, 0.99497487437, 0.98994974874, 0.98492462312, 0.97989949749,...
swtch= [0.1, 0.10449736118, 0.10894421858, 0.11334057221, 0.11768642206,...
c= [0.0, 1.000000005, 1.000000005, 1.000000005, 1.000000005, 1.000000005,...
Based on the logic swtch = adiff*bdiff and m.if2(swtch-thres,0,1), c[0] should be ~1.0. I've played with these parameters and haven't found a way to affect that first cell. I can't say for sure that this initial position is causing issues, but this seems like an erroneous output regardless.
Second, given that the m.if() outputs as approximately 0 and 1, I've attempted to soften the geometric constraint as: m.abs2({constraint}) <= {tol}. Even in the case when a generous tolerance is applied and c is excluded, this fails to produce a solution (whereas the hard constraint will).
Any suggestions for correcting either issue are appreciated.
Lastly, in the prior post, the use of m.integral() for setting the value of c was suggested. I'm unclear if that entails using if2 as well. If you can expand on implementing a switch that enables at t=a and switches off at t=b using an integral, that would be appreciated.
Full code:
###Import Libraries
import math
import matplotlib
matplotlib.use("TkAgg")
import matplotlib.animation as animation
import numpy as np
from gekko import GEKKO
###Defining a model
m = GEKKO(remote=True)
v = 1 #set walking speed (m/s)
L1 = .5 #set thigh length (m)
L2 = .5 #set shank length (m)
M = 75 #set mass (kg)
#################################
###Define secondary parameters
D = L1 + L2 #leg length parameter
pi = math.pi #define pi
g = 9.81 #define gravity
###Define initial and final conditions and limits
xmin = -D; xmax = D
xdotmin = .5*v; xdotmax = 1.5*v
ymin = 0*D; ymax = 5*D
q1min = -pi/2; q1max = pi/2
q2min = -pi/2; q2max = -.01
tfmin = .25; tfmax = 10
#amin = 0; amax = .45 #limits for FVs (future capability)
#bmin = .55; bmax = 1
###Defining the time parameter (0, 1)
N = 200
t = np.linspace(0,1,N)
m.time = t
###Final time Fixed Variable
TF = m.FV(1,lb=tfmin,ub=tfmax); TF.STATUS = 1
end_loc = len(m.time)-1
###Defining initial and final condition vectors
init = np.zeros(len(m.time))
final = np.zeros(len(m.time))
init[1] = 1
final[-1] = 1
init = m.Param(value=init)
final = m.Param(value=final)
###Parameters
M = m.Param(value=M) #cart mass
L1 = m.Param(value=L1) #link 1 length
L2 = m.Param(value=L2) #link 1 length
g = m.Const(value=g) #gravity
###Control Input Manipulated Variable
u = m.MV(0, lb=-70, ub=70); u.STATUS = 1
###Ground Contact Fixed Variables
#as fixed variables (future state)
#a = m.FV(0,lb=amin,ub=amax); a.STATUS = 1 #equates to the unscaled time when contact first occurs
#b = m.FV(1,lb=bmin,ub=bmax); b.STATUS = 1 #equates to the unscaled time when contact last occurs
#as fixed parameter
a = m.Param(value=-.1) #a<0 to drive m.time-a positive
b = m.Param(value=1)
###State Variables
x, y, xdot, ydot, q1, q2 = m.Array(m.Var, 6)
#Define BCs
m.free_initial(x)
m.free_final(x)
m.free_initial(xdot)
m.free_final(xdot)
m.free_initial(y)
m.free_initial(ydot)
#Define Limits
y.LOWER = ymin; y.UPPER = ymax
x.LOWER = xmin; x.UPPER = xmax
xdot.LOWER = xdotmin; xdot.UPPER = xdotmax
q1.LOWER = q1min; q1.UPPER = q1max
q2.LOWER = q2min; q2.UPPER = q2max
###Intermediates
xdot_int = m.Intermediate(final*m.integral(xdot)) #for average velocity constraint
adiff = m.Param(m.time-a.VALUE) #positive if m.time>a
bdiff = m.Param(b.VALUE-m.time) #positive if m.time<b
swtch = m.Intermediate(adiff*bdiff) #positive if m.time > a AND m.time < b
thres = .001
c = m.if2(swtch-thres,0,1) #c=0 if swtch <0, c=1 if swtch >0
###Defining the State Space Model
m.Equation(xdot.dt()/TF == -c*u*(L1*m.sin(q1)
+L2*m.sin(q1+q2))
/(M*L1*L2*m.sin(q2)))
m.Equation(ydot.dt()/TF == c*u*(L1*m.cos(q1)
+L2*m.cos(q1+q2))
/(M*L1*L2*m.sin(q2))-g)
m.Equation(x.dt()/TF == xdot)
m.Equation(y.dt()/TF == ydot)
m.periodic(y) #initial and final y position must be equal
m.periodic(ydot) #initial and final y velocity must be equal
m.periodic(xdot) #initial and final x velocity must be equal
m.Equation(m.abs2(xdot_int*final - v*final) <= .02) #soft constraint for average velocity ~= v
###Geometric constraints
#with no contact switch, this works
m.Equation(x + L1*m.sin(q1) + L2*m.sin(q1+q2) == 0) #x geometric constraint when in contact
m.Equation(y - L1*m.cos(q1) - L2*m.cos(q1+q2) == 0) #y geometric constraint when in contact
#soft constraint for contact switch. Produces no solution, with or without c, abs2 or abs3:
#m.Equation(c*m.abs2(x + L1*m.sin(q1) + L2*m.sin(q1+q2)) <= .01) #x geometric constraint when in contact
#.Equation(c*m.abs2(y - L1*m.cos(q1) - L2*m.cos(q1+q2)) <= .01) #y geometric constraint when in contact
###Objectives
#Maximize stride length
m.Maximize(100*final*x)
m.Minimize(100*init*x)
#Minimize torque
m.Obj(0.01*u**2)
###Solve
m.options.IMODE = 6
m.options.SOLVER = 3
m.solve()
###Scale time vector
m.time = np.multiply(TF, m.time)
###Display Outputs
print("adiff=", adiff.VALUE)
print("bdiff=", bdiff.VALUE)
print("swtch=", swtch.VALUE)
print("c=", c.VALUE)
########################################
####Plotting the results
import matplotlib.pyplot as plt
plt.close('all')
fig1 = plt.figure()
fig2 = plt.figure()
fig3 = plt.figure()
fig4 = plt.figure()
ax1 = fig1.add_subplot()
ax2 = fig2.add_subplot(221)
ax3 = fig2.add_subplot(222)
ax4 = fig2.add_subplot(223)
ax5 = fig2.add_subplot(224)
ax6 = fig3.add_subplot()
ax7 = fig4.add_subplot(121)
ax8 = fig4.add_subplot(122)
ax1.plot(m.time,u.value,'m',lw=2)
ax1.legend([r'$u$'],loc=1)
ax1.set_title('Control Input')
ax1.set_ylabel('Torque (N-m)')
ax1.set_xlabel('Time (s)')
ax1.set_xlim(m.time[0],m.time[-1])
ax1.grid(True)
ax2.plot(m.time,x.value,'r',lw=2)
ax2.set_ylabel('X Position (m)')
ax2.set_xlabel('Time (s)')
ax2.legend([r'$x$'],loc='upper left')
ax2.set_xlim(m.time[0],m.time[-1])
ax2.grid(True)
ax2.set_title('Mass X-Position')
ax3.plot(m.time,xdot.value,'g',lw=2)
ax3.set_ylabel('X Velocity (m/s)')
ax3.set_xlabel('Time (s)')
ax3.legend([r'$xdot$'],loc='upper left')
ax3.set_xlim(m.time[0],m.time[-1])
ax3.grid(True)
ax3.set_title('Mass X-Velocity')
ax4.plot(m.time,y.value,'r',lw=2)
ax4.set_ylabel('Y Position (m)')
ax4.set_xlabel('Time (s)')
ax4.legend([r'$y$'],loc='upper left')
ax4.set_xlim(m.time[0],m.time[-1])
ax4.grid(True)
ax4.set_title('Mass Y-Position')
ax5.plot(m.time,ydot.value,'g',lw=2)
ax5.set_ylabel('Y Velocity (m/s)')
ax5.set_xlabel('Time (s)')
ax5.legend([r'$ydot$'],loc='upper left')
ax5.set_xlim(m.time[0],m.time[-1])
ax5.grid(True)
ax5.set_title('Mass Y-Velocity')
ax6.plot(x.value, y.value,'g',lw=2)
ax6.set_ylabel('Y-Position (m)')
ax6.set_xlabel('X-Position (m)')
ax6.legend([r'$mass coordinate$'],loc='upper left')
ax6.set_xlim(x.value[0],x.value[-1])
ax6.set_ylim(0,1.1)
ax6.grid(True)
ax6.set_title('Mass Position')
ax7.plot(m.time,q1.value,'r',lw=2)
ax7.set_ylabel('q1 Position (rad)')
ax7.set_xlabel('Time (s)')
ax7.legend([r'$q1$'],loc='upper left')
ax7.set_xlim(m.time[0],m.time[-1])
ax7.grid(True)
ax7.set_title('Hip Joint Angle')
ax8.plot(m.time,q2.value,'r',lw=2)
ax8.set_ylabel('q2 Position (rad)')
ax8.set_xlabel('Time (s)')
ax8.legend([r'$q2$'],loc='upper left')
ax8.set_xlim(m.time[0],m.time[-1])
ax8.grid(True)
ax8.set_title('Knee Joint Angle')
plt.show()
The m.if2() function is a Mathematical Program with Complementarity Constraints (MPCC). It does not use a binary variable like the m.if3() function and therefore can be solved with any NLP solver, such as IPOPT. The disadvantage is that it has a saddle point at the switching condition and often gets stuck at the local solution. One way to overcome this issue is to use m.if3() with the IPOPT solver for initialization and then switch to the APOPT solver to generate an exact MINLP solution.
m.options.SOLVER=3 # IPOPT
m.solve()
m.options.SOLVER=1 # APOPT
m.options.TIME_SHIFT = 0 # don't update initial conditions
m.solve()
Additional information on MPCCs and binary conditional statements is in the Design Optimization course section on Logical Conditions.

constrain the initial and final values of a GEKKO ```Var``` to a data-based curve

I am trying to solve a low thrust optimal control problem for Earth orbits, i.e. going from one orbit to another. The formulation of the problem includes six states (r_1, r_2, r_3, v_1, v_2, v_3) and 3 controls (u_1, u_2, u_3) with a simplified point model of gravity. When I specify the full initial state and half of the final state, the solver converges and yields a good solution. When I try the full final state, the problem is over constrained.
My thought on how to remedy this is to allow the trajectory to depart the initial orbit at any point along the orbital curve and join the final orbit an any point along the final orbital curve, giving it more degrees of freedom. Is there a way to constrain the initial and final values of all 6 states to a cspline curve? This is what I have tried so far:
m = GEKKO(remote=True)
m.time = np.linspace(0, tof, t_steps)
theta_i = m.FV(lb = 0, ub = 360)
theta_f = m.FV(lb = 0, ub = 360)
rx_i = m.Param()
ry_i = m.Param()
rz_i = m.Param()
vx_i = m.Param()
vy_i = m.Param()
vz_i = m.Param()
rx_f = m.Param()
ry_f = m.Param()
rz_f = m.Param()
vx_f = m.Param()
vy_f = m.Param()
vz_f = m.Param()
m.cspline(theta_i, rx_i, initial.theta, initial.r[:, 0])
m.cspline(theta_i, ry_i, initial.theta, initial.r[:, 1])
m.cspline(theta_i, rz_i, initial.theta, initial.r[:, 2])
m.cspline(theta_i, vx_i, initial.theta, initial.v[:, 0])
m.cspline(theta_i, vy_i, initial.theta, initial.v[:, 1])
m.cspline(theta_i, vz_i, initial.theta, initial.v[:, 2])
m.cspline(theta_f, rx_f, final.theta, final.r[:, 0])
m.cspline(theta_f, ry_f, final.theta, final.r[:, 1])
m.cspline(theta_f, rz_f, final.theta, final.r[:, 2])
m.cspline(theta_f, vx_f, final.theta, final.v[:, 0])
m.cspline(theta_f, vy_f, final.theta, final.v[:, 1])
m.cspline(theta_f, vz_f, final.theta, final.v[:, 2])
r1 = m.Var(rx_i)
r2 = m.Var(ry_i)
r3 = m.Var(rz_i)
r1dot = m.Var(vx_i)
r2dot = m.Var(vy_i)
r3dot = m.Var(vz_i)
u1 = m.Var(lb = -max_u, ub = max_u)
u2 = m.Var(lb = -max_u, ub = max_u)
u3 = m.Var(lb = -max_u, ub = max_u)
m.Equation(r1.dt() == r1dot)
m.Equation(r2.dt() == r2dot)
m.Equation(r3.dt() == r3dot)
r = m.Intermediate(m.sqrt(r1**2 + r2**2 + r3**2))
v = m.Intermediate(m.sqrt(r1dot**2 + r2dot**2 + r3dot**3))
m.Equation(-mu*r1/r**3 == r1dot.dt() + u1)
m.Equation(-mu*r2/r**3 == r2dot.dt() + u2)
m.Equation(-mu*r3/r**3 == r3dot.dt() + u3)
m.fix_final(r1, rx_f)
m.fix_final(r2, ry_f)
m.fix_final(r3, rz_f)
m.fix_final(r1dot, vx_f)
m.fix_final(r2dot, vy_f)
m.fix_final(r3dot, vz_f)
m.Minimize(m.integral(u1**2 + u2**2 + u3**2))
m.options.IMODE = 6
m.options.solver = 3
#m.options.ATOL = 1e-3
m.options.MAX_ITER = 300
m.solve(disp=True) # solve
With cspline, I am trying to allow GEKKO to pick a fixed value of the true anomaly (the parameter that determines how far along the orbit you are) from data that I have generated about the states at sampled true anomalies that would have associated position and velocity states. Any help would be much appreciated!
SOLUTION:
I implemented the end constraints as "soft constraints", i.e. quantities to be minimized, instead of hard constraints which would use fix_final. Specific implementation is as follows.
final = np.zeros(len(m.time))
final[-1] = 1
final = m.Param(value=final)
m.Obj(final*(r1-final_o.r[0, 0])**2)
m.Obj(final*(r2-final_o.r[0, 1])**2)
m.Obj(final*(r3-final_o.r[0, 2])**2)
m.Obj(final*(r1dot-final_o.v[0, 0])**2)
m.Obj(final*(r2dot-final_o.v[0, 1])**2)
m.Obj(final*(r3dot-final_o.v[0, 2])**2)
final makes the solver only consider the last item (the end boundary) when multiplied.
It is generally much harder for an optimizer to exactly reach a fixed endpoint, especially when it depends on a complex sequence of moves. This often leads to infeasible solutions. An alternative is to create a soft constraint (objective minimization) to penalize deviations from the final trajectory. Here is an example that is similar:
import matplotlib.animation as animation
import numpy as np
from gekko import GEKKO
#Defining a model
m = GEKKO()
#################################
#Weight of item
m2 = 1
#################################
#Defining the time, we will go beyond the 6.2s
#to check if the objective was achieved
m.time = np.linspace(0,8,100)
end_loc = int(100.0*6.2/8.0)
#Parameters
m1a = m.Param(value=10)
m2a = m.Param(value=m2)
final = np.zeros(len(m.time))
for i in range(len(m.time)):
if m.time[i] < 6.2:
final[i] = 0
else:
final[i] = 1
final = m.Param(value=final)
#MV
ua = m.Var(value=0)
#State Variables
theta_a = m.Var(value=0)
qa = m.Var(value=0)
ya = m.Var(value=-1)
va = m.Var(value=0)
#Intermediates
epsilon = m.Intermediate(m2a/(m1a+m2a))
#Defining the State Space Model
m.Equation(ya.dt() == va)
m.Equation(va.dt() == -epsilon*theta_a + ua)
m.Equation(theta_a.dt() == qa)
m.Equation(qa.dt() == theta_a -ua)
#Define the Objectives
#Make all the state variables be zero at time >= 6.2
m.Obj(final*ya**2)
m.Obj(final*va**2)
m.Obj(final*theta_a**2)
m.Obj(final*qa**2)
m.fix(ya,pos=end_loc,val=0.0)
m.fix(va,pos=end_loc,val=0.0)
m.fix(theta_a,pos=end_loc,val=0.0)
m.fix(qa,pos=end_loc,val=0.0)
#Try to minimize change of MV over all horizon
m.Obj(0.001*ua**2)
m.options.IMODE = 6 #MPC
m.solve() #(disp=False)
This example uses a combination of soft and hard constraints to help it find the optimal solution.

Fit Amplitude (Frequency response) of a capacitor with lmfit

I am trying to fit measured data with lmfit.
My goal is to get the parameters of the capacitor with an equivalent circuit diagram.
So, I want to create a model with parameters (C, R1, L1,...) and fit it to the measured data.
I know that the resonance frequency is at the global minimum and there must also be R1. Also known is C.
So I could fix the parameter C and R1. With the resonance frequency I could calculate L1 too.
I created the model, but the fit doesn't work right.
Maybe someone could help me with this.
Thanks in advance.
from lmfit import minimize, Parameters
from lmfit import report_fit
params = Parameters()
params.add('C', value = 220e-9, vary = False)
params.add('L1', value = 0.00001, min = 0, max = 0.1)
params.add('R1', value = globalmin, vary = False)
params.add('Rp', value = 10000, min = 0, max = 10e20)
params.add('Cp', value = 0.1, min = 0, max = 0.1)
def get_elements(params, freq, data):
C = params['C'].value
L1 = params['L1'].value
R1 = params['R1'].value
Rp = params['Rp'].value
Cp = params['Cp'].value
XC = 1/(1j*2*np.pi*freq*C)
XL = 1j*2*np.pi*freq*L1
XP = 1/(1j*2*np.pi*freq*Cp)
Z1 = R1 + XC*Rp/(XC+Rp) + XL
real = np.real(Z1*XP/(Z1+XP))
imag = np.imag(Z1*XP/(Z1+XP))
model = np.sqrt(real**2 + imag**2)
#model = np.sqrt(R1**2 + ((2*np.pi*freq*L1 - 1/(2*np.pi*freq*C))**2))
#model = (np.arctan((2*np.pi*freq*L1 - 1/(2*np.pi*freq*C))/R1)) * 360/((2*np.pi))
return data - model
out = minimize(get_elements, params , args=(freq, data))
report_fit(out)
#make reconstruction for plotting
C = out.params['C'].value
L1 = out.params['L1'].value
R1 = out.params['R1'].value
Rp = out.params['Rp'].value
Cp = out.params['Cp'].value
XC = 1/(1j*2*np.pi*freq*C)
XL = 1j*2*np.pi*freq*L1
XP = 1/(1j*2*np.pi*freq*Cp)
Z1 = R1 + XC*Rp/(XC+Rp) + XL
real = np.real(Z1*XP/(Z1+XP))
imag = np.imag(Z1*XP/(Z1+XP))
reconst = np.sqrt(real**2 + imag**2)
reconst_phase = np.arctan(imag/real)* 360/(2*np.pi)
'''
PLOTTING
'''
#plot of filtred signal vs measered data (AMPLITUDE)
fig = plt.figure(figsize=(40,15))
file_title = 'Measured Data'
plt.subplot(311)
plt.xscale('log')
plt.yscale('log')
plt.xlim([min(freq), max(freq)])
plt.ylabel('Amplitude')
plt.xlabel('Frequency in Hz')
plt.grid(True, which="both")
plt.plot(freq, z12_fac, 'g', alpha = 0.7, label = 'data')
#Plot Impedance of model in magenta
plt.plot(freq, reconst, 'm', label='Reconstruction (Model)')
plt.legend()
#(PHASE)
plt.subplot(312)
plt.xscale('log')
plt.xlim([min(freq), max(freq)])
plt.ylabel('Phase in °')
plt.xlabel('Frequency in Hz')
plt.grid(True, which="both")
plt.plot(freq, z12_deg, 'g', alpha = 0.7, label = 'data')
#Plot Phase of model in magenta
plt.plot(freq, reconst_phase, 'm', label='Reconstruction (Model)')
plt.legend()
plt.savefig(file_title)
plt.close(fig)
measured data
equivalent circuit diagram (model)
Edit 1:
Fit-Report:
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 28
# data points = 4001
# variables = 3
chi-square = 1197180.70
reduced chi-square = 299.444897
Akaike info crit = 22816.4225
Bayesian info crit = 22835.3054
## Warning: uncertainties could not be estimated:
L1: at initial value
Rp: at boundary
Cp: at initial value
Cp: at boundary
[[Variables]]
C: 2.2e-07 (fixed)
L1: 1.0000e-05 (init = 1e-05)
R1: 0.06375191 (fixed)
Rp: 0.00000000 (init = 10000)
Cp: 0.10000000 (init = 0.1)
Edit 2:
Data can be found here:
https://1drv.ms/u/s!AsLKp-1R8HlZhcdlJER5T7qjmvfmnw?e=r8G2nN
Edit 3:
I now have simplified my model to a simple RLC-series. With a another set of data this works pretty good. see here the plot with another set of data
def get_elements(params, freq, data):
C = params['C'].value
L1 = params['L1'].value
R1 = params['R1'].value
#Rp = params['Rp'].value
#Cp = params['Cp'].value
#k = params['k'].value
#freq = np.log10(freq)
XC = 1/(1j*2*np.pi*freq*C)
XL = 1j*2*np.pi*freq*L1
# XP = 1/(1j*2*np.pi*freq*Cp)
# Z1 = R1*k + XC*Rp/(XC+Rp) + XL
# real = np.real(Z1*XP/(Z1+XP))
# imag = np.imag(Z1*XP/(Z1+XP))
Z1 = R1 + XC + XL
real = np.real(Z1)
imag= np.imag(Z1)
model = np.sqrt(real**2 + imag**2)
return np.sqrt(np.real(data)**2+np.imag(data)**2) - model
out = minimize(get_elements, params , args=(freq, data))
Report:
Chi-Square is really high...
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 25
# data points = 4001
# variables = 2
chi-square = 5.0375e+08
reduced chi-square = 125968.118
Akaike info crit = 46988.8798
Bayesian info crit = 47001.4684
[[Variables]]
C: 3.3e-09 (fixed)
L1: 5.2066e-09 +/- 1.3906e-08 (267.09%) (init = 1e-05)
R1: 0.40753691 +/- 24.5685882 (6028.56%) (init = 0.05)
[[Correlations]] (unreported correlations are < 0.100)
C(L1, R1) = -0.174
With my originally set of data I get this:
plot original data (complex)
Which is not bad, but also not good. That's why I want to make my model more detailed, so I can fit also in higher frequency regions...
Report of this one:
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 25
# data points = 4001
# variables = 2
chi-square = 109156.170
reduced chi-square = 27.2958664
Akaike info crit = 13232.2473
Bayesian info crit = 13244.8359
[[Variables]]
C: 2.2e-07 (fixed)
L1: 2.3344e-08 +/- 1.9987e-10 (0.86%) (init = 1e-05)
R1: 0.17444702 +/- 0.29660571 (170.03%) (init = 0.05)
Please note: I also have changed the input data of the model. Now I give the model complex values and then I calculate the Amplitude. Find this also here: https://1drv.ms/u/s!AsLKp-1R8HlZhcdlJER5T7qjmvfmnw?e=qnrZk1

How to use gekko to control two variables while manipulating two variables for a cstr?

Attached below is my PYTHON code:
I have a CSTR and im trying to control the height of the tank and the temperature while manipulating the inlet flow and the cooling temperature. The problem is that the CV's are not tracking their respective setpoints. I tried doing the problem for only 1 CV and 1 MV, it worked really well.
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from gekko import GEKKO
# Steady State Initial Condition
u1_ss = 280.0
u2_ss=100.0
# Feed Temperature (K)
Tf = 350
# Feed Concentration (mol/m^3)
Caf = 1
# Steady State Initial Conditions for the States
Ca_ss = 1
T_ss = 304
h_ss=94.77413303
V_ss=8577.41330293
x0 = np.empty(4)
x0[0] = Ca_ss
x0[1] = T_ss
x0[2]= h_ss
x0[3]= V_ss
#%% GEKKO nonlinear MPC
m = GEKKO(remote=False)
m.time = [0,0.02,0.04,0.06,0.08,0.1,0.12,0.15,0.2]
c1=10.0
Ac=100.0
# Density of A-B Mixture (kg/m^3)
rho = 1000
# Heat capacity of A-B Mixture (J/kg-K)
Cp = 0.239
# Heat of reaction for A->B (J/mol)
mdelH = 5e4
# E - Activation energy in the Arrhenius Equation (J/mol)
# R - Universal Gas Constant = 8.31451 J/mol-K
EoverR = 8750
# Pre-exponential factor (1/sec)
k0 = 7.2e10
# U - Overall Heat Transfer Coefficient (W/m^2-K)
# A - Area - this value is specific for the U calculation (m^2)
UA = 5e4
# initial conditions
Tc0 = 280
T0 = 304
Ca0 = 1.0
h0=94.77413303
q0=100.0
V0=8577.41330293
tau = m.Const(value=0.5)
Kp = m.Const(value=1)
m.Tc = m.MV(value=Tc0,lb=250,ub=350)
m.T = m.CV(value=T_ss)
m.h= m.CV(value=h0)
m.rA = m.Var(value=0)
m.Ca = m.CV(value=Ca_ss,lb=0,ub=1)
m.V= m.CV(value=V_ss,lb=0,ub=100000)
m.q=m.MV(value=q0,lb=0,ub=100000)
m.Equation(m.rA == k0*m.exp(-EoverR/m.T)*m.Ca)
m.Equation(m.T.dt() == m.q/m.V*(Tf - m.T) \
+ mdelH/(rho*Cp)*m.rA \
+ UA/m.V/rho/Cp*(m.Tc-m.T))
m.Equation(m.Ca.dt() == m.q/m.V*(Caf - m.Ca) - m.rA)
m.Equation(m.h.dt()==(m.q-c1*m.h**0.5)/Ac)
m.Equation(m.V.dt() == m.q- c1*m.h**0.5)
#MV tuning
m.Tc.STATUS = 1
m.Tc.FSTATUS = 0
m.Tc.DMAX = 100
m.Tc.DMAXHI = 20
m.Tc.DMAXLO = -100
m.q.STATUS = 1
m.q.FSTATUS = 0
m.q.DMAX = 10
#CV tuning
m.T.STATUS = 1
m.T.FSTATUS = 1
m.T.TR_INIT = 1
m.T.TAU = 1.0
DT = 0.5 # deadband
m.h.STATUS = 1
m.h.FSTATUS = 1
m.h.TR_INIT = 1
m.h.TAU = 1.0
m.Ca.STATUS = 1
m.Ca.FSTATUS = 0 # no measurement
m.Ca.TR_INIT = 0
m.V.STATUS = 1
m.V.FSTATUS = 0 # no measurement
m.V.TR_INIT = 0
m.options.CV_TYPE = 1
m.options.IMODE = 6
m.options.SOLVER = 3
#%% define CSTR model
def cstr(x,t,u1,u2,Tf,Caf,Ac):
# Inputs (3):
# Temperature of cooling jacket (K)
Tc = u1
q=u2
# Tf = Feed Temperature (K)
# Caf = Feed Concentration (mol/m^3)
# States (2):
# Concentration of A in CSTR (mol/m^3)
Ca = x[0]
# Temperature in CSTR (K)
T = x[1]
# the height of the tank (m)
h=x[2]
V=x[3]
# Parameters:
# Density of A-B Mixture (kg/m^3)
rho = 1000
# Heat capacity of A-B Mixture (J/kg-K)
Cp = 0.239
# Heat of reaction for A->B (J/mol)
mdelH = 5e4
# E - Activation energy in the Arrhenius Equation (J/mol)
# R - Universal Gas Constant = 8.31451 J/mol-K
EoverR = 8750
# Pre-exponential factor (1/sec)
k0 = 7.2e10
# U - Overall Heat Transfer Coefficient (W/m^2-K)
# A - Area - this value is specific for the U calculation (m^2)
UA = 5e4
# reaction rate
rA = k0*np.exp(-EoverR/T)*Ca
# Calculate concentration derivative
dCadt = q/V*(Caf - Ca) - rA
# Calculate temperature derivative
dTdt = q/V*(Tf - T) \
+ mdelH/(rho*Cp)*rA \
+ UA/V/rho/Cp*(Tc-T)
# Calculate height derivative
dhdt=(q-c1*h**0.5)/Ac
if x[2]>=300 and dhdt>0:
dhdt = 0
dVdt= q-c1*h**0.5
# Return xdot:
xdot = np.zeros(4)
xdot[0] = dCadt
xdot[1] = dTdt
xdot[2]= dhdt
xdot[3]= dVdt
return xdot
# Time Interval (min)
t = np.linspace(0,8,401)
# Store results for plotting
Ca = np.ones(len(t)) * Ca_ss
V=np.ones(len(t))*V_ss
T = np.ones(len(t)) * T_ss
Tsp = np.ones(len(t)) * T_ss
hsp=np.ones(len(t))*h_ss
h=np.ones(len(t))*h_ss
u1 = np.ones(len(t)) * u1_ss
u2 = np.ones(len(t)) * u2_ss
# Set point steps
Tsp[0:100] = 330.0
Tsp[100:200] = 350.0
Tsp[230:260] = 370.0
Tsp[260:290] = 390.0
hsp[0:100] = 30.0
hsp[100:200] =60.0
hsp[200:250]=90.0
# Create plot
plt.figure(figsize=(10,7))
plt.ion()
plt.show()
# Simulate CSTR
for i in range(len(t)-1):
# simulate one time period (0.05 sec each loop)
ts = [t[i],t[i+1]]
y = odeint(cstr,x0,ts,args=(u1[i+1],u2[i+1],Tf,Caf,Ac))
# retrieve measurements
Ca[i+1] = y[-1][0]
T[i+1] = y[-1][1]
h[i+1]= y[-1][2]
V[i+1]= y[-1][3]
# insert measurement
m.T.MEAS = T[i+1]
m.h.MEAS=h[i+1]
# solve MPC
m.solve(disp=True)
m.T.SPHI = Tsp[i+1] + DT
m.T.SPLO = Tsp[i+1] - DT
m.h.SPHI = hsp[i+1] + DT
m.h.SPLO = hsp[i+1] - DT
# retrieve new Tc value
u1[i+1] = m.Tc.NEWVAL
u2[i+1] = m.q.NEWVAL
# update initial conditions
x0[0] = Ca[i+1]
x0[1] = T[i+1]
x0[2]= h[i+1]
x0[3]= V[i+1]
#%% Plot the results
plt.clf()
plt.subplot(6,1,1)
plt.plot(t[0:i],u1[0:i],'b--',linewidth=3)
plt.ylabel('Cooling T (K)')
plt.legend(['Jacket Temperature'],loc='best')
plt.subplot(6,1,2)
plt.plot(t[0:i],u2[0:i],'b--',linewidth=3)
plt.ylabel('inlet flow')
plt.subplot(6,1,3)
plt.plot(t[0:i],Ca[0:i],'b.-',linewidth=3,label=r'$C_A$')
plt.plot([0,t[i-1]],[0.2,0.2],'r--',linewidth=2,label='limit')
plt.ylabel(r'$C_A$ (mol/L)')
plt.legend(loc='best')
plt.subplot(6,1,4)
plt.plot(t[0:i],V[0:i],'g--',linewidth=3)
plt.xlabel('time')
plt.ylabel('Volume of Tank')
plt.subplot(6,1,5)
plt.plot(t[0:i],Tsp[0:i],'k-',linewidth=3,label=r'$T_{sp}$')
plt.plot(t[0:i],T[0:i],'b.-',linewidth=3,label=r'$T_{meas}$')
plt.plot([0,t[i-1]],[400,400],'r--',linewidth=2,label='limit')
plt.ylabel('T (K)')
plt.xlabel('Time (min)')
plt.legend(loc='best')
plt.subplot(6,1,6)
plt.plot(t[0:i],hsp[0:i],'g--',linewidth=3,label=r'$h_{sp}$')
plt.plot(t[0:i],h[0:i],'k.-',linewidth=3,label=r'$h_{meas}$')
plt.xlabel('time')
plt.ylabel('tank level')
plt.legend(loc='best')
plt.draw()
plt.pause(0.01)

tensorflow adapt for local rgb image classification

I was wondering how to adapt the following code from github batchnorm_five_layers to read in two classes (cats&dogs) from local image paths with image size 780x780 and RBG. Here is the uncommented code from the link:
# encoding: UTF-8
import tensorflow as tf
import tensorflowvisu
import math
from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets
tf.set_random_seed(0)
# Download images and labels into mnist.test (10K images+labels) and mnist.train (60K images+labels)
mnist = read_data_sets("data", one_hot=True, reshape=False, validation_size=0)
# input X: 28x28 grayscale images, the first dimension (None) will index the images in the mini-batch
X = tf.placeholder(tf.float32, [None, 28, 28, 1])
# correct answers will go here
Y_ = tf.placeholder(tf.float32, [None, 10])
# variable learning rate
lr = tf.placeholder(tf.float32)
# train/test selector for batch normalisation
tst = tf.placeholder(tf.bool)
# training iteration
iter = tf.placeholder(tf.int32)
# five layers and their number of neurons (tha last layer has 10 softmax neurons)
L = 200
M = 100
N = 60
P = 30
Q = 10
# Weights initialised with small random values between -0.2 and +0.2
# When using RELUs, make sure biases are initialised with small *positive* values for example 0.1 = tf.ones([K])/10
W1 = tf.Variable(tf.truncated_normal([784, L], stddev=0.1)) # 784 = 28 * 28
B1 = tf.Variable(tf.ones([L])/10)
W2 = tf.Variable(tf.truncated_normal([L, M], stddev=0.1))
B2 = tf.Variable(tf.ones([M])/10)
W3 = tf.Variable(tf.truncated_normal([M, N], stddev=0.1))
B3 = tf.Variable(tf.ones([N])/10)
W4 = tf.Variable(tf.truncated_normal([N, P], stddev=0.1))
B4 = tf.Variable(tf.ones([P])/10)
W5 = tf.Variable(tf.truncated_normal([P, Q], stddev=0.1))
B5 = tf.Variable(tf.ones([Q])/10)
def batchnorm(Ylogits, is_test, iteration, offset, convolutional=False):
exp_moving_avg = tf.train.ExponentialMovingAverage(0.999, iteration) # adding the iteration prevents from averaging across non-existing iterations
bnepsilon = 1e-5
if convolutional:
mean, variance = tf.nn.moments(Ylogits, [0, 1, 2])
else:
mean, variance = tf.nn.moments(Ylogits, [0])
update_moving_everages = exp_moving_avg.apply([mean, variance])
m = tf.cond(is_test, lambda: exp_moving_avg.average(mean), lambda: mean)
v = tf.cond(is_test, lambda: exp_moving_avg.average(variance), lambda: variance)
Ybn = tf.nn.batch_normalization(Ylogits, m, v, offset, None, bnepsilon)
return Ybn, update_moving_everages
def no_batchnorm(Ylogits, is_test, iteration, offset, convolutional=False):
return Ylogits, tf.no_op()
# The model
XX = tf.reshape(X, [-1, 784])
# batch norm scaling is not useful with relus
# batch norm offsets are used instead of biases
Y1l = tf.matmul(XX, W1)
Y1bn, update_ema1 = batchnorm(Y1l, tst, iter, B1)
Y1 = tf.nn.relu(Y1bn)
Y2l = tf.matmul(Y1, W2)
Y2bn, update_ema2 = batchnorm(Y2l, tst, iter, B2)
Y2 = tf.nn.relu(Y2bn)
Y3l = tf.matmul(Y2, W3)
Y3bn, update_ema3 = batchnorm(Y3l, tst, iter, B3)
Y3 = tf.nn.relu(Y3bn)
Y4l = tf.matmul(Y3, W4)
Y4bn, update_ema4 = batchnorm(Y4l, tst, iter, B4)
Y4 = tf.nn.relu(Y4bn)
Ylogits = tf.matmul(Y4, W5) + B5
Y = tf.nn.softmax(Ylogits)
update_ema = tf.group(update_ema1, update_ema2, update_ema3, update_ema4)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=Y_)
cross_entropy = tf.reduce_mean(cross_entropy)*100
# accuracy of the trained model, between 0 (worst) and 1 (best)
correct_prediction = tf.equal(tf.argmax(Y, 1), tf.argmax(Y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# matplotlib visualisation
allweights = tf.concat([tf.reshape(W1, [-1]), tf.reshape(W2, [-1]), tf.reshape(W3, [-1])], 0)
allbiases = tf.concat([tf.reshape(B1, [-1]), tf.reshape(B2, [-1]), tf.reshape(B3, [-1])], 0)
# to use for sigmoid
#allactivations = tf.concat([tf.reshape(Y1, [-1]), tf.reshape(Y2, [-1]), tf.reshape(Y3, [-1]), tf.reshape(Y4, [-1])], 0)
# to use for RELU
allactivations = tf.concat([tf.reduce_max(Y1, [0]), tf.reduce_max(Y2, [0]), tf.reduce_max(Y3, [0]), tf.reduce_max(Y4, [0])], 0)
alllogits = tf.concat([tf.reshape(Y1l, [-1]), tf.reshape(Y2l, [-1]), tf.reshape(Y3l, [-1]), tf.reshape(Y4l, [-1])], 0)
I = tensorflowvisu.tf_format_mnist_images(X, Y, Y_)
It = tensorflowvisu.tf_format_mnist_images(X, Y, Y_, 1000, lines=25)
datavis = tensorflowvisu.MnistDataVis(title4="Logits", title5="Max activations across batch", histogram4colornum=2, histogram5colornum=2)
# training step, the learning rate is a placeholder
train_step = tf.train.AdamOptimizer(lr).minimize(cross_entropy)
# init
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
# You can call this function in a loop to train the model, 100 images at a time
def training_step(i, update_test_data, update_train_data):
# training on batches of 100 images with 100 labels
batch_X, batch_Y = mnist.train.next_batch(100)
max_learning_rate = 0.03
min_learning_rate = 0.0001
decay_speed = 1000.0
learning_rate = min_learning_rate + (max_learning_rate - min_learning_rate) * math.exp(-i/decay_speed)
# compute training values for visualisation
if update_train_data:
a, c, im, al, ac = sess.run([accuracy, cross_entropy, I, alllogits, allactivations], {X: batch_X, Y_: batch_Y, tst: False})
print(str(i) + ": accuracy:" + str(a) + " loss: " + str(c) + " (lr:" + str(learning_rate) + ")")
datavis.append_training_curves_data(i, a, c)
datavis.update_image1(im)
datavis.append_data_histograms(i, al, ac)
# compute test values for visualisation
if update_test_data:
a, c, im = sess.run([accuracy, cross_entropy, It], {X: mnist.test.images, Y_: mnist.test.labels, tst: True})
print(str(i) + ": ********* epoch " + str(i*100//mnist.train.images.shape[0]+1) + " ********* test accuracy:" + str(a) + " test loss: " + str(c))
datavis.append_test_curves_data(i, a, c)
datavis.update_image2(im)
# the backpropagation training step
sess.run(train_step, {X: batch_X, Y_: batch_Y, lr: learning_rate, tst: False})
sess.run(update_ema, {X: batch_X, Y_: batch_Y, tst: False, iter: i})
datavis.animate(training_step, iterations=10000+1, train_data_update_freq=20, test_data_update_freq=100, more_tests_at_start=True)
print("max test accuracy: " + str(datavis.get_max_test_accuracy()))
To answer your question in the comments: this is probably what you want to change your code into:
# input X: images, the first dimension (None) will index the images in the mini-batch
X = tf.placeholder(tf.float32, [None, 780, 780, 3])
# correct answers will go here
Y_ = tf.placeholder(tf.float32, [None, 2])
And an image can be read like this:
from scipy import misc
input = misc.imread('input.png')
Now it might be best to follow a Tensorflow tutorial. This one is really good: kadenze.com/courses/creative-applications-of-deep-learning-with-tensorflow-iv/info
Good luck!

Resources