DecisionTreeClassifier cost complexity pruning ccp_alpha - sklearn-pandas

I have this code which model the imbalance class via decision tree. but some how ccp_alpha in the end its not picking the right value. the ccp_alpha should be around 0.005 instead of code is picking up 0.020.
I am not sure why "cp_alpha=0.02044841897041862" instead of 0.005 as per the graph of
"Recall vs alpha for training and testing sets"
class_weight_t={0: 0.07, 1: 0.89}
clf = DecisionTreeClassifier(random_state=1, class_weight=class_weight_t)
path = clf.cost_complexity_pruning_path(X_train, y_train)
ccp_alphas, impurities = path.ccp_alphas, path.impurities
pd.DataFrame(path)
clfs = []
for ccp_alpha in ccp_alphas:
clf = DecisionTreeClassifier(
random_state=1, ccp_alpha=ccp_alpha, class_weight=class_weight_t
)
clf.fit(X_train, y_train)
clfs.append(clf)
#print(str(clf)+","+str(ccp_alpha)+","+str(clfs[-1].tree_.node_count))
print(
"Number of nodes in the last tree is: {} with ccp_alpha: {}".format(
clfs[-1].tree_.node_count, ccp_alphas[-1]
)
)
Number of nodes in the last tree is: 1 with ccp_alpha: 0.29696815935983295
recall_train = []
for clf in clfs:
pred_train = clf.predict(X_train)
values_train = recall_score(y_train, pred_train)
recall_train.append(values_train)
recall_test = []
for clf in clfs:
pred_test = clf.predict(X_test)
values_test = recall_score(y_test, pred_test)
recall_test.append(values_test)
fig, ax = plt.subplots(figsize=(15, 5))
ax.set_xlabel("alpha")
ax.set_ylabel("Recall")
ax.set_title("Recall vs alpha for training and testing sets")
ax.plot(
ccp_alphas, recall_train, marker="o", label="train", drawstyle="steps-post",
)
ax.plot(ccp_alphas, recall_test, marker="o", label="test", drawstyle="steps-post")
#ax.plot(
# ccp_alphas, train_scores, marker="o", label="train", drawstyle="steps-post",
#)
#ax.plot(ccp_alphas, test_scores, marker="o", label="test", drawstyle="steps-post")
ax.legend()
plt.show()
https://i.stack.imgur.com/0imAq.png
index_best_model = np.argmax(recall_test)
best_model = clfs[index_best_model]
print(best_model)
DecisionTreeClassifier(ccp_alpha=0.02044841897041862,class_weight={0: 0.07, 1: 0.89}, random_state=1)

Related

Comparing Biweekly HFMD Cases with and without using the Squared Error Objective & L1-Norm Objective

I wish to model the biweekly HFMD cases in Malaysia.
Then, I want to show that the model using the Squared Error Objective and L1-Norm Objective can better model the biweekly HFMD cases than the model without objectives.
My question is, is it possible to model the biweekly HFMD cases without using the Squared Error Objective and L1-Norm Objective?
With this, I have attached the coding below:
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
m1 = GEKKO(remote=False)
m2 = GEKKO(remote=False)
m = m1
# Known parameters
nb = 26 # Number of biweeks in a year
ny = 3 # Number of years
biweeks = np.zeros((nb,ny*nb+1))
biweeks[0][0] = 1
for i in range(nb):
for j in range(ny):
biweeks[i][j*nb+i+1] = 1
# Write csv data file
tm = np.linspace(0,78,79)
# case data
# Malaysia weekly HFMD data from the year 2013 - 2015
cases = np.array([506,506,700,890,1158,1605,1694,1311,1490,1310,1368,\
1009,1097,934,866,670,408,481,637,749,700,648,710,\
740,627,507,516,548,636,750,1066,1339,1565,\
1464,1575,1759,1631,1601,1227,794,774,623,411,\
750,1017,976,1258,1290,1546,1662,1720,1553,1787,1291,1712,2227,2132,\
2550,2140,1645,1743,1296,1153,871,621,570,388,\
347,391,446,442,390,399,421,398,452,470,437,411])
data = np.vstack((tm,cases))
data = data.T
# np.savetxt('measles_biweek_2.csv',data,delimiter=',',header='time,cases')
np.savetxt('hfmd_biweek_2.csv',data,delimiter=',',header='time,cases')
# Load data from csv
# m.time, cases_meas = np.loadtxt('measles_biweek_2.csv', \
m.time, cases_hfmd = np.loadtxt('hfmd_biweek_2.csv', \
delimiter=',',skiprows=1,unpack=True)
# m.Vr = m.Param(value = 0)
# Variables
# m.N = m.FV(value = 3.2e6)
# m.mu = m.FV(value = 7.8e-4)
# m.N = m.FV(value = 3.11861e7)
# m.mu = m.FV(value = 6.42712e-4)
m.N = m.FV(value = 3.16141e7) # Malaysia average total population (2015 - 2017)
m.mu = m.FV(value = 6.237171519e-4) # Malaysia scaled birth rate (births/biweek/total population)
m.rep_frac = m.FV(value = 0.45) # Percentage of underreporting
# Beta values (unknown parameters in the model)
m.beta = [m.FV(value=1, lb=0.1, ub=5) for i in range(nb)]
# Predicted values
m.S = m.SV(value = 0.162492875*m.N.value, lb=0,ub=m.N) # Susceptibles (Kids from 0 - 9 YO: 5137066 people) - Average of 94.88% from total reported cases
m.I = m.SV(value = 7.907863896e-5*m.N.value, lb=0,ub=m.N) #
# m.V = m.Var(value = 2e5)
# measured values
m.cases = m.CV(value = cases_hfmd, lb=0)
# turn on feedback status for CASES
m.cases.FSTATUS = 1
# weight on prior model predictions
m.cases.WMODEL = 0
# meas_gap = deadband that represents level of
# accuracy / measurement noise
db = 100
m.cases.MEAS_GAP = db
for i in range(nb):
m.beta[i].STATUS=1
#m.gamma = m.FV(value=0.07)
m.gamma = m.FV(value=0.07)
m.gamma.STATUS = 1
m.gamma.LOWER = 0.05
m.gamma.UPPER = 0.5
m.biweek=[None]*nb
for i in range(nb):
m.biweek[i] = m.Param(value=biweeks[i])
# Intermediate
m.Rs = m.Intermediate(m.S*m.I/m.N)
# Equations
sum_biweek = sum([m.biweek[i]*m.beta[i]*m.Rs for i in range(nb)])
# m.Equation(m.S.dt()== -sum_biweek + m.mu*m.N - m.Vr)
m.Equation(m.S.dt()== -sum_biweek + m.mu*m.N)
m.Equation(m.I.dt()== sum_biweek - m.gamma*m.I)
m.Equation(m.cases == m.rep_frac*sum_biweek)
# m.Equation(m.V.dt()==-m.Vr)
# options
m.options.SOLVER = 1
m.options.NODES=3
# imode = 5, dynamic estimation
m.options.IMODE = 5
# ev_type = 1 (L1-norm) or 2 (squared error)
m.options.EV_TYPE = 2
# solve model and print solver output
m.solve()
[print('beta['+str(i+1)+'] = '+str(m.beta[i][0])) \
for i in range(nb)]
print('gamma = '+str(m.gamma.value[0]))
# export data
# stack time and avg as column vectors
my_data = np.vstack((m.time,np.asarray(m.beta),m.gamma))
# transpose data
my_data = my_data.T
# save text file with comma delimiter
beta_str = ''
for i in range(nb):
beta_str = beta_str + ',beta[' + str(i+1) + ']'
header_name = 'time,gamma' + beta_str
##np.savetxt('solution_data.csv',my_data,delimiter=',',\
## header = header_name, comments='')
np.savetxt('solution_data_EVTYPE_'+str(m.options.EV_TYPE)+\
'_gamma'+str(m.gamma.STATUS)+'.csv',\
my_data,delimiter=',',header = header_name)
plt.figure(num=1, figsize=(16,8))
plt.suptitle('Estimation')
plt.subplot(2,2,1)
plt.plot(m.time,m.cases, label='Cases (model)')
plt.plot(m.time,cases_hfmd, label='Cases (measured)')
if m.options.EV_TYPE==2:
plt.plot(m.time,cases_hfmd+db/2, 'k-.',\
lw=0.5, label=r'$Cases_{db-hi}$')
plt.plot(m.time,cases_hfmd-db/2, 'k-.',\
lw=0.5, label=r'$Cases_{db-lo}$')
plt.fill_between(m.time,cases_hfmd-db/2,\
cases_hfmd+db/2,color='gold',alpha=.5)
plt.legend(loc='best')
plt.ylabel('Cases')
plt.subplot(2,2,2)
plt.plot(m.time,m.S,'r--')
plt.ylabel('S')
plt.subplot(2,2,3)
[plt.plot(m.time,m.beta[i], label='_nolegend_')\
for i in range(nb)]
plt.plot(m.time,m.gamma,'c--', label=r'$\gamma$')
plt.legend(loc='best')
plt.ylabel(r'$\beta, \gamma$')
plt.xlabel('Time')
plt.subplot(2,2,4)
plt.plot(m.time,m.I,'g--')
plt.xlabel('Time')
plt.ylabel('I')
plt.subplots_adjust(hspace=0.2,wspace=0.4)
name = 'cases_EVTYPE_'+ str(m.options.EV_TYPE) +\
'_gamma' + str(m.gamma.STATUS) + '.png'
plt.savefig(name)
plt.show()
To define a custom objective, use the m.Minimize() or m.Maximize() functions instead of the squared error or l1-norm objectives that are built into the m.CV() objects. To create a custom objective, use m.Var() instead of m.CV() such as:
from gekko import GEKKO
import numpy as np
m = GEKKO()
x = m.Array(m.Var,4,value=1,lb=1,ub=5)
x1,x2,x3,x4 = x
# change initial values
x2.value = 5; x3.value = 5
m.Equation(x1*x2*x3*x4>=25)
m.Equation(x1**2+x2**2+x3**2+x4**2==40)
m.Minimize(x1*x4*(x1+x2+x3)+x3)
m.solve()
print('x: ', x)
print('Objective: ',m.options.OBJFCNVAL)
Here is a similar problem with disease prediction (Measles) that uses m.CV().
import numpy as np
from gekko import GEKKO
import matplotlib.pyplot as plt
# Import Data
# write csv data file
t_s = np.linspace(0,78,79)
# case data
cases_s = np.array([180,180,271,423,465,523,649,624,556,420,\
423,488,441,268,260,163,83,60,41,48,65,82,\
145,122,194,237,318,450,671,1387,1617,2058,\
3099,3340,2965,1873,1641,1122,884,591,427,282,\
174,127,84,97,68,88,79,58,85,75,121,174,209,458,\
742,929,1027,1411,1885,2110,1764,2001,2154,1843,\
1427,970,726,416,218,160,160,188,224,298,436,482,468])
# Initialize gekko model
m = GEKKO()
# Number of collocation nodes
nodes = 4
# Number of phases (years in this case)
n = 3
#Biweek periods per year
bi = 26
# Time horizon (for all 3 phases)
m.time = np.linspace(0,1,bi+1)
# Parameters that will repeat each year
N = m.Param(3.2e6)
mu = m.Param(7.8e-4)
rep_frac = m.Param(0.45)
Vr = m.Param(0)
beta = m.MV(2,lb = 0.1)
beta.STATUS = 1
gamma = m.FV(value=0.07)
gamma.STATUS = 1
gamma.LOWER = 0.05
gamma.UPPER = 0.5
# Data used to control objective function
casesobj1 = m.Param(cases_s[0:(bi+1)])
casesobj2 = m.Param(cases_s[bi:(2*bi+1)])
casesobj3 = m.Param(cases_s[2*bi:(3*bi+1)])
# Variables that vary between years, one version for each year
cases = [m.CV(value = cases_s[(i*bi):(i+1)*(bi+1)-i],lb=0) for i in range(n)]
for i in cases:
i.FSTATUS = 1
i.WMODEL = 0
i.MEAS_GAP = 100
S = [m.Var(0.06*N,lb = 0,ub = N) for i in range(n)]
I = [m.Var(0.001*N, lb = 0,ub = N) for i in range(n)]
V = [m.Var(2e5) for i in range(n)]
# Equations (created for each year)
for i in range(n):
R = m.Intermediate(beta*S[i]*I[i]/N)
m.Equation(S[i].dt() == -R + mu*N - Vr)
m.Equation(I[i].dt() == R - gamma*I[i])
m.Equation(cases[i] == rep_frac*R)
m.Equation(V[i].dt() == -Vr)
# Connect years together at endpoints
for i in range(n-1):
m.Connection(cases[i+1],cases[i],1,bi,1,nodes)#,1,nodes)
m.Connection(cases[i+1],'CALCULATED',pos1=1,node1=1)
m.Connection(S[i+1],S[i],1,bi,1,nodes)
m.Connection(S[i+1],'CALCULATED',pos1=1,node1=1)
m.Connection(I[i+1],I[i],1,bi,1,nodes)
m.Connection(I[i+1],'CALCULATED',pos1=1, node1=1)
# Solver options
m.options.IMODE = 5
m.options.NODES = nodes
m.EV_TYPE = 1
m.options.SOLVER = 1
# Solve
m.Obj(2*(casesobj1-cases[0])**2+(casesobj3-cases[2])**2)
m.solve()
# Calculate the start time of each phase
ts = np.linspace(1,n,n)
# Plot
plt.figure()
plt.subplot(4,1,1)
tm = np.empty(len(m.time))
for i in range(n):
tm = m.time + ts[i]
plt.plot(tm,cases[i].value,label='Cases Year %s'%(i+1))
plt.plot(tm,cases_s[(i*bi):(i+1)*(bi+1)-i],'.')
plt.legend()
plt.ylabel('Cases')
plt.subplot(4,1,2)
for i in range(n):
tm = m.time + ts[i]
plt.plot(tm,beta.value,label='Beta Year %s'%(i+1))
plt.legend()
plt.ylabel('Contact Rate')
plt.subplot(4,1,3)
for i in range(n):
tm = m.time + ts[i]
plt.plot(tm,I[i].value,label='I Year %s'%(i+1))
plt.legend()
plt.ylabel('Infectives')
plt.subplot(4,1,4)
for i in range(n):
tm = m.time + ts[i]
plt.plot(tm,S[i].value,label='S Year %s'%(i+1))
plt.legend()
plt.ylabel('Susceptibles')
plt.xlabel('Time (yr)')
plt.show()

Using GEKKO for Moving Horizon Estimation online 2

This is the following question after appyling comments from: Using GEKKO for Moving Horizon Estimation online
I have studied example from estimation iterative example on the Dynamic Optimization course website and revised my code as follows:
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
import matplotlib; matplotlib.use('TkAgg')
class Observer():
def __init__(self, window_size, r_init, alpha_init):
self.m = GEKKO(remote=False)
self.dt = 0.05
self.m.time = [i*self.dt for i in range(window_size)]
#Parameters
self.m.u = self.m.MV()
#Variables
self.m.r = self.m.CV(lb=0) # value=r_init) #ub=20 can be over 20
self.m.alpha = self.m.CV() # value=alpha_init) #ub lb for angle?
#Equations
self.m.Equation(self.m.r.dt()== -self.m.cos(self.m.alpha))
self.m.Equation(self.m.alpha.dt()== self.m.sin(self.m.alpha)/self.m.r - self.m.u) # differential equation
#Options
self.m.options.MV_STEP_HOR = 2
self.m.options.IMODE = 5 # dynamic estimation
self.m.options.EV_TYPE = 2 #Default 1: absolute error form 2: squared error form
self.m.options.DIAGLEVEL = 0 #diagnostic level
self.m.options.NODES = 5 #nodes # collocation nodes default:2
self.m.options.SOLVER = 3 #solver_num
# STATUS = 0, optimizer doesn't adjust value
# STATUS = 1, optimizer can adjust
self.m.u.STATUS = 0
self.m.r.STATUS = 1
self.m.alpha.STATUS = 1
# FSTATUS = 0, no measurement
# FSTATUS = 1, measurement used to update model
self.m.u.FSTATUS = 1 #default
self.m.r.FSTATUS = 1
self.m.alpha.FSTATUS = 1
self.m.r.TR_INIT = 0
self.m.alpha.TR_INIT = 0
self.count = 0
def MHE(self, observed_state, u_data):
self.count =+ 1
self.m.u.MEAS = u_data
self.m.r.MEAS = observed_state[0]
self.m.alpha.MEAS = observed_state[1]
self.m.solve(disp=False)
return self.m.r.MODEL, self.m.alpha.MODEL
if __name__=="__main__":
FILE_PATH00 = '/home/shane16/Project/model_guard/uav_paper/adversarial/SA_PPO/src/DATA/4end_estimation_results_r.csv'
FILE_PATH01 = '/home/shane16/Project/model_guard/uav_paper/adversarial/SA_PPO/src/DATA/4end_estimation_results_alpha.csv'
FILE_PATH02 = '/home/shane16/Project/model_guard/uav_paper/adversarial/SA_PPO/src/DATA/4end_action_buffer_eps0.0_sig0.0.csv'
cycles = 55
x = np.arange(cycles) # 1...300
matrix00 = np.loadtxt(FILE_PATH00, delimiter=',')
matrix01 = np.loadtxt(FILE_PATH01, delimiter=',')
matrix02 = np.loadtxt(FILE_PATH02, delimiter=',')
vanilla_action_sigma_0 = matrix02
vanilla_estimation_matrix_r = np.zeros(cycles)
vanilla_estimation_matrix_alpha = np.zeros(cycles)
# sigma = 0.0
# vanilla model true/observed states
r_vanilla_sigma_0_true = matrix00[0, 3:] # from step 1
r_vanilla_sigma_0_observed = matrix00[1, 3:] # from step1
alpha_vanilla_sigma_0_true = matrix01[0, 3:]
alpha_vanilla_sigma_0_observed = matrix01[1, 3:]
# initialize estimator
sigma = 0.0 #1.0
solver_num = 3
nodes = 5
# for window_size in [5, 10, 20, 30, 40, 50]:
window_size = 5
observer = Observer(window_size, r_vanilla_sigma_0_observed[0], alpha_vanilla_sigma_0_observed[0])
for i in range(cycles):
if i % 100 == 0:
print('cylcle: {}'.format(i))
vanilla_observed_states = np.hstack((r_vanilla_sigma_0_observed[i], alpha_vanilla_sigma_0_observed[i])) # from current observed state
r_hat, alpha_hat = observer.MHE(vanilla_observed_states, vanilla_action_sigma_0[i]) # and current action -> estimate current state
vanilla_estimation_matrix_r[i] = r_hat
vanilla_estimation_matrix_alpha[i] = alpha_hat
#plot vanilla
plt.figure()
plt.subplot(3,1,1)
plt.title('Vanilla model_sig{}'.format(sigma))
plt.plot(x, vanilla_action_sigma_0[:cycles],'b:',label='action (w)')
plt.legend()
plt.subplot(3,1,2)
plt.ylabel('r')
plt.plot(x, r_vanilla_sigma_0_true[:cycles], 'k-', label='true_r')
plt.plot(x, r_vanilla_sigma_0_observed[:cycles], 'gx', label='observed_r')
plt.plot(x, vanilla_estimation_matrix_r, 'r--', label='time window: 10')
# plt.legend()
plt.subplot(3,1,3)
plt.xlabel('time steps')
plt.ylabel('alpha')
plt.plot(x, alpha_vanilla_sigma_0_true[:cycles], 'k-', label='true_alpha')
plt.plot(x, alpha_vanilla_sigma_0_observed[:cycles], 'gx', label='observed_alpha')
plt.plot(x, vanilla_estimation_matrix_alpha, 'r--', label='time window: {}'.format(window_size))
plt.legend()
plt.savefig('plot/revision/4estimated_STATES_vanilla_sig{}_window{}_cycles{}_solver{}_nodes{}.png'.format(sigma, window_size,cycles, solver_num, nodes))
plt.show()
csv files: https://drive.google.com/drive/folders/1jW_6zBCdbJHB7yU3HmCIhamEyOT1LJqD?usp=sharing
The code works when initialized with values specified at line 15,16 (m.r, m.alpha).
However, if I try with no initial value,(as same condition in example), solution is not found.
terminal output:
cylcle: 0 Traceback (most recent call last): File
"4observer_mhe.py", line 86, in
r_hat, alpha_hat = observer.MHE(vanilla_observed_states, vanilla_action_sigma_0[i]) # and current action -> estimate current
state File "4observer_mhe.py", line 49, in MHE
self.m.solve(disp=False) File "/home/shane16/Project/model_guard/LipSDP/lipenv/lib/python3.7/site-packages/gekko/gekko.py",
line 2140, in solve
raise Exception(apm_error) Exception: #error: Solution Not Found
What could be the solution to this problem?
I have tried below strategies, but couldn't find the solution.
Reduce the number of decision variables by using m.FV() or m.MV() with m.options.MV_STEP_HOR=2+ to reduce the degrees of freedom for the solver for the unknown parameters.
Try other solvers with m.options.SOLVER=1 or m.options.SOLVER=2.
I expect to see estimation results that follow the true state well.
But I guess I'm doing something wrong.
Could anyone help me please?
Thank you.
Solvers sometimes need good initial guess values or constraints (lower and upper bounds) on the degrees of freedom (MV or FV) to find the optimal solution.
One of the equations may be the source of the problem:
self.m.alpha.dt() == self.m.sin(self.m.alpha)/self.m.r - self.m.u
The initial value of r is zero (default) because no initial value is provided when it is declared as self.m.r = self.m.CV(lb=0). A comment suggests that it was formerly initialized with value r_init. The zero value creates a divide-by-zero for that equation. Try rearranging the equation into an equivalent form that avoids the potential for divide-by-zero either with the initial guess or when the solver is iterating.
self.m.r*self.m.alpha.dt() == self.m.sin(self.m.alpha) - self.m.r*self.m.u
There may be other things that are also causing the model to not converge. When the solution does not converge then the infeasibilities.txt file can be a source to troubleshoot the specific equations that are having trouble. Here are instructions to retrieve the infeasibilities.txt file: How to retrieve the 'infeasibilities.txt' from the gekko

Emotion detection using facial landmarks

I plan on using scikit svm for class prediction.
I have been trying this :
Get images from a webcam
Detect Facial Landmarks
Train a machine learning algorithm (we will use a linear SVM)
Predict emotions
I have a problem in this line : clf.fit(npar_train, training_labels)
also I have a problem in site-packages\sklearn\svm_base.py and in site-packages\sklearn\utils\validation.py
How can I remove this error?
thank you in advance
python script
emotions = ['neutral', 'sad', 'happy', 'anger']
data={}
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')
clf = SVC(kernel='linear', probability=True, tol=1e-3)
def get_files(emotion):
files = glob.glob('img\\datasets\\%s\\*' %emotion)
random.shuffle(files)
training = files[:int(len(files)*0.8)]
prediction = files[-int(len(files)*0.2)]
return training, prediction
def get_landmarks(image):
detections = detector(image, 1)
for k, d in enumerate(detections): # For all detected face instances individually
shape = predictor(image, d) # Draw Facial Landmarks with the predictor class
xlist = []
ylist = []
for i in range(1, 68): # Store X and Y coordinates in two lists
xlist.append(float(shape.part(i).x))
ylist.append(float(shape.part(i).y))
xmean = np.mean(xlist)
ymean = np.mean(ylist)
xcentral = [(x - xmean) for x in xlist]
ycentral = [(y - ymean) for y in ylist]
landmarks_vectorised = []
for x, y, w, z in zip(xcentral, ycentral, xlist, ylist):
landmarks_vectorised.append(w)
landmarks_vectorised.append(z)
meannp = np.asarray((ymean, xmean))
coornp = np.asarray((z, w))
dist = np.linalg.norm(coornp - meannp)
landmarks_vectorised.append(dist)
landmarks_vectorised.append((math.atan2(y, x) * 360) / (2 * math.pi))
data['landmarks_vectorised'] = landmarks_vectorised
if len(detections) < 1:
data['landmarks_vestorised'] = "error"
def make_sets():
training_data = []
training_labels = []
prediction_data = []
prediction_labels = []
for emotion in emotions:
print("Working on %s emotion" %emotion)
training, prediction = get_files(emotion)
for item in training:
image = cv2.imread(item)
try:
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
except:
print()
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
clahe_image = clahe.apply(image)
get_landmarks(clahe_image)
if data['landmarks_vectorised'] == "error":
print("no face detected on this one")
else:
training_data.append(data['landmarks_vectorised']) # append image array to training data list
training_labels.append(emotions.index(emotion))
for item in prediction:
image = cv2.imread(item)
try:
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
except:
print()
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
clahe_image = clahe.apply(image)
get_landmarks(clahe_image)
if data['landmarks_vectorised'] == "error":
print("no face detected on this one")
else:
prediction_data.append(data['landmarks_vectorised'])
prediction_labels.append(emotions.index(emotion))
return training_data, training_labels, prediction_data, prediction_labels
accur_lin = []
for i in range(0,10):
print("Making sets %s" % i) # Make sets by random sampling 80/20%
training_data, training_labels, prediction_data, prediction_labels = make_sets()
npar_train = np.array(training_data)
npar_trainlabs = np.array(training_labels)
print("training SVM linear %s" % i) # train SVM
clf.fit(npar_train, training_labels)
print("getting accuracies %s" % i)
npar_pred = np.array(prediction_data)
pred_lin = clf.score(npar_pred, prediction_labels)
print("Mean value lin svm: %s" % np.mean(accur_lin))

how to get reproducible result in Tensorflow

I built 5-layer neural network by using tensorflow.
I have a problem to get reproducible results (or stable results).
I found similar questions regarding reproducibility of tensorflow and the corresponding answers, such as How to get stable results with TensorFlow, setting random seed
But the problem is not solved yet.
I also set random seed like the following
tf.set_random_seed(1)
Furthermore, I added seed options to every random function such as
b1 = tf.Variable(tf.random_normal([nHidden1], seed=1234))
I confirmed that the first epoch shows the identical results, but not identical from the second epoch little by little.
How can I get the reproducible results?
Am I missing something?
Here is a code block I use.
def xavier_init(n_inputs, n_outputs, uniform=True):
if uniform:
init_range = tf.sqrt(6.0 / (n_inputs + n_outputs))
return tf.random_uniform_initializer(-init_range, init_range, seed=1234)
else:
stddev = tf.sqrt(3.0 / (n_inputs + n_outputs))
return tf.truncated_normal_initializer(stddev=stddev, seed=1234)
import numpy as np
import tensorflow as tf
import dataSetup
from scipy.stats.stats import pearsonr
tf.set_random_seed(1)
x_train, y_train, x_test, y_test = dataSetup.input_data()
# Parameters
learningRate = 0.01
trainingEpochs = 1000000
batchSize = 64
displayStep = 100
thresholdReduce = 1e-6
thresholdNow = 0.6
#dropoutRate = tf.constant(0.7)
# Network Parameter
nHidden1 = 128 # number of 1st layer nodes
nHidden2 = 64 # number of 2nd layer nodes
nInput = 24 #
nOutput = 1 # Predicted score: 1 output for regression
# save parameter
modelPath = 'model/model_layer5_%d_%d_mini%d_lr%.3f_noDrop_rollBack.ckpt' %(nHidden1, nHidden2, batchSize, learningRate)
# tf Graph input
X = tf.placeholder("float", [None, nInput])
Y = tf.placeholder("float", [None, nOutput])
# Weight
W1 = tf.get_variable("W1", shape=[nInput, nHidden1], initializer=xavier_init(nInput, nHidden1))
W2 = tf.get_variable("W2", shape=[nHidden1, nHidden2], initializer=xavier_init(nHidden1, nHidden2))
W3 = tf.get_variable("W3", shape=[nHidden2, nHidden2], initializer=xavier_init(nHidden2, nHidden2))
W4 = tf.get_variable("W4", shape=[nHidden2, nHidden2], initializer=xavier_init(nHidden2, nHidden2))
WFinal = tf.get_variable("WFinal", shape=[nHidden2, nOutput], initializer=xavier_init(nHidden2, nOutput))
# biases
b1 = tf.Variable(tf.random_normal([nHidden1], seed=1234))
b2 = tf.Variable(tf.random_normal([nHidden2], seed=1234))
b3 = tf.Variable(tf.random_normal([nHidden2], seed=1234))
b4 = tf.Variable(tf.random_normal([nHidden2], seed=1234))
bFinal = tf.Variable(tf.random_normal([nOutput], seed=1234))
# Layers for dropout
L1 = tf.nn.relu(tf.add(tf.matmul(X, W1), b1))
L2 = tf.nn.relu(tf.add(tf.matmul(L1, W2), b2))
L3 = tf.nn.relu(tf.add(tf.matmul(L2, W3), b3))
L4 = tf.nn.relu(tf.add(tf.matmul(L3, W4), b4))
hypothesis = tf.add(tf.matmul(L4, WFinal), bFinal)
print "Layer setting DONE..."
# define loss and optimizer
cost = tf.reduce_mean(tf.square(hypothesis - Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learningRate).minimize(cost)
# Initialize the variable
init = tf.initialize_all_variables()
# save op to save and restore all the variables
saver = tf.train.Saver()
with tf.Session() as sess:
# initialize
sess.run(init)
print "Initialize DONE..."
# Training
costPrevious = 100000000000000.0
best = float("INF")
totalBatch = int(len(x_train)/batchSize)
print "Total Batch: %d" %totalBatch
for epoch in range(trainingEpochs):
#print "EPOCH: %04d" %epoch
avgCost = 0.
for i in range(totalBatch):
np.random.seed(i+epoch)
randidx = np.random.randint(len(x_train), size=batchSize)
batch_xs = x_train[randidx,:]
batch_ys = y_train[randidx,:]
# Fit traiing using batch data
sess.run(optimizer, feed_dict={X:batch_xs, Y:batch_ys})
# compute average loss
avgCost += sess.run(cost, feed_dict={X:batch_xs, Y:batch_ys})/totalBatch
# compare the current cost and the previous
# if current cost > the previous
# just continue and make the learning rate half
#print "Cost: %1.8f --> %1.8f at epoch %05d" %(costPrevious, avgCost, epoch+1)
if avgCost > costPrevious + .5:
#sess.run(init)
load_path = saver.restore(sess, modelPath)
print "Cost increases at the epoch %05d" %(epoch+1)
print "Cost: %1.8f --> %1.8f" %(costPrevious, avgCost)
continue
costNow = avgCost
reduceCost = abs(costPrevious - costNow)
costPrevious = costNow
#Display logs per epoch step
if costNow < best:
best = costNow
bestMatch = sess.run(hypothesis, feed_dict={X:x_test})
# model save
save_path = saver.save(sess, modelPath)
if epoch % displayStep == 0:
print "step {}".format(epoch)
pearson = np.corrcoef(bestMatch.flatten(), y_test.flatten())
print 'train loss = {}, current loss = {}, test corrcoef={}'.format(best, costNow, pearson[0][1])
if reduceCost < thresholdReduce or costNow < thresholdNow:
print "Epoch: %04d, Cost: %.9f, Prev: %.9f, Reduce: %.9f" %(epoch+1, costNow, costPrevious, reduceCost)
break
print "Optimization Finished"
It seems that your results are perhaps not reproducible because you are using Saver to write/restore from checkpoint each time? (i.e. the second time that you run the code, the variable values aren't initialized using your random seed -- they are restored from your previous checkpoint)
Please trim down your code example to just the code necessary to reproduce irreproducibility.

How to model a mixture of 3 Normals in PyMC?

There is a question on CrossValidated on how to use PyMC to fit two Normal distributions to data. The answer of Cam.Davidson.Pilon was to use a Bernoulli distribution to assign data to one of the two Normals:
size = 10
p = Uniform( "p", 0 , 1) #this is the fraction that come from mean1 vs mean2
ber = Bernoulli( "ber", p = p, size = size) # produces 1 with proportion p.
precision = Gamma('precision', alpha=0.1, beta=0.1)
mean1 = Normal( "mean1", 0, 0.001 )
mean2 = Normal( "mean2", 0, 0.001 )
#deterministic
def mean( ber = ber, mean1 = mean1, mean2 = mean2):
return ber*mean1 + (1-ber)*mean2
Now my question is: how to do it with three Normals?
Basically, the issue is that you can't use a Bernoulli distribution and 1-Bernoulli anymore. But how to do it then?
edit: With the CDP's suggestion, I wrote the following code:
import numpy as np
import pymc as mc
n = 3
ndata = 500
dd = mc.Dirichlet('dd', theta=(1,)*n)
category = mc.Categorical('category', p=dd, size=ndata)
precs = mc.Gamma('precs', alpha=0.1, beta=0.1, size=n)
means = mc.Normal('means', 0, 0.001, size=n)
#mc.deterministic
def mean(category=category, means=means):
return means[category]
#mc.deterministic
def prec(category=category, precs=precs):
return precs[category]
v = np.random.randint( 0, n, ndata)
data = (v==0)*(50+ np.random.randn(ndata)) \
+ (v==1)*(-50 + np.random.randn(ndata)) \
+ (v==2)*np.random.randn(ndata)
obs = mc.Normal('obs', mean, prec, value=data, observed = True)
model = mc.Model({'dd': dd,
'category': category,
'precs': precs,
'means': means,
'obs': obs})
The traces with the following sampling procedure look good as well. Solved!
mcmc = mc.MCMC( model )
mcmc.sample( 50000,0 )
mcmc.trace('means').gettrace()[-1,:]
there is a mc.Categorical object that does just this.
p = [0.2, 0.3, .5]
t = mc.Categorical('test', p )
t.random()
#array(2, dtype=int32)
It returns an int between 0 and len(p)-1. To model the 3 Normals, you make p a mc.Dirichlet object (it accepts a k length array as the hyperparameters; setting the values in the array to be the same is setting the prior probabilities to be equal). The rest of the model is nearly identical.
This is a generalization of the model I suggested above.
Update:
Okay, so instead of having different means, we can collapse them all into 1:
means = Normal( "means", 0, 0.001, size=3 )
...
#mc.deterministic
def mean(categorical=categorical, means = means):
return means[categorical]

Resources