pymc and parameterize stochastic variables - pymc

I'm fairly new to python and pymc and wanted to try a problem out using pymc for learning purposes. I'm modeling a simple mendelian inheritence from grandparents down to son, but I don't understand how to reapply the same stochastic model multiple times. Any help is appreciated.
#py.stochastic
def childOf(value=1, d=0, m=0):
pdra=d/2
pmra=m/2
# now return likelihood
if (value==0):
return -np.log((1-pdra)*(1-pmra))
elif (value==1):
return -np.log((1-pdra)*(pmra)+(pdra)*(1-pmra))
else:
return -np.log((pdra*pmra))
p = [0.25,0.5,0.25]
gdd = py.Categorical("gdd", p, size=1)
gdm = py.Categorical("gdm", p, size=1)
gmd = py.Categorical("gmd", p, size=1)
gmm = py.Categorical("gmm", p, size=1)
gm=childOf('gm',d=gmm,m=gmd)
gd=childOf('gd',d=gdm,m=gdd)
gs=childOf('gs',d=gm,m=gd)
The error is a long string that ends with TypeError: 'numpy.ndarray' object is not callable on the first ChildOf

You are not using your Stochastic object correctly. childOf is a PyMC object itself, and not a constructor of PyMC objects as you are attempting to do in the last three lines. A better approach would be to specify a log-probability function and use this as the logp attribute for each object. For example:
import pymc as pm
import numpy as np
def childOf_logp(value=1, d=0, m=0):
pdra=d/2
pmra=m/2
# now return likelihood
if (value==0):
return -np.log((1-pdra)*(1-pmra))
elif (value==1):
return -np.log((1-pdra)*(pmra)+(pdra)*(1-pmra))
else:
return -np.log((pdra*pmra))
#pm.stochastic
def childOf_pm(value=1, d=gmm,m=gmd):
logp = childOf_logp

Related

How to find all local minimums of a function efficiently

This question is related to global optimization and it is simpler. The task is to find all local minimums of a function. This is useful sometimes, for example, in physics we might want to find metastable states besides the true ground state in phase space. I have a naive implementation which has been tested on a scalar function xsin(x)+xcos(2*x) by randomly searching points in the interval. But clearly this is not efficient. The code and output are attached if you are interested.
#!/usr/bin/env python
from scipy import *
from numpy import *
from pylab import *
from numpy import random
"""
search all of the local minimums using random search when the functional form of the target function is known.
"""
def function(x):
return x*sin(x)+x*cos(2*x)
# return x**4-3*x**3+2
def derivative(x):
return sin(x)+x*cos(x)+cos(2*x)-2*x*sin(2*x)
# return 4.*x**3-9.*x**2
def ploting(xr,yr,mls):
plot(xr,yr)
grid()
for xm in mls:
axvline(x=xm,c='r')
savefig("plotf.png")
show()
def findlocmin(x,Nit,step_def=0.1,err=0.0001,gamma=0.01):
"""
we use gradient decent method to find local minumum using x as the starting point
"""
for i in range(Nit):
slope=derivative(x)
step=min(step_def,abs(slope)*gamma)
x=x-step*slope/abs(slope)
# print step,x
if(abs(slope)<err):
print "Found local minimum using "+str(i)+' iterations'
break
if i==Nit-1:
raise Exception("local min is not found using Nit=",str(Nit),'iterations')
return x
if __name__=="__main__":
xleft=-9;xright=9
xs=linspace(xleft,xright,100)
ys=array([function(x) for x in xs ])
minls=[]
Nrand=100;it=0
Nit=10000
while it<Nrand:
xint=random.uniform(xleft,xright)
xlocm=findlocmin(xint,Nit)
print xlocm
minls.append(xlocm)
it+=1
# print minls
ploting(xs,ys,minls)`]
I'd like to know if there exists better solution to this?

Cannot convert object of type 'list' to a numeric value

I am making a pyomo model, where i want to use random numbers for my two dimensional parameters. I put a small python script for random numbers that looks exactly what i wanted to see for my two dimensional parameter. I am getting a TypeError: Cannot convert object of type 'list'(value =[[....]] to a numeric value. in my objective function. Below is my objective function and random numbers script.
model.obj = Objective(expr=sum(model.C[v,l] * model.T[v,l] for v in model.V for l in model.L) + \
sum(model.OC[o,l] * model.D[o,l] for o in model.O for l in model.L), sense=minimize)
import random
C = [[] for i in range(7)]
for i in range(7):
for j in range(5):
C[i]+= [random.randint(100,500)]
model.C = Param(model.V, model.L, initialize=C)
Please let me know if someone can help fixing this.
You should initialize your parameter using a function instead of a nested list
def init_c(m, i, j):
return random.randint(100,500)
model.c = Param(model.V, model.L, initialize=init_c)

How can I print the intermediate variables in the loss function in TensorFlow and Keras?

I'm writing a custom objective to train a Keras (with TensorFlow backend) model but I need to debug some intermediate computation. For simplicity, let's say I have:
def custom_loss(y_pred, y_true):
diff = y_pred - y_true
return K.square(diff)
I could not find an easy way to access, for example, the intermediate variable diff or its shape during training. In this simple example, I know that I could return diff to print its values, but my actual loss is more complex and I can't return intermediate values without getting compiling errors.
Is there an easy way to debug intermediate variables in Keras?
This is not something that is solved in Keras as far as I know, so you have to resort to backend-specific functionality. Both Theano and TensorFlow have Print nodes that are identity nodes (i.e., they return the input node) and have the side-effect of printing the input (or some tensor of the input).
Example for Theano:
diff = y_pred - y_true
diff = theano.printing.Print('shape of diff', attrs=['shape'])(diff)
return K.square(diff)
Example for TensorFlow:
diff = y_pred - y_true
diff = tf.Print(diff, [tf.shape(diff)])
return K.square(diff)
Note that this only works for intermediate values. Keras expects tensors that are passed to other layers to have specific attributes such as _keras_shape. Values processed by the backend, i.e. through Print, usually do not have that attribute. To solve this, you can wrap debug statements in a Lambda layer for example.
In TensorFlow 2, you can now add IDE breakpoints in the TensorFlow Keras models/layers/losses, including when using the fit, evaluate, and predict methods. However, you must add model.run_eagerly = True after calling model.compile() for the values of the tensor to be available in the debugger at the breakpoint. For example,
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
def custom_loss(y_pred, y_true):
diff = y_pred - y_true
return tf.keras.backend.square(diff) # Breakpoint in IDE here. =====
class SimpleModel(Model):
def __init__(self):
super().__init__()
self.dense0 = Dense(2)
self.dense1 = Dense(1)
def call(self, inputs):
z = self.dense0(inputs)
z = self.dense1(z)
return z
x = tf.convert_to_tensor([[1, 2, 3], [4, 5, 6]], dtype=tf.float32)
y = tf.convert_to_tensor([0, 1], dtype=tf.float32)
model0 = SimpleModel()
model0.run_eagerly = True
model0.compile(optimizer=Adam(), loss=custom_loss)
y0 = model0.fit(x, y, epochs=1) # Values of diff *not* shown at breakpoint. =====
model1 = SimpleModel()
model1.compile(optimizer=Adam(), loss=custom_loss)
model1.run_eagerly = True
y1 = model1.fit(x, y, epochs=1) # Values of diff shown at breakpoint. =====
This also works for debugging the outputs of intermediate network layers (for example, adding the breakpoint in the call of the SimpleModel).
Note: this was tested in TensorFlow 2.0.0-rc0.
In TensorFlow 2.0, you can use tf.print and print anything inside the definition of your loss function. You can also do something like tf.print("my_intermediate_tensor =", my_intermediate_tensor), i.e. with a message, similar to Python's print. However, you may need to decorate your loss function with #tf.function to actually see the results of the tf.print.

Supervised Machine Learning, producing a trained estimator

I have an assignment in which I am supposed to use scikit, numpy and pylab to do the following:
"All of the following should use data from the training_data.csv file
provided. training_data gives you a labeled set of integer pairs,
representing the scores of two sports teams, with the labels giving the
sport.
Write the following functions:
plot_scores() should draw a scatterplot of the data.
predict(dataset) should produce a trained Estimator to guess the sport
that resulted in a given score (from a dataset we've withheld, which will
be inputs as a 1000 x 2 np array). You can use any algorithm from scikit.
An optional additional function called "preprocess" will process dataset
before we it is passed to predict.
"
This is what I have done so far:
import numpy as np
import scipy as sp
import pylab as pl
from random import shuffle
def plot_scores():
k=open('training_data.csv')
lst=[]
for triple in k:
temp=triple.split(',')
lst.append([int(temp[0]), int(temp[1]), int(temp[2][:1])])
array=np.array(lst)
pl.scatter(array[:,0], array[:,1])
pl.show()
def preprocess(dataset):
k=open('training_data.csv')
lst=[]
for triple in k:
temp=triple.split(',')
lst.append([int(temp[0]), int(temp[1]), int(temp[2][:1])])
shuffle(lst)
return lst
In preprocess, I shuffled the data because I am supposed to use some of it to train on and some of it to test on, but the original data was not at all random. My question is, how am I supposed to "produce a trained estimator" in predict(dataset)? Is this supposed to be a function that returns another function? And which algorithm would be ideal to classify based on a dataset that looks like this:
The task likely wants you to train a standard scikit classifier model and return it, i.e. something like
from sklearn.svm import SVC
def predict(dataset):
X = ... # features, extract from dataset
y = ... # labels, extract from dataset
clf = SVC() # create classifier
clf.fit(X, y) # train
return clf
Though judging from the name of the function (predict) you should check if it really wants you to return a trained classifier or return predictions for the given dataset argument, as that would be more typical.
As a classifier you can basically use anyone that you like. Your plot looks like your dataset is linearly seperable (there are no colors for the classes, but I assume that the blops are the two classes). On linearly separable data hardly anything will fail. Try SVMs, logistic regression, random forests, naive bayes, ... For extra fun you can try to plot the decision boundaries, see here (which also contains an overview of the available classifiers).
I would recommend you to take a look at this structure:
from random import shuffle
import matplotlib.pyplot as plt
# import a classifier you need
def get_data():
# open your file and parse data to prepare X as a set of input vectors and Y as a set of targets
return X, Y
def split_data(X, Y):
size = len(X)
indices = range(size)
shuffle(indices)
train_indices = indices[:size/2]
test_indices = indices[size/2:]
X_train = [X[i] for i in train_indices]
Y_train = [Y[i] for i in train_indices]
X_test = [X[i] for i in test_indices]
Y_test = [Y[i] for i in test_indices]
return X_train, Y_train, X_test, Y_test
def plot_scatter(Y1, Y2):
plt.figure()
plt.scatter(Y1, Y2, 'bo')
plt.show()
# get data
X, Y = get_data()
# split data
X_train, Y_train, X_test, Y_test = split_data(X, Y)
# create a classifier as an object
classifier = YourImportedClassifier()
# train the classifier, after that the classifier is the trained estimator you need
classifier.train(X_train, Y_train) # or .fit(X_train, Y_train) or another train routine
# make a prediction
Y_prediction = classifier.predict(X_test)
# plot the scatter
plot_scatter(Y_prediction, Y_test)
I think what you are looking for is clf.fit() function, instead creating function that produce another function

PyMC3: How can I code my custom distribution with observed data better for Theano?

I am attempting to implement a fairly simple model in pymc3. The gist is that I have some data that is generated from a sequence of random choices. The choices can be thought of as a multinomial, and the process selects choices as a function of previous choices.
The overall probability of the categories is modeled with a Dirichlet prior.
The likelihood function must be customized for the data at hand. The data are lists of 0's and 1's that are output from the process. I have successfully made the model in pymc2, which you can find at this blog post. Here is a python function that generates test data for this problem:
ps = [0.2,0.35,0.25,0.15,0.0498,1/5000]
def make(ps):
out = []
while len(out) < 5:
n_spots = 5-len(out)
sp = sum(ps[:n_spots+1])
P = [x/sp for x in ps[:n_spots+1]]
l = np.argwhere(np.random.multinomial(1,P)==1).ravel()[0]
#if len(out) == 4:
# l = np.argwhere(np.random.multinomial(1,ps[:2])==1).ravel()[0]
out.extend([1]*l)
if (out and out[-1] == 1 and len(out) < 5) or l == 0:
out.append(0)
#print n_spots, l, len(out)
assert len(out) == 5
return out
As I'm learning/moving to pymc3, I'm trying to input my data as observed into a custom likelihood function, and I'm running into several issues along the way. It's probably because this is my first experience with Theano, but I'm hoping that someone can give some advice.
Here is my code (using the make function above):
import numpy as np
import pymc3 as pm
from scipy import optimize
import theano.tensor as T
from theano.compile.ops import as_op
from collections import Counter
# This function gets the attributes of the data that are relevant for calculating the likelihood
def scan(value):
groups = []
prev = False
s = 0
for i in xrange(5):
if value[i] == 0:
if prev:
groups.append((s,5-(i-s)))
prev = False
s = 0
else:
groups.append((0,5-i))
else:
prev = True
s += 1
if prev:
groups.append((s,4-(i-s)))
return groups
# The likelihood calculation for a single data point
def like1(v,p):
l = 1
groups = scan(v)
for n, s in groups:
l *= p[n]/p[:s+1].sum()
return T.log(l)
# my custom likelihood class
class CustomDist(pm.distributions.Discrete):
def __init__(self, ps, data, *args, **kwargs):
super(CustomDist, self).__init__(*args, **kwargs)
self.ps = ps
self.data = data
def logp(self,v):
all_l = 0
for v, k in self.data.items():
l = like1(v,self.ps)
all_l += l*k
return all_l
# model creation
model = pm.Model()
with model:
probs = pm.Dirichlet('probs',a=np.array([0.5]*6),shape=6,testval=np.array([1/6.0]*6))
output = CustomDist("rolls",ps=probs,data=data,observed=True)
I am able to find the MAP in about a minute or so (my machine is Windows 7, i7-4790 #3.6GHz). The MAP matches well with the input probability vector, which at least means the model is linked properly.
When I try to do traces, though, my memory usage skyrockets (up to several gig) and I haven't actually been patient enough for the model to finish compiling. I've waited 10 minutes + for the NUTS or HMC to compile before even tracing. The metropolis stepping works just fine, though (and is much faster than with pymc2).
Am I just being too hopeful for Theano to be able to handle for-loops of non-theano data well? Is there a better way to write this code so that Theano plays well with it, or am I limited because my data is a custom python type and can't be analyzed with array/matrix operations?
Thanks in advance for your advice and feedback. Please let me know what might need clarification!

Resources