How to build a model requiring external package in PyMC3? - pymc

I'm not sure if this is a PyMC3 question or a Theano question. I've used PyMC2 for a long time to fit a cosmology to supernova data. This requires some messy integrals (see i.e. http://arxiv.org/abs/astroph/9905116 )
So I use a package in python called Cosmolopy to do the integration and for some other convenience functions. Whereas this used to work fine with PyMC2, with the reliance on theano in PyMC3, I can't figure out if there is even a way to use Cosmolopy.
Here is some example code of my current understanding of how to build a model in PyMC3
import numpy as np
import pymc as pm
import cosmolopy as cp
# generate some redshifts
nSNe = 100
z = np.random.uniform( low=0.0, high=1.0, size=nSNe )
# set cosmology and simulate some distance moduli and errors
cosmo = cp.fidcosmo
muSN = cp.magnitudes.distance_modulus( z, **cosmo ) + np.random.normal( loc=0, scale=0.15, size=nSNe )
muSN_err = np.random.uniform(low=0.1, high=0.3, size=nSNe)
# pymc model
with pm.Model() as model:
# omega matter is the free parameter in this simple example
omega_matter = pm.Uniform( 'omega_matter', lower=0.0, upper=1.0 )
# the cosmology as a function of omega_matter
cosmo['omega_M_0'] = omega_matter
cosmo['omega_lambda_0'] = 1.0 - omega_matter
mu_fit = cp.magnitudes.distance_modulus( z, **cosmo )
# what should be fit by the MCMC
snr = pm.Normal( 'snr', mu = mu_fit, sd = muSN_err, observed = muSN )
This code crashes because Cosmolopy expects a float for omega_matter but receives a theano.TensorVariable instead.
So the question is two-fold:
Am I just missing something syntactically with PyMC3 that would allow me to do this (possibly because I am still stuck somehow on PyMC2 model-building)?
If not 1, then do I need to find a way to do the integrals in theano?

I don't know well PyMC3, but I know well Theano. Theano use symbolic compiler and TensorVariable are such symbolic variable. You need to compile and execute the function to get a value out of it. I don't know where to do this in PyMC3. A fast thing to try that will work if the variable depend only on constant and shared variable is to do this call::
the_tensor_variable.eval()
This will compile the function and suppose it don't take any variable input and if it compile, it will run it and return the value.

I think one possible solution would be to write a custom Theano Op following the instructions at http://deeplearning.net/software/theano/extending/
I would write a pure Python op without support for gradient computation, in which you would only have to implement the make_node() and perform() methods.

Related

SARIMAX model in PyMC3

I would like to write down the following SARIMAX model (2,0,0) (2,0,0,12) in PyMC3 to perform bayesian estimation of its coefficients but I cannot figure out how to start with the seasonal part
Has anyone tries something like this?
with pm.Model() as ar2:
theta = pm.Normal("theta", 0.0, 1.0, shape=2)
sigma = pm.HalfNormal("sigma", 3)
likelihood = pm.AR("y", theta, sigma=sigma, observed=data)
trace = pm.sample(
1000,
tune=2000,
random_seed=13,
)
idata = az.from_pymc3(trace)
Although it would be best (e.g. best performance) if you can get an answer that uses PyMC3 exclusively, in case that does not exist yet, there is an alternative way to do this that uses the SARIMAX model in Statsmodels in combination with PyMC3.
There are too many details to repeat a full answer here, but basically you wrap the log-likelihood and gradient methods associated with a Statsmodels SARIMAX model. Here is a link to an example Jupyter notebook that shows how to do this:
https://www.statsmodels.org/stable/examples/notebooks/generated/statespace_sarimax_pymc3.html
I'm not sure if you'll still need it, however, expanding on cfulton's answer, here is how to fix the error in the statsmodels example (https://www.statsmodels.org/dev/examples/notebooks/generated/statespace_sarimax_pymc3.html, cell 8):
with pm.Model():
# Priors
arL1 = pm.Uniform('ar.L1', -0.99, 0.99)
maL1 = pm.Uniform('ma.L1', -0.99, 0.99)
sigma2 = pm.InverseGamma('sigma2', 2, 4)
# convert variables to tensor vectors
# # this is wrong:
theta = tt.as_tensor_variable([arL1, maL1, sigma2])
# # this is correct:
theta = tt.as_tensor_variable([arL1, maL1, sigma2], 'v')
# use a DensityDist (use a lamdba function to "call" the Op)
# # this is wrong:
# pm.DensityDist('likelihood', lambda v: loglike(v), observed={'v': theta})
# # this is correct:
pm.DensityDist('likelihood', lambda v: loglike(v), observed=theta)
# Draw samples
trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True, cores=4)
I'm no pymc3/theano expert, but I think the error means that Theano has failed to associate the tensor's name with the values. If you define the name along with the values right at the beginning, it works.
I know it's not a direct answer to your question. Nevertheless, I hope it helps.

Variable.assign(value) on Multi-GPU with Tensorflow 2

I have a model that works perfectly on a single GPU as follows:
alpha = tf.Variable(alpha,
name='ws_alpha',
trainable=False,
dtype=tf.float32,
aggregation=tf.VariableAggregation.ONLY_FIRST_REPLICA,
)
...
class CustomModel(tf.keras.Model):
#tf.function
def train_step(inputs):
...
alpha.assign_add(increment)
...
model.fit(dataset, epochs=10)
However, when I run on multiple GPUs, the assignment is not being done. It works for two training steps, and then remains the same over the whole epoch.
The alpha is for a weighted sum of two layers e.g. out = a*Layer1 + (1-a)*Layer2. It is not a trainable parameter, but something akin to a step_count variable.
Has anyone had experience with assigning individual values in a multi-GPU setting on tensorflow 2?
Would it be better to assign the variable as:
with tf.device("CPU:0"):
alpha = tf.Variable()
?
Simple fix, as per tensorflow issues
alpha = tf.Variable(alpha,
name='ws_alpha',
trainable=False,
dtype=tf.float32,
aggregation=tf.VariableAggregation.ONLY_FIRST_REPLICA,
synchronization=tf.VariableSynchronization.ON_READ,
)

How can I print the intermediate variables in the loss function in TensorFlow and Keras?

I'm writing a custom objective to train a Keras (with TensorFlow backend) model but I need to debug some intermediate computation. For simplicity, let's say I have:
def custom_loss(y_pred, y_true):
diff = y_pred - y_true
return K.square(diff)
I could not find an easy way to access, for example, the intermediate variable diff or its shape during training. In this simple example, I know that I could return diff to print its values, but my actual loss is more complex and I can't return intermediate values without getting compiling errors.
Is there an easy way to debug intermediate variables in Keras?
This is not something that is solved in Keras as far as I know, so you have to resort to backend-specific functionality. Both Theano and TensorFlow have Print nodes that are identity nodes (i.e., they return the input node) and have the side-effect of printing the input (or some tensor of the input).
Example for Theano:
diff = y_pred - y_true
diff = theano.printing.Print('shape of diff', attrs=['shape'])(diff)
return K.square(diff)
Example for TensorFlow:
diff = y_pred - y_true
diff = tf.Print(diff, [tf.shape(diff)])
return K.square(diff)
Note that this only works for intermediate values. Keras expects tensors that are passed to other layers to have specific attributes such as _keras_shape. Values processed by the backend, i.e. through Print, usually do not have that attribute. To solve this, you can wrap debug statements in a Lambda layer for example.
In TensorFlow 2, you can now add IDE breakpoints in the TensorFlow Keras models/layers/losses, including when using the fit, evaluate, and predict methods. However, you must add model.run_eagerly = True after calling model.compile() for the values of the tensor to be available in the debugger at the breakpoint. For example,
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
def custom_loss(y_pred, y_true):
diff = y_pred - y_true
return tf.keras.backend.square(diff) # Breakpoint in IDE here. =====
class SimpleModel(Model):
def __init__(self):
super().__init__()
self.dense0 = Dense(2)
self.dense1 = Dense(1)
def call(self, inputs):
z = self.dense0(inputs)
z = self.dense1(z)
return z
x = tf.convert_to_tensor([[1, 2, 3], [4, 5, 6]], dtype=tf.float32)
y = tf.convert_to_tensor([0, 1], dtype=tf.float32)
model0 = SimpleModel()
model0.run_eagerly = True
model0.compile(optimizer=Adam(), loss=custom_loss)
y0 = model0.fit(x, y, epochs=1) # Values of diff *not* shown at breakpoint. =====
model1 = SimpleModel()
model1.compile(optimizer=Adam(), loss=custom_loss)
model1.run_eagerly = True
y1 = model1.fit(x, y, epochs=1) # Values of diff shown at breakpoint. =====
This also works for debugging the outputs of intermediate network layers (for example, adding the breakpoint in the call of the SimpleModel).
Note: this was tested in TensorFlow 2.0.0-rc0.
In TensorFlow 2.0, you can use tf.print and print anything inside the definition of your loss function. You can also do something like tf.print("my_intermediate_tensor =", my_intermediate_tensor), i.e. with a message, similar to Python's print. However, you may need to decorate your loss function with #tf.function to actually see the results of the tf.print.

Using if conditions inside a TensorFlow graph

In tensorflow CIFAR-10 tutorial in cifar10_inputs.py line 174 it is said you should randomize the order of the operations random_contrast and random_brightness for better data augmentation.
To do so the first thing I think of is drawing a random variable from the uniform distribution between 0 and 1 : p_order. And do:
if p_order>0.5:
distorted_image=tf.image.random_contrast(image)
distorted_image=tf.image.random_brightness(distorted_image)
else:
distorted_image=tf.image.random_brightness(image)
distorted_image=tf.image.random_contrast(distorted_image)
However there are two possible options for getting p_order:
1) Using numpy which disatisfies me as I wanted pure TF and that TF discourages its user to mix numpy and tensorflow
2) Using TF, however as p_order can only be evaluated in a tf.Session()
I do not really know if I should do:
with tf.Session() as sess2:
p_order_tensor=tf.random_uniform([1,],0.,1.)
p_order=float(p_order_tensor.eval())
All those operations are inside the body of a function and are run from another script which has a different session/graph. Or I could pass the graph from the other script as an argument to this function but I am confused.
Even the fact that tensorflow functions like this one or inference for example seem to define the graph in a global fashion without explicitly returning it as an output is a bit hard to understand for me.
You can use tf.cond(pred, fn1, fn2, name=None) (see doc).
This function allows you to use the boolean value of pred inside the TensorFlow graph (no need to call self.eval() or sess.run(), hence no need of a Session).
Here is an example of how to use it:
def fn1():
distorted_image=tf.image.random_contrast(image)
distorted_image=tf.image.random_brightness(distorted_image)
return distorted_image
def fn2():
distorted_image=tf.image.random_brightness(image)
distorted_image=tf.image.random_contrast(distorted_image)
return distorted_image
# Uniform variable in [0,1)
p_order = tf.random_uniform(shape=[], minval=0., maxval=1., dtype=tf.float32)
pred = tf.less(p_order, 0.5)
distorted_image = tf.cond(pred, fn1, fn2)

Real/imaginary part of sympy complex matrix

Here is my problem.
I'm using sympy and a complex matrix P (all elements of P are complex valued).
I wanna extract the real/imaginary part of the first row.
So, I use the following sequence:
import sympy as sp
P = sp.Matrix([ [a+sp.I*b,c-sp.I*d], [c-sp.I*d,a+sp.I*b] ])
Row = P.row(0)
Row.as_mutable()
Re_row = sp.re(Row)
Im_row = sp.im(Row)
But the code returns me the following error:
"AttributeError: ImmutableMatrix has no attribute as_coefficient."
The error occurs during the operation sp.re(Row) and sp.im(Row)...
Sympy tells me that Row is an Immutable matrix but I specify that I want a mutable one...
So I'm in a dead end, and I don't have the solution...
Could someone plz help me ?
thank you very much !
Most SymPy functions won't work if you just pass a Matrix to them directly. You need to use the methods of the Matrix, or if there is not such method (as is the case here), use applyfunc
In [34]: Row.applyfunc(re)
Out[34]: [re(a) - im(b) re(c) + im(d)]
In [35]: Row.applyfunc(im)
Out[35]: [re(b) + im(a) -re(d) + im(c)]
(I've defined a, b, c, and d as just ordinary symbols here, if you set them as real the answer will come out much simpler).

Resources