Why does pymc.MAP not always return the same value - pymc

I am running pymc2 to fit a straight line through my data. The code is shown below (modified from examples I found online). When I call the MAP function multiple times, I get different answers, even though I start with the exact same model. I thought the optimization method, fmin_powell, starts at the supplied value for each parameter. As far as I know, fmin_powell has no random component, so it should always end at the same optimum, yet it doesn't. Why do I keep getting different results?
import numpy as np
import pymc
# observed data
n = 21
a = 6
b = 2
sigma = 2
x = np.linspace(0, 1, n)
np.random.seed(1)
y_obs = a * x + b + np.random.normal(0, sigma, n)
def model():
# define priors
a = pymc.Normal('a', mu=0, tau=1 /10 ** 2, value=5)
b = pymc.Normal('b', mu=0, tau=1 / 10 ** 2, value=1)
tau = pymc.Gamma('tau', alpha=0.1, beta=0.1, value=1)
# define likelihood
#pymc.deterministic
def mu(a=a, b=b, x=x):
return a * x + b
y = pymc.Normal('y', mu=mu, tau=tau, value=y_obs, observed=True)
return locals()
ml = model() # dictionary of all locals
mcmc = pymc.Model(ml) # MCMC object
mapmcmc = pymc.MAP(mcmc)
mapmcmc.fit(method='fmin_powell')
print(mcmc.a.value, mcmc.b.value, mcmc.tau.value)
ml = model() # dictionary of all locals
mcmc = pymc.Model(ml) # MCMC object
mapmcmc = pymc.MAP(mcmc)
mapmcmc.fit(method='fmin_powell')
print(mcmc.a.value, mcmc.b.value, mcmc.tau.value)
ml = model() # dictionary of all locals
mcmc = pymc.Model(ml) # MCMC object
mapmcmc = pymc.MAP(mcmc)
mapmcmc.fit(method='fmin_powell')
print(mcmc.a.value, mcmc.b.value, mcmc.tau.value)

Related

Runge-Kutta curve fitting extremely slow

I am currently trying to do a regression of a function calculated via a RK4 method performed on a non-linear Volterra integral of the second kind. The problem I found is that the code is extremely slow, for 1 call of the curve_fit function (fitt), it takes about 30-40 minute to generate a data. Overall, there will be a lot of calls to fitt before the parameters are determined, this takes more than 6 hours to run. Is there anyway to optimize this code? Thanks in advance!
from scipy.special import gamma
from ml_internal import LTInversion
from scipy.optimize import curve_fit , fsolve
from scipy.misc import derivative
from sklearn.metrics import r2_score
from math import comb , factorial
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
# Gets the data
df = pd.read_excel('D:\\CoMat\\Fractional_fit\\optimized\\data_optimized.xlsx')
skipTime = 1
skipIndex = df[df['Time']== skipTime].index.values[0]
xls = pd.read_excel('D:\\CoMat\\Fractional_fit\\optimized\\data_optimized.xlsx',skiprows=np.arange(1,skipIndex+1,1))
timeDF = xls['Time']
tempDF = xls['Temp']
taDF = xls['Ta']
timeDF = timeDF - timeDF[0]
tempDF = tempDF + 273.15
t0 = tempDF[0]
ta = sum(taDF)/len(taDF)
ta = ta + 273.15
###########################################
#Spliting into intervals
h = 0.05
a = 0
b = timeDF[len(timeDF)-1]
N = int(np.round((b-a)/h))
#Each xi
def xidx(index):
return a + h*index
#Function in the image are written here.
def gx(t,lamda,alpha):
return t0 * ml(lamda*(t**alpha),alpha)
gx = np.vectorize(gx)
def kernel(t,s,rad,lamda,alpha,beta):
if t == s:
return 0
return (t-s)**(alpha-1) * ml_(lamda*((t-s)**alpha),alpha,alpha,1) * (beta*(rad**4) - beta*(ta**4) - lamda*ta)
kernel = np.vectorize(kernel)
############################
# The problem is here!!!!!!
def fx(x,n,lamda,alpha,beta):
ans = gx(x,lamda,alpha)
for j in range(n):
ans += (h/6)*(kernel(x,xidx(j),f0[j],lamda,alpha,beta) + 2*kernel(x,xidx(j+1/2),f1[j],lamda,alpha,beta) + 2*kernel(x,xidx(j+1/2),f2[j],lamda,alpha,beta) + kernel(x,xidx(j+1),f3[j],lamda,alpha,beta))
return ans
#########################
f0 = np.zeros(N+1)
f0[0] = t0
f1 = np.zeros(N+1)
f2 = np.zeros(N+1)
f3 = np.zeros(N+1)
F = np.zeros((3,N+1))
def fitt(xvalue,lamda,alpha,beta):
global f0,f1,f2,f3,F
n = int(np.round(xvalue/h))
f1[n] = fx(xidx(n) + 1/2,n,lamda,alpha,beta) + (h/2)*kernel(xidx(n + 1/2),xidx(n),f0[n],lamda,alpha,beta)
f2[n] = fx(xidx(n + 1/2),n,lamda,alpha,beta)
f3[n] = fx(xidx(n+1),n,lamda,alpha,beta) + h*kernel(xidx(n+1),xidx(n+1/2),f2[n],lamda,alpha,beta)
if n+1 <= N:
f0[n+1] = fx(xidx(n+1),n,lamda,alpha,beta) + (h/6)*(kernel(xidx(n+1),xidx(n),f0[n],lamda,alpha,beta) + 2*kernel(xidx(n+1),xidx(n+1/2),f1[n],lamda,alpha,beta) + 2*kernel(xidx(n+1),xidx(n+1/2),f2[n],lamda,alpha,beta) + kernel(xidx(n+1),xidx(n+1),f3[n],lamda,alpha,beta))
if(xvalue == timeDF[len(timeDF) - 1]):
print(f0[n],n)
returnValue = f0[n]
f0 = np.zeros(N+1)
f0[0] = t0
f1 = np.zeros(N+1)
f2 = np.zeros(N+1)
f3 = np.zeros(N+1)
return returnValue
print(f0[n],n)
return f0[n]
fitt = np.vectorize(fitt)
#Fitting, plotting and giving (Adj) R-squared
popt , pcov = curve_fit(fitt,timeDF,tempDF,p0=(-0.1317,0.95,-1e-11),bounds=((-np.inf,0,-np.inf),(0,1,0)))
print(popt)
y_fit = np.array(fitt(timeDF,popt[0],popt[1],popt[2]))
plt.scatter(timeDF,tempDF,color='ORANGE',marker='.',s=0.5)
plt.fill_between(timeDF,tempDF-0.5,tempDF+0.5,color='ORANGE', alpha=0.2)
plt.plot(timeDF,y_fit,color='RED',linewidth=1)
plt.legend(["Experimental data", "Caputo fit"], loc ="upper right")
plt.xlabel("Time (min)")
plt.ylabel("Temperature (Kelvin)")
plt.show()
plt.close()
r2 = r2_score(tempDF,y_fit)
print(r2)
adjr2 = 1 - (1 - r2)*((len(xls)-1)/(len(xls)-3-1))
print(adjr2)
I already tried computing the values f0,f1,f2,f3 all at once, but the thing consuming the most time is Fn(x) which I haven't figured in out how to compute them all at once. If this is possible to compute at once, I think the program will run much faster. PS: ML,ML_ is a function from https://github.com/khinsen/mittag-leffler.
This is the function necesssary. Fn is the only one I haven't figured out yet.
There are two typing errors in the cited image. The combination of x_n and 1/2 is always meant to be the midpoint x_{n+1/2} = x_n + h/2. The second error is a duplication of x_{n+1/2} in the formula for f^{(4)}_n in its third term. The first error is probably producing errors that are large enough to make convergence complicated and any limit wrong for the intended problem.
In the Simpson/RK4 step, the 4 fx computations can be reduced to 2.
The F_n implement the left side of the integral equation
F(x) = g(x) + int(s=0 to x of K(x,s,f(s))
where the integral is approximated with the sample sequences f0,...,f3. Due to the structure of problem and algorithm F_n(x_n)=f^0_n = f^4_{n-1}.
Note that K(x,s,f) should be set to zero for s >= x. In the exact version of the equation these values "above the diagonal" are not used.
If an increase in accuracy is needed, for instance to avoid divergence where there is none in the exact solution, you can decrease the step site by a factor of 10 and then sub-sample the f^0_n sequence to produce the numerical guess for the given data. Other factors than 10 are of course also possible.

What is the formula being used in the in-sample prediction of statsmodels?

I would like to know what formula is being used in statsmodels ARIMA predict/forecast. For a simple AR(1) model I thought that it would be y_t = a1 * y_t-1. However, I am not able to recreate the results produced by forecast or predict.
Here's what I am trying to do:
from statsmodels.tsa.arima.model import ARIMA
import numpy as np
def ar_series(n):
# generate the series y_t = a1 y_t-1 + eps
np.random.seed(1)
y0 = np.random.rand()
y = [y0]
a1 = 0.7 # the AR coefficient
for i in range(1, n):
y.append(a1 * y[i - 1] + 0.3 * np.random.rand())
return np.array(y)
series = ar_series(10)
model = ARIMA(series, order=(1, 0, 0))
fit = model.fit()
#print(fit.summary())
# const = 0.3441; ar.L1 = 0.6518
print(fit.predict())
y_pred = [0.3441]
for i in range(1, 10):
y_pred.append( 0.6518 * series[i-1])
y_pred = np.array(y_pred)
print(y_pred)
The two series don't match and I have no idea how the in-sample predictions are being calculated?
Found the answer here. I think what I was trying to do is valid only if the process mean is zero.
https://faculty.washington.edu/ezivot/econ584/notes/forecast.pdf

GARCH model in pymc3: how to loop over random variables?

I'm attempting to implement a GARCH model in pymc3, along the lines of this example. For this I attempted to implement a GARCH(1, 1) distribution as follows
import pymc3 as pm
from pymc3.distributions import Continuous, Normal
class GARCH(Continuous):
def __init__(self, alpha_0=None, alpha_1=None, beta_1=None, sigma_0=None, *args, **kwargs):
super(GARCH, self).__init__(*args, **kwargs)
self.alpha_0 = alpha_0
self.alpha_1 = alpha_1
self.beta_1 = beta_1
self.sigma_0 = sigma_0
self.mean = 0
def logp(self, values):
sigma = self.sigma_0
alpha_0 = self.alpha_0
alpha_1 = self.alpha_1
beta_1 = self.beta_1
x_prev = values[0]
_logp = Normal.dist(0., sd=sigma).logp(x_prev)
for x in values[1:]:
sigma = pm.sqrt(alpha_0 + alpha_1 * (x_prev/sigma)**2
+ beta_1 * sigma**2)
_logp = _logp + pm.Normal(0., sd=sigma).logp(x)
x_prev = x
return _logp
To clarify, this is the log-likelihood of the GARCH(1,1) model. The volatility process is a time series where the volatility at time t depends on the residual at time t-1. But to determine the residual at time t-1, we require the volatility at time t-1.
Anyways, that's not really important for my question. What matters is that the likelihood cannot be computed by vectorizing the for-loop (which is how it is done in the link at the top of the post). So you need an explicit loop which at each step first updates the volatility, and then determines likelihood of the observed return.
But the code above doesn't work. If I try to build a model like
import numpy as np
returns = np.genfromtxt("SP500.csv")[-200:]
garchmodel = pm.Model()
with garchmodel:
alpha_0 = pm.Exponential('alpha_0', 30., testval=.02)
alpha_1 = pm.Uniform('alpha_1', lower=0, upper=1, testval=.9)
upper = pm.Deterministic('upper', 1-alpha_1)
beta_1 = pm.Uniform('beta_1', lower=0, upper=upper, testval=.05)
sigma_0 = pm.Exponential('sigma_0', 30., testval=.02)
garch = GARCH('garch', alpha_0=alpha_0, alpha_1=alpha_1,
beta_1=beta_1, sigma_0=sigma_0, observed=returns)
The "SP500.csv" file can be found on e.g. github
This code generate the error:
ValueError: length not known
I'm pretty certain that this is because the for loops are conflicting with theano. How do I deal with this?

Fitting a capped Poisson process with a variable rate

I'm trying to estimate the rate of a Poisson process where the rate varies over time using the maximum a posteriori estimate. Here's a simplified example with a rate varying linearly (λ = ax+b) :
import numpy as np
import pymc
# Observation
a_actual = 1.3
b_actual = 2.0
t = np.arange(10)
obs = np.random.poisson(a_actual * t + b_actual)
# Model
a = pymc.Uniform(name='a', value=1., lower=0, upper=10)
b = pymc.Uniform(name='b', value=1., lower=0, upper=10)
#pymc.deterministic
def linear(a=a, b=b):
return a * t + b
r = pymc.Poisson(mu=linear, name='r', value=obs, observed=True)
model = pymc.Model([a, b, r])
map = pymc.MAP(model)
map.fit()
map.revert_to_max()
print "a :", a._value
print "b :", b._value
This is working fine. But my actual Poisson process is capped by a deterministic value. As I can't associate my observed values to a Deterministic function, I'm adding a Normal Stochastic function with a small variance for my observations :
import numpy as np
import pymc
# Observation
a_actual = 1.3
b_actual = 2.0
t = np.arange(10)
obs = np.random.poisson(a_actual * t + b_actual).clip(0, 10)
# Model
a = pymc.Uniform(name='a', value=1., lower=0, upper=10)
b = pymc.Uniform(name='b', value=1., lower=0, upper=10)
#pymc.deterministic
def linear(a=a, b=b):
return a * t + b
r = pymc.Poisson(mu=linear, name='r')
#pymc.deterministic
def clip(r=r):
return r.clip(0, 10)
rc = pymc.Normal(mu=r, tau=0.001, name='rc', value=obs, observed=True)
model = pymc.Model([a, b, r, rc])
map = pymc.MAP(model)
map.fit()
map.revert_to_max()
print "a :", a._value
print "b :", b._value
This code is producing the following error :
Traceback (most recent call last):
File "pymc-bug-2.py", line 59, in <module>
map.revert_to_max()
File "pymc/NormalApproximation.py", line 486, in revert_to_max
self._set_stochastics([self.mu[s] for s in self.stochastics])
File "pymc/NormalApproximation.py", line 58, in __getitem__
tot_len += self.owner.stochastic_len[p]
KeyError: 0
Any idea on what am I doing wrong?
By "Capped" do you mean that it is a truncated Poisson? It appears thats what you are saying. If it were a left truncation (which is more common), you could use the TruncatedPoisson distribution, but since you are doing a right truncation, you cannot (we should have made this more general!). What you are trying will not work -- the Poisson object has no clip() method. What you can do is use a factor potential. It would look like this:
#pymc.potential
def clip(r=r):
if np.any(r>10):
return -np.inf
return 0
This will constrain the values of r to be less than 10. Refer to the pymc docs for information on the Potential class.

How to model a mixture of 3 Normals in PyMC?

There is a question on CrossValidated on how to use PyMC to fit two Normal distributions to data. The answer of Cam.Davidson.Pilon was to use a Bernoulli distribution to assign data to one of the two Normals:
size = 10
p = Uniform( "p", 0 , 1) #this is the fraction that come from mean1 vs mean2
ber = Bernoulli( "ber", p = p, size = size) # produces 1 with proportion p.
precision = Gamma('precision', alpha=0.1, beta=0.1)
mean1 = Normal( "mean1", 0, 0.001 )
mean2 = Normal( "mean2", 0, 0.001 )
#deterministic
def mean( ber = ber, mean1 = mean1, mean2 = mean2):
return ber*mean1 + (1-ber)*mean2
Now my question is: how to do it with three Normals?
Basically, the issue is that you can't use a Bernoulli distribution and 1-Bernoulli anymore. But how to do it then?
edit: With the CDP's suggestion, I wrote the following code:
import numpy as np
import pymc as mc
n = 3
ndata = 500
dd = mc.Dirichlet('dd', theta=(1,)*n)
category = mc.Categorical('category', p=dd, size=ndata)
precs = mc.Gamma('precs', alpha=0.1, beta=0.1, size=n)
means = mc.Normal('means', 0, 0.001, size=n)
#mc.deterministic
def mean(category=category, means=means):
return means[category]
#mc.deterministic
def prec(category=category, precs=precs):
return precs[category]
v = np.random.randint( 0, n, ndata)
data = (v==0)*(50+ np.random.randn(ndata)) \
+ (v==1)*(-50 + np.random.randn(ndata)) \
+ (v==2)*np.random.randn(ndata)
obs = mc.Normal('obs', mean, prec, value=data, observed = True)
model = mc.Model({'dd': dd,
'category': category,
'precs': precs,
'means': means,
'obs': obs})
The traces with the following sampling procedure look good as well. Solved!
mcmc = mc.MCMC( model )
mcmc.sample( 50000,0 )
mcmc.trace('means').gettrace()[-1,:]
there is a mc.Categorical object that does just this.
p = [0.2, 0.3, .5]
t = mc.Categorical('test', p )
t.random()
#array(2, dtype=int32)
It returns an int between 0 and len(p)-1. To model the 3 Normals, you make p a mc.Dirichlet object (it accepts a k length array as the hyperparameters; setting the values in the array to be the same is setting the prior probabilities to be equal). The rest of the model is nearly identical.
This is a generalization of the model I suggested above.
Update:
Okay, so instead of having different means, we can collapse them all into 1:
means = Normal( "means", 0, 0.001, size=3 )
...
#mc.deterministic
def mean(categorical=categorical, means = means):
return means[categorical]

Resources