How to find all local minimums of a function efficiently - random

This question is related to global optimization and it is simpler. The task is to find all local minimums of a function. This is useful sometimes, for example, in physics we might want to find metastable states besides the true ground state in phase space. I have a naive implementation which has been tested on a scalar function xsin(x)+xcos(2*x) by randomly searching points in the interval. But clearly this is not efficient. The code and output are attached if you are interested.
#!/usr/bin/env python
from scipy import *
from numpy import *
from pylab import *
from numpy import random
"""
search all of the local minimums using random search when the functional form of the target function is known.
"""
def function(x):
return x*sin(x)+x*cos(2*x)
# return x**4-3*x**3+2
def derivative(x):
return sin(x)+x*cos(x)+cos(2*x)-2*x*sin(2*x)
# return 4.*x**3-9.*x**2
def ploting(xr,yr,mls):
plot(xr,yr)
grid()
for xm in mls:
axvline(x=xm,c='r')
savefig("plotf.png")
show()
def findlocmin(x,Nit,step_def=0.1,err=0.0001,gamma=0.01):
"""
we use gradient decent method to find local minumum using x as the starting point
"""
for i in range(Nit):
slope=derivative(x)
step=min(step_def,abs(slope)*gamma)
x=x-step*slope/abs(slope)
# print step,x
if(abs(slope)<err):
print "Found local minimum using "+str(i)+' iterations'
break
if i==Nit-1:
raise Exception("local min is not found using Nit=",str(Nit),'iterations')
return x
if __name__=="__main__":
xleft=-9;xright=9
xs=linspace(xleft,xright,100)
ys=array([function(x) for x in xs ])
minls=[]
Nrand=100;it=0
Nit=10000
while it<Nrand:
xint=random.uniform(xleft,xright)
xlocm=findlocmin(xint,Nit)
print xlocm
minls.append(xlocm)
it+=1
# print minls
ploting(xs,ys,minls)`]
I'd like to know if there exists better solution to this?

Related

How can I print the intermediate variables in the loss function in TensorFlow and Keras?

I'm writing a custom objective to train a Keras (with TensorFlow backend) model but I need to debug some intermediate computation. For simplicity, let's say I have:
def custom_loss(y_pred, y_true):
diff = y_pred - y_true
return K.square(diff)
I could not find an easy way to access, for example, the intermediate variable diff or its shape during training. In this simple example, I know that I could return diff to print its values, but my actual loss is more complex and I can't return intermediate values without getting compiling errors.
Is there an easy way to debug intermediate variables in Keras?
This is not something that is solved in Keras as far as I know, so you have to resort to backend-specific functionality. Both Theano and TensorFlow have Print nodes that are identity nodes (i.e., they return the input node) and have the side-effect of printing the input (or some tensor of the input).
Example for Theano:
diff = y_pred - y_true
diff = theano.printing.Print('shape of diff', attrs=['shape'])(diff)
return K.square(diff)
Example for TensorFlow:
diff = y_pred - y_true
diff = tf.Print(diff, [tf.shape(diff)])
return K.square(diff)
Note that this only works for intermediate values. Keras expects tensors that are passed to other layers to have specific attributes such as _keras_shape. Values processed by the backend, i.e. through Print, usually do not have that attribute. To solve this, you can wrap debug statements in a Lambda layer for example.
In TensorFlow 2, you can now add IDE breakpoints in the TensorFlow Keras models/layers/losses, including when using the fit, evaluate, and predict methods. However, you must add model.run_eagerly = True after calling model.compile() for the values of the tensor to be available in the debugger at the breakpoint. For example,
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
def custom_loss(y_pred, y_true):
diff = y_pred - y_true
return tf.keras.backend.square(diff) # Breakpoint in IDE here. =====
class SimpleModel(Model):
def __init__(self):
super().__init__()
self.dense0 = Dense(2)
self.dense1 = Dense(1)
def call(self, inputs):
z = self.dense0(inputs)
z = self.dense1(z)
return z
x = tf.convert_to_tensor([[1, 2, 3], [4, 5, 6]], dtype=tf.float32)
y = tf.convert_to_tensor([0, 1], dtype=tf.float32)
model0 = SimpleModel()
model0.run_eagerly = True
model0.compile(optimizer=Adam(), loss=custom_loss)
y0 = model0.fit(x, y, epochs=1) # Values of diff *not* shown at breakpoint. =====
model1 = SimpleModel()
model1.compile(optimizer=Adam(), loss=custom_loss)
model1.run_eagerly = True
y1 = model1.fit(x, y, epochs=1) # Values of diff shown at breakpoint. =====
This also works for debugging the outputs of intermediate network layers (for example, adding the breakpoint in the call of the SimpleModel).
Note: this was tested in TensorFlow 2.0.0-rc0.
In TensorFlow 2.0, you can use tf.print and print anything inside the definition of your loss function. You can also do something like tf.print("my_intermediate_tensor =", my_intermediate_tensor), i.e. with a message, similar to Python's print. However, you may need to decorate your loss function with #tf.function to actually see the results of the tf.print.

Using if conditions inside a TensorFlow graph

In tensorflow CIFAR-10 tutorial in cifar10_inputs.py line 174 it is said you should randomize the order of the operations random_contrast and random_brightness for better data augmentation.
To do so the first thing I think of is drawing a random variable from the uniform distribution between 0 and 1 : p_order. And do:
if p_order>0.5:
distorted_image=tf.image.random_contrast(image)
distorted_image=tf.image.random_brightness(distorted_image)
else:
distorted_image=tf.image.random_brightness(image)
distorted_image=tf.image.random_contrast(distorted_image)
However there are two possible options for getting p_order:
1) Using numpy which disatisfies me as I wanted pure TF and that TF discourages its user to mix numpy and tensorflow
2) Using TF, however as p_order can only be evaluated in a tf.Session()
I do not really know if I should do:
with tf.Session() as sess2:
p_order_tensor=tf.random_uniform([1,],0.,1.)
p_order=float(p_order_tensor.eval())
All those operations are inside the body of a function and are run from another script which has a different session/graph. Or I could pass the graph from the other script as an argument to this function but I am confused.
Even the fact that tensorflow functions like this one or inference for example seem to define the graph in a global fashion without explicitly returning it as an output is a bit hard to understand for me.
You can use tf.cond(pred, fn1, fn2, name=None) (see doc).
This function allows you to use the boolean value of pred inside the TensorFlow graph (no need to call self.eval() or sess.run(), hence no need of a Session).
Here is an example of how to use it:
def fn1():
distorted_image=tf.image.random_contrast(image)
distorted_image=tf.image.random_brightness(distorted_image)
return distorted_image
def fn2():
distorted_image=tf.image.random_brightness(image)
distorted_image=tf.image.random_contrast(distorted_image)
return distorted_image
# Uniform variable in [0,1)
p_order = tf.random_uniform(shape=[], minval=0., maxval=1., dtype=tf.float32)
pred = tf.less(p_order, 0.5)
distorted_image = tf.cond(pred, fn1, fn2)

Supervised Machine Learning, producing a trained estimator

I have an assignment in which I am supposed to use scikit, numpy and pylab to do the following:
"All of the following should use data from the training_data.csv file
provided. training_data gives you a labeled set of integer pairs,
representing the scores of two sports teams, with the labels giving the
sport.
Write the following functions:
plot_scores() should draw a scatterplot of the data.
predict(dataset) should produce a trained Estimator to guess the sport
that resulted in a given score (from a dataset we've withheld, which will
be inputs as a 1000 x 2 np array). You can use any algorithm from scikit.
An optional additional function called "preprocess" will process dataset
before we it is passed to predict.
"
This is what I have done so far:
import numpy as np
import scipy as sp
import pylab as pl
from random import shuffle
def plot_scores():
k=open('training_data.csv')
lst=[]
for triple in k:
temp=triple.split(',')
lst.append([int(temp[0]), int(temp[1]), int(temp[2][:1])])
array=np.array(lst)
pl.scatter(array[:,0], array[:,1])
pl.show()
def preprocess(dataset):
k=open('training_data.csv')
lst=[]
for triple in k:
temp=triple.split(',')
lst.append([int(temp[0]), int(temp[1]), int(temp[2][:1])])
shuffle(lst)
return lst
In preprocess, I shuffled the data because I am supposed to use some of it to train on and some of it to test on, but the original data was not at all random. My question is, how am I supposed to "produce a trained estimator" in predict(dataset)? Is this supposed to be a function that returns another function? And which algorithm would be ideal to classify based on a dataset that looks like this:
The task likely wants you to train a standard scikit classifier model and return it, i.e. something like
from sklearn.svm import SVC
def predict(dataset):
X = ... # features, extract from dataset
y = ... # labels, extract from dataset
clf = SVC() # create classifier
clf.fit(X, y) # train
return clf
Though judging from the name of the function (predict) you should check if it really wants you to return a trained classifier or return predictions for the given dataset argument, as that would be more typical.
As a classifier you can basically use anyone that you like. Your plot looks like your dataset is linearly seperable (there are no colors for the classes, but I assume that the blops are the two classes). On linearly separable data hardly anything will fail. Try SVMs, logistic regression, random forests, naive bayes, ... For extra fun you can try to plot the decision boundaries, see here (which also contains an overview of the available classifiers).
I would recommend you to take a look at this structure:
from random import shuffle
import matplotlib.pyplot as plt
# import a classifier you need
def get_data():
# open your file and parse data to prepare X as a set of input vectors and Y as a set of targets
return X, Y
def split_data(X, Y):
size = len(X)
indices = range(size)
shuffle(indices)
train_indices = indices[:size/2]
test_indices = indices[size/2:]
X_train = [X[i] for i in train_indices]
Y_train = [Y[i] for i in train_indices]
X_test = [X[i] for i in test_indices]
Y_test = [Y[i] for i in test_indices]
return X_train, Y_train, X_test, Y_test
def plot_scatter(Y1, Y2):
plt.figure()
plt.scatter(Y1, Y2, 'bo')
plt.show()
# get data
X, Y = get_data()
# split data
X_train, Y_train, X_test, Y_test = split_data(X, Y)
# create a classifier as an object
classifier = YourImportedClassifier()
# train the classifier, after that the classifier is the trained estimator you need
classifier.train(X_train, Y_train) # or .fit(X_train, Y_train) or another train routine
# make a prediction
Y_prediction = classifier.predict(X_test)
# plot the scatter
plot_scatter(Y_prediction, Y_test)
I think what you are looking for is clf.fit() function, instead creating function that produce another function

PyMC3: How can I code my custom distribution with observed data better for Theano?

I am attempting to implement a fairly simple model in pymc3. The gist is that I have some data that is generated from a sequence of random choices. The choices can be thought of as a multinomial, and the process selects choices as a function of previous choices.
The overall probability of the categories is modeled with a Dirichlet prior.
The likelihood function must be customized for the data at hand. The data are lists of 0's and 1's that are output from the process. I have successfully made the model in pymc2, which you can find at this blog post. Here is a python function that generates test data for this problem:
ps = [0.2,0.35,0.25,0.15,0.0498,1/5000]
def make(ps):
out = []
while len(out) < 5:
n_spots = 5-len(out)
sp = sum(ps[:n_spots+1])
P = [x/sp for x in ps[:n_spots+1]]
l = np.argwhere(np.random.multinomial(1,P)==1).ravel()[0]
#if len(out) == 4:
# l = np.argwhere(np.random.multinomial(1,ps[:2])==1).ravel()[0]
out.extend([1]*l)
if (out and out[-1] == 1 and len(out) < 5) or l == 0:
out.append(0)
#print n_spots, l, len(out)
assert len(out) == 5
return out
As I'm learning/moving to pymc3, I'm trying to input my data as observed into a custom likelihood function, and I'm running into several issues along the way. It's probably because this is my first experience with Theano, but I'm hoping that someone can give some advice.
Here is my code (using the make function above):
import numpy as np
import pymc3 as pm
from scipy import optimize
import theano.tensor as T
from theano.compile.ops import as_op
from collections import Counter
# This function gets the attributes of the data that are relevant for calculating the likelihood
def scan(value):
groups = []
prev = False
s = 0
for i in xrange(5):
if value[i] == 0:
if prev:
groups.append((s,5-(i-s)))
prev = False
s = 0
else:
groups.append((0,5-i))
else:
prev = True
s += 1
if prev:
groups.append((s,4-(i-s)))
return groups
# The likelihood calculation for a single data point
def like1(v,p):
l = 1
groups = scan(v)
for n, s in groups:
l *= p[n]/p[:s+1].sum()
return T.log(l)
# my custom likelihood class
class CustomDist(pm.distributions.Discrete):
def __init__(self, ps, data, *args, **kwargs):
super(CustomDist, self).__init__(*args, **kwargs)
self.ps = ps
self.data = data
def logp(self,v):
all_l = 0
for v, k in self.data.items():
l = like1(v,self.ps)
all_l += l*k
return all_l
# model creation
model = pm.Model()
with model:
probs = pm.Dirichlet('probs',a=np.array([0.5]*6),shape=6,testval=np.array([1/6.0]*6))
output = CustomDist("rolls",ps=probs,data=data,observed=True)
I am able to find the MAP in about a minute or so (my machine is Windows 7, i7-4790 #3.6GHz). The MAP matches well with the input probability vector, which at least means the model is linked properly.
When I try to do traces, though, my memory usage skyrockets (up to several gig) and I haven't actually been patient enough for the model to finish compiling. I've waited 10 minutes + for the NUTS or HMC to compile before even tracing. The metropolis stepping works just fine, though (and is much faster than with pymc2).
Am I just being too hopeful for Theano to be able to handle for-loops of non-theano data well? Is there a better way to write this code so that Theano plays well with it, or am I limited because my data is a custom python type and can't be analyzed with array/matrix operations?
Thanks in advance for your advice and feedback. Please let me know what might need clarification!

pythonic way to do something N times without an index variable? [duplicate]

This question already has answers here:
Is it possible to implement a Python for range loop without an iterator variable?
(15 answers)
Closed 7 months ago.
I have some code like:
for i in range(N):
do_something()
I want to do something N times. The code inside the loop doesn't depend on the value of i.
Is it possible to do this simple task without creating a useless index variable, or in an otherwise more elegant way? How?
A slightly faster approach than looping on xrange(N) is:
import itertools
for _ in itertools.repeat(None, N):
do_something()
Use the _ variable, like so:
# A long way to do integer exponentiation
num = 2
power = 3
product = 1
for _ in range(power):
product *= num
print(product)
I just use for _ in range(n), it's straight to the point. It's going to generate the entire list for huge numbers in Python 2, but if you're using Python 3 it's not a problem.
since function is first-class citizen, you can write small wrapper (from Alex answers)
def repeat(f, N):
for _ in itertools.repeat(None, N): f()
then you can pass function as argument.
The _ is the same thing as x. However it's a python idiom that's used to indicate an identifier that you don't intend to use. In python these identifiers don't takes memor or allocate space like variables do in other languages. It's easy to forget that. They're just names that point to objects, in this case an integer on each iteration.
I found the various answers really elegant (especially Alex Martelli's) but I wanted to quantify performance first hand, so I cooked up the following script:
from itertools import repeat
N = 10000000
def payload(a):
pass
def standard(N):
for x in range(N):
payload(None)
def underscore(N):
for _ in range(N):
payload(None)
def loopiter(N):
for _ in repeat(None, N):
payload(None)
def loopiter2(N):
for _ in map(payload, repeat(None, N)):
pass
if __name__ == '__main__':
import timeit
print("standard: ",timeit.timeit("standard({})".format(N),
setup="from __main__ import standard", number=1))
print("underscore: ",timeit.timeit("underscore({})".format(N),
setup="from __main__ import underscore", number=1))
print("loopiter: ",timeit.timeit("loopiter({})".format(N),
setup="from __main__ import loopiter", number=1))
print("loopiter2: ",timeit.timeit("loopiter2({})".format(N),
setup="from __main__ import loopiter2", number=1))
I also came up with an alternative solution that builds on Martelli's one and uses map() to call the payload function. OK, I cheated a bit in that I took the freedom of making the payload accept a parameter that gets discarded: I don't know if there is a way around this. Nevertheless, here are the results:
standard: 0.8398549720004667
underscore: 0.8413165839992871
loopiter: 0.7110594899968419
loopiter2: 0.5891903560004721
so using map yields an improvement of approximately 30% over the standard for loop and an extra 19% over Martelli's.
Assume that you've defined do_something as a function, and you'd like to perform it N times.
Maybe you can try the following:
todos = [do_something] * N
for doit in todos:
doit()
What about a simple while loop?
while times > 0:
do_something()
times -= 1
You already have the variable; why not use it?

Resources