Suppose i have the following function:
#cython.boundscheck(False)
#cython.wraparound(False)
cpdef bint test(np.int_t [:] values):
cdef Py_ssize_t n_values = len(values)
cdef int i
for i in prange(n_values,nogil=True):
if i ==0:
return 0
print 'test'
I run it like so:
In [1]: import algos
In [2]: import numpy as np
In [3]: algos.test(np.array([1,2,3,1,4,5]))
test
Out[3]: False
Why is the function printing when it should have just exited without printing? Is there a way to have the function exit when it reaches the return?
Thank you.
The documentation is clear that this is a bit of a minefield since there's no guarantee which iteration finishes first. I think the added complication that they don't make clear is that if the number of iterations is small enough then (provided one thread is done) you can also end up continuing as after the prange too, which is what you see.
What seems to work for me is to use the else clause of a loop, which only gets executed if it hasn't finished early:
for i in prange(n_values,nogil=True):
# stuff ...
else:
with gil:
print "test"
A quick look at the C code suggests that this is putting appropriate checks in place and it should be reliable.
Related
This question is related to global optimization and it is simpler. The task is to find all local minimums of a function. This is useful sometimes, for example, in physics we might want to find metastable states besides the true ground state in phase space. I have a naive implementation which has been tested on a scalar function xsin(x)+xcos(2*x) by randomly searching points in the interval. But clearly this is not efficient. The code and output are attached if you are interested.
#!/usr/bin/env python
from scipy import *
from numpy import *
from pylab import *
from numpy import random
"""
search all of the local minimums using random search when the functional form of the target function is known.
"""
def function(x):
return x*sin(x)+x*cos(2*x)
# return x**4-3*x**3+2
def derivative(x):
return sin(x)+x*cos(x)+cos(2*x)-2*x*sin(2*x)
# return 4.*x**3-9.*x**2
def ploting(xr,yr,mls):
plot(xr,yr)
grid()
for xm in mls:
axvline(x=xm,c='r')
savefig("plotf.png")
show()
def findlocmin(x,Nit,step_def=0.1,err=0.0001,gamma=0.01):
"""
we use gradient decent method to find local minumum using x as the starting point
"""
for i in range(Nit):
slope=derivative(x)
step=min(step_def,abs(slope)*gamma)
x=x-step*slope/abs(slope)
# print step,x
if(abs(slope)<err):
print "Found local minimum using "+str(i)+' iterations'
break
if i==Nit-1:
raise Exception("local min is not found using Nit=",str(Nit),'iterations')
return x
if __name__=="__main__":
xleft=-9;xright=9
xs=linspace(xleft,xright,100)
ys=array([function(x) for x in xs ])
minls=[]
Nrand=100;it=0
Nit=10000
while it<Nrand:
xint=random.uniform(xleft,xright)
xlocm=findlocmin(xint,Nit)
print xlocm
minls.append(xlocm)
it+=1
# print minls
ploting(xs,ys,minls)`]
I'd like to know if there exists better solution to this?
Is there a particular reason to favor stepping into multiple blocks vs. short cutting? For instance, take the following two functions in which multiple conditions are evaluated. The first example is stepping into each block, while the second example short cuts. The examples are in Python, but the question is not restricted to Python. It is overly trivialized as well.
def some_function():
if some_condition:
if some_other_condition:
do_something()
vs.
def some_function():
if not some_condition:
return
it not some_other_condition:
return
do_something()
Favoring the second makes code easier to read. It's not that evident in your example but consider:
def some_function()
if not some_condition:
return 1
if not some_other_condition:
return 2
do_something()
return 0
vs
def some_function():
if some_condition:
if some_other_condition:
do_something()
return 0
else:
return 2
else:
return 1
Even if the function has no return value for the "failed" conditions, writing the functions with inverted ifs way makes placing breakpoints and debugging easier.
In your original example where would you place the breakpoint if you wanted to know whether your code is not running because some_condition or some_other_condition failed?
This (enormously simplified example) works fine (Python 2.6.6, Debian Squeeze):
from multiprocessing import Pool
import numpy as np
src=None
def process(row):
return np.sum(src[row])
def main():
global src
src=np.ones((100,100))
pool=Pool(processes=16)
rows=pool.map(process,range(100))
print rows
if __name__ == "__main__":
main()
however, after years of being taught global state bad!!!, all my instincts are telling me I really really would rather be writing something closer to:
from multiprocessing import Pool
import numpy as np
def main():
src=np.ones((100,100))
def process(row):
return np.sum(src[row])
pool=Pool(processes=16)
rows=pool.map(process,range(100))
print rows
if __name__ == "__main__":
main()
but of course that doesn't work (hangs up unable to pickle something).
The example here is trivial, but by the time you add multiple "process" functions, and each of those is dependent on multiple additional inputs... well it all becomes a bit reminiscent of something written in BASIC 30 years ago. Trying to use classes to at least aggregate the state with the appropriate functions seems an obvious solution, but doesn't seem to be that easy in practice.
Is there some recommended pattern or style for using multiprocessing.pool which will avoid the proliferation of global state to support each function I want to parallel map over ?
How do experienced "multiprocessing pros" deal with this ?
Update: Note that I'm actually interested in processing much bigger arrays, so variations on the above which pickle src each call/iteration aren't nearly as good as ones which fork it into the pool's worker processes.
You could always pass a callable object like this, then the object can containe the shared state:
from multiprocessing import Pool
import numpy as np
class RowProcessor(object):
def __init__(self, src):
self.__src = src
def __call__(self, row):
return np.sum(self.__src[row])
def main():
src=np.ones((100,100))
p = RowProcessor(src)
pool=Pool(processes=16)
rows = pool.map(p, range(100))
print rows
if __name__ == "__main__":
main()
This question already has answers here:
Is it possible to implement a Python for range loop without an iterator variable?
(15 answers)
Closed 7 months ago.
I have some code like:
for i in range(N):
do_something()
I want to do something N times. The code inside the loop doesn't depend on the value of i.
Is it possible to do this simple task without creating a useless index variable, or in an otherwise more elegant way? How?
A slightly faster approach than looping on xrange(N) is:
import itertools
for _ in itertools.repeat(None, N):
do_something()
Use the _ variable, like so:
# A long way to do integer exponentiation
num = 2
power = 3
product = 1
for _ in range(power):
product *= num
print(product)
I just use for _ in range(n), it's straight to the point. It's going to generate the entire list for huge numbers in Python 2, but if you're using Python 3 it's not a problem.
since function is first-class citizen, you can write small wrapper (from Alex answers)
def repeat(f, N):
for _ in itertools.repeat(None, N): f()
then you can pass function as argument.
The _ is the same thing as x. However it's a python idiom that's used to indicate an identifier that you don't intend to use. In python these identifiers don't takes memor or allocate space like variables do in other languages. It's easy to forget that. They're just names that point to objects, in this case an integer on each iteration.
I found the various answers really elegant (especially Alex Martelli's) but I wanted to quantify performance first hand, so I cooked up the following script:
from itertools import repeat
N = 10000000
def payload(a):
pass
def standard(N):
for x in range(N):
payload(None)
def underscore(N):
for _ in range(N):
payload(None)
def loopiter(N):
for _ in repeat(None, N):
payload(None)
def loopiter2(N):
for _ in map(payload, repeat(None, N)):
pass
if __name__ == '__main__':
import timeit
print("standard: ",timeit.timeit("standard({})".format(N),
setup="from __main__ import standard", number=1))
print("underscore: ",timeit.timeit("underscore({})".format(N),
setup="from __main__ import underscore", number=1))
print("loopiter: ",timeit.timeit("loopiter({})".format(N),
setup="from __main__ import loopiter", number=1))
print("loopiter2: ",timeit.timeit("loopiter2({})".format(N),
setup="from __main__ import loopiter2", number=1))
I also came up with an alternative solution that builds on Martelli's one and uses map() to call the payload function. OK, I cheated a bit in that I took the freedom of making the payload accept a parameter that gets discarded: I don't know if there is a way around this. Nevertheless, here are the results:
standard: 0.8398549720004667
underscore: 0.8413165839992871
loopiter: 0.7110594899968419
loopiter2: 0.5891903560004721
so using map yields an improvement of approximately 30% over the standard for loop and an extra 19% over Martelli's.
Assume that you've defined do_something as a function, and you'd like to perform it N times.
Maybe you can try the following:
todos = [do_something] * N
for doit in todos:
doit()
What about a simple while loop?
while times > 0:
do_something()
times -= 1
You already have the variable; why not use it?
I've got a loop that wants to execute to exhaustion or until some user specified limit is reached. I've got a construct that looks bad yet I can't seem to find a more elegant way to express it; is there one?
def ello_bruce(limit=None):
for i in xrange(10**5):
if predicate(i):
if not limit is None:
limit -= 1
if limit <= 0:
break
def predicate(i):
# lengthy computation
return True
Holy nesting! There has to be a better way. For purposes of a working example, xrange is used where I normally have an iterator of finite but unknown length (and predicate sometimes returns False).
Maybe something like this would be a little better:
from itertools import ifilter, islice
def ello_bruce(limit=None):
for i in islice(ifilter(predicate, xrange(10**5)), limit):
# do whatever you want with i here
I'd take a good look at the itertools library. Using that, I think you'd have something like...
# From the itertools examples
def tabulate(function, start=0):
return imap(function, count(start))
def take(n, iterable):
return list(islice(iterable, n))
# Then something like:
def ello_bruce(limit=None):
take(filter(tabulate(predicate)), limit)
I'd start with
if limit is None: return
since nothing can ever happen to limit when it starts as None (if there are no desirable side effects in the iteration and in the computation of predicate -- if there are, then, in this case you can just do for i in xrange(10**5): predicate(i)).
If limit is not None, then you just want to perform max(limit, 1) computations of predicate that are true, so an itertools.islice of an itertools.ifilter would do:
import itertools as it
def ello_bruce(limit=None):
if limit is None:
for i in xrange(10**5): predicate(i)
else:
for _ in it.islice(
it.ifilter(predicate, xrange(10**5),
max(limit, 1)): pass
You should remove the nested ifs:
if predicate(i) and not limit is None:
...
What you want to do seems perfectly suited for a while loop:
def ello_bruce(limit=None):
max = 10**5
# if you consider 0 to be an invalid value for limit you can also do
# if limit:
if limit is None:
limit = max
while max and limit:
if predicate(i):
limit -= 1
max -=1
The loop stops if either max or limit reaches zero.
Um. As far as I understand it, predicate just computes in segments, and you totally ignore its return value, right?
This is another take:
import itertools
def ello_bruce(limit=None):
if limit is None:
limiter= itertools.repeat(None)
else:
limiter= xrange(limit)
# since predicate is a Python function
# itertools looping won't be faster, so use plain for.
# remember to replace the xrange(100000) with your own iterator
for dummy in itertools.izip(xrange(100000), limiter):
pass
Also, remove the unneeded return True from the end of predicate.