a more pythonic way to express conditionally bounded loop? - coding-style

I've got a loop that wants to execute to exhaustion or until some user specified limit is reached. I've got a construct that looks bad yet I can't seem to find a more elegant way to express it; is there one?
def ello_bruce(limit=None):
for i in xrange(10**5):
if predicate(i):
if not limit is None:
limit -= 1
if limit <= 0:
break
def predicate(i):
# lengthy computation
return True
Holy nesting! There has to be a better way. For purposes of a working example, xrange is used where I normally have an iterator of finite but unknown length (and predicate sometimes returns False).

Maybe something like this would be a little better:
from itertools import ifilter, islice
def ello_bruce(limit=None):
for i in islice(ifilter(predicate, xrange(10**5)), limit):
# do whatever you want with i here

I'd take a good look at the itertools library. Using that, I think you'd have something like...
# From the itertools examples
def tabulate(function, start=0):
return imap(function, count(start))
def take(n, iterable):
return list(islice(iterable, n))
# Then something like:
def ello_bruce(limit=None):
take(filter(tabulate(predicate)), limit)

I'd start with
if limit is None: return
since nothing can ever happen to limit when it starts as None (if there are no desirable side effects in the iteration and in the computation of predicate -- if there are, then, in this case you can just do for i in xrange(10**5): predicate(i)).
If limit is not None, then you just want to perform max(limit, 1) computations of predicate that are true, so an itertools.islice of an itertools.ifilter would do:
import itertools as it
def ello_bruce(limit=None):
if limit is None:
for i in xrange(10**5): predicate(i)
else:
for _ in it.islice(
it.ifilter(predicate, xrange(10**5),
max(limit, 1)): pass

You should remove the nested ifs:
if predicate(i) and not limit is None:
...

What you want to do seems perfectly suited for a while loop:
def ello_bruce(limit=None):
max = 10**5
# if you consider 0 to be an invalid value for limit you can also do
# if limit:
if limit is None:
limit = max
while max and limit:
if predicate(i):
limit -= 1
max -=1
The loop stops if either max or limit reaches zero.

Um. As far as I understand it, predicate just computes in segments, and you totally ignore its return value, right?
This is another take:
import itertools
def ello_bruce(limit=None):
if limit is None:
limiter= itertools.repeat(None)
else:
limiter= xrange(limit)
# since predicate is a Python function
# itertools looping won't be faster, so use plain for.
# remember to replace the xrange(100000) with your own iterator
for dummy in itertools.izip(xrange(100000), limiter):
pass
Also, remove the unneeded return True from the end of predicate.

Related

Ruby elegant alternative to ++ in nested loops?

Before anything, I have read all the answers of Why doesn't Ruby support i++ or i—? and understood why. Please note that this is not just another discussion topic about whether to have it or not.
What I'm really after is a more elegant solution for the situation that made me wonder and research about ++/-- in Ruby. I've looked up loops, each, each_with_index and things alike but I couldn't find a better solution for this specific situation.
Less talk, more code:
# Does the first request to Zendesk API, fetching *first page* of results
all_tickets = zd_client.tickets.incremental_export(1384974614)
# Initialises counter variable (please don't kill me for this, still learning! :D )
counter = 1
# Loops result pages
loop do
# Loops each ticket on the paged result
all_tickets.all do |ticket, page_number|
# For debug purposes only, I want to see an incremental by each ticket
p "#{counter} P#{page_number} #{ticket.id} - #{ticket.created_at} | #{ticket.subject}"
counter += 1
end
# Fetches next page, if any
all_tickets.next unless all_tickets.last_page?
# Breaks outer loop if last_page?
break if all_tickets.last_page?
end
For now, I need counter for debug purposes only - it's not a big deal at all - but my curiosity typed this question itself: is there a better (more beautiful, more elegant) solution for this? Having a whole line just for counter += 1 seems pretty dull. Just as an example, having "#{counter++}" when printing the string would be much simpler (for readability sake, at least).
I can't simply use .each's index because it's a nested loop, and it would reset at each page (outer loop).
Any thoughts?
BTW: This question has nothing to do with Zendesk API whatsoever. I've just used it to better illustrate my situation.
To me, counter += 1 is a fine way to express incrementing the counter.
You can start your counter at 0 and then get the effect you wanted by writing:
p "#{counter += 1} ..."
But I generally wouldn't recommend this because people do not expect side effects like changing a variable to happen inside string interpolation.
If you are looking for something more elegant, you should make an Enumerator that returns integers one at a time, each time you call next on the enumerator.
nums = Enumerator.new do |y|
c = 0
y << (c += 1) while true
end
nums.next # => 1
nums.next # => 2
nums.next # => 3
Instead of using Enumerator.new in the code above, you could just write:
nums = 1.upto(Float::INFINITY)
As mentioned by B Seven each_with_index will work, but you can keep the page_number, as long all_tickets is a container of tuples as it must be to be working right now.
all_tickets.each_with_index do |ticket, page_number, i|
#stuff
end
Where i is the index. If you have more than ticket and page_number inside each element of all_tickets you continue putting them, just remember that the index is the extra one and shall stay in the end.
Could be I oversimplified your example but you could calculate a counter from your inner and outer range like this.
all_tickets = *(1..10)
inner_limit = all_tickets.size
outer_limit = 5000
1.upto(outer_limit) do |outer_counter|
all_tickets.each_with_index do |ticket, inner_counter|
p [(outer_counter*inner_limit)+inner_counter, outer_counter, inner_counter, ticket]
end
# some conditional to break out, in your case the last_page? method
break if outer_counter > 3
end
all_tickets.each_with_index(1) do |ticket, i|
I'm not sure where page_number is coming from...
See Ruby Docs.

Python, fastest way to iterate over regular expressions but stop on first match

I have a function that returns True if a string matches at least one
regular expression in a list and False otherwise. The function is called
often enough that performance is an issue.
When running it through cProfile, the function is spending about 65% of
its time doing matches and 35% of its time iterating over the list.
I would think there would be a way to use map() or something but I can't
think of a way to have it stop iterating after it finds a match.
Is there a way to make the function faster while still having it return
upon finding the first match?
def matches_pattern(str, patterns):
for pattern in patterns:
if pattern.match(str):
return True
return False
The first thing that comes to mind is pushing the loop to the C side by using a generator expression:
def matches_pattern(s, patterns):
return any(p.match(s) for p in patterns)
Probably you don't even need a separate function for that.
Another thing you should try out is to build a single, composite regex using the | alternation operator, so that the engine has a chance to optimize it for you. You can also create the regex dynamically from a list of string patterns, if this is necessary:
def matches_pattern(s, patterns):
return re.match('|'.join('(?:%s)' % p for p in patterns), s)
Of course you need to have your regexes in string form for that to work. Just profile both of these and check which one is faster :)
You might also want to have a look at a general tip for debugging regular expressions in Python. This can also help to find opportunities to optimize.
UPDATE: I was curious and wrote a little benchmark:
import timeit
setup = """
import re
patterns = [".*abc", "123.*", "ab.*", "foo.*bar", "11010.*", "1[^o]*"]*10
strings = ["asdabc", "123awd2", "abasdae23", "fooasdabar", "111", "11010100101", "xxxx", "eeeeee", "dddddddddddddd", "ffffff"]*10
compiled_patterns = list(map(re.compile, patterns))
def matches_pattern(str, patterns):
for pattern in patterns:
if pattern.match(str):
return True
return False
def test0():
for s in strings:
matches_pattern(s, compiled_patterns)
def test1():
for s in strings:
any(p.match(s) for p in compiled_patterns)
def test2():
for s in strings:
re.match('|'.join('(?:%s)' % p for p in patterns), s)
def test3():
r = re.compile('|'.join('(?:%s)' % p for p in patterns))
for s in strings:
r.match(s)
"""
import sys
print(timeit.timeit("test0()", setup=setup, number=1000))
print(timeit.timeit("test1()", setup=setup, number=1000))
print(timeit.timeit("test2()", setup=setup, number=1000))
print(timeit.timeit("test3()", setup=setup, number=1000))
The output on my machine:
1.4120500087738037
1.662621021270752
4.729579925537109
0.1489570140838623
So any doesn't seem to be faster than your original approach. Building up a regex dynamically also isn't really fast. But if you can manage to build up a regex upfront and use it several times, this might result in better performance. You can also adapt this benchmark to test some other options :)
The way to do this fastest is to combine all the regexes into one with "|" between them, then make one regex match call. Also, you'll want to compile it once to be sure you're avoiding repeated regex compilation.
For example:
def matches_pattern(s, pats):
pat = "|".join("(%s)" % p for p in pats)
return bool(re.match(pat, s))
This is for pats as strings, not compiled patterns. If you really only have compiled regexes, then:
def matches_pattern(s, pats):
pat = "|".join("(%s)" % p.pattern for p in pats)
return bool(re.match(pat, s))
Adding to the excellent answers above, make sure you compare the output of re.match with None:
>>> timeit('None is None')
0.03676295280456543
>>> timeit('bool(None)')
0.1125330924987793
>>> timeit('re.match("a","abc") is None', 'import re')
1.0200879573822021
>>> timeit('bool(re.match("a","abc"))', 'import re')
1.134294033050537
It's not exactly what the OP asked, but this worked well for me as an alternative to long iterative matching.
Here is some example data and code:
import random
import time
mylonglist = [ ''.join([ random.choice("ABCDE") for i in range(50)]) for j in range(3000) ]
# check uniqueness
print "uniqueness:"
print len(mylonglist) == len(set(mylonglist))
# subsample 1000
subsamp = [ mylonglist[x] for x in random.sample(xrange(3000),1000) ]
# join long string for matching
string = " ".join(subsamp)
# test function 1
def by_string_match(string, mylonglist):
counter = 0
t1 = time.time()
for i in mylonglist:
if i in string:
counter += 1
t2 = time.time()
print "It took {} seconds to find {} items".format(t2-t1,counter)
# test function 2
def by_iterative_match(subsamp, mylonglist):
counter = 0
t1 = time.time()
for i in mylonglist:
if any([ i in s for s in subsamp ]):
counter += 1
t2 = time.time()
print "It took {} seconds to find {} items".format(t2-t1,counter)
# test 1:
print "string match:"
by_string_match(string, mylonglist)
# test 2:
print "iterative match:"
by_iterative_match(subsamp, mylonglist)

pythonic way to do something N times without an index variable? [duplicate]

This question already has answers here:
Is it possible to implement a Python for range loop without an iterator variable?
(15 answers)
Closed 7 months ago.
I have some code like:
for i in range(N):
do_something()
I want to do something N times. The code inside the loop doesn't depend on the value of i.
Is it possible to do this simple task without creating a useless index variable, or in an otherwise more elegant way? How?
A slightly faster approach than looping on xrange(N) is:
import itertools
for _ in itertools.repeat(None, N):
do_something()
Use the _ variable, like so:
# A long way to do integer exponentiation
num = 2
power = 3
product = 1
for _ in range(power):
product *= num
print(product)
I just use for _ in range(n), it's straight to the point. It's going to generate the entire list for huge numbers in Python 2, but if you're using Python 3 it's not a problem.
since function is first-class citizen, you can write small wrapper (from Alex answers)
def repeat(f, N):
for _ in itertools.repeat(None, N): f()
then you can pass function as argument.
The _ is the same thing as x. However it's a python idiom that's used to indicate an identifier that you don't intend to use. In python these identifiers don't takes memor or allocate space like variables do in other languages. It's easy to forget that. They're just names that point to objects, in this case an integer on each iteration.
I found the various answers really elegant (especially Alex Martelli's) but I wanted to quantify performance first hand, so I cooked up the following script:
from itertools import repeat
N = 10000000
def payload(a):
pass
def standard(N):
for x in range(N):
payload(None)
def underscore(N):
for _ in range(N):
payload(None)
def loopiter(N):
for _ in repeat(None, N):
payload(None)
def loopiter2(N):
for _ in map(payload, repeat(None, N)):
pass
if __name__ == '__main__':
import timeit
print("standard: ",timeit.timeit("standard({})".format(N),
setup="from __main__ import standard", number=1))
print("underscore: ",timeit.timeit("underscore({})".format(N),
setup="from __main__ import underscore", number=1))
print("loopiter: ",timeit.timeit("loopiter({})".format(N),
setup="from __main__ import loopiter", number=1))
print("loopiter2: ",timeit.timeit("loopiter2({})".format(N),
setup="from __main__ import loopiter2", number=1))
I also came up with an alternative solution that builds on Martelli's one and uses map() to call the payload function. OK, I cheated a bit in that I took the freedom of making the payload accept a parameter that gets discarded: I don't know if there is a way around this. Nevertheless, here are the results:
standard: 0.8398549720004667
underscore: 0.8413165839992871
loopiter: 0.7110594899968419
loopiter2: 0.5891903560004721
so using map yields an improvement of approximately 30% over the standard for loop and an extra 19% over Martelli's.
Assume that you've defined do_something as a function, and you'd like to perform it N times.
Maybe you can try the following:
todos = [do_something] * N
for doit in todos:
doit()
What about a simple while loop?
while times > 0:
do_something()
times -= 1
You already have the variable; why not use it?

What is the pythonic way to detect the last element in a 'for' loop?

How can I treat the last element of the input specially, when iterating with a for loop? In particular, if there is code that should only occur "between" elements (and not "after" the last one), how can I structure the code?
Currently, I write code like so:
for i, data in enumerate(data_list):
code_that_is_done_for_every_element
if i != len(data_list) - 1:
code_that_is_done_between_elements
How can I simplify or improve this?
Most of the times it is easier (and cheaper) to make the first iteration the special case instead of the last one:
first = True
for data in data_list:
if first:
first = False
else:
between_items()
item()
This will work for any iterable, even for those that have no len():
file = open('/path/to/file')
for line in file:
process_line(line)
# No way of telling if this is the last line!
Apart from that, I don't think there is a generally superior solution as it depends on what you are trying to do. For example, if you are building a string from a list, it's naturally better to use str.join() than using a for loop “with special case”.
Using the same principle but more compact:
for i, line in enumerate(data_list):
if i > 0:
between_items()
item()
Looks familiar, doesn't it? :)
For #ofko, and others who really need to find out if the current value of an iterable without len() is the last one, you will need to look ahead:
def lookahead(iterable):
"""Pass through all values from the given iterable, augmented by the
information if there are more values to come after the current one
(True), or if it is the last value (False).
"""
# Get an iterator and pull the first value.
it = iter(iterable)
last = next(it)
# Run the iterator to exhaustion (starting from the second value).
for val in it:
# Report the *previous* value (more to come).
yield last, True
last = val
# Report the last value.
yield last, False
Then you can use it like this:
>>> for i, has_more in lookahead(range(3)):
... print(i, has_more)
0 True
1 True
2 False
Although that question is pretty old, I came here via google and I found a quite simple way: List slicing. Let's say you want to put an '&' between all list entries.
s = ""
l = [1, 2, 3]
for i in l[:-1]:
s = s + str(i) + ' & '
s = s + str(l[-1])
This returns '1 & 2 & 3'.
if the items are unique:
for x in list:
#code
if x == list[-1]:
#code
other options:
pos = -1
for x in list:
pos += 1
#code
if pos == len(list) - 1:
#code
for x in list:
#code
#code - e.g. print x
if len(list) > 0:
for x in list[:-1]:
#process everything except the last element
for x in list[-1:]:
#process only last element
The 'code between' is an example of the Head-Tail pattern.
You have an item, which is followed by a sequence of ( between, item ) pairs. You can also view this as a sequence of (item, between) pairs followed by an item. It's generally simpler to take the first element as special and all the others as the "standard" case.
Further, to avoid repeating code, you have to provide a function or other object to contain the code you don't want to repeat. Embedding an if statement in a loop which is always false except one time is kind of silly.
def item_processing( item ):
# *the common processing*
head_tail_iter = iter( someSequence )
head = next(head_tail_iter)
item_processing( head )
for item in head_tail_iter:
# *the between processing*
item_processing( item )
This is more reliable because it's slightly easier to prove, It doesn't create an extra data structure (i.e., a copy of a list) and doesn't require a lot of wasted execution of an if condition which is always false except once.
If you're simply looking to modify the last element in data_list then you can simply use the notation:
L[-1]
However, it looks like you're doing more than that. There is nothing really wrong with your way. I even took a quick glance at some Django code for their template tags and they do basically what you're doing.
you can determine the last element with this code :
for i,element in enumerate(list):
if (i==len(list)-1):
print("last element is" + element)
This is similar to Ants Aasma's approach but without using the itertools module. It's also a lagging iterator which looks-ahead a single element in the iterator stream:
def last_iter(it):
# Ensure it's an iterator and get the first field
it = iter(it)
prev = next(it)
for item in it:
# Lag by one item so I know I'm not at the end
yield 0, prev
prev = item
# Last item
yield 1, prev
def test(data):
result = list(last_iter(data))
if not result:
return
if len(result) > 1:
assert set(x[0] for x in result[:-1]) == set([0]), result
assert result[-1][0] == 1
test([])
test([1])
test([1, 2])
test(range(5))
test(xrange(4))
for is_last, item in last_iter("Hi!"):
print is_last, item
We can achieve that using for-else
cities = [
'Jakarta',
'Surabaya',
'Semarang'
]
for city in cities[:-1]:
print(city)
else:
print(' '.join(cities[-1].upper()))
output:
Jakarta
Surabaya
S E M A R A N G
The idea is we only using for-else loops until n-1 index, then after the for is exhausted, we access directly the last index using [-1].
You can use a sliding window over the input data to get a peek at the next value and use a sentinel to detect the last value. This works on any iterable, so you don't need to know the length beforehand. The pairwise implementation is from itertools recipes.
from itertools import tee, izip, chain
def pairwise(seq):
a,b = tee(seq)
next(b, None)
return izip(a,b)
def annotated_last(seq):
"""Returns an iterable of pairs of input item and a boolean that show if
the current item is the last item in the sequence."""
MISSING = object()
for current_item, next_item in pairwise(chain(seq, [MISSING])):
yield current_item, next_item is MISSING:
for item, is_last_item in annotated_last(data_list):
if is_last_item:
# current item is the last item
Is there no possibility to iterate over all-but the last element, and treat the last one outside of the loop? After all, a loop is created to do something similar to all elements you loop over; if one element needs something special, it shouldn't be in the loop.
(see also this question: does-the-last-element-in-a-loop-deserve-a-separate-treatment)
EDIT: since the question is more about the "in between", either the first element is the special one in that it has no predecessor, or the last element is special in that it has no successor.
I like the approach of #ethan-t, but while True is dangerous from my point of view.
data_list = [1, 2, 3, 2, 1] # sample data
L = list(data_list) # destroy L instead of data_list
while L:
e = L.pop(0)
if L:
print(f'process element {e}')
else:
print(f'process last element {e}')
del L
Here, data_list is so that last element is equal by value to the first one of the list. L can be exchanged with data_list but in this case it results empty after the loop. while True is also possible to use if you check that list is not empty before the processing or the check is not needed (ouch!).
data_list = [1, 2, 3, 2, 1]
if data_list:
while True:
e = data_list.pop(0)
if data_list:
print(f'process element {e}')
else:
print(f'process last element {e}')
break
else:
print('list is empty')
The good part is that it is fast. The bad - it is destructible (data_list becomes empty).
Most intuitive solution:
data_list = [1, 2, 3, 2, 1] # sample data
for i, e in enumerate(data_list):
if i != len(data_list) - 1:
print(f'process element {e}')
else:
print(f'process last element {e}')
Oh yes, you have already proposed it!
There is nothing wrong with your way, unless you will have 100 000 loops and wants save 100 000 "if" statements. In that case, you can go that way :
iterable = [1,2,3] # Your date
iterator = iter(iterable) # get the data iterator
try : # wrap all in a try / except
while 1 :
item = iterator.next()
print item # put the "for loop" code here
except StopIteration, e : # make the process on the last element here
print item
Outputs :
1
2
3
3
But really, in your case I feel like it's overkill.
In any case, you will probably be luckier with slicing :
for item in iterable[:-1] :
print item
print "last :", iterable[-1]
#outputs
1
2
last : 3
or just :
for item in iterable :
print item
print iterable[-1]
#outputs
1
2
3
last : 3
Eventually, a KISS way to do you stuff, and that would work with any iterable, including the ones without __len__ :
item = ''
for item in iterable :
print item
print item
Ouputs:
1
2
3
3
If feel like I would do it that way, seems simple to me.
Use slicing and is to check for the last element:
for data in data_list:
<code_that_is_done_for_every_element>
if not data is data_list[-1]:
<code_that_is_done_between_elements>
Caveat emptor: This only works if all elements in the list are actually different (have different locations in memory). Under the hood, Python may detect equal elements and reuse the same objects for them. For instance, for strings of the same value and common integers.
Google brought me to this old question and I think I could add a different approach to this problem.
Most of the answers here would deal with a proper treatment of a for loop control as it was asked, but if the data_list is destructible, I would suggest that you pop the items from the list until you end up with an empty list:
while True:
element = element_list.pop(0)
do_this_for_all_elements()
if not element:
do_this_only_for_last_element()
break
do_this_for_all_elements_but_last()
you could even use while len(element_list) if you don't need to do anything with the last element. I find this solution more elegant then dealing with next().
For me the most simple and pythonic way to handle a special case at the end of a list is:
for data in data_list[:-1]:
handle_element(data)
handle_special_element(data_list[-1])
Of course this can also be used to treat the first element in a special way .
Better late than never. Your original code used enumerate(), but you only used the i index to check if it's the last item in a list. Here's an simpler alternative (if you don't need enumerate()) using negative indexing:
for data in data_list:
code_that_is_done_for_every_element
if data != data_list[-1]:
code_that_is_done_between_elements
if data != data_list[-1] checks if the current item in the iteration is NOT the last item in the list.
Hope this helps, even nearly 11 years later.
if you are going through the list, for me this worked too:
for j in range(0, len(Array)):
if len(Array) - j > 1:
notLast()
Instead of counting up, you can also count down:
nrToProcess = len(list)
for s in list:
s.doStuff()
nrToProcess -= 1
if nrToProcess==0: # this is the last one
s.doSpecialStuff()
I will provide with a more elegant and robust way as follows, using unpacking:
def mark_last(iterable):
try:
*init, last = iterable
except ValueError: # if iterable is empty
return
for e in init:
yield e, True
yield last, False
Test:
for a, b in mark_last([1, 2, 3]):
print(a, b)
The result is:
1 True
2 True
3 False
If you are looping the List,
Using enumerate function is one of the best try.
for index, element in enumerate(ListObj):
# print(index, ListObj[index], len(ListObj) )
if (index != len(ListObj)-1 ):
# Do things to the element which is not the last one
else:
# Do things to the element which is the last one
Delay the special handling of the last item until after the loop.
>>> for i in (1, 2, 3):
... pass
...
>>> i
3
There can be multiple ways. slicing will be fastest. Adding one more which uses .index() method:
>>> l1 = [1,5,2,3,5,1,7,43]
>>> [i for i in l1 if l1.index(i)+1==len(l1)]
[43]
If you are happy to be destructive with the list, then there's the following.
We are going to reverse the list in order to speed up the process from O(n^2) to O(n), because pop(0) moves the list each iteration - cf. Nicholas Pipitone's comment below
data_list.reverse()
while data_list:
value = data_list.pop()
code_that_is_done_for_every_element(value)
if data_list:
code_that_is_done_between_elements(value)
else:
code_that_is_done_for_last_element(value)
This works well with empty lists, and lists of non-unique items.
Since it's often the case that lists are transitory, this works pretty well ... at the cost of destructing the list.
Assuming input as an iterator, here's a way using tee and izip from itertools:
from itertools import tee, izip
items, between = tee(input_iterator, 2) # Input must be an iterator.
first = items.next()
do_to_every_item(first) # All "do to every" operations done to first item go here.
for i, b in izip(items, between):
do_between_items(b) # All "between" operations go here.
do_to_every_item(i) # All "do to every" operations go here.
Demo:
>>> def do_every(x): print "E", x
...
>>> def do_between(x): print "B", x
...
>>> test_input = iter(range(5))
>>>
>>> from itertools import tee, izip
>>>
>>> items, between = tee(test_input, 2)
>>> first = items.next()
>>> do_every(first)
E 0
>>> for i,b in izip(items, between):
... do_between(b)
... do_every(i)
...
B 0
E 1
B 1
E 2
B 2
E 3
B 3
E 4
>>>
The most simple solution coming to my mind is:
for item in data_list:
try:
print(new)
except NameError: pass
new = item
print('The last item: ' + str(new))
So we always look ahead one item by delaying the the processing one iteration. To skip doing something during the first iteration I simply catch the error.
Of course you need to think a bit, in order for the NameError to be raised when you want it.
Also keep the `counstruct
try:
new
except NameError: pass
else:
# continue here if no error was raised
This relies that the name new wasn't previously defined. If you are paranoid you can ensure that new doesn't exist using:
try:
del new
except NameError:
pass
Alternatively you can of course also use an if statement (if notfirst: print(new) else: notfirst = True). But as far as I know the overhead is bigger.
Using `timeit` yields:
...: try: new = 'test'
...: except NameError: pass
...:
100000000 loops, best of 3: 16.2 ns per loop
so I expect the overhead to be unelectable.
Count the items once and keep up with the number of items remaining:
remaining = len(data_list)
for data in data_list:
code_that_is_done_for_every_element
remaining -= 1
if remaining:
code_that_is_done_between_elements
This way you only evaluate the length of the list once. Many of the solutions on this page seem to assume the length is unavailable in advance, but that is not part of your question. If you have the length, use it.
One simple solution that comes to mind would be:
for i in MyList:
# Check if 'i' is the last element in the list
if i == MyList[-1]:
# Do something different for the last
else:
# Do something for all other elements
A second equally simple solution could be achieved by using a counter:
# Count the no. of elements in the list
ListLength = len(MyList)
# Initialize a counter
count = 0
for i in MyList:
# increment counter
count += 1
# Check if 'i' is the last element in the list
# by using the counter
if count == ListLength:
# Do something different for the last
else:
# Do something for all other elements
Just check if data is not the same as the last data in data_list (data_list[-1]).
for data in data_list:
code_that_is_done_for_every_element
if data != data_list[- 1]:
code_that_is_done_between_elements
So, this is definitely not the "shorter" version - and one might digress if "shortest" and "Pythonic" are actually compatible.
But if one needs this pattern often, just put the logic in to a
10-liner generator - and get any meta-data related to an element's
position directly on the for call. Another advantage here is that it will
work wit an arbitrary iterable, not only Sequences.
_sentinel = object()
def iter_check_last(iterable):
iterable = iter(iterable)
current_element = next(iterable, _sentinel)
while current_element is not _sentinel:
next_element = next(iterable, _sentinel)
yield (next_element is _sentinel, current_element)
current_element = next_element
In [107]: for is_last, el in iter_check_last(range(3)):
...: print(is_last, el)
...:
...:
False 0
False 1
True 2
This is an old question, and there's already lots of great responses, but I felt like this was pretty Pythonic:
def rev_enumerate(lst):
"""
Similar to enumerate(), but counts DOWN to the last element being the
zeroth, rather than counting UP from the first element being the zeroth.
Since the length has to be determined up-front, this is not suitable for
open-ended iterators.
Parameters
----------
lst : Iterable
An iterable with a length (list, tuple, dict, set).
Yields
------
tuple
A tuple with the reverse cardinal number of the element, followed by
the element of the iterable.
"""
length = len(lst) - 1
for i, element in enumerate(lst):
yield length - i, element
Used like this:
for num_remaining, item in rev_enumerate(['a', 'b', 'c']):
if not num_remaining:
print(f'This is the last item in the list: {item}')
Or perhaps you'd like to do the opposite:
for num_remaining, item in rev_enumerate(['a', 'b', 'c']):
if num_remaining:
print(f'This is NOT the last item in the list: {item}')
Or, just to know how many remain as you go...
for num_remaining, item in rev_enumerate(['a', 'b', 'c']):
print(f'After {item}, there are {num_remaining} items.')
I think the versatility and familiarity with the existing enumerate makes it most Pythonic.
Caveat, unlike enumerate(), rev_enumerate() requires that the input implement __len__, but this includes lists, tuples, dicts and sets just fine.

Pythonic ways to use 'else' in a for loop [duplicate]

This question already has answers here:
Why does python use 'else' after for and while loops?
(24 answers)
Closed 7 months ago.
I have hardly ever noticed a python program that uses else in a for loop.
I recently used it to perform an action based on the loop variable condition while exiting; as it is in the scope.
What is the pythonic way to use an else in a for loop? Are there any notable use cases?
And, yea. I dislike using break statement. I'd rather set the looping condition complex. Would I be able to get any benefit out of it, if I don't like to use break statement anyway.
Worth noting that for loop has an else since the language inception, the first ever version.
What could be more pythonic than PyPy?
Look at what I discovered starting at line 284 in ctypes_configure/configure.py:
for i in range(0, info['size'] - csize + 1, info['align']):
if layout[i:i+csize] == [None] * csize:
layout_addfield(layout, i, ctype, '_alignment')
break
else:
raise AssertionError("unenforceable alignment %d" % (
info['align'],))
And here, from line 425 in pypy/annotation/annrpython.py (clicky)
if cell.is_constant():
return Constant(cell.const)
else:
for v in known_variables:
if self.bindings[v] is cell:
return v
else:
raise CannotSimplify
In pypy/annotation/binaryop.py, starting at line 751:
def is_((pbc1, pbc2)):
thistype = pairtype(SomePBC, SomePBC)
s = super(thistype, pair(pbc1, pbc2)).is_()
if not s.is_constant():
if not pbc1.can_be_None or not pbc2.can_be_None:
for desc in pbc1.descriptions:
if desc in pbc2.descriptions:
break
else:
s.const = False # no common desc in the two sets
return s
A non-one-liner in pypy/annotation/classdef.py, starting at line 176:
def add_source_for_attribute(self, attr, source):
"""Adds information about a constant source for an attribute.
"""
for cdef in self.getmro():
if attr in cdef.attrs:
# the Attribute() exists already for this class (or a parent)
attrdef = cdef.attrs[attr]
s_prev_value = attrdef.s_value
attrdef.add_constant_source(self, source)
# we should reflow from all the reader's position,
# but as an optimization we try to see if the attribute
# has really been generalized
if attrdef.s_value != s_prev_value:
attrdef.mutated(cdef) # reflow from all read positions
return
else:
# remember the source in self.attr_sources
sources = self.attr_sources.setdefault(attr, [])
sources.append(source)
# register the source in any Attribute found in subclasses,
# to restore invariant (III)
# NB. add_constant_source() may discover new subdefs but the
# right thing will happen to them because self.attr_sources
# was already updated
if not source.instance_level:
for subdef in self.getallsubdefs():
if attr in subdef.attrs:
attrdef = subdef.attrs[attr]
s_prev_value = attrdef.s_value
attrdef.add_constant_source(self, source)
if attrdef.s_value != s_prev_value:
attrdef.mutated(subdef) # reflow from all read positions
Later in the same file, starting at line 307, an example with an illuminating comment:
def generalize_attr(self, attr, s_value=None):
# if the attribute exists in a superclass, generalize there,
# as imposed by invariant (I)
for clsdef in self.getmro():
if attr in clsdef.attrs:
clsdef._generalize_attr(attr, s_value)
break
else:
self._generalize_attr(attr, s_value)
If you have a for loop you don't really have any condition statement. So break is your choice if you like to abort and then else can serve perfectly to handle the case where you were not happy.
for fruit in basket:
if fruit.kind in ['Orange', 'Apple']:
fruit.eat()
break
else:
print 'The basket contains no desirable fruit'
Basically, it simplifies any loop that uses a boolean flag like this:
found = False # <-- initialize boolean
for divisor in range(2, n):
if n % divisor == 0:
found = True # <-- update boolean
break # optional, but continuing would be a waste of time
if found: # <-- check boolean
print n, "is composite"
else:
print n, "is prime"
and allows you to skip the management of the flag:
for divisor in range(2, n):
if n % divisor == 0:
print n, "is composite"
break
else:
print n, "is prime"
Note that there is already a natural place for code to execute when you do find a divisor - right before the break. The only new feature here is a place for code to execute when you tried all divisor and did not find any.
This helps only in conjuction with break. You still need booleans if you can't break (e.g. because you looking for the last match, or have to track several conditions in parallel).
Oh, and BTW, this works for while loops just as well.
any/all
Nowdays, if the only purpose of the loop is a yes-or-no answer, you might be able to write it much shorter with the any()/all() functions with a generator or generator expression that yields booleans:
if any(n % divisor == 0
for divisor in range(2, n)):
print n, "is composite"
else:
print n, "is prime"
Note the elegancy! The code is 1:1 what you want to say!
[This is as effecient as a loop with a break, because the any() function is short-circuiting, only running the generator expression until it yeilds True. In fact it's usually even faster than a loop. Simpler Python code tends to have less overhear.]
This is less workable if you have other side effects - for example if you want to find the divisor. You can still do it (ab)using the fact that non-0 value are true in Python:
divisor = any(d for d in range(2, n) if n % d == 0)
if divisor:
print n, "is divisible by", divisor
else:
print n, "is prime"
but as you see this is getting shaky - wouldn't work if 0 was a possible divisor value...
Without using break, else blocks have no benefit for for and while statements. The following two examples are equivalent:
for x in range(10):
pass
else:
print "else"
for x in range(10):
pass
print "else"
The only reason for using else with for or while is to do something after the loop if it terminated normally, meaning without an explicit break.
After a lot of thinking, I can finally come up with a case where this might be useful:
def commit_changes(directory):
for file in directory:
if file_is_modified(file):
break
else:
# No changes
return False
# Something has been changed
send_directory_to_server()
return True
Perhaps the best answer comes from the official Python tutorial:
break and continue Statements, and else Clauses on Loops:
Loop statements may have an else
clause; it is executed when the loop
terminates through exhaustion of the
list (with for) or when the condition
becomes false (with while), but not
when the loop is terminated by a break
statement
I was introduced to a wonderful idiom in which you can use a for/break/else scheme with an iterator to save both time and LOC. The example at hand was searching for the candidate for an incompletely qualified path. If you care to see the original context, please see the original question.
def match(path, actual):
path = path.strip('/').split('/')
actual = iter(actual.strip('/').split('/'))
for pathitem in path:
for item in actual:
if pathitem == item:
break
else:
return False
return True
What makes the use of for/else so great here is the elegance of avoiding juggling a confusing boolean around. Without else, but hoping to achieve the same amount of short-circuiting, it might be written like so:
def match(path, actual):
path = path.strip('/').split('/')
actual = iter(actual.strip('/').split('/'))
failed = True
for pathitem in path:
failed = True
for item in actual:
if pathitem == item:
failed = False
break
if failed:
break
return not failed
I think the use of else makes it more elegant and more obvious.
A use case of the else clause of loops is breaking out of nested loops:
while True:
for item in iterable:
if condition:
break
suite
else:
continue
break
It avoids repeating conditions:
while not condition:
for item in iterable:
if condition:
break
suite
Here you go:
a = ('y','a','y')
for x in a:
print x,
else:
print '!'
It's for the caboose.
edit:
# What happens if we add the ! to a list?
def side_effect(your_list):
your_list.extend('!')
for x in your_list:
print x,
claimant = ['A',' ','g','u','r','u']
side_effect(claimant)
print claimant[-1]
# oh no, claimant now ends with a '!'
edit:
a = (("this","is"),("a","contrived","example"),("of","the","caboose","idiom"))
for b in a:
for c in b:
print c,
if "is" == c:
break
else:
print

Resources