What is the most optimized method to loop through this code? - performance

stocks is a dict():
stocks[0]: [u'portfolio1', u'Active']
stocks[1]: [u'portfolio2', u'Active']
stocks[2]: [u'portfolio3', u'Inactive']
I am trying to check the status of the portfolio which is stocks[0][1], stocks[1][1] and stocks[2][1] and create a list of elements containing only the active portfolio.
And, I am using a counter to do the iteration which seems to be a very slow process. What is the most efficient method to loop through this code?
a = 0
test = {}
while a <= 500:
try:
if stocks[a][1] == 'Active':
test[a] = stocks[a][0]
print test[a]
a +=1
else:
pass
a +=1
except KeyError:
break
test = list(test.values())
test = str(','.join(test)).split(',')

One thing you could try is instead of using a counter is to iterate through the dictionary values themselves, returning only those portfolios with a status of Active. When you find yourself needing to check all of the items in a particular data structure, it is usually easiest to iterate over the structure itself instead of using a counter (i.e. saying for item in iterable instead of for x in range(len(iterable)): iterable[x]):
In [1]: stocks = {
...: 0: [u'portfolio1', u'Active'],
...: 1: [u'portfolio2', u'Active'],
...: 2: [u'portfolio3', u'Inactive']
...: }
In [2]: actives = [x[0] for x in stocks.itervalues() if x[1] == 'Active']
In [3]: actives
Out[3]: [u'portfolio1', u'portfolio2']
actives in this cause is generated using a list comprehension that iterates through the values of the stocks dictionary and returns only those where x[1] (the status, in your case) is equal to Active.

Related

I am not getting the desired output with my python code

<code>def CheckNumber(MyList, number):
counter=0
while number!=0:
for i,element in enumerate(MyList):
if number%10==element:
del MyList[i]
else:
continue
number = number/10
if len(MyList)==0:
return 1
else:
return 2
print("Program to print all the possible combinations of a number")
MyNumber = int(input("Enter the number: "))
MyList = []
while MyNumber!=0:
MyList.append(MyNumber%10)
MyNumber=int(MyNumber/10)
MyLimit = 10**(len(MyList)-1)
for i in range(MyLimit, MyLimit*10):
answer = CheckNumber(MyList, i)
if answer == 1:
print(i)
else:
continue`</code>
I am a beginner at programming and I was trying to write a code to print all the possible combinations of a number. If user enters a 3 digit number the program will check all the three digit numbers to find possible combinations but instead it gives all the numbers as output. For example if user enters 12 then the output should be 12 21 but instead it shows every number from 10 to 99.
As far as I know everything is working fine but the results are not as I expect.
This is a pass-by-reference vs pass-by-value problem. What that means is when you pass a list to a function in python you are not passing the values in that list, you are passing the list itself, or rather its location in memory. So when you are modifying MyList in your CheckNumber function you are actually modifying the MyList variable globally. This is not true for primitive types which is why modifying number does not change i in the for loop. Quick example:
def foo(my_list):
my_list.append('world')
print(my_list)
a = []
foo(a) # this will print out 'world'
print(a) # this will print out 'world'
b = 'hello'
foo(b.copy()) # This will print out 'hello world'
print(b) # Here we have not passed b directly into foo,
# but instead passed a copy, so this will just print out 'hello' as b
# has not been modified
To summarize variable are stored in a specific location in memory. When you pass-by-reference you are passing a long that location in memory so you variable will be mutated. If you pass-by-value, you function will create a new variable and store a copy of the data so you will not mutate your outer variable. In other languages you can specify which way to pass in a variable but afaik you cannot in python.
With that out of the way this is a very easy fix. You don't want to modify your original MyList so just make a copy of it and pass that into the function. You also forgot to cast number/10 to int in the CheckNumber function. The working code should look like this:
def CheckNumber(MyList, number):
counter=0
while number!=0:
for i,element in enumerate(MyList):
if number%10==element:
del MyList[i]
else:
continue
number = int(number/10)
if len(MyList)==0:
return 1
else:
return 2
print("Program to print all the possible combinations of a number")
MyNumber = int(input("Enter the number: "))
MyList = []
while MyNumber!=0:
MyList.append(MyNumber%10)
MyNumber=int(MyNumber/10)
MyLimit = 10**(len(MyList)-1)
for i in range(MyLimit, MyLimit*10):
answer = CheckNumber(MyList.copy(), i)
if answer == 1:
print(i)
else:
continue
More info on pass-by-reference:
What's the difference between passing by reference vs. passing by value?
https://blog.penjee.com/passing-by-value-vs-by-reference-java-graphical/
https://courses.washington.edu/css342/zander/css332/passby.html

Python: Printing vertically

The final code will print the distance between states. I'm trying to print the menu with the names of the states numbered and vertically. I really struggle to find my mistakes.
This code doesn't raise any error, it just prints nothing, empty.
state_data = """
LA 34.0522°N 118.2437°W
Florida 27.6648°N 81.5158°W
NY 40.7128°N 74.0060°W"""
states = []
import re
state_data1 = re.sub("[°N#°E]", "", state_data)
def process_states(string):
states_temp = string.split()
states = [(states_temp[x], float(states_temp[x + 1]), float(states_temp[x + 2])) for x in
range(0, len(states_temp), 3)]
return states
def menu():
for state_data in range(state_data1):
print(f'{state_data + 1} {name[number]}')
My first guess is, your code does not print anything without errors because you never actually execute process_airports() nor menu().
You have to call them like this at the end of your script:
something = process_airports(airport_data1)
menu()
This will now raise some errors though. So let's address them.
The menu() function will raise an error because neither name nor number are defined and because you are trying to apply the range function over a string (airport_data1) instead of an integer.
First fixing the range error: you mixed two ideas in your for-loop: iterating over the elements in your list airport_data1 and iterating over the indexes of the elements in the list.
You have to choose one (we'll see later that you can do both at once), in this example, I choose to iterate over the indexes of the list.
Then, since neither name nor number exists anywhere they will raise an error. You always need to declare variables somewhere, however, in this case they are not needed at all so let's just remove them:
def menu(data):
for i in range(len(data)):
print(f'{i + 1} {data[i]}')
processed_airports = process_airports(airport_data1)
menu(processed_airports)
Considering data is the output of process_airports()
Now for some general advices and improvements.
First, global variables.
Notice how you can access airport_data1 within the menu() function just fine, while it works this is not something recommended, it's usually better to explicitly pass variables as arguments.
Notice how in the function I proposed above, every single variable is declared in the function itself, there is no information coming from a higher scope. Again, this is not mandatory but makes the code way easier to work with and understand.
airport_data = """
Alexandroupoli 40.855869°N 25.956264°E
Athens 37.936389°N 23.947222°E
Chania 35.531667°N 24.149722°E
Chios 38.343056°N 26.140556°E
Corfu 39.601944°N 19.911667°E"""
airports = []
import re
airport_data1 = re.sub("[°N#°E]", "", airport_data)
def process_airports(string):
airports_temp = string.split()
airports = [(airports_temp[x], float(airports_temp[x + 1]), float(airports_temp[x + 2])) for x in
range(0, len(airports_temp), 3)]
return airports
def menu(data):
for i in range(len(data)):
print(f'{i + 1} {data[i]}')
# I'm adding the call to the functions for clarity
data = process_airports(airport_data1)
menu(data)
The printed menu now looks like that:
1 ('Alexandroupoli', 40.855869, 25.956264)
2 ('Athens', 37.936389, 23.947222)
3 ('Chania', 35.531667, 24.149722)
4 ('Chios', 38.343056, 26.140556)
5 ('Corfu', 39.601944, 19.911667)
Second and this is mostly fyi, but you can access both the index of a iterable and the element itself by looping over enumerate() meaning, the following function will print the exact same thing as the one with range(len(data)). This is handy if you need to work with both the element itself and it's index.
def menu(data):
for the_index, the_element in enumerate(data):
print(f'{the_index + 1} {the_element}')

Passing & returning a list/array as a parameter/ return type to a UDF in Redshift

I have a bunch of metrics that consume the entire list of float values of a column(think a series of order value on which I a doing some outlier analysis, hence needing the entire array of values) .
Can I pass the entire list as a parameter ? It would be too much data munging, if I were to do this in python entirely. Thoughts ?
# Redshift UDF - the red part is invalid signature & needs a fill
create function Median_absolute_deviation(y <Pass a list, but how? >,threshold float)
--INPUTS:
--a list of order values, -- a threshold
RETURNS <return a list, but how? >
STABLE
AS $
import numpy as np
m = np.median(y)
abs_dev = np.abs(y - m)
left_mad = np.median(abs_dev[y<=m])
right_mad = np.median(abs_dev[y>=m])
y_mad = np.zeros(len(y))
y_mad[y < m] = left_mad
y_mad[y > m] = right_mad
modified_z_score = 0.6745 * abs_dev / y_mad
modified_z_score[y == m] = 0
return modified_z_score > threshold
$LANGUAGE plpythonu
I can pass the m = np.median(y) from another function (using select statement on the DB) - but again calculating abs_dev & left_mad & right_mad needs the entire series.
Can I use anyelement data type here ? AWS Reference : http://docs.aws.amazon.com/redshift/latest/dg/udf-data-types.html
This is what I tried . Also, I would like to return the value of that column if flag was "0" - but I guess I can do it on 2nd pass ?
create or replace function Median_absolute_deviation(y anyelement ,thresh int)
--INPUTS:
--a list of order values, -- a threshold
-- I tried both float & anyelement return type, but same error
RETURNS float
--OUTPUT:
-- returns the value of order amount if not outlier, else returns 0
STABLE
AS $$
import numpy as np
m = np.median(y)
abs_dev = np.abs(y - m)
left_mad = np.median(abs_dev[y<=m])
right_mad = np.median(abs_dev[y>=m])
y_mad = np.zeros(len(y))
y_mad[y < m] = left_mad
y_mad[y > m] = right_mad
modified_z_score = 0.6745 * abs_dev / y_mad
modified_z_score[y == m] = 0
flag= 1 if (modified_z_score > thresh ) else 0
return flag
$$LANGUAGE plpythonu
select Median_absolute_deviation(price,3) from my_table where price >0 limit 5;
An error occurred when executing the SQL command:
select Median_absolute_deviation(price,3) from my_table where price >0 limit 5
ERROR: IndexError: invalid index to scalar variable.. Please look at svl_udf_log for more information
Detail:
-----------------------------------------------
error: IndexError: invalid index to scalar variable.. Please look at svl_udf_log for more information
code: 10000
context: UDF
query: 47544645
location: udf_client.cpp:298
process: query6_41 [pid=24744]
-----------------------------------------------
Execution time: 0.73s
1 statement failed.
My end goal is populating tableau views using these computations made via UDF's(the end goal) - so I need something that can interact with tableau and do computations on the fly using a function. Suggestions ?
Redshift only supports scalar UDFs for the time being, which means that you basically CANNOT pass a list as a parameter.
That being said, you can be creative and pass it as a string of numbers separated with a special character and then reconvert it to a list in your udf eg.:
list = [1, 2, 3.5] can be passed as
string_list = "1|2|3.5"
For this to work you need to pre-decide the precision of your numbers and the maximum size of your list, so as to define a varchar of the appropriate length.
It is not the best practice, but it will work.

PyMC3: How can I code my custom distribution with observed data better for Theano?

I am attempting to implement a fairly simple model in pymc3. The gist is that I have some data that is generated from a sequence of random choices. The choices can be thought of as a multinomial, and the process selects choices as a function of previous choices.
The overall probability of the categories is modeled with a Dirichlet prior.
The likelihood function must be customized for the data at hand. The data are lists of 0's and 1's that are output from the process. I have successfully made the model in pymc2, which you can find at this blog post. Here is a python function that generates test data for this problem:
ps = [0.2,0.35,0.25,0.15,0.0498,1/5000]
def make(ps):
out = []
while len(out) < 5:
n_spots = 5-len(out)
sp = sum(ps[:n_spots+1])
P = [x/sp for x in ps[:n_spots+1]]
l = np.argwhere(np.random.multinomial(1,P)==1).ravel()[0]
#if len(out) == 4:
# l = np.argwhere(np.random.multinomial(1,ps[:2])==1).ravel()[0]
out.extend([1]*l)
if (out and out[-1] == 1 and len(out) < 5) or l == 0:
out.append(0)
#print n_spots, l, len(out)
assert len(out) == 5
return out
As I'm learning/moving to pymc3, I'm trying to input my data as observed into a custom likelihood function, and I'm running into several issues along the way. It's probably because this is my first experience with Theano, but I'm hoping that someone can give some advice.
Here is my code (using the make function above):
import numpy as np
import pymc3 as pm
from scipy import optimize
import theano.tensor as T
from theano.compile.ops import as_op
from collections import Counter
# This function gets the attributes of the data that are relevant for calculating the likelihood
def scan(value):
groups = []
prev = False
s = 0
for i in xrange(5):
if value[i] == 0:
if prev:
groups.append((s,5-(i-s)))
prev = False
s = 0
else:
groups.append((0,5-i))
else:
prev = True
s += 1
if prev:
groups.append((s,4-(i-s)))
return groups
# The likelihood calculation for a single data point
def like1(v,p):
l = 1
groups = scan(v)
for n, s in groups:
l *= p[n]/p[:s+1].sum()
return T.log(l)
# my custom likelihood class
class CustomDist(pm.distributions.Discrete):
def __init__(self, ps, data, *args, **kwargs):
super(CustomDist, self).__init__(*args, **kwargs)
self.ps = ps
self.data = data
def logp(self,v):
all_l = 0
for v, k in self.data.items():
l = like1(v,self.ps)
all_l += l*k
return all_l
# model creation
model = pm.Model()
with model:
probs = pm.Dirichlet('probs',a=np.array([0.5]*6),shape=6,testval=np.array([1/6.0]*6))
output = CustomDist("rolls",ps=probs,data=data,observed=True)
I am able to find the MAP in about a minute or so (my machine is Windows 7, i7-4790 #3.6GHz). The MAP matches well with the input probability vector, which at least means the model is linked properly.
When I try to do traces, though, my memory usage skyrockets (up to several gig) and I haven't actually been patient enough for the model to finish compiling. I've waited 10 minutes + for the NUTS or HMC to compile before even tracing. The metropolis stepping works just fine, though (and is much faster than with pymc2).
Am I just being too hopeful for Theano to be able to handle for-loops of non-theano data well? Is there a better way to write this code so that Theano plays well with it, or am I limited because my data is a custom python type and can't be analyzed with array/matrix operations?
Thanks in advance for your advice and feedback. Please let me know what might need clarification!

What is the pythonic way to detect the last element in a 'for' loop?

How can I treat the last element of the input specially, when iterating with a for loop? In particular, if there is code that should only occur "between" elements (and not "after" the last one), how can I structure the code?
Currently, I write code like so:
for i, data in enumerate(data_list):
code_that_is_done_for_every_element
if i != len(data_list) - 1:
code_that_is_done_between_elements
How can I simplify or improve this?
Most of the times it is easier (and cheaper) to make the first iteration the special case instead of the last one:
first = True
for data in data_list:
if first:
first = False
else:
between_items()
item()
This will work for any iterable, even for those that have no len():
file = open('/path/to/file')
for line in file:
process_line(line)
# No way of telling if this is the last line!
Apart from that, I don't think there is a generally superior solution as it depends on what you are trying to do. For example, if you are building a string from a list, it's naturally better to use str.join() than using a for loop “with special case”.
Using the same principle but more compact:
for i, line in enumerate(data_list):
if i > 0:
between_items()
item()
Looks familiar, doesn't it? :)
For #ofko, and others who really need to find out if the current value of an iterable without len() is the last one, you will need to look ahead:
def lookahead(iterable):
"""Pass through all values from the given iterable, augmented by the
information if there are more values to come after the current one
(True), or if it is the last value (False).
"""
# Get an iterator and pull the first value.
it = iter(iterable)
last = next(it)
# Run the iterator to exhaustion (starting from the second value).
for val in it:
# Report the *previous* value (more to come).
yield last, True
last = val
# Report the last value.
yield last, False
Then you can use it like this:
>>> for i, has_more in lookahead(range(3)):
... print(i, has_more)
0 True
1 True
2 False
Although that question is pretty old, I came here via google and I found a quite simple way: List slicing. Let's say you want to put an '&' between all list entries.
s = ""
l = [1, 2, 3]
for i in l[:-1]:
s = s + str(i) + ' & '
s = s + str(l[-1])
This returns '1 & 2 & 3'.
if the items are unique:
for x in list:
#code
if x == list[-1]:
#code
other options:
pos = -1
for x in list:
pos += 1
#code
if pos == len(list) - 1:
#code
for x in list:
#code
#code - e.g. print x
if len(list) > 0:
for x in list[:-1]:
#process everything except the last element
for x in list[-1:]:
#process only last element
The 'code between' is an example of the Head-Tail pattern.
You have an item, which is followed by a sequence of ( between, item ) pairs. You can also view this as a sequence of (item, between) pairs followed by an item. It's generally simpler to take the first element as special and all the others as the "standard" case.
Further, to avoid repeating code, you have to provide a function or other object to contain the code you don't want to repeat. Embedding an if statement in a loop which is always false except one time is kind of silly.
def item_processing( item ):
# *the common processing*
head_tail_iter = iter( someSequence )
head = next(head_tail_iter)
item_processing( head )
for item in head_tail_iter:
# *the between processing*
item_processing( item )
This is more reliable because it's slightly easier to prove, It doesn't create an extra data structure (i.e., a copy of a list) and doesn't require a lot of wasted execution of an if condition which is always false except once.
If you're simply looking to modify the last element in data_list then you can simply use the notation:
L[-1]
However, it looks like you're doing more than that. There is nothing really wrong with your way. I even took a quick glance at some Django code for their template tags and they do basically what you're doing.
you can determine the last element with this code :
for i,element in enumerate(list):
if (i==len(list)-1):
print("last element is" + element)
This is similar to Ants Aasma's approach but without using the itertools module. It's also a lagging iterator which looks-ahead a single element in the iterator stream:
def last_iter(it):
# Ensure it's an iterator and get the first field
it = iter(it)
prev = next(it)
for item in it:
# Lag by one item so I know I'm not at the end
yield 0, prev
prev = item
# Last item
yield 1, prev
def test(data):
result = list(last_iter(data))
if not result:
return
if len(result) > 1:
assert set(x[0] for x in result[:-1]) == set([0]), result
assert result[-1][0] == 1
test([])
test([1])
test([1, 2])
test(range(5))
test(xrange(4))
for is_last, item in last_iter("Hi!"):
print is_last, item
We can achieve that using for-else
cities = [
'Jakarta',
'Surabaya',
'Semarang'
]
for city in cities[:-1]:
print(city)
else:
print(' '.join(cities[-1].upper()))
output:
Jakarta
Surabaya
S E M A R A N G
The idea is we only using for-else loops until n-1 index, then after the for is exhausted, we access directly the last index using [-1].
You can use a sliding window over the input data to get a peek at the next value and use a sentinel to detect the last value. This works on any iterable, so you don't need to know the length beforehand. The pairwise implementation is from itertools recipes.
from itertools import tee, izip, chain
def pairwise(seq):
a,b = tee(seq)
next(b, None)
return izip(a,b)
def annotated_last(seq):
"""Returns an iterable of pairs of input item and a boolean that show if
the current item is the last item in the sequence."""
MISSING = object()
for current_item, next_item in pairwise(chain(seq, [MISSING])):
yield current_item, next_item is MISSING:
for item, is_last_item in annotated_last(data_list):
if is_last_item:
# current item is the last item
Is there no possibility to iterate over all-but the last element, and treat the last one outside of the loop? After all, a loop is created to do something similar to all elements you loop over; if one element needs something special, it shouldn't be in the loop.
(see also this question: does-the-last-element-in-a-loop-deserve-a-separate-treatment)
EDIT: since the question is more about the "in between", either the first element is the special one in that it has no predecessor, or the last element is special in that it has no successor.
I like the approach of #ethan-t, but while True is dangerous from my point of view.
data_list = [1, 2, 3, 2, 1] # sample data
L = list(data_list) # destroy L instead of data_list
while L:
e = L.pop(0)
if L:
print(f'process element {e}')
else:
print(f'process last element {e}')
del L
Here, data_list is so that last element is equal by value to the first one of the list. L can be exchanged with data_list but in this case it results empty after the loop. while True is also possible to use if you check that list is not empty before the processing or the check is not needed (ouch!).
data_list = [1, 2, 3, 2, 1]
if data_list:
while True:
e = data_list.pop(0)
if data_list:
print(f'process element {e}')
else:
print(f'process last element {e}')
break
else:
print('list is empty')
The good part is that it is fast. The bad - it is destructible (data_list becomes empty).
Most intuitive solution:
data_list = [1, 2, 3, 2, 1] # sample data
for i, e in enumerate(data_list):
if i != len(data_list) - 1:
print(f'process element {e}')
else:
print(f'process last element {e}')
Oh yes, you have already proposed it!
There is nothing wrong with your way, unless you will have 100 000 loops and wants save 100 000 "if" statements. In that case, you can go that way :
iterable = [1,2,3] # Your date
iterator = iter(iterable) # get the data iterator
try : # wrap all in a try / except
while 1 :
item = iterator.next()
print item # put the "for loop" code here
except StopIteration, e : # make the process on the last element here
print item
Outputs :
1
2
3
3
But really, in your case I feel like it's overkill.
In any case, you will probably be luckier with slicing :
for item in iterable[:-1] :
print item
print "last :", iterable[-1]
#outputs
1
2
last : 3
or just :
for item in iterable :
print item
print iterable[-1]
#outputs
1
2
3
last : 3
Eventually, a KISS way to do you stuff, and that would work with any iterable, including the ones without __len__ :
item = ''
for item in iterable :
print item
print item
Ouputs:
1
2
3
3
If feel like I would do it that way, seems simple to me.
Use slicing and is to check for the last element:
for data in data_list:
<code_that_is_done_for_every_element>
if not data is data_list[-1]:
<code_that_is_done_between_elements>
Caveat emptor: This only works if all elements in the list are actually different (have different locations in memory). Under the hood, Python may detect equal elements and reuse the same objects for them. For instance, for strings of the same value and common integers.
Google brought me to this old question and I think I could add a different approach to this problem.
Most of the answers here would deal with a proper treatment of a for loop control as it was asked, but if the data_list is destructible, I would suggest that you pop the items from the list until you end up with an empty list:
while True:
element = element_list.pop(0)
do_this_for_all_elements()
if not element:
do_this_only_for_last_element()
break
do_this_for_all_elements_but_last()
you could even use while len(element_list) if you don't need to do anything with the last element. I find this solution more elegant then dealing with next().
For me the most simple and pythonic way to handle a special case at the end of a list is:
for data in data_list[:-1]:
handle_element(data)
handle_special_element(data_list[-1])
Of course this can also be used to treat the first element in a special way .
Better late than never. Your original code used enumerate(), but you only used the i index to check if it's the last item in a list. Here's an simpler alternative (if you don't need enumerate()) using negative indexing:
for data in data_list:
code_that_is_done_for_every_element
if data != data_list[-1]:
code_that_is_done_between_elements
if data != data_list[-1] checks if the current item in the iteration is NOT the last item in the list.
Hope this helps, even nearly 11 years later.
if you are going through the list, for me this worked too:
for j in range(0, len(Array)):
if len(Array) - j > 1:
notLast()
Instead of counting up, you can also count down:
nrToProcess = len(list)
for s in list:
s.doStuff()
nrToProcess -= 1
if nrToProcess==0: # this is the last one
s.doSpecialStuff()
I will provide with a more elegant and robust way as follows, using unpacking:
def mark_last(iterable):
try:
*init, last = iterable
except ValueError: # if iterable is empty
return
for e in init:
yield e, True
yield last, False
Test:
for a, b in mark_last([1, 2, 3]):
print(a, b)
The result is:
1 True
2 True
3 False
If you are looping the List,
Using enumerate function is one of the best try.
for index, element in enumerate(ListObj):
# print(index, ListObj[index], len(ListObj) )
if (index != len(ListObj)-1 ):
# Do things to the element which is not the last one
else:
# Do things to the element which is the last one
Delay the special handling of the last item until after the loop.
>>> for i in (1, 2, 3):
... pass
...
>>> i
3
There can be multiple ways. slicing will be fastest. Adding one more which uses .index() method:
>>> l1 = [1,5,2,3,5,1,7,43]
>>> [i for i in l1 if l1.index(i)+1==len(l1)]
[43]
If you are happy to be destructive with the list, then there's the following.
We are going to reverse the list in order to speed up the process from O(n^2) to O(n), because pop(0) moves the list each iteration - cf. Nicholas Pipitone's comment below
data_list.reverse()
while data_list:
value = data_list.pop()
code_that_is_done_for_every_element(value)
if data_list:
code_that_is_done_between_elements(value)
else:
code_that_is_done_for_last_element(value)
This works well with empty lists, and lists of non-unique items.
Since it's often the case that lists are transitory, this works pretty well ... at the cost of destructing the list.
Assuming input as an iterator, here's a way using tee and izip from itertools:
from itertools import tee, izip
items, between = tee(input_iterator, 2) # Input must be an iterator.
first = items.next()
do_to_every_item(first) # All "do to every" operations done to first item go here.
for i, b in izip(items, between):
do_between_items(b) # All "between" operations go here.
do_to_every_item(i) # All "do to every" operations go here.
Demo:
>>> def do_every(x): print "E", x
...
>>> def do_between(x): print "B", x
...
>>> test_input = iter(range(5))
>>>
>>> from itertools import tee, izip
>>>
>>> items, between = tee(test_input, 2)
>>> first = items.next()
>>> do_every(first)
E 0
>>> for i,b in izip(items, between):
... do_between(b)
... do_every(i)
...
B 0
E 1
B 1
E 2
B 2
E 3
B 3
E 4
>>>
The most simple solution coming to my mind is:
for item in data_list:
try:
print(new)
except NameError: pass
new = item
print('The last item: ' + str(new))
So we always look ahead one item by delaying the the processing one iteration. To skip doing something during the first iteration I simply catch the error.
Of course you need to think a bit, in order for the NameError to be raised when you want it.
Also keep the `counstruct
try:
new
except NameError: pass
else:
# continue here if no error was raised
This relies that the name new wasn't previously defined. If you are paranoid you can ensure that new doesn't exist using:
try:
del new
except NameError:
pass
Alternatively you can of course also use an if statement (if notfirst: print(new) else: notfirst = True). But as far as I know the overhead is bigger.
Using `timeit` yields:
...: try: new = 'test'
...: except NameError: pass
...:
100000000 loops, best of 3: 16.2 ns per loop
so I expect the overhead to be unelectable.
Count the items once and keep up with the number of items remaining:
remaining = len(data_list)
for data in data_list:
code_that_is_done_for_every_element
remaining -= 1
if remaining:
code_that_is_done_between_elements
This way you only evaluate the length of the list once. Many of the solutions on this page seem to assume the length is unavailable in advance, but that is not part of your question. If you have the length, use it.
One simple solution that comes to mind would be:
for i in MyList:
# Check if 'i' is the last element in the list
if i == MyList[-1]:
# Do something different for the last
else:
# Do something for all other elements
A second equally simple solution could be achieved by using a counter:
# Count the no. of elements in the list
ListLength = len(MyList)
# Initialize a counter
count = 0
for i in MyList:
# increment counter
count += 1
# Check if 'i' is the last element in the list
# by using the counter
if count == ListLength:
# Do something different for the last
else:
# Do something for all other elements
Just check if data is not the same as the last data in data_list (data_list[-1]).
for data in data_list:
code_that_is_done_for_every_element
if data != data_list[- 1]:
code_that_is_done_between_elements
So, this is definitely not the "shorter" version - and one might digress if "shortest" and "Pythonic" are actually compatible.
But if one needs this pattern often, just put the logic in to a
10-liner generator - and get any meta-data related to an element's
position directly on the for call. Another advantage here is that it will
work wit an arbitrary iterable, not only Sequences.
_sentinel = object()
def iter_check_last(iterable):
iterable = iter(iterable)
current_element = next(iterable, _sentinel)
while current_element is not _sentinel:
next_element = next(iterable, _sentinel)
yield (next_element is _sentinel, current_element)
current_element = next_element
In [107]: for is_last, el in iter_check_last(range(3)):
...: print(is_last, el)
...:
...:
False 0
False 1
True 2
This is an old question, and there's already lots of great responses, but I felt like this was pretty Pythonic:
def rev_enumerate(lst):
"""
Similar to enumerate(), but counts DOWN to the last element being the
zeroth, rather than counting UP from the first element being the zeroth.
Since the length has to be determined up-front, this is not suitable for
open-ended iterators.
Parameters
----------
lst : Iterable
An iterable with a length (list, tuple, dict, set).
Yields
------
tuple
A tuple with the reverse cardinal number of the element, followed by
the element of the iterable.
"""
length = len(lst) - 1
for i, element in enumerate(lst):
yield length - i, element
Used like this:
for num_remaining, item in rev_enumerate(['a', 'b', 'c']):
if not num_remaining:
print(f'This is the last item in the list: {item}')
Or perhaps you'd like to do the opposite:
for num_remaining, item in rev_enumerate(['a', 'b', 'c']):
if num_remaining:
print(f'This is NOT the last item in the list: {item}')
Or, just to know how many remain as you go...
for num_remaining, item in rev_enumerate(['a', 'b', 'c']):
print(f'After {item}, there are {num_remaining} items.')
I think the versatility and familiarity with the existing enumerate makes it most Pythonic.
Caveat, unlike enumerate(), rev_enumerate() requires that the input implement __len__, but this includes lists, tuples, dicts and sets just fine.

Resources