Sorting multidimensional table in Lua - sorting

The "how to sort a table in Lua" question isn't new, but the answers I found can't help me out, maybe you can.
I got this Table:
table = {} -- some kind of database
table[1] = {table.containing.table.with.even.more.tables.inside}
table[9] = {table.containing.table.with.even.more.tables.inside}
table[13] = {table.containing.table.with.even.more.tables.inside}
table[15] = {table.containing.table.with.even.more.tables.inside}
table[45] = {table.containing.table.with.even.more.tables.inside}
table[3254] = {table.containing.table.with.even.more.tables.inside}
Now I want to iterate through "table", check for an specified boolean it contains and if so, run a method with parameters from some subtabels.
for key, value in pairs(table) do
print(key)
end
Is something like:
9 13 1 3254 45 15
As far as I know, that's because Lua iterates through hashvalues(right?).
My Idea was:
sorted_table = {} -- shall point to table, with sorted keys
for i = 0, #table do -- from 0 to the last key of table (some write #table is the last key, some write it's the number of contained keys, I don't know. If you do, please tell me.)
if table[i] then -- for runs every number from i to #table, if i is a key, bingo.
table.insert(sorted_table,(table[i])) -- first key found -> sorted_table[1], second -> sorted_table[2]....
end
end
for k,v in pairs(sorted_table) do
print(key)
end
I got no error, it just jumps over the function and nothing happens. When I print #table, it prints 0. #table is in another file but it's not local but used at other location in the functionfile, so,.. this is weird.
EDIT
Mh strange. I threw some debugs, #table is nil, but in a pairs(table) just under the code, everything works fine.
**SOLUTION-EDIT**
local sorted_table = {}
for k, v in pairs(original_table) do
table.insert(sorted_table, k)
end
table.sort(sorted_table)
for k, v in ipairs(sorted_table) do
print(original_table[v])
end

A little try of explanation:
#table does not necessarily return the length of a table. A table element gets a default key if added in a table without a special key. These keys start from 1 and go up to n. If there is a gap between two keys, when you give your own key, #table will return the key right before that gap.
Example:
t = {'one', 'two', 'three'} -- would be a table like 1 - one, 2 - two, 3 - three
print(#t) -- the last key -> 3, here it works
t2 = {'one', 'two', [4] = 'four'} -- would be a table like 1 - one, 2 - two, 4 - four
print(#t2) -- the last key without a gap -> 2, does not work
Same with pairs(table) and ipairs(table). ipairs iterates from key 1 to n without a gap, pairs iterates over all keys. That's why the solution works.
You can set an own metamethod for the table (__len) to use # for the right length.
And remember that your table keys start at 1 by default.
Hope it helped a bit to unterstand the problem here.

SOLUTION
local sorted_table = {}
for k, v in pairs(original_table) do
table.insert(sorted_table, k)
end
table.sort(sorted_table)
for k, v in ipairs(sorted_table) do
print(original_table[v])
end

Related

Lua - Create a nested table using for loop

I'm a very new to lua so am happy to read material if it will help with tables.
I've decoded a json object and would like to build a table properly using its data, rather than writing 64 lines of the below:
a = {}
a[decode.var1[1].aId] = {decode.var2[1].bId, decode.var3[1].cId}
a[decode.var1[2].aId] = {decode.var2[2].bId, decode.var3[2].cId}
a[decode.var1[3].aId] = {decode.var2[3].bId, decode.var3[3].cId}
...etc
Because the numbers are consecutive 1-64, i presume i should be able to build it using a for loop.
Unfortunately despite going through table building ideas I cannot seem to find a way to do it, or find anything on creating nested tables using a loop.
Any help or direction would be appreciated.
Lua for-loops are, at least in my opinion, pretty easy to understand:
for i = 1, 10 do
print(i)
end
This loop inclusively prints the positive integers 1 through 10.
Lua for-loops also take an optional third argument--which defaults to 1--that indicates the step of the loop:
for i = 1, 10, 2 do
print(i)
end
This loop prints the numbers 1 through 10 but skips every other number, that is, it has a step of 2; therefore, it will print 1 3 5 7 9.
In the case of your example, if I understand it correctly, it seems that you know the minimum and maximum bounds of your for loops, which are 1 and 64, respectively. You could write a loop to decode the values and put them in a table like so:
local a = {}
for i = 1, 64 do
a[decodevar.var1[i].aId] = {decode.var2[i].bId, decode.var3[i].cId}
end
What you can do is generating a new table with all the contents from the decoded JSON with a for loop.
For example,
function jsonParse(jsonObj)
local tbl = {}
for i = 1, 64 do
a[decodevar.var1[i].aId] = {decode.var2[i].bId, decode.var3[i].cId}
end
return tbl
end
To deal with nested cases, you can recursively call that method as follows
function jsonParse(jsonObj)
local tbl = {}
for i = 1, 64 do
a[decodevar.var1[i].aId] = {decode.var2[i].bId, decode.var3[i].cId}
if type(decode.var2[i].bId) == "table" then
a[decodevar.var1[i].aid[0] = jsonParse(decode.var2[i].bId)
end
end
end
By the way, I can't understand why are you trying to create a table using a table that have done the job you want already. I assume they are just random and you may have to edit the code with the structure of the decodevar variable you have

Sorting a Lua table by key

I have gone through many questions and Google results but couldn't find the solution.
I am trying to sort a table using table.sort function in Lua but I can't figure out how to use it.
I have a table that has keys as random numeric values. I want to sort them in ascending order. I have gone through the Lua wiki page also but table.sort only works with the table values.
t = { [223]="asd", [23]="fgh", [543]="hjk", [7]="qwe" }
I want it like:
t = { [7]="qwe", [23]="fgh", [223]="asd", [543]="hjk" }
You cannot set the order in which the elements are retrieved from the hash (which is what your table is) using pairs. You need to get the keys from that table, sort the keys as its own table, and then use those sorted keys to retrieve the values from your original table:
local t = { [223]="asd", [23]="fgh", [543]="hjk", [7]="qwe" }
local tkeys = {}
-- populate the table that holds the keys
for k in pairs(t) do table.insert(tkeys, k) end
-- sort the keys
table.sort(tkeys)
-- use the keys to retrieve the values in the sorted order
for _, k in ipairs(tkeys) do print(k, t[k]) end
This will print
7 qwe
23 fgh
223 asd
543 hjk
Another option would be to provide your own iterator instead of pairs to iterate the table in the order you need, but the sorting of the keys may be simple enough for your needs.
What was said by #lhf is true, your lua table holds its contents in whatever order the implementation finds feasible. However, if you want to print (or iterate over it) in a sorted manner, it is possible (so you can compare it element by element). To achieve this, you can do it in the following way
for key, value in orderedPairs(mytable) do
print(string.format("%s:%s", key, value))
end
Unfortunately, orderedPairs is not provided as a part of lua, you can copy the implementation from here though.
The Lua sort docs provide a good solution
local function pairsByKeys (t, f)
local a = {}
for n in pairs(t) do table.insert(a, n) end
table.sort(a, f)
local i = 0 -- iterator variable
local iter = function () -- iterator function
i = i + 1
if a[i] == nil then return nil
else return a[i], t[a[i]]
end
end
return iter
end
Then you traverse the sorted structure
local t = { b=1, a=2, z=55, c=0, qa=53, x=8, d=7 }
for key,value in pairsByKeys(t) do
print(" " .. tostring(key) .. "=" .. tostring(value))
end
There is no notion of order in Lua tables: they are just sets of key-value pairs.
The two tables below have exactly the same contents because they contain exactly the same pairs:
t = { [223] = "asd" ,[23] = "fgh",[543]="hjk",[7]="qwe"}
t = {[7]="qwe",[23] = "fgh",[223] = "asd" ,[543]="hjk"}

Random iteration to fill a table in Lua

I'm attempting to fill a table of 26 values randomly. That is, I have a table called rndmalpha, and I want to randomly insert the values throughout the table. This is the code I have:
rndmalpha = {}
for i= 1, 26 do
rndmalpha[i] = 0
end
valueadded = 0
while valueadded = 0 do
a = math.random(1,26)
if rndmalpha[a] == 0 then
rndmalpha[a] = "a"
valueadded = 1
end
end
while valueadded = 0 do
a = math.random(1,26)
if rndmalpha[a] == 0 then
rndmalpha[a] = "b"
valueadded = 1
end
end
...
The code repeats itself until "z", so this is just a general idea. The problem I'm running into, however, is as the table gets filled, the random hits less. This has potential to freeze up the program, especially in the final letters because there are only 2-3 numbers that have 0 as a value. So, what happens if the while loop goes through a million calls before it finally hits that last number? Is there an efficient way to say, "Hey, disregard positions 6, 13, 17, 24, and 25, and focus on filling the others."? For that matter, is there a much more efficient way to do what I'm doing overall?
The algorithm you are using seems pretty non-efficient, it seems to me that all you need is to initialize a table with all alphabet:
math.randomseed(os.time())
local t = {"a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"}
and Then shuffle the elements:
for i = 1, #t*2 do
local a = math.random(#t)
local b = math.random(#t)
t[a],t[b] = t[b],t[a]
end
Swapping the elements for #t*2 times gives randomness pretty well. If you need more randomness, increase the number of shuffling, and use a better random number generator. The random() function provided by the C library is usually not that good.
Instead of randoming for each letter, go through the table once and get something random per position. The method you're using could take forever because you might never hit it.
Never repeat yourself. Never repeat yourself! If you're copy and pasting too often, it's a sure sign something has gone wrong. Use a second table to contain all the possible letters you can choose, and then randomly pick from that.
letters = {"a","b","c","d","e"}
numberOfLetters = 5
rndmalpha = {}
for i in 1,26 do
rndmalpha[i] = letters[math.random(1,numberOfLetters)]
end

What is the pythonic way to detect the last element in a 'for' loop?

How can I treat the last element of the input specially, when iterating with a for loop? In particular, if there is code that should only occur "between" elements (and not "after" the last one), how can I structure the code?
Currently, I write code like so:
for i, data in enumerate(data_list):
code_that_is_done_for_every_element
if i != len(data_list) - 1:
code_that_is_done_between_elements
How can I simplify or improve this?
Most of the times it is easier (and cheaper) to make the first iteration the special case instead of the last one:
first = True
for data in data_list:
if first:
first = False
else:
between_items()
item()
This will work for any iterable, even for those that have no len():
file = open('/path/to/file')
for line in file:
process_line(line)
# No way of telling if this is the last line!
Apart from that, I don't think there is a generally superior solution as it depends on what you are trying to do. For example, if you are building a string from a list, it's naturally better to use str.join() than using a for loop “with special case”.
Using the same principle but more compact:
for i, line in enumerate(data_list):
if i > 0:
between_items()
item()
Looks familiar, doesn't it? :)
For #ofko, and others who really need to find out if the current value of an iterable without len() is the last one, you will need to look ahead:
def lookahead(iterable):
"""Pass through all values from the given iterable, augmented by the
information if there are more values to come after the current one
(True), or if it is the last value (False).
"""
# Get an iterator and pull the first value.
it = iter(iterable)
last = next(it)
# Run the iterator to exhaustion (starting from the second value).
for val in it:
# Report the *previous* value (more to come).
yield last, True
last = val
# Report the last value.
yield last, False
Then you can use it like this:
>>> for i, has_more in lookahead(range(3)):
... print(i, has_more)
0 True
1 True
2 False
Although that question is pretty old, I came here via google and I found a quite simple way: List slicing. Let's say you want to put an '&' between all list entries.
s = ""
l = [1, 2, 3]
for i in l[:-1]:
s = s + str(i) + ' & '
s = s + str(l[-1])
This returns '1 & 2 & 3'.
if the items are unique:
for x in list:
#code
if x == list[-1]:
#code
other options:
pos = -1
for x in list:
pos += 1
#code
if pos == len(list) - 1:
#code
for x in list:
#code
#code - e.g. print x
if len(list) > 0:
for x in list[:-1]:
#process everything except the last element
for x in list[-1:]:
#process only last element
The 'code between' is an example of the Head-Tail pattern.
You have an item, which is followed by a sequence of ( between, item ) pairs. You can also view this as a sequence of (item, between) pairs followed by an item. It's generally simpler to take the first element as special and all the others as the "standard" case.
Further, to avoid repeating code, you have to provide a function or other object to contain the code you don't want to repeat. Embedding an if statement in a loop which is always false except one time is kind of silly.
def item_processing( item ):
# *the common processing*
head_tail_iter = iter( someSequence )
head = next(head_tail_iter)
item_processing( head )
for item in head_tail_iter:
# *the between processing*
item_processing( item )
This is more reliable because it's slightly easier to prove, It doesn't create an extra data structure (i.e., a copy of a list) and doesn't require a lot of wasted execution of an if condition which is always false except once.
If you're simply looking to modify the last element in data_list then you can simply use the notation:
L[-1]
However, it looks like you're doing more than that. There is nothing really wrong with your way. I even took a quick glance at some Django code for their template tags and they do basically what you're doing.
you can determine the last element with this code :
for i,element in enumerate(list):
if (i==len(list)-1):
print("last element is" + element)
This is similar to Ants Aasma's approach but without using the itertools module. It's also a lagging iterator which looks-ahead a single element in the iterator stream:
def last_iter(it):
# Ensure it's an iterator and get the first field
it = iter(it)
prev = next(it)
for item in it:
# Lag by one item so I know I'm not at the end
yield 0, prev
prev = item
# Last item
yield 1, prev
def test(data):
result = list(last_iter(data))
if not result:
return
if len(result) > 1:
assert set(x[0] for x in result[:-1]) == set([0]), result
assert result[-1][0] == 1
test([])
test([1])
test([1, 2])
test(range(5))
test(xrange(4))
for is_last, item in last_iter("Hi!"):
print is_last, item
We can achieve that using for-else
cities = [
'Jakarta',
'Surabaya',
'Semarang'
]
for city in cities[:-1]:
print(city)
else:
print(' '.join(cities[-1].upper()))
output:
Jakarta
Surabaya
S E M A R A N G
The idea is we only using for-else loops until n-1 index, then after the for is exhausted, we access directly the last index using [-1].
You can use a sliding window over the input data to get a peek at the next value and use a sentinel to detect the last value. This works on any iterable, so you don't need to know the length beforehand. The pairwise implementation is from itertools recipes.
from itertools import tee, izip, chain
def pairwise(seq):
a,b = tee(seq)
next(b, None)
return izip(a,b)
def annotated_last(seq):
"""Returns an iterable of pairs of input item and a boolean that show if
the current item is the last item in the sequence."""
MISSING = object()
for current_item, next_item in pairwise(chain(seq, [MISSING])):
yield current_item, next_item is MISSING:
for item, is_last_item in annotated_last(data_list):
if is_last_item:
# current item is the last item
Is there no possibility to iterate over all-but the last element, and treat the last one outside of the loop? After all, a loop is created to do something similar to all elements you loop over; if one element needs something special, it shouldn't be in the loop.
(see also this question: does-the-last-element-in-a-loop-deserve-a-separate-treatment)
EDIT: since the question is more about the "in between", either the first element is the special one in that it has no predecessor, or the last element is special in that it has no successor.
I like the approach of #ethan-t, but while True is dangerous from my point of view.
data_list = [1, 2, 3, 2, 1] # sample data
L = list(data_list) # destroy L instead of data_list
while L:
e = L.pop(0)
if L:
print(f'process element {e}')
else:
print(f'process last element {e}')
del L
Here, data_list is so that last element is equal by value to the first one of the list. L can be exchanged with data_list but in this case it results empty after the loop. while True is also possible to use if you check that list is not empty before the processing or the check is not needed (ouch!).
data_list = [1, 2, 3, 2, 1]
if data_list:
while True:
e = data_list.pop(0)
if data_list:
print(f'process element {e}')
else:
print(f'process last element {e}')
break
else:
print('list is empty')
The good part is that it is fast. The bad - it is destructible (data_list becomes empty).
Most intuitive solution:
data_list = [1, 2, 3, 2, 1] # sample data
for i, e in enumerate(data_list):
if i != len(data_list) - 1:
print(f'process element {e}')
else:
print(f'process last element {e}')
Oh yes, you have already proposed it!
There is nothing wrong with your way, unless you will have 100 000 loops and wants save 100 000 "if" statements. In that case, you can go that way :
iterable = [1,2,3] # Your date
iterator = iter(iterable) # get the data iterator
try : # wrap all in a try / except
while 1 :
item = iterator.next()
print item # put the "for loop" code here
except StopIteration, e : # make the process on the last element here
print item
Outputs :
1
2
3
3
But really, in your case I feel like it's overkill.
In any case, you will probably be luckier with slicing :
for item in iterable[:-1] :
print item
print "last :", iterable[-1]
#outputs
1
2
last : 3
or just :
for item in iterable :
print item
print iterable[-1]
#outputs
1
2
3
last : 3
Eventually, a KISS way to do you stuff, and that would work with any iterable, including the ones without __len__ :
item = ''
for item in iterable :
print item
print item
Ouputs:
1
2
3
3
If feel like I would do it that way, seems simple to me.
Use slicing and is to check for the last element:
for data in data_list:
<code_that_is_done_for_every_element>
if not data is data_list[-1]:
<code_that_is_done_between_elements>
Caveat emptor: This only works if all elements in the list are actually different (have different locations in memory). Under the hood, Python may detect equal elements and reuse the same objects for them. For instance, for strings of the same value and common integers.
Google brought me to this old question and I think I could add a different approach to this problem.
Most of the answers here would deal with a proper treatment of a for loop control as it was asked, but if the data_list is destructible, I would suggest that you pop the items from the list until you end up with an empty list:
while True:
element = element_list.pop(0)
do_this_for_all_elements()
if not element:
do_this_only_for_last_element()
break
do_this_for_all_elements_but_last()
you could even use while len(element_list) if you don't need to do anything with the last element. I find this solution more elegant then dealing with next().
For me the most simple and pythonic way to handle a special case at the end of a list is:
for data in data_list[:-1]:
handle_element(data)
handle_special_element(data_list[-1])
Of course this can also be used to treat the first element in a special way .
Better late than never. Your original code used enumerate(), but you only used the i index to check if it's the last item in a list. Here's an simpler alternative (if you don't need enumerate()) using negative indexing:
for data in data_list:
code_that_is_done_for_every_element
if data != data_list[-1]:
code_that_is_done_between_elements
if data != data_list[-1] checks if the current item in the iteration is NOT the last item in the list.
Hope this helps, even nearly 11 years later.
if you are going through the list, for me this worked too:
for j in range(0, len(Array)):
if len(Array) - j > 1:
notLast()
Instead of counting up, you can also count down:
nrToProcess = len(list)
for s in list:
s.doStuff()
nrToProcess -= 1
if nrToProcess==0: # this is the last one
s.doSpecialStuff()
I will provide with a more elegant and robust way as follows, using unpacking:
def mark_last(iterable):
try:
*init, last = iterable
except ValueError: # if iterable is empty
return
for e in init:
yield e, True
yield last, False
Test:
for a, b in mark_last([1, 2, 3]):
print(a, b)
The result is:
1 True
2 True
3 False
If you are looping the List,
Using enumerate function is one of the best try.
for index, element in enumerate(ListObj):
# print(index, ListObj[index], len(ListObj) )
if (index != len(ListObj)-1 ):
# Do things to the element which is not the last one
else:
# Do things to the element which is the last one
Delay the special handling of the last item until after the loop.
>>> for i in (1, 2, 3):
... pass
...
>>> i
3
There can be multiple ways. slicing will be fastest. Adding one more which uses .index() method:
>>> l1 = [1,5,2,3,5,1,7,43]
>>> [i for i in l1 if l1.index(i)+1==len(l1)]
[43]
If you are happy to be destructive with the list, then there's the following.
We are going to reverse the list in order to speed up the process from O(n^2) to O(n), because pop(0) moves the list each iteration - cf. Nicholas Pipitone's comment below
data_list.reverse()
while data_list:
value = data_list.pop()
code_that_is_done_for_every_element(value)
if data_list:
code_that_is_done_between_elements(value)
else:
code_that_is_done_for_last_element(value)
This works well with empty lists, and lists of non-unique items.
Since it's often the case that lists are transitory, this works pretty well ... at the cost of destructing the list.
Assuming input as an iterator, here's a way using tee and izip from itertools:
from itertools import tee, izip
items, between = tee(input_iterator, 2) # Input must be an iterator.
first = items.next()
do_to_every_item(first) # All "do to every" operations done to first item go here.
for i, b in izip(items, between):
do_between_items(b) # All "between" operations go here.
do_to_every_item(i) # All "do to every" operations go here.
Demo:
>>> def do_every(x): print "E", x
...
>>> def do_between(x): print "B", x
...
>>> test_input = iter(range(5))
>>>
>>> from itertools import tee, izip
>>>
>>> items, between = tee(test_input, 2)
>>> first = items.next()
>>> do_every(first)
E 0
>>> for i,b in izip(items, between):
... do_between(b)
... do_every(i)
...
B 0
E 1
B 1
E 2
B 2
E 3
B 3
E 4
>>>
The most simple solution coming to my mind is:
for item in data_list:
try:
print(new)
except NameError: pass
new = item
print('The last item: ' + str(new))
So we always look ahead one item by delaying the the processing one iteration. To skip doing something during the first iteration I simply catch the error.
Of course you need to think a bit, in order for the NameError to be raised when you want it.
Also keep the `counstruct
try:
new
except NameError: pass
else:
# continue here if no error was raised
This relies that the name new wasn't previously defined. If you are paranoid you can ensure that new doesn't exist using:
try:
del new
except NameError:
pass
Alternatively you can of course also use an if statement (if notfirst: print(new) else: notfirst = True). But as far as I know the overhead is bigger.
Using `timeit` yields:
...: try: new = 'test'
...: except NameError: pass
...:
100000000 loops, best of 3: 16.2 ns per loop
so I expect the overhead to be unelectable.
Count the items once and keep up with the number of items remaining:
remaining = len(data_list)
for data in data_list:
code_that_is_done_for_every_element
remaining -= 1
if remaining:
code_that_is_done_between_elements
This way you only evaluate the length of the list once. Many of the solutions on this page seem to assume the length is unavailable in advance, but that is not part of your question. If you have the length, use it.
One simple solution that comes to mind would be:
for i in MyList:
# Check if 'i' is the last element in the list
if i == MyList[-1]:
# Do something different for the last
else:
# Do something for all other elements
A second equally simple solution could be achieved by using a counter:
# Count the no. of elements in the list
ListLength = len(MyList)
# Initialize a counter
count = 0
for i in MyList:
# increment counter
count += 1
# Check if 'i' is the last element in the list
# by using the counter
if count == ListLength:
# Do something different for the last
else:
# Do something for all other elements
Just check if data is not the same as the last data in data_list (data_list[-1]).
for data in data_list:
code_that_is_done_for_every_element
if data != data_list[- 1]:
code_that_is_done_between_elements
So, this is definitely not the "shorter" version - and one might digress if "shortest" and "Pythonic" are actually compatible.
But if one needs this pattern often, just put the logic in to a
10-liner generator - and get any meta-data related to an element's
position directly on the for call. Another advantage here is that it will
work wit an arbitrary iterable, not only Sequences.
_sentinel = object()
def iter_check_last(iterable):
iterable = iter(iterable)
current_element = next(iterable, _sentinel)
while current_element is not _sentinel:
next_element = next(iterable, _sentinel)
yield (next_element is _sentinel, current_element)
current_element = next_element
In [107]: for is_last, el in iter_check_last(range(3)):
...: print(is_last, el)
...:
...:
False 0
False 1
True 2
This is an old question, and there's already lots of great responses, but I felt like this was pretty Pythonic:
def rev_enumerate(lst):
"""
Similar to enumerate(), but counts DOWN to the last element being the
zeroth, rather than counting UP from the first element being the zeroth.
Since the length has to be determined up-front, this is not suitable for
open-ended iterators.
Parameters
----------
lst : Iterable
An iterable with a length (list, tuple, dict, set).
Yields
------
tuple
A tuple with the reverse cardinal number of the element, followed by
the element of the iterable.
"""
length = len(lst) - 1
for i, element in enumerate(lst):
yield length - i, element
Used like this:
for num_remaining, item in rev_enumerate(['a', 'b', 'c']):
if not num_remaining:
print(f'This is the last item in the list: {item}')
Or perhaps you'd like to do the opposite:
for num_remaining, item in rev_enumerate(['a', 'b', 'c']):
if num_remaining:
print(f'This is NOT the last item in the list: {item}')
Or, just to know how many remain as you go...
for num_remaining, item in rev_enumerate(['a', 'b', 'c']):
print(f'After {item}, there are {num_remaining} items.')
I think the versatility and familiarity with the existing enumerate makes it most Pythonic.
Caveat, unlike enumerate(), rev_enumerate() requires that the input implement __len__, but this includes lists, tuples, dicts and sets just fine.

Can't sort table with associative indexes

Why I can't use table.sort to sort tables with associative indexes?
In general, Lua tables are pure associative arrays. There is no "natural" order other than the as a side effect of the particular hash table implementation used in the Lua core. This makes sense because values of any Lua data type (other than nil) can be used as both keys and values; but only strings and numbers have any kind of sensible ordering, and then only between values of like type.
For example, what should the sorted order of this table be:
unsortable = {
answer=42,
true="Beauty",
[function() return 17 end] = function() return 42 end,
[math.pi] = "pi",
[ {} ] = {},
12, 11, 10, 9, 8
}
It has one string key, one boolean key, one function key, one non-integral key, one table key, and five integer keys. Should the function sort ahead of the string? How do you compare the string to a number? Where should the table sort? And what about userdata and thread values which don't happen to appear in this table?
By convention, values indexed by sequential integers beginning with 1 are commonly used as lists. Several functions and common idioms follow this convention, and table.sort is one example. Functions that operate over lists usually ignore any values stored at keys that are not part of the list. Again, table.sort is an example: it sorts only those elements that are stored at keys that are part of the list.
Another example is the # operator. For the above table, #unsortable is 5 because unsortable[5] ~= nil and unsortable[6] == nil. Notice that the value stored at the numeric index math.pi is not counted even though pi is between 3 and 4 because it is not an integer. Furthermore, none of the other non-integer keys are counted either. This means that a simple for loop can iterate over the entire list:
for i in 1,#unsortable do
print(i,unsortable[i])
end
Although that is often written as
for i,v in ipairs(unsortable) do
print(i,v)
end
In short, Lua tables are unordered collections of values, each indexed by a key; but there is a special convention for sequential integer keys beginning at 1.
Edit: For the special case of non-integral keys with a suitable partial ordering, there is a work-around involving a separate index table. The described content of tables keyed by string values is a suitable example for this trick.
First, collect the keys in a new table, in the form of a list. That is, make a table indexed by consecutive integers beginning at 1 with keys as values and sort that. Then, use that index to iterate over the original table in the desired order.
For example, here is foreachinorder(), which uses this technique to iterate over all values of a table, calling a function for each key/value pair, in an order determined by a comparison function.
function foreachinorder(t, f, cmp)
-- first extract a list of the keys from t
local keys = {}
for k,_ in pairs(t) do
keys[#keys+1] = k
end
-- sort the keys according to the function cmp. If cmp
-- is omitted, table.sort() defaults to the < operator
table.sort(keys,cmp)
-- finally, loop over the keys in sorted order, and operate
-- on elements of t
for _,k in ipairs(keys) do
f(k,t[k])
end
end
It constructs an index, sorts it with table.sort(), then loops over each element in the sorted index and calls the function f for each one. The function f is passed the key and value. The sort order is determined by an optional comparison function which is passed to table.sort. It is called with two elements to compare (the keys to the table t in this case) and must return true if the first is less than the second. If omitted, table.sort uses the built-in < operator.
For example, given the following table:
t1 = {
a = 1,
b = 2,
c = 3,
}
then foreachinorder(t1,print) prints:
a 1
b 2
c 3
and foreachinorder(t1,print,function(a,b) return a>b end) prints:
c 3
b 2
a 1
You can only sort tables with consecutive integer keys starting at 1, i.e., lists. If you have another table of key-value pairs, you can make a list of pairs and sort that:
function sortpairs(t, lt)
local u = { }
for k, v in pairs(t) do table.insert(u, { key = k, value = v }) end
table.sort(u, lt)
return u
end
Of course this is useful only if you provide a custom ordering (lt) which expects as arguments key/value pairs.
This issue is discussed at greater length in a related question about sorting Lua tables.
Because they don't have any order in the first place. It's like trying to sort a garbage bag full of bananas.

Resources