Related
A function is given with a method to get the next integer from a stream of integers. The numbers are fetched sequentially from the stream. How will we go about producing a summary of integers encountered till now?
Given a list of numbers, the summary will consist of the ranges of numbers. Example: The list till now = [1,5,4,2,7] then summary = [[1-2],[4-5],7]
Put the number in ranges if they are continuous.
My Thoughts:
Approach 1:
Maintain the sorted numbers. So when we fetch a new number from a stream, we can use binary search to find the location of the number in the list and insert the element so that the resulting list is sorted. But since this is a list, I think inserting the element will be an O(N) operation.
Approach 2:
Use Balanced binary search trees like Red, Black, or AVL. Each insertion will be O(log N)
and in order will yield the sorted array from which one can compute the range in O(N)
Approach 2 looks like a better approach if I am not making any mistakes. I am unsure if there is a better way to solve this issue.
I'd not keep the original numbers, but aggregate them to ranges on the fly. This has the potential to reduce the number of elements by quite some factor (depending on the ordering and distribution of the incoming values). The task itself seems to imply that you expect contiguous ranges of integers to appear quite frequently in the input.
Then a newly incoming number can fall into one of a few cases:
It is already contained in some range: then simply ignore the number (this is only relevant if duplicate inputs can happen).
It is adjacent to none of the ranges so far: create a new single-element range.
It is adjacent to exactly one range: extend that range by 1, downward or upward.
It is adjacent to two ranges (i.e. fills the gap): merge the two ranges.
For the data structure holding the ranges, you want a good performance for the following operations:
Find the place (position) for a given number.
Insert a new element (range) at a given place.
Merge two (neighbor) elements. This can be broken down into:
Remove an element at a given place.
Modify an element at a given place.
Depending on the expected number und sparsity of ranges, a sorted list of ranges might do. Otherwise, some kind of search tree might turn out helpful.
Anyway, start with the most readable approach, measure performance for typical cases, and decide whether some optimization is necessary.
I suggest maintaining a hashmap that maps each integer seen so far to the interval it belongs to.
Make sure that two numbers that are part of the same interval will point to the same interval object, not to copies; so that if you update an interval to extend it, all numbers can see it.
All operations are O(1), except the operation "merge two intervals" that happens if the stream produces integer x when we have two intervals [a, x - 1] and [x + 1, b]. The merge operation is proportional to the length of the shortest of these two intervals.
As a result, for a stream of n integers, the algorithm's complexity is O(n) in the best-case (where at most a few big merges happen) and O(n log n) in the worst-case (when we keep merging lots of intervals).
In python:
def add_element(intervals, x):
if x in intervals: # do not do anything
pass
elif x + 1 in intervals and x - 1 in intervals: # merge two intervals
i = intervals[x - 1]
j = intervals[x + 1]
if i[1]-i[0] > j[1]-j[0]: # j is shorter: update i, and make everything in j point to i
i[1] = j[1]
for y in range(j[0] - 1, j[1]+1):
intervals[y] = i
else: # i is shorter: update j, and make everything in i point to j
j[0] = i[0]
for y in range(i[0], i[1] + 2):
intervals[y] = j
elif x + 1 in intervals: # extend one interval to the left
i = intervals[x + 1]
i[0] = x
intervals[x] = i
elif x - 1 in intervals: # extend one interval to the right
i = intervals[x - 1]
i[1] = x
intervals[x] = i
else: # add a singleton
intervals[x] = [x,x]
return intervals
from random import shuffle
def main():
stream = list(range(10)) * 2
shuffle(stream)
print(stream)
intervals = {}
for x in stream:
intervals = add_element(intervals, x)
print(x)
print(set(map(tuple, intervals.values()))) # this line terribly inefficient because I'm lazy
if __name__=='__main__':
main()
Output:
[1, 5, 8, 3, 9, 6, 7, 9, 3, 0, 6, 5, 8, 1, 4, 7, 2, 2, 0, 4]
1
{(1, 1)}
5
{(1, 1), (5, 5)}
8
{(8, 8), (1, 1), (5, 5)}
3
{(8, 8), (1, 1), (5, 5), (3, 3)}
9
{(8, 9), (1, 1), (5, 5), (3, 3)}
6
{(3, 3), (1, 1), (8, 9), (5, 6)}
7
{(5, 9), (1, 1), (3, 3)}
9
{(5, 9), (1, 1), (3, 3)}
3
{(5, 9), (1, 1), (3, 3)}
0
{(0, 1), (5, 9), (3, 3)}
6
{(0, 1), (5, 9), (3, 3)}
5
{(0, 1), (5, 9), (3, 3)}
8
{(0, 1), (5, 9), (3, 3)}
1
{(0, 1), (5, 9), (3, 3)}
4
{(0, 1), (3, 9)}
7
{(0, 1), (3, 9)}
2
{(0, 9)}
2
{(0, 9)}
0
{(0, 9)}
4
{(0, 9)}
You could use a Disjoint Set Forest implementation for this. If well-implemented, it gives a near linear time complexity for inserting 𝑛 elements into it. The amortized running time of each insert operation is Θ(α(𝑛)) where α(𝑛) is the inverse Ackermann function. For all practical purposes we can not distinguish this from O(1).
The extraction of the ranges can have a time complexity of O(𝑘), where 𝑘 is the number of ranges, provided that the disjoint set maintains the set of root nodes. If the ranges need to be sorted, then this extraction will have a time complexity of O(𝑘log𝑘), as it will then just perform the sort-operation on it.
Here is an implementation in Python:
class Node:
def __init__(self, value):
self.low = value
self.parent = self
self.size = 1
def find(self): # Union-Find: Path splitting
node = self
while node.parent is not node:
node, node.parent = node.parent, node.parent.parent
return node
class Ranges:
def __init__(self):
self.nums = dict()
self.roots = set()
def union(self, a, b): # Union-Find: Size-based merge
a = a.find()
b = b.find()
if a is not b:
if a.size > b.size:
a, b = b, a
self.roots.remove(a) # Keep track of roots
a.parent = b
b.low = min(a.low, b.low)
b.size = a.size + b.size
def add(self, n):
if n not in self.nums:
self.nums[n] = node = Node(n)
self.roots.add(node)
if (n+1) in self.nums:
self.union(node, self.nums[n+1])
if (n-1) in self.nums:
self.union(node, self.nums[n-1])
def get(self):
return sorted((node.low, node.low + node.size - 1) for node in self.roots)
# example run
ranges = Ranges()
for n in 4, 7, 1, 6, 2, 9, 5:
ranges.add(n)
print(ranges.get()) # [(1, 2), (4, 7), (9, 9)]
Given A and B, which are two interval lists. A has no overlap inside A and B has no overlap inside B. In A, the intervals are sorted by their starting points. In B, the intervals are sorted by their starting points. How do you merge the two interval lists and output the result with no overlap?
One method is to concatenate the two lists, sort by the starting point, and apply merge intervals as discussed at https://www.geeksforgeeks.org/merging-intervals/. Is there a more efficient method?
Here is an example:
A: [1,5], [10,14], [16,18]
B: [2,6], [8,10], [11,20]
The output:
[1,6], [8, 20]
So you have two sorted lists with events - entering interval and leaving interval.
Merge these lists keeping current state as integer 0, 1, 2 (active interval count)
Get the next coordinate from both lists
If it is entering event
Increment state
If state becomes 1, start new output interval
If it is closing event
Decrement state
If state becomes 0, close current output interval
Note that this algo is similar to intersection finding there
Here is a different approach, in the spirit of the answer to the question of overlaps.
<!--code lang=scala-->
def findUnite (l1: List[Interval], l2: List[Interval]): List[Interval] = (l1, l2) match {
case (Nil, Nil) => Nil
case (as, Nil) => as
case (Nil, bs) => bs
case (a :: as, b :: bs) => {
if (a.lower > b.upper) b :: findUnite (l1, bs)
else if (a.upper < b.lower) a :: findUnite (as, l2)
else if (a.upper > b.upper) findUnite (a.union (b).get :: as, bs)
else findUnite (as, a.union (b).get :: bs)
}
}
If both lists are empty - return the empty list.
If only one is empty, return the other.
If the upper bound of one list is below the lower bound of the other, there is no unification possible, so return the other and proceed with the rest.
If they overlap, don't return, but call the method recursively, the unification on the side of the more far reaching interval and without the consumed less far reaching interval.
The union method looks similar to the one which does the overlap:
<!--code scala-->
case class Interval (lower: Int, upper: Int) {
// from former question, to compare
def overlap (other: Interval) : Option [Interval] = {
if (lower > other.upper || upper < other.lower) None else
Some (Interval (Math.max (lower, other.lower), Math.min (upper, other.upper)))
}
def union (other: Interval) : Option [Interval] = {
if (lower > other.upper || upper < other.lower) None else
Some (Interval (Math.min (lower, other.lower), Math.max (upper, other.upper)))
}
}
The test for non overlap is the same. But min and max have changed places.
So for (2, 4) (3, 5) the overlap is (3, 4), the union is (2, 5).
lower upper
_____________
2 4
3 5
_____________
min 2 4
max 3 5
Table of min/max lower/upper.
<!--code lang='scala'-->
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (6, 8), Interval (9, 11))
findUnite (e, f)
// res3: List[Interval] = List(Interval(0,4), Interval(6,12))
Now for the tricky or unclear case from above:
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (5, 8), Interval (9, 11))
findUnite (e, f)
// res6: List[Interval] = List(Interval(0,4), Interval(5,12))
0-4 and 5-8 don't overlap, so they form two different results which don't get merged.
A simple solution could be, to deflate all elements, put them into a set, sort it, then iterate to transform adjectant elements to Intervals.
A similar approach could be chosen for your other question, just eliminating all distinct values to get the overlaps.
But - there is a problem with that approach.
Lets define a class Interval:
case class Interval (lower: Int, upper: Int) {
def deflate () : List [Int] = {(lower to upper).toList}
}
and use it:
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (6, 8), Interval (9, 11))
deflating:
e.map (_.deflate)
// res26: List[List[Int]] = List(List(0, 1, 2, 3, 4), List(7, 8, 9, 10, 11, 12))
f.map (_.deflate)
// res27: List[List[Int]] = List(List(1, 2, 3), List(6, 7, 8), List(9, 10, 11))
The ::: combines two Lists, here two Lists of Lists, which is why we have to flatten the result, to make one big List:
(res26 ::: res27).flatten
// res28: List[Int] = List(0, 1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 1, 2, 3, 6, 7, 8, 9, 10, 11)
With distinct, we remove duplicates:
(res26 ::: res27).flatten.distinct
// res29: List[Int] = List(0, 1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 6)
And then we sort it:
(res26 ::: res27).flatten.distinct.sorted
// res30: List[Int] = List(0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12)
All in one command chain:
val united = ((e.map (_.deflate) ::: f.map (_.deflate)).flatten.distinct).sorted
// united: List[Int] = List(0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12)
// ^ (Gap)
Now we have to find the gaps like the one between 4 and 6 and return two distinct Lists.
We go recursively through the input list l, and if the element is from the sofar collected elements 1 bigger than the last, we collect that element into this sofar-list. Else we return the sofar collected list as partial result, followed by splitting of the rest with a List of just the current element as new sofar-collection. In the beginning, sofar is empty, so we can start right with adding the first element into that list and splitting the tail with that.
def split (l: List [Int], sofar: List[Int]): List[List[Int]] = l match {
case Nil => List (sofar)
case h :: t => if (sofar.isEmpty) split (t, List (h)) else
if (h == sofar.head + 1) split (t, h :: sofar)
else sofar :: split (t, List (h))
}
// Nil is the empty list, we hand in for initialization
split (united, Nil)
// List(List(4, 3, 2, 1, 0), List(12, 11, 10, 9, 8, 7, 6))
Converting the Lists into intervals would be a trivial task - take the first and last element, and voila!
But there is a problem with that approach. Maybe you recognized, that I redefined your A: and B: (from the former question). In B, I redefined the second element from 5-8 to 6-8. Because else, it would merge with the 0-4 from A because 4 and 5 are direct neighbors, so why not combine them to a big interval?
But maybe it is supposed to work this way? For the above data:
split (united, Nil)
// List(List(6, 5, 4, 3, 2, 1), List(20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8))
For a given integer N, I want to generate a matrix of order NxN where sum of rows is some permutation of sum of column.
For example:
3
0 2 3
4 0 1
1 3 0
row sums are 5, 5, 4
col sums are 5, 5, 4
both are permutations of each other.
How to generate such matrix for any given N ?
PS:
I know that diagonal matrix, symmetric matrix would work here and the matrices like this
3
1 0 0
0 0 1
0 1 0
but i want to make a bit random matrix.
You could start with a matrix that fulfills the requirement but without the permutation aspect: so the sum for a particular row should equal the sum of the column with the same index. For example, the zero matrix would do.
Then randomly choose a set of columns. Iterate those columns, and choose the row to be the index of the previous column from that list (so the row will start out with the index of the last column in the list). This produces a cycle of elements such that if you increase the values of all of them with an equal constant, the sum-requirement is maintained. This constant can be 1 or any other integer (although 0 would not be very useful).
Repeat this as many times as you wish, until you feel it is scrambled enough. You could for instance decide to repeat this n² times.
Finally, you can shuffle the rows, to increase the randomness: the row sums now correspond with a permutation of the column sums.
Here is Python code:
import random
def increment(a):
i = 1 # the increment that will be applied. Could also be random
# choose a random list of distinct columns:
perm = random.sample(range(len(a)), random.randint(1,len(a)-1))
row = perm[-1]
# cycle through them and increment the values to keep the balance
for col in perm:
a[row][col] += i
row = col
return a
### main ###
n = 7
# create square matrix with only zeroes
a = [[0 for i in range(n)] for j in range(n)]
# repeat the basic mutation that keeps the sum property in tact:
for i in range(n*n): # as many times as you wish
increment(a)
# shuffle the rows
random.shuffle(a)
A run produced this matrix:
[[6, 5, 7, 7, 5, 2, 1],
[6, 1, 7, 6, 2, 5, 1],
[6, 1, 0, 4, 3, 5, 4],
[6, 2, 5, 1, 6, 2, 4],
[1, 3, 4, 2, 8, 3, 6],
[1, 7, 0, 3, 3, 10, 1],
[1, 4, 2, 3, 1, 6, 1]]
I used this check just before the row shuffle to make sure the sum property was in tact:
# test that indeed the sums are OK
def test(a):
for i in range(len(a)):
if sum(a[i]) != sum([a[j][i] for j in range(len(a))]):
print('fail at ', i)
One method to get fairly random looking ones is as follows:
First create a random symmetric matrix. Such a matrix will have its row sums equal its column sums.
Note that if any two rows are swapped then its row sums are permuted but its column sums are left alone. Similarly if any two columns are swapped then its column sums are permuted but its row sums are left alone. Thus -- if you randomly swap random rows and swap random columns a large number of times, the row and column sums will be permutations of each other but the original symmetry will be hidden.
A Python proof of concept:
import random
def randSwapRows(matrix):
i,j = random.sample(list(range(len(matrix))),2)
matrix[i], matrix[j] = matrix[j], matrix[i]
def randSwapColumns(matrix):
i,j = random.sample(list(range(len(matrix))),2)
for row in matrix:
row[i],row[j] = row[j],row[i]
def randSpecialMatrix(n):
matrix = [[0]*n for i in range(n)]
for i in range(n):
for j in range(i,n):
matrix[i][j] = random.randint(0,n-1)
matrix[j][i] = matrix[i][j]
#now swap a lot of random rows and columns:
for i in range(n**2):
randSwapRows(matrix)
randSwapColumns(matrix)
return matrix
#test:
matrix = randSpecialMatrix(5)
for row in matrix: print(row)
print('-'*15)
print('row sums: ' + ', '.join(str(sum(row)) for row in matrix))
print('col sums: ' + ', '.join(str(sum(column)) for column in zip(*matrix)))
Typical output:
[3, 2, 2, 0, 3]
[3, 1, 0, 2, 3]
[4, 1, 3, 3, 4]
[2, 0, 3, 3, 4]
[0, 0, 2, 1, 1]
---------------
row sums: 10, 9, 15, 12, 4
col sums: 12, 4, 10, 9, 15
Note that even though this is random looking it isn't really random in the sense of uniformly chosen from the set of all 5x5 matrices with entries in 0-4 which satisfy the desired property. Without a hit and miss approach of randomly generating matrices until you get such a matrix, I don't see any way to get uniform distribution.
I need for given N create N*N matrix which does not have repetitions in rows, cells, minor and major diagonals and values are 1, 2 , 3, ...., N.
For N = 4 one of matrices is the following:
1 2 3 4
3 4 1 2
4 3 2 1
2 1 4 3
Problem overview
The math structure you described is Diagonal Latin Square. Constructing them is the more mathematical problem than the algorithmic or programmatic.
To correctly understand what it is and how to create you should read following articles:
Latin squares definition
Magic squares definition
Diagonal Latin square construction <-- p.2 is answer to your question with proof and with other interesting properties
Short answer
One of the possible ways to construct Diagonal Latin Square:
Let N is the power of required matrix L.
If there are exist numbers A and B from range [0; N-1] which satisfy properties:
A relativly prime to N
B relatively prime to N
(A + B) relatively prime to N
(A - B) relatively prime to N
Then you can create required matrix with the following rule:
L[i][j] = (A * i + B * j) mod N
It would be nice to do this mathematically, but I'll propose the simplest algorithm that I can think of - brute force.
At a high level
we can represent a matrix as an array of arrays
for a given N, construct S a set of arrays, which contains every combination of [1..N]. There will be N! of these.
using an recursive & iterative selection process (e.g. a search tree), search through all orders of these arrays until one of the 'uniqueness' rules is broken
For example, in your N = 4 problem, I'd construct
S = [
[1,2,3,4], [1,2,4,3]
[1,3,2,4], [1,3,4,2]
[1,4,2,3], [1,4,3,2]
[2,1,3,4], [2,1,4,3]
[2,3,1,4], [2,3,4,1]
[2,4,1,3], [2,4,3,1]
[3,1,2,4], [3,1,4,2]
// etc
]
R = new int[4][4]
Then the algorithm is something like
If R is 'full', you're done
Evaluate does the next row from S fit into R,
if yes, insert it into R, reset the iterator on S, and go to 1.
if no, increment the iterator on S
If there are more rows to check in S, go to 2.
Else you've iterated across S and none of the rows fit, so remove the most recent row added to R and go to 1. In other words, explore another branch.
To improve the efficiency of this algorithm, implement a better data structure. Rather than a flat array of all combinations, use a prefix tree / Trie of some sort to both reduce the storage size of the 'options' and reduce the search area within each iteration.
Here's a method which is fast for N <= 9 : (python)
import random
def generate(n):
a = [[0] * n for _ in range(n)]
def rec(i, j):
if i == n - 1 and j == n:
return True
if j == n:
return rec(i + 1, 0)
candidate = set(range(1, n + 1))
for k in range(i):
candidate.discard(a[k][j])
for k in range(j):
candidate.discard(a[i][k])
if i == j:
for k in range(i):
candidate.discard(a[k][k])
if i + j == n - 1:
for k in range(i):
candidate.discard(a[k][n - 1 - k])
candidate_list = list(candidate)
random.shuffle(candidate_list)
for e in candidate_list:
a[i][j] = e
if rec(i, j + 1):
return True
a[i][j] = 0
return False
rec(0, 0)
return a
for row in generate(9):
print(row)
Output:
[8, 5, 4, 7, 1, 6, 2, 9, 3]
[2, 7, 5, 8, 4, 1, 3, 6, 9]
[9, 1, 2, 3, 6, 4, 8, 7, 5]
[3, 9, 7, 6, 2, 5, 1, 4, 8]
[5, 8, 3, 1, 9, 7, 6, 2, 4]
[4, 6, 9, 2, 8, 3, 5, 1, 7]
[6, 3, 1, 5, 7, 9, 4, 8, 2]
[1, 4, 8, 9, 3, 2, 7, 5, 6]
[7, 2, 6, 4, 5, 8, 9, 3, 1]
I encountered and solved this problem as part of a larger algorithm, but my solution seems inelegant and I would appreciate any insights.
I have a list of pairs which can be viewed as points on a Cartesian plane. I need to generate three lists: the sorted x values, the sorted y values, and a list which maps an index in the sorted x values with the index in the sorted y values corresponding to the y value with which it was originally paired.
A concrete example might help explain. Given the following list of points:
((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))
The sorted list of x values would be (3, 4, 5, 7, 9, 15), and the sorted list of y values would be (0, 4, 7, 7, 11, 12).
Assuming a zero based indexing scheme, the list that maps the x list index to the index of its paired y list index would be (2, 3, 0, 4, 5, 1).
For example the value 7 appears as index 3 in the x list. The value in the mapping list at index 3 is 4, and the value at index 4 in the y list is 11, corresponding to the original pairing (7, 11).
What is the simplest way of generating this mapping list?
Here's a simple O(nlog n) method:
Sort the pairs by their x value: ((3, 7), (4, 7), (5, 0), (7, 11), (9, 12), (15, 4))
Produce a list of pairs in which the first component is the y value from the same position in the previous list and the second increases from 0: ((7, 0), (7, 1), (0, 2), (11, 3), (12, 4), (4, 5))
Sort this list by its first component (y value): ((0, 2), (4, 5), (7, 0), (7, 1), (11, 3), (12, 4))
Iterate through this list. For the ith such pair (y, k), set yFor[k] = i. yFor[] is your list (well, array) mapping indices in the sorted x list to indices in the sorted y list.
Create the sorted x list simply by removing the 2nd element from the list produced in step 1.
Create the sorted y list by doing the same with the list produced in step 3.
I propose the following.
Generate the unsorted x and y lists.
xs = [3, 15, 7, 5, 4, 9 ]
ys = [7, 4, 11, 0, 7, 12]
Transform each element into a tuple - the first of the pair being the coordinate, the second being the original index.
xs = [(3, 0), (15, 1), ( 7, 2), (5, 3), (4, 4), ( 9, 5)]
ys = [(7, 0), ( 4, 1), (11, 2), (0, 3), (7, 4), (12, 5)]
Sort both lists.
xs = [(3, 0), (4, 4), (5, 3), (7, 2), ( 9, 5), (15, 1)]
ys = [(0, 3), (4, 1), (7, 0), (7, 4), (11, 2), (12, 5)]
Create an array, y_positions. The nth element of the array contains the current index of the y element that was originally at index n.
Create an empty index_list.
For each element of xs, get the original_index, the second pair of the tuple.
Use y_positions to retrieve the current index of the y element with the given original_index. Add the current index to index_list.
Finally, remove the index values from xs and ys.
Here's a sample Python implementation.
points = ((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))
#generate unsorted lists
xs, ys = zip(*points)
#pair each element with its index
xs = zip(xs, range(len(xs)))
ys = zip(ys, range(len(xs)))
#sort
xs.sort()
ys.sort()
#generate the y positions list.
y_positions = [None] * len(ys)
for i in range(len(ys)):
original_index = ys[i][1]
y_positions[original_index] = i
#generate `index_list`
index_list = []
for x, original_index in xs:
index_list.append(y_positions[original_index])
#remove tuples from x and y lists
xs = zip(*xs)[0]
ys = zip(*ys)[0]
print "xs:", xs
print "ys:", ys
print "index list:", index_list
Output:
xs: (3, 4, 5, 7, 9, 15)
ys: (0, 4, 7, 7, 11, 12)
index list: [2, 3, 0, 4, 5, 1]
Generation of y_positions and index_list is O(n) time, so the complexity of the algorithm as a whole is dominated by the sorting step.
Thank you for the answers. For what it's worth, the solution I had was pretty similar to those outlined, but as j_random_hacker pointed out, there's no need for a map. It just struck me that this little problem seems more complicated than it appears at first glance and I was wondering if I was missing something obvious. I've rehashed my solution into Python for comparison.
points = ((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))
N = len(points)
# Separate the points into their x and y components, tag the values with
# their index into the points list.
# Sort both resulting (value, tag) lists and then unzip them into lists of
# sorted x and y values and the tag information.
xs, s = zip(*sorted(zip([x for (x, y) in points], range(N))))
ys, r = zip(*sorted(zip([y for (x, y) in points], range(N))))
# Generate the mapping list.
t = N * [0]
for i in range(N):
t[r[i]] = i
index_list = [t[j] for j in s]
print "xs:", xs
print "ys:", ys
print "index_list:", index_list
Output:
xs: (3, 4, 5, 7, 9, 15)
ys: (0, 4, 7, 7, 11, 12)
index_list: [2, 3, 0, 4, 5, 1]
I've just understood what j_random_hacker meant by removing a level of indirection by sorting the points in x initially. That allows things to be tidied up nicely. Thanks.
points = ((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))
N = len(points)
ordered_by_x = sorted(points)
ordered_by_y = sorted(zip([y for (x, y) in ordered_by_x], range(N)))
index_list = N * [0]
for i, (y, k) in enumerate(ordered_by_y):
index_list[k] = i
xs = [x for (x, y) in ordered_by_x]
ys = [y for (y, k) in ordered_by_y]
print "xs:", xs
print "ys:", ys
print "index_list:", index_list