find all combinations with non-overlapped regions - algorithm

Within a super-region S, there are k small subregions. The number k can be up to 200. There may be overlap between subregions. I have millions of regions S.
For each super-region, my goal is to find out all combinations in which there are 2 or more non-overlapped subregions.
Here is an example:
Super region: 1-100
Subregions: 1-8, 2-13, 9-18, 15-30, 20-35
Goal:
Combination1: 1-8, 9-18
Combination2: 1-8, 20-35
Combination3: 1-8, 9-18, 20-35
Combination4: 1-8, 15-30
...

Number of subsets might be exponential (max 2^k), so there is nothing wrong to traverse all possible independent subsets with recursion. I've used linear search of the next possible interval, but it is worth to exploit binary search.
def nonovl(l, idx, right, ll):
if idx == len(l):
if ll:
print(ll)
return
#find next non-overlapping interval without using l[idx]
next = idx + 1
while next < len(l) and right >= l[next][0]:
next += 1
nonovl(l, next, right, ll)
#find next non-overlapping interval after using l[idx]
next = idx + 1
right = l[idx][1]
while next < len(l) and right >= l[next][0]:
next += 1
nonovl(l, next, right, ll + str(l[idx]))
l=[(1,8),(2,13),(9,18),(15,30),(20,35)]
l.sort()
nonovl(l, 0, -1, "")
(20, 35)
(15, 30)
(9, 18)
(9, 18)(20, 35)
(2, 13)
(2, 13)(20, 35)
(2, 13)(15, 30)
(1, 8)
(1, 8)(20, 35)
(1, 8)(15, 30)
(1, 8)(9, 18)
(1, 8)(9, 18)(20, 35)

Related

Proving that a particular matrix exists

I found this problem in a programming forum Ohjelmointiputka:
https://www.ohjelmointiputka.net/postit/tehtava.php?tunnus=ahdruu and
https://www.ohjelmointiputka.net/postit/tehtava.php?tunnus=ahdruu2
Somebody said that there is a solution found by a computer, but I was unable to find a proof.
Prove that there is a matrix with 117 elements containing the digits such that one can read the squares of the numbers 1, 2, ..., 100.
Here read means that you fix the starting position and direction (8 possibilities) and then go in that direction, concatenating the numbers. For example, if you can find for example the digits 1,0,0,0,0,4 consecutively, you have found the integer 100004, which contains the square numbers of 1, 2, 10, 100 and 20, since you can read off 1, 4, 100, 10000, and 400 (reversed) from that sequence.
But there are so many numbers to be found (100 square numbers, to be precise, or 81 if you remove those that are contained in another square number with total 312 digits) and so few integers in a matrix that you have to put all those square numbers so densely that finding such a matrix is difficult, at least for me.
I found that if there is such a matrix mxn, we may assume without loss of generalty that m<=n. Therefore, the matrix must be of the type 1x117, 3x39 or 9x13. But what kind of algorithm will find the matrix?
I have managed to do the program that checks if numbers to be added can be put on the board. But how can I implemented the searching algorithm?
# -*- coding: utf-8 -*-
# Returns -1 if can not put and value how good a solution is if can be put. Bigger value of x is better.
def can_put_on_grid(grid, number, start_x, start_y, direction):
# Check that the new number lies inside the grid.
x = 0
if start_x < 0 or start_x > len(grid[0]) - 1 or start_y < 0 or start_y > len(grid) - 1:
return -1
end = end_coordinates(number, start_x, start_y, direction)
if end[0] < 0 or end[0] > len(grid[0]) - 1 or end[1] < 0 or end[1] > len(grid) - 1:
return -1
# Test if new number does not intersect any previous number.
A = [-1,-1,-1,0,0,1,1,1]
B = [-1,0,1,-1,1,-1,0,1]
for i in range(0,len(number)):
if grid[start_x + A[direction] * i][start_y + B[direction] * i] not in ("X", number[i]):
return -1
else:
if grid[start_x + A[direction] * i][start_y + B[direction] * i] == number[i]:
x += 1
return x
def end_coordinates(number, start_x, start_y, direction):
end_x = None
end_y = None
l = len(number)
if direction in (1, 4, 7):
end_x = start_x - l + 1
if direction in (3, 6, 5):
end_x = start_x + l - 1
if direction in (2, 0):
end_x = start_x
if direction in (1, 2, 3):
end_y = start_y - l + 1
if direction in (7, 0, 5):
end_y = start_y + l - 1
if direction in (4, 6):
end_y = start_y
return (end_x, end_y)
if __name__ == "__main__":
A = [['X' for x in range(13)] for y in range(9)]
numbers = [str(i*i) for i in range(1, 101)]
directions = [0, 1, 2, 3, 4, 5, 6, 7]
for i in directions:
C = can_put_on_grid(A, "10000", 3, 5, i)
if C > -1:
print("One can put the number to the grid!")
exit(0)
I also found think that brute force search or best first search is too slow. I think there might be a solution using simulated annealing, genetic algorithm or bin packing algorithm. I also wondered if one can apply Markov chains somehow to find the grid. Unfortunately those seems to be too hard for me to implemented at current skills.
There is a program for that in https://github.com/minkkilaukku/square-packing/blob/master/sqPackMB.py . Just change M=9, N=13 from the lines 20 and 21.

KNN - Triangular Inequality Optimization

I don't fully understand how the triangular inequality is used to optimise distance calculations in KNN classification.
I had written a python script referring the steps mentioned below
Calculate the distance between each training pixel to the other.
For each test sample
Calculate the distance from the first training sample as dn. This would be the current minimum distance.
Calculate the distance from the second training sample(p) as dp.
If dp < dn assign dn =dp
For each remaining training sample(c)
If distance between the sample c and sample p measured as dcp meets
dp - dn < dcp < dp + dn
Calculate distance from test sample to the sample c as dp
If dp < dn, assign: dn = dp
Else, skip this training sample.
Stop if there are no more training samples
The class to which n belongs is the estimate.
Python Script:
def get_distance(p1 = (0, 0), p2 = (0, 0)):
return abs(p1[0] - p2[0]) + abs(p1[1] - p2[1])
def algorithm(train_set, new_point):
d_n = get_distance(new_point, train_set[0])
d_p = get_distance(new_point, train_set[1])
min_index = 0
if d_p < d_n:
d_n = d_p
min_index = 1
for c in range(2, len(train_set)):
dcp = get_distance(train_set[min_index], train_set[c])
if d_p - d_n < dcp < d_p + d_n:
d_p = get_distance(new_point, train_set[c])
if d_p < d_n:
d_n = d_p
min_index = c
print(train_set[min_index], d_n)
train_set = [
(0, 1, 'A'),
(1, 1, 'A'),
(2, 5, 'B'),
(1, 8, 'A'),
(5, 3, 'C'),
(4, 2, 'C'),
(3, 2, 'A'),
(1, 7, 'B'),
(4, 8, 'B'),
(4, 0, 'A'),
]
for new_point in train_set:
# Checking the distances from the points within training set iteself: min distance = 0, used for validation
result_point = min(train_set, key = lambda x : get_distance(x, new_point))
print(result_point, get_distance(result_point, new_point))
algorithm(train_set, new_point)
print('----------')
But it doesn't give the required result for 1 point.
Is my understanding of the optimization wrong?
Thank you in advance for any help.

Skyline of Buildings

I'm trying to understand the skyline problem. Given n rectangular building and we need to compute the skyline. I have trouble in understanding the output for this problem.
Input: (1,11,5), (2,6,7), (3,13,9), (12,7,16), (14,3,25), (19,18,22), (23,13,29), (24,4,28) }
Output Skylines: (1, 11), (3, 13), (9, 0), (12, 7), (16, 3), (19, 18), (22, 3), (25, 0)
The output is pair (xaxis, height). Why is the third pair (9,0)? If we see the skyline graph, the x-axis value 9 has height of 13, not 0. Why is it showing 0? In other words, if we take the first building (input (1,11,5)), the output is (1, 11), (5, 0). Can you guys explain why it is (5,0) instead of (5,11)?
Think of the rooftop intervals as closed on the left and open on the right.
Your output does not signify "at x the height is y", but rather "at x the height changes to y".
using the sweep line algorithm; here is my python version solution:
class Solution:
# #param {integer[][]} buildings
# #return {integer[][]}
def getSkyline(self, buildings):
if len(buildings)==0: return []
if len(buildings)==1: return [[buildings[0][0], buildings[0][2]], [buildings[0][1], 0]]
points=[]
for building in buildings:
points+=[[building[0],building[2]]]
points+=[[building[1],-building[2]]]
points=sorted(points, key=lambda x: x[0])
moving, active, res, current=0, [0], [],-1
while moving<len(points):
i=moving
while i<=len(points):
if i<len(points) and points[i][0]==points[moving][0]:
if points[i][1]>0:
active+=[points[i][1]]
if points[i][1]>current:
current=points[i][1]
if len(res)>0 and res[-1][0]==points[i][0]:
res[-1][1]=current
else:
res+=[[points[moving][0], current]]
else:
active.remove(-points[i][1])
i+=1
else:
break
if max(active)<current:
current=max(active)
res+=[[points[moving][0], current]]
moving=i
return res
static long largestRectangle(int[] h) {
int k=1;
int n=h.length;
long max=0;
while(k<=n){
long area=0;
for(int i=0;i<n-k+1;i++){
long min=Long.MAX_VALUE;
for(int j=i;j<i+k;j++){
//System.out.print(h[j]+" ");
min=Math.min(h[j],min);
}
// System.out.println();
area=k*min;
//System.out.println(area);
max=Math.max(area,max);
}
//System.out.println(k);
k++;
}
return max;
}

Find the maximum possible area

Given n non-negative integers a1, a2, ..., an, where each represents a
point at coordinate (i, ai). n vertical lines are drawn such that the
two endpoints of line i is at (i, ai) and (i, 0). Find two lines,
which together with x-axis forms a container, such that the container
contains the most water.
Note: You may not slant the container.
One solution could be that we take each and every line and find area with every line. This takes O(n^2). Not time efficient.
Another solution could be using DP to find the maximum area for every index, and then at index n, we will get the maximum area.
I think it's O(n).
Could there be more better solutions?
int maxArea(vector<int> &height) {
int ret = 0;
int left = 0, right = height.size() - 1;
while (left < right) {
ret = max(ret, (right - left) * min(height[left], height[right]));
if (height[left] <= height[right])
left++;
else
right--;
}
return ret;
}
Many people here are mistaking this problem to maximal rectangle problem, which is not the case.
Solution
Delete all the elements aj such that ai >= aj =< ak and i > j < k. This can be done in linear time.
Find the maximum value am
Let as = a1
For j = 2 through m-1, if as >= aj, delete aj, else as = aj
Let as = an
For j = n-1 through m+1, if as >= aj, delete aj, else as = aj
Notice that the resulting values look like a pyramid, that is, all the elements on the left of the maximum are strictly increasing and on the right are strictly decreasing.
i=1, j=n. m is location of max.
While i<=m and j>=m
Find area between ai and aj and keep track of the max
If ai < aj, i+=1, else j-=1
Complexity is linear (O(n))
Here is an implementation with Java:
Basic idea is to use two pointers from front and back, and calculate the area along the way.
public int maxArea(int[] height) {
int i = 0, j = height.length-1;
int max = Integer.MIN_VALUE;
while(i < j){
int area = (j-i) * Math.min(height[i], height[j]);
max = Math.max(max, area);
if(height[i] < height[j]){
i++;
}else{
j--;
}
}
return max;
}
Here is a clean Python3 solution. The runtime for this solution is O(n). It is important to remember that the area formed between two lines is determined by the height of the shorter line and the distance between the lines.
def maxArea(height):
"""
:type height: List[int]
:rtype: int
"""
left = 0
right = len(height) - 1
max_area = 0
while (left < right):
temp_area = ((right - left) * min(height[left], height[right]))
if (temp_area > max_area):
max_area = temp_area
elif (height[right] > height[left]):
left = left + 1
else:
right = right - 1
return max_area
This problem can be solved in linear time.
Construct a list of possible left walls (position+height pairs), in order from highest to lowest. This is done by taking the leftmost possible wall and adding it to the list, then going through all possible walls, from left to right, and taking every wall that is larger than the last wall added to the list. For example, for the array
2 5 4 7 3 6 2 1 3
your possible left walls would be (pairs are (pos, val)):
(3, 7) (1, 5) (0, 2)
Construct a list of possible right walls in the same way, but going from right to left. For the above array the possible right walls would be:
(3, 7) (5, 6) (8, 3)
Start your water level as high as possible, that is the minimum of heights of the walls at the front of the two lists. Calculate the total volume of water using those walls (it might be negative or zero, but that is ok), then drop the water level by popping an element off of one of the lists such that the water level drops the least. Calculate the possible water volume at each of these heights and take the max.
Running this algorithm on these lists would look like this:
L: (3, 7) (1, 5) (0, 2) # if we pop this one then our water level drops to 5
R: (3, 7) (5, 6) (8, 3) # so we pop this one since it will only drop to 6
Height = 7
Volume = (3 - 3) * 7 = 0
Max = 0
L: (3, 7) (1, 5) (0, 2) # we pop this one now so our water level drops to 5
R: (5, 6) (8, 3) # instead of 3, like if we popped this one
Height = 6
Volume = (5 - 3) * 6 = 12
Max = 12
L: (1, 5) (0, 2)
R: (5, 6) (8, 3)
Height = 5
Volume = (5 - 1) * 5 = 20
Max = 20
L: (1, 5) (0, 2)
R: (8, 3)
Height = 3
Volume = (8 - 1) * 3 = 21
Max = 21
L: (0, 2)
R: (8, 3)
Height = 2
Volume = (8 - 0) * 2 = 16
Max = 21
Steps 1, 2, and 3 all run in linear time, so the complete solution also takes linear time.
The best answer is by Black_Rider, however they did not provide an explanation.
I've found a very clear explanation on this blog. Shortly, it goes as follows:
Given array height of length n:
Start with the widest container you can, i.e. from left side at 0 to right side at n-1.
If a better container exists it will be narrower, so its both sides must be higher than the lower of currently chosen sides.
So, change left to (left+1) if height[left] < height[right], otherwise change right to (right-1).
Calculate new area, if it's better than what you have so far, replace.
If left < right, start over from 2.
My implementation in C++:
int maxArea(vector<int>& height) {
auto current = make_pair(0, height.size() - 1);
auto bestArea = area(height, current);
while (current.first < current.second) {
current = height[current.first] < height[current.second]
? make_pair(current.first + 1, current.second)
: make_pair(current.first, current.second - 1);
auto nextArea = area(height, current);
bestArea = max(bestArea, nextArea);
}
return bestArea;
}
inline int area(const vector<int>& height, const pair<int, int>& p) {
return (p.second - p.first) * min(height[p.first], height[p.second]);
}
This problem is a simpler version of The Maximal Rectangle Problem. The given situation can be view as a binary matrix. Consider the rows of the matrix as X-axis and columns as Y-axis. For every element a[i] in the array, set
Matrix[i][0] = Matrix[i][1] = ..... = Matrix[i][a[i]] = 1
For e.g - For a[] = { 5, 3, 7, 1}, our binary matrix is given by:
1111100
1110000
1111111
1000000

Algorithm to sort pairs of numbers

I am stuck with a problem and I need some help from bright minds of SO.
I have N pairs of unsigned integerers. I need to sort them. The ending vector of pairs should be sorted nondecreasingly by the first number in each pair and nonincreasingly by the second in each pair. Each pair can have the first and second elements swapped with each other. Sometimes there is no solution, so I need to throw an exception then.
Example:
in pairs:
1 5
7 1
3 8
5 6
out pairs:
1 7 <-- swapped
1 5
6 5 <-- swapped
8 3 <-- swapped
^^ Without swapping pairs it is impossible to build the solution. So we swap pairs (7, 1), (3, 8) and (5, 6) and build the result.
or
in pairs:
1 5
6 9
out:
not possible
One more example that shows how 'sorting pairs' first isn't the solution.
in pairs:
1 4
2 5
out pairs:
1 4
5 2
Thanks
O( n log n ) solution
Let S(n) equals all the valid sort orderings, where n corresponds to pairs included [0,n].
S(n) = []
for each order in S(n-1)
for each combination of n-th pair
if pair can be inserted in order, add the order after insertion to S(n)
else don't include the order in S(n)
A pair can be inserted into an order in maximum of two ways(normal pair and reversed pair).
Maximum orderings = O(2^n)
I'm not very sure about this amortized orderings, but hear me out.
For an order and pair we have four ways of getting sorted orders after insertions
(two orders, one(normal),one(reversed), zero)
No of orderings (Amortized) = (1/4)*2 + (1/4)*1 + (1/4)*1 + (1/4)*0 = 1
Amortized orderings = O(1)
Similarly time complexity will be O(n^2), Again not sure.
Following program finds orderings using a variant of Insertion sort.
debug = False
(LEFT, RIGHT, ERROR) = range(3)
def position(first, second):
""" Returns the position of first pair when compared to second """
x,y = first
a,b = second
if x <= a and b <= y:
return LEFT
if x >= a and b >= y:
return RIGHT
else:
return ERROR
def insert(pair, order):
""" A pair can be inserted in normal order or reversed order
For each order of insertion we will get one solution or none"""
solutions = []
paircombinations = [pair]
if pair[0] != pair[1]: # reverse and normal order are distinct
paircombinations.append(pair[::-1])
for _pair in paircombinations:
insertat = 0
if debug: print "Inserting", _pair,
for i,p in enumerate(order):
pos = position(_pair, p)
if pos == LEFT:
break
elif pos == RIGHT:
insertat += 1
else:
if debug: print "into", order,"is not possible"
insertat = None
break
if insertat != None:
if debug: print "at",insertat,"in", order
solutions.append(order[0:insertat] + [_pair] + order[insertat:])
return solutions
def swapsort(pairs):
"""
Finds all the solutions of pairs such that ending vector
of pairs are be sorted non decreasingly by the first number in
each pair and non increasingly by the second in each pair.
"""
solutions = [ pairs[0:1] ] # Solution first pair
for pair in pairs[1:]:
# Pair that needs to be inserted into solutions
newsolutions = []
for solution in solutions:
sols = insert(pair, solution) # solutions after inserting pair
if sols:
newsolutions.extend(sols)
if newsolutions:
solutions = newsolutions
else:
return None
return solutions
if __name__ == "__main__":
groups = [ [(1,5), (7,1), (3,8), (5,6)],
[(1,5), (2,3), (3,3), (3,4), (2,4)],
[(3,5), (6,6), (7,4)],
[(1,4), (2,5)] ]
for pairs in groups:
print "Solutions for",pairs,":"
solutions = swapsort(pairs)
if solutions:
for sol in solutions:
print sol
else:
print "not possible"
Output:
Solutions for [(1, 5), (7, 1), (3, 8), (5, 6)] :
[(1, 7), (1, 5), (6, 5), (8, 3)]
Solutions for [(1, 5), (2, 3), (3, 3), (3, 4), (2, 4)] :
[(1, 5), (2, 4), (2, 3), (3, 3), (4, 3)]
[(1, 5), (2, 3), (3, 3), (4, 3), (4, 2)]
[(1, 5), (2, 4), (3, 4), (3, 3), (3, 2)]
[(1, 5), (3, 4), (3, 3), (3, 2), (4, 2)]
Solutions for [(3, 5), (6, 6), (7, 4)] :
not possible
Solutions for [(1, 4), (2, 5)] :
[(1, 4), (5, 2)]
This is a fun problem. I came up with Tom's solution independently, here's my Python code:
class UnableToAddPair:
pass
def rcmp(i,j):
c = cmp(i[0],j[0])
if c == 0:
return -cmp(i[1],j[1])
return c
def order(pairs):
pairs = [list(x) for x in pairs]
for x in pairs:
x.sort()
pairs.sort(rcmp)
top, bottom = [], []
for p in pairs:
if len(top) == 0 or p[1] <= top[-1][1]:
top += [p]
elif len(bottom) == 0 or p[1] <= bottom[-1][1]:
bottom += [p]
else:
raise UnableToAddPair
bottom = [[x[1],x[0]] for x in bottom]
bottom.reverse()
print top + bottom
One important point not mentioned in Tom's solution is that in the sorting stage, if the lesser values of any two pairs are the same, you have to sort by decreasing value of the greater element.
It took me a long time to figure out why a failure must indicate that there's no solution; my original code had backtracking.
Below is a simple recursive depth-first search algorithm in Python:
import sys
def try_sort(seq, minx, maxy, partial):
if len(seq) == 0: return partial
for i, (x, y) in enumerate(seq):
if x >= minx and y <= maxy:
ret = try_sort(seq[:i] + seq[i+1:], x, y, partial + [(x, y)])
if ret is not None: return ret
if y >= minx and x <= maxy:
ret = try_sort(seq[:i] + seq[i+1:], y, x, partial + [(y, x)])
if ret is not None: return ret
return None
def do_sort(seq):
ret = try_sort(seq, -sys.maxint-1, sys.maxint, [])
print ret if ret is not None else "not possible"
do_sort([(1,5), (7,1), (3,8), (5,6)])
do_sort([(1,5), (2,9)])
do_sort([(3,5), (6,6), (7,4)])
It maintains a sorted subsequence (partial) and tries to append every remaining pair to it both in the original and in the reversed order, without violating the conditions of the sort.
If desired, the algorithm can be easily changed to find all valid sort orders.
Edit: I suspect that the algorithm can be substantially improved by maintaining two partially-sorted sequences (a prefix and a suffix). I think that this would allow the next element can be chosen deterministically instead of trying all possible elements. Unfortunately, I don't have time right now to think this through.
Update: this answer is no longer valid since question was changed
Split vector of pairs into buckets by first number. Do descending sort on each bucket. Merge buckets in ascending order of first numbers and keep track of second number of last pair. If it's greater than current one there is no solution. Otherwise you will get solution after merge is done.
If you have stable sorting algorithm you can do descending sort by second number and then ascending sort by first number. After that check if second numbers are still in descending order.
The swapping in your case is just a sort of a 2-element array.
so you can
tuple[] = (4,6),(1,5),(7,1),(8,6), ...
for each tuple -> sort internal list
=> (4,6),(1,5),(1,7),(6,8)
sort tuple by 1st asc
=> (1,5),(1,7),(4,6),(6,8)
sort tuple by 1nd desc
=> (1,7),(1,5),(4,6),(6,8)
The first thing I notice is that there is no solution if both values in one tuple are larger than both values in any other tuple.
The next thing I notice is that tuples with a small difference become sorted towards the middle, and tupples with large differences become sorted towards the ends.
With these two pieces of information you should be able to figure out a reasonable solution.
Phase 1: Sort each tuple moving the smaller value first.
Phase 2: Sort the list of tuples; first in descending order of the difference between the two values of each tuple, then sort each grouping of equal difference in ascending order of the first member of each tuple. (Eg. (1,6),(2,7),(3,8),(4,4),(5,5).)
Phase 3: Check for exceptions. 1: Look for a pair of tuples where both elements of one tuple are larger than both elements of the other tuple. (Eg. (4,4),(5,5).) 2: If there are four or more tuples, then look within each group of tuples with the same difference for three or more variations (Eg. (1,6),(2,7),(3,8).)
Phase 4: Rearrange tuples. Starting at the back end (tuples with smallest difference), the second variation within each grouping of tuples with equal difference must have their elements swapped and the tuples appended to the back of the list. (Eg. (1,6),(2,7),(5,5) => (2,7),(5,5),(6,1).)
I think this should cover it.
This is a very interesting question. Here is my solution to it in VB.NET.
Module Module1
Sub Main()
Dim input = {Tuple.Create(1, 5),
Tuple.Create(2, 3),
Tuple.Create(3, 3),
Tuple.Create(3, 4),
Tuple.Create(2, 4)}.ToList
Console.WriteLine(Solve(input))
Console.ReadLine()
End Sub
Private Function Solve(ByVal input As List(Of Tuple(Of Integer, Integer))) As String
Dim splitItems As New List(Of Tuple(Of Integer, Integer))
Dim removedSplits As New List(Of Tuple(Of Integer, Integer))
Dim output As New List(Of Tuple(Of Integer, Integer))
Dim otherPair = Function(indexToFind As Integer, startPos As Integer) splitItems.FindIndex(startPos, Function(x) x.Item2 = indexToFind)
Dim otherPairBackwards = Function(indexToFind As Integer, endPos As Integer) splitItems.FindLastIndex(endPos, Function(x) x.Item2 = indexToFind)
'split the input while preserving their indices in the Item2 property
For i = 0 To input.Count - 1
splitItems.Add(Tuple.Create(input(i).Item1, i))
splitItems.Add(Tuple.Create(input(i).Item2, i))
Next
'then sort the split input ascending order
splitItems.Sort(Function(x, y) x.Item1.CompareTo(y.Item1))
'find the distinct values in the input (which is pre-sorted)
Dim distincts = splitItems.Select(Function(x) x.Item1).Distinct
Dim dIndex = 0
Dim lastX = -1, lastY = -1
'go through the distinct values one by one
Do While dIndex < distincts.Count
Dim d = distincts(dIndex)
'temporary list to store the output for the current distinct number
Dim temOutput As New List(Of Tuple(Of Integer, Integer))
'go through each of the split items and look for the current distinct number
Dim curIndex = 0, endIndex = splitItems.Count - 1
Do While curIndex <= endIndex
If splitItems(curIndex).Item1 = d Then
'find the pair of the item
Dim pairIndex = otherPair(splitItems(curIndex).Item2, curIndex + 1)
If pairIndex = -1 Then pairIndex = otherPairBackwards(splitItems(curIndex).Item2, curIndex - 1)
'create a pair and add it to the temporary output list
temOutput.Add(Tuple.Create(splitItems(curIndex).Item1, splitItems(pairIndex).Item1))
'push the items onto the temporary storage and remove it from the split list
removedSplits.Add(splitItems(curIndex))
removedSplits.Add(splitItems(pairIndex))
If curIndex > pairIndex Then
splitItems.RemoveAt(curIndex)
splitItems.RemoveAt(pairIndex)
Else
splitItems.RemoveAt(pairIndex)
splitItems.RemoveAt(curIndex)
End If
endIndex -= 2
Else
'increment the index or exit the iteration as appropriate
If splitItems(curIndex).Item1 <= d Then curIndex += 1 Else Exit Do
End If
Loop
'sort temporary output by the second item and add to the main output
output.AddRange(From r In temOutput Order By r.Item2 Descending)
'ensure that the entire list is properly ordered
'start at the first item that was added from the temporary output
For i = output.Count - temOutput.Count To output.Count - 1
Dim r = output(i)
If lastX = -1 Then
lastX = r.Item1
ElseIf lastX > r.Item1 Then
'!+ It appears this section of the if statement is unnecessary
'sorting on the first column is out of order so remove the temporary list
'and send the items in the temporary list back to the split items list
output.RemoveRange(output.Count - temOutput.Count, temOutput.Count)
splitItems.AddRange(removedSplits)
splitItems.Sort(Function(x, y) x.Item1.CompareTo(y.Item1))
dIndex += 1
Exit For
End If
If lastY = -1 Then
lastY = r.Item2
ElseIf lastY < r.Item2 Then
'sorting on the second column is out of order so remove the temporary list
'and send the items in the temporary list back to the split items list
output.RemoveRange(output.Count - temOutput.Count, temOutput.Count)
splitItems.AddRange(removedSplits)
splitItems.Sort(Function(x, y) x.Item1.CompareTo(y.Item1))
dIndex += 1
Exit For
End If
Next
removedSplits.Clear()
Loop
If splitItems.Count = 0 Then
Dim result As New Text.StringBuilder()
For Each r In output
result.AppendLine(r.Item1 & " " & r.Item2)
Next
Return result.ToString
Else
Return "Not Possible"
End If
End Function
<DebuggerStepThrough()> _
Public Class Tuple(Of T1, T2)
Implements IEqualityComparer(Of Tuple(Of T1, T2))
Public Property Item1() As T1
Get
Return _first
End Get
Private Set(ByVal value As T1)
_first = value
End Set
End Property
Private _first As T1
Public Property Item2() As T2
Get
Return _second
End Get
Private Set(ByVal value As T2)
_second = value
End Set
End Property
Private _second As T2
Public Sub New(ByVal item1 As T1, ByVal item2 As T2)
_first = item1
_second = item2
End Sub
Public Overloads Function Equals(ByVal x As Tuple(Of T1, T2), ByVal y As Tuple(Of T1, T2)) As Boolean Implements IEqualityComparer(Of Tuple(Of T1, T2)).Equals
Return EqualityComparer(Of T1).[Default].Equals(x.Item1, y.Item1) AndAlso EqualityComparer(Of T2).[Default].Equals(x.Item2, y.Item2)
End Function
Public Overrides Function Equals(ByVal obj As Object) As Boolean
Return TypeOf obj Is Tuple(Of T1, T2) AndAlso Equals(Me, DirectCast(obj, Tuple(Of T1, T2)))
End Function
Public Overloads Function GetHashCode(ByVal obj As Tuple(Of T1, T2)) As Integer Implements IEqualityComparer(Of Tuple(Of T1, T2)).GetHashCode
Return EqualityComparer(Of T1).[Default].GetHashCode(Item1) Xor EqualityComparer(Of T2).[Default].GetHashCode(Item2)
End Function
End Class
Public MustInherit Class Tuple
<DebuggerStepThrough()> _
Public Shared Function Create(Of T1, T2)(ByVal first As T1, ByVal second As T2) As Tuple(Of T1, T2)
Return New Tuple(Of T1, T2)(first, second)
End Function
End Class
End Module
The input
1 5
2 3
3 3
3 4
2 4
Produces the output
1 5
2 4
2 3
3 4
3 3
And
3 5
6 6
7 4
Outputs
Not Nossible
Comments
I found this problem quite challenging. It took me some 15 minutes to come up with with a solution and an hour or so to write and debug it. The code is littered with comments so that anyone can follow it.

Resources