alternative rank function RBTree (red black tree) - algorithm

I have an order-statistic augmented red black tree.
it works for the most part. but i need to implement a fast function (O(lg n)) that mostly returns the place of a node in sorted order. like the OS-rank function from my textbook. but with one twist: the return value if two nodes have the same score, should be the same. here is the os-rank function (in pseudocode, for a given node x, where root is the root of the tree).
OS-Rank(x)
r=x.left.size+1
y=x
while y!=root
if y==y.p.right
r+=y.p.left.size+1
y=y.p
return r
But: what i need is something where if A has key 1 and Node B has key 1, the function returns 1 for both. and so on. I tried myself with something like this.
rank(x)
start with value r=1
check that x.right is not Nil
case x.right has the same key as x
add x.right.#nodeswithkeyhigher(x.key) to r
other cases: add x.right.size to r
y=x
while y != root
if y.parent.left == y
case y.parent.right.key>x.key
add y.parent.right to r
other cases
add y.parent.right.#nodeswithkeyhigher(x.key) to r
y=y.parent
return r
Guess what: a testcase failed. I'd like to know if this is a correct way of doing things, or if perhaps i made some mistake i am not seeing (else the mistake is in the Node.#nodeswithkeyhigher(key) function).

edit: final paragraph for answer, thanks to Sticky.
tl;dr: skip to last paragraphs
This is the same issue I'm having trouble with. (Yes DS aswell). So far all runs except 5 are correct. I've tested several things, one being a very simple one: Just exchange left and right in OSRank. In some cases it gave a correct answer but in the harder cases it was quite a bit off. Oh I also added that if y.score == y.parent.score I only add the right size of y.parent, if not I add the right size + 1.
public int OSRank(Node x)
{
int r = x.Right.Size + 1;
Node y = x;
while (y != root)
{
if (y == y.Parent.Left)
{
if (y.Score == y.Parent.Score)
r = r + y.Parent.Right.Size;
else
r = r + y.Parent.Right.Size + 1;
}
y = y.Parent;
}
return r;
}
Let's first test this method on the tree on page 340 (figure 14.1). We'll search for the rank of 38 (which should return 4 because 39, 47 and 41 are higher):
r = 1 + 1 = 2 //Right side + 1
r = 2 //nothing happens because we're a right child
r = r + 1 + 1 = 4 //we're a left child, the key of our parent is larger and parent.Right.size = 1
r = 4 //nothing happens because we're a right child
So in this case the result is correct. But what if we add another node with key 38 to our tree. That reshapes our tree a bit, the right part of node 26 now looks like:
(I'm not allowed to add images yet so look here:http://i47.tinypic.com/358ynhh.png)
If we would use the same algorithm we'd get the following result (picking the red one):
r = 0 + 1 = 1 //no right side
r = 1 //we're a right child
r = 1 //we're a right child
r = 1 + 3 + 1 = 5 //The 3 comes from the size of node 41.
r = 5 //we're a right child
Though we expect rank 4 here. While I was typing this out I noticed that we check if y.Score == y.Parent.Score, but I completely forgot y changes. So in line 4 the clause "y.Score == y.Parent.Score" was false because we compared node 30 with 38. So if we change that line to:
if (x.Score == y.Parent.Score)
The algorithm outputs rank 4, which is correct. This means we eliminated another issue. But there are more, which I didn't figure out either:
The case in which Y.Parent.Right contains duplicate keys. Technically if we have 3 nodes with the same key, they should count as 1.
The case in which Y.Parent.Right contains keys that are equal to x.Key (the node you want the rank of). That would put us a few ranks back, incorrectly.
I suppose you could keep another integer which holds the amount of nodes with a higher score. Upon insertion you could climb the tree and adjust values if the subtree of that node doesn't contain a node with the same score. But how this is done (and efficiently) is unknown to me right now.
edit: First find the final successor of x with the same score x. Then calculate the rank the normal way. The code above works.

Related

Find() operation for disjoint sets using "Path Halving"

According to Disjoint-set_data_structure, In Union section, I have a problem understanding the implementation of Path Halving approach.
function Find(x)
while x.parent ≠ x
x.parent := x.parent.parent
x := x.parent
return x
My first iteration looks like this :
After the first iteration, x.parent and x is pointing at same node(which should not happen).I need help with the correct flow and iteration of this function
I am confused with the 3rd and 4th lines of that function and also "Path halving makes every other node on the path point to its grandparent".
Any help will be appreciated, thanks!
The algorithm wokrs as follows: you start from a node x, make x point to its granparent, then move on to the granparent itself, you continue until you find the root because the parent of the root is the root itself.
Look at the picture, you are actually halving the set by transforming it into a binary tree (it's not a proper binary tree but it can represented as such).
Let's say we have a set like this:
8->7->6->5->4->3->2->1->0
where the arrow means the parent (e.g. 8->7 = the parent of 8 is 7)
Say we call Find(8)
first iteration:
x = 8
8.parent = 7
8.parent = 8.parent.parent = 7.parent = 6
x = 8.parent = 7.parent = 6
second iteration:
x = 6
6.parent = 5
6.parent = 6.parent.parent = 5.parent = 4
x = 6.parent = 5.parent = 4
and so on...

Possible Staircases using Dynamic Programming

For the Staircase problem mentioned in the URL http://acm.timus.ru/problem.aspx?num=1017&locale=en
Can we solve it in linear time O(k) where k is the maximum steps possible? I felt like missing some logic using below approach
Any Suggestions?
Below is the code That I have implemented:
def answer(n):
steps = determine_steps(n)
x = ((n -1) - n/steps) * ((n-2) - n/steps + 1) #Minimum of two stair case
for i in range(3, steps):
x = x * ((n-i)/i) #Stairs from 3 can go from minimum height 0 to max (n-i)/i
return x
def determine_steps(n):
"""Determine no of steps possible"""
steps = 1;
while (steps * steps + steps) <= 2 * n:
steps = steps + 1
return steps - 1
#print answer(212)
print answer(212)
Suppose, you have a function which takes 2 parameters, one left which is number of bricks left and the other one is curr which is the current height of the step which you are on. Now, at any step you have 2 options. The first option is to increase the height of the current step you are on by adding one more brick, i.e., rec(left-1, curr+1) and the second option is to create a new step whose height should be greater than curr ,i.e., rec(left-curr-1, curr+1) ( you created a step of height curr+1 ). Now, left can never be negative , thus if left<0 then return 0. And when left is 0 that means, we have created a valid staircase,thus if left==0 then return 1.
This case: if dp[left][curr] !=-1 is just for memoization.
Now, rec( 212-1, 1 ) means a step of height 1 is created and it is the current step. And for final answer 1 is subtracted because any valid staircase should contain at least 2 steps so, subtracting 1 for single step staircase.
# your code goes here
dp = [ [-1]*501 for i in range(501) ]
def rec(left, curr):
if left<0:
return 0
if left==0:
return 1
if dp[left][curr] !=-1:
return dp[left][curr]
dp[left][curr] = rec(left-1, curr+1) + rec( left-curr-1, curr+1)
return dp[left][curr]
print ( rec(212-1,1) - 1 )
Feel free to comment back, if you are not able to understand the code.

How to go through the elements of a matrix, layer by layer

It is difficult to explain what I want. Lets say I have a matrix of 0 and 1
000000
000000
001100
000000
000000
I want to start from a certain group of ones (this is given in the beginning, and then I want to go outwards.
000000,,,,,,, 000000
011110 OR 001100
010010,,,,,,, 010010
011110,,,,,,, 001100
000000,,,,,,, 000000
The difference is not important, as long as I will go through everything, outwards.
The reason I want to do this is, this matrix of 1 and 0 corresponds to a matrix of some 2D function, and I want to examine the points in that function going outwards. I want to
If i understand the question correctly, basically what you want is to find a group of 1s inside a matrix and invert the group of 1s and all of it's surrounding. This is actually an image-processing problem, so my explanation will be accordingly. Sidenote: the term 'polygon' is here used for the group of 1s in the matrix. Some assumptions made: the polygon is always filled. The polygon doesn't contain any points that are directly at the outer bounds of the matrix (ex.: the point (0 , 2) is never part of the polygon). The solution can be easily found this way:
Step 1: search an arbitrary 1 that is part of the outer bound of the polygon represented by the 1s in the matrix. By starting from the upper left corner it's guaranteed that the returned coordinated will belong to a 1 that is either on the left side of the polygon, the upper-side or at a corner.
point searchArb1(int[][] matrix)
list search
search.add(point(0 , 0))
while NOT search.isEmpty()
point pt = search.remove(0)
//the point wasn't the searched one
if matrix[pt.x][pt.y] == 1
return pt
//continue search in 3 directions: down, right, and diagonally down/right
point tmp = pt.down()
if tmp.y < matrix.height
search.add(tmp)
tmp = pt.right()
if tmp.x < matrix.width
search.add(tmp)
tmp = pt.diagonal_r_d()
if tmp.x < matrix.width AND tmp.y < matrix.height
search.add(tmp)
return null
Step 2: now that the we have an arbitrary point in the outer bound of the polygon, we can simply proceed by searching the outer bound of the polygon. Due to the above mentioned assumptions, we only have to search for 1s in 3 directions (diagonals are always represented by 3 points forming a corner). This method will search the polygon bound clockwise.
int UP = 0
int RIGHT = 1
int DOWN = 2
int LEFT = 3
list searchOuterBound(int[][] matrix , point arbp)
list result
point pt = arbp
point ptprev
//at each point one direction can't be available (determined using the previous found 1
int dir_unav = LEFT
do
result.add(pt)
//generate all possible candidates for the next point in the polygon bounds
map candidates
for int i in [UP , LEFT]
if i == dir_unav
continue
point try
switch i
case UP:
try = pt.up()
break
case DOWN:
try = pt.down()
break
case RIGHT:
try = pt.right()
break
case LEFT:
try = pt.left()
break
candidates.store(i , try)
ptprev = pt
for int i in [0 , 2]
//the directions can be interpreted as cycle of length 4
//always start search for the next 1 at the clockwise next direction
//relatively to the direction we come from
//eg.: dir_unav = LEFT -> start with UP
int dir = (dir_unav + i + 1) % 4
point try = candidates.get(dir)
if matrix[pt.x][pt.y] == 1
//found the first match
pt = try
//direction we come from is the exact opposite of dir
dir_unav = (dir + 2) % 4
break
//no matching candidate was found
if pt == ptprev
return result
while pt != arbp
//algorithm has reached the starting point again
return result
Step 3: Now we've got a representation of the polygon. Next step: Inverting the points around the polygon aswell. Due to the fact that the polygon itself will be filled with 0s later on, we can simply fill up the surrounding of every point in the polygon with 1s. Since there are two options for generating this part of the matrix-state, i'll split up into two solutions:
Step 3.1: Fill points that are diagonal neighbours of points of the polygon with 1s aswell
void fillNeighbours_Diagonal_Included(int[][] matrix , list polygon)
for point p in polygon
for int x in [-1 , 1]
for int y in [-1 , 1]
matrix[p.x + x][p.y + y] = 1
Step 3.1: Don't fill points that are diagonal neighbours of points of the polygon
void fillNeighbours_Diagonal_Excluded(int[][] matrix , list polygon)
for point p in polygon
matrix[p.x - 1][p.y] = 1
matrix[p.x + 1][p.y] = 1
matrix[p.x][p.y - 1] = 1
matrix[p.x][p.y + 1] = 1
Step 4: Finally, last step: Invert all 1s in the polygon into 0s. Note: I'm too lazy to optimize this any further, so this part is implemented as brute-force.
void invertPolygon(int[][] matrix , list polybounds)
//go through each line of the matrix
for int i in [0 , matrix.height]
sortedlist cut_x
//search for all intersections of the line with the polygon
for point p in polybounds
if p.y == i
cut_x.add(p.x)
//remove ranges of points to only keep lines
int at = 0
while at < cut_x.size()
if cut_x.get(at - 1) + 1 == cut_x.get(at)
AND cut_x.get(at) == cut_x.get(at + 1) - 1
cut_x.remove(at)
--at
//set all points in the line that are part of the polygon to 0
for int j in [0 , cut_x.size()[ step = 2
for int x in [cut_x.get(j) , cut_x.get(j + 1)]
matrix[x][i] = 0
I hope you understand the basic idea behind this. Sry for the long answer.

Algorithm that sorts elements within range of values into clusters?

Essentially I want to take a list of things like...
a 345
b 762
c 983
d 425
e 45
...
and given a maximum distance, create clusters for each element containing other elements within that range. For example, if I specified the maximum distance above to be 300 the clusters would be...
a 345
d 425
e 45
b 762
c 983
c 983
b 762
d 425
a 345
e 45
a 345
Constraints wise, I'm reading entries in a file, which is common with the work I'm doing. As such, I generally focus my algorithms on doing work as it reads entries, rather than reading everything in the file, storing it in some convenient structure, and then doing work on it. Anyway, storing the entries from the file and then performing a sort based on those values, then just going through the sorted list and doing appropriate output is something I'm trying to avoid.
I've done some layman brainstorming, but before I spend lots of time doing a thorough analysis I feel like I've seen this somewhere or that there is an algorithm very similar to this. I'm not asking for someone to come up with an algorithm unless you feel so inclined, just wondering if there are any existing ones that solve this problem or one very similar to it.
Thanks.
Another possible flood fill algorithm using some modified binary search to reduce the amount of required comparisons:
Pseudo-code:
def clustersWithinRange(values, distance): //Should run in O(n log n)
sortedValues = values.sorted()
valueCount = values.len()
clusters = Array()
cachedLower = 0
cachedUpper = values.len
cachedValue = null
for i in range(0, valueCount):
value = sortedValues.get(i)
if (cachedValue == value): //duplicate value, no need to calculate twice
clusters.add(clusters.get(i).copy()) //simply clone last cluster or nop if no duplicate clusters wanted
else:
lower = sortedValues.binaryBoundSearch(values, value, distance, cachedLower, i, LowerBoundMatchFlag)
upper = sortedValues.binaryBoundSearch(values, value, distance, cachedUpper, valueCount, UpperBoundMatchFlag)
clusters.add(sortedValues[lower:upper+1])//add sublist within (and including) lower...upper to clusters
cachedLower = lower
cachedUpper = upper
return clusters
def withinDistance(valueA, valueB, distance):
return abs(valueA - valueB) <= distance
def binaryBoundSearch(values, value, distance, low, high, searchFlag):
// Possible values of searchFlag:
// "LowerBoundMatchFlag" to get the leftmost found match.
// "UpperBoundMatchFlag" to get the rightmost found match.
if (high < low):
return -1 // not found
mid = low + ((high - low) / 2)
matchPosition = -1
aValue = values[mid]
if (withinDistance(aValue, value, distance)):
matchPosition = mid
displacement = (searchFlag == LowerBoundMatchFlag) ? -1 : 1
if (mid > low && mid < high && withinDistance(values.get(mid + displacement), value, distance)):
newLow = (searchFlag == LowerBoundMatchFlag) ? low : mid + displacement
newHigh = (searchFlag == LowerBoundMatchFlag) ? mid + displacement : high
matchPosition = binaryBoundSearch(values, value, newLow, newHigh, searchFlag)
else:
flag = compare(aValue, value)
if (flag < 0): // current position too far left
matchPosition = binaryBoundSearch(values, value, mid + 1, high, searchFlag)
else if (flag > 0): // current position too far right
matchPosition = binaryBoundSearch(values, value, low, mid - 1, searchFlag)
return matchPosition

Minimum Window for the given numbers in an array

Saw this question recently:
Given 2 arrays, the 2nd array containing some of the elements of the 1st array, return the minimum window in the 1st array which contains all the elements of the 2nd array.
Eg :
Given A={1,3,5,2,3,1} and B={1,3,2}
Output : 3 , 5 (where 3 and 5 are indices in the array A)
Even though the range 1 to 4 also contains the elements of A, the range 3 to 5 is returned Since it contains since its length is lesser than the previous range ( ( 5 - 3 ) < ( 4 - 1 ) )
I had devised a solution but I am not sure if it works correctly and also not efficient.
Give an Efficient Solution for the problem. Thanks in Advance
A simple solution of iterating through the list.
Have a left and right pointer, initially both at zero
Move the right pointer forwards until [L..R] contains all the elements (or quit if right reaches the end).
Move the left pointer forwards until [L..R] doesn't contain all the elements. See if [L-1..R] is shorter than the current best.
This is obviously linear time. You'll simply need to keep track of how many of each element of B is in the subarray for checking whether the subarray is a potential solution.
Pseudocode of this algorithm.
size = bestL = A.length;
needed = B.length-1;
found = 0; left=0; right=0;
counts = {}; //counts is a map of (number, count)
for(i in B) counts.put(i, 0);
//Increase right bound
while(right < size) {
if(!counts.contains(right)) continue;
amt = count.get(right);
count.set(right, amt+1);
if(amt == 0) found++;
if(found == needed) {
while(found == needed) {
//Increase left bound
if(counts.contains(left)) {
amt = count.get(left);
count.set(left, amt-1);
if(amt == 1) found--;
}
left++;
}
if(right - left + 2 >= bestL) continue;
bestL = right - left + 2;
bestRange = [left-1, right] //inclusive
}
}

Resources