How to write pseudocode for general case? - performance

I'm going to start tutoring so I decided to work on some old problems from my algorithms class. The problem is as follows:
You are selling newspapers and every day you start your route at one intersection and end your route =north-east of where you started. The city streets are on a grid, as depicted below, and you start at (0, 0) and end at (n,m).
A move north takes you from (x, y) to (x, y +1). A move east takes you from (x, y) to (x +1, y). At each intersection (x, y), you stop to sell newspapers and will make an revenue of r (x, y). Let OPT(n,m) denote the total revenue of an optimal walk from (0, 0) to (n,m).
My pseudocode using bottom-up dynamic programming for this problem is as follows:
Bottom-Up-Alg(n,m,s[][]) \\ n and m are coordinates and s holds the revenue at each coordinate (n,m)
opt = 0 \\ holds optimal revenue
opt += s[0][0] \\value at (0,0)
i = 0
j = 0
while (i <= n and j <= m)
if (s[i+1][j] > s[i][j+1])
opt += s[i+1][j] \\ Move east
i++
else
opt += s[i][j+1] \\ Move north
j++
return r
Strictly speaking the running time of this algorithm would be O(n+m). But if n and m are proportional then the running time can be said to be O(n) or O(m).
The problem is I found that my algorithm is greedy and it won't work for every situation. I'm having trouble writing pseudocode that would work in general.

You can number every node, starting from the upper right node, with the maximum revenue you can get if you start from that node, and which prior node gives it that maximum. O(nm).
You do this by sweeping a diagonal from upper right to lower left.
When this numbering reaches the lower left, you have your answer.
Just trace back.
22 19-17-15--9
|
27 26 17 16 14
|
35-32 22 22 20
ADDED: If you're wondering how to sweep a diagonal, it's easier to visualize than to code.
But here's some C:
for (j = m-1; j >= -(n-1); j--){
for (ii = n-1; ii >= 0; ii--){
int jj = j + (n-1) - ii;
int rii = rjj = 0;
if (jj >= 0 && jj < m){
if (ii+1 < n && jj >= 0 && jj < m)
rii = r[ii+1][jj];
if (jj+1 < m && jj+1 >= 0)
rjj = r[ii][jj+1];
r[ii][jj] = s[ii][jj] + max( rii, rjj );
}
}
}
Basically, ii and jj are the indices of the cell you're working on, and if either its rightward or upward neighbor is outside the rectangle you take its revenue as zero.

This is your TA. I couldn't help but notice that this question was posted before the due date for your homework. Seeing as it's past that date now, the answer you were looking for is the following
BOTTOM-UP-NEWSPAPER(n,m,r)
opt = array(n,m)
for i = 0 to n
for j = 0 to m
if i = 0 and j = 0 // In starting position
opt[i][j] = r(i,j)
else if i = 0 and j > 0 // On the south side of grid
opt[i][j] = r(i,j) + opt[i][j-1]
else if j = 0 and i > 0 // On the west side of grid
opt[i][j] = r(i,j) + opt[i-1][j]
else // Anywhere else
opt[i][j] = r(i,j) + max(opt[i-1][j], opt[i][j-1])
opt[n][m] holds the maximum revenue

Your algorithm works because it's like Dijkstra's Algorithm but to find the longest path in a Directed Acyclic Graph where each node has two directed edges. The algorithm is finding the critical path in a greedy way.
The running time should be O(mn). It's like edit distance's tracing back procedure.

Related

Algorithm Problem: Finding all cells that has distance of K from some specific cells in a 2D grid [duplicate]

I am attempting to solve a coding challenge however my solution is not very performant, I'm looking for advice or suggestions on how I can improve my algorithm.
The puzzle is as follows:
You are given a grid of cells that represents an orchard, each cell can be either an empty spot (0) or a fruit tree (1). A farmer wishes to know how many empty spots there are within the orchard that are within k distance from all fruit trees.
Distance is counted using taxicab geometry, for example:
k = 1
[1, 0]
[0, 0]
the answer is 2 as only the bottom right spot is >k distance from all trees.
My solution goes something like this:
loop over grid and store all tree positions
BFS from the first tree position and store all empty spots until we reach a neighbour that is beyond k distance
BFS from the next tree position and store the intersection of empty spots
Repeat step 3 until we have iterated over all tree positions
Return the number of empty spots remaining after all intersections
I have found that for large grids with large values of k, my algorithm becomes very slow as I end up checking every spot in the grid multiple times. After doing some research, I found some solutions for similar problems that suggest taking the two most extreme target nodes and then only comparing distance to them:
https://www.codingninjas.com/codestudio/problem-details/count-nodes-within-k-distance_992849
https://www.geeksforgeeks.org/count-nodes-within-k-distance-from-all-nodes-in-a-set/
However this does not work for my challenge given certain inputs like below:
k = 4
[0, 0, 0, 1]
[0, 1, 0, 0]
[0, 0, 0, 0]
[1, 0, 0, 0]
[0, 0, 0, 0]
Using the extreme nodes approach, the bottom right empty spot is counted even though it is 5 distance away from the middle tree.
Could anyone point me towards a more efficient approach? I am still very new to these types of problems so I am finding it hard to see the next step I should take.
There is a simple, linear time solution to this problem because of the grid and distance structure. Given a fruit tree with coordinates (a, b), consider the 4 diagonal lines bounding the box of distance k around it. The diagonals going down and to the right have a constant value of x + y, while the diagonals going down and to the left have a constant value of x - y.
A point (x, y) is inside the box (and therefore, within distance k of (a, b)) if and only if:
a + b - k <= x + y <= a + b + k, and
a - b - k <= x - y <= a - b + k
So we can iterate over our fruit trees (a, b) to find four numbers:
first_max = max(a + b - k); first_min = min(a + b + k);
second_max = max(a - b - k); second_min = min(a - b + k);
where min and max are taken over all fruit trees. Then, iterate over empty cells (or do some math and subtract fruit tree counts, if your grid is enormous), counting how many empty spots (x,y) satisfy
first_max <= x + y <= first_min, and
second_max <= x - y <= second_min.
This Python code (written in a procedural style) illustrates this idea. Each diagonal of each bounding box cuts off exactly half of the plane, so this is equivalent to intersection of parallel half planes:
fruit_trees = [(a, b) for a in range(len(grid))
for b in range(len(grid[0]))
if grid[a][b] == 1]
northwest_half_plane = -infinity
southeast_half_plane = infinity
southwest_half_plane = -infinity
northeast_half_plane = infinity
for a, b in fruit_trees:
northwest_half_plane = max(northwest_half_plane, a - b - k)
southeast_half_plane = min(southeast_half_plane, a - b + k)
southwest_half_plane = max(southwest_half_plane, a + b - k)
northeast_half_plane = min(northeast_half_plane, a + b + k)
count = 0
for x in range(len(grid)):
for y in range(len(grid[0])):
if grid[x][y] == 0:
if (northwest_half_plane <= x - y <= southeast_half_plane
and southwest_half_plane <= x + y <= northeast_half_plane):
count += 1
print(count)
Some notes on the code: Technically the array coordinates are a quarter-turn rotated from the Cartesian coordinates of the picture, but that is immaterial here. The code is left deliberately bereft of certain 'optimizations' which may seem obvious, for two reasons: 1. The best optimization depends on the input format of fruit trees and the grid, and 2. The solution, while being simple in concept and simple to read, is not simple to get right while writing, and it's important that the code be 'obviously correct'. Things like 'exit early and return 0 if a lower bound exceeds an upper bound' can be added later if the performance is necessary.
As Answered by #kcsquared ,Providing an implementation in JAVA
public int solutionGrid(int K, int [][]A){
int m=A.length;
int n=A[0].length;
int k=K;
//to store the house coordinates
Set<String> houses=new HashSet<>();
//Find the house and store the coordinates
for(int i=0;i<m;i++) {
for (int j = 0; j < n; j++) {
if (A[i][j] == 1) {
houses.add(i + "&" + j);
}
}
}
int northwest_half_plane = Integer.MIN_VALUE;
int southeast_half_plane = Integer.MAX_VALUE;
int southwest_half_plane = Integer.MIN_VALUE;
int northeast_half_plane = Integer.MAX_VALUE;
for(String ele:houses){
String arr[]=ele.split("&");
int a=Integer.valueOf(arr[0]);
int b=Integer.valueOf(arr[1]);
northwest_half_plane = Math.max(northwest_half_plane, a - b - k);
southeast_half_plane = Math.min(southeast_half_plane, a - b + k);
southwest_half_plane = Math.max(southwest_half_plane, a + b - k);
northeast_half_plane = Math.min(northeast_half_plane, a + b + k);
}
int count = 0;
for(int x=0;x<m;x++) {
for (int y = 0; y < n; y++) {
if (A[x][y] == 0){
if ((northwest_half_plane <= x - y && x - y <= southeast_half_plane)
&& southwest_half_plane <= x + y && x + y <= northeast_half_plane){
count += 1;
}
}
}
}
return count;
}
This wouldn't be easy to implement but could be sublinear for many cases, and at most linear. Consider representing the perimeter of each tree as four corners (they mark a square rotated 45 degrees). For each tree compute it's perimeter intersection with the current intersection. The difficulty comes with managing the corners of the intersection, which could include more than one point because of the diagonal alignments. Run inside the final intersection to count how many empty spots are within it.
Since you are using taxicab distance, BFS is unneccesary. You can compute the distance between an empty spot and a tree directly.
This algorithm is based on a suggestion by https://stackoverflow.com/users/3080723/stef
// select tree near top left corner
SET flag false
LOOP r over rows
LOOP c over columns
IF tree at c, r
SET t to tree at c,r
SET flag true
BREAK
IF flag
BREAK
LOOP s over empty spots
Calculate distance between s and t
IF distance <= k
ADD s to spotlist
LOOP s over spotlist
LOOP t over trees, starting at bottom right corner
Calculate distance between s and t
IF distance > k
REMOVE s from spotlist
BREAK
RETURN spotlist

Count nodes within k distance of marked nodes in grid

I am attempting to solve a coding challenge however my solution is not very performant, I'm looking for advice or suggestions on how I can improve my algorithm.
The puzzle is as follows:
You are given a grid of cells that represents an orchard, each cell can be either an empty spot (0) or a fruit tree (1). A farmer wishes to know how many empty spots there are within the orchard that are within k distance from all fruit trees.
Distance is counted using taxicab geometry, for example:
k = 1
[1, 0]
[0, 0]
the answer is 2 as only the bottom right spot is >k distance from all trees.
My solution goes something like this:
loop over grid and store all tree positions
BFS from the first tree position and store all empty spots until we reach a neighbour that is beyond k distance
BFS from the next tree position and store the intersection of empty spots
Repeat step 3 until we have iterated over all tree positions
Return the number of empty spots remaining after all intersections
I have found that for large grids with large values of k, my algorithm becomes very slow as I end up checking every spot in the grid multiple times. After doing some research, I found some solutions for similar problems that suggest taking the two most extreme target nodes and then only comparing distance to them:
https://www.codingninjas.com/codestudio/problem-details/count-nodes-within-k-distance_992849
https://www.geeksforgeeks.org/count-nodes-within-k-distance-from-all-nodes-in-a-set/
However this does not work for my challenge given certain inputs like below:
k = 4
[0, 0, 0, 1]
[0, 1, 0, 0]
[0, 0, 0, 0]
[1, 0, 0, 0]
[0, 0, 0, 0]
Using the extreme nodes approach, the bottom right empty spot is counted even though it is 5 distance away from the middle tree.
Could anyone point me towards a more efficient approach? I am still very new to these types of problems so I am finding it hard to see the next step I should take.
There is a simple, linear time solution to this problem because of the grid and distance structure. Given a fruit tree with coordinates (a, b), consider the 4 diagonal lines bounding the box of distance k around it. The diagonals going down and to the right have a constant value of x + y, while the diagonals going down and to the left have a constant value of x - y.
A point (x, y) is inside the box (and therefore, within distance k of (a, b)) if and only if:
a + b - k <= x + y <= a + b + k, and
a - b - k <= x - y <= a - b + k
So we can iterate over our fruit trees (a, b) to find four numbers:
first_max = max(a + b - k); first_min = min(a + b + k);
second_max = max(a - b - k); second_min = min(a - b + k);
where min and max are taken over all fruit trees. Then, iterate over empty cells (or do some math and subtract fruit tree counts, if your grid is enormous), counting how many empty spots (x,y) satisfy
first_max <= x + y <= first_min, and
second_max <= x - y <= second_min.
This Python code (written in a procedural style) illustrates this idea. Each diagonal of each bounding box cuts off exactly half of the plane, so this is equivalent to intersection of parallel half planes:
fruit_trees = [(a, b) for a in range(len(grid))
for b in range(len(grid[0]))
if grid[a][b] == 1]
northwest_half_plane = -infinity
southeast_half_plane = infinity
southwest_half_plane = -infinity
northeast_half_plane = infinity
for a, b in fruit_trees:
northwest_half_plane = max(northwest_half_plane, a - b - k)
southeast_half_plane = min(southeast_half_plane, a - b + k)
southwest_half_plane = max(southwest_half_plane, a + b - k)
northeast_half_plane = min(northeast_half_plane, a + b + k)
count = 0
for x in range(len(grid)):
for y in range(len(grid[0])):
if grid[x][y] == 0:
if (northwest_half_plane <= x - y <= southeast_half_plane
and southwest_half_plane <= x + y <= northeast_half_plane):
count += 1
print(count)
Some notes on the code: Technically the array coordinates are a quarter-turn rotated from the Cartesian coordinates of the picture, but that is immaterial here. The code is left deliberately bereft of certain 'optimizations' which may seem obvious, for two reasons: 1. The best optimization depends on the input format of fruit trees and the grid, and 2. The solution, while being simple in concept and simple to read, is not simple to get right while writing, and it's important that the code be 'obviously correct'. Things like 'exit early and return 0 if a lower bound exceeds an upper bound' can be added later if the performance is necessary.
As Answered by #kcsquared ,Providing an implementation in JAVA
public int solutionGrid(int K, int [][]A){
int m=A.length;
int n=A[0].length;
int k=K;
//to store the house coordinates
Set<String> houses=new HashSet<>();
//Find the house and store the coordinates
for(int i=0;i<m;i++) {
for (int j = 0; j < n; j++) {
if (A[i][j] == 1) {
houses.add(i + "&" + j);
}
}
}
int northwest_half_plane = Integer.MIN_VALUE;
int southeast_half_plane = Integer.MAX_VALUE;
int southwest_half_plane = Integer.MIN_VALUE;
int northeast_half_plane = Integer.MAX_VALUE;
for(String ele:houses){
String arr[]=ele.split("&");
int a=Integer.valueOf(arr[0]);
int b=Integer.valueOf(arr[1]);
northwest_half_plane = Math.max(northwest_half_plane, a - b - k);
southeast_half_plane = Math.min(southeast_half_plane, a - b + k);
southwest_half_plane = Math.max(southwest_half_plane, a + b - k);
northeast_half_plane = Math.min(northeast_half_plane, a + b + k);
}
int count = 0;
for(int x=0;x<m;x++) {
for (int y = 0; y < n; y++) {
if (A[x][y] == 0){
if ((northwest_half_plane <= x - y && x - y <= southeast_half_plane)
&& southwest_half_plane <= x + y && x + y <= northeast_half_plane){
count += 1;
}
}
}
}
return count;
}
This wouldn't be easy to implement but could be sublinear for many cases, and at most linear. Consider representing the perimeter of each tree as four corners (they mark a square rotated 45 degrees). For each tree compute it's perimeter intersection with the current intersection. The difficulty comes with managing the corners of the intersection, which could include more than one point because of the diagonal alignments. Run inside the final intersection to count how many empty spots are within it.
Since you are using taxicab distance, BFS is unneccesary. You can compute the distance between an empty spot and a tree directly.
This algorithm is based on a suggestion by https://stackoverflow.com/users/3080723/stef
// select tree near top left corner
SET flag false
LOOP r over rows
LOOP c over columns
IF tree at c, r
SET t to tree at c,r
SET flag true
BREAK
IF flag
BREAK
LOOP s over empty spots
Calculate distance between s and t
IF distance <= k
ADD s to spotlist
LOOP s over spotlist
LOOP t over trees, starting at bottom right corner
Calculate distance between s and t
IF distance > k
REMOVE s from spotlist
BREAK
RETURN spotlist

Counting inversions in an array of 2D pair

Problem Description:
Let there be an array of 2D pairs ((x1, y1), . . . ,(xn, yn))
. With a fixed constant
y' a pair (i, j) is called half-inverted if i < j, xi > xj , and yi ≥ y' > yj . Devise an algorithm
that counts the number of half-inverted pairs. You will get full marks if your algorithm is
correct of complexity no more than O(n log n).
\My idea is to treat this using similar method as counting inversion in a normal array, but my problem is that how do we maintain the order during the Merge And Count step?
It is a simple modification of the familiar merge-sort inversion counting algorithm which can be used to solve this problem so make you fully understand it as a prerequisite.
If we examine the merge step of this algorithm we have 2 sorted halves and 2 pointers pointing to an element of each. Let our left pointer be i and our right, j. Using the traditional definition of an inversion, if our i pointer points to a value that is larger than the value pointed to by j then due the arrays being sorted and all the elements on the left being before those on the right in the real array, we know all the elements from i to the end of the left half meet our definition of an inversion for our value at j so we increase our count by mid - i where mid is the end of the left half.
Switching back to your problem, we are dealing with pairs (x,y). If we can keep our x values sorted then, using the approach described above, we can simply count the number of inversions only considering x values. Looking at your definition of half inversions we will surely be over counting the number we need if we only count xi > xj. We are missing the additional constraint of yi >= y' > yj which must be filtered out of our counting.
So, if we look back to our traditional algorithm when our i pointer is pointing to a value greater than the value at j we also need to make sure that our y value at j is less than y'. If this not true then none of the x's from i to mid will match our definition of a half inversion and so we cannot count them. Now let's assume our j's y is smaller than y', if we simply counted all the pairs from i to mid then we would still be over counting the pairs which have yi < y'.
One way to fix this is to keep track of the of y values in the left half from i to mid which are >= y' and add that value to our count. We can keep track of how many y >= y' we see in the merge step up to any i, and subtract that from the total number of y's which are >= y' in the left half. To keep track of that total number we can return that value from our recursive function (total = left + right) and only use the number which came from the left half when merging. We also need to modify our base case which is straightforward.
def count_half_inversions(l, y):
return count_rec(l, 0, len(l), l.copy(), y)[0]
def count_rec(l, begin, end, copy, y):
if end-begin <= 1:
# we have only 1 pair
return (0, 1 if l[begin][1] >= y else 0)
mid = begin + ((end-begin) // 2)
left = count_rec(copy, begin, mid, l, y)
right = count_rec(copy, mid, end, l, y)
between = merge_count(l, begin, mid, end, copy, left[1], y)
# return (inversion count, number of pairs, (i,j), with j >= y)
return (left[0] + right[0] + between, left[1] + right[1])
def merge_count(l, begin, mid, end, copy, left_y_count, y):
result = 0
i,j = begin, mid
k = begin
while i < mid and j < end:
if copy[i][0] > copy[j][0]:
if y > copy[j][1]:
result += left_y_count
smaller = copy[j]
j += 1
else:
if copy[i][1] >= y:
left_y_count -= 1
smaller = copy[i]
i += 1
l[k] = smaller
k += 1
while i < mid:
l[k] = copy[i]
i += 1
k += 1
while j < end:
l[k] = copy[j]
j += 1
k += 1
return result
test_case = [(1,1), (6,4), (6,3), (1,2), (1,2), (3,3), (6,2), (0,1)]
fixed_y = 2
print(count_half_inversions(test_case, fixed_y))

dynamic programming to find the minimum weight cover of the points

You are given n points p1, p2, . . . , pn on the real line. The location of pi
is given by its coordinate xi
. You
are also given m intervals I1, I2, . . . , Im where Ij = [aj , bj ] (aj is the left end point and bj is the right end
point). Each interval j has a non-negative weight wj . An interval Ij is said to cover pi
if xi ∈ [aj , bj ]. A
subset S ⊆ {I1, I2, . . . , Im} of intervals is a cover for the given points if for each pi
, 1 ≤ i ≤ n, there is some
interval in S that covers pi
. In the figure below, the intervals shown in bold form a cover of the points.
The goal is to find a minimum weight cover of the points. Note that a minimum weight cover may differ
from a cover with minimum number of intervals.
ps: I first tried to find the smallest interval, each time we find a valid smallest interval we remove its covered points from P[ ] until P[ ] is empty, but this algorithms can be easily prove to be wrong. Then I tried to pick the minimum ratio of weight to number of its covering points, but I do not know how to apply it into dp with memoization.
So what we can do here is first sort all the p[] array values and also sort the intervals according to a[] i.e the left end point.
Now at each state we can have two index, the first is the index of the interval and the second is the index of the point
So at each state what we have two cases:
The first case is simple, i.e we do not select the current interval.
In the second case what we can do is select the current interval if the point lies inside, also note that if we take the current interval than we can just move the index of the point in the new state forward by the no of points we find in the current interval since they are sorted.
Pseudo code for Recursive Dynamic Programming solution using memoisation:
#define INF 1e9
int n = 100;
int m = 100;
int p[100], a[100], b[100], w[100], memo[100][100];//initialize memo to -1
int solve(int intervalIndex, int pointIndex){
if(intervalIndex == m){
if(pointIndex == n) return 0;
else return INF;
}
if(pointIndex == n) return 0;
if(memo[intervalIndex][pointIndex] != -1) return memo[intervalIndex][pointIndex];
//we can make 2 choices either to select this interval ao not to select this interval
//if we select this interval, then we also take the points that it covers
//case #1 : do not take current interval
int ans = solve(intervalIndex + 1, pointIndex);
if(a[intervalIndex] <= p[pointIndex] && b[intervalIndex] >= p[pointIndex]){ //i.e this point is inside curr interval
//case # 2 : take current interval, and all the points it contains
int index = pointIndex;
for(int i = pointIndex + 1; i < n; i++){
if(a[intervalIndex] <= p[i] && b[i] >= p[i]){
index = i;
}else{
break;
}
}
ans = min(ans, solve(intervalIndex + 1, index + 1) + w[intervalIndex]);
}
return memo[intervalIndex][pointIndex] = ans;
}
The above can be modified for real numbers easily.
The complexity of above code is O(m*n*n), but it can be reduced to O(m * n), if we just precompute the value that what is the farthest point index corresponding to an interval, so that we do not have to use the for loop inside to search for it, instead just use the precomputed value.

How to compute optimal paths for traveling salesman bitonic tour?

UPDATED
After more reading, the solution can be given with the following recurrence relation:
(a) When i = 1 and j = 2, l(i; j) = dist(pi; pj )
(b) When i < j - 1; l(i; j) = l(i; j - 1) + dist(pj-1; pj)
(c) When i = j - 1 and j > 2, min 1<=k<i (l(k; i) + dist(pk; pj ))
This is now starting to make sense, except for part C. How would I go about determining the minimum value k? I suppose it means you can iterate through all possible k values and just store the minimum result of ( l(k,i) + dist(pk,pj)?
Yes, definitely a problem I was studying at school. We are studying bitonic tours for the traveling salesman problem.
Anyway, say I have 5 vertices {0,1,2,3,4}. I know my first step is to sort these in order of increasing x-coordinates. From there, I am a bit confused on how this would be done with dynamic programming.
I am reading that I should scan the list of sorted nodes, and maintain optimal paths for both parts (initial path and the return path). I am confused as to how I will calculate these optimal paths. For instance, how will I know if I should include a given node in the initial path or the return path, since it cannot be in both (except for the endpoints). Thinking back to Fibonacci in dynamic programming, you basically start with your base case and work your way forward. I guess what I am asking is how would I get started with the bitonic traveling salesman problem?
For something like the Fibonacci numbers, a dynamic programming approached is quite clear. However, I don't know if I am just being dense or what but I am quite confused trying to wrap my head around this problem.
Thanks for looking!
NOTE: I am not looking for complete solutions, but at least some good tips to get my started. For example, if this were the Fibonacci problem, one could illustrate how the first few numbers are calculated. Please let me know how I can improve the question as well.
Clarification on your algorithm.
The l(i,j) recursive function should compute the minimum distance of a bitonic tour i -> 1 -> j visiting all nodes that are smaller than i. So, the solution to the initial problem will be l(n,n)!
Important notes:
we can assume that the nodes are ordered by their x coordinate and labeled accordingly (p1.x < p2.x < p3.x ... < pn.x). It they weren't ordered, we could sort them in O(nlogn) time.
l(i,j) = l(j,i). The reason is that in the lhs, we have a i ->...-> 1 -> ... -> j tour which is optimal. However traversing this route backward will give us the same distance, and won't broke bitonic property.
Now the easy cases (note the changes!):
(a) When i = 1 and j = 2, l(i; j) = dist(pi; pj ) = dist(1,2)
Here we have the following tour : 1->1->...->2. Trivially this is equivalent to the length of the path 1->...->2. Since points are ordered by their .x coordinate, there is no point between 1 and 2, so the straight line connecting them will be the optimal one. ( Choosing any number of other points to visit before 2 would result in a longer path! )
(b) When i < j - 1; l(i; j) = l(i; j - 1) + dist(pj-1; pj)
In this case, j-1 must be on the part of the path 1 -> ... -> j, because the part i -> ... -> 1 can not contain nodes with an index bigger than i. Because all nodes in the path 1 -> ... -> j are in increasing order of index, there can be none between j-1 and j. So, this is equivalent to the tour: i -> ... -> 1 -> .... -> j-1 -> j, which is equivalent to l(i,j-1) + dist(pj-1,pj)!
Anf finally the interesting part comes:
(c) When i = j - 1 or i = j, min 1<=k<i (l(k; i) + dist(pk; pj ))
Here we know that we have to get from i to 1, but there is no clue on the backward sweep! The key idea here is that we must think of the node just before j on our backward route. It may be any of the nodes from 1 to j-1! Let us assume that this node is k.
Now we have a tour: i -> ... -> 1 -> .... -> k -> j, right? The cost of this tour is l(i,k) + dist(pk,pj).
Hope you got it.
Implementation.
You will need a 2-dimensional array say BT[1..n][1..n]. Let i be the row index, j be the column index. How should we fill in this table?
In the first row we know BT[1][1] = 0, BT[1][2] = d(1,2), so we have only i,j indexes left that fall into the (b) category.
In the remainin rows, we fill the elements from the diagonal till the end.
Here is a sample C++ code (not tested):
void ComputeBitonicTSPCost( const std::vector< std::vector<int> >& dist, int* opt ) {
int n = dist.size();
std::vector< std::vector< int > > BT;
BT.resize(n);
for ( int i = 0; i < n; ++i )
BT.at(i).resize(n);
BT.at(0).at(0) = 0; // p1 to p1 bitonic distance is 0
BT.at(0).at(1) = dist.at(0).at(1); // p1 to p2 bitonic distance is d(2,1)
// fill the first row
for ( int j = 2; j < n; ++j )
BT.at(0).at(j) = BT.at(0).at(j-1) + dist.at(j-1).at(j);
// fill the remaining rows
int temp, min;
for ( int i = 1; i < n; ++i ) {
for ( int j = i; j < n; ++j ) {
BT.at(i).at(j) = -1;
min = std::numeric_limits<int>::max();
if ( i == j || i == j -1 ) {
for( int k = 0; k < i; ++k ) {
temp = BT.at(k).at(i) + dist.at(k).at(j);
min = ( temp < min ) ? temp : min;
}
BT.at(i).at(j) = min;
} else {
BT.at(i).at(j) = BT.at(i).at(j-1) + dist.at(j-1).at(j);
}
}
}
*opt = BT.at(n-1).at(n-1);
}
Okay, the key notions in a dynamic programming solution are:
you pre-compute smaller problems
you have a rule to let you combine smaller problems to find solutions for bigger problems
you have a known property of the problems that let's you prove the solution is really optimal under some measure of optimality. (In this case, shortest.)
The essential property of a bitonic tour is that a vertical line in the coordinate system crosses a side of the closed polygon at most twice. So, what is a bitonic tour of exactly two points? Clearly, any two points form a (degenerate) bitonic tour. Three points have two bitonic tours ("clockwise" and "counterclockwise").
Now, how can you pre-compute the various smaller bitonic tours and combine them until you have all points included and still have a bitonic tour?
Okay, you're on the righ track with your update. But now, in a dynamic programming solution, what you do with work it bottom-up: pre-compute and memoize (not "memorize") the optimal subproblems.

Resources