How many moves to reach a destination? Efficient flood filling - algorithm

I want to compute the distance of cells from a destination cell, using number of four-way movements to reach something. So the the four cells immediately adjacent to the destination have a distance of 1, and those on the four cardinal directions of each of them have a distance of 2 and so on. There is a maximum distance that might be around 16 or 20, and there are cells that are occupied by barriers; the distance can flow around them but not through them.
I want to store the output into a 2D array, and I want to be able to compute this 'distance map' for any destination on a bigger maze map very quickly.
I am successfully doing it with a variation on a flood fill where the I place incremental distance of the adjacent unfilled cells in a priority queue (using C++ STL).
I am happy with the functionality and now want to focus on optimizing the code, as it is very performance sensitive.
What cunning and fast approaches might there be?

I think you have done everything right. If you coded it correct it takes O(n) time and O(n) memory to compute flood fill, where n is the number of cells, and it can be proven that it's impossible to do better (in general case). And after fill is complete you just return distance for any destination with O(1), it easy to see that it also can be done better.
So if you want to optimize performance, you can only focused on CODE LOCAL OPTIMIZATION. Which will not affect asymptotic but can significantly improve your real execution time. But it's hard to give you any advice for code optimization without actually seeing source.
So if you really want to see optimized code see the following (Pure C):
include
int* BFS()
{
int N, M; // Assume we have NxM grid.
int X, Y; // Start position. X, Y are unit based.
int i, j;
int movex[4] = {0, 0, 1, -1}; // Move on x dimension.
int movey[4] = {1, -1, 0, 0}; // Move on y dimension.
// TO DO: Read N, M, X, Y
// To reduce redundant functions calls and memory reallocation
// allocate all needed memory once and use a simple arrays.
int* map = (int*)malloc((N + 2) * (M + 2));
int leadDim = M + 2;
// Our map. We use one dimension array. map[x][y] = map[leadDim * x + y];
// If (x,y) is occupied then map[leadDim*x + y] = -1;
// If (x,y) is not visited map[leadDim*x + y] = -2;
int* queue = (int*)malloc(N*M);
int first = 0, last =1;
// Fill the boarders to simplify the code and reduce conditions
for (i = 0; i < N+2; ++i)
{
map[i * leadDim + 0] = -1;
map[i * leadDim + M + 1] = -1;
}
for (j = 0; j < M+2; ++j)
{
map[j] = -1;
map[(N + 1) * leadDim + j] = -1;
}
// TO DO: Read the map.
queue[first] = X * leadDim + Y;
map[X * leadDim + Y] = 0;
// Very simple optimized process loop.
while (first < last)
{
int current = queue[first];
int step = map[current];
for (i = 0; i < 4; ++i)
{
int temp = current + movex[i] * leadDim + movey[i];
if (map[temp] == -2) // only one condition in internal loop.
{
map[temp] = step + 1;
queue[last++] = temp;
}
}
++first;
}
free(queue);
return map;
}
Code may seems tricky. And of course, it doesn't look like OOP (I actually think that OOP fans will hate it) but if you want something really fast that's what you need.

It's common task for BFS. Complexity is O(cellsCount)
My c++ implementation:
vector<vector<int> > GetDistance(int x, int y, vector<vector<int> > cells)
{
const int INF = 0x7FFFFF;
vector<vector<int> > distance(cells.size());
for(int i = 0; i < distance.size(); i++)
distance[i].assign(cells[i].size(), INF);
queue<pair<int, int> > q;
q.push(make_pair(x, y));
distance[x][y] = 0;
while(!q.empty())
{
pair<int, int> curPoint = q.front();
q.pop();
int curDistance = distance[curPoint.first][curPoint.second];
for(int i = -1; i <= 1; i++)
for(int j = -1; j <= 1; j++)
{
if( (i + j) % 2 == 0 ) continue;
pair<int, int> nextPoint(curPoint.first + i, curPoint.second + j);
if(nextPoint.first >= 0 && nextPoint.first < cells.size()
&& nextPoint.second >= 0 && nextPoint.second < cells[nextPoint.first].size()
&& cells[nextPoint.first][nextPoint.second] != BARRIER
&& distance[nextPoint.first][nextPoint.second] > curDistance + 1)
{
distance[nextPoint.first][nextPoint.second] = curDistance + 1;
q.push(nextPoint);
}
}
}
return distance;
}

Start with a recursive implementation: (untested code)
int visit( int xy, int dist) {
int ret =1;
if (array[xy] <= dist) return 0;
array[xy] = dist;
if (dist == maxdist) return ret;
ret += visit ( RIGHT(xy) , dist+1);
...
same for left, up, down
...
return ret;
}
You'l need to handle the initalisation and the edge-cases. And you have to decide if you want a two dimentional array or a one dimensonal array.
A next step could be to use a todo list and remove the recursion, and a third step could be to add some bitmasking.

8-bit computers in the 1970s did this with an optimization that has the same algorithmic complexity, but in the typical case is much faster on actual hardware.
Starting from the initial square, scan to the left and right until "walls" are found. Now you have a "span" that is one square tall and N squares wide. Mark the span as "filled," in this case each square with the distance to the initial square.
For each square above and below the current span, if it's not a "wall" or already filled, pick it as the new origin of a span.
Repeat until no new spans are found.
Since horizontal rows tend to be stored contiguously in memory, this algorithm tends to thrash the cache far less than one that has no bias for horizontal searches.
Also, since in the most common cases far fewer items are pushed and popped from a stack (spans instead of individual blocks) there is less time spent maintaining the stack.

Related

How to print values in memoization method-Dynamic pragraming

I know for a problem that can be solved using DP, can be solved by either tabulation(bottom-up) approach or memoization(top-down) approach. personally i find memoization is easy and even efficient approach(analysis required just to get recursive formula,once recursive formula is obtained, a brute-force recursive method can easily be converted to store sub-problem's result and reuse it.) The only problem that i am facing in this approach is, i am not able to construct actual result from the table which i filled on demand.
For example, in Matrix Product Parenthesization problem ( to decide in which order to perform the multiplications on Matrices so that cost of multiplication is minimum) i am able to calculate minimum cost not not able to generate order in algo.
For example, suppose A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix. Then,
(AB)C = (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations
A(BC) = (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 operations.
here i am able to get min-cost as 27000 but unable to get order which is A(BC).
I used this. Suppose F[i, j] represents least number of multiplication needed to multiply Ai.....Aj and an array p[] is given which represents the chain of matrices such that the ith matrix Ai is of dimension p[i-1] x p[i]. So
0 if i=j
F[i,j]=
min(F[i,k] + F[k+1,j] +P_i-1 * P_k * P_j where k∈[i,j)
Below is the implementation that i have created.
#include<stdio.h>
#include<limits.h>
#include<string.h>
#define MAX 4
int lookup[MAX][MAX];
int MatrixChainOrder(int p[], int i, int j)
{
if(i==j) return 0;
int min = INT_MAX;
int k, count;
if(lookup[i][j]==0){
// recursively calculate count of multiplcations and return the minimum count
for (k = i; k<j; k++) {
int gmin=0;
if(lookup[i][k]==0)
lookup[i][k]=MatrixChainOrder(p, i, k);
if(lookup[k+1][j]==0)
lookup[k+1][j]=MatrixChainOrder(p, k+1, j);
count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < min){
min = count;
printf("\n****%d ",k); // i think something has be done here to represent the correct answer ((AB)C)D where first mat is represented by A second by B and so on.
}
}
lookup[i][j] = min;
}
return lookup[i][j];
}
// Driver program to test above function
int main()
{
int arr[] = {2,3,6,4,5};
int n = sizeof(arr)/sizeof(arr[0]);
memset(lookup, 0, sizeof(lookup));
int width =10;
printf("Minimum number of multiplications is %d ", MatrixChainOrder(arr, 1, n-1));
printf("\n ---->");
for(int l=0;l<MAX;++l)
printf(" %*d ",width,l);
printf("\n");
for(int z=0;z<MAX;z++){
printf("\n %d--->",z);
for(int x=0;x<MAX;x++)
printf(" %*d ",width,lookup[z][x]);
}
return 0;
}
I know using tabulation approach printing the solution is much easy but i want to do it in memoization technique.
Thanks.
Your code correctly computes the minimum number of multiplications, but you're struggling to display the optimal chain of matrix multiplications.
There's two possibilities:
When you compute the table, you can store the best index found in another memoization array.
You can recompute the optimal splitting points from the results in the memoization array.
The first would involve creating the split points in a separate array:
int lookup_splits[MAX][MAX];
And then updating it inside your MatrixChainOrder function:
...
if (count < min) {
min = count;
lookup_splits[i][j] = k;
}
You can then generate the multiplication chain recursively like this:
void print_mult_chain(int i, int j) {
if (i == j) {
putchar('A' + i - 1);
return;
}
putchar('(');
print_mult_chain(i, lookup_splits[i][j]);
print_mult_chain(lookup_splits[i][j] + 1, j);
putchar(')');
}
You can call the function with print_mult_chain(1, n - 1) from main.
The second possibility is that you don't cache lookup_splits and recompute it as necessary.
int get_lookup_splits(int p[], int i, int j) {
int best = INT_MAX;
int k_best;
for (int k = i; k < j; k++) {
int count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < best) {
best = count;
k_best = k;
}
}
return k;
}
This is essentially the same computation you did inside MatrixChainOrder, so if you go with this solution you should factor the code appropriately to avoid having two copies.
With this function, you can adapt print_mult_chain above to use it rather than the lookup_splits array. (You'll need to pass the p array in).
[None of this code is tested, so you may need to edit the answer to fix bugs].

How can I develop the exact recurrence for this?

N buildings are built in a row, numbered 1 to N from left to right.
Spiderman is on buildings number 1, and want to reach building number N.
He can jump from building number i to building number j iff i < j and j-i is a power of 2 (1,2,4, so on).
Such a move costs him energy |Height[j]-Height[i]|, where Height[i] is the height of the ith building.
Find the minimum energy using which he can reach building N?
Input:
First line contains N, number of buildings.
Next line contains N space-separated integers, denoting the array Height.
Output:
Print a single integer, the answer to the above problem.
So, I thought of something like this:
int calc(int arr[], int beg, int end, )
{
//int ans = INT_MIN;
if (beg == end)
return 0;
else if (beg > end)
return 0;
else
{
for (int i = beg+1; i <= end; i++ ) // Iterate over all possible combinations
{
int foo = arr[i] - arr[beg]; // Check if power of two or not
int k = log2(foo);
int z = pow(2,k);
if (z == foo) // Calculate the minimum value over multiple values
{
int temp = calc(arr,i,end);
if (temp < ans)
temp = ans;
}
}
}
}
The above is a question that I am trying to solve and here is the link: https://www.codechef.com/TCFS15P/problems/SPIDY2
However, the above recurrence is not exactly correct. Do I have to pass in the value of answer too in this?
We can reach nth building from any of (n-2^0),(n-2^1),(n-2^2)... buildings. So we need to process the buildings starting from 1. For each building i we calculate cost for getting there from any of earlier building j where i-j is power of 2 and take the minimum cost.
int calc(int arr[],int dp[],int n) {
// n is the target building
for(int i=1; i<=n; i++) dp[i]=LLONG_MAX; //initialize to infinity
dp[1]=0; // no cost for starting building
for(int i=2; i<=n; i++) {
for(int j=1; i-j>=1; j*=2) {
dp[i]=min(dp[i], dp[i-j]+abs(arr[i]-arr[i-j]));
}
}
return dp[n];
}
Time complexity is O(n*log(n)).
First, you are doing the check for a power of 2 on the wrong quantity. The jumps have to be between buildings that are separated in index by a power of 2, not that differ in height (which is what you are checking).
Second, the recursion should be formulated in terms of the cost of the first jump and the cost of the remaining jumps (obtained by a recursive call). You are looking for the minimum cost over all legal first jumps. A first jump is legal if it is to a building that is at an index less than N and also a power of 2 in index away from the current start.
Something like this should work:
int calc(int arr[], int beg, int end)
{
if (beg == end)
return 0;
else if (beg > end)
throw an exception
int minEnergy = INFINITY;
for (int i = 1; // start with a step of 1
beg + i <= end; // test if we'd go too far
i <<= 1) // increase step to next power of 2
{
int energy = abs(arr[beg + i] - arr[beg]) // energy of first jump
+ calc(arr, beg + i, end); // remaining jumps
if (energy < minEnergy) {
minEnergy = energy;
}
}
return minEnergy;
}
The efficiency of this search can be greatly improved by passing the minimum energy obtained so far. Then if abs(arr[beg + i] - arr[beg]) is not less than that quantity, there's no need to do the recursive call, because whatever is found will never be smaller. (In fact, you can cut off the recursion if abs(arr[beg + i] - arr[beg]) + abs(arr[end] - arr[beg + i]) is not smaller than the best solution so far, because Spiderman will have to at least spend abs(arr[end] - arr[beg + i]) after getting to building beg + i.) Adding this improvement is left as an exercise. :)

Water collected between towers

I recently came across an interview question asked by Amazon and I am not able to find an optimized algorithm to solve this question:
You are given an input array whose each element represents the height of a line towers. The width of every tower is 1. It starts raining. How much water is collected between the towers?
Example
Input: [1,5,3,7,2] , Output: 2 units
Explanation: 2 units of water collected between towers of height 5 and 7
*
*
*w*
*w*
***
****
*****
Another Example
Input: [5,3,7,2,6,4,5,9,1,2] , Output: 14 units
Explanation= 2 units of water collected between towers of height 5 and 7 +
4 units of water collected between towers of height 7 and 6 +
1 units of water collected between towers of height 6 and 5 +
2 units of water collected between towers of height 6 and 9 +
4 units of water collected between towers of height 7 and 9 +
1 units of water collected between towers of height 9 and 2.
At first I thought this could be solved by Stock-Span Problem (http://www.geeksforgeeks.org/the-stock-span-problem/) but I was wrong so it would be great if anyone can think of a time-optimized algorithm for this question.
Once the water's done falling, each position will fill to a level equal to the smaller of the highest tower to the left and the highest tower to the right.
Find, by a rightward scan, the highest tower to the left of each position. Then find, by a leftward scan, the highest tower to the right of each position. Then take the minimum at each position and add them all up.
Something like this ought to work:
int tow[N]; // nonnegative tower heights
int hl[N] = {0}, hr[N] = {0}; // highest-left and highest-right
for (int i = 0; i < n; i++) hl[i] = max(tow[i], (i!=0)?hl[i-1]:0);
for (int i = n-1; i >= 0; i--) hr[i] = max(tow[i],i<(n-1) ? hr[i+1]:0);
int ans = 0;
for (int i = 0; i < n; i++) ans += min(hl[i], hr[i]) - tow[i];
Here's an efficient solution in Haskell
rainfall :: [Int] -> Int
rainfall xs = sum (zipWith (-) mins xs)
where mins = zipWith min maxl maxr
maxl = scanl1 max xs
maxr = scanr1 max xs
it uses the same two-pass scan algorithm mentioned in the other answers.
Refer this website for code, its really plain and simple
http://learningarsenal.info/index.php/2015/08/21/amount-of-rain-water-collected-between-towers/
Input: [5,3,7,2,6,4,5,9,1,2] , Output: 14 units
Explanation
Each tower can hold water upto a level of smallest height between heighest tower to left, and highest tower to the right.
Thus we need to calculate highest tower to left on each and every tower, and likewise for the right side.
Here we will be needing two extra arrays for holding height of highest tower to left on any tower say, int leftMax[] and likewise for right side say int rightMax[].
STEP-1
We make a left pass of the given array(i.e int tower[]),and will be maintaining a temporary maximum(say int tempMax) such that on each iteration height of each tower will be compared to tempMax, and if height of current tower is less than tempMax then tempMax will be set as highest tower to left of it, otherwise height of current tower will be assigned as the heighest tower to left and tempMax will be updated with current tower height,
STEP-2
We will be following above procedure only as discussed in STEP-1 to calculate highest tower to right BUT this times making a pass through array from right side.
STEP-3
The amount of water which each tower can hold is-
(minimum height between highest right tower and highest left tower) – (height of tower)
You can do this by scanning the array twice.
The first time you scan from top to bottom and store the value of the tallest tower you have yet to encounter when reaching each row.
You then repeat the process, but in reverse. You start from the bottom and work towards the top of the array. You keep track of the tallest tower you have seen so far and compare the height of it to the value for that tower in the other result set.
Take the difference between the lesser of these two values (the shortest of the tallest two towers surrounding the current tower, subtract the height of the tower and add that amount to the total amount of water.
int maxValue = 0;
int total = 0;
int[n] lookAhead
for(i=0;i<n;i++)
{
if(input[i] > maxValue) maxValue = input[i];
lookahead[i] = maxValue;
}
maxValue = 0;
for(i=n-1;i>=0;i--)
{
// If the input is greater than or equal to the max, all water escapes.
if(input[i] >= maxValue)
{
maxValue = input[i];
}
else
{
if(maxValue > lookAhead[i])
{
// Make sure we don't run off the other side.
if(lookAhead[i] > input[i])
{
total += lookAhead[i] - input[i];
}
}
else
{
total += maxValue - input[i];
}
}
}
Readable Python Solution:
def water_collected(heights):
water_collected = 0
left_height = []
right_height = []
temp_max = heights[0]
for height in heights:
if (height > temp_max):
temp_max = height
left_height.append(temp_max)
temp_max = heights[-1]
for height in reversed(heights):
if (height > temp_max):
temp_max = height
right_height.insert(0, temp_max)
for i, height in enumerate(heights):
water_collected += min(left_height[i], right_height[i]) - height
return water_collected
O(n) solution in Java, single pass
Another implementation in Java, finding the water collected in a single pass through the list. I scanned the other answers but didn't see any that were obviously using my solution.
Find the first "peak" by looping through the list until the tower height stops increasing. All water before this will not be collected (drain off to the left).
For all subsequent towers:
If the height of the subsequent tower decreases or stays the same, add water to a "potential collection" bucket, equal to the difference between the tower height and the previous max tower height.
If the height of the subsequent tower increases, we collect water from the previous bucket (subtract from the "potential collection" bucket and add to the collected bucket) and also add water to the potential bucket equal to the difference between the tower height and the previous max tower height.
If we find a new max tower, then all the "potential water" is moved into the collected bucket and this becomes the new max tower height.
In the example above, with input: [5,3,7,2,6,4,5,9,1,2], the solution works as follows:
5: Finds 5 as the first peak
3: Adds 2 to the potential bucket (5-3) collected = 0, potential = 2
7: New max, moves all potential water to the collected bucket collected = 2, potential = 0
2: Adds 5 to the potential bucket (7-2) collected = 2, potential = 5
6: Moves 4 to the collected bucket and adds 1 to the potential bucket (6-2, 7-6) collected = 6, potential = 2
4: Adds 2 to the potential bucket (6-4) collected = 6, potential = 4
5: Moves 1 to the collected bucket and adds 2 to the potential bucket (5-4, 7-5) collected = 7, potential = 6
9: New max, moves all potential water to the collected bucket collected = 13, potential = 0
1: Adds 8 to the potential bucket (9-1) collected = 13, potential = 8
2: Moves 1 to the collected bucket and adds 7 to the potential bucket (2-1, 9-2) collected = 14, potential = 15
After running through the list once, collected water has been measured.
public static int answer(int[] list) {
int maxHeight = 0;
int previousHeight = 0;
int previousHeightIndex = 0;
int coll = 0;
int temp = 0;
// find the first peak (all water before will not be collected)
while(list[previousHeightIndex] > maxHeight) {
maxHeight = list[previousHeightIndex];
previousHeightIndex++;
if(previousHeightIndex==list.length) // in case of stairs (no water collected)
return coll;
else
previousHeight = list[previousHeightIndex];
}
for(int i = previousHeightIndex; i<list.length; i++) {
if(list[i] >= maxHeight) { // collect all temp water
coll += temp;
temp = 0;
maxHeight = list[i]; // new max height
}
else {
temp += maxHeight - list[i];
if(list[i] > previousHeight) { // we went up... collect some water
int collWater = (i-previousHeightIndex)*(list[i]-previousHeight);
coll += collWater;
temp -= collWater;
}
}
// previousHeight only changes if consecutive towers are not same height
if(list[i] != previousHeight) {
previousHeight = list[i];
previousHeightIndex = i;
}
}
return coll;
}
None of the 17 answers already posted are really time-optimal.
For a single processor, a 2 sweep (left->right, followed by a right->left summation) is optimal, as many people have pointed out, but using many processors, it is possible to complete this task in O(log n) time. There are many ways to do this, so I'll explain one that is fairly close to the sequential algorithm.
Max-cached tree O(log n)
1: Create a binary tree of all towers such that each node contains the height of the highest tower in any of its children. Since the two leaves of any node can be computed independently, this can be done in O(log n) time with n cpu's. (Each value is handled by its own cpu, and they build the tree by repeatedly merging two existing values. All parallel branches can be executed in parallel. Thus, it's O(log2(n)) for a 2-way merge function (max, in this case)).
2a: Then, for each node in the tree, starting at the root, let the right leaf have the value max(left, self, right). This will create the left-to-right monotonic sweep in O(log n) time, using n cpu's.
2b: To compute the right-to-left sweep, we do the same procedure as before. Starting with root of the max-cached tree, let the left leaf have the value max(left, self, right). These left-to-right (2a) and right-to-left (2b) sweeps can be done in parallel if you'd like to. They both use the max-cached tree as input, and generate one new tree each (or sets their own fields in original tree, if you prefer that).
3: Then, for each tower, the amount of water on it is min(ltr, rtl) - towerHeight, where ltr is the value for that tower in the left-to-right monotonic sweep we did before, i.e. the maximum height of any tower to the left of us (including ourselves1), and rtl is the same for the right-to-left sweep.
4: Simply sum this up using a tree in O(log n) time using n cpu's, and we're done.
1 If the current tower is taller than all towers to the left of us, or taller than all towers to the the right of us, min(ltr, rtl) - towerHeight is zero.
Here's two other ways to do it.
Here is a solution in Groovy in two passes.
assert waterCollected([1, 5, 3, 7, 2]) == 2
assert waterCollected([5, 3, 7, 2, 6, 4, 5, 9, 1, 2]) == 14
assert waterCollected([5, 5, 5, 5]) == 0
assert waterCollected([5, 6, 7, 8]) == 0
assert waterCollected([8, 7, 7, 6]) == 0
assert waterCollected([6, 7, 10, 7, 6]) == 0
def waterCollected(towers) {
int size = towers.size()
if (size < 3) return 0
int left = towers[0]
int right = towers[towers.size() - 1]
def highestToTheLeft = []
def highestToTheRight = [null] * size
for (int i = 1; i < size; i++) {
// Track highest tower to the left
if (towers[i] < left) {
highestToTheLeft[i] = left
} else {
left = towers[i]
}
// Track highest tower to the right
if (towers[size - 1 - i] < right) {
highestToTheRight[size - 1 - i] = right
} else {
right = towers[size - 1 - i]
}
}
int water = 0
for (int i = 0; i < size; i++) {
if (highestToTheLeft[i] && highestToTheRight[i]) {
int minHighest = highestToTheLeft[i] < highestToTheRight[i] ? highestToTheLeft[i] : highestToTheRight[i]
water += minHighest - towers[i]
}
}
return water
}
Here same snippet with an online compiler:
https://groovy-playground.appspot.com/#?load=3b1d964bfd66dc623c89
You can traverse first from left to right, and calculate the water accumulated for the cases where there is a smaller building on the left and a larger one on the right. You would have to subtract the area of the buildings that are in between these two buildings and are smaller than the left one.
Similar would be the case for right to left.
Here is the code for left to right. I have uploaded this problem on leetcode online judge using this approach.
I find this approach much more intuitive than the standard solution which is present everywhere (calculating the largest building on the right and the left for each i ).
int sum=0, finalAns=0;
idx=0;
while(a[idx]==0 && idx < n)
idx++;
for(int i=idx+1;i<n;i++){
while(a[i] < a[idx] && i<n){
sum += a[i];
i++;
}
if(i==n)
break;
jdx=i;
int area = a[idx] * (jdx-idx-1);
area -= sum;
finalAns += area;
idx=jdx;
sum=0;
}
The time complexity of this approach is O(n), as you are traversing the array two time linearly.
Space complexity would be O(1).
The first and the last bars in the list cannot trap water. For the remaining towers, they can trap water when there are max heights to the left and to the right.
water accumulation is:
max( min(max_left, max_right) - current_height, 0 )
Iterating from the left, if we know that there is a max_right that is greater, min(max_left, max_right) will become just max_left. Therefore water accumulation is simplified as:
max(max_left - current_height, 0) Same pattern when considering from the right side.
From the info above, we can write a O(N) time and O(1) space algorithm as followings(in Python):
def trap_water(A):
water = 0
left, right = 1, len(A)-1
max_left, max_right = A[0], A[len(A)-1]
while left <= right:
if A[left] <= A[right]:
max_left = max(A[left], max_left)
water += max(max_left - A[left], 0)
left += 1
else:
max_right = max(A[right], max_right)
water += max(max_right - A[right], 0)
right -= 1
return water
/**
* #param {number[]} height
* #return {number}
*/
var trap = function(height) {
let maxLeftArray = [], maxRightArray = [];
let maxLeft = 0, maxRight = 0;
const ln = height.length;
let trappedWater = 0;
for(let i = 0;i < height.length; i ++) {
maxLeftArray[i] = Math.max(height[i], maxLeft);
maxLeft = maxLeftArray[i];
maxRightArray[ln - i - 1] = Math.max(height[ln - i - 1], maxRight);
maxRight = maxRightArray[ln - i - 1];
}
for(let i = 0;i < height.length; i ++) {
trappedWater += Math.min(maxLeftArray[i], maxRightArray[i]) - height[i];
}
return trappedWater;
};
var arr = [5,3,7,2,6,4,5,9,1,2];
console.log(trap(arr));
You could read the detailed explanation in my blogpost: trapping-rain-water
Here is one more solution written on Scala
def find(a: Array[Int]): Int = {
var count, left, right = 0
while (left < a.length - 1) {
right = a.length - 1
for (j <- a.length - 1 until left by -1) {
if (a(j) > a(right)) right = j
}
if (right - left > 1) {
for (k <- left + 1 until right) count += math.min(a(left), a(right)) - a(k)
left = right
} else left += 1
}
count
}
An alternative algorithm in the style of Euclid, which I consider more elegant than all this scanning is:
Set the two tallest towers as the left and right tower. The amount of
water contained between these towers is obvious.
Take the next tallest tower and add it. It must be either between the
end towers, or not. If it is between the end towers it displaces an
amount of water equal to the towers volume (thanks to Archimedes for
this hint). If it outside the end towers it becomes a new end tower
and the amount of additional water contained is obvious.
Repeat for the next tallest tower until all towers are added.
I've posted code to achieve this (in a modern Euclidean idiom) here: http://www.rosettacode.org/wiki/Water_collected_between_towers#F.23
I have a solution that only requires a single traversal from left to right.
def standing_water(heights):
if len(heights) < 3:
return 0
i = 0 # index used to iterate from left to right
w = 0 # accumulator for the total amount of water
while i < len(heights) - 1:
target = i + 1
for j in range(i + 1, len(heights)):
if heights[j] >= heights[i]:
target = j
break
if heights[j] > heights[target]:
target = j
if target == i:
return w
surface = min(heights[i], heights[target])
i += 1
while i < target:
w += surface - heights[i]
i += 1
return w
An intuitive solution for this problem is one in which you bound the problem and fill water based on the height of the left and right bounds.
My solution:
Begin at the left, setting both bounds to be the 0th index.
Check and see if there is some kind of a trajectory (If you were to
walk on top of these towers, would you ever go down and then back up
again?) If that is the case, then you have found a right bound.
Now back track and fill the water accordingly (I simply added the
water to the array values themselves as it makes the code a little
cleaner, but this is obviously not required).
The punch line: If the left bounding tower height is greater than the
right bounding tower height than you need to increment the right
bound. The reason is because you might run into a higher tower and need to fill some more water.
However, if the right tower is higher than the left tower then no
more water can be added in your current sub-problem. Thus, you move
your left bound to the right bound and continue.
Here is an implementation in C#:
int[] towers = {1,5,3,7,2};
int currentMinimum = towers[0];
bool rightBoundFound = false;
int i = 0;
int leftBoundIndex = 0;
int rightBoundIndex = 0;
int waterAdded = 0;
while(i < towers.Length - 1)
{
currentMinimum = towers[i];
if(towers[i] < currentMinimum)
{
currentMinimum = towers[i];
}
if(towers[i + 1] > towers[i])
{
rightBoundFound = true;
rightBoundIndex = i + 1;
}
if (rightBoundFound)
{
for(int j = leftBoundIndex + 1; j < rightBoundIndex; j++)
{
int difference = 0;
if(towers[leftBoundIndex] < towers[rightBoundIndex])
{
difference = towers[leftBoundIndex] - towers[j];
}
else if(towers[leftBoundIndex] > towers[rightBoundIndex])
{
difference = towers[rightBoundIndex] - towers[j];
}
else
{
difference = towers[rightBoundIndex] - towers[j];
}
towers[j] += difference;
waterAdded += difference;
}
if (towers[leftBoundIndex] > towers[rightBoundIndex])
{
i = leftBoundIndex - 1;
}
else if (towers[rightBoundIndex] > towers[leftBoundIndex])
{
leftBoundIndex = rightBoundIndex;
i = rightBoundIndex - 1;
}
else
{
leftBoundIndex = rightBoundIndex;
i = rightBoundIndex - 1;
}
rightBoundFound = false;
}
i++;
}
I have no doubt that there are more optimal solutions. I am currently working on a single-pass optimization. There is also a very neat stack implementation of this problem, and it uses a similar idea of bounding.
Here is my solution, it passes this level and pretty fast, easy to understand
The idea is very simple: first, you figure out the maximum of the heights (it could be multiple maximum), then you chop the landscape into 3 parts, from the beginning to the left most maximum heights, between the left most max to the right most max, and from the right most max to the end.
In the middle part, it's easy to collect the rains, one for loop does that. Then for the first part, you keep on updating the current max height that is less than the max height of the landscape. one loop does that. Then for the third part, you reverse what you have done to the first part
def answer(heights):
sumL = 0
sumM = 0
sumR = 0
L = len(heights)
MV = max(heights)
FI = heights.index(MV)
LI = L - heights[::-1].index(MV) - 1
if LI-FI>1:
for i in range(FI+1,LI):
sumM = sumM + MV-heights[i]
if FI>0:
TM = heights[0]
for i in range(1,FI):
if heights[i]<= TM:
sumL = sumL + TM-heights[i]
else:
TM = heights[i]
if LI<(L-1):
TM = heights[-1]
for i in range(L-1,LI,-1):
if heights[i]<= TM:
sumL = sumL + TM-heights[i]
else:
TM = heights[i]
return(sumL+sumM+sumR)
Here is a solution in JAVA that traverses the list of numbers once. So the worst case time is O(n). (At least that's how I understand it).
For a given reference number keep looking for a number which is greater or equal to the reference number. Keep a count of numbers that was traversed in doing so and store all those numbers in a list.
The idea is this. If there are 5 numbers between 6 and 9, and all the five numbers are 0's, it means that a total of 30 units of water can be stored between 6 and 9. For a real situation where the numbers in between aren't 0's, we just deduct the total sum of the numbers in between from the total amount if those numbers were 0. (In this case, we deduct from 30). And that will give the count of water stored in between these two towers. We then save this amount in a variable called totalWaterRetained and then start from the next tower after 9 and keep doing the same till the last element.
Adding all the instances of totalWaterRetained will give us the final answer.
JAVA Solution: (Tested on a few inputs. Might be not 100% correct)
private static int solveLineTowerProblem(int[] inputArray) {
int totalWaterContained = 0;
int index;
int currentIndex = 0;
int countInBetween = 0;
List<Integer> integerList = new ArrayList<Integer>();
if (inputArray.length < 3) {
return totalWaterContained;
} else {
for (index = 1; index < inputArray.length - 1;) {
countInBetween = 0;
integerList.clear();
int tempIndex = index;
boolean flag = false;
while (inputArray[currentIndex] > inputArray[tempIndex] && tempIndex < inputArray.length - 1) {
integerList.add(inputArray[tempIndex]);
tempIndex++;
countInBetween++;
flag = true;
}
if (flag) {
integerList.add(inputArray[index + countInBetween]);
integerList.add(inputArray[index - 1]);
int differnceBetweenHighest = min(integerList.get(integerList.size() - 2),
integerList.get(integerList.size() - 1));
int totalCapacity = differnceBetweenHighest * countInBetween;
totalWaterContained += totalCapacity - sum(integerList);
}
index += countInBetween + 1;
currentIndex = index - 1;
}
}
return totalWaterContained;
}
Here is my take to the problem,
I use a loop to see if the previous towers is bigger than the actual one.
If it is then I create another loop to check if the towers coming after the actual one are bigger or equal to the previous tower.
If that's the case then I just add all the differences in height between the previous tower and all other towers.
If not and if my loop reaches my last object then I simply reverse the array so that the previous tower becomes my last tower and call my method recursively on it.
That way I'm certain to find a tower bigger than my new previous tower and will find the correct amount of water collected.
public class towers {
public static int waterLevel(int[] i) {
int totalLevel = 0;
for (int j = 1; j < i.length - 1; j++) {
if (i[j - 1] > i[j]) {
for (int k = j; k < i.length; k++) {
if (i[k] >= i[j - 1]) {
for (int l = j; l < k; l++) {
totalLevel += (i[j - 1] - i[l]);
}
j = k;
break;
}
if (k == i.length - 1) {
int[] copy = Arrays.copyOfRange(i, j - 1, k + 1);
int[] revcopy = reverse(copy);
totalLevel += waterLevel(revcopy);
}
}
}
}
return totalLevel;
}
public static int[] reverse(int[] i) {
for (int j = 0; j < i.length / 2; j++) {
int temp = i[j];
i[j] = i[i.length - j - 1];
i[i.length - j - 1] = temp;
}
return i;
}
public static void main(String[] args) {
System.out.println(waterLevel(new int[] {1, 6, 3, 2, 2, 6}));
}
}
Tested all the Java solution provided, but none of them passes even half of the test-cases I've come up with, so there is one more Java O(n) solution, with all possible cases covered. The algorithm is really simple:
1) Traverse the input from the beginning, searching for tower that is equal or higher that the given tower, while summing up possible amount of water for lower towers into temporary var.
2) Once the tower found - add that temporary var into main result var and shorten the input list.
3) If no more tower found then reverse the remaining input and calculate again.
public int calculate(List<Integer> input) {
int result = doCalculation(input);
Collections.reverse(input);
result += doCalculation(input);
return result;
}
private static int doCalculation(List<Integer> input) {
List<Integer> copy = new ArrayList<>(input);
int result = 0;
for (ListIterator<Integer> iterator = input.listIterator(); iterator.hasNext(); ) {
final int firstHill = iterator.next();
int tempResult = 0;
int lowerHillsSize = 0;
while (iterator.hasNext()) {
final int nextHill = iterator.next();
if (nextHill >= firstHill) {
iterator.previous();
result += tempResult;
copy = copy.subList(lowerHillsSize + 1, copy.size());
break;
} else {
tempResult += firstHill - nextHill;
lowerHillsSize++;
}
}
}
input.clear();
input.addAll(copy);
return result;
}
For the test cases, please, take a look at this test class.
Feel free to create a pull request if you find uncovered test cases)
This is a funny problem, I just got that question in an interview. LOL I broke my mind on that stupid problem, and found a solution which need one pass (but clearly non-continuous). (and in fact you even not loop over the entire data, as you bypass the boundary...)
So the idea is. You start from the side with the lowest tower (which is now the reference). You directly add the content of the towers, and if you reach a tower which is highest than the reference, you call the function recursively (with side to be reset). Not trivial to explain with words, the code speak for himself.
#include <iostream>
using namespace std;
int compute_water(int * array, int index_min, int index_max)
{
int water = 0;
int dir;
int start,end;
int steps = std::abs(index_max-index_min)-1;
int i,count;
if(steps>=1)
{
if(array[index_min]<array[index_max])
{
dir=1;
start = index_min;
end = index_max;
}
else
{
dir = -1;
start = index_max;
end = index_min;
}
for(i=start+dir,count=0;count<steps;i+=dir,count++)
{
if(array[i]<=array[start])water += array[start] - array[i];
else
{
if(i<end)water += compute_water(array, i, end);
else water += compute_water(array, end, i);
break;
}
}
}
return water;
}
int main(int argc,char ** argv)
{
int size = 0;
int * towers;
if(argc==1)
{
cout<< "Usage: "<<argv[0]<< "a list of tower height separated by spaces" <<endl;
}
else
{
size = argc - 1;
towers = (int*)malloc(size*sizeof(int));
for(int i = 0; i<size;i++)towers[i] = atoi(argv[i+1]);
cout<< "water collected: "<< compute_water(towers, 0, size-1)<<endl;
free(towers);
}
}
I wrote this relying on some of the ideas above in this thread:
def get_collected_rain(towers):
length = len(towers)
acummulated_water=[0]*length
left_max=[0]*length
right_max=[0]*length
for n in range(0,length):
#first left item
if n!=0:
left_max[n]=max(towers[:n])
#first right item
if n!=length-1:
right_max[n]=max(towers[n+1:length])
acummulated_water[n]=max(min(left_max[n], right_max[n]) - towers[n], 0)
return sum(acummulated_water)
Well ...
> print(get_collected_rain([9,8,7,8,9,5,6]))
> 5
Here's my attempt in jQuery. It only scans to the right.
Working fiddle (with helpful logging)
var a = [1, 5, 3, 7, 2];
var water = 0;
$.each(a, function (key, i) {
if (i > a[key + 1]) { //if next tower to right is bigger
for (j = 1; j <= a.length - key; j++) { //number of remaining towers to the right
if (a[key+1 + j] >= i) { //if any tower to the right is bigger
for (k = 1; k < 1+j; k++) {
//add to water: the difference of the first tower and each tower between the first tower and its bigger tower
water += a[key] - a[key+k];
}
}
}
}
});
console.log("Water: "+water);
Here's my go at it in Python. Pretty sure it works but haven't tested it.
Two passes through the list (but deleting the list as it finds 'water'):
def answer(heights):
def accWater(lst,sumwater=0):
x,takewater = 1,[]
while x < len(lst):
a,b = lst[x-1],lst[x]
if takewater:
if b < takewater[0]:
takewater.append(b)
x += 1
else:
sumwater += sum(takewater[0]- z for z in takewater)
del lst[:x]
x = 1
takewater = []
else:
if b < a:
takewater.extend([a,b])
x += 1
else:
x += 1
return [lst,sumwater]
heights, swater = accWater(heights)
x, allwater = accWater(heights[::-1],sumwater=swater)
return allwater
private static int soln1(int[] a)
{
int ret=0;
int l=a.length;
int st,en=0;
int h,i,j,k=0;
int sm;
for(h=0;h<l;h++)
{
for(i=1;i<l;i++)
{
if(a[i]<a[i-1])
{
st=i;
for(j=i;j<l-1;j++)
{
if(a[j]<=a[i] && a[j+1]>a[i])
{
en=j;
h=en;
break;
}
}
if(st<=en)
{
sm=a[st-1];
if(sm>a[en+1])
sm=a[en+1];
for(k=st;k<=en;k++)
{
ret+=sm-a[k];
a[k]=sm;
}
}
}
}
}
return ret;
}
/*** Theta(n) Time COmplexity ***/
static int trappingRainWater(int ar[],int n)
{
int res=0;
int lmaxArray[]=new int[n];
int rmaxArray[]=new int[n];
lmaxArray[0]=ar[0];
for(int j=1;j<n;j++)
{
lmaxArray[j]=Math.max(lmaxArray[j-1], ar[j]);
}
rmaxArray[n-1]=ar[n-1];
for(int j=n-2;j>=0;j--)
{
rmaxArray[j]=Math.max(rmaxArray[j+1], ar[j]);
}
for(int i=1;i<n-1;i++)
{
res=res+(Math.min(lmaxArray[i], rmaxArray[i])-ar[i]);
}
return res;
}

Backtracking optimization

recently I was trying to solve famous little bishops algorithmic problem. In one of the websites I read that I should divide chessboard into black and white parts to optimize the execution. After that I should use backtracking to count number of possible ways to put bishops on black squares and white squares separetely.
In the following code I try to put 6 bishops ONLY ON WHITE squares of an 8 by 8 chessboard. I do it only to verify that technique is really working.
//inside main function
int k = 6; //number of bishops
int n = 8; //length of one side of chessboard
Integer[] positions = new Integer[k];
long result = backtrack(positions, 0, n);
//find how many times we double counting each possible combination of bishops
int factor = 1;
for(int i = k; i>0; i--) {
factor = factor * i;
}
System.out.println("The result is " + result/factor);
//implementation of other functions
public long backtrack(Integer[] prevPositions, int k, int n) {
if(k == 6) {
return 1;
}
long sum = 0;
Integer[] candidates = new Integer[n*n];
int length = getCandidates(prevPositions, k, candidates, n);
for(int i=0 ; i<length ; i++) {
prevPositions[k] = candidates[i];
sum += backtrack(prevPositions,k+1,n);
}
return sum;
}
public Integer getCandidates(Integer[] prevPositions, int k, Integer[] candidates, int n) {
int length = 0;
//only white squares are considered as candidates, hence i+=2
for (int i = 0; i < n*n; i+=2) {
boolean isGood = true;
int iRow = i / n;
int iCol = i % n;
for (int j = 0; j < k; j++) {
int prev = prevPositions[j];
if (i == prev) {
isGood = false;
break;
} else {
int prevRow = prev / n;
int prevCol = prev % n;
if (Math.abs(iRow - prevRow) == Math.abs(iCol - prevCol)) {
isGood = false;
break;
}
}
}
if(isGood) {
candidates[length] = new Integer(i);
length++;
}
}
return length;
}
Even though I can see why dividing chessboard into white and black squares optimizes the problem, it is still takes around 11 seconds to count number of possible ways to put all bishops ONLY ON WHITE SQUARES. Can you help me pls? What am I doing wrong?
here are a few ways to improve your search.
(1) Instead of generate-and-test, you could consider finite domain search, where every bishop has a "domain" of possible places. Whenever you place a bishop, you prune the domains of the remaining bishops. If a bishop's domain becomes empty, you must backtrack.
(2) As a refinement, if you have n bishops to place and m < n places left, you must backtrack.
(3) Use dynamic programming/memoization, where you store solutions for 1 bishop, 2 bishops, ..., and compute the set of n + 1 bishop solutions from the set of n bishop solutions.
(4) Exploit symmetry to reduce your search space. In this case there is (at least) black/white symmetry and rotational/reflective symmetry.
(5) Try to find a better representation. For example, bit patterns.
(6) If you use a different representation, look into using a "trail" (cf. Prolog) to track the operations you need to undo on backtracking.
Cheers!

A Cache Efficient Matrix Transpose Program?

So the obvious way to transpose a matrix is to use :
for( int i = 0; i < n; i++ )
for( int j = 0; j < n; j++ )
destination[j+i*n] = source[i+j*n];
but I want something that will take advantage of locality and cache blocking. I was looking it up and can't find code that would do this, but I'm told it should be a very simple modification to the original. Any ideas?
Edit: I have a 2000x2000 matrix, and I want to know how can I change the code using two for loops, basically splitting the matrix into blocks that I transpose individually, say 2x2 blocks, or 40x40 blocks, and see which block size is most efficient.
Edit2: The matrices are stored in column major order, that is to say for a matrix
a1 a2
a3 a4
is stored as a1 a3 a2 a4.
You're probably going to want four loops - two to iterate over the blocks, and then another two to perform the transpose-copy of a single block. Assuming for simplicity a block size that divides the size of the matrix, something like this I think, although I'd want to draw some pictures on the backs of envelopes to be sure:
for (int i = 0; i < n; i += blocksize) {
for (int j = 0; j < n; j += blocksize) {
// transpose the block beginning at [i,j]
for (int k = i; k < i + blocksize; ++k) {
for (int l = j; l < j + blocksize; ++l) {
dst[k + l*n] = src[l + k*n];
}
}
}
}
An important further insight is that there's actually a cache-oblivious algorithm for this (see http://en.wikipedia.org/wiki/Cache-oblivious_algorithm, which uses this exact problem as an example). The informal definition of "cache-oblivious" is that you don't need to experiment tweaking any parameters (in this case the blocksize) in order to hit good/optimal cache performance. The solution in this case is to transpose by recursively dividing the matrix in half, and transposing the halves into their correct position in the destination.
Whatever the cache size actually is, this recursion takes advantage of it. I expect there's a bit of extra management overhead compared with your strategy, which is to use performance experiments to, in effect, jump straight to the point in the recursion at which the cache really kicks in, and go no further. On the other hand, your performance experiments might give you an answer that works on your machine but not on your customers' machines.
I had the exact same problem yesterday.
I ended up with this solution:
void transpose(double *dst, const double *src, size_t n, size_t p) noexcept {
THROWS();
size_t block = 32;
for (size_t i = 0; i < n; i += block) {
for(size_t j = 0; j < p; ++j) {
for(size_t b = 0; b < block && i + b < n; ++b) {
dst[j*n + i + b] = src[(i + b)*p + j];
}
}
}
}
This is 4 time faster than the obvious solution on my machine.
This solution takes care of a rectangular matrix with dimensions which are not a multiple of the block size.
if dst and src are the same square matrix an in place function should really be used instead:
void transpose(double*m,size_t n)noexcept{
size_t block=0,size=8;
for(block=0;block+size-1<n;block+=size){
for(size_t i=block;i<block+size;++i){
for(size_t j=i+1;j<block+size;++j){
std::swap(m[i*n+j],m[j*n+i]);}}
for(size_t i=block+size;i<n;++i){
for(size_t j=block;j<block+size;++j){
std::swap(m[i*n+j],m[j*n+i]);}}}
for(size_t i=block;i<n;++i){
for(size_t j=i+1;j<n;++j){
std::swap(m[i*n+j],m[j*n+i]);}}}
I used C++11 but this could be easily translated in other languages.
Instead of transposing the matrix in memory, why not collapse the transposition operation into the next operation you're going to do on the matrix?
Steve Jessop mentioned a cache oblivious matrix transpose algorithm.
For the record, I want to share an possible implementation of a cache oblivious matrix transpose.
public class Matrix {
protected double data[];
protected int rows, columns;
public Matrix(int rows, int columns) {
this.rows = rows;
this.columns = columns;
this.data = new double[rows * columns];
}
public Matrix transpose() {
Matrix C = new Matrix(columns, rows);
cachetranspose(0, rows, 0, columns, C);
return C;
}
public void cachetranspose(int rb, int re, int cb, int ce, Matrix T) {
int r = re - rb, c = ce - cb;
if (r <= 16 && c <= 16) {
for (int i = rb; i < re; i++) {
for (int j = cb; j < ce; j++) {
T.data[j * rows + i] = data[i * columns + j];
}
}
} else if (r >= c) {
cachetranspose(rb, rb + (r / 2), cb, ce, T);
cachetranspose(rb + (r / 2), re, cb, ce, T);
} else {
cachetranspose(rb, re, cb, cb + (c / 2), T);
cachetranspose(rb, re, cb + (c / 2), ce, T);
}
}
}
More details on cache oblivious algorithms can be found here.
Matrix multiplication comes to mind, but the cache issue there is much more pronounced, because each element is read N times.
With matrix transpose, you are reading in a single linear pass and there's no way to optimize that. But you can simultaneously process several rows so that you write several columns and so fill complete cache lines. You will only need three loops.
Or do it the other way around and read in columns while writing linearly.
With a large matrix, possibly a large sparse matrix, it might be an idea to decompose it into smaller cache friendly chunks (Say, 4x4 sub matrices). You can also flag sub matrices as identity which will help you in creating optimized code paths.

Resources