Simple algorithm for a simple balancing problem - algorithm

I have three buckets. They do not contain the same amount of water and they can contain up to 50 liters each.
Now I want to add more water to the buckets. This amount may vary from time to time, and might also be more than 50 x 3 liters. My goal is to fill the buckets with the new water so they have about an equal amount in each of the buckets - as close to equal as possible, but it's not a criterion. And also without exceeding the upper limit of 50.
Is there a simple and easy-to-read algorithm that would balance (as much as possible) the amount of water in the buckets?
I always know how much water there already is in each bucket.
I always know how much new water I get.
Water already in buckets cannot be touched
Equal water level is not a criterion, but further from the limits is desirable

Yes, there is a simple algorithm as follows:
Sort the buckets by the amount of water. Let's call them a, b, c sorted none-decreasing.
The total amount of water that you need to balance them is (c - b) + (c - a) = 2*c - b - a. Let's call the needed amount t.
If the available water is less than t, it is not possible to balance the buckets.
Otherwise, add c - b to b and c - a to a.
Update based on the new contraints in the edit:
If you have enough water to bring the amount of water in the lesser filled buckets to the level of the more full bucket, the previous algorithm works just fine.
But in case there isn't enough water available to make all three equal (note that this can be calculated up front as described above), first fill the bucket with the smallest amount of water up until it is equal to the middle one. Then divide the remaining amount of available water and distribute it equally between the two buckets that are equal but have less water than the other.
The intuition is this: When you add to the smallest bucket up until you reach the middle one, you are decreasing the absolute difference between the three by 2 for each added liter. That's because the smallest is approaching the middle and the largest one.
Example:
a, b, c = 5, 3, 1
available_water = 4
difference = (5 - 3) + (5 - 1) + (3 - 1) = 8
add 2 to the smallest:
a, b, c = 5, 3, 3
available_water = 2
difference = (5 - 3) + (5 - 3) + (3 - 3) = 4
Note that we reduced the difference by 2 times the amount of used water
add 1 to each of the smaller buckets:
a, b, c = 5, 4, 4
available_water = 0
difference = (5 - 4) + (5 - 4) = 2
Now if we didn't follow this algorithm and just arbitrary used the water:
add 2 to the middle bucket:
a, b, c = 5, 5, 1
available_water = 2
difference = (5 - 5) + (5 - 1) + (5 - 1) = 8
add 2 to the smallest one:
a, b, c = 5, 5, 3
available_water = 0
difference = (5 - 5) + (5 - 3) + (5 - 3) = 4

Related

Advanced Algorithms Problems ("Nice Triangle"): Prime number Pyramid where every number depends on numbers above it

I'm currently studying for an advanced algorithms and datastructures exam, and I simply can't seem to solve one of the practice-problems which is the following:
1.14) "Nice Triangle"
A "nice" triangle is defined in the following way:
There are three different numbers which the triangle consists of, namely the first three prime numbers (2, 3 and 5).
Every number depends on the two numbers below it in the following way.
Numbers are the same, resulting number is also the same. (2, 2 => 2)
Numbers are different, resulting number is the remaining number. (2, 3 => 5)
Given an integer N with length L, corresponding to the base of the triangle, determine the last element at the top
For example:
Given N = 25555 (and thus L = 5), the triangle looks like this:
2
3 5
2 5 5
3 5 5 5
2 5 5 5 5
=> 2 is the result of this example
What does the fact that every number is prime have to do with the problem?
By using a naive approach (simply calculating every single row), one obtains a time-complexity of O(L^2).
However, the professor said, it's possible with O(L), but I simply can't find any pattern!!!
I'm not sure why this problem would be used in an advanced algorithms course, but yes, you can do this in O(l) = O(log n) time.
There are a couple ways you can do it, but they both rely on recognizing that:
For the problem statement, it doesn't matter what digits you use. Lets use 0, 1, and 2 instead of 2, 3, and 5. Then
If a and b are the input numbers and c is the output, then c = -(a+b) mod 3
You can build the whole triangle using c = a+b mod 3 instead, and then just negate every second row.
Now the two ways you can do this in O(log n) time are:
For each digit d in the input, calculate the number of times (call it k) that it gets added into the final sum, add up all the kd mod 3, and then negate the result if you started with an even number of digits. That takes constant time per digit. Alternatively:
recognize that you can do arithmetic on n-sized values in constant time. Make a value that is a bit mask of all the digits in n. That takes 2 bits each. Then by using bitwise operations you can calculate each row from the previous one in constant time, for O(log n) time altogether.
Here's an implementation of the 2nd way in python:
def niceTriangle(n):
# a vector of 3-bit integers mod 3
rowvec = 0
# a vector of 1 for each number in the row
onevec = 0
# number of rows remaining
rows = 0
# mapping for digits 0-9
digitmap = [0, 0, 0, 1, 1, 2, 2, 2, 2, 2]
# first convert n into the first row
while n > 0:
digit = digitmap[n % 10]
n = n//10
rows += 1
onevec = (onevec << 3) + 1
rowvec = (rowvec << 3) + digit
if rows%2 == 0:
# we have an even number of rows -- negate everything
rowvec = ((rowvec&onevec)<<1) | ((rowvec>>1)&onevec)
while rows > 1:
# add each number to its neighbor
rowvec += (rowvec >> 3)
# isolate the entries >= 3, by adding 1 to each number and
# getting the 2^2 bit
gt3 = ((rowvec + onevec) >> 2) & onevec
# subtract 3 from all the greater entries
rowvec -= gt3*3
rows -= 1
return [2,3,5][rowvec%4]

Modified Tower of Hanoi

We all know that the minimum number of moves required to solve the classical towers of hanoi problem is 2n-1. Now, let us assume that some of the discs have same size. What would be the minimum number of moves to solve the problem in that case.
Example, let us assume that there are three discs. In the classical problem, the minimum number of moves required would be 7. Now, let us assume that the size of disc 2 and disc 3 is same. In that case, the minimum number of moves required would be:
Move disc 1 from a to b.
Move disc 2 from a to c.
Move disc 3 from a to c.
Move disc 1 from b to c.
which is 4 moves. Now, given the total number of discs n and the sets of discs which have same size, find the minimum number of moves to solve the problem. This is a challenge by a friend, so pointers towards solution are welcome. Thanks.
Let's consider a tower of size n. The top disk has to be moved 2n-1 times, the second disk 2n-2 times, and so on, until the bottom disk has to be moved just once, for a total of 2n-1 moves. Moving each disk takes exactly one turn.
1 moved 8 times
111 moved 4 times
11111 moved 2 times
1111111 moved 1 time => 8 + 4 + 2 + 1 == 15
Now if x disks have the same size, those have to be in consecutive layers, and you would always move them towards the same target stack, so you could just as well collapse those to just one disk, requiring x turns to be moved. You could consider those multi-disks to be x times as 'heavy', or 'thick', if you like.
1
111 1 moved 8 times
111 collapse 222 moved 4 times, taking 2 turns each
11111 -----------> 11111 moved 2 times
1111111 3333333 moved 1 time, taking 3 turns
1111111 => 8 + 4*2 + 2 + 1*3 == 21
1111111
Now just sum those up and you have your answer.
Here's some Python code, using the above example: Assuming you already have a list of the 'collapsed' disks, with disks[i] being the weight of the collapsed disk in the ith layer, you can just do this:
disks = [1, 2, 1, 3] # weight of collapsed disks, top to bottom
print sum(d * 2**i for i, d in enumerate(reversed(disks)))
If instead you have a list of the sizes of the disks, like on the left side, you could use this algorithm:
disks = [1, 3, 3, 5, 7, 7, 7] # size of disks, top to bottom
last, t, s = disks[-1], 1, 0
for d in reversed(disks):
if d < last: t, last = t*2, d
s = s + t
print s
Output, in both cases, is 21, the required number of turns.
It completely depends on the distribution of the discs that are the same size. If you have n=7 discs and they are all the same size then the answer is 7 (or n). And, of course the standard problem is answered by 2n-1.
As tobias_k suggested, you can group same size discs. So now look at the problem as moving groups of discs. To move a certain number of groups, you have to know the size of each group
examples
1
n=7 //disc sizes (1,2,3,3,4,5,5)
g=5 //group sizes (1,1,2,1,2)
//group index (1,2,3,4,5)
number of moves = sum( g-size * 2^( g-count - g-index ) )
in this case
moves = 1*2^4 + 1*2^3 + 2*2^2 + 1*2^1 + 2*2^0
= 16 + 8 + 8 + 2 + 2
= 36
2
n=7 //disc sizes (1,1,1,1,1,1,1)
g=1 //group sizes (7)
//group index (1)
number of moves = sum( g-size * 2^( g-count - g-index ) )
in this case
moves = 7*2^0
= 7
3
n=7 //disc sizes (1,2,3,4,5,6,7)
g=7 //group sizes (1,1,1,1,1,1,1)
//group index (1,2,3,4,5,6,7)
number of moves = sum( g-size * 2^( g-count - g-index ) )
in this case
moves = 1*2^6 + 1*2^5 + 1*2^4 + 1*2^3 + 1*2^2 + 1*2^1 + 1*2^0
= 64 + 32 + 16 + 8 + 4 + 2 + 1
= 127
Interesting note about the last example, and the standard hanoi problem: sum(2n-1) = 2n - 1
I wrote a Github gist in C for this problem. I am attaching a link to it, may be useful to somebody, I hope.
Modified tower of Hanoi problem with one or more disks of the same size
There are n types of disks. For each type, all disks are identical. In array arr, I am taking the number of disks of each type. A, B and C are pegs or towers.
Method swap(int, int), partition(int, int) and qSort(int, int) are part of my implementation of the quicksort algorithm.
Method toh(char, char, char, int, int) is the Tower of Hanoi solution.
How it is working: Imagine we compress all the disks of the same size into one disk. Now we have a problem which has a general solution to the Tower of Hanoi. Now each time a disk moves, we add the total movement which is equal to the total number of that type of disk.

How many permutations of a given array result in BST's of height 2?

A BST is generated (by successive insertion of nodes) from each permutation of keys from the set {1,2,3,4,5,6,7}. How many permutations determine trees of height two?
I been stuck on this simple question for quite some time. Any hints anyone.
By the way the answer is 80.
Consider how the tree would be height 2?
-It needs to have 4 as root, 2 as the left child, 6 right child, etc.
How come 4 is the root?
-It needs to be the first inserted. So we have one number now, 6 still can move around in the permutation.
And?
-After the first insert there are still 6 places left, 3 for the left and 3 for the right subtrees. That's 6 choose 3 = 20 choices.
Now what?
-For the left and right subtrees, their roots need to be inserted first, then the children's order does not affect the tree - 2, 1, 3 and 2, 3, 1 gives the same tree. That's 2 for each subtree, and 2 * 2 = 4 for the left and right subtrees.
So?
In conclusion: C(6, 3) * 2 * 2 = 20 * 2 * 2 = 80.
Note that there is only one possible shape for this tree - it has to be perfectly balanced. It therefore has to be this tree:
4
/ \
2 6
/ \ / \
1 3 5 7
This requires 4 to be inserted first. After that, the insertions need to build up the subtrees holding 1, 2, 3 and 5, 6, 7 in the proper order. This means that we will need to insert 2 before 1 and 3 and need to insert 6 before 5 and 7. It doesn't matter what relative order we insert 1 and 3 in, as long as they're after the 2, and similarly it doesn't matter what relative order we put 5 and 7 in as long as they're after 6. You can therefore think of what we need to insert as 2 X X and 6 Y Y, where the X's are the children of 2 and the Y's are the children of 6. We can then find all possible ways to get back the above tree by finding all interleaves of the sequences 2 X X and 6 Y Y, then multiplying by four (the number of ways of assigning X and Y the values 1, 3, 5, and 7).
So how many ways are there to interleave? Well, you can think of this as the number of ways to permute the sequence L L L R R R, since each permutation of L L L R R R tells us how to choose from either the Left sequence or the Right sequence. There are 6! / 3! 3! = 20 ways to do this. Since each of those twenty interleaves gives four possible insertion sequences, there end up being a total of 20 × 4 = 80 possible ways to do this.
Hope this helps!
I've created a table for the number of permutations possible with 1 - 12 elements, with heights up to 12, and included the per-root break down for anybody trying to check that their manual process (described in other answers) is matching with the actual values.
http://www.asmatteringofit.com/blog/2014/6/14/permutations-of-a-binary-search-tree-of-height-x
Here is a C++ code aiding the accepted answer, here I haven't shown the obvious ncr(i,j) function, hope someone will find it useful.
int solve(int n, int h) {
if (n <= 1)
return (h == 0);
int ans = 0;
for (int i = 0; i < n; i++) {
int res = 0;
for (int j = 0; j < h - 1; j++) {
res = res + solve(i, j) * solve(n - i - 1, h - 1);
res = res + solve(n - i - 1, j) * solve(i, h - 1);
}
res = res + solve(i, h - 1) * solve(n - i - 1, h - 1);
ans = ans + ncr(n - 1, i) * res;
}
return ans
}
The tree must have 4 as the root and 2 and 6 as the left and right child, respectively. There is only one choice for the root and the insertion should start with 4, however, once we insert the root, there are many insertion orders. There are 2 choices for, the second insertion 2 or 6. If we choose 2 for the second insertion, we have three cases to choose 6: choose 6 for the third insertion, 4, 2, 6, -, -, -, - there are 4!=24 choices for the rest of the insertions; fix 6 for the fourth insertion, 4, 2, -, 6, -,-,- there are 2 choices for the third insertion, 1 or 3, and 3! choices for the rest, so 2*3!=12, and the last case is to fix 6 in the fifth insertion, 4, 2, -, -, 6, -, - there are 2 choices for the third and fourth insertion ((1 and 3), or (3 and 1)) as well as for the last two insertions ((5 and 7) or (7 and 5)), so there are 4 choices. In total, if 2 is the second insertion we have 24+12+4=40 choices for the rest of the insertions. Similarly, there are 40 choices if the second insertion is 6, so the total number of different insertion orders is 80.

Project Euler - 68

I have already read What is an "external node" of a "magic" 3-gon ring? and I have solved problems up until 90 but this n-gon thing totally baffles me as I don't understand the question at all.
So I take this ring and I understand that the external circles are 4, 5, 6 as they are outside the inner circle. Now he says there are eight solutions. And the eight solutions are without much explanation listed below. Let me take
9 4,2,3; 5,3,1; 6,1,2
9 4,3,2; 6,2,1; 5,1,3
So how do we arrive at the 2 solutions? I understand 4, 3, 2, is in straight line and 6,2,1 is in straight line and 5, 1, 3 are in a straight line and they are in clockwise so the second solution makes sense.
Questions
Why does the first solution 4,2,3; 5,3,1; 6,1,2 go anti clock wise? Should it not be 423 612 and then 531?
How do we arrive at 8 solutions. Is it just randomly picking three numbers? What exactly does it mean to solve a "N-gon"?
The first doesn't go anti-clockwise. It's what you get from the configuration
4
\
2
/ \
1---3---5
/
6
when you go clockwise, starting with the smallest number in the outer ring.
How do we arrive at 8 solutions. Is it just randomly picking three numbers? What exactly does it mean to solve a "N-gon"?
For an N-gon, you have an inner N-gon, and for each side of the N-gon one spike, like
X
|
X---X---X
| |
X---X---X
|
X
so that the spike together with the side of the inner N-gon connects a group of three places. A "solution" of the N-gon is a configuration where you placed the numbers from 1 to 2*N so that each of the N groups sums to the same value.
The places at the end of the spikes appear in only one group each, the places on the vertices of the inner N-gon in two. So the sum of the sums of all groups is
N
∑ k + ∑{ numbers on vertices }
k=1
The sum of the numbers on the vertices of the inner N-gon is at least 1 + 2 + ... + N = N*(N+1)/2 and at most (N+1) + (N+2) + ... + 2*N = N² + N*(N+1)/2 = N*(3*N+1)/2.
Hence the sum of the sums of all groups is between
N*(2*N+1) + N*(N+1)/2 = N*(5*N+3)/2
and
N*(2*N+1) + N*(3*N+1)/2 = N*(7*N+3)/2
inclusive, and the sum per group must be between
(5*N+3)/2
and
(7*N+3)/2
again inclusive.
For the triangle - N = 3 - the bounds are (5*3+3)/2 = 9 and (7*3+3)/2 = 12. For a square - N = 4 - the bounds are (5*4+3)/2 = 11.5 and (7*4+3)/2 = 15.5 - since the sum must be an integer, the possible sums are 12, 13, 14, 15.
Going back to the triangle, if the sum of each group is 9, the sum of the sums is 27, and the sum of the numbers on the vertices must be 27 - (1+2+3+4+5+6) = 27 - 21 = 6 = 1+2+3, so the numbers on the vertices are 1, 2 and 3.
For the sum to be 9, the value at the end of the spike for the side connecting 1 and 2 must be 6, for the side connecting 1 and 3, the spike value must be 5, and 4 for the side connecting 2 and 3.
If you start with the smallest value on the spikes - 4 - you know you have to place 2 and 3 on the vertices of the side that spike protrudes from. There are two ways to arrange the two numbers there, leading to the two solutions for sum 9.
If the sum of each group is 10, the sum of the sums is 30, and the sum of the numbers on the vertices must be 9. To represent 9 as the sum of three distinct numbers from 1 to 6, you have the possibilities
1 + 2 + 6
1 + 3 + 5
2 + 3 + 4
For the first group, you have one side connecting 1 and 2, so you'd need a 7 on the end of the spike to make 10 - no solution.
For the third group, the minimal sum of two of the numbers is 5, but 5+6 = 11 > 10, so there's no place for the 6 - no solution.
For the second group, the sums of the sides are
1 + 3 = 4 -- 6 on the spike
1 + 5 = 6 -- 4 on the spike
3 + 5 = 8 -- 2 on the spike
and you have two ways to arrange 3 and 5, so that the group is either 2-3-5 or 2-5-3, the rest follows again.
The solutions for the sums 11 and 12 can be obtained similarly, or by replacing k with 7-k in the solutions for the sums 9 resp. 10.
To solve the problem, you must now find out
what it means to obtain a 16-digit string or a 17-digit string
which sum for the groups gives rise to the largest value when the numbers are concatenated in the prescribed way.
(And use pencil and paper for the fastest solution.)

minimum steps required to make array of integers contiguous

given a sorted array of distinct integers, what is the minimum number of steps required to make the integers contiguous? Here the condition is that: in a step , only one element can be changed and can be either increased or decreased by 1 . For example, if we have 2,4,5,6 then '2' can be made '3' thus making the elements contiguous(3,4,5,6) .Hence the minimum steps here is 1 . Similarly for the array: 2,4,5,8:
Step 1: '2' can be made '3'
Step 2: '8' can be made '7'
Step 3: '7' can be made '6'
Thus the sequence now is 3,4,5,6 and the number of steps is 3.
I tried as follows but am not sure if its correct?
//n is the number of elements in array a
int count=a[n-1]-a[0]-1;
for(i=1;i<=n-2;i++)
{
count--;
}
printf("%d\n",count);
Thanks.
The intuitive guess is that the "center" of the optimal sequence will be the arithmetic average, but this is not the case. Let's find the correct solution with some vector math:
Part 1: Assuming the first number is to be left alone (we'll deal with this assumption later), calculate the differences, so 1 12 3 14 5 16-1 2 3 4 5 6 would yield 0 -10 0 -10 0 -10.
sidenote: Notice that a "contiguous" array by your implied definition would be an increasing arithmetic sequence with difference 1. (Note that there are other reasonable interpretations of your question: some people may consider 5 4 3 2 1 to be contiguous, or 5 3 1 to be contiguous, or 1 2 3 2 3 to be contiguous. You also did not specify if negative numbers should be treated any differently.)
theorem: The contiguous numbers must lie between the minimum and maximum number. [proof left to reader]
Part 2: Now returning to our example, assuming we took the 30 steps (sum(abs(0 -10 0 -10 0 -10))=30) required to turn 1 12 3 14 5 16 into 1 2 3 4 5 6. This is one correct answer. But 0 -10 0 -10 0 -10+c is also an answer which yields an arithmetic sequence of difference 1, for any constant c. In order to minimize the number of "steps", we must pick an appropriate c. In this case, each time we increase or decrease c, we increase the number of steps by N=6 (the length of the vector). So for example if we wanted to turn our original sequence 1 12 3 14 5 16 into 3 4 5 6 7 8 (c=2), then the differences would have been 2 -8 2 -8 2 -8, and sum(abs(2 -8 2 -8 2 -8))=30.
Now this is very clear if you could picture it visually, but it's sort of hard to type out in text. First we took our difference vector. Imagine you drew it like so:
4|
3| *
2| * |
1| | | *
0+--+--+--+--+--*
-1| |
-2| *
We are free to "shift" this vector up and down by adding or subtracting 1 from everything. (This is equivalent to finding c.) We wish to find the shift which minimizes the number of | you see (the area between the curve and the x-axis). This is NOT the average (that would be minimizing the standard deviation or RMS error, not the absolute error). To find the minimizing c, let's think of this as a function and consider its derivative. If the differences are all far away from the x-axis (we're trying to make 101 112 103 114 105 116), it makes sense to just not add this extra stuff, so we shift the function down towards the x-axis. Each time we decrease c, we improve the solution by 6. Now suppose that one of the *s passes the x axis. Each time we decrease c, we improve the solution by 5-1=4 (we save 5 steps of work, but have to do 1 extra step of work for the * below the x-axis). Eventually when HALF the *s are past the x-axis, we can NO LONGER IMPROVE THE SOLUTION (derivative: 3-3=0). (In fact soon we begin to make the solution worse, and can never make it better again. Not only have we found the minimum of this function, but we can see it is a global minimum.)
Thus the solution is as follows: Pretend the first number is in place. Calculate the vector of differences. Minimize the sum of the absolute value of this vector; do this by finding the median OF THE DIFFERENCES and subtracting that off from the differences to obtain an improved differences-vector. The sum of the absolute value of the "improved" vector is your answer. This is O(N) The solutions of equal optimality will (as per the above) always be "adjacent". A unique solution exists only if there are an odd number of numbers; otherwise if there are an even number of numbers, AND the median-of-differences is not an integer, the equally-optimal solutions will have difference-vectors with corrective factors of any number between the two medians.
So I guess this wouldn't be complete without a final example.
input: 2 3 4 10 14 14 15 100
difference vector: 2 3 4 5 6 7 8 9-2 3 4 10 14 14 15 100 = 0 0 0 -5 -8 -7 -7 -91
note that the medians of the difference-vector are not in the middle anymore, we need to perform an O(N) median-finding algorithm to extract them...
medians of difference-vector are -5 and -7
let us take -5 to be our correction factor (any number between the medians, such as -6 or -7, would also be a valid choice)
thus our new goal is 2 3 4 5 6 7 8 9+5=7 8 9 10 11 12 13 14, and the new differences are 5 5 5 0 -3 -2 -2 -86*
this means we will need to do 5+5+5+0+3+2+2+86=108 steps
*(we obtain this by repeating step 2 with our new target, or by adding 5 to each number of the previous difference... but since you only care about the sum, we'd just add 8*5 (vector length times correct factor) to the previously calculated sum)
Alternatively, we could have also taken -6 or -7 to be our correction factor. Let's say we took -7...
then the new goal would have been 2 3 4 5 6 7 8 9+7=9 10 11 12 13 14 15 16, and the new differences would have been 7 7 7 2 1 0 0 -84
this would have meant we'd need to do 7+7+7+2+1+0+0+84=108 steps, the same as above
If you simulate this yourself, can see the number of steps becomes >108 as we take offsets further away from the range [-5,-7].
Pseudocode:
def minSteps(array A of size N):
A' = [0,1,...,N-1]
diffs = A'-A
medianOfDiffs = leftMedian(diffs)
return sum(abs(diffs-medianOfDiffs))
Python:
leftMedian = lambda x:sorted(x)[len(x)//2]
def minSteps(array):
target = range(len(array))
diffs = [t-a for t,a in zip(target,array)]
medianOfDiffs = leftMedian(diffs)
return sum(abs(d-medianOfDiffs) for d in diffs)
edit:
It turns out that for arrays of distinct integers, this is equivalent to a simpler solution: picking one of the (up to 2) medians, assuming it doesn't move, and moving other numbers accordingly. This simpler method often gives incorrect answers if you have any duplicates, but the OP didn't ask that, so that would be a simpler and more elegant solution. Additionally we can use the proof I've given in this solution to justify the "assume the median doesn't move" solution as follows: the corrective factor will always be in the center of the array (i.e. the median of the differences will be from the median of the numbers). Thus any restriction which also guarantees this can be used to create variations of this brainteaser.
Get one of the medians of all the numbers. As the numbers are already sorted, this shouldn't be a big deal. Assume that median does not move. Then compute the total cost of moving all the numbers accordingly. This should give the answer.
community edit:
def minSteps(a):
"""INPUT: list of sorted unique integers"""
oneMedian = a[floor(n/2)]
aTarget = [oneMedian + (i-floor(n/2)) for i in range(len(a))]
# aTargets looks roughly like [m-n/2?, ..., m-1, m, m+1, ..., m+n/2]
return sum(abs(aTarget[i]-a[i]) for i in range(len(a)))
This is probably not an ideal solution, but a first idea.
Given a sorted sequence [x1, x2, …, xn]:
Write a function that returns the differences of an element to the previous and to the next element, i.e. (xn – xn–1, xn+1 – xn).
If the difference to the previous element is > 1, you would have to increase all previous elements by xn – xn–1 – 1. That is, the number of necessary steps would increase by the number of previous elements × (xn – xn–1 – 1). Let's call this number a.
If the difference to the next element is >1, you would have to decrease all subsequent elements by xn+1 – xn – 1. That is, the number of necessary steps would increase by the number of subsequent elements × (xn+1 – xn – 1). Let's call this number b.
If a < b, then increase all previous elements until they are contiguous to the current element. If a > b, then decrease all subsequent elements until they are contiguous to the current element. If a = b, it doesn't matter which of these two actions is chosen.
Add up the number of steps taken in the previous step (by increasing the total number of necessary steps by either a or b), and repeat until all elements are contiguous.
First of all, imagine that we pick an arbitrary target of contiguous increasing values and then calculate the cost (number of steps required) for modifying the array the array to match.
Original: 3 5 7 8 10 16
Target: 4 5 6 7 8 9
Difference: +1 0 -1 -1 -2 -7 -> Cost = 12
Sign: + 0 - - - -
Because the input array is already ordered and distinct, it is strictly increasing. Because of this, it can be shown that the differences will always be non-increasing.
If we change the target by increasing it by 1, the cost will change. Each position in which the difference is currently positive or zero will incur an increase in cost by 1. Each position in which the difference is currently negative will yield a decrease in cost by 1:
Original: 3 5 7 8 10 16
New target: 5 6 7 8 9 10
New Difference: +2 +1 0 0 -1 -6 -> Cost = 10 (decrease by 2)
Conversely, if we decrease the target by 1, each position in which the difference is currently positive will yield a decrease in cost by 1, while each position in which the difference is zero or negative will incur an increase in cost by 1:
Original: 3 5 7 8 10 16
New target: 3 4 5 6 7 8
New Difference: 0 -1 -2 -2 -3 -8 -> Cost = 16 (increase by 4)
In order to find the optimal values for the target array, we must find a target such that any change (increment or decrement) will not decrease the cost. Note that an increment of the target can only decrease the cost when there are more positions with negative difference than there are with zero or positive difference. A decrement can only decrease the cost when there are more positions with a positive difference than with a zero or negative difference.
Here are some example distributions of difference signs. Remember that the differences array is non-increasing, so positives always have to be first and negatives last:
C C
+ + + - - - optimal
+ + 0 - - - optimal
0 0 0 - - - optimal
+ 0 - - - - can increment (negatives exceed positives & zeroes)
+ + + 0 0 0 optimal
+ + + + - - can decrement (positives exceed negatives & zeroes)
+ + 0 0 - - optimal
+ 0 0 0 0 0 optimal
C C
Observe that if one of the central elements (marked C) is zero, the target must be optimal. In such a circumstance, at best any increment or decrement will not change the cost, but it may increase it. This result is important, because it gives us a trivial solution. We pick a target such that a[n/2] remains unchanged. There may be other possible targets that yield the same cost, but there are definitely none that are better. Here's the original code modified to calculate this cost:
//n is the number of elements in array a
int targetValue;
int cost = 0;
int middle = n / 2;
int startValue = a[middle] - middle;
for (i = 0; i < n; i++)
{
targetValue = startValue + i;
cost += abs(targetValue - a[i]);
}
printf("%d\n",cost);
You can not do it by iterating once on the array, that's for sure.
You need first to check the difference between each two numbers, for example:
2,7,8,9 can be 2,3,4,5 with 18 steps or 6,7,8,9 with 4 steps.
Create a new array with the difference like so: for 2,7,8,9 it wiil be 4,1,1. Now you can decide whether to increase or decrease the first number.
Lets assume that the contiguous array looks something like this -
c c+1 c+2 c+3 .. and so on
Now lets take an example -
5 7 8 10
The contiguous array in this case will be -
c c+1 c+2 c+3
In order to get the minimum steps, the sum of the modulus of the difference of the integers(before and after) w.r.t the ith index should be the minimum. In which case,
(c-5)^2 + (c-6)^2 + (c-6)^2 + (c-7)^2 should be minimum
Let f(c) = (c-5)^2 + (c-6)^2 + (c-6)^2 + (c-7)^2
= 4c^2 - 48c + 146
Applying differential calculus to get the minima,
f'(c) = 8c - 48 = 0
=> c = 6
So our contiguous array is 6 7 8 9 and the minimum cost here is 2.
To sum it up, just generate f(c), get the first differential and find out c.
This should take O(n).
Brute force approach O(N*M)
If one draws a line through each point in the array a then y0 is a value where each line starts at index 0. Then the answer is the minimum among number of steps reqired to get from a to every line that starts at y0, in Python:
y0s = set((y - i) for i, y in enumerate(a))
nsteps = min(sum(abs(y-(y0+i)) for i, y in enumerate(a))
for y0 in xrange(min(y0s), max(y0s)+1)))
Input
2,4,5,6
2,4,5,8
Output
1
3

Resources