Distance algorithm - minimum coins required to clear all the level - algorithm

Thor is playing a game where there are N levels and M types of available weapons. The levels are numbered from 0 to N-1 and the weapons are numbered from 0 to M-1. He can clear these levels in any order. In each level, some subset of these M weapons is required to clear this level. If in a particular level, he needs to buy x new weapons, he will pay x^2 coins for it. Also note that he can carry all the weapons he has currently to the next level. Initially, he has no weapons. Can you find out the minimum coins required such that he can clear all the levels?
Input Format
The first line of input contains 2 space separated integers:
N = the number of levels in the game
M = the number of types of weapons
N lines follow. The ith of these lines contains a binary string of length M. If the jth character of this string is 1, it means we need a weapon of type j to clear the ith level.
Constraints
1 <= N <= 20
1 <= M <= 20
Output Format
Print a single integer which is the answer to the problem.
Sample TestCase 1
Input
1 4
0101
Output
4
Explanation
There is only one level in this game. We need 2 types of weapons - 1 and 3. Since, initially, Thor has no weapons he will have to buy these, which will cost him 2^2 = 4 coins.
Sample TestCase 2
Input
3 3
111
001
010
Output
3
Explanation
There are 3 levels in this game. The 0th level (111) requires all 3 types of weapons. The 1st level (001) requires only weapon of type 2. The 2nd level requires only weapon of type 1. If we clear the levels in the given order (0-1-2), total cost = 3^2 + 0^2 + 0^2 = 9 coins. If we clear the levels in the order 1-2-0, it will cost = 1^2 + 1^2 + 1^2 = 3 coins, which is the optimal way.

The beauty of Gassa's answer is partly in the fact that if a different state can be reached by oring one of the levels' bitstring masks with the current state, we are guaranteed that achieving the current state did not include visiting this level (since otherwise those bits would already be set). This means checking a transition from one state to another by adding a different bitmask, guarantees we are looking at an ordering that did not yet include that mask. So a formulation like Gassa's could work: let f(st) represent the cost of acheiving state st, then:
f(st) = min(
some known cost of f(st),
f(prev_st) + (popcount(prev_st | level) - popcount(prev_st))^2
)
for all level and prev_st that or to st

Related

Maximum path cost in matrix

Can anyone tell the algorithm for finding the maximum path cost in a NxM matrix starting from top left corner and ending with bottom right corner with left ,right , down movement is allowed in a matrix and contains negative cost. A cell can be visited any number of times and after visiting a cell its cost is replaced with 0
Constraints
1 <= nxm <= 4x10^6
INPUT
4 5
1 2 3 -1 -2
-5 -8 -1 2 -150
1 2 3 -250 100
1 1 1 1 20
OUTPUT
37
Explanation is given in the image
Explanation of Output
Since you have also negative costs then use bellman-ford. What you do is that you change sign of all the costs(convert negative signs to positive and positive to negative) then find the shortest path and this path will be the longest because you have changed the signs.
If the sign is never becoms negative then use dijkstra shrtest-path but before that make all values negative and this will return you the longest path with it's cost.
You matrix is a direct graph. In your image you are trying to find a path(max or min) from index (0,0) to (n-1,n-1).
You need these things to represent it as a graph.
You need a linkedlist and in each node you have a first_Node, second_Node,Cost to move from first node to second.
An array of linkedlist. In each array index you save a linkedlist.If for example there is a path from 0 to 5 and 0 to 1(it's an undirected graph) then your graph will look like this.
If you want a direct-graph then simply add in adj[0] = 5 and do not add in adj[5] = 0 , this means that there is path from 0 to 5 but not from 5 to zero.
Here linkedlist represents only nodes which are connected not there cost. You have to add extra variable there which keep cost for each two nodes and it will look like this.
Now instead of first linkedlist put this linkedlist in your array and you have a graph now to run shortest or longest path algorithm.
If you want an intellgent algorithm then you can use A* with heuristic, i guess manhattan will be best.
If cost of your edges is not negative then use Dijkstra.
If cost is negative then use bellman-ford algorithm.
You can always find the longest path by converting the minus sign to plus and plus to minus and then run shortest path algorithm. Path founded will be longest.
I answered this question and as you said in comments to look at point two. If that's a task then main idea of this assignment is ensure the Monotonocity.
h stands for heuristic cost.
A stands for accumulated cost.
Which says that each node the h(A) =< h(A) + A(A,B). Means if you want to move from A to B then cost should not be decreasing(can you do something with your values such that this property will hold) but increasing and once you satisfy this condition then everyone node which A* chooses , that node will be part of your path from source to Goal because this is the path with shortest/longest value.
pathMax You can enforece monotonicity. If there is path from A to B such that f(S...AB) < f(S ..B) then set cost of the f(S...AB) = Max(f(S...AB) , f(S...A)) where S means source.
Since moving up is not allowed, paths always look like a set of horizontal intervals that share at least 1 position (for the down move). Answers can be characterized as, say
struct Answer {
int layer[N][2]; // layer[i][0] and [i][1] represent interval start&end
// with 0 <= layer[i][0] <= layer[i][1] < M
// layer[0][0] = 0, layer[N][1] = M-1
// and non-empty intersection of layers i and i+1
};
An alternative encoding is to note only layer widths and offsets to each other; but you would still have to make sure that the last layer includes the exit cell.
Assuming that you have a maxLayer routine that finds the highest-scoring interval in each layer (const O(M) per layer), and that all such such layers overlap, this would yield an O(N+M) optimal answer. However, it may be necessary to expand intervals to ensure that overlap occurs; and there may be multiple highest-scoring intervals in a given layer. At this point I would model the problem as a directed graph:
each layer has one node per score-maximizing horizontal continuous interval.
nodes from one layer are connected to nodes in the next layer according to the cost of expanding both intervals to achieve at least 1 overlap. If they already overlap, the cost is 0. Edge costs will always be zero or negative (otherwise, either source or target intervals could have scored higher by growing bigger). Add the (expanded) source-node interval value to the connection cost to get an "edge weight".
You can then run Dijkstra on this graph (negate edge weights so that the "longest path" is returned) to find the optimal path. Even better, since all paths pass once and only once through each layer, you only need to keep track of the best route to each node, and only need to build nodes and edges for the layer you are working on.
Implementation details ahead
to calculate maxLayer in O(M), use Kadane's Algorithm, modified to return all maximal intervals instead of only the first. Where the linked algorithm discards an interval and starts anew, you would instead keep a copy of that contender to use later.
given the sample input, the maximal intervals would look like this:
[0]
1 2 3 -1 -2 [1 2 3]
-5 -8 -1 2 -150 => [2]
1 2 3 -250 100 [1 2 3] [100]
1 1 1 1 20 [1 1 1 1 20]
[0]
given those intervals, they would yield the following graph:
(0)
| =>0
(+6)
\ -1=>5
\
(+2)
=>7/ \ -150=>-143
/ \
(+7) (+100)
=>12 \ / =>-43
\ /
(+24)
| =>37
(0)
when two edges incide on a single node (row 1 1 1 1 20), carry forward only the highest incoming value.
For each element in a row, find the maximum cost that can be obtained if we move horizontally across the row, given that we go through that element.
Eg. For the row
1 2 3 -1 -2
The maximum cost for each element obtained if we move horizontally given that we pass through that element will be
6 6 6 5 3
Explanation:
for element 3: we can move backwards horizontally touching 1 and 2. we will not move horizontally forward as the values -1 and -2, reduces the cost value.
So the maximum cost for 3 = 1 + 2 + 3 = 6
The maximum cost matrix for each of elements in a row if we move horizontally, for the input you have given in the description will be
6 6 6 5 3
-5 -7 1 2 -148
6 6 6 -144 100
24 24 24 24 24
Since we can move vertically from one row to the below row, update the maximum cost for each element as follows:
cost[i][j] = cost[i][j] + cost[i-1][j]
So the final cost matrix will be :
6 6 6 5 3
1 -1 7 7 -145
7 5 13 -137 -45
31 29 37 -113 -21
Maximum value in the last row of the above matrix will be give you the required output i.e 37

Modified Tower of Hanoi

We all know that the minimum number of moves required to solve the classical towers of hanoi problem is 2n-1. Now, let us assume that some of the discs have same size. What would be the minimum number of moves to solve the problem in that case.
Example, let us assume that there are three discs. In the classical problem, the minimum number of moves required would be 7. Now, let us assume that the size of disc 2 and disc 3 is same. In that case, the minimum number of moves required would be:
Move disc 1 from a to b.
Move disc 2 from a to c.
Move disc 3 from a to c.
Move disc 1 from b to c.
which is 4 moves. Now, given the total number of discs n and the sets of discs which have same size, find the minimum number of moves to solve the problem. This is a challenge by a friend, so pointers towards solution are welcome. Thanks.
Let's consider a tower of size n. The top disk has to be moved 2n-1 times, the second disk 2n-2 times, and so on, until the bottom disk has to be moved just once, for a total of 2n-1 moves. Moving each disk takes exactly one turn.
1 moved 8 times
111 moved 4 times
11111 moved 2 times
1111111 moved 1 time => 8 + 4 + 2 + 1 == 15
Now if x disks have the same size, those have to be in consecutive layers, and you would always move them towards the same target stack, so you could just as well collapse those to just one disk, requiring x turns to be moved. You could consider those multi-disks to be x times as 'heavy', or 'thick', if you like.
1
111 1 moved 8 times
111 collapse 222 moved 4 times, taking 2 turns each
11111 -----------> 11111 moved 2 times
1111111 3333333 moved 1 time, taking 3 turns
1111111 => 8 + 4*2 + 2 + 1*3 == 21
1111111
Now just sum those up and you have your answer.
Here's some Python code, using the above example: Assuming you already have a list of the 'collapsed' disks, with disks[i] being the weight of the collapsed disk in the ith layer, you can just do this:
disks = [1, 2, 1, 3] # weight of collapsed disks, top to bottom
print sum(d * 2**i for i, d in enumerate(reversed(disks)))
If instead you have a list of the sizes of the disks, like on the left side, you could use this algorithm:
disks = [1, 3, 3, 5, 7, 7, 7] # size of disks, top to bottom
last, t, s = disks[-1], 1, 0
for d in reversed(disks):
if d < last: t, last = t*2, d
s = s + t
print s
Output, in both cases, is 21, the required number of turns.
It completely depends on the distribution of the discs that are the same size. If you have n=7 discs and they are all the same size then the answer is 7 (or n). And, of course the standard problem is answered by 2n-1.
As tobias_k suggested, you can group same size discs. So now look at the problem as moving groups of discs. To move a certain number of groups, you have to know the size of each group
examples
1
n=7 //disc sizes (1,2,3,3,4,5,5)
g=5 //group sizes (1,1,2,1,2)
//group index (1,2,3,4,5)
number of moves = sum( g-size * 2^( g-count - g-index ) )
in this case
moves = 1*2^4 + 1*2^3 + 2*2^2 + 1*2^1 + 2*2^0
= 16 + 8 + 8 + 2 + 2
= 36
2
n=7 //disc sizes (1,1,1,1,1,1,1)
g=1 //group sizes (7)
//group index (1)
number of moves = sum( g-size * 2^( g-count - g-index ) )
in this case
moves = 7*2^0
= 7
3
n=7 //disc sizes (1,2,3,4,5,6,7)
g=7 //group sizes (1,1,1,1,1,1,1)
//group index (1,2,3,4,5,6,7)
number of moves = sum( g-size * 2^( g-count - g-index ) )
in this case
moves = 1*2^6 + 1*2^5 + 1*2^4 + 1*2^3 + 1*2^2 + 1*2^1 + 1*2^0
= 64 + 32 + 16 + 8 + 4 + 2 + 1
= 127
Interesting note about the last example, and the standard hanoi problem: sum(2n-1) = 2n - 1
I wrote a Github gist in C for this problem. I am attaching a link to it, may be useful to somebody, I hope.
Modified tower of Hanoi problem with one or more disks of the same size
There are n types of disks. For each type, all disks are identical. In array arr, I am taking the number of disks of each type. A, B and C are pegs or towers.
Method swap(int, int), partition(int, int) and qSort(int, int) are part of my implementation of the quicksort algorithm.
Method toh(char, char, char, int, int) is the Tower of Hanoi solution.
How it is working: Imagine we compress all the disks of the same size into one disk. Now we have a problem which has a general solution to the Tower of Hanoi. Now each time a disk moves, we add the total movement which is equal to the total number of that type of disk.

Determining the Longest Continguous Subsequence

There are N nodes (1 <= N <= 100,000) various positions along a
long one-dimensional length. The ith node is at position x_i (an
integer in the range 0...1,000,000,000) and has a node type b_i(an integer in
the range 1..8). Nodes can not be in the same position
You want to get a range on this one-dimension in which all of the types of nodes are fairly represented. Therefore, you want to ensure that, for whatever types of nodes that are present in the range, there is an equal number of each node type (for example, a range with 27 each of types 1 and 3 is ok, a range with 27 of types 1, 3, and 4 is
ok, but 9 of type 1 and 10 of type 3 is not ok). You also want
at least K (K >= 2) types (out of the 8 total) to be represented in the
rand. Find the maximum size of this range that satisfies the constraints. The size of a photo is the difference between the maximum and minimum positions of the nodes in the photo.
If there are no ranges satisfying the constraints, output -1 instead.
INPUT:
* Line 1: N and K separated by a space
* Lines 2..N+1: Each line contains a description of a node as two
integers separated by a space; x(i) and its node type.
INPUT:
9 2
1 1
5 1
6 1
9 1
100 1
2 2
7 2
3 3
8 3
INPUT DETAILS:
Node types: 1 2 3 - 1 1 2 3 1 - ... - 1
Locations: 1 2 3 4 5 6 7 8 9 10 ... 99 100
OUTPUT:
* Line 1: A single integer indicating the maximum size of a fair
range. If no such range exists, output -1.
OUTPUT:
6
OUTPUT DETAILS:
The range from x = 2 to x = 8 has 2 each of types 1, 2, and 3. The range
from x = 9 to x = 100 has 2 of type 1, but this is invalid because K = 2
and so you need at least 2 distinct types of nodes.
Could You Please help in suggesting some algorithm to solve this. I have thought about using some sort of priority queue or stack data structure, but am really unsure how to proceed.
Thanks, Todd
It's not too difficult to invent almost linear-time algorithm because recently similar problem was discussed on CodeChef: "ABC-Strings".
Sort nodes by their positions.
Prepare all possible subsets of node types (for example, we could expect types 1,2,4,5,7 to be present in resulting interval and all other types not present there). For K=2 there may be only 256-8-1=247 subsets. For each subset perform remaining steps:
Initialize 8 type counters to [0,0,0,0,0,0,0,0].
For each node perform remaining steps:
Increment counter for current node type.
Take L counters for types included to current subset, subtract first of them from other L-1 counters, which produces L-1 values. Take remaining 8-L counters and combine them together with those L-1 values into a tuple of 7 values.
Use this tuple as a key for hash map. If hash map contains no value for this key, add a new entry with this key and value equal to the position of current node. Otherwise subtract value in the hash map from the position of current node and (possibly) update the best result.

Calculate unique integer identifier for elements in an hierarchical structure based on its position

We have a hierarchical structure like this :
- 1 ( identifier = 100 )
- 1 (101)
- 2 (102)
- 3 (103)
- 1 (1031)
- 2 (1032)
- 3 (1033)
- 4 (104)
- 1 (1041)
- 2 (1042)
- 3 (1043)
-901 (1001)
- 2 (200)
- 1 (201)
- 2 (202)
- 10 (1000)
- 1 (1001)
Required characteristics :
the identifiers of each node should be unique.
the identifiers should increment accordingly to level of the element
the identifiers should be of Integer type.
the counter of each element resets at each new level/parent element
As you can see in the example of elements 1.901 and 10.1 , the current implementation doesn't work.
We've tried next solutions :
multiplying each level by number.
multiplying only the first level by a number and add each child
It becomes much more easier in case the identifier is a String, in this case we can use next way : "level1.level2.level3...", so for 1 -> 1 it will be "1.1" and so on. But this is the most unwanted step.
So, could you please suggest any algorithm that can be used here to generate required identifiers ?
Update Fixed the example. P.S. I knew that it's wrong.
You're too hung up on decimal numbers.
Choose a value X which represents the maximum number of child nodes within a node, or any number greater than that that you find convenient to work with.
Drop your adherence to decimal numbers, and represent all identifiers as integers represented in base X.
Encode an identifier as an integer in base X where the first digit represents the top-level of the node tree, the second digit represents the second level, and so forth.
So, if you were lucky and a reasonable value for X turned out to be 16 you could use hexadecimal representations of integers. If 36 is a good value, use any alphanumeric character as a digit.
EDIT
As Rafael has pointed out this approach breaks down if it is not possible to define an upper limit on the number of children a node can have. In my experience this is unlikely to be a serious problem in practice.
If the value of X is large, say 863 then I'd suggest the obvious implementation would be to set X = 1000 and use groups of 3 decimal digits to represent each digit in base 1000. This way the identifier 12.245.1 would be represented as 12245001.
And now we're into territory already covered by alestanis' answer
First of all, your example is wrong (100 maps to identifier 10000), but this doesn't mean it works. Here"s why:
- 1 (100)
-1 (101)
...
-100 (200)
- 2 (200)
What you need to have a working example is indeed to multiply each level by a number N or to say the same, reserve X digits for each level. In your case, you chose N = 100, so the extra constraint you need to make your example work is that each level cannot have more than N-1 children.
If you enforce this constraint, element 1.100 would be illegal, thus eliminating duplicate identifiers.
Using N = 100 (or X = 2) could yield:
- 1 (01)
- 1 (0101)
- 1 (010101)
...
- 3 (010103)
...
- 99 (0199)
- 100 is ILLEGAL
- 2 (02)
- 42 (42)
- 42 (4242)
The "problem" then would be to choose X wisely so that you minimize the number of needed digits without constraining your user. For instance, if you know you will never add more than 450 children, choose X=3.
This problem has a simple but memory-wise costly solution:
Each node position can be easily represented as an unique finite sequence of integers i(1), i(2), ..., i(n):
node 1 = [1]
node 1.1 = [1, 1]
node 7.2.42 = [7, 2, 42]
Thus the question can be represented as how to map each finite sequence of integers to a unique integer. That can be done using prime numbers
p(1) = 2
p(2) = 3
p(3) = 5
...
Just multiply the i(n):th powers of primes p(n). The uniqueness of the resulting integer is guaranteed by the uniqueness of prime number factorization. See Wikipedia:
http://en.wikipedia.org/wiki/Integer_factorization
Examples:
node 1 : 2^1 = 2
node 1.1 : 2^1 * 3^1 = 6
node 7.2.42 : 2^7 * 3^2 * 5^42 = (huge number!)
Another possible solution
Begin with the string-representation of the node (e.g. "7.2.42"). Use octal numbers for node numbering. Use '8' (the first unused digit for octals) to separate levels instead of '.'. Use the resulting string as a decimal integer.
Examples:
node 1 : 1
node 1.1 : 181
node 1.2 : 182
node 1.3 : 183
node 7.7.7 : 78787
node 8 : 10
node 8.1 : 1081
node 8.2 : 1082
node 8.9 : 10811
node 8.9.1: 1081181
node 8.9.2: 1081182

minimum steps required to make array of integers contiguous

given a sorted array of distinct integers, what is the minimum number of steps required to make the integers contiguous? Here the condition is that: in a step , only one element can be changed and can be either increased or decreased by 1 . For example, if we have 2,4,5,6 then '2' can be made '3' thus making the elements contiguous(3,4,5,6) .Hence the minimum steps here is 1 . Similarly for the array: 2,4,5,8:
Step 1: '2' can be made '3'
Step 2: '8' can be made '7'
Step 3: '7' can be made '6'
Thus the sequence now is 3,4,5,6 and the number of steps is 3.
I tried as follows but am not sure if its correct?
//n is the number of elements in array a
int count=a[n-1]-a[0]-1;
for(i=1;i<=n-2;i++)
{
count--;
}
printf("%d\n",count);
Thanks.
The intuitive guess is that the "center" of the optimal sequence will be the arithmetic average, but this is not the case. Let's find the correct solution with some vector math:
Part 1: Assuming the first number is to be left alone (we'll deal with this assumption later), calculate the differences, so 1 12 3 14 5 16-1 2 3 4 5 6 would yield 0 -10 0 -10 0 -10.
sidenote: Notice that a "contiguous" array by your implied definition would be an increasing arithmetic sequence with difference 1. (Note that there are other reasonable interpretations of your question: some people may consider 5 4 3 2 1 to be contiguous, or 5 3 1 to be contiguous, or 1 2 3 2 3 to be contiguous. You also did not specify if negative numbers should be treated any differently.)
theorem: The contiguous numbers must lie between the minimum and maximum number. [proof left to reader]
Part 2: Now returning to our example, assuming we took the 30 steps (sum(abs(0 -10 0 -10 0 -10))=30) required to turn 1 12 3 14 5 16 into 1 2 3 4 5 6. This is one correct answer. But 0 -10 0 -10 0 -10+c is also an answer which yields an arithmetic sequence of difference 1, for any constant c. In order to minimize the number of "steps", we must pick an appropriate c. In this case, each time we increase or decrease c, we increase the number of steps by N=6 (the length of the vector). So for example if we wanted to turn our original sequence 1 12 3 14 5 16 into 3 4 5 6 7 8 (c=2), then the differences would have been 2 -8 2 -8 2 -8, and sum(abs(2 -8 2 -8 2 -8))=30.
Now this is very clear if you could picture it visually, but it's sort of hard to type out in text. First we took our difference vector. Imagine you drew it like so:
4|
3| *
2| * |
1| | | *
0+--+--+--+--+--*
-1| |
-2| *
We are free to "shift" this vector up and down by adding or subtracting 1 from everything. (This is equivalent to finding c.) We wish to find the shift which minimizes the number of | you see (the area between the curve and the x-axis). This is NOT the average (that would be minimizing the standard deviation or RMS error, not the absolute error). To find the minimizing c, let's think of this as a function and consider its derivative. If the differences are all far away from the x-axis (we're trying to make 101 112 103 114 105 116), it makes sense to just not add this extra stuff, so we shift the function down towards the x-axis. Each time we decrease c, we improve the solution by 6. Now suppose that one of the *s passes the x axis. Each time we decrease c, we improve the solution by 5-1=4 (we save 5 steps of work, but have to do 1 extra step of work for the * below the x-axis). Eventually when HALF the *s are past the x-axis, we can NO LONGER IMPROVE THE SOLUTION (derivative: 3-3=0). (In fact soon we begin to make the solution worse, and can never make it better again. Not only have we found the minimum of this function, but we can see it is a global minimum.)
Thus the solution is as follows: Pretend the first number is in place. Calculate the vector of differences. Minimize the sum of the absolute value of this vector; do this by finding the median OF THE DIFFERENCES and subtracting that off from the differences to obtain an improved differences-vector. The sum of the absolute value of the "improved" vector is your answer. This is O(N) The solutions of equal optimality will (as per the above) always be "adjacent". A unique solution exists only if there are an odd number of numbers; otherwise if there are an even number of numbers, AND the median-of-differences is not an integer, the equally-optimal solutions will have difference-vectors with corrective factors of any number between the two medians.
So I guess this wouldn't be complete without a final example.
input: 2 3 4 10 14 14 15 100
difference vector: 2 3 4 5 6 7 8 9-2 3 4 10 14 14 15 100 = 0 0 0 -5 -8 -7 -7 -91
note that the medians of the difference-vector are not in the middle anymore, we need to perform an O(N) median-finding algorithm to extract them...
medians of difference-vector are -5 and -7
let us take -5 to be our correction factor (any number between the medians, such as -6 or -7, would also be a valid choice)
thus our new goal is 2 3 4 5 6 7 8 9+5=7 8 9 10 11 12 13 14, and the new differences are 5 5 5 0 -3 -2 -2 -86*
this means we will need to do 5+5+5+0+3+2+2+86=108 steps
*(we obtain this by repeating step 2 with our new target, or by adding 5 to each number of the previous difference... but since you only care about the sum, we'd just add 8*5 (vector length times correct factor) to the previously calculated sum)
Alternatively, we could have also taken -6 or -7 to be our correction factor. Let's say we took -7...
then the new goal would have been 2 3 4 5 6 7 8 9+7=9 10 11 12 13 14 15 16, and the new differences would have been 7 7 7 2 1 0 0 -84
this would have meant we'd need to do 7+7+7+2+1+0+0+84=108 steps, the same as above
If you simulate this yourself, can see the number of steps becomes >108 as we take offsets further away from the range [-5,-7].
Pseudocode:
def minSteps(array A of size N):
A' = [0,1,...,N-1]
diffs = A'-A
medianOfDiffs = leftMedian(diffs)
return sum(abs(diffs-medianOfDiffs))
Python:
leftMedian = lambda x:sorted(x)[len(x)//2]
def minSteps(array):
target = range(len(array))
diffs = [t-a for t,a in zip(target,array)]
medianOfDiffs = leftMedian(diffs)
return sum(abs(d-medianOfDiffs) for d in diffs)
edit:
It turns out that for arrays of distinct integers, this is equivalent to a simpler solution: picking one of the (up to 2) medians, assuming it doesn't move, and moving other numbers accordingly. This simpler method often gives incorrect answers if you have any duplicates, but the OP didn't ask that, so that would be a simpler and more elegant solution. Additionally we can use the proof I've given in this solution to justify the "assume the median doesn't move" solution as follows: the corrective factor will always be in the center of the array (i.e. the median of the differences will be from the median of the numbers). Thus any restriction which also guarantees this can be used to create variations of this brainteaser.
Get one of the medians of all the numbers. As the numbers are already sorted, this shouldn't be a big deal. Assume that median does not move. Then compute the total cost of moving all the numbers accordingly. This should give the answer.
community edit:
def minSteps(a):
"""INPUT: list of sorted unique integers"""
oneMedian = a[floor(n/2)]
aTarget = [oneMedian + (i-floor(n/2)) for i in range(len(a))]
# aTargets looks roughly like [m-n/2?, ..., m-1, m, m+1, ..., m+n/2]
return sum(abs(aTarget[i]-a[i]) for i in range(len(a)))
This is probably not an ideal solution, but a first idea.
Given a sorted sequence [x1, x2, …, xn]:
Write a function that returns the differences of an element to the previous and to the next element, i.e. (xn – xn–1, xn+1 – xn).
If the difference to the previous element is > 1, you would have to increase all previous elements by xn – xn–1 – 1. That is, the number of necessary steps would increase by the number of previous elements × (xn – xn–1 – 1). Let's call this number a.
If the difference to the next element is >1, you would have to decrease all subsequent elements by xn+1 – xn – 1. That is, the number of necessary steps would increase by the number of subsequent elements × (xn+1 – xn – 1). Let's call this number b.
If a < b, then increase all previous elements until they are contiguous to the current element. If a > b, then decrease all subsequent elements until they are contiguous to the current element. If a = b, it doesn't matter which of these two actions is chosen.
Add up the number of steps taken in the previous step (by increasing the total number of necessary steps by either a or b), and repeat until all elements are contiguous.
First of all, imagine that we pick an arbitrary target of contiguous increasing values and then calculate the cost (number of steps required) for modifying the array the array to match.
Original: 3 5 7 8 10 16
Target: 4 5 6 7 8 9
Difference: +1 0 -1 -1 -2 -7 -> Cost = 12
Sign: + 0 - - - -
Because the input array is already ordered and distinct, it is strictly increasing. Because of this, it can be shown that the differences will always be non-increasing.
If we change the target by increasing it by 1, the cost will change. Each position in which the difference is currently positive or zero will incur an increase in cost by 1. Each position in which the difference is currently negative will yield a decrease in cost by 1:
Original: 3 5 7 8 10 16
New target: 5 6 7 8 9 10
New Difference: +2 +1 0 0 -1 -6 -> Cost = 10 (decrease by 2)
Conversely, if we decrease the target by 1, each position in which the difference is currently positive will yield a decrease in cost by 1, while each position in which the difference is zero or negative will incur an increase in cost by 1:
Original: 3 5 7 8 10 16
New target: 3 4 5 6 7 8
New Difference: 0 -1 -2 -2 -3 -8 -> Cost = 16 (increase by 4)
In order to find the optimal values for the target array, we must find a target such that any change (increment or decrement) will not decrease the cost. Note that an increment of the target can only decrease the cost when there are more positions with negative difference than there are with zero or positive difference. A decrement can only decrease the cost when there are more positions with a positive difference than with a zero or negative difference.
Here are some example distributions of difference signs. Remember that the differences array is non-increasing, so positives always have to be first and negatives last:
C C
+ + + - - - optimal
+ + 0 - - - optimal
0 0 0 - - - optimal
+ 0 - - - - can increment (negatives exceed positives & zeroes)
+ + + 0 0 0 optimal
+ + + + - - can decrement (positives exceed negatives & zeroes)
+ + 0 0 - - optimal
+ 0 0 0 0 0 optimal
C C
Observe that if one of the central elements (marked C) is zero, the target must be optimal. In such a circumstance, at best any increment or decrement will not change the cost, but it may increase it. This result is important, because it gives us a trivial solution. We pick a target such that a[n/2] remains unchanged. There may be other possible targets that yield the same cost, but there are definitely none that are better. Here's the original code modified to calculate this cost:
//n is the number of elements in array a
int targetValue;
int cost = 0;
int middle = n / 2;
int startValue = a[middle] - middle;
for (i = 0; i < n; i++)
{
targetValue = startValue + i;
cost += abs(targetValue - a[i]);
}
printf("%d\n",cost);
You can not do it by iterating once on the array, that's for sure.
You need first to check the difference between each two numbers, for example:
2,7,8,9 can be 2,3,4,5 with 18 steps or 6,7,8,9 with 4 steps.
Create a new array with the difference like so: for 2,7,8,9 it wiil be 4,1,1. Now you can decide whether to increase or decrease the first number.
Lets assume that the contiguous array looks something like this -
c c+1 c+2 c+3 .. and so on
Now lets take an example -
5 7 8 10
The contiguous array in this case will be -
c c+1 c+2 c+3
In order to get the minimum steps, the sum of the modulus of the difference of the integers(before and after) w.r.t the ith index should be the minimum. In which case,
(c-5)^2 + (c-6)^2 + (c-6)^2 + (c-7)^2 should be minimum
Let f(c) = (c-5)^2 + (c-6)^2 + (c-6)^2 + (c-7)^2
= 4c^2 - 48c + 146
Applying differential calculus to get the minima,
f'(c) = 8c - 48 = 0
=> c = 6
So our contiguous array is 6 7 8 9 and the minimum cost here is 2.
To sum it up, just generate f(c), get the first differential and find out c.
This should take O(n).
Brute force approach O(N*M)
If one draws a line through each point in the array a then y0 is a value where each line starts at index 0. Then the answer is the minimum among number of steps reqired to get from a to every line that starts at y0, in Python:
y0s = set((y - i) for i, y in enumerate(a))
nsteps = min(sum(abs(y-(y0+i)) for i, y in enumerate(a))
for y0 in xrange(min(y0s), max(y0s)+1)))
Input
2,4,5,6
2,4,5,8
Output
1
3

Resources