Minimum Path to Travel (N-1) 1D Points - algorithm

In order to travel to all (N-1) points, one would usually use a MST / Travelling Sales Man problem, but I found an equation that needs explanation for calculating such in O(1) for 1D points.
min( min(abs(b[0]-a) + b[n-2]-b[0], abs(b[n-2]-a) + b[n-2]-b[0]), min( abs(b[1]-a) + b[n-1]-b[1], abs(b[n-1]-a) + b[n-1] - b[1]))
Where b[] is the array of the given points and a is the starting location.
The source of the problem and the equation are from Codeforces:
http://codeforces.com/contest/709/status/B
I would appreciate any help explaining this mathematical maneuver.

Well, looking at the problem definition, it does not mentions that the input of the points are sorted in some order.
The way you gave is possible only if you first sort the input points, and then yes, finding the minimum path is O(1) by calculating this equation. Though, total runtime complexity will be O(nlogn).
Explanation of the equation:
Think of it as all the points are in a row (after sorting). You are given a starting point and you need to visit n-1 points, meaning that you need to visit all points except for only one. From this we know that minimum path will be one of the two:
All points except for the first one
All points except for the last one
The equation you gave calculates exactly that.

Related

Minimize the max distance, 1D array

Problem:
Given a group of numbers of length n (sorted), each number is the location of a house in a 1D line "city".
Given a number k<=n, you need to place k "supermarkets" on the 1D city.
For every element in A, the min distance is defined as the minimum distance between A and a supermarket: |a-c|.
The cost of a city is defined as the max of all min distances.
You need to find what the minimum (optimal) cost would be for a given A of length n, and k<=n.
I can't find a solution for this problem. The solution should use dynamic programming. I'm thinking of how to write the recursive formula, and I think I already came out with the base cases:
if k = n then obviously the result will be 0 since you can place each supermarket in a city
if k = 1, I think the solution should be: (A[n] - A[1])/2.
But I can't come up with the actual formula (and the whole actual dynamic program). Also, I can't seem to find a "title" to this answer, I didn't find any other example of this exact answer online.
To minimize the maximum distance from k supermarkets, you divide the houses into consecutive groups so that you minimize the maximum distance between the starting and ending houses in each group. Then you just put a supermarket in the middle of each group.
Solving the problem this way makes it much easier for dynamic programming, since it removes the continuous variable of supermarket position.
I came up with this recursive function for the problem:
if there are more stands than houses, the answer is 0
if there is only one stand, so we place it in the middle between the edges
Othrwise:
For all the indexes from i to j, we calculate the maximum between all of them, and then the min.

Frechet distance in O(n)

I have seen on a number of articles that the Fréchet algorithm complexity is O(n^2).
That the paths represent as an Q and P arrays, of n size each
What if I start from Q[0], P[0] and check all the possibilities and choose the minimal:
STP_i,j = min(|Q[i] - P[j+1]|, |Q[i+1] - P[j+1]|,|Q[i+1] - P[j]|)
And change the i and j accordingly.
So I can get the answer on O(n).
Am I wrong?
Consider the next example:
Take the dots marked with black as the beginning of the lines. In the first step, your algorithm would advance one point in both lines. However, the Fréchet distance in this case would be the distance between the first red point and the third blue point, but since your algorithm has already move away from the first point it will give you a larger value.

Picking the "spread" from the points on a line

I'm facing an algorithmic problem described as follows: Given a line from 0 to N (really big N), a list of X points on said line, and a number Z (0<=Z<=X) pick Z points from X to maximize the distance between two closest points. The brute-force solution in O(n^2) doesn't seem that difficult but I'm looking for something more sophisticated that can be done in O(n log n) time. Any clues, solutions, advice is very appreciated.
Edit: Answering the question in the first post-it is the minimal distance (between the two closest points) that has to be maximized.
One easy approach is O(XlogN).
First, sort the points.
Next observe that if you already know the minimum distance (call it d) between the points, it's O(X) to see if there's a way of picking Z points all of which are at least distance d apart: take the left-most element, then the next that's at least distance d away, then the next that's at least distance d away from that, and so on. If by the time you've got to the end of the array you have at least Z points, then you have a solution, and if you don't, there is no solution.
Now, you can use a binary search on [0, N] to find the largest d with a solution.
The sort is O(XlogX), the binary search takes O(logN) trials, and each is O(X). Overall, that's O(XlogX + XlogN), but since N >= X that simplifies to O(XlogN).

Google Interview : Find the maximum sum of a polygon [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
Given a polygon with N vertexes and N edges. There is an int number(could be negative) on every vertex and an operation in set (*,+) on every edge. Every time, we remove an edge E from the polygon, merge the two vertexes linked by the edge (V1,V2) to a new vertex with value: V1 op(E) V2. The last case would be two vertexes with two edges, the result is the bigger one.
Return the max result value can be gotten from a given polygon.
For the last case we might not need two merge as the other number could be negative, so in that case we would just return the larger number.
How I am approaching the problem:
p[i,j] denotes the maximum value we can obtain by merging nodes from labelled i to j.
p[i,i] = v[i] -- base case
p[i,j] = p[i,k] operator in between p[k+1,j] , for k between i to j-1.
and then p[0,n] will be my answer.
Second point , i will have to start from all the vertices and do the same as above as this will be cyclic n vertices n edges.
The time complexity for this is n^3 *n i.e n^4 .
Can i do better then this ?
As you have identified (tagged) correctly, this indeed is very similar to the matrix multiplication problem (in what order do I multiply matrixes in order to do it quickly).
This can be solved polynomially using a dynamic algorithm.
I'm going to instead solve a similar, more classic (and identical) problem, given a formula with numbers, addition and multiplications, what way of parenthesizing it gives the maximal value, for example
6+1 * 2 becomes (6+1)*2 which is more than 6+(1*2).
Let us denote our input a1 to an real numbers and o(1),...o(n-1) either * or +. Our approach will work as follows, we will observe the subproblem F(i,j) which represents the maximal formula (after parenthasizing) for a1,...aj. We will create a table of such subproblems and observe that F(1,n) is exactly the result we were looking for.
Define
F(i,j)
- If i>j return 0 //no sub-formula of negative length
- If i=j return ai // the maximal formula for one number is the number
- If i<j return the maximal value for all m between i (including) and j (not included) of:
F(i,m) (o(m)) F(m+1,j) //check all places for possible parenthasis insertion
This goes through all possible options. TProof of correctness is done by induction on the size n=j-i and is pretty trivial.
Lets go through runtime analysis:
If we do not save the values dynamically for smaller subproblems this runs pretty slow, however we can make this algorithm perform relatively fast in O(n^3)
We create a n*n table T in which the cell at index i,j contains F(i,j) filling F(i,i) and F(i,j) for j smaller than i is done in O(1) for each cell since we can calculate these values directly, then we go diagonally and fill F(i+1,i+1) (which we can do quickly since we already know all the previous values in the recursive formula), we repeat this n times for n diagonals (all the diagonals in the table really) and filling each cell takes (O(n)), since each cell has O(n) cells we fill each diagonals in O(n^2) meaning we fill all the table in O(n^3). After filling the table we obviously know F(1,n) which is the solution to your problem.
Now back to your problem
If you translate the polygon into n different formulas (one for starting at each vertex) and run the algorithm for formula values on it, you get exactly the value you want.
I think you can reduce the need for a brute force search. For example: if there is a chain of
x + y + z
You can replace it with a single vertex whose value is the sum, you can't do better than that. You need to do the multiplying after the addition when you're dealing with +ve integers. So if it's all positive then simply reduce all + chains and then mutliply.
So that leaves the cases where there are -ve numbers. Seems to me that the strategy for a single -ve number is pretty obvious, for two -ve numbers there are a few cases (remembering that - x - is positive) and for more than 2 -ve numbers it seems to get tricky :-)

Partition a set into k groups with minimum number of moves

You have a set of n objects for which integer positions are given. A group of objects is a set of objects at the same position (not necessarily all the objects at that position: there might be multiple groups at a single position). The objects can be moved to the left or right, and the goal is to move these objects so as to form k groups, and to do so with the minimum distance moved.
For example:
With initial positions at [4,4,7], and k = 3: the minimum cost is 0.
[4,4,7] and k = 2: minimum cost is 0
[1,2,5,7] and k = 2: minimum cost is 1 + 2 = 3
I've been trying to use a greedy approach (by calculating which move would be shortest) but that wouldn't work because every move involves two elements which could be moved either way. I haven't been able to formulate a dynamic programming approach as yet but I'm working on it.
This problem is a one-dimensional instance of the k-medians problem, which can be stated as follows. Given a set of points x_1...x_n, partition these points into k sets S_1...S_k and choose k locations y_1...y_k in a way that minimizes the sum over all x_i of |x_i - y_f(i)|, where y_f(i) is the location corresponding of the set to which x_i is assigned.
Due to the fact that the median is the population minimizer for absolute distance (i.e. L_1 norm), it follows that each location y_j will be the median of the elements x in the corresponding set S_j (hence the name k-medians). Since you are looking at integer values, there is the technicality that if S_j contains an even number of elements, the median might not be an integer, but in such cases choosing either the next integer above or below the median will give the same sum of absolute distances.
The standard heuristic for solving k-medians (and the related and more common k-means problem) is iterative, but this is not guaranteed to produce an optimal or even good solution. Solving the k-medians problem for general metric spaces is NP-hard, and finding efficient approximations for k-medians is an open research problem. Googling "k-medians approximation", for example, will lead to a bunch of papers giving approximation schemes.
http://www.cis.upenn.edu/~sudipto/mypapers/kmedian_jcss.pdf
http://graphics.stanford.edu/courses/cs468-06-winter/Papers/arr-clustering.pdf
In one dimension things become easier, and you can use a dynamic programming approach. A DP solution to the related one-dimensional k-means problem is described in this paper, and the source code in R is available here. See the paper for details, but the idea is essentially the same as what #SajalJain proposed, and can easily be adapted to solve the k-medians problem rather than k-means. For j<=k and m<=n let D(j,m) denote the cost of an optimal j-medians solution to x_1...x_m, where the x_i are assumed to be in sorted order. We have the recurrence
D(j,m) = min (D(j-1,q) + Cost(x_{q+1},...,x_m)
where q ranges from j-1 to m-1 and Cost is equal to the sum of absolute distances from the median. With a naive O(n) implementation of Cost, this would yield an O(n^3k) DP solution to the whole problem. However, this can be improved to O(n^2k) due to the fact that the Cost can be updated in constant time rather than computed from scratch every time, using the fact that, for a sorted sequence:
Cost(x_1,...,x_h) = Cost(x_2,...,x_h) + median(x_1...x_h)-x_1 if h is odd
Cost(x_1,...,x_h) = Cost(x_2,...,x_h) + median(x_2...x_h)-x_1 if h is even
See the writeup for more details. Except for the fact that the update of the Cost function is different, the implementation will be the same for k-medians as for k-means.
http://journal.r-project.org/archive/2011-2/RJournal_2011-2_Wang+Song.pdf
as I understand, the problems is:
we have n points on a line.
we want to place k position on the line. I call them destinations.
move each of n points to one of the k destinations so the sum of distances is minimum. I call this sum, total cost.
destinations can overlap.
An obvious fact is that for each point we should look for the nearest destinations on the left and the nearest destinations on the right and choose the nearest.
Another important fact is all destinations should be on the points. because we can move them on the line to right or to left to reach a point without increasing total distance.
By these facts consider following DP solution:
DP[i][j] means the minimum total cost needed for the first i point, when we can use only j destinations, and have to put a destination on the i-th point.
to calculate DP[i][j] fix the destination before the i-th point (we have i choice), and for each choice (for example k-th point) calculate the distance needed for points between the i-th point and the new point added (k-th point). add this with DP[k][j - 1] and find the minimum for all k.
the calculation of initial states (e.g. j = 1) and final answer is left as an exercise!
Task 0 - sort the position of the objects in non-decreasing order
Let us define 'center' as the position of the object where it is shifted to.
Now we have two observations;
For N positions the 'center' would be the position which is nearest to the mean of these N positions. Example, let 1,3,6,10 be the positions. Then mean = 5. Nearest position is 6. Hence the center for these elements is 6. This gives us the position with minimum cost of moving when all elements need to be grouped into 1 group.
Let N positions be grouped into K groups "optimally". When N+1 th object is added, then it will disturb only the K th group, i.e, first K-1 groups will remain unchanged.
From these observations, we build a dynamic programming approach.
Let Cost[i][k] and Center[i][k] be two 2D arrays.
Cost[i][k] = minimum cost when first 'i' objects are partitioned into 'k' groups
Center[i][k] stores the center of the 'i-th' object when Cost[i][k] is computed.
Let {L} be the elements from i-L,i-L+1,..i-1 which have the same center.
(Center[i-L][k] = Center[i-L+1][k] = ... = Center[i-1][k]) These are the only objects that need to be considered in the computation for i-th element (from observation 2)
Now
Cost[i][k] will be
min(Cost[i-1][k-1] , Cost[i-L-1][k-1] + computecost(i-L, i-L+1, ... ,i))
Update Center[i-L ... i][k]
computecost() can be found trivially by finding the center (from observation 1)
Time Complexity:
Sorting O(NlogN)
Total Cost Computation Matrix = Total elements * Computecost = O(NK * N)
Total = O(NlogN + N*NK) = O(N*NK)
Let's look at k=1.
For k=1 and n odd, all points should move to the center point. For k=1 and n even, all points should move to either of the center points or any spot between them. By 'center' I mean in terms of number of points to either side, i.e. the median.
You can see this because if you select a target spot, x, with more points to its right than it's left, then a new target 1 to the right of x would result in a cost reduction (unless there is exactly one more point to the right than the left and the target spot is a point, in which case n is even and the target is on/between the two center points).
If your points are already sorted, this is an O(1) operation. If not, I believe it's O(n) (via an order statistic algorithm).
Once you've found the spot that all points are moving to, it's O(n) to find the cost.
Thus regardless of whether the points are sorted or not, this is O(n).

Resources