length arrangement with probability and cost - algorithm

Consider a set of length, each associated with a probability. i.e.
X={a1=(100,1/4),a2=(500,1/4),a3=(200,1/2)}
Obviously, the sum of all the probabilities = 1.
Arrange the lengths together on a line one after the other from a starting point.
For example: {a2,a1,a3} in that order from start to finish.
Define the cost of an element a_i as its the total length from the starting line to the end of this element multiplied by its probability.
So from the previous arrangement:
cost(a2) = (500)*(1/4)
cost(a1) = (500+100)*(1/4)
cost(a3) = (500+100+200)*(1/2)
Define the total cost as the sum of all costs. e.g. cost(X) = cost(a2) + cost(a1) + cost(a3). Give an algorithm that finds an arrangement that minimizes cost(X)
My thoughts:
This looks like an greedy algorithm, since the last element in the arrangement always has the same sum multiplied by its probability, but I'm can't think of an heuristics that accomplishes this. It goes without saying that sorting by probability or length will not work.

Related

Given numbers a1,a2,...,an whose sum is positive. Find the minimal number s.t. the sum of numbers less than or equal to it is positive, in linear time

Problem: Given n different numbers a1,a2,...,an, whose sum is positive. Show how one can find the minimal number such that the sum of numbers less than or equal to it is positive, in time-complexity of O(n).
Note: the numbers aren't necessarily whole and they aren't necessarily sorted as given.
Some explanation of the problem: if the array was sorted, [x,x,x,y,x,...,x,x,x] and y is the first number such that summing all the numbers up-to it will give a positive/zero sum ( and summing less numbers up-to it will give negative sum ), then y will be returned. ( the x here is just a place holder for a number, all numbers in the array are different )
Attempt:
Define the parameters low , high = 0, n which will serve as boundaries for the summation of the elements within them and also as boundaries for choosing the pivot.
Chose a pivot randomly and partition the array ( for example, by Lomuto's partition ), denote this pivot's index as p'. The partitioning will cost O(n). Sum the numbers from low to p' and designate the sum of these numbers as s.
If s<0 define low=p', and repeat the process of choosing a random pivot ( whose index will be denoted as p' ) and parititoning between low and high and then summing the numbers between these two bounderies as s := s + the new summation value.
Else, define high=p' and repeat the process described in the 'If' condition above.
The process will end when low = high.
Besides a few logical gaps in my attempt, it's overall complexity is O(n) on average and not at worst-case.
Do you have any ideas as to how solve the problem in O(n) time?, I thought maybe using a manipulation of 'Median of Medians' algorithm but I have no idea.
Thanks in advance for any help!

Why are we comparing numbers of array with median rather then mean in the question sticks lengths of CSES(question mentioned below)?

There are n sticks with some lengths. Your task is to modify the sticks so that each stick has the same length.
You can either lengthen and shorten each stick. Both operations cost x where x is the difference between the new and original length.
What is the minimum total cost?
Input:
The first input line contains an integer n: the number of sticks.
Then there are n integers: p1,p2,…,pn: the lengths of the sticks.
Output:
Print one integer: the minimum total cost.
Example:
Input:
10
576256620 793841203 607061968 362964043 698782696 775664590 69510254 711292185 317067848 711901928
Output:
1758621869
What I thought was that if we take mod of the value of all the numbers subtracted with mean(or mean+1) of the numbers and add them we should get the answer. But turns out we need to take median.
Can anyone explain why are we getting the answer through median?
As a simple example, consider list [a,b,c], which is sorted.
Their median = b. Total cost = (b-a) + (b-b) + (c-b) = c-a
Their mean = (a+b+c)/3. Let it be x. Total cost = (x-a) + abs(b-x) + (c-x)
If you compute (x-a)+(c-x), it'll evaluate to c-a. Thus the total cost becomes (c-a) + abs(b-x), which will always be >= (c-a).
Thus the median is a better parameter.

Least cost increasing subsequence

Say we have an array A that contains N integers. The problem is that we want to minimize the cost of some increasing subsequence(not necessarily strictly increasing) starting at position 1 and ending at position N. The total cost of a subsequence is the total cost of transitioning between elements in the subsequence. When building the subsequence, the cost of transitioning from position j to position i, where i >= j can be found in the matrix COST[i][j]. It is guaranteed that some increasing subsequence exists in which we start from position 1 and reach position N. Values in the array may be very large.
For example:
N = 5
A = [0,3,2,3,3]
Cost =
[[0,INF,INF,INF,INF],
[3,0,INF,INF,INF],
[3,INF,0,INF,INF],
[5,2,2,0,INF],
[6,0,3,1,0]]
The least-cost increasing subsequence is (A[1], A[2], A[5]) or (0,3,3).
The cost
is COST[2][1] + COST[5][2] = 3 + 0 = 3.
So far I have been able to modify the traditional O(n^2) dp solution by initializing dp[i] to infinity and dp[1] to 0 and subsequently looping over all previous values to extend the subsequence. While iterating through previous values I simply maintain the minimum cost.
Now I want to improve this solution and make it o(nlogn). I know the regular LIS problem can be solved using arrays and binary search, but I have been unable to modify such an approach to fit this problem.

Generate list of real values, which sum up to fixed value and satisfy some constraints

I need to generate n random real values P[0], P[1], ..., P[n-1] which satisfy the following constraints:
Pmin[0] <= P[0] <= Pmax[0]
Pmin[1] <= P[1] <= Pmax[1]
...
Pmin[n-1] <= P[n-1] <= Pmax[n-1]
P[0] + P[1] + ... + P[n-1] = S
Any idea how to do this efficiently?
In general, it is not possible to solve this problem if choosing elements uniformly at random from the given ranges.
Example 1: Say that Pmin[i] = 0 and Pmax[i] = 1. Say that n = 10 and S = 100. Then there is no solution, since the greatest possible sum is 10.
Example 2: Say that Pmin[i] = 0 and Pmax[i] = 1. Say that n = 10 and S = 10. Then there is exactly one solution: choose P[i] = 1.
It is possible to write an algorithm such that the resulting sequence is chosen uniformly at random from the set of possible solutions; this is quite different from saying that the P[i] are uniformly distributed between Pmin[i] and Pmax[i].
The basic idea is to, at each stage, further restrict your range, as follows:
The beginning of the range ought to be the larger of the following two quantities: Pmin[i], or S - Smax[i] - P, where Smax[i] is the sum Pmax[i+1] + ... + Pmax[n] and P is the sum P[0] + ... + P[i]. This guarantees that you're picking a number large enough to eventually work.
The end of the range ought to be the smaller of the following two quantities:
Pmax[i], or S - Smin[i] - P, where Smin[i] is the sum Pmin[i+1] + ... + Pmin[n] and P is as before. This guarantees that you're picking a number small enough to eventually work.
If you are able to obey those rules when picking each P[i], there's a solution, and you will find one at random. Otherwise, there is not a solution.
Note that to actually make this select solutions at random, it's probably best to shuffle the indices, perform this algorithm, and then rearrange the sequence so that it's in the proper order. You can shuffle in O(n), do this algorithm (recommend dynamic programming here, since you can build solutions bottom-up) and then spit out the sequence by "unshuffling" the resulting sequence.
For every i, assign P[i] := Pmin[i]
Compute the sum
If sum>S, then stop (it's impossible)
For every i:
If P[i]+S-sum <= Pmax[i]
P[i] = P[i]+S-sum
Stop (it's done :-)
sum = sum+Pmax[i]-P[i]
P[i] = Pmax[i]
Go for next i
Stop (it's impossible)
Ooops, sorry, you said random... that's not so trivial. Let me think about it...
Run the previous algorithm to have a starting point. Now compute the total margin above and below. The margin above is the sum of individual margins Pmax[i]-P[i] for every i. The margin below is the sum of individual margins P[i]-Pmin[i] for every i.
Traverse all the elements but one in a random order, visiting each one of them exactly once. For every one of them:
Update the margin above and the margin below subtracting from them the contribution of the current element.
Establish a min and max for the current value taking into account that:
They must be in the interval [Pmin[i], Pmax[i]] AND
These min and max are near enough to P[i], so that changing other elements later can compensate changing P[i] to this min or max (that's what the margins above and below indicate)
Change P[i] to a random value in the calculated interval [min, max] and update the sum and the margins (I'm not 100% sure of how the margins should be updated here...)
Then adjust the remaining element to fit the sum S.
Regarding the traversal in random order, see the Knuth shuffles.

Dynamic programming to find minimum number of coins

I'm trying to understand part of a question I have as my HW but it really looks like Chinese...
Let's say we have coins x_1, x_2, x_3, ... x_n. x_1 = 1 always.
We want to give a certain amount of money in a minimum number of coins.
Then we use dynamic programming.
And now I don't understand this - c(i,j) = min { c(i-1,j), 1+c(i,j-x_i) }
where c(i,j) is the minimal amount of coins to return amount j.
c(i,j-x_i) is the minimal number of coins to get the value j-x_i using only coins i,i+1,...,n (This is the induction hypothesis, that's what the recursive formula ensures us).
Thus, 1+c(i,j-x_i) is the minimal way to get j-x_i with the given set of coins + an extra coin valued x_i, which we decided to use.
From this, c(i,j) = min { c(i-1,j), 1+c(i,j-x_i) } is actually choosing "what is best" exhaustively:
Taking the current coin, and checking recursively the rest of the smaller problem
Deciding not to take it - and again, checking the smaller problem recursively.
Taking the minimal of those ensures us (because it is done exhaustively - over all possibilities) that c(i,j) is minimal.

Resources