Related
This grade 11 problem has been bothering me since 2010 and I still can't figure out/find a solution even after university.
Problem Description
There is a very unusual street in your neighbourhood. This street
forms a perfect circle, and the circumference of the circle is
1,000,000. There are H (1 ≤ H ≤ 1000) houses on the street. The
address of each house is the clockwise arc-length from the
northern-most point of the circle. The address of the house at the
northern-most point of the circle is 0. You also have special firehoses
which follow the curve of the street. However, you wish to keep the
length of the longest hose you require to a minimum. Your task is to
place k (1 ≤ k ≤ 1000) fire hydrants on this street so that the maximum
length of hose required to connect a house to a fire hydrant is as
small as possible.
Input Specification
The first line of input will be an integer H, the number of houses. The
next H lines each contain one integer, which is the address of that
particular house, and each house address is at least 0 and less than
1,000,000. On the H + 2nd line is the number k, which is the number of
fire hydrants that can be placed around the circle. Note that a fire
hydrant can be placed at the same position as a house. You may assume
that no two houses are at the same address. Note: at least 40% of the
marks for this question have H ≤ 10.
Output Specification
On one line, output the length of hose required
so that every house can connect to its nearest fire hydrant with that
length of hose.
Sample Input
4
0
67000
68000
77000
2
Output for Sample Input
5000
Link to original question
I can't even come up with a brutal force algorithm since the placement might be float number. For example if the houses are located in 1 and 2, then the hydro should be placed at 1.5 and the distance would be 0.5
Here is quick outline of an answer.
First write a function that can figures out whether you can cover all of the houses with a given maximum length per hydrant. (The maximum hose will be half that length.) It just starts at a house, covers all of the houses it can, jumps to the next, and ditto, and sees whether you stretch. If you fail it tries starting at the next house instead until it has gone around the circle. This will be a O(n^2) function.
Second create a sorted list of the pairwise distances between houses. (You have to consider it going both ways around for a single hydrant, you can only worry about the shorter way if you have 2+ hydrants.) The length covered by a hydrant will be one of those. This takes O(n^2 log(n)).
Now do a binary search to find the shortest length that can cover all of the houses. This will require O(log(n)) calls to the O(n^2) function that you wrote in the first step.
The end result is a O(n^2 log(n)) algorithm.
And here is working code for all but the parsing logic.
#! /usr/bin/env python
def _find_hoses_needed (circle_length, hose_span, houses):
# We assume that houses is sorted.
answers = [] # We can always get away with one hydrant per house.
for start in range(len(houses)):
needed = 1
last_begin = start
current_house = start + 1 if start + 1 < len(houses) else 0
while current_house != start:
pos_begin = houses[last_begin]
pos_end = houses[current_house]
length = pos_end - pos_begin if pos_begin <= pos_end else circle_length + pos_begin - pos_end
if hose_span < length:
# We need a new hose.
needed = needed + 1
last_begin = current_house
current_house = current_house + 1
if len(houses) <= current_house:
# We looped around the circle.
current_house = 0
answers.append(needed)
return min(answers)
def find_min_hose_coverage (circle_length, hydrant_count, houses):
houses = sorted(houses)
# First we find all of the possible answers.
is_length = set()
for i in range(len(houses)):
for j in range(i, len(houses)):
is_length.add(houses[j] - houses[i])
is_length.add(houses[i] - houses[j] + circle_length)
possible_answers = sorted(is_length)
# Now we do a binary search.
lower = 0
upper = len(possible_answers) - 1
while lower < upper:
mid = (lower + upper) / 2 # Note, we lose the fraction here.
if hydrant_count < _find_hoses_needed(circle_length, possible_answers[mid], houses):
# We need a strictly longer coverage to make it.
lower = mid + 1
else:
# Longer is not needed
upper = mid
return possible_answers[lower]
print(find_min_hose_coverage(1000000, 2, [0, 67000, 68000, 77000])/2.0)
I am practising DP and I came across this question. http://www.spoj.com/problems/MPILOT/en/
Charlie acquired airline transport company and to stay in business he needs to lower the expenses by any means possible. There are N pilots working for his company (N is even) and N/2 plane crews needs to be made. A plane crew consists of two pilots - a captain and his assistant. A captain must be older than his assistant. Each pilot has a contract granting him two possible salaries - one as a captain and the other as an assistant. A captain's salary is larger than assistant's for the same pilot. However, it is possible that an assistant has larger salary than his captain. Write a program that will compute the minimal amount of money Charlie needs to give for the pilots' salaries if he decides to spend some time to make the optimal (i.e. the cheapest) arrangement of pilots in crews.
Input
The first line of input contains integer N, 2 ≤ N ≤ 10,000, N is even, the number of pilots working for the Charlie's company. The next N lines of input contain pilots' salaries. The lines are sorted by pilot's age, the salaries of the youngest pilot are given the first. Each of those N lines contains two integers separated by a space character, X i Y, 1 ≤ Y < X ≤ 100,000, a salary as a captain (X) and a salary as an assistant (Y).
Output
The first and only line of output should contain the minimal amount of money Charlie needs to give for the pilots' salaries.
After research, I found out this will be solved by DP but how exactly do I solve this? I have spent hours reading up on the links but I didn't get one which is easily understandable. Please help me.
There's actually a nice way to visualize it. Starting from the bottom left, the start of our ascending list, we can envision choosing a Y (assistant's salary) as a movement to the right and an X (captain's salary) as a movement up, with the condition that the southwest-northeast diagonal is not crossed (see Catalan Number in Wikipedia).
From this we can see that each node in the triangle has at most two predecessors, from the west or from the south, and so the bottom-up general case ought to be:
captain assistant
dp[i][j] = min(x[i+j-1] + dp[i-1][j], y[i+j-1] + dp[i][j-1])
Example:
x = [4,5,6,7]
y = [3,2,1,2]
[9+7]
[3+5] [min(8+1,5+6)]
[.] [3] [3+2]
I'll leave coding as an exercise.
As usual, we need to find a compact set of subproblems. Remembering the details of how the pilots are matched won't do -- we need something like the following characterization.
Consider a labeling that makes each pilot a captain or an assistant without regard to matching. There exists a valid matching if and only if the following two conditions hold:
There are N/2 captains and N/2 assistants.
For every age cutoff, there are at least as many assistants who are younger than the cutoff as there are captains who are younger than the cutoff.
The "only if" direction is easy: Condition 1 is obvious, and Condition 2 holds because the inequality holds for each 2-pilot crew, and we can sum the inequalities.
For the "if" direction, we actually have to construct crews. We proceed by induction. If there are no pilots, then the empty match is valid. Otherwise, since the youngest pilot is an assistant (by Condition 2) and there exists at least one captain (by Condition 1), there exists (by Sperner's lemma if you want to get fancy) a pair of pilots such that (a) no pilots are intermediate in age (b) the younger of the pair is an assistant (c) the older of the pair is a captain. Match the pair and remove them from the pool. Observe that both Conditions still hold, so match the rest by inductive hypothesis.
This observation leads to an O(N^2)-time dynamic program. We repeatedly read the salaries of the next oldest pilot and then compute, given that K pilots have been considered so far, for all C from 0 to [K/2], the minimum cost of paying K - C assistants and C captains among these pilots. At the end, return the cost of paying N/2 assistants and N/2 captains. Untested Python:
def cost(pilots):
cost = [0]
for i, (assistant_salary, captain_salary) in enumerate(pilots):
cost.append(float('inf')) # two-way sentinel
cost = [min(cost[c] + assistant_salary,
cost[c - 1] + captain_salary)
for c in range((i + 1) / 2 + 1)]
return cost[-1] # i.e., N/2
Let's formulate some conditions for a recursion: if we've reached the end, return the sum; if we have N/2 assistants, add a captain next; if we have the same number of assistants as captains left, add an assistant next (we can't have a captain younger than an assistant); otherwise, return the minimum cost of either adding a captain or an assistant.
JavaScript code:
var x = [4,5,6,7];
var y = [3,2,1,2];
var n = x.length;
function f(i,ys,s){
if (i == n){
return s;
}
if (ys == n / 2){
return f(i + 1,ys,s + x[i]);
} else if (i % 2 == 0 && ys == i / 2){
return f(i + 1,ys + 1,s + y[i]);
} else {
return Math.min(f(i + 1,ys + 1,s + y[i]),f(i + 1,ys,s + x[i]));
}
}
Output:
console.log(f(0,0,0)) // 16
For recursive approach in c++, you can follow this code for better understanding:
int min_salary(int a[],int n,int x,int c[])
{
if(n==0)
return 0;
if(x==0)
return(a[0]+min_salary(a+1,n-1,1,c+1));
if(x==n)
return(c[0]+min_salary(a+1,n-1,x-1,c+1));
else
return(min(a[0]+min_salary(a+1,n-1,x+1,c+1),c[0]+min_salary(a+1,n-1,x-1,c+1)));
}
I want to implement an iterative algorithm, which calculates weighted average. The specific weight law does not matter, but it should be close to 1 for the newest values and close to 0 to the oldest.
The algorithm should be iterative. i.e. it should not remember all previous values. It should know only one newest value and any aggregative information about past, like previous values of the average, sums, counts etc.
Is it possible?
For example, the following algorithm can be:
void iterate(double value) {
sum *= 0.99;
sum += value;
count++;
avg = sum / count;
}
It will give exponential decreasing weight, which may be not good. Is it possible to have step decreasing weight or something?
EDIT 1
The the requirements for weighing law is follows:
1) The weight decreases into past
2) I has some mean or characteristic duration so that values older this duration matters much lesser than newer ones
3) I should be able to set this duration
EDIT 2
I need the following. Suppose v_i are values, where v_1 is the first. Also suppose w_i are weights. But w_0 is THE LAST.
So, after first value came I have first average
a_1 = v_1 * w_0
After the second value v_2 came, I should have average
a_2 = v_1 * w_1 + v_2 * w_0
With next value I should have
a_3 = v_1 * w_2 + v_2 * w_1 + v_3 * w_0
Note, that weight profile is moving with me, while I am moving along value sequence.
I.e. each value does not have it's own weight all the time. My goal is to have this weight lower while going to past.
First a bit of background. If we were keeping a normal average, it would go like this:
average(a) = 11
average(a,b) = (average(a)+b)/2
average(a,b,c) = (average(a,b)*2 + c)/3
average(a,b,c,d) = (average(a,b,c)*3 + d)/4
As you can see here, this is an "online" algorithm and we only need to keep track of pieces of data: 1) the total numbers in the average, and 2) the average itself. Then we can undivide the average by the total, add in the new number, and divide it by the new total.
Weighted averages are a bit different. It depends on what kind of weighted average. For example if you defined:
weightedAverage(a,wa, b,wb, c,wc, ..., z,wz) = a*wa + b*wb + c*wc + ... + w*wz
or
weightedAverage(elements, weights) = elements·weights
...then you don't need to do anything besides add the new element*weight! If however you defined the weighted average akin to an expected-value from probability:
weightedAverage(elements,weights) = elements·weights / sum(weights)
...then you'd need to keep track of the total weights. Instead of undividing by the total number of elements, you undivide by the total weight, add in the new element*weight, then divide by the new total weight.
Alternatively you don't need to undivide, as demonstrated below: you can merely keep track of the temporary dot product and weight total in a closure or an object, and divide it as you yield (this can help a lot with avoiding numerical inaccuracy from compounded rounding errors).
In python this would be:
def makeAverager():
dotProduct = 0
totalWeight = 0
def averager(newValue, weight):
nonlocal dotProduct,totalWeight
dotProduct += newValue*weight
totalWeight += weight
return dotProduct/totalWeight
return averager
Demo:
>>> averager = makeAverager()
>>> [averager(value,w) for value,w in [(100,0.2), (50,0.5), (100,0.1)]]
[100.0, 64.28571428571429, 68.75]
>>> averager(10,1.1)
34.73684210526316
>>> averager(10,1.1)
25.666666666666668
>>> averager(30,2.0)
27.4
> But my task is to have average recalculated each time new value arrives having old values reweighted. –OP
Your task is almost always impossible, even with exceptionally simple weighting schemes.
You are asking to, with O(1) memory, yield averages with a changing weighting scheme. For example, {values·weights1, (values+[newValue2])·weights2, (values+[newValue2,newValue3])·weights3, ...} as new values are being passed in, for some nearly arbitrarily changing weights sequence. This is impossible due to injectivity. Once you merge the numbers in together, you lose a massive amount of information. For example, even if you had the weight vector, you could not recover the original value vector, or vice versa. There are only two cases I can think of where you could get away with this:
Constant weights such as [2,2,2,...2]: this is equivalent to an on-line averaging algorithm, which you don't want because the old values are not being "reweighted".
The relative weights of previous answers do not change. For example you could do weights of [8,4,2,1], and add in a new element with arbitrary weight like ...+[1], but you must increase all the previous by the same multiplicative factor, like [16,8,4,2]+[1]. Thus at each step, you are adding a new arbitrary weight, and a new arbitrary rescaling of the past, so you have 2 degrees of freedom (only 1 if you need to keep your dot-product normalized). The weight-vectors you'd get would look like:
[w0]
[w0*(s1), w1]
[w0*(s1*s2), w1*(s2), w2]
[w0*(s1*s2*s3), w1*(s2*s3), w2*(s3), w3]
...
Thus any weighting scheme you can make look like that will work (unless you need to keep the thing normalized by the sum of weights, in which case you must then divide the new average by the new sum, which you can calculate by keeping only O(1) memory). Merely multiply the previous average by the new s (which will implicitly distribute over the dot-product into the weights), and tack on the new +w*newValue.
I think you are looking for something like this:
void iterate(double value) {
count++;
weight = max(0, 1 - (count / 1000));
avg = ( avg * total_weight * (count - 1) + weight * value) / (total_weight * (count - 1) + weight)
total_weight += weight;
}
Here I'm assuming you want the weights to sum to 1. As long as you can generate a relative weight without it changing in the future, you can end up with a solution which mimics this behavior.
That is, suppose you defined your weights as a sequence {s_0, s_1, s_2, ..., s_n, ...} and defined the input as sequence {i_0, i_1, i_2, ..., i_n}.
Consider the form: sum(s_0*i_0 + s_1*i_1 + s_2*i_2 + ... + s_n*i_n) / sum(s_0 + s_1 + s_2 + ... + s_n). Note that it is trivially possible to compute this incrementally with a couple of aggregation counters:
int counter = 0;
double numerator = 0;
double denominator = 0;
void addValue(double val)
{
double weight = calculateWeightFromCounter(counter);
numerator += weight * val;
denominator += weight;
}
double getAverage()
{
if (denominator == 0.0) return 0.0;
return numerator / denominator;
}
Of course, calculateWeightFromCounter() in this case shouldn't generate weights that sum to one -- the trick here is that we average by dividing by the sum of the weights so that in the end, the weights virtually seem to sum to one.
The real trick is how you do calculateWeightFromCounter(). You could simply return the counter itself, for example, however note that the last weighted number would not be near the sum of the counters necessarily, so you may not end up with the exact properties you want. (It's hard to say since, as mentioned, you've left a fairly open problem.)
This is too long to post in a comment, but it may be useful to know.
Suppose you have:
w_0*v_n + ... w_n*v_0 (we'll call this w[0..n]*v[n..0] for short)
Then the next step is:
w_0*v_n1 + ... w_n1*v_0 (and this is w[0..n1]*v[n1..0] for short)
This means we need a way to calculate w[1..n1]*v[n..0] from w[0..n]*v[n..0].
It's certainly possible that v[n..0] is 0, ..., 0, z, 0, ..., 0 where z is at some location x.
If we don't have any 'extra' storage, then f(z*w(x))=z*w(x + 1) where w(x) is the weight for location x.
Rearranging the equation, w(x + 1) = f(z*w(x))/z. Well, w(x + 1) better be constant for a constant x, so f(z*w(x))/z better be constant. Hence, f must let z propagate -- that is, f(z*w(x)) = z*f(w(x)).
But here again we have an issue. Note that if z (which could be any number) can propagate through f, then w(x) certainly can. So f(z*w(x)) = w(x)*f(z). Thus f(w(x)) = w(x)/f(z).
But for a constant x, w(x) is constant, and thus f(w(x)) better be constant, too. w(x) is constant, so f(z) better be constant so that w(x)/f(z) is constant. Thus f(w(x)) = w(x)/c where c is a constant.
So, f(x)=c*x where c is a constant when x is a weight value.
So w(x+1) = c*w(x).
That is, each weight is a multiple of the previous. Thus, the weights take the form w(x)=m*b^x.
Note that this assumes the only information f has is the last aggregated value. Note that at some point you will be reduced to this case unless you're willing to store a non-constant amount of data representing your input. You cannot represent an infinite length vector of real numbers with a real number, but you can approximate them somehow in a constant, finite amount of storage. But this would merely be an approximation.
Although I haven't rigorously proven it, it is my conclusion that what you want is impossible to do with a high degree of precision, but you may be able to use log(n) space (which may as well be O(1) for many practical applications) to generate a quality approximation. You may be able to use even less.
I tried to practically code something (in Java). As has been said, your goal is not achievable. You can only count average from some number of last remembered values. If you don't need to be exact, you can approximate the older values. I tried to do it by remembering last 5 values exactly and older values only SUMmed by 5 values, remembering the last 5 SUMs. Then, the complexity is O(2n) for remembering last n+n*n values. This is a very rough approximation.
You can modify the "lastValues" and "lasAggregatedSums" array sizes as you want. See this ascii-art picture trying to display a graph of last values, showing that the first columns (older data) are remembered as aggregated value (not individually), and only the earliest 5 values are remembered individually.
values:
#####
##### ##### #
##### ##### ##### # #
##### ##### ##### ##### ## ##
##### ##### ##### ##### ##### #####
time: --->
Challenge 1: My example doesn't count weights, but I think it shouldn't be problem for you to add weights for the "lastAggregatedSums" appropriately - the only problem is, that if you want lower weights for older values, it would be harder, because the array is rotating, so it is not straightforward to know which weight for which array member. Maybe you can modify the algorithm to always "shift" values in the array instead of rotating? Then adding weights shouldn't be a problem.
Challenge 2: The arrays are initialized with 0 values, and those values are counting to the average from the beginning, even when we haven't receive enough values. If you are running the algorithm for long time, you probably don't bother that it is learning for some time at the beginning. If you do, you can post a modification ;-)
public class AverageCounter {
private float[] lastValues = new float[5];
private float[] lastAggregatedSums = new float[5];
private int valIdx = 0;
private int aggValIdx = 0;
private float avg;
public void add(float value) {
lastValues[valIdx++] = value;
if(valIdx == lastValues.length) {
// count average of last values and save into the aggregated array.
float sum = 0;
for(float v: lastValues) {sum += v;}
lastAggregatedSums[aggValIdx++] = sum;
if(aggValIdx >= lastAggregatedSums.length) {
// rotate aggregated values index
aggValIdx = 0;
}
valIdx = 0;
}
float sum = 0;
for(float v: lastValues) {sum += v;}
for(float v: lastAggregatedSums) {sum += v;}
avg = sum / (lastValues.length + lastAggregatedSums.length * lastValues.length);
}
public float getAvg() {
return avg;
}
}
you can combine (weighted sum) exponential means with different effective window sizes (N) in order to get the desired weights.
Use more exponential means to define your weight profile more detailed.
(more exponential means also means to store and calculate more values, so here is the trade off)
A memoryless solution is to calculate the new average from a weighted combination of the previous average and the new value:
average = (1 - P) * average + P * value
where P is an empirical constant, 0 <= P <= 1
expanding gives:
average = sum i (weight[i] * value[i])
where value[0] is the newest value, and
weight[i] = P * (1 - P) ^ i
When P is low, historical values are given higher weighting.
The closer P gets to 1, the more quickly it converges to newer values.
When P = 1, it's a regular assignment and ignores previous values.
If you want to maximise the contribution of value[N], maximize
weight[N] = P * (1 - P) ^ N
where 0 <= P <= 1
I discovered weight[N] is maximized when
P = 1 / (N + 1)
I came across this question
ADZEN is a very popular advertising firm in your city. In every road
you can see their advertising billboards. Recently they are facing a
serious challenge , MG Road the most used and beautiful road in your
city has been almost filled by the billboards and this is having a
negative effect on
the natural view.
On people's demand ADZEN has decided to remove some of the billboards
in such a way that there are no more than K billboards standing together
in any part of the road.
You may assume the MG Road to be a straight line with N billboards.Initially there is no gap between any two adjecent
billboards.
ADZEN's primary income comes from these billboards so the billboard removing process has to be done in such a way that the
billboards
remaining at end should give maximum possible profit among all possible final configurations.Total profit of a configuration is the
sum of the profit values of all billboards present in that
configuration.
Given N,K and the profit value of each of the N billboards, output the maximum profit that can be obtained from the remaining
billboards under the conditions given.
Input description
1st line contain two space seperated integers N and K. Then follow N lines describing the profit value of each billboard i.e ith
line contains the profit value of ith billboard.
Sample Input
6 2
1
2
3
1
6
10
Sample Output
21
Explanation
In given input there are 6 billboards and after the process no more than 2 should be together. So remove 1st and 4th
billboards giving a configuration _ 2 3 _ 6 10 having a profit of 21.
No other configuration has a profit more than 21.So the answer is 21.
Constraints
1 <= N <= 1,00,000(10^5)
1 <= K <= N
0 <= profit value of any billboard <= 2,000,000,000(2*10^9)
I think that we have to select minimum cost board in first k+1 boards and then repeat the same untill last,but this was not giving correct answer
for all cases.
i tried upto my knowledge,but unable to find solution.
if any one got idea please kindly share your thougths.
It's a typical DP problem. Lets say that P(n,k) is the maximum profit of having k billboards up to the position n on the road. Then you have following formula:
P(n,k) = max(P(n-1,k), P(n-1,k-1) + C(n))
P(i,0) = 0 for i = 0..n
Where c(n) is the profit from putting the nth billboard on the road. Using that formula to calculate P(n, k) bottom up you'll get the solution in O(nk) time.
I'll leave up to you to figure out why that formula holds.
edit
Dang, I misread the question.
It still is a DP problem, just the formula is different. Let's say that P(v,i) means the maximum profit at point v where last cluster of billboards has size i.
Then P(v,i) can be described using following formulas:
P(v,i) = P(v-1,i-1) + C(v) if i > 0
P(v,0) = max(P(v-1,i) for i = 0..min(k, v))
P(0,0) = 0
You need to find max(P(n,i) for i = 0..k)).
This problem is one of the challenges posted in www.interviewstreet.com ...
I'm happy to say I got this down recently, but not quite satisfied and wanted to see if there's a better method out there.
soulcheck's DP solution above is straightforward, but won't be able to solve this completely due to the fact that K can be as big as N, meaning the DP complexity will be O(NK) for both runtime and space.
Another solution is to do branch-and-bound, keeping track the best sum so far, and prune the recursion if at some level, that is, if currSumSoFar + SUM(a[currIndex..n)) <= bestSumSoFar ... then exit the function immediately, no point of processing further when the upper-bound won't beat best sum so far.
The branch-and-bound above got accepted by the tester for all but 2 test-cases.
Fortunately, I noticed that the 2 test-cases are using small K (in my case, K < 300), so the DP technique of O(NK) suffices.
soulcheck's (second) DP solution is correct in principle. There are two improvements you can make using these observations:
1) It is unnecessary to allocate the entire DP table. You only ever look at two rows at a time.
2) For each row (the v in P(v, i)), you are only interested in the i's which most increase the max value, which is one more than each i that held the max value in the previous row. Also, i = 1, otherwise you never consider blanks.
I coded it in c++ using DP in O(nlogk).
Idea is to maintain a multiset with next k values for a given position. This multiset will typically have k values in mid processing. Each time you move an element and push new one. Art is how to maintain this list to have the profit[i] + answer[i+2]. More details on set:
/*
* Observation 1: ith state depends on next k states i+2....i+2+k
* We maximize across this states added on them "accumulative" sum
*
* Let Say we have list of numbers of state i+1, that is list of {profit + state solution}, How to get states if ith solution
*
* Say we have following data k = 3
*
* Indices: 0 1 2 3 4
* Profits: 1 3 2 4 2
* Solution: ? ? 5 3 1
*
* Answer for [1] = max(3+3, 5+1, 9+0) = 9
*
* Indices: 0 1 2 3 4
* Profits: 1 3 2 4 2
* Solution: ? 9 5 3 1
*
* Let's find answer for [0], using set of [1].
*
* First, last entry should be removed. then we have (3+3, 5+1)
*
* Now we should add 1+5, but entries should be incremented with 1
* (1+5, 4+3, 6+1) -> then find max.
*
* Could we do it in other way but instead of processing list. Yes, we simply add 1 to all elements
*
* answer is same as: 1 + max(1-1+5, 3+3, 5+1)
*
*/
ll dp()
{
multiset<ll, greater<ll> > set;
mem[n-1] = profit[n-1];
ll sumSoFar = 0;
lpd(i, n-2, 0)
{
if(sz(set) == k)
set.erase(set.find(added[i+k]));
if(i+2 < n)
{
added[i] = mem[i+2] - sumSoFar;
set.insert(added[i]);
sumSoFar += profit[i];
}
if(n-i <= k)
mem[i] = profit[i] + mem[i+1];
else
mem[i] = max(mem[i+1], *set.begin()+sumSoFar);
}
return mem[0];
}
This looks like a linear programming problem. This problem would be linear, but for the requirement that no more than K adjacent billboards may remain.
See wikipedia for a general treatment: http://en.wikipedia.org/wiki/Linear_programming
Visit your university library to find a good textbook on the subject.
There are many, many libraries to assist with linear programming, so I suggest you do not attempt to code an algorithm from scratch. Here is a list relevant to Python: http://wiki.python.org/moin/NumericAndScientific/Libraries
Let P[i] (where i=1..n) be the maximum profit for billboards 1..i IF WE REMOVE billboard i. It is trivial to calculate the answer knowing all P[i]. The baseline algorithm for calculating P[i] is as follows:
for i=1,N
{
P[i]=-infinity;
for j = max(1,i-k-1)..i-1
{
P[i] = max( P[i], P[j] + C[j+1]+..+C[i-1] );
}
}
Now the idea that allows us to speed things up. Let's say we have two different valid configurations of billboards 1 through i only, let's call these configurations X1 and X2. If billboard i is removed in configuration X1 and profit(X1) >= profit(X2) then we should always prefer configuration X1 for billboards 1..i (by profit() I meant the profit from billboards 1..i only, regardless of configuration for i+1..n). This is as important as it is obvious.
We introduce a doubly-linked list of tuples {idx,d}: {{idx1,d1}, {idx2,d2}, ..., {idxN,dN}}.
p->idx is index of the last billboard removed. p->idx is increasing as we go through the list: p->idx < p->next->idx
p->d is the sum of elements (C[p->idx]+C[p->idx+1]+..+C[p->next->idx-1]) if p is not the last element in the list. Otherwise it is the sum of elements up to the current position minus one: (C[p->idx]+C[p->idx+1]+..+C[i-1]).
Here is the algorithm:
P[1] = 0;
list.AddToEnd( {idx=0, d=C[0]} );
// sum of elements starting from the index at top of the list
sum = C[0]; // C[list->begin()->idx]+C[list->begin()->idx+1]+...+C[i-1]
for i=2..N
{
if( i - list->begin()->idx > k + 1 ) // the head of the list is "too far"
{
sum = sum - list->begin()->d
list.RemoveNodeFromBeginning()
}
// At this point the list should containt at least the element
// added on the previous iteration. Calculating P[i].
P[i] = P[list.begin()->idx] + sum
// Updating list.end()->d and removing "unnecessary nodes"
// based on the criterion described above
list.end()->d = list.end()->d + C[i]
while(
(list is not empty) AND
(P[i] >= P[list.end()->idx] + list.end()->d - C[list.end()->idx]) )
{
if( list.size() > 1 )
{
list.end()->prev->d += list.end()->d
}
list.RemoveNodeFromEnd();
}
list.AddToEnd( {idx=i, d=C[i]} );
sum = sum + C[i]
}
//shivi..coding is adictive!!
#include<stdio.h>
long long int arr[100001];
long long int sum[100001];
long long int including[100001],excluding[100001];
long long int maxim(long long int a,long long int b)
{if(a>b) return a;return b;}
int main()
{
int N,K;
scanf("%d%d",&N,&K);
for(int i=0;i<N;++i)scanf("%lld",&arr[i]);
sum[0]=arr[0];
including[0]=sum[0];
excluding[0]=sum[0];
for(int i=1;i<K;++i)
{
sum[i]+=sum[i-1]+arr[i];
including[i]=sum[i];
excluding[i]=sum[i];
}
long long int maxi=0,temp=0;
for(int i=K;i<N;++i)
{
sum[i]+=sum[i-1]+arr[i];
for(int j=1;j<=K;++j)
{
temp=sum[i]-sum[i-j];
if(i-j-1>=0)
temp+=including[i-j-1];
if(temp>maxi)maxi=temp;
}
including[i]=maxi;
excluding[i]=including[i-1];
}
printf("%lld",maxim(including[N-1],excluding[N-1]));
}
//here is the code...passing all but 1 test case :) comment improvements...simple DP
This is an interview question that a friend of mine got and I'm unable to come up with how to solve it.
Question:
You are given a array of n buttons that are either red or blue. There are k containers present. The value of a container is given by the product of red buttons and blue buttons present in it. The problem is to put the buttons into the containers such that the sum of all values of the containers is minimal. Additionally, all containers must contain the buttons and they must be put in order they are given.
For example, the very first button can only go to the first container, the second one can go to either the first or the second but not the third (otherwise the second container won't have any buttons).
k will be less than or equal to n.
I think there must be a dynamic programming solution for this.
How do you solve this ?
So far, I've only got the trivial cases where
if (n==k), the answer would be zero because you could just put one in each container making the value of each container zero, therefore the sum would be zero.
if (k==1), you just dump all of them and calculate the product.
if only one color is present, the answer would be zero.
Edit:
I'll give an example.
n = 4 and k = 2
Input: R B R R
The first container gets the first two (R and B) making its value 1 (1R X 1B)
The second container gets the remaining (R and R) making its value 0 (2R x 0B)
The answer is 1 + 0 = 1
if k=3,
the first container would have only the first button (R)
the second container would have only the second one (B)
the third one would have the last two buttons (R and R)
Each of the containers would have value 0 and hence sum and answer would be 0.
Hope this clears up the doubts.
Possible DP solution:
Let dp[i, j] = minimum number possible if we put the first i numbers into j containers.
dp[i, j] = min{dp[p, j - 1] + numRed[p+1, i]*numBlues[p+1, i]}, p = 1 to i - 1
Answer will be in dp[n, k].
int blue = 0, red = 0;
for (int i = 1; i <= n; ++i)
{
if (buttons[i] == 1)
++red;
else
++blue;
dp[i][1] = red * blue;
}
for (int i = 2; i <= n; ++i)
for (int j = 2; j <= k; ++j)
{
dp[i][j] = inf;
for (int p = 1; p <= i; ++p)
dp[i][j] = min(dp[p][j - 1] + getProd(p + 1, i), dp[i][j]);
}
return dp[n][k];
Complexity will be O(n^3*k), but it's possible to reduce to O(n^2*k) by making getProd run in O(1) with the help of certain precomputations (hint: use dp[i][1]). I'll post it tomorrow if no one figures out this is actually wrong until then.
It might also be possible to reduce to O(n*k), but that will probably require a different approach...
If I understand the question correctly, as long as every container has at least one button in it, you can choose any container to put the remaining buttons in. Given that, put one button in every container, making sure that there is at least one container with a red button and at least one with a blue button. Then with the remaining buttons, put all the red buttons in a container with a red button and put all the blue buttons in a container with blue buttons in it. This will make it so every container has at least one button and every container has only one color of buttons. Then every container's score is 0. Thus the sum is 0 and you have minimized the combined score.
Warning: Proven to be non-optimal
How about a greedy algorithm to get people talking? I'm not going to try to prove it's optimal at this point, but it's a way of approaching the problem.
In this solution, we use the G to denote the number of contiguous regions of one colour in the sequence of buttons. Say we had (I'm using x for red and o for blue since R and B look too similar):
x x x o x o o o x x o
This would give G = 6. Let's split this into groups (red/blue) where, to start with, each group gets an entire region of a consistent colour:
3/0 0/1 1/0 0/3 2/0 0/1 //total value: 0
When G <= k, you have a minimum of zero since each grouping can go into its own container. Now assume G > k. Our greedy algorithm will be, while there are more groups than containers, collapse two adjacent groups into one that result in the least container value delta (valueOf(merged(a, b)) - valueOf(a) - valueOf(b)). Say k = 5 with our example above. Our choices are:
Collapse 1,2: delta = (3 - 0 - 0) = 3
2,3: delta = 1
3,4: delta = 3
4,5: delta = 6
5,6: delta = 2
So we collapse 2 and 3:
3/0 1/1 0/3 2/0 0/1 //total value: 1
And k = 4:
Collapse 1,2: delta = (4 - 0 - 1) = 3
2,3: delta = (4 - 1 - 0) = 3
3,4: delta = (6 - 0 - 0) = 6
4,5: delta = 2
3/0 1/1 0/3 2/1 //total value: 3
k = 3
4/1 0/3 2/1 //total value: 6
k = 2
4/1 2/4 //total value: 12
k = 1
6/5 //total value: 30
It seems optimal for this case, but I was just intending to get people talking about a solution. Note that the starting assignments of buttons to containers was a shortcut: you could instead start with each button in the sequence in its own bucket and then reduce, but you would always arrive to the point where each container has the maximum number of buttons of one colour.
Counterexample: Thanks to Jules Olléon for providing a counter-example that I was too lazy to think of:
o o o x x o x o o x x x
If k = 2, the optimal mapping is
2/4 4/2 //total value: 16
Let's see how the greedy algorithm approaches it:
0/3 2/0 0/1 1/0 0/2 3/0 //total value: 0
0/3 2/0 1/1 0/2 3/0 //total value: 1
0/3 3/1 0/2 3/0 //total value: 3
0/3 3/1 3/2 //total value: 9
3/4 3/2 //total value: 18
I'll leave this answer up since it's accomplished its only purpose of getting people talking about a solution. I wonder if the greedy heuristic could be used in an informed search algorithm such as A* to improve the runtime of an exhaustive search, but that would not achieve polynomial runtime.
I always ask for clarifications of the problem statement in an interview. Imagine that you never put blue an red buttons together. Then the sum is 0, just like n==k. So, for all cases where k > 1, then the minimum is 0.
Here is what I understand so far: The algorithm is to process a sequence of values {R,B}.
It may choose to put the value in the current container or the next, if there is a next.
I first would ask a couple of questions to clarify the things I don't know yet:
Is k and n known to the algorithm in advance? I assume so.
Do we know the full sequence of buttons in advance?
If we don't know the sequence in advance, should the average value minimized? Or the maximum (the worst case)?
Idea for a proof for the algortihm by Mark Peters
Edit: Idea for a proof (sorry, couldn't fit it in a comment)
Let L(i) be the length of the ith group. Let d(i) be the diff you get by collapsing container i and i+1 => d(i) = L(i)*L(i+1).
We can define a distribution by the sequence of containers collapsed. As index we use the maximum index of the original containers contained in the collapsed container containing the containers with the smaller indexes.
A given sequence of collapses I = [i(1), .. i(m)] results in a value which has a lower bound equal to the sum of d(i(m)) for all m from 1 to n-k.
We need to proof that there can't be a sequence other then the one created by the algorithm with a smaller diff. So let the sequence above be the one resulting from the algorithm. Let J = [j(1), .. j(m)].
Here it gets skimpy:
I think it should be possible to proof that the lower limit of J is larger then the actual value of I because in each step we choose by construction the collapse operation from I so it must be smaller then the matching collapse from the alternate sequence
I think we might assume that the sequences are disjunct, but I'm not completely sure about it.
Here is a brute force algorithm written in Python which seems to work.
from itertools import combinations
def get_best_order(n, k):
slices = combinations(range(1, len(n)), k-1)
container_slices = ([0] + list(s) + [len(n)] for s in slices)
min_value = -1
best = None
def get_value(slices, n):
value = 0
for i in range(1, len(slices)):
start, end = slices[i-1], slices[i]
num_red = len([b for b in n[start:end] if b == 'r'])
value += num_red * (end - start - num_red)
return value
for slices in container_slices:
value = get_value(slices, n)
if value < min_value or min_value == -1:
min_value = value
best = slices
return [n[best[i-1]:best[i]] for i in range(1, len(best))]
n = ['b', 'r', 'b', 'r', 'r', 'r', 'b', 'b', 'r']
k = 4
print(get_best_order(n, k))
# [['b', 'r', 'b'], ['r', 'r', 'r'], ['b', 'b'], ['r']]
Basically the algorithm works like this:
Generate a list of every possible arrangement (items stay in order, so this is just a number of items per container)
Calculate the value for that arrangement as described by the OP
If that value is less than the current best value, save it
Return the arrangement that has the lowest value