I have a website where users submit questions (zero, one or multiple per day), vote on them and answer one question per day (more details here). A user can see the question only once either by submitting, voting or answering it.
I have a pool of questions that players have already seen. I need to remove 30 questions from the pool each month. I need to pick questions to remove in such way that I maximize the number of available questions left in the pool for player with least available questions.
Example with pool of 5 questions (and need to remove 3):
player A has seen questions 1, 3 and 5
player B has seen questions 1 and 4
player C has seen questions 2 and 4
I though about removing the questions that top player has seen, but the position would change. Following the above example, player A has only got 2 questions left to play (2 and 4). However, if I remove 1, 3 and 5, the situation would be:
player A can play questions 2 and 4
player B can play question 2
player C cannot play anything because 1,3,5 are removed and he has already seen 2 and 4.
The score for this solution is zero, i.e. the player with least amount of available questions has zero available questions to play.
In this case it would be better to remove 1, 3 and 4, giving:
player A can play question 2
player B can play questions 2 and 5
player C can play question 5
The score for this solution is one, because the two players with least amount of available questions to play have one available question.
If the data size was small, I would be able to brute-force the solution. However, I have hundreds of players and questions, so I'm looking for some algorithm to solve this.
Let's suppose that you have a general efficient algorithm for this. Concentrate on the questions left, rather than the questions removed.
You could use such an algorithm to solve the problem - can you choose at most T questions such that every user has at least one question to answer? I think that this is http://en.wikipedia.org/wiki/Set_cover, and I think solving your problem in general allows you to solve set cover, so I think it is NP-complete.
There is at least a linear programming relaxation. Associate each question with a variable Qi in the range 0<= Qi <= 1. Choosing questions Qi such that each user has at least X questions available amounts to the constraint SUM Uij Qj >= X, which is linear in Qj and X, so you can maximise for the objective function X with the linear variables X and Qj. Unfortunately, the result need not give you integer Qj - consider for example the case when all possible pairs of questions are associated with some user and you want each user to be able to answer at least 1 question, using at most half of the questions. The optimum solution is Qi = 1/2 for all i.
(But given a linear programming relaxation you could use it as the bound in http://en.wikipedia.org/wiki/Branch_and_bound).
Alternatively you could just write down the problem and throw it at an integer linear programming package, if you have one handy.
For completeness of the thread, here is a simple greedy, aproximating approach.
Place the solved questions in the previously discussed matrix form:
Q0 X
Q1 XX
Q2 X
Q3 X
Q4 XX
223
Sort by the number of questions solved:
Q0 X
Q1 XX
Q2 X
Q3 X
Q4 XX
322
Strike out a question with the most Xs among the players with most problems solved. (This is guaranteed to decrease our measure if anything is):
=======
Q1 XX
Q2 X
Q3 X
Q4 XX
222
Sort again:
=======
Q1 XX
Q2 X
Q3 X
Q4 XX
222
Strike again:
=======
=======
Q2 X
Q3 X
Q4 XX
211
Sort again:
=======
=======
Q2 X
Q3 X
Q4 XX
211
Strike again:
=======
=======
Q2 X
Q3 X
=======
101
It's O(n^2logn) without optimizations, so it is plenty fast for some hundreds of questions. It's also easy to implement.
It's not optimal as can be seen from this counter example with 2 strikes:
Q0 X
Q1 X
Q2 XXX
Q3 XXX
Q4 XXXX
Q5 222222
Here the greedy approach is going to remove Q5 and Q2 (or Q3) instead of Q2 and Q3 which would be optimal for our measure.
I propose a bunch of optimizations based on the idea that you really want to maximize the number of unseen questions for the player with the minimum number of questions, and do not care if there is 1 player with the minimum number of questions or 10000 players with that same number of questions.
Step 1: Find the player with the minimum number of questions unseen (In your example, that would be player A) Call this player p.
Step 2: Find all players with within 30 of the number of questions unseen by player p. Call this set P. P are the only players who need to be considered, as removing 30 unseen questions from any other player would still leave them with more unseen questions than player p, and thus player p would still be worse off.
Step 3: Find the intersection of all sets of problems seen by players in P, you may remove all problems within this set, hopefully dropping you down from 30 to some smaller number of problems to remove, that we will call r. r <= 30
Step 4: Find the union of all sets of problems seen by players in P, Call this set U. If the size of U is <= r, you are done, remove all problems in U, and then remove remaining problems arbitrarily from your set of all problems, player p will lose r - size of U and remain with the fewest unseen problems, but this is the best you can do.
You are now left with your original problem, but likely with vastly smaller sets.
Your problem set is U, your player set is P, and you must remove r problems.
The brute force approach takes time (size(U) choose r) * size (P). If those numbers are reasonable, you can just brute force it. This approach is to choose each set of r problems from U and evaluate it against all players in P.
Since your problem does appear to be NP-Complete, the best you can probably hope for is an approximation. The easiest way to do this is to set some max number of tries, then randomly choose and evaluate sets of problems to remove. As such, a function to perform U choose r randomly becomes necessary. This can be done in time O(r), (In fact, I answered how to do this earlier today!)
Select N random elements from a List<T> in C#
You can also put any of the heuristics suggested by other users into your choices by weighting each problem's chance to be selected, I believe the link above shows how to do that in the selected answer.
Linear programming models.
Variant 1.
Sum(Uij * Qj) - Sum(Dij * Xj) + 0 = 0 (for each i)
0 + Sum(Dij * Xj) - Score >= 0 (for each i)
Sum(Qj) = (Number of questions - 30)
Maximize(Score)
Uij is 1 if user i has not seen question j, otherwise it is 0
Dij is element of identity matrix (Dij=1 if i=j, otherwise it is 0)
Xj is auxiliary variable (one for each user)
Variant 2.
Sum(Uij * Qj) >= Score (for each i)
Sum(Qj) = (Number of questions - 30)
No objective function, just check feasibility
In this case, LP problem is simpler, but Score should be determined by binary and linear search. Set current range to [0 .. the least number of unseen questions for a user], set Score to the middle of the range, apply integer LP algorithm (with small time limit). If no solution found, set range to [begin .. Score], otherwise set it to [Score .. end] and continue binary search.
(Optionally) use binary search to determine upper bound for exact solution's Score.
Starting from the best Score, found by binary search, apply integer LP algorithm with Score, increased by 1, 2, ...
(and limiting computation time as necessary). At the end, you get either exact solution, or some good approximation.
Here is sample code in C for GNU GLPK (for variant 1):
#include <stdio.h>
#include <stdlib.h>
#include <glpk.h>
int main(void)
{
int ind[3000];
double val[3000];
int row;
int col;
glp_prob *lp;
// Parameters
int users = 120;
int questions = 10000;
int questions2 = questions - 30;
int time = 30; // sec.
// Create GLPK problem
lp = glp_create_prob();
glp_set_prob_name(lp, "questions");
glp_set_obj_dir(lp, GLP_MAX);
// Configure rows
glp_add_rows(lp, users*2 + 1);
for (row = 1; row <= users; ++row)
{
glp_set_row_bnds(lp, row, GLP_FX, 0.0, 0.0);
glp_set_row_bnds(lp, row + users, GLP_LO, 0.0, 0.0);
}
glp_set_row_bnds(lp, users*2 + 1, GLP_FX, questions2, questions2);
// Configure columns
glp_add_cols(lp, questions + users + 1);
for (col = 1; col <= questions; ++col)
{
glp_set_obj_coef(lp, col, 0.0);
glp_set_col_kind(lp, col, GLP_BV);
}
for (col = 1; col <= users; ++col)
{
glp_set_obj_coef(lp, questions + col, 0.0);
glp_set_col_kind(lp, questions + col, GLP_IV);
glp_set_col_bnds(lp, questions + col, GLP_FR, 0.0, 0.0);
}
glp_set_obj_coef(lp, questions+users+1, 1.0);
glp_set_col_kind(lp, questions+users+1, GLP_IV);
glp_set_col_bnds(lp, questions+users+1, GLP_FR, 0.0, 0.0);
// Configure matrix (question columns)
for(col = 1; col <= questions; ++col)
{
for (row = 1; row <= users*2; ++row)
{
ind[row] = row;
val[row] = ((row <= users) && (rand() % 2))? 1.0: 0.0;
}
ind[users*2 + 1] = users*2 + 1;
val[users*2 + 1] = 1.0;
glp_set_mat_col(lp, col, users*2 + 1, ind, val);
}
// Configure matrix (user columns)
for(col = 1; col <= users; ++col)
{
for (row = 1; row <= users*2; ++row)
{
ind[row] = row;
val[row] = (row == col)? -1.0: ((row == col + users)? 1.0: 0.0);
}
ind[users*2 + 1] = users*2 + 1;
val[users*2 + 1] = 0.0;
glp_set_mat_col(lp, questions + col, users*2 + 1, ind, val);
}
// Configure matrix (score column)
for (row = 1; row <= users*2; ++row)
{
ind[row] = row;
val[row] = (row > users)? -1.0: 0.0;
}
ind[users*2 + 1] = users*2 + 1;
val[users*2 + 1] = 0.0;
glp_set_mat_col(lp, questions + users + 1, users*2 + 1, ind, val);
// Solve integer GLPK problem
glp_iocp param;
glp_init_iocp(¶m);
param.presolve = GLP_ON;
param.tm_lim = time * 1000;
glp_intopt(lp, ¶m);
printf("Score = %g\n", glp_mip_obj_val(lp));
glp_delete_prob(lp);
return 0;
}
Time limit is not working reliably in my tests. Looks like some bug in GLPK...
Sample code for variant 2 (only LP algorithm, no automatic search for Score):
#include <stdio.h>
#include <stdlib.h>
#include <glpk.h>
int main(void)
{
int ind[3000];
double val[3000];
int row;
int col;
glp_prob *lp;
// Parameters
int users = 120;
int questions = 10000;
int questions2 = questions - 30;
double score = 4869.0 + 7;
// Create GLPK problem
lp = glp_create_prob();
glp_set_prob_name(lp, "questions");
glp_set_obj_dir(lp, GLP_MAX);
// Configure rows
glp_add_rows(lp, users + 1);
for (row = 1; row <= users; ++row)
{
glp_set_row_bnds(lp, row, GLP_LO, score, score);
}
glp_set_row_bnds(lp, users + 1, GLP_FX, questions2, questions2);
// Configure columns
glp_add_cols(lp, questions);
for (col = 1; col <= questions; ++col)
{
glp_set_obj_coef(lp, col, 0.0);
glp_set_col_kind(lp, col, GLP_BV);
}
// Configure matrix (question columns)
for(col = 1; col <= questions; ++col)
{
for (row = 1; row <= users; ++row)
{
ind[row] = row;
val[row] = (rand() % 2)? 1.0: 0.0;
}
ind[users + 1] = users + 1;
val[users + 1] = 1.0;
glp_set_mat_col(lp, col, users + 1, ind, val);
}
// Solve integer GLPK problem
glp_iocp param;
glp_init_iocp(¶m);
param.presolve = GLP_ON;
glp_intopt(lp, ¶m);
glp_delete_prob(lp);
return 0;
}
It appears that variant 2 allows to find pretty good approximation quite fast.
And approximation is better than for variant 1.
Let's say you want to delete Y questions from the pool. The simple algorithm would be to sort questions by the amount of views they had. Then you remove Y of the top viewed questions. For your example: 1: 2, 2: 1, 3: 1, 4: 2, 5: 1. Clearly, you better off removing questions 1 and 4. But this algorithm doesn't achieve the goal. However, it is a good starting point. To improve it, you need to make sure that every user will end up with at least X questions after the "cleaning".
In addition to the above array (which we can call "score"), you need a second one with questions and users, where crossing will have 1 if user have seen the question, and 0 if he didn't. Then, for every user you need to find X questions with lowest score edit: that he hasn't seen yet (the less their score the better, since the less people saw the question, the more "valuable" it is for the system overall). You combine all the found X questions from every user into third array, let's call it "safe", since we won't delete any from it.
As the last step you just delete Y top viewed questions (the ones with the highest score), which aren't in the "safe" array.
What that algorithm achieves also is that if deleting say 30 questions will make some users have less than X questions to view, it won't remove all 30. Which is, I guess, good for the system.
Edit: Good optimization for this would be to track not every user, but have some activity benchmark to filter people that saw only a few questions. Because if there are too many people that saw only say 1 rare different question, then nothing can be deleted. Filtering theese kind of users or improving the safe array functionality can solve it.
Feel free to ask questions if I didn't describe the idea deep enough.
Have you considered viewing this in terms of a dynamic programming solution?
I think you might be able to do it by maximizing on the number of available questions left open
to all players such that no single player is left with zero open questions.
The following link provides a good overview of how to construct dynamic programming
solutions to these sort of problems.
Presenting this in terms of questions still playable. I'll number the questions from 0 to 4 instead of 1 to 5, as this is more convenient in programming.
01234
-----
player A x x - player A has just 2 playable questions
player B xx x - player B has 3 playable questions
player C x x x - player C has 3 playable questions
I'll first describe what might appear to be a very naive algorithm, but at the end I'll show how it can be improved significantly.
For each of the 5 questions, you'll need to decide whether to keep it or discard it. This will require a recursive functions that will have a depth of 5.
vector<bool> keep_or_discard(5); // an array to store the five decisions
void decide_one_question(int question_id) {
// first, pretend we keep the question
keep_or_discard[question_id] = true;
decide_one_question(question_id + 1); // recursively consider the next question
// then, pretend we discard this question
keep_or_discard[question_id] = false;
decide_one_question(question_id + 1); // recursively consider the next question
}
decide_one_question(0); // this call starts the whole recursive search
This first attempt will fall into an infinite recursive descent and run past the end of the array. The obvious first thing we need to do is to return immediately when question_id == 5 (i.e. when all questions 0 to 4 have been decided. We add this code to the beginning of decide_one_question:
void decide_one_question(int question_id) {
{
if(question_id == 5) {
// no more decisions needed.
return;
}
}
// ....
Next, we know how many questions we are allowed to keep. Call this allowed_to_keep. This is 5-3 in this case, meaning we are to keep exactly two questions. You might set this as a global variable somewhere.
int allowed_to_keep; // set this to 2
Now, we must add further checks to the beginning of decide_one_question, and add another parameter:
void decide_one_question(int question_id, int questions_kept_so_far) {
{
if(question_id == 5) {
// no more decisions needed.
return;
}
if(questions_kept_so_far > allowed_to_keep) {
// not allowed to keep this many, just return immediately
return;
}
int questions_left_to_consider = 5 - question_id; // how many not yet considered
if(questions_kept_so_far + questions_left_to_consider < allowed_to_keep) {
// even if we keep all the rest, we'll fall short
// may as well return. (This is an optional extra)
return;
}
}
keep_or_discard[question_id] = true;
decide_one_question(question_id + 1, questions_kept_so_far + 1);
keep_or_discard[question_id] = false;
decide_one_question(question_id + 1, questions_kept_so_far );
}
decide_one_question(0,0);
( Notice the general pattern here: we allow the recursive function call to go one level 'too deep'. I find it easier to check for 'invalid' states at the start of the function, than to attempt to avoid making invalid function calls in the first place. )
So far, this looks quite naive. This is checking every single combination. Bear with me!
We need to start keeping track of the score, in order to remember the best (and in preparation for a later optimization). The first thing would be to write a function calculate_score. And to have a global called best_score_so_far. Our goal is to maximize it, so this should be initialized to -1 at the start of the algorithm.
int best_score_so_far; // initialize to -1 at the start
void decide_one_question(int question_id, int questions_kept_so_far) {
{
if(question_id == 5) {
int score = calculate_score();
if(score > best_score_so_far) {
// Great!
best_score_so_far = score;
store_this_good_set_of_answers();
}
return;
}
// ...
Next, it would be better to keep track of how the score is changing as we recurse through the levels. Let's start of by being optimistic; let's pretend we can keep every question and calculate the score and call it upper_bound_on_the_score. A copy of this will be passed into the function every time it calls itself recursively, and it will be updated locally every time a decision is made to discard a question.
void decide_one_question(int question_id
, int questions_kept_so_far
, int upper_bound_on_the_score) {
... the checks we've already detailed above
keep_or_discard[question_id] = true;
decide_one_question(question_id + 1
, questions_kept_so_far + 1
, upper_bound_on_the_score
);
keep_or_discard[question_id] = false;
decide_one_question(question_id + 1
, questions_kept_so_far
, calculate_the_new_upper_bound()
);
See near the end of that last code snippet, that a new (smaller) upper bound has been calculated, based on the decision to discard question 'question_id'.
At each level in the recursion, this upper bound be getting smaller. Each recursive call either keeps the question (making no change to this optimistic bound), or else it decides to discard one question (leading to a smaller bound in this part of the recursive search).
The optimization
Now that we know an upper bound, we can have the following check at the very start of the function, regardless of how many questions have been decided at this point:
void decide_one_question(int question_id
, int questions_kept_so_far
, upper_bound_on_the_score) {
if(upper_bound_on_the_score < best_score_so_far) {
// the upper bound is already too low,
// therefore, this is a dead end.
return;
}
if(question_id == 5) // .. continue with the rest of the function.
This check ensures that once a 'reasonable' solution has been found, that the algorithm will quickly abandon all the 'dead end' searches. It will then (hopefully) quickly find better and better solutions, and it can then be even more aggresive in pruning dead branches. I have found that this approach works quite nicely for me in practice.
If it doesn't work, there are many avenues for further optimization. I won't try to list them all, and you could certainly try entirely different approaches. But I have found this to work on the rare occasions when I have to do some sort of search like this.
Here's an integer program. Let constant unseen(i, j) be 1 if player i has not seen question j and 0 otherwise. Let variable kept(j) be 1 if question j is to be kept and 0 otherwise. Let variable score be the objective.
maximize score # score is your objective
subject to
for all i, score <= sum_j (unseen(i, j) * kept(j)) # score is at most
# the number of questions
# available to player i
sum_j (1 - kept(j)) = 30 # remove exactly
# 30 questions
for all j, kept(j) in {0, 1} # each question is kept
# or not kept (binary)
(score has no preset bound; the optimal solution chooses score
to be the minimum over all players of the number of questions
available to that player)
If there are too many options to brute force and there are likely many solutions that are near-optimal (sounds to be the case), consider monte-carlo methods.
You have a clearly defined fitness function, so just make some random assignments score the result. Rinse and repeat until you run out of time or some other criteria is met.
the question first seems easy, but after thinking deeper you realize the hardness.
the simplest option would be removing the questions that have been seen by maximum number of users. but this does not take the number of remaining questions for each user into consideration. some too few questions may be left for some users after removing.
a more complex solution would be computing the number of remaining questions for each user after deleting a question. You need to compute it for every question and every user. This task may be time consuming if you have many users and questions. Then you can sum up the number of questions left for all users. And select the question with the highest sum.
I think it would be wise to limit the number of remaining questions for a user to a reasonable value. You can think "OK, this user has enough questions to view if he has more than X questions". You need this because after deleting a question, only 15 questions may be left for an active user while 500 questions may be left for a rare-visiting user. It's not fair to sum 15 and 500. You can, instead, define a threshold value of 100.
To make it easier to compute, you can consider only the users who have viewed more than X questions.
Related
I just got the following interview question:
Given a list of float numbers, insert “+”, “-”, “*” or “/” between each consecutive pair of numbers to find the maximum value you can get. For simplicity, assume that all operators are of equal precedence order and evaluation happens from left to right.
Example:
(1, 12, 3) -> 1 + 12 * 3 = 39
If we built a recursive solution, we would find that we would get an O(4^N) solution. I tried to find overlapping sub-problems (to increase the efficiency of this algorithm) and wasn't able to find any overlapping problems. The interviewer then told me that there wasn't any overlapping subsolutions.
How can we detect when there are overlapping solutions and when there isn't? I spent a lot of time trying to "force" subsolutions to appear and eventually the Interviewer told me that there wasn't any.
My current solution looks as follows:
def maximumNumber(array, current_value=None):
if current_value is None:
current_value = array[0]
array = array[1:]
if len(array) == 0:
return current_value
return max(
maximumNumber(array[1:], current_value * array[0]),
maximumNumber(array[1:], current_value - array[0]),
maximumNumber(array[1:], current_value / array[0]),
maximumNumber(array[1:], current_value + array[0])
)
Looking for "overlapping subproblems" sounds like you're trying to do bottom up dynamic programming. Don't bother with that in an interview. Write the obvious recursive solution. Then memoize. That's the top down approach. It is a lot easier to get working.
You may get challenged on that. Here was my response the last time that I was asked about that.
There are two approaches to dynamic programming, top down and bottom up. The bottom up approach usually uses less memory but is harder to write. Therefore I do the top down recursive/memoize and only go for the bottom up approach if I need the last ounce of performance.
It is a perfectly true answer, and I got hired.
Now you may notice that tutorials about dynamic programming spend more time on bottom up. They often even skip the top down approach. They do that because bottom up is harder. You have to think differently. It does provide more efficient algorithms because you can throw away parts of that data structure that you know you won't use again.
Coming up with a working solution in an interview is hard enough already. Don't make it harder on yourself than you need to.
EDIT Here is the DP solution that the interviewer thought didn't exist.
def find_best (floats):
current_answers = {floats[0]: ()}
floats = floats[1:]
for f in floats:
next_answers = {}
for v, path in current_answers.iteritems():
next_answers[v + f] = (path, '+')
next_answers[v * f] = (path, '*')
next_answers[v - f] = (path, '-')
if 0 != f:
next_answers[v / f] = (path, '/')
current_answers = next_answers
best_val = max(current_answers.keys())
return (best_val, current_answers[best_val])
Generally the overlapping sub problem approach is something where the problem is broken down into smaller sub problems, the solutions to which when combined solve the big problem. When these sub problems exhibit an optimal sub structure DP is a good way to solve it.
The decision about what you do with a new number that you encounter has little do with the numbers you have already processed. Other than accounting for signs of course.
So I would say this is a over lapping sub problem solution but not a dynamic programming problem. You could use dive and conquer or evenmore straightforward recursive methods.
Initially let's forget about negative floats.
process each new float according to the following rules
If the new float is less than 1, insert a / before it
If the new float is more than 1 insert a * before it
If it is 1 then insert a +.
If you see a zero just don't divide or multiply
This would solve it for all positive floats.
Now let's handle the case of negative numbers thrown into the mix.
Scan the input once to figure out how many negative numbers you have.
Isolate all the negative numbers in a list, convert all the numbers whose absolute value is less than 1 to the multiplicative inverse. Then sort them by magnitude. If you have an even number of elements we are all good. If you have an odd number of elements store the head of this list in a special var , say k, and associate a processed flag with it and set the flag to False.
Proceed as before with some updated rules
If you see a negative number less than 0 but more than -1, insert a / divide before it
If you see a negative number less than -1, insert a * before it
If you see the special var and the processed flag is False, insert a - before it. Set processed to True.
There is one more optimization you can perform which is removing paris of negative ones as candidates for blanket subtraction from our initial negative numbers list, but this is just an edge case and I'm pretty sure you interviewer won't care
Now the sum is only a function of the number you are adding and not the sum you are adding to :)
Computing max/min results for each operation from previous step. Not sure about overall correctness.
Time complexity O(n), space complexity O(n)
const max_value = (nums) => {
const ops = [(a, b) => a+b, (a, b) => a-b, (a, b) => a*b, (a, b) => a/b]
const dp = Array.from({length: nums.length}, _ => [])
dp[0] = Array.from({length: ops.length}, _ => [nums[0],nums[0]])
for (let i = 1; i < nums.length; i++) {
for (let j = 0; j < ops.length; j++) {
let mx = -Infinity
let mn = Infinity
for (let k = 0; k < ops.length; k++) {
if (nums[i] === 0 && k === 3) {
// If current number is zero, removing division
ops.splice(3, 1)
dp.splice(3, 1)
continue
}
const opMax = ops[j](dp[i-1][k][0], nums[i])
const opMin = ops[j](dp[i-1][k][1], nums[i])
mx = Math.max(opMax, opMin, mx)
mn = Math.min(opMax, opMin, mn)
}
dp[i].push([mx,mn])
}
}
return Math.max(...dp[nums.length-1].map(v => Math.max(...v)))
}
// Tests
console.log(max_value([1, 12, 3]))
console.log(max_value([1, 0, 3]))
console.log(max_value([17,-34,2,-1,3,-4,5,6,7,1,2,3,-5,-7]))
console.log(max_value([59, 60, -0.000001]))
console.log(max_value([0, 1, -0.0001, -1.00000001]))
2048 used to be quite popular just a little while ago. Everybody played it and a lot of people posted nice screenshots with their accomplishments(myself among them). Then at some point I began to wonder if it possible to tell how long did someone play to get to that score. I benchmarked and it turns out that(at least on the android application I have) no more than one move can be made in one second. Thus if you play long enough(and fast enough) the number of moves you've made is quite good approximation to the number of seconds you've played. Now the question is: is it possible having a screenshot of 2048 game to compute how many moves were made.
Here is an example screenshot(actually my best effort on the game so far):
From the screenshot you can see the field layout at the current moment and the number of points that the player has earned. So: is this information enough to compute how many moves I've made and if so, what is the algorithm to do that?
NOTE: I would like to remind you that points are only scored when two tiles "combine" and the number of points scored is the value of the new tile(i.e. the sum of the values of the tiles being combined).
The short answer is it is possible to compute the number of moves using only this information. I will explain the algorithm to do that and I will try to post my answer in steps. Each step will be an observation targeted at helping you solve the problem. I encourage the reader to try and solve the problem alone after each tip.
Observation number one: after each move exactly one tile appears. This tile is either 4 or 2. Thus what we need to do is to count the number of tiles that appeared. At least on the version of the game I played the game always started with 2 tiles with 2 on them placed at random.
We don't care about the actual layout of the field. We only care about the numbers that are present on it. This will become more obvious when I explain my algorithm.
Seeing the values in the cells on the field we can compute what the score would be if 2 had appeared after each move. Call that value twos_score.
The number of fours that have appeared is equal to the difference of twos_score and actual_score divided by 4. This is true because for forming a 4 from two 2-s we would have scored 4 points, while if the 4 appears straight away we score 0. Call the number of fours fours.
We can compute the number of twos we needed to form all the numbers on the field. After that we need to subtract 2 * fours from this value as a single 4 replaces the need of two 2s. Call this twos.
Using this observations we are able to solve the problem. Now I will explain in more details how to perform the separate steps.
How to compute the score if only two appeared?
I will prove that to form the number 2n, the player would score 2n*(n - 1) points(using induction).
The statements is obvious for 2 as it directly appears and therefor no points are scored for it.
Let's assume that for a fixed k for the number 2k the user will score 2k*(k - 1)
For k + 1: 2k + 1 can only be formed by combining two numbers of value 2k. Thus the user will score 2k*(k - 1) + 2k*(k - 1) + 2k+1(the score for the two numbers being combined plus the score for the new number).
This equals: 2k + 1*(k - 1) + 2k+1= 2k+1 * (k - 1 + 1) = 2k+1 * k. This completes the induction.
Therefor to compute the score if only twos appeared we need to iterate over all numbers on the board and accumulate the score we get for them using the formula above.
How to compute the number of twos needed to form the numbers on the field?
It is much easier to notice that the number of twos needed to form 2n is 2n - 1. A strict proof can again be done using induction, but I will leave this to the reader.
The code
I will provide code for solving the problem in c++. However I do not use anything too language specific(appart from vector which is simply a dynamically expanding array) so it should be very easy to port to many other languages.
/**
* #param a - a vector containing the values currently in the field.
* A value of zero means "empty cell".
* #param score - the score the player currently has.
* #return a pair where the first number is the number of twos that appeared
* and the second number is the number of fours that appeared.
*/
pair<int,int> solve(const vector<vector<int> >& a, int score) {
vector<int> counts(20, 0);
for (int i = 0; i < (int)a.size(); ++i) {
for (int j = 0; j < (int)a[0].size(); ++j) {
if (a[i][j] == 0) {
continue;
}
int num;
for (int l = 1; l < 20; ++l) {
if (a[i][j] == 1 << l) {
num = l;
break;
}
}
counts[num]++;
}
}
// What the score would be if only twos appeared every time
int twos_score = 0;
for (int i = 1; i < 20; ++i) {
twos_score += counts[i] * (1 << i) * (i - 1);
}
// For each 4 that appears instead of a two the overall score decreases by 4
int fours = (twos_score - score) / 4;
// How many twos are needed for all the numbers on the field(ignoring score)
int twos = 0;
for (int i = 1; i < 20; ++i) {
twos += counts[i] * (1 << (i - 1));
}
// Each four replaces two 2-s
twos -= fours * 2;
return make_pair(twos, fours);
}
Now to answer how many moves we've made we should add the two values of the pair returned by this function and subtract two because two tiles with 2 appear straight away.
I've been browsing the internet all day for an existing solution to creating an equation out of a list of numbers and operators for a specified target number.
I came across a lot of 24 Game solvers, Countdown solvers and alike, but they are all build around the concept of allowing parentheses in the answers.
For example, for a target 42, using the number 1 2 3 4 5 6, a solution could be:
6 * 5 = 30
4 * 3 = 12
30 + 12 = 42
Note how the algorithm remembers the outcome of a sub-equation and later re-uses it to form the solution (in this case 30 and 12), essentially using parentheses to form the solution (6 * 5) + (4 * 3) = 42.
Whereas I'd like a solution WITHOUT the use of parentheses, which is solved from left to right, for example 6 - 1 + 5 * 4 + 2 = 42, if I'd write it out, it would be:
6 - 1 = 5
5 + 5 = 10
10 * 4 = 40
40 + 2 = 42
I have a list of about 55 numbers (random numbers ranging from 2 to 12), 9 operators (2 of each basic operator + 1 random operator) and a target value (a random number between 0 and 1000). I need an algorithm to check whether or not my target value is solvable (and optionally, if it isn't, how close we can get to the actual value). Each number and operator can only be used once, which means there will be a maximum of 10 numbers you can use to get to the target value.
I found a brute-force algorithm which can be easily adjusted to do what I want (How to design an algorithm to calculate countdown style maths number puzzle), and that works, but I was hoping to find something which generates more sophisticated "solutions", like on this page: http://incoherency.co.uk/countdown/
I wrote the solver you mentioned at the end of your post, and I apologise in advance that the code isn't very readable.
At its heart the code for any solver to this sort of problem is simply a depth-first search, which you imply you already have working.
Note that if you go with your "solution WITHOUT the use of parentheses, which is solved from left to right" then there are input sets which are not solvable. For example, 11,11,11,11,11,11 with a target of 144. The solution is ((11/11)+11)*((11/11)+11). My solver makes this easier for humans to understand by breaking the parentheses up into different lines, but it is still effectively using parentheses rather than evaluating from left to right.
The way to "use parentheses" is to apply an operation to your inputs and put the result back in the input bag, rather than to apply an operation to one of the inputs and an accumulator. For example, if your input bag is 1,2,3,4,5,6 and you decide to multiply 3 and 4, the bag becomes 1,2,12,5,6. In this way, when you recurse, that step can use the result of the previous step. Preparing this for output is just a case of storing the history of operations along with each number in the bag.
I imagine what you mean about more "sophisticated" solutions is just the simplicity heuristic used in my javascript solver. The solver works by doing a depth-first search of the entire search space, and then choosing the solution that is "best" rather than just the one that uses the fewest steps.
A solution is considered "better" than a previous solution (i.e. replaces it as the "answer" solution) if it is closer to the target (note that any state in the solver is a candidate solution, it's just that most are further away from the target than the previous best candidate solution), or if it is equally distant from the target and has a lower heuristic score.
The heuristic score is the sum of the "intermediate values" (i.e. the values on the right-hand-side of the "=" signs), with trailing 0's removed. For example, if the intermediate values are 1, 4, 10, 150, the heuristic score is 1+4+1+15: the 10 and the 150 only count for 1 and 15 because they end in zeroes. This is done because humans find it easier to deal with numbers that are divisible by 10, and so the solution appears "simpler".
The other part that could be considered "sophisticated" is the way that some lines are joined together. This simply joins the result of "5 + 3 = 8" and "8 + 2 = 10" in to "5 + 3 + 2 = 10". The code to do this is absolutely horrible, but in case you're interested it's all in the Javascript at https://github.com/jes/cntdn/blob/master/js/cntdn.js - the gist is that after finding the solution which is stored in array form (with information about how each number was made) a bunch of post-processing happens. Roughly:
convert the "solution list" generated from the DFS to a (rudimentary, nested-array-based) expression tree - this is to cope with the multi-argument case (i.e. "5 + 3 + 2" is not 2 addition operations, it's just one addition that has 3 arguments)
convert the expression tree to an array of steps, including sorting the arguments so that they're presented more consistently
convert the array of steps into a string representation for display to the user, including an explanation of how distant from the target number the result is, if it's not equal
Apologies for the length of that. Hopefully some of it is of use.
James
EDIT: If you're interested in Countdown solvers in general, you may want to take a look at my letters solver as it is far more elegant than the numbers one. It's the top two functions at https://github.com/jes/cntdn/blob/master/js/cntdn.js - to use call solve_letters() with a string of letters and a function to get called for every matching word. This solver works by traversing a trie representing the dictionary (generated by https://github.com/jes/cntdn/blob/master/js/mk-js-dict), and calling the callback at every end node.
I use the recursive in java to do the array combination. The main idea is just using DFS to get the array combination and operation combination.
I use a boolean array to store the visited position, which can avoid the same element to be used again. The temp StringBuilder is used to store current equation, if the corresponding result is equal to target, i will put the equation into result. Do not forget to return temp and visited array to original state when you select next array element.
This algorithm will produce some duplicate answer, so it need to be optimized later.
public static void main(String[] args) {
List<StringBuilder> res = new ArrayList<StringBuilder>();
int[] arr = {1,2,3,4,5,6};
int target = 42;
for(int i = 0; i < arr.length; i ++){
boolean[] visited = new boolean[arr.length];
visited[i] = true;
StringBuilder sb = new StringBuilder();
sb.append(arr[i]);
findMatch(res, sb, arr, visited, arr[i], "+-*/", target);
}
for(StringBuilder sb : res){
System.out.println(sb.toString());
}
}
public static void findMatch(List<StringBuilder> res, StringBuilder temp, int[] nums, boolean[] visited, int current, String operations, int target){
if(current == target){
res.add(new StringBuilder(temp));
}
for(int i = 0; i < nums.length; i ++){
if(visited[i]) continue;
for(char c : operations.toCharArray()){
visited[i] = true;
temp.append(c).append(nums[i]);
if(c == '+'){
findMatch(res, temp, nums, visited, current + nums[i], operations, target);
}else if(c == '-'){
findMatch(res, temp, nums, visited, current - nums[i], operations, target);
}else if(c == '*'){
findMatch(res, temp, nums, visited, current * nums[i], operations, target);
}else if(c == '/'){
findMatch(res, temp, nums, visited, current / nums[i], operations, target);
}
temp.delete(temp.length() - 2, temp.length());
visited[i] = false;
}
}
}
I often teach large introductory programming classes (400 - 600 students) and when exam time comes around, we often have to split the class up into different rooms in order to make sure everyone has a seat for the exam.
To keep things logistically simple, I usually break the class apart by last name. For example, I might send students with last names A - H to one room, last name I - L to a second room, M - S to a third room, and T - Z to a fourth room.
The challenge in doing this is that the rooms often have wildly different capacities and it can be hard to find a way to segment the class in a way that causes everyone to fit. For example, suppose that the distribution of last names is (for simplicity) the following:
Last name starts with A: 25
Last name starts with B: 150
Last name starts with C: 200
Last name starts with D: 50
Suppose that I have rooms with capacities 350, 50, and 50. A greedy algorithm for finding a room assignment might be to sort the rooms into descending order of capacity, then try to fill in the rooms in that order. This, unfortunately, doesn't always work. For example, in this case, the right option is to put last name A in one room of size 50, last names B - C into the room of size 350, and last name D into another room of size 50. The greedy algorithm would put last names A and B into the 350-person room, then fail to find seats for everyone else.
It's easy to solve this problem by just trying all possible permutations of the room orderings and then running the greedy algorithm on each ordering. This will either find an assignment that works or report that none exists. However, I'm wondering if there is a more efficient way to do this, given that the number of rooms might be between 10 and 20 and checking all permutations might not be feasible.
To summarize, the formal problem statement is the following:
You are given a frequency histogram of the last names of the students in a class, along with a list of rooms and their capacities. Your goal is to divvy up the students by the first letter of their last name so that each room is assigned a contiguous block of letters and does not exceed its capacity.
Is there an efficient algorithm for this, or at least one that is efficient for reasonable room sizes?
EDIT: Many people have asked about the contiguous condition. The rules are
Each room should be assigned at most a block of contiguous letters, and
No letter should be assigned to two or more rooms.
For example, you could not put A - E, H - N, and P - Z into the same room. You could also not put A - C in one room and B - D in another.
Thanks!
It can be solved using some sort of DP solution on [m, 2^n] space, where m is number of letters (26 for english) and n is number of rooms. With m == 26 and n == 20 it will take about 100 MB of space and ~1 sec of time.
Below is solution I have just implemented in C# (it will successfully compile on C++ and Java too, just several minor changes will be needed):
int[] GetAssignments(int[] studentsPerLetter, int[] rooms)
{
int numberOfRooms = rooms.Length;
int numberOfLetters = studentsPerLetter.Length;
int roomSets = 1 << numberOfRooms; // 2 ^ (number of rooms)
int[,] map = new int[numberOfLetters + 1, roomSets];
for (int i = 0; i <= numberOfLetters; i++)
for (int j = 0; j < roomSets; j++)
map[i, j] = -2;
map[0, 0] = -1; // starting condition
for (int i = 0; i < numberOfLetters; i++)
for (int j = 0; j < roomSets; j++)
if (map[i, j] > -2)
{
for (int k = 0; k < numberOfRooms; k++)
if ((j & (1 << k)) == 0)
{
// this room is empty yet.
int roomCapacity = rooms[k];
int t = i;
for (; t < numberOfLetters && roomCapacity >= studentsPerLetter[t]; t++)
roomCapacity -= studentsPerLetter[t];
// marking next state as good, also specifying index of just occupied room
// - it will help to construct solution backwards.
map[t, j | (1 << k)] = k;
}
}
// Constructing solution.
int[] res = new int[numberOfLetters];
int lastIndex = numberOfLetters - 1;
for (int j = 0; j < roomSets; j++)
{
int roomMask = j;
while (map[lastIndex + 1, roomMask] > -1)
{
int lastRoom = map[lastIndex + 1, roomMask];
int roomCapacity = rooms[lastRoom];
for (; lastIndex >= 0 && roomCapacity >= studentsPerLetter[lastIndex]; lastIndex--)
{
res[lastIndex] = lastRoom;
roomCapacity -= studentsPerLetter[lastIndex];
}
roomMask ^= 1 << lastRoom; // Remove last room from set.
j = roomSets; // Over outer loop.
}
}
return lastIndex > -1 ? null : res;
}
Example from OP question:
int[] studentsPerLetter = { 25, 150, 200, 50 };
int[] rooms = { 350, 50, 50 };
int[] ans = GetAssignments(studentsPerLetter, rooms);
Answer will be:
2
0
0
1
Which indicates index of room for each of the student's last name letter. If assignment is not possible my solution will return null.
[Edit]
After thousands of auto generated tests my friend has found a bug in code which constructs solution backwards. It does not influence main algo, so fixing this bug will be an exercise to the reader.
The test case that reveals the bug is students = [13,75,21,49,3,12,27,7] and rooms = [6,82,89,6,56]. My solution return no answers, but actually there is an answer. Please note that first part of solution works properly, but answer construction part fails.
This problem is NP-Complete and thus there is no known polynomial time (aka efficient) solution for this (as long as people cannot prove P = NP). You can reduce an instance of knapsack or bin-packing problem to your problem to prove it is NP-complete.
To solve this you can use 0-1 knapsack problem. Here is how:
First pick the biggest classroom size and try to allocate as many group of students you can (using 0-1 knapsack), i.e equal to the size of the room. You are guaranteed not to split a group of student, as this is 0-1 knapsack. Once done, take the next biggest classroom and continue.
(You use any known heuristic to solve 0-1 knapsack problem.)
Here is the reduction --
You need to reduce a general instance of 0-1 knapsack to a specific instance of your problem.
So lets take a general instance of 0-1 knapsack. Lets take a sack whose weight is W and you have x_1, x_2, ... x_n groups and their corresponding weights are w_1, w_2, ... w_n.
Now the reduction --- this general instance is reduced to your problem as follows:
you have one classroom with seating capacity W. Each x_i (i \in (1,n)) is a group of students whose last alphabet begins with i and their number (aka size of group) is w_i.
Now you can prove if there is a solution of 0-1 knapsack problem, your problem has a solution...and the converse....also if there is no solution for 0-1 knapsack, then your problem have no solution, and vice versa.
Please remember the important thing of reduction -- general instance of a known NP-C problem to a specific instance of your problem.
Hope this helps :)
Here is an approach that should work reasonably well, given common assumptions about the distribution of last names by initial. Fill the rooms from smallest capacity to largest as compactly as possible within the constraints, with no backtracking.
It seems reasonable (to me at least) for the largest room to be listed last, as being for "everyone else" not already listed.
Is there any reason to make life so complicated? Why cann't you assign registration numbers to each student and then use the number to allocate them whatever the way you want :) You do not need to write a code, students are happy, everyone is happy.
I came across this question
ADZEN is a very popular advertising firm in your city. In every road
you can see their advertising billboards. Recently they are facing a
serious challenge , MG Road the most used and beautiful road in your
city has been almost filled by the billboards and this is having a
negative effect on
the natural view.
On people's demand ADZEN has decided to remove some of the billboards
in such a way that there are no more than K billboards standing together
in any part of the road.
You may assume the MG Road to be a straight line with N billboards.Initially there is no gap between any two adjecent
billboards.
ADZEN's primary income comes from these billboards so the billboard removing process has to be done in such a way that the
billboards
remaining at end should give maximum possible profit among all possible final configurations.Total profit of a configuration is the
sum of the profit values of all billboards present in that
configuration.
Given N,K and the profit value of each of the N billboards, output the maximum profit that can be obtained from the remaining
billboards under the conditions given.
Input description
1st line contain two space seperated integers N and K. Then follow N lines describing the profit value of each billboard i.e ith
line contains the profit value of ith billboard.
Sample Input
6 2
1
2
3
1
6
10
Sample Output
21
Explanation
In given input there are 6 billboards and after the process no more than 2 should be together. So remove 1st and 4th
billboards giving a configuration _ 2 3 _ 6 10 having a profit of 21.
No other configuration has a profit more than 21.So the answer is 21.
Constraints
1 <= N <= 1,00,000(10^5)
1 <= K <= N
0 <= profit value of any billboard <= 2,000,000,000(2*10^9)
I think that we have to select minimum cost board in first k+1 boards and then repeat the same untill last,but this was not giving correct answer
for all cases.
i tried upto my knowledge,but unable to find solution.
if any one got idea please kindly share your thougths.
It's a typical DP problem. Lets say that P(n,k) is the maximum profit of having k billboards up to the position n on the road. Then you have following formula:
P(n,k) = max(P(n-1,k), P(n-1,k-1) + C(n))
P(i,0) = 0 for i = 0..n
Where c(n) is the profit from putting the nth billboard on the road. Using that formula to calculate P(n, k) bottom up you'll get the solution in O(nk) time.
I'll leave up to you to figure out why that formula holds.
edit
Dang, I misread the question.
It still is a DP problem, just the formula is different. Let's say that P(v,i) means the maximum profit at point v where last cluster of billboards has size i.
Then P(v,i) can be described using following formulas:
P(v,i) = P(v-1,i-1) + C(v) if i > 0
P(v,0) = max(P(v-1,i) for i = 0..min(k, v))
P(0,0) = 0
You need to find max(P(n,i) for i = 0..k)).
This problem is one of the challenges posted in www.interviewstreet.com ...
I'm happy to say I got this down recently, but not quite satisfied and wanted to see if there's a better method out there.
soulcheck's DP solution above is straightforward, but won't be able to solve this completely due to the fact that K can be as big as N, meaning the DP complexity will be O(NK) for both runtime and space.
Another solution is to do branch-and-bound, keeping track the best sum so far, and prune the recursion if at some level, that is, if currSumSoFar + SUM(a[currIndex..n)) <= bestSumSoFar ... then exit the function immediately, no point of processing further when the upper-bound won't beat best sum so far.
The branch-and-bound above got accepted by the tester for all but 2 test-cases.
Fortunately, I noticed that the 2 test-cases are using small K (in my case, K < 300), so the DP technique of O(NK) suffices.
soulcheck's (second) DP solution is correct in principle. There are two improvements you can make using these observations:
1) It is unnecessary to allocate the entire DP table. You only ever look at two rows at a time.
2) For each row (the v in P(v, i)), you are only interested in the i's which most increase the max value, which is one more than each i that held the max value in the previous row. Also, i = 1, otherwise you never consider blanks.
I coded it in c++ using DP in O(nlogk).
Idea is to maintain a multiset with next k values for a given position. This multiset will typically have k values in mid processing. Each time you move an element and push new one. Art is how to maintain this list to have the profit[i] + answer[i+2]. More details on set:
/*
* Observation 1: ith state depends on next k states i+2....i+2+k
* We maximize across this states added on them "accumulative" sum
*
* Let Say we have list of numbers of state i+1, that is list of {profit + state solution}, How to get states if ith solution
*
* Say we have following data k = 3
*
* Indices: 0 1 2 3 4
* Profits: 1 3 2 4 2
* Solution: ? ? 5 3 1
*
* Answer for [1] = max(3+3, 5+1, 9+0) = 9
*
* Indices: 0 1 2 3 4
* Profits: 1 3 2 4 2
* Solution: ? 9 5 3 1
*
* Let's find answer for [0], using set of [1].
*
* First, last entry should be removed. then we have (3+3, 5+1)
*
* Now we should add 1+5, but entries should be incremented with 1
* (1+5, 4+3, 6+1) -> then find max.
*
* Could we do it in other way but instead of processing list. Yes, we simply add 1 to all elements
*
* answer is same as: 1 + max(1-1+5, 3+3, 5+1)
*
*/
ll dp()
{
multiset<ll, greater<ll> > set;
mem[n-1] = profit[n-1];
ll sumSoFar = 0;
lpd(i, n-2, 0)
{
if(sz(set) == k)
set.erase(set.find(added[i+k]));
if(i+2 < n)
{
added[i] = mem[i+2] - sumSoFar;
set.insert(added[i]);
sumSoFar += profit[i];
}
if(n-i <= k)
mem[i] = profit[i] + mem[i+1];
else
mem[i] = max(mem[i+1], *set.begin()+sumSoFar);
}
return mem[0];
}
This looks like a linear programming problem. This problem would be linear, but for the requirement that no more than K adjacent billboards may remain.
See wikipedia for a general treatment: http://en.wikipedia.org/wiki/Linear_programming
Visit your university library to find a good textbook on the subject.
There are many, many libraries to assist with linear programming, so I suggest you do not attempt to code an algorithm from scratch. Here is a list relevant to Python: http://wiki.python.org/moin/NumericAndScientific/Libraries
Let P[i] (where i=1..n) be the maximum profit for billboards 1..i IF WE REMOVE billboard i. It is trivial to calculate the answer knowing all P[i]. The baseline algorithm for calculating P[i] is as follows:
for i=1,N
{
P[i]=-infinity;
for j = max(1,i-k-1)..i-1
{
P[i] = max( P[i], P[j] + C[j+1]+..+C[i-1] );
}
}
Now the idea that allows us to speed things up. Let's say we have two different valid configurations of billboards 1 through i only, let's call these configurations X1 and X2. If billboard i is removed in configuration X1 and profit(X1) >= profit(X2) then we should always prefer configuration X1 for billboards 1..i (by profit() I meant the profit from billboards 1..i only, regardless of configuration for i+1..n). This is as important as it is obvious.
We introduce a doubly-linked list of tuples {idx,d}: {{idx1,d1}, {idx2,d2}, ..., {idxN,dN}}.
p->idx is index of the last billboard removed. p->idx is increasing as we go through the list: p->idx < p->next->idx
p->d is the sum of elements (C[p->idx]+C[p->idx+1]+..+C[p->next->idx-1]) if p is not the last element in the list. Otherwise it is the sum of elements up to the current position minus one: (C[p->idx]+C[p->idx+1]+..+C[i-1]).
Here is the algorithm:
P[1] = 0;
list.AddToEnd( {idx=0, d=C[0]} );
// sum of elements starting from the index at top of the list
sum = C[0]; // C[list->begin()->idx]+C[list->begin()->idx+1]+...+C[i-1]
for i=2..N
{
if( i - list->begin()->idx > k + 1 ) // the head of the list is "too far"
{
sum = sum - list->begin()->d
list.RemoveNodeFromBeginning()
}
// At this point the list should containt at least the element
// added on the previous iteration. Calculating P[i].
P[i] = P[list.begin()->idx] + sum
// Updating list.end()->d and removing "unnecessary nodes"
// based on the criterion described above
list.end()->d = list.end()->d + C[i]
while(
(list is not empty) AND
(P[i] >= P[list.end()->idx] + list.end()->d - C[list.end()->idx]) )
{
if( list.size() > 1 )
{
list.end()->prev->d += list.end()->d
}
list.RemoveNodeFromEnd();
}
list.AddToEnd( {idx=i, d=C[i]} );
sum = sum + C[i]
}
//shivi..coding is adictive!!
#include<stdio.h>
long long int arr[100001];
long long int sum[100001];
long long int including[100001],excluding[100001];
long long int maxim(long long int a,long long int b)
{if(a>b) return a;return b;}
int main()
{
int N,K;
scanf("%d%d",&N,&K);
for(int i=0;i<N;++i)scanf("%lld",&arr[i]);
sum[0]=arr[0];
including[0]=sum[0];
excluding[0]=sum[0];
for(int i=1;i<K;++i)
{
sum[i]+=sum[i-1]+arr[i];
including[i]=sum[i];
excluding[i]=sum[i];
}
long long int maxi=0,temp=0;
for(int i=K;i<N;++i)
{
sum[i]+=sum[i-1]+arr[i];
for(int j=1;j<=K;++j)
{
temp=sum[i]-sum[i-j];
if(i-j-1>=0)
temp+=including[i-j-1];
if(temp>maxi)maxi=temp;
}
including[i]=maxi;
excluding[i]=including[i-1];
}
printf("%lld",maxim(including[N-1],excluding[N-1]));
}
//here is the code...passing all but 1 test case :) comment improvements...simple DP