Partition an array in order - algorithm

This is an interview question that a friend of mine got and I'm unable to come up with how to solve it.
Question:
You are given a array of n buttons that are either red or blue. There are k containers present. The value of a container is given by the product of red buttons and blue buttons present in it. The problem is to put the buttons into the containers such that the sum of all values of the containers is minimal. Additionally, all containers must contain the buttons and they must be put in order they are given.
For example, the very first button can only go to the first container, the second one can go to either the first or the second but not the third (otherwise the second container won't have any buttons).
k will be less than or equal to n.
I think there must be a dynamic programming solution for this.
How do you solve this ?
So far, I've only got the trivial cases where
if (n==k), the answer would be zero because you could just put one in each container making the value of each container zero, therefore the sum would be zero.
if (k==1), you just dump all of them and calculate the product.
if only one color is present, the answer would be zero.
Edit:
I'll give an example.
n = 4 and k = 2
Input: R B R R
The first container gets the first two (R and B) making its value 1 (1R X 1B)
The second container gets the remaining (R and R) making its value 0 (2R x 0B)
The answer is 1 + 0 = 1
if k=3,
the first container would have only the first button (R)
the second container would have only the second one (B)
the third one would have the last two buttons (R and R)
Each of the containers would have value 0 and hence sum and answer would be 0.
Hope this clears up the doubts.

Possible DP solution:
Let dp[i, j] = minimum number possible if we put the first i numbers into j containers.
dp[i, j] = min{dp[p, j - 1] + numRed[p+1, i]*numBlues[p+1, i]}, p = 1 to i - 1
Answer will be in dp[n, k].
int blue = 0, red = 0;
for (int i = 1; i <= n; ++i)
{
if (buttons[i] == 1)
++red;
else
++blue;
dp[i][1] = red * blue;
}
for (int i = 2; i <= n; ++i)
for (int j = 2; j <= k; ++j)
{
dp[i][j] = inf;
for (int p = 1; p <= i; ++p)
dp[i][j] = min(dp[p][j - 1] + getProd(p + 1, i), dp[i][j]);
}
return dp[n][k];
Complexity will be O(n^3*k), but it's possible to reduce to O(n^2*k) by making getProd run in O(1) with the help of certain precomputations (hint: use dp[i][1]). I'll post it tomorrow if no one figures out this is actually wrong until then.
It might also be possible to reduce to O(n*k), but that will probably require a different approach...

If I understand the question correctly, as long as every container has at least one button in it, you can choose any container to put the remaining buttons in. Given that, put one button in every container, making sure that there is at least one container with a red button and at least one with a blue button. Then with the remaining buttons, put all the red buttons in a container with a red button and put all the blue buttons in a container with blue buttons in it. This will make it so every container has at least one button and every container has only one color of buttons. Then every container's score is 0. Thus the sum is 0 and you have minimized the combined score.

Warning: Proven to be non-optimal
How about a greedy algorithm to get people talking? I'm not going to try to prove it's optimal at this point, but it's a way of approaching the problem.
In this solution, we use the G to denote the number of contiguous regions of one colour in the sequence of buttons. Say we had (I'm using x for red and o for blue since R and B look too similar):
x x x o x o o o x x o
This would give G = 6. Let's split this into groups (red/blue) where, to start with, each group gets an entire region of a consistent colour:
3/0 0/1 1/0 0/3 2/0 0/1 //total value: 0
When G <= k, you have a minimum of zero since each grouping can go into its own container. Now assume G > k. Our greedy algorithm will be, while there are more groups than containers, collapse two adjacent groups into one that result in the least container value delta (valueOf(merged(a, b)) - valueOf(a) - valueOf(b)). Say k = 5 with our example above. Our choices are:
Collapse 1,2: delta = (3 - 0 - 0) = 3
2,3: delta = 1
3,4: delta = 3
4,5: delta = 6
5,6: delta = 2
So we collapse 2 and 3:
3/0 1/1 0/3 2/0 0/1 //total value: 1
And k = 4:
Collapse 1,2: delta = (4 - 0 - 1) = 3
2,3: delta = (4 - 1 - 0) = 3
3,4: delta = (6 - 0 - 0) = 6
4,5: delta = 2
3/0 1/1 0/3 2/1 //total value: 3
k = 3
4/1 0/3 2/1 //total value: 6
k = 2
4/1 2/4 //total value: 12
k = 1
6/5 //total value: 30
It seems optimal for this case, but I was just intending to get people talking about a solution. Note that the starting assignments of buttons to containers was a shortcut: you could instead start with each button in the sequence in its own bucket and then reduce, but you would always arrive to the point where each container has the maximum number of buttons of one colour.
Counterexample: Thanks to Jules Olléon for providing a counter-example that I was too lazy to think of:
o o o x x o x o o x x x
If k = 2, the optimal mapping is
2/4 4/2 //total value: 16
Let's see how the greedy algorithm approaches it:
0/3 2/0 0/1 1/0 0/2 3/0 //total value: 0
0/3 2/0 1/1 0/2 3/0 //total value: 1
0/3 3/1 0/2 3/0 //total value: 3
0/3 3/1 3/2 //total value: 9
3/4 3/2 //total value: 18
I'll leave this answer up since it's accomplished its only purpose of getting people talking about a solution. I wonder if the greedy heuristic could be used in an informed search algorithm such as A* to improve the runtime of an exhaustive search, but that would not achieve polynomial runtime.

I always ask for clarifications of the problem statement in an interview. Imagine that you never put blue an red buttons together. Then the sum is 0, just like n==k. So, for all cases where k > 1, then the minimum is 0.

Here is what I understand so far: The algorithm is to process a sequence of values {R,B}.
It may choose to put the value in the current container or the next, if there is a next.
I first would ask a couple of questions to clarify the things I don't know yet:
Is k and n known to the algorithm in advance? I assume so.
Do we know the full sequence of buttons in advance?
If we don't know the sequence in advance, should the average value minimized? Or the maximum (the worst case)?
Idea for a proof for the algortihm by Mark Peters
Edit: Idea for a proof (sorry, couldn't fit it in a comment)
Let L(i) be the length of the ith group. Let d(i) be the diff you get by collapsing container i and i+1 => d(i) = L(i)*L(i+1).
We can define a distribution by the sequence of containers collapsed. As index we use the maximum index of the original containers contained in the collapsed container containing the containers with the smaller indexes.
A given sequence of collapses I = [i(1), .. i(m)] results in a value which has a lower bound equal to the sum of d(i(m)) for all m from 1 to n-k.
We need to proof that there can't be a sequence other then the one created by the algorithm with a smaller diff. So let the sequence above be the one resulting from the algorithm. Let J = [j(1), .. j(m)].
Here it gets skimpy:
I think it should be possible to proof that the lower limit of J is larger then the actual value of I because in each step we choose by construction the collapse operation from I so it must be smaller then the matching collapse from the alternate sequence
I think we might assume that the sequences are disjunct, but I'm not completely sure about it.

Here is a brute force algorithm written in Python which seems to work.
from itertools import combinations
def get_best_order(n, k):
slices = combinations(range(1, len(n)), k-1)
container_slices = ([0] + list(s) + [len(n)] for s in slices)
min_value = -1
best = None
def get_value(slices, n):
value = 0
for i in range(1, len(slices)):
start, end = slices[i-1], slices[i]
num_red = len([b for b in n[start:end] if b == 'r'])
value += num_red * (end - start - num_red)
return value
for slices in container_slices:
value = get_value(slices, n)
if value < min_value or min_value == -1:
min_value = value
best = slices
return [n[best[i-1]:best[i]] for i in range(1, len(best))]
n = ['b', 'r', 'b', 'r', 'r', 'r', 'b', 'b', 'r']
k = 4
print(get_best_order(n, k))
# [['b', 'r', 'b'], ['r', 'r', 'r'], ['b', 'b'], ['r']]
Basically the algorithm works like this:
Generate a list of every possible arrangement (items stay in order, so this is just a number of items per container)
Calculate the value for that arrangement as described by the OP
If that value is less than the current best value, save it
Return the arrangement that has the lowest value

Related

Optimal choice algorithm

i have an appointment for university which is due today and i start getting nervous. We recently discussed dynamic programming for algorithm optimization and now we shall implement an algorithm ourself which uses dynamic programming.
Task
So we have a simple game for which we shall write an algorithm to find the best possible strategy to get the best possible score (assuming both players play optimized).
We have a row of numbers like 4 7 2 3 (note that according to the task description it is not asured that it always is an equal count of numbers). Now each player turnwise takes a number from the back or the front. When the last number is picked the numbers are summed up for each player and the resulting scores for each player are substracted from each other. The result is then the score for player 1. So an optimal order for the above numbers would be
P1: 3 -> p2: 4 -> p1: 7 -> p2: 2
So p1 would have 3, 7 and p2 would have 4, 2 which results in a final score of (3 + 7) - (4 + 2) = 4 for player 1.
In the first task we should simply implement "an easy recursive way of solving this" where i just used a minimax algorithm which seemed to be fine for the automated test. In the second task however i am stuck since we shall now work with dynamic programming techniques. The only hint i found was that in the task itself a matrix is mentioned.
What i know so far
We had an example of a word converting problem where such a matrix was used it was called Edit distance of two words which means how many changes (Insertions, Deletions, Substitutions) of letters does it take to change one word into another. There the two words where ordered as a table or matrix and for each combination of the word the distance would be calculated.
Example:
W H A T
| D | I
v v
W A N T
editing distance would be 2. And you had a table where each editing distance for each substring was displayed like this:
"" W H A T
1 2 3 4
W 1 0 1 2 3
A 2 1 1 2 3
N 3 2 2 2 3
T 4 3 3 3 2
So for example from WHA to WAN would take 2 edits: insert N and delete H, from WH to WAN would also take 2 edits: substitude H->A and insert N and so on. These values where calculated with an "OPT" function which i think stands for optimization.
I also leanred about bottom-up and top-down recursive schemes but im not quite sure how to attach that to my problem.
What i thought about
As a reminder i use the numbers 4 7 2 3.
i learned from the above that i should try to create a table where each possible result is displayed (like minimax just that it will be saved before). I then created a simple table where i tried to include the possible draws which can be made like this (which i think is my OPT function):
4 7 2 3
------------------
a. 4 | 0 -3 2 1
|
b. 7 | 3 0 5 4
|
c. 2 | -2 -5 0 -1
|
d. 3 | -1 -4 1 0
the left column marks player 1 draws, the upper row marks player 2 draws and each number then stands for numberP1 - numberP2. From this table i can at least read the above mentioned optimal strategy of 3 -> 4 -> 7 -> 2 (-1 + 5) so im sure that the table should contain all possible results, but im not quite sure now how to draw the results from it. I had the idea to start iterating over the rows and pick the one with the highest number in it and mark that as the pick from p1 (but that would be greedy anyways). p2 would then search this row for the lowest number and pick that specific entry which would then be the turn.
Example:
p1 picks row a. 7 | 3 0 5 4 since 5 is the highest value in the table. P2 now picks the 3 from that row because it is the lowest (the 0 is an invalid draw since it is the same number and you cant pick that twice) so the first turn would be 7 -> 4 but then i noticed that this draw is not possible since the 7 is not accessible from the start. So for each turn you have only 4 possibilities: the outer numbers of the table and the ones which are directly after/before them since these would be accessable after drawing. So for the first turn i only have rows a. or d. and from that p1 could pick:
4 which leaves p2 with 7 or 3. Or p1 takes 3 which leaves p2 with 4 or 2
But i dont really know how to draw a conclusion out of that and im really stuck.
So i would really like to know if im on the right way with that or if im overthinking this pretty much. Is this the right way to solve this?
The first thing you should try to write down, when starting a dynamic programming algorithm, is a recurrence relation.
Let's first simplify a very little the problem. We will consider that the number of cards is even, and that we want to design an optimal strategy for the first player to play. Once we have managed to solve this version of the problem, the others (odd number of cards, optimize strategy for second player) follows trivially.
So, first, a recurrence relation. Let X(i, j) be the best possible score that player 1 can expect (when player 2 plays optimally as well), when the cards remaining are from the i^th to the j^th ones. Then, the best score that player 1 can expect when playing the game will be represented by X(1, n).
We have:
X(i, j) = max(Arr[i] + X(i+1, j), X(i, j-1) + Arr[j]) if j-i % 2 == 1, meaning that the best score that player one can expect is the best between taking the card on the left, and taking the card on the right.
In the other case, the other player is playing, so he'll try to minimize:
X(i, j) = min(Arr[i] + X(i+1, j), X(i, j-1) + Arr[j]) if j-i % 2 == 0.
The terminal case is trivial: X(i, i) = Arr[i], meaning that when there is only one card, we just pick it, and that's all.
Now the algorithm without dynamic programming, here we only write the recurrence relation as a recursive algorithm:
function get_value(Arr, i, j) {
if i == j {
return Arr[i]
} else if j - i % 2 == 0 {
return max(
Arr[i] + get_value(i+1, j),
get_value(i, j-1) + Arr[j]
)
} else {
return min(
Arr[i] + get_value(i+1, j),
get_value(i, j-1) + Arr[j]
)
}
}
The problem with this function is that for some given i, j, there will be many redundant calculations of X(i, j). The essence of dynamic programming is to store intermediate results in order to prevent redundant calculations.
Algo with dynamic programming (X is initialized with + inf everywhere.
function get_value(Arr, X, i, j) {
if X[i][j] != +inf {
return X[i][j]
} else if i == j {
result = Arr[i]
} else if j - i % 2 == 0 {
result = max(
Arr[i] + get_value(i+1, j),
get_value(i, j-1) + Arr[j]
)
} else {
result = min(
Arr[i] + get_value(i+1, j),
get_value(i, j-1) + Arr[j]
)
}
X[i][j] = result
return result
}
As you can see the only difference with the algorithm above is that we now use a 2D array X to store intermediate results. The consequence on time complexity is huge, since the first algorithm runs in O(2^n), while the second runs in O(n²).
Dynamic programming problems can generally be solved in 2 ways, top down and bottom up.
Bottom up requires building a data structure from the simplest to the most complex case. This is harder to write, but offers the option of throwing away parts of the data that you know you won't need again. Top down requires writing a recursive function, and then memoizing. So bottom up can be more efficient, top down is usually easier to write.
I will show both. The naive approach can be:
def best_game(numbers):
if 0 == len(numbers):
return 0
else:
score_l = numbers[0] - best_game(numbers[1:])
score_r = numbers[-1] - best_game(numbers[0:-1])
return max(score_l, score_r)
But we're passing a lot of redundant data. So let's reorganize it slightly.
def best_game(numbers):
def _best_game(i, j):
if j <= i:
return 0
else:
score_l = numbers[i] - _best_game(i+1, j)
score_r = numbers[j-1] - _best_game(i, j-1)
return max(score_l, score_r)
return _best_game(0, len(numbers))
And now we can add a caching layer to memoize it:
def best_game(numbers):
seen = {}
def _best_game(i, j):
if j <= i:
return 0
elif (i, j) not in seen:
score_l = numbers[i] - _best_game(i+1, j)
score_r = numbers[j-1] - _best_game(i, j-1)
seen[(i, j)] = max(score_l, score_r)
return seen[(i, j)]
return _best_game(0, len(numbers))
This approach will be memory and time O(n^2).
Now bottom up.
def best_game(numbers):
# We start with scores for each 0 length game
# before, after, and between every pair of numbers.
# There are len(numbers)+1 of these, and all scores
# are 0.
scores = [0] * (len(numbers) + 1)
for i in range(len(numbers)):
# We will compute scores for all games of length i+1.
new_scores = []
for j in range(len(numbers) - i):
score_l = numbers[j] - scores[j+1]
score_r = numbers[j+i] - scores[j]
new_scores.append(max(score_l, score_r))
# And now we replace scores by new_scores.
scores = new_scores
return scores[0]
This is again O(n^2) time but only O(n) space. Because after I compute the games of length 1 I can throw away the games of length 0. Of length 2, I can throw away the games of length 1. And so on.

FireHose (S3) from CCC

This grade 11 problem has been bothering me since 2010 and I still can't figure out/find a solution even after university.
Problem Description
There is a very unusual street in your neighbourhood. This street
forms a perfect circle, and the circumference of the circle is
1,000,000. There are H (1 ≤ H ≤ 1000) houses on the street. The
address of each house is the clockwise arc-length from the
northern-most point of the circle. The address of the house at the
northern-most point of the circle is 0. You also have special firehoses
which follow the curve of the street. However, you wish to keep the
length of the longest hose you require to a minimum. Your task is to
place k (1 ≤ k ≤ 1000) fire hydrants on this street so that the maximum
length of hose required to connect a house to a fire hydrant is as
small as possible.
Input Specification
The first line of input will be an integer H, the number of houses. The
next H lines each contain one integer, which is the address of that
particular house, and each house address is at least 0 and less than
1,000,000. On the H + 2nd line is the number k, which is the number of
fire hydrants that can be placed around the circle. Note that a fire
hydrant can be placed at the same position as a house. You may assume
that no two houses are at the same address. Note: at least 40% of the
marks for this question have H ≤ 10.
Output Specification
On one line, output the length of hose required
so that every house can connect to its nearest fire hydrant with that
length of hose.
Sample Input
4
0
67000
68000
77000
2
Output for Sample Input
5000
Link to original question
I can't even come up with a brutal force algorithm since the placement might be float number. For example if the houses are located in 1 and 2, then the hydro should be placed at 1.5 and the distance would be 0.5
Here is quick outline of an answer.
First write a function that can figures out whether you can cover all of the houses with a given maximum length per hydrant. (The maximum hose will be half that length.) It just starts at a house, covers all of the houses it can, jumps to the next, and ditto, and sees whether you stretch. If you fail it tries starting at the next house instead until it has gone around the circle. This will be a O(n^2) function.
Second create a sorted list of the pairwise distances between houses. (You have to consider it going both ways around for a single hydrant, you can only worry about the shorter way if you have 2+ hydrants.) The length covered by a hydrant will be one of those. This takes O(n^2 log(n)).
Now do a binary search to find the shortest length that can cover all of the houses. This will require O(log(n)) calls to the O(n^2) function that you wrote in the first step.
The end result is a O(n^2 log(n)) algorithm.
And here is working code for all but the parsing logic.
#! /usr/bin/env python
def _find_hoses_needed (circle_length, hose_span, houses):
# We assume that houses is sorted.
answers = [] # We can always get away with one hydrant per house.
for start in range(len(houses)):
needed = 1
last_begin = start
current_house = start + 1 if start + 1 < len(houses) else 0
while current_house != start:
pos_begin = houses[last_begin]
pos_end = houses[current_house]
length = pos_end - pos_begin if pos_begin <= pos_end else circle_length + pos_begin - pos_end
if hose_span < length:
# We need a new hose.
needed = needed + 1
last_begin = current_house
current_house = current_house + 1
if len(houses) <= current_house:
# We looped around the circle.
current_house = 0
answers.append(needed)
return min(answers)
def find_min_hose_coverage (circle_length, hydrant_count, houses):
houses = sorted(houses)
# First we find all of the possible answers.
is_length = set()
for i in range(len(houses)):
for j in range(i, len(houses)):
is_length.add(houses[j] - houses[i])
is_length.add(houses[i] - houses[j] + circle_length)
possible_answers = sorted(is_length)
# Now we do a binary search.
lower = 0
upper = len(possible_answers) - 1
while lower < upper:
mid = (lower + upper) / 2 # Note, we lose the fraction here.
if hydrant_count < _find_hoses_needed(circle_length, possible_answers[mid], houses):
# We need a strictly longer coverage to make it.
lower = mid + 1
else:
# Longer is not needed
upper = mid
return possible_answers[lower]
print(find_min_hose_coverage(1000000, 2, [0, 67000, 68000, 77000])/2.0)

Towers of Hanoi - Bellman equation solution

I have to implement an algorithm that solves the Towers of Hanoi game for k pods and d rings in a limited number of moves (let's say 4 pods, 10 rings, 50 moves for example) using Bellman dynamic programming equation (if the problem is solvable of course).
Now, I understand the logic behind the equation:
where V^T is the objective function at time T, a^0 is the action at time 0, x^0 is the starting configuration, H_0 is cumulative gain f(x^0, a^0)=x^1.
The cardinality of the state space is $k^d$ and I get that a good representation for a state is a number in base k: d digits that can go from 0 to k-1. Each digit represents a ring and the digit can go from 0 to k-1, that are the labels of the k rings.
I want to minimize the number of moves for going from the initial configuration (10 rings on the first pod) to the end one (10 rings on the last pod).
What I don't get is: how do I write my objective function?
The first you need to do is choose a reward function H_t(s,a) which will define you goal. Once this function is chosen, the (optimal) value function is defined and all you have to do is compute it.
The idea of dynamic programming for the Bellman equation is that you should compute V_t(s) bottom-up: you start with t=T, then t=T-1 and so on until t=0.
The initial case is simply given by:
V_T(s) = 0, ∀s
You can compute V_{T-1}(x) ∀x from V_T:
V_{T-1}(x) = max_a [ H_{T-1}(x,a) ]
Then you can compute V_{T-2}(x) ∀s from V_{T-1}:
V_{T-2}(x) = max_a [ H_{T-2}(x,a) + V_{T-1}(f(x,a)) ]
And you keep on computing V_{t-1}(x) ∀s from V_{t}:
V_{t-1}(x) = max_a [ H_{t-1}(x,a) + V_{t}(f(x,a)) ]
until you reach V_0.
Which gives the algorithm:
forall x:
V[T](x) ← 0
for t from T-1 to 0:
forall x:
V[t](x) ← max_a { H[t](x,a) + V[t-1](f(x,a)) }
What actually was requested was this:
def k_hanoi(npods,nrings):
if nrings == 1 and npods > 1: #one remaining ring: just one move
return 1
if npods == 3:
return 2**nrings - 1 #optimal solution with 3 pods take 2^d -1 moves
if npods > 3 and nrings > 0:
sol = []
for pivot in xrange(1, nrings): #loop on all possible pivots
sol.append(2*k_hanoi(npods, pivot)+k_hanoi(npods-1, nrings-pivot))
return min(sol) #minimization on pivot
k = 4
d = 10
print k_hanoi(k, d)
I think it is the Frame algorithm, with optimization on the pivot chosen to divide the disks in two subgroups. I also think someone demonstrated this is optimal for 4 pegs (in 2014 or something like that? Not sure btw) and conjectured to be optimal for more than 4 pegs. The limitation on the number of moves can be implemented easily.
The value function in this case was the number of steps needed to go from the initial configuration to the ending one and it needed be minimized. Thank you all for the contribution.

How to efficiently detect a tie early in m,n,k-game (generalized tic-tac-toe)?

I'm implementing an m,n,k-game, a generalized version of tic-tac-toe, where m is the number of rows, n is the number of columns and k is the number of pieces that a player needs to put in a row to win. I have implemented a check for a win, but I haven't figured out a satisfactory way to check before the board is full of pieces, if no player can win the game. In other words, there might be empty slots on the board, but they cannot be filled in such a way that one player would win.
My question is, how to check this efficiently? The following algorithm is the best that I can think of. It checks for two conditions:
A. Go over all board positions in all 4 directions (top to bottom, right to left, and both diagonal directions). If say k = 5, and 4 (= k-1) consecutive empty slots are found, stop checking and report "no tie". This doesn't take into account for example the following situation:
OX----XO (Example 1)
where a) there are 4 empty consecutive slots (-) somewhere between two X's, b) next it is O's turn, c) there are less than four other empty positions on the board and no player can win by putting pieces to those, and d) it is not possible to win in any other direction than horizontally in the shown slots either. Now we know that it is a tie because O will eventually block the last winning possibility, but erroneously it is not reported yet because there are four consecutive empty slots. That would be ok (but not great). Checking this condition gives a good speed-up at the beginning when the checking algorithm usually finds such a case early, but it gets slower as more pieces are put on the board.
B. If this k-1-consecutive-empty-slots-condition isn't met, the algorithm would check the slots again consecutively in all 4 directions. Suppose we are currently checking from left to right. If at some point an X is encountered and it was preceded by an O or - (empty slot) or a board border, then start counting the number of consecutive X's and empty slots, counting in this first encountered X. If one can count to 5, then one knows it is possible for X to win, and "no tie" is reported. If an O preceded by an X is encountered before 5 consecutive X's, then X cannot win in those 5 slots from left to right starting from where we started counting. For example:
X-XXO (Example 2)
12345
Here we started checking at position 1, counted to 4, and encountered an O. In this case, one would continue from the encountered O in the same way, trying to find 5 consecutive O's or empty slots this time. In another case when counting X's or empty slots, an O preceded by one or more empty slots is encountered, before counting to 5. For example:
X-X-O (Example 3)
12345
In this case we would again continue from the O at position 5, but add to the new counter (of consecutive O's or empty slots) the number of consecutive empty slots that preceded O, here 1, so that we wouldn't miss for example this possible winning position:
X-X-O---X (Example 4)
In this way, in the worst case, one would have to go through all positions 4 times (4 directions, and of course diagonals whose length is less than k can be skipped), giving running time O(mn).
The best way I could think of was doing these two described checks, A and B, in one pass. If the checking algorithm gets through all positions in all directions without reporting "no tie", it reports a tie.
Knowing that you can check a win just by checking in the vicinity of the last piece that was added with running time O(k), I was wondering if there were quicker ways to do an early check for a tie. Doesn't have to be asymptotically quicker. I'm currently keeping the pieces in a two-dimensional array. Is there maybe a data structure that would allow an efficient check? One approach: what is the highest threshold of moves that one can wait the players to make before running any checks for a tie at all?
There are many related questions at Stack Overflow, for example this, but all discussions I could find either only pointed out the obvious tie condition, where the number of moves made is equal to the size of the board (or they checked if the board is full), or handled only the special case where the board is square: m = n. For example this answer claims to do the check for a tie in constant time, but only works when m = n = k. I'm interested in reporting the tie as early as possible and for general m,n and k. Also if the algorithm works for more than two players, that would be neat.
I would reduce the problem of determining a tie to the easier sub-problem:
Can player X still win?
If the answer is 'no' for all players, it is a tie.
To find out whether Player X can win:
fill all blank spaces with virtual 'X'-pieces
are there k 'X'-pieces in a row anywhere?
if there are not --> Player X cannot win. return false.
if there are, find the row of k stones with the least amount of virtual pieces. Count the number of virtual pieces in it.
count the number of moves player X has left, alternating with all other players, until the board is completely full.
if the number of moves is less than the amount of virtual pieces required to win, player X cannot win. return false.
otherwise, player X can still win. return true.
(This algorithm will report a possible win for player X even in cases where the only winning moves for X would have another player win first, but that is ok, since that would not be a tie either)
If, as you said, you can check a win just by checking in the vicinity of the last piece that was added with running time O(k), then I think you can run the above algorithm in O(k * Number_of_empty_spots): Add all virtual X-Piece, note any winning combinations in the vicinity of the added pieces.
The number of empty slots can be large, but as long as there is at least one empty row of size k and player X has still k moves left until the board is filled, you can be sure that player X can still win, so you do not need to run the full check.
This should work with any number of players.
Actually the constant time solution you referenced only works when k = m = n as well. If k is smaller then I don't see any way to adapt the solution to get constant time, basically because there are multiple locations on each row/column/diagonal where a winning consecutive k 0's or 1's may occur.
However, maintaining auxiliary information for each row/column/diagonal can give a speed up. For each row/column/diagonal, you can store the start and end locations for consecutive occurrences of 1's and blanks as possible winning positions for player 1, and similarly store start and end locations of consecutive occurrences of 0's and blanks as possible winning positions for player 0. Note that for a given row/column/diagonal, intervals for player 0 and 1 may overlap if they contain blanks. For each row/column/diagonal, store the intervals for player 1 in sorted order in a self-balancing binary tree (Note you can do this because the intervals are disjoint). Similarly store the intervals for player 0 sorted in a tree. When a player makes a move, find the row/column/diagonals that contain the move location and update the intervals containing the move in the appropriate row column and diagonal trees for the player that did not make the move. For the player that did not make a move, this will split an interval (if it exists) into smaller intervals that you can replace the old interval with and then rebalance the tree. If an interval ever gets to length less than k you can delete it. If a tree ever becomes empty then it is impossible for that player to win in that row/column/diagonal. You can maintain a counter of how many rows/columns/diagonals are impossible to win for each player, and if the counter ever reaches the total number of rows/columns/diagonals for both players then you know you have a tie. The total running time for this is O(log(n/k) + log(m/k)) to check for a tie per move, with O(mn/k) extra space.
You can similarly maintain trees that store consecutive intervals of 1's (without spaces) and update the trees in O(log n + log m) time when a move is made, basically searching for the positions before and after the move in your tree and updating the interval(s) found and merging two intervals if two intervals (before and after) are found. Then you report a win if an interval is ever created/updated and obtains length greater than or equal to k. Similarly for player 0. Total time to check for a win is O(log n + log m) which may be better than O(k) depending on how large k is. Extra space is O(mn).
Let's look at one row (or column or diagonal, it doesn't matter) and count the number of winning lines of length k ("k-line") it's possible to make, at each place in the row, for player X. This solution will keep track of that number over the course of the game, checking fulfillment of the winning condition on each move as well as detecting a tie.
1 2 3... k k k k... 3 2 1
There is one k-line including an X in the leftmost slot, two with the second slot from the left, and so on. If an opposing player, O or otherwise, plays in this row, we can reduce the k-line possibility counts for player X in O(k) time at the time of the move. (The logic for this step should be straightforward after doing an example, needing no other data structure, but any method involving checking each of the k rows of k from will do. Going left to right, only k operations on the counts is needed.) An enemy piece should set the possibility count to -1.
Then, a detectably tied game is one where no cell has a non-zero k-line possibility count for any player. It's easy to check this by keeping track of the index of the first non-zero cell. Maintaining the structure amounts to O(k*players) work on each move. The number of empty slots is less than those filled, for positions that might be tied, so the other answers are good for checking a position in isolation. However, at least for reasonably small numbers of players, this problem is intimately linked with checking the winning condition in the first place, which at minimum you must do, O(k), on every move. Depending on your game engine there may be a better structure that is rich enough to find good moves as well as detect ties. But the possibility counting structure has the nice property that you can check for a win whilst updating it.
If space isn't an issue, I had this idea:
For each player maintain a structure sized (2mn + (1 - k)(m + n) + 2(m - k + 1)(n - k + 1) + 2(sum 1 to (m - k))) where each value represents if one of another player's moves are in one distinct k-sized interval. For example for a 8-8-4 game, one element in the structure could represent row 1, cell 0 to 3; another row 1, cell 1 to 4; etc.
In addition, one variable per player will represent how many elements in their structure are still unset. Only one move is required to set an element, showing that that k-interval can no longer be used to win.
An update of between O(k) and O(4k) time per player seems needed per move. A tie is detected when the number of players exceeds the number of different elements unset.
Using bitsets, the number of bytes needed for each player's structure would be the structure size divided by 8. Notice that when k=m=n, the structure size is 4*k and update time O(4). Less than half a megabyte per player would be needed for a 1000,1000,5 game.
Below is a JavaScript example.
var m = 1000, n = 1000, k = 5, numberOfPlayers = 2
, numberOfHorizontalKIs = m * Math.max(n - k + 1,0)
, numberOfverticalKIs = n * Math.max(m - k + 1,0)
, horizontalVerticalKIArraySize = Math.ceil((numberOfHorizontalKIs + numberOfverticalKIs)/31)
, horizontalAndVerticalKIs = Array(horizontalVerticalKIArraySize)
, numberOfUnsetKIs = horizontalAndVerticalKIs
, upToM = Math.max(0,m - k) // southwest diagonals up to position m
, upToMSum = upToM * (upToM + 1) / 2
, numberOfSouthwestKIs = 2 * upToMSum //sum is multiplied by 2 to account for bottom-right-corner diagonals
+ Math.max(0,n - m + 1) * (m - k + 1)
, diagonalKIArraySize = Math.ceil(2 * numberOfSouthwestKIs/31)
, diagonalKIs = Array(diagonalKIArraySize)
, numberOfUnsetKIs = 2 * numberOfSouthwestKIs + numberOfHorizontalKIs + numberOfverticalKIs
function checkTie(move){
var row = move[0], column = move[1]
//horizontal and vertical
for (var rotate=0; rotate<2; rotate++){
var offset = Math.max(k - n + column, 0)
column -= offset
var index = rotate * numberOfHorizontalKIs + (n - k + 1) * row + column
, count = 0
while (column >= 0 && count < k - offset){
var KIArrayIndex = Math.floor(index / 31)
, bitToSet = 1 << index % 31
if (!(horizontalAndVerticalKIs[KIArrayIndex] & bitToSet)){
horizontalAndVerticalKIs[KIArrayIndex] |= bitToSet
numberOfUnsetKIs--
}
index--
column--
count++
}
//rotate board to log vertical KIs
var mTmp = m
m = n
n = mTmp
row = move[1]
column = move[0]
count = 0
}
//rotate board back
mTmp = m
m = n
n = mTmp
// diagonals
for (var rotate=0; rotate<2; rotate++){
var diagonalTopColumn = column + row
if (diagonalTopColumn < k - 1 || diagonalTopColumn >= n + m - k){
continue
} else {
var offset = Math.max(k - m + row, 0)
row -= offset
column += offset
var dBeforeM = Math.min (diagonalTopColumn - k + 1,m - k)
, dAfterM = n + m - k - diagonalTopColumn
, index = dBeforeM * (dBeforeM + 1) / 2
+ (m - k + 1) * Math.max (Math.min(diagonalTopColumn,n) - m + 1,0)
+ (diagonalTopColumn < n ? 0 : upToMSum - dAfterM * (dAfterM + 1) / 2)
+ (diagonalTopColumn < n ? row : n - 1 - column)
+ rotate * numberOfSouthwestKIs
, count = 0
while (row >= 0 && column < n && count < k - offset){
var KIArrayIndex = Math.floor(index / 31)
, bitToSet = 1 << index % 31
if (!(diagonalKIs[KIArrayIndex] & bitToSet)){
diagonalKIs[KIArrayIndex] |= bitToSet
numberOfUnsetKIs--
}
index--
row--
column++
count++
}
}
//mirror board
column = n - 1 - column
}
if (numberOfUnsetKIs < 1){
return "This player cannot win."
} else {
return "No tie."
}
}

Removal of billboards from given ones

I came across this question
ADZEN is a very popular advertising firm in your city. In every road
you can see their advertising billboards. Recently they are facing a
serious challenge , MG Road the most used and beautiful road in your
city has been almost filled by the billboards and this is having a
negative effect on
the natural view.
On people's demand ADZEN has decided to remove some of the billboards
in such a way that there are no more than K billboards standing together
in any part of the road.
You may assume the MG Road to be a straight line with N billboards.Initially there is no gap between any two adjecent
billboards.
ADZEN's primary income comes from these billboards so the billboard removing process has to be done in such a way that the
billboards
remaining at end should give maximum possible profit among all possible final configurations.Total profit of a configuration is the
sum of the profit values of all billboards present in that
configuration.
Given N,K and the profit value of each of the N billboards, output the maximum profit that can be obtained from the remaining
billboards under the conditions given.
Input description
1st line contain two space seperated integers N and K. Then follow N lines describing the profit value of each billboard i.e ith
line contains the profit value of ith billboard.
Sample Input
6 2
1
2
3
1
6
10
Sample Output
21
Explanation
In given input there are 6 billboards and after the process no more than 2 should be together. So remove 1st and 4th
billboards giving a configuration _ 2 3 _ 6 10 having a profit of 21.
No other configuration has a profit more than 21.So the answer is 21.
Constraints
1 <= N <= 1,00,000(10^5)
1 <= K <= N
0 <= profit value of any billboard <= 2,000,000,000(2*10^9)
I think that we have to select minimum cost board in first k+1 boards and then repeat the same untill last,but this was not giving correct answer
for all cases.
i tried upto my knowledge,but unable to find solution.
if any one got idea please kindly share your thougths.
It's a typical DP problem. Lets say that P(n,k) is the maximum profit of having k billboards up to the position n on the road. Then you have following formula:
P(n,k) = max(P(n-1,k), P(n-1,k-1) + C(n))
P(i,0) = 0 for i = 0..n
Where c(n) is the profit from putting the nth billboard on the road. Using that formula to calculate P(n, k) bottom up you'll get the solution in O(nk) time.
I'll leave up to you to figure out why that formula holds.
edit
Dang, I misread the question.
It still is a DP problem, just the formula is different. Let's say that P(v,i) means the maximum profit at point v where last cluster of billboards has size i.
Then P(v,i) can be described using following formulas:
P(v,i) = P(v-1,i-1) + C(v) if i > 0
P(v,0) = max(P(v-1,i) for i = 0..min(k, v))
P(0,0) = 0
You need to find max(P(n,i) for i = 0..k)).
This problem is one of the challenges posted in www.interviewstreet.com ...
I'm happy to say I got this down recently, but not quite satisfied and wanted to see if there's a better method out there.
soulcheck's DP solution above is straightforward, but won't be able to solve this completely due to the fact that K can be as big as N, meaning the DP complexity will be O(NK) for both runtime and space.
Another solution is to do branch-and-bound, keeping track the best sum so far, and prune the recursion if at some level, that is, if currSumSoFar + SUM(a[currIndex..n)) <= bestSumSoFar ... then exit the function immediately, no point of processing further when the upper-bound won't beat best sum so far.
The branch-and-bound above got accepted by the tester for all but 2 test-cases.
Fortunately, I noticed that the 2 test-cases are using small K (in my case, K < 300), so the DP technique of O(NK) suffices.
soulcheck's (second) DP solution is correct in principle. There are two improvements you can make using these observations:
1) It is unnecessary to allocate the entire DP table. You only ever look at two rows at a time.
2) For each row (the v in P(v, i)), you are only interested in the i's which most increase the max value, which is one more than each i that held the max value in the previous row. Also, i = 1, otherwise you never consider blanks.
I coded it in c++ using DP in O(nlogk).
Idea is to maintain a multiset with next k values for a given position. This multiset will typically have k values in mid processing. Each time you move an element and push new one. Art is how to maintain this list to have the profit[i] + answer[i+2]. More details on set:
/*
* Observation 1: ith state depends on next k states i+2....i+2+k
* We maximize across this states added on them "accumulative" sum
*
* Let Say we have list of numbers of state i+1, that is list of {profit + state solution}, How to get states if ith solution
*
* Say we have following data k = 3
*
* Indices: 0 1 2 3 4
* Profits: 1 3 2 4 2
* Solution: ? ? 5 3 1
*
* Answer for [1] = max(3+3, 5+1, 9+0) = 9
*
* Indices: 0 1 2 3 4
* Profits: 1 3 2 4 2
* Solution: ? 9 5 3 1
*
* Let's find answer for [0], using set of [1].
*
* First, last entry should be removed. then we have (3+3, 5+1)
*
* Now we should add 1+5, but entries should be incremented with 1
* (1+5, 4+3, 6+1) -> then find max.
*
* Could we do it in other way but instead of processing list. Yes, we simply add 1 to all elements
*
* answer is same as: 1 + max(1-1+5, 3+3, 5+1)
*
*/
ll dp()
{
multiset<ll, greater<ll> > set;
mem[n-1] = profit[n-1];
ll sumSoFar = 0;
lpd(i, n-2, 0)
{
if(sz(set) == k)
set.erase(set.find(added[i+k]));
if(i+2 < n)
{
added[i] = mem[i+2] - sumSoFar;
set.insert(added[i]);
sumSoFar += profit[i];
}
if(n-i <= k)
mem[i] = profit[i] + mem[i+1];
else
mem[i] = max(mem[i+1], *set.begin()+sumSoFar);
}
return mem[0];
}
This looks like a linear programming problem. This problem would be linear, but for the requirement that no more than K adjacent billboards may remain.
See wikipedia for a general treatment: http://en.wikipedia.org/wiki/Linear_programming
Visit your university library to find a good textbook on the subject.
There are many, many libraries to assist with linear programming, so I suggest you do not attempt to code an algorithm from scratch. Here is a list relevant to Python: http://wiki.python.org/moin/NumericAndScientific/Libraries
Let P[i] (where i=1..n) be the maximum profit for billboards 1..i IF WE REMOVE billboard i. It is trivial to calculate the answer knowing all P[i]. The baseline algorithm for calculating P[i] is as follows:
for i=1,N
{
P[i]=-infinity;
for j = max(1,i-k-1)..i-1
{
P[i] = max( P[i], P[j] + C[j+1]+..+C[i-1] );
}
}
Now the idea that allows us to speed things up. Let's say we have two different valid configurations of billboards 1 through i only, let's call these configurations X1 and X2. If billboard i is removed in configuration X1 and profit(X1) >= profit(X2) then we should always prefer configuration X1 for billboards 1..i (by profit() I meant the profit from billboards 1..i only, regardless of configuration for i+1..n). This is as important as it is obvious.
We introduce a doubly-linked list of tuples {idx,d}: {{idx1,d1}, {idx2,d2}, ..., {idxN,dN}}.
p->idx is index of the last billboard removed. p->idx is increasing as we go through the list: p->idx < p->next->idx
p->d is the sum of elements (C[p->idx]+C[p->idx+1]+..+C[p->next->idx-1]) if p is not the last element in the list. Otherwise it is the sum of elements up to the current position minus one: (C[p->idx]+C[p->idx+1]+..+C[i-1]).
Here is the algorithm:
P[1] = 0;
list.AddToEnd( {idx=0, d=C[0]} );
// sum of elements starting from the index at top of the list
sum = C[0]; // C[list->begin()->idx]+C[list->begin()->idx+1]+...+C[i-1]
for i=2..N
{
if( i - list->begin()->idx > k + 1 ) // the head of the list is "too far"
{
sum = sum - list->begin()->d
list.RemoveNodeFromBeginning()
}
// At this point the list should containt at least the element
// added on the previous iteration. Calculating P[i].
P[i] = P[list.begin()->idx] + sum
// Updating list.end()->d and removing "unnecessary nodes"
// based on the criterion described above
list.end()->d = list.end()->d + C[i]
while(
(list is not empty) AND
(P[i] >= P[list.end()->idx] + list.end()->d - C[list.end()->idx]) )
{
if( list.size() > 1 )
{
list.end()->prev->d += list.end()->d
}
list.RemoveNodeFromEnd();
}
list.AddToEnd( {idx=i, d=C[i]} );
sum = sum + C[i]
}
//shivi..coding is adictive!!
#include<stdio.h>
long long int arr[100001];
long long int sum[100001];
long long int including[100001],excluding[100001];
long long int maxim(long long int a,long long int b)
{if(a>b) return a;return b;}
int main()
{
int N,K;
scanf("%d%d",&N,&K);
for(int i=0;i<N;++i)scanf("%lld",&arr[i]);
sum[0]=arr[0];
including[0]=sum[0];
excluding[0]=sum[0];
for(int i=1;i<K;++i)
{
sum[i]+=sum[i-1]+arr[i];
including[i]=sum[i];
excluding[i]=sum[i];
}
long long int maxi=0,temp=0;
for(int i=K;i<N;++i)
{
sum[i]+=sum[i-1]+arr[i];
for(int j=1;j<=K;++j)
{
temp=sum[i]-sum[i-j];
if(i-j-1>=0)
temp+=including[i-j-1];
if(temp>maxi)maxi=temp;
}
including[i]=maxi;
excluding[i]=including[i-1];
}
printf("%lld",maxim(including[N-1],excluding[N-1]));
}
//here is the code...passing all but 1 test case :) comment improvements...simple DP

Resources