Maximize sum of two numbers plus distance between them - algorithm

We are given square matrix of numbers, e.g.
1 9 2
3 8 3
2 1 1
The distance between adjacent numbers is 2. We want to find such two numbers, in the same row or in the same column, that their sum plus the distance between them is maximal. For example, in the example above, such numbers are 9 and 8 and the max result is 9+8+1*2 = 19. We want to find just the maximal result, we don't need which specific numbers sum to it.
That looks like a DP problem for me, but I can't think of any elegant solution.

One can solve the 1D problem (that is, given a list of numbers, find the pair which maximizes sum+distance) using dynamic programming.
bi = 0
best = -10**9 # anything large and negative
for i in range(1, n+1):
best = max(best, a[i] + a[bi] + (i - bi)*2)
if a[i] - i*2 > a[bi] - bi*2:
bi = i
After this code finishes, best will store the maximum sum + distance of any pair of numbers in the list. It works because at any given loop iteration of i, bi stores the index of the value at index less than i that maximizes its value minus twice its index. One can observe that the number at this index is the best number (to the left of i) to pair the number at i with.
Once you have this, the 2D problem is straightforward: go through each row and column and apply the 1D algorithm, and return the maximum pair found. Overall for an n by n matrix, this runs in O(n^2) time, which is clearly asymptotically optimal since every element in the matrix needs to be read at least once.
Here is working Python3 code:
def max_sum_dist_1D(a):
bi = 0
best = -10**9
for i in range(1, len(a)):
best = max(best, a[i] + a[bi] + (i - bi)*2)
if a[i] - i*2 > a[bi] - bi*2:
bi = i
return best
def max_sum_dist_2D(M):
best_row = max(max_sum_dist_1D(row) for row in M)
best_col = max(max_sum_dist_1D(col) for col in zip(*M))
return max(best_row, best_col)
M = [[1, 9, 2], [3, 8, 3], [2, 1, 1]]
print(max_sum_dist_2D(M))

Related

Coin change(Dynamic programming)

I have a question about the coin change problem where we not only have to print the number of ways to change $n with the given coin denominations for eg {1,5,10,25}, but also print the ways
For example if the target = $50, and the coins are {1,5,10,25}, then the ways to actually get use the coins to get the target are
2 × $25
1 × $25 + 2 × $10 + 1 × $5
etc.
What is the best time complexity we could get to solve this problem?
I tried to modify the dynamic programming solution for the coin change problem where we only need the number of ways but not the actual ways
I am having trouble figuring out the time complexity.
I do use memorization so that I don't have to solve the same problem again for the given coin and sum value but still we need to iterate through all the solution and print them. So the time complexity is definitely more than O(ns) where n is the number of coins and s is the target
Is it exponential? Any help will be much appreciated
Printing Combinations
def coin_change_solutions(coins, S):
# create an S x N table for memoization
N = len(coins)
sols = [[[] for n in xrange(N + 1)] for s in xrange(S + 1)]
for n in range(0, N + 1):
sols[0][n].append([])
# fill table using bottom-up dynamic programming
for s in range(1, S+1):
for n in range(1, N+1):
without_last = sols[s][n - 1]
if (coins[n - 1] <= s):
with_last = [list(sol) + [coins[n-1]] for sol in sols[s - coins[n - 1]][n]]
else:
with_last = []
sols[s][n] = without_last + with_last
return sols[S][N]
print coin_change_solutions([1,2], 4)
# => [[1, 1, 1, 1], [1, 1, 2], [2, 2]]
without: we don't need to use the last coin to make the sum. All the coin solutions are found directly by recursing to solution[s][n-1]. We take all those coin combinations and copy them to with_last_sols.
with: we do need to use the last coin. So that coin must be in our solution. The remaining coins are found recursively via sol[s - coins[n - 1]][n]. Reading this entry will give us many possible choices for what the remaining coins should be. For each possible choice , sol, we append the last coin, coin[n - 1]:
# For example, suppose target is s = 4
# We're finding solutions that use the last coin.
# Suppose the last coin has a value of 2:
#
# find possible combinations that add up to 4 - 2 = 2:
# ===> [[1,1], [2]]
# then for each combination, add the last coin
# so that the combination adds up to 4)
# ===> [[1,1,2], [2,2]]
The final list of combinations is found by taking the combinations for the first case and the second case and concatenating the two lists.
without_last_sols = [[1,1,1,1]]
with_last_sols = [[1,1,2], [2,2]]
without_last_sols + with_last_sols = [[1,1,1,1], [1,1,2], [2,2]]
Time Complexity
In the worst case we have a coin set with all coins from 1 to n: coins
= [1,2,3,4,...,n] – the number of possible coin sum combinations, num solutions, is equal to the number of integer partitions of s, p(s).
It can be shown that the number of integer partitions, p(s) grows exponentially.
Hence num solutions = p(s) = O(2^s). Any solution must have this at a minimum so that it can print out all these possible solutions. Hence the problem is exponential in nature.
We have two loops: one loop for s and the other loop for n.
For each s and n, we compute sols[s][n]:
without: We look at the O(2^s) combinations in sol[s - coins[n - 1]][n]. For each combination, we copy it in O(n) time. So overall this takes: O(n×2^s) time.
with: We look at all O(2^s) combinations in sol[s][n]. For each combination list sol, we create copy of that new list in O(n) time and then append the last coin. Overall this case takes O(n×2^s).
Hence the time complexity is O(s×n)×O(n2^s + n2^s) = O(s×n^2×2^s).
Space Complexity
The space complexity is O(s×n^2×2^s) because we have a s×n table with
each entry storing O(2^s) possible combinations, (e.g. [[1, 1, 1, 1], [1, 1, 2], [2, 2]]), with each combination, (e.g. [1,1,1,1]) taking O(n) space.
What I tend to do is solve the problem recursively and then build a memoization solution from there.
Starting with a recursive one the approach is simple, pick a coin subtract from target and dont pick a coin.
Whilst you pick a coin you add it to a vector or your list, when you dont pick one you pop the one you added before. The code looks something like:
void print(vector<int>& coinsUsed)
{
for(auto c : coinsUsed)
{
cout << c << ",";
}
cout << endl;
}
int helper(vector<int>& coins, int target, int index, vector<int>& coinsUsed)
{
if (index >= coins.size() || target < 0) return 0;
if (target == 0)
{
print(coinsUsed);
return 1;
}
coinsUsed.push_back(coins[index]);
int with = helper(coins, target - coins[index], index, coinsUsed);
coinsUsed.pop_back();
int without = helper(coins, target, index + 1, coinsUsed);
return with + without;
}
int coinChange(vector<int>& coins, int target)
{
vector<int> coinsUsed;
return helper(coins, target, 0, coinsUsed);
}
You can call it like:
vector<int> coins = {1,5,10,25};
cout << "Total Ways:" << coinChange(coins, 10);
So this gives you the total ways and also the coins used in the process to reach the target stored in coinsUsed you can now memoize this as you please by storing the passed in values in a cache.
The time complexity of the recursive solution is exponential.
link to the running program: http://coliru.stacked-crooked.com/a/5ef0ed76b7a496fe
Let d_i be a denomination, the value of a coin in cents. In your example d_i = {1, 5, 10, 25}.
Let k be the number of denominations (coins), here k = 4.
We will use a 2D array numberOfCoins[1..k][0..n] to determine the minimum number of coins required to make a change. The optimal solution is given by:
numberOfCoins[k][n] = min(numberOfCoins[i − 1][j], numberOfCoins[i][j − d_i] + 1)
The equation above represents the fact that to build an optimal solution we either do not use d_i, so we need use a smaller coin (this is why i is decremented below):
numberOfCoins[i][j] = numberOfCoins[i − 1][j] // eq1
or we use d_i, so we add +1 to the number of coins needed and we decrement by d_i (the value of the coin we just used):
numberOfCoins[i][j] = numberOfCoins[i][j − d_i] + 1 // eq2
The time complexity is O(kn) but in cases where k is small, as is the case in your example, we have O(4n) = O(n).
We will use another 2D array, coinUsed, having the same dimensions as numberOfCoins, to mark which coins were used. Each entry will either tell us that we did not use the coin in coinUsed[i][j] by setting a "^" in that position (this correspond to eq1). Or we mark that the coin was used by setting a "<" in that position (corresponding to eq2).
Both arrays can be built as the algorithm is working. We will only have constant more instructions in the inner loop, therefore the time complexity of building both arrays is still O(kn).
To print the solution we need to iterate, in the worse case scenario over k + n+1 elements. For example, when the optimal solution is using all 1 cent denominations. But note that printing is done after building so the overall time complexity is O(kn) + O(k + n+1). As before, if k is small the complexity is O(kn) + O(k + n+1) = O(kn) + O(n+1) = O(kn) + O(n) = O((k+1)n) = O(n).

Evenly pruning an ordered set

Given an ordered set of n elements (a1, a2, ..., an), what algorithm can I use to pick M of these elements (M < n) such that the sampling from the original set is as spread out as possible?
For example, if the starting set is the numbers from 1 to 9 (i.e. n=9), and I want to evenly sample it so I end up with only 5 of those numbers (i.e. M=9), I'd select 1, 3, 5, 7, 9. To get 3 of the original 9, I'd go with 1, 5, 9, and so on. But what would the pseudo-code for picking the elements look like for any n and M?
The mathematical formulation for this problem would be as follows: given M < n, find the set q(1), q(2), ..., q(M) such that 1 <= q(k) < q(k+1) <= M for any k:[1, M-1], and the sum of q(k+1)-q(k) for k: [1, M-1] is the maximum possible.
Following Nico's suggestion of maximizing the minimum difference (the form of the objective function is fairly flexible), there's a simple O(n^2 M)-time dynamic program that goes something like this.
def prune(a, M):
n = len(a)
# The [i][j] entry is the max min gap
# for a[0], ..., a[i] choose j + 1 (a[0] is always chosen).
table = [[float('inf')] * M for i in range(n)]
for i in range(1, n):
for j in range(M):
table[i][j] = max(min(table[k][j], a[i] - a[k])
for k in range(i))
# Trace back the sequence of argmaxes to recover the chosen indexes.
This can be improved to O(n M) time using total monotonicity. The idea is that, as i increases, the proper value for k can only increase too, and as soon as the objective decreases when we increase k, we can move on to the next i.
Both of the above algorithms handle fairly general objectives. If max is OK, then there's an O(n log n)-time algorithm that uses binary search to guess the minimum gap and then check whether it's feasible. I'll wait for you to update the question.

List Subsequents Method

So, I have to find the maximum Sum of the continuous subset, I followed this algorithm in python.
def SubSeq(list):
final_list = None
suming = 0
for start in range(len(list)):
for end in range(start+1, len(list)+1):
subseq = list[start:end]
summation= sum(list[start:end])
if summation > suming:
suming = summation
final_list = subseq
return final_list
print SubSeq([5, 15, -30, 10, -5, 40, 10])
I wonder if it is a correct way in dynamic programming though the running time is O(n^2). Plus, is there a possible way to make it O(n)
This is not Dynamic Programming, it is a brute force solution. The solution seems to be correct, but as you observer - it is inefficient.
An O(n) solution can be achieved by applying Dynamic Programming, denote D(i) as the maximum sub contiguous subarray that ends in i, and must include i.
D(-1) = 0
D(i) = max{ arr[i], D(i-1)
The idea is, you have two choices - get the previously "best" array that ends at i-1 and add element i to it, or create a new one that starts with i.
At the end, by applying the above in a DP solution, you get an array where each element indicates the maximum sub contiguous array ending with this index, all you have to do is choose the maximal value out of this array to get the value of the maximal sum, and go back on the array to get the actual subsequence.
Example:
array = 5,15,-30,10,-5,40,10
Applying the dynamic programming:
D(-1) = 0
D(0) = 5
D(1) = 20
D(2) = -10
D(3) 10 //because max{-10+10,10} = 10
D(4) = 5
D(5) = 45
D(6) = 55
And now you have the array:
D = [5,20,-10,10,5,45,55]
The maximal subsequence's value is 55, and is given by [10,-5,40,10] (following the above array, and going back on it)
You are basically calculating the sum again and again.You can avoid that by storing the sums in an array.
You can do it in O(n).
Let S[0,1..n-1] be the sequence
Let T[0,1,...n-1] be the array in which T[i] is the maximum continuous sum possible starting from ith element.
Now to fill T[i], start from reverse. T[n-1]=max(S[n-1],0)
Now T[i]=max(T[i+1]+S[i] , S[i] ,0)
Now go through the 'T' array to find the maximum sum.
Let T[m] be the max value.
To calculate the exact sequence start from S[m] and add all the elements till the sum equals T[m]

Minimum sum that cant be obtained from a set

Given a set S of positive integers whose elements need not to be distinct i need to find minimal non-negative sum that cant be obtained from any subset of the given set.
Example : if S = {1, 1, 3, 7}, we can get 0 as (S' = {}), 1 as (S' = {1}), 2 as (S' = {1, 1}), 3 as (S' = {3}), 4 as (S' = {1, 3}), 5 as (S' = {1, 1, 3}), but we can't get 6.
Now we are given one array A, consisting of N positive integers. Their are M queries,each consist of two integers Li and Ri describe i'th query: we need to find this Sum that cant be obtained from array elements ={A[Li], A[Li+1], ..., A[Ri-1], A[Ri]} .
I know to find it by a brute force approach to be done in O(2^n). But given 1 ≤ N, M ≤ 100,000.This cant be done .
So is their any effective approach to do it.
Concept
Suppose we had an array of bool representing which numbers so far haven't been found (by way of summing).
For each number n we encounter in the ordered (increasing values) subset of S, we do the following:
For each existing True value at position i in numbers, we set numbers[i + n] to True
We set numbers[n] to True
With this sort of a sieve, we would mark all the found numbers as True, and iterating through the array when the algorithm finishes would find us the minimum unobtainable sum.
Refinement
Obviously, we can't have a solution like this because the array would have to be infinite in order to work for all sets of numbers.
The concept could be improved by making a few observations. With an input of 1, 1, 3, the array becomes (in sequence):
(numbers represent true values)
An important observation can be made:
(3) For each next number, if the previous numbers had already been found it will be added to all those numbers. This implies that if there were no gaps before a number, there will be no gaps after that number has been processed.
For the next input of 7 we can assert that:
(4) Since the input set is ordered, there will be no number less than 7
(5) If there is no number less than 7, then 6 cannot be obtained
We can come to a conclusion that:
(6) the first gap represents the minimum unobtainable number.
Algorithm
Because of (3) and (6), we don't actually need the numbers array, we only need a single value, max to represent the maximum number found so far.
This way, if the next number n is greater than max + 1, then a gap would have been made, and max + 1 is the minimum unobtainable number.
Otherwise, max becomes max + n. If we've run through the entire S, the result is max + 1.
Actual code (C#, easily converted to C):
static int Calculate(int[] S)
{
int max = 0;
for (int i = 0; i < S.Length; i++)
{
if (S[i] <= max + 1)
max = max + S[i];
else
return max + 1;
}
return max + 1;
}
Should run pretty fast, since it's obviously linear time (O(n)). Since the input to the function should be sorted, with quicksort this would become O(nlogn). I've managed to get results M = N = 100000 on 8 cores in just under 5 minutes.
With numbers upper limit of 10^9, a radix sort could be used to approximate O(n) time for the sorting, however this would still be way over 2 seconds because of the sheer amount of sorts required.
But, we can use statistical probability of 1 being randomed to eliminate subsets before sorting. On the start, check if 1 exists in S, if not then every query's result is 1 because it cannot be obtained.
Statistically, if we random from 10^9 numbers 10^5 times, we have 99.9% chance of not getting a single 1.
Before each sort, check if that subset contains 1, if not then its result is one.
With this modification, the code runs in 2 miliseconds on my machine. Here's that code on http://pastebin.com/rF6VddTx
This is a variation of the subset-sum problem, which is NP-Complete, but there is a pseudo-polynomial Dynamic Programming solution you can adopt here, based on the recursive formula:
f(S,i) = f(S-arr[i],i-1) OR f(S,i-1)
f(-n,i) = false
f(_,-n) = false
f(0,i) = true
The recursive formula is basically an exhaustive search, each sum can be achieved if you can get it with element i OR without element i.
The dynamic programming is achieved by building a SUM+1 x n+1 table (where SUM is the sum of all elements, and n is the number of elements), and building it bottom-up.
Something like:
table <- SUM+1 x n+1 table
//init:
for each i from 0 to SUM+1:
table[0][i] = true
for each j from 1 to n:
table[j][0] = false
//fill the table:
for each i from 1 to SUM+1:
for each j from 1 to n+1:
if i < arr[j]:
table[i][j] = table[i][j-1]
else:
table[i][j] = table[i-arr[j]][j-1] OR table[i][j-1]
Once you have the table, you need the smallest i such that for all j: table[i][j] = false
Complexity of solution is O(n*SUM), where SUM is the sum of all elements, but note that the algorithm can actually be trimmed after the required number was found, without the need to go on for the next rows, which are un-needed for the solution.

Find subset with elements that are furthest apart from eachother

I have an interview question that I can't seem to figure out. Given an array of size N, find the subset of size k such that the elements in the subset are the furthest apart from each other. In other words, maximize the minimum pairwise distance between the elements.
Example:
Array = [1,2,6,10]
k = 3
answer = [1,6,10]
The bruteforce way requires finding all subsets of size k which is exponential in runtime.
One idea I had was to take values evenly spaced from the array. What I mean by this is
Take the 1st and last element
find the difference between them (in this case 10-1) and divide that by k ((10-1)/3=3)
move 2 pointers inward from both ends, picking out elements that are +/- 3 from your previous pick. So in this case, you start from 1 and 10 and find the closest elements to 4 and 7. That would be 6.
This is based on the intuition that the elements should be as evenly spread as possible. I have no idea how to prove it works/doesn't work. If anyone knows how or has a better algorithm please do share. Thanks!
This can be solved in polynomial time using DP.
The first step is, as you mentioned, sort the list A. Let X[i,j] be the solution for selecting j elements from first i elements A.
Now, X[i+1, j+1] = max( min( X[k,j], A[i+1]-A[k] ) ) over k<=i.
I will leave initialization step and memorization of subset step for you to work on.
In your example (1,2,6,10) it works the following way:
1 2 6 10
1 - - - -
2 - 1 5 9
3 - - 1 4
4 - - - 1
The basic idea is right, I think. You should start by sorting the array, then take the first and the last elements, then determine the rest.
I cannot think of a polynomial algorithm to solve this, so I would suggest one of the two options.
One is to use a search algorithm, branch-and-bound style, since you have a nice heuristic at hand: the upper bound for any solution is the minimum size of the gap between the elements picked so far, so the first guess (evenly spaced cells, as you suggested) can give you a good baseline, which will help prune most of the branches right away. This will work fine for smaller values of k, although the worst case performance is O(N^k).
The other option is to start with the same baseline, calculate the minimum pairwise distance for it and then try to improve it. Say you have a subset with minimum distance of 10, now try to get one with 11. This can be easily done by a greedy algorithm -- pick the first item in the sorted sequence such that the distance between it and the previous item is bigger-or-equal to the distance you want. If you succeed, try increasing further, if you fail -- there is no such subset.
The latter solution can be faster when the array is large and k is relatively large as well, but the elements in the array are relatively small. If they are bound by some value M, this algorithm will take O(N*M) time, or, with a small improvement, O(N*log(M)), where N is the size of the array.
As Evgeny Kluev suggests in his answer, there is also a good upper bound on the maximum pairwise distance, which can be used in either one of these algorithms. So the complexity of the latter is actually O(N*log(M/k)).
You can do this in O(n*(log n) + n*log(M)), where M is max(A) - min(A).
The idea is to use binary search to find the maximum separation possible.
First, sort the array. Then, we just need a helper function that takes in a distance d, and greedily builds the longest subarray possible with consecutive elements separated by at least d. We can do this in O(n) time.
If the generated array has length at least k, then the maximum separation possible is >=d. Otherwise, it's strictly less than d. This means we can use binary search to find the maximum value. With some cleverness, you can shrink the 'low' and 'high' bounds of the binary search, but it's already so fast that sorting would become the bottleneck.
Python code:
def maximize_distance(nums: List[int], k: int) -> List[int]:
"""Given an array of numbers and size k, uses binary search
to find a subset of size k with maximum min-pairwise-distance"""
assert len(nums) >= k
if k == 1:
return [nums[0]]
nums.sort()
def longest_separated_array(desired_distance: int) -> List[int]:
"""Given a distance, returns a subarray of nums
of length k with pairwise differences at least that distance (if
one exists)."""
answer = [nums[0]]
for x in nums[1:]:
if x - answer[-1] >= desired_distance:
answer.append(x)
if len(answer) == k:
break
return answer
low, high = 0, (nums[-1] - nums[0])
while low < high:
mid = (low + high + 1) // 2
if len(longest_separated_array(mid)) == k:
low = mid
else:
high = mid - 1
return longest_separated_array(low)
I suppose your set is ordered. If not, my answer will be changed slightly.
Let's suppose you have an array X = (X1, X2, ..., Xn)
Energy(Xi) = min(|X(i-1) - Xi|, |X(i+1) - Xi|), 1 < i <n
j <- 1
while j < n - k do
X.Exclude(min(Energy(Xi)), 1 < i < n)
j <- j + 1
n <- n - 1
end while
$length = length($array);
sort($array); //sorts the list in ascending order
$differences = ($array << 1) - $array; //gets the difference between each value and the next largest value
sort($differences); //sorts the list in ascending order
$max = ($array[$length-1]-$array[0])/$M; //this is the theoretical max of how large the result can be
$result = array();
for ($i = 0; i < $length-1; $i++){
$count += $differences[i];
if ($length-$i == $M - 1 || $count >= $max){ //if there are either no more coins that can be taken or we have gone above or equal to the theoretical max, add a point
$result.push_back($count);
$count = 0;
$M--;
}
}
return min($result)
For the non-code people: sort the list, find the differences between each 2 sequential elements, sort that list (in ascending order), then loop through it summing up sequential values until you either pass the theoretical max or there arent enough elements remaining; then add that value to a new array and continue until you hit the end of the array. then return the minimum of the newly created array.
This is just a quick draft though. At a quick glance any operation here can be done in linear time (radix sort for the sorts).
For example, with 1, 4, 7, 100, and 200 and M=3, we get:
$differences = 3, 3, 93, 100
$max = (200-1)/3 ~ 67
then we loop:
$count = 3, 3+3=6, 6+93=99 > 67 so we push 99
$count = 100 > 67 so we push 100
min(99,100) = 99
It is a simple exercise to convert this to the set solution that I leave to the reader (P.S. after all the times reading that in a book, I've always wanted to say it :P)

Resources