List Subsequents Method - algorithm

So, I have to find the maximum Sum of the continuous subset, I followed this algorithm in python.
def SubSeq(list):
final_list = None
suming = 0
for start in range(len(list)):
for end in range(start+1, len(list)+1):
subseq = list[start:end]
summation= sum(list[start:end])
if summation > suming:
suming = summation
final_list = subseq
return final_list
print SubSeq([5, 15, -30, 10, -5, 40, 10])
I wonder if it is a correct way in dynamic programming though the running time is O(n^2). Plus, is there a possible way to make it O(n)

This is not Dynamic Programming, it is a brute force solution. The solution seems to be correct, but as you observer - it is inefficient.
An O(n) solution can be achieved by applying Dynamic Programming, denote D(i) as the maximum sub contiguous subarray that ends in i, and must include i.
D(-1) = 0
D(i) = max{ arr[i], D(i-1)
The idea is, you have two choices - get the previously "best" array that ends at i-1 and add element i to it, or create a new one that starts with i.
At the end, by applying the above in a DP solution, you get an array where each element indicates the maximum sub contiguous array ending with this index, all you have to do is choose the maximal value out of this array to get the value of the maximal sum, and go back on the array to get the actual subsequence.
Example:
array = 5,15,-30,10,-5,40,10
Applying the dynamic programming:
D(-1) = 0
D(0) = 5
D(1) = 20
D(2) = -10
D(3) 10 //because max{-10+10,10} = 10
D(4) = 5
D(5) = 45
D(6) = 55
And now you have the array:
D = [5,20,-10,10,5,45,55]
The maximal subsequence's value is 55, and is given by [10,-5,40,10] (following the above array, and going back on it)

You are basically calculating the sum again and again.You can avoid that by storing the sums in an array.
You can do it in O(n).
Let S[0,1..n-1] be the sequence
Let T[0,1,...n-1] be the array in which T[i] is the maximum continuous sum possible starting from ith element.
Now to fill T[i], start from reverse. T[n-1]=max(S[n-1],0)
Now T[i]=max(T[i+1]+S[i] , S[i] ,0)
Now go through the 'T' array to find the maximum sum.
Let T[m] be the max value.
To calculate the exact sequence start from S[m] and add all the elements till the sum equals T[m]

Related

Maximum sum in array with special conditions

Assume we have an array with n Elements ( n%3 = 0).In each step, a number is taken from the array. Either you take the leftmost or the rightmost one. If you choose the left one, this element is added to the sum and the two right numbers are removed and vice versa.
Example: A = [100,4,2,150,1,1], sum = 0.
take the leftmost element. A = [4,2,150] sum = 0+100 =100
2.take the rightmost element. A = [] sum = 100+150 = 250
So the result for A should be 250 and the sequence would be Left, Right.
How can I calculate the maximum sum I can get in an array? And how can I determine the sequence in which I have to extract the elements?
I guess this problem can best be solved with dynamic programming and the concrete sequence can then be determined by backtracking.
The underlying problem can be solved via dynamic programming as follows. The state space can be defined by letting
M(i,j) := maximum value attainable by chosing from the subarray of
A starting at index i and ending at index j
for any i, j in {1, N} where `N` is the number of elements
in the input.
where the recurrence relation is as follows.
M(i,j) = max { M(i+1, j-2) + A[i], M(i+2, j-1) + A[j] }
Here, the first value corresponds to the choice of adding the beginning of the array while the second value connesponds to the choice of subtracting the end of the array. The base cases are the states of value 0 where i=j.

Maximize sum of two numbers plus distance between them

We are given square matrix of numbers, e.g.
1 9 2
3 8 3
2 1 1
The distance between adjacent numbers is 2. We want to find such two numbers, in the same row or in the same column, that their sum plus the distance between them is maximal. For example, in the example above, such numbers are 9 and 8 and the max result is 9+8+1*2 = 19. We want to find just the maximal result, we don't need which specific numbers sum to it.
That looks like a DP problem for me, but I can't think of any elegant solution.
One can solve the 1D problem (that is, given a list of numbers, find the pair which maximizes sum+distance) using dynamic programming.
bi = 0
best = -10**9 # anything large and negative
for i in range(1, n+1):
best = max(best, a[i] + a[bi] + (i - bi)*2)
if a[i] - i*2 > a[bi] - bi*2:
bi = i
After this code finishes, best will store the maximum sum + distance of any pair of numbers in the list. It works because at any given loop iteration of i, bi stores the index of the value at index less than i that maximizes its value minus twice its index. One can observe that the number at this index is the best number (to the left of i) to pair the number at i with.
Once you have this, the 2D problem is straightforward: go through each row and column and apply the 1D algorithm, and return the maximum pair found. Overall for an n by n matrix, this runs in O(n^2) time, which is clearly asymptotically optimal since every element in the matrix needs to be read at least once.
Here is working Python3 code:
def max_sum_dist_1D(a):
bi = 0
best = -10**9
for i in range(1, len(a)):
best = max(best, a[i] + a[bi] + (i - bi)*2)
if a[i] - i*2 > a[bi] - bi*2:
bi = i
return best
def max_sum_dist_2D(M):
best_row = max(max_sum_dist_1D(row) for row in M)
best_col = max(max_sum_dist_1D(col) for col in zip(*M))
return max(best_row, best_col)
M = [[1, 9, 2], [3, 8, 3], [2, 1, 1]]
print(max_sum_dist_2D(M))

Minimum sum that cant be obtained from a set

Given a set S of positive integers whose elements need not to be distinct i need to find minimal non-negative sum that cant be obtained from any subset of the given set.
Example : if S = {1, 1, 3, 7}, we can get 0 as (S' = {}), 1 as (S' = {1}), 2 as (S' = {1, 1}), 3 as (S' = {3}), 4 as (S' = {1, 3}), 5 as (S' = {1, 1, 3}), but we can't get 6.
Now we are given one array A, consisting of N positive integers. Their are M queries,each consist of two integers Li and Ri describe i'th query: we need to find this Sum that cant be obtained from array elements ={A[Li], A[Li+1], ..., A[Ri-1], A[Ri]} .
I know to find it by a brute force approach to be done in O(2^n). But given 1 ≤ N, M ≤ 100,000.This cant be done .
So is their any effective approach to do it.
Concept
Suppose we had an array of bool representing which numbers so far haven't been found (by way of summing).
For each number n we encounter in the ordered (increasing values) subset of S, we do the following:
For each existing True value at position i in numbers, we set numbers[i + n] to True
We set numbers[n] to True
With this sort of a sieve, we would mark all the found numbers as True, and iterating through the array when the algorithm finishes would find us the minimum unobtainable sum.
Refinement
Obviously, we can't have a solution like this because the array would have to be infinite in order to work for all sets of numbers.
The concept could be improved by making a few observations. With an input of 1, 1, 3, the array becomes (in sequence):
(numbers represent true values)
An important observation can be made:
(3) For each next number, if the previous numbers had already been found it will be added to all those numbers. This implies that if there were no gaps before a number, there will be no gaps after that number has been processed.
For the next input of 7 we can assert that:
(4) Since the input set is ordered, there will be no number less than 7
(5) If there is no number less than 7, then 6 cannot be obtained
We can come to a conclusion that:
(6) the first gap represents the minimum unobtainable number.
Algorithm
Because of (3) and (6), we don't actually need the numbers array, we only need a single value, max to represent the maximum number found so far.
This way, if the next number n is greater than max + 1, then a gap would have been made, and max + 1 is the minimum unobtainable number.
Otherwise, max becomes max + n. If we've run through the entire S, the result is max + 1.
Actual code (C#, easily converted to C):
static int Calculate(int[] S)
{
int max = 0;
for (int i = 0; i < S.Length; i++)
{
if (S[i] <= max + 1)
max = max + S[i];
else
return max + 1;
}
return max + 1;
}
Should run pretty fast, since it's obviously linear time (O(n)). Since the input to the function should be sorted, with quicksort this would become O(nlogn). I've managed to get results M = N = 100000 on 8 cores in just under 5 minutes.
With numbers upper limit of 10^9, a radix sort could be used to approximate O(n) time for the sorting, however this would still be way over 2 seconds because of the sheer amount of sorts required.
But, we can use statistical probability of 1 being randomed to eliminate subsets before sorting. On the start, check if 1 exists in S, if not then every query's result is 1 because it cannot be obtained.
Statistically, if we random from 10^9 numbers 10^5 times, we have 99.9% chance of not getting a single 1.
Before each sort, check if that subset contains 1, if not then its result is one.
With this modification, the code runs in 2 miliseconds on my machine. Here's that code on http://pastebin.com/rF6VddTx
This is a variation of the subset-sum problem, which is NP-Complete, but there is a pseudo-polynomial Dynamic Programming solution you can adopt here, based on the recursive formula:
f(S,i) = f(S-arr[i],i-1) OR f(S,i-1)
f(-n,i) = false
f(_,-n) = false
f(0,i) = true
The recursive formula is basically an exhaustive search, each sum can be achieved if you can get it with element i OR without element i.
The dynamic programming is achieved by building a SUM+1 x n+1 table (where SUM is the sum of all elements, and n is the number of elements), and building it bottom-up.
Something like:
table <- SUM+1 x n+1 table
//init:
for each i from 0 to SUM+1:
table[0][i] = true
for each j from 1 to n:
table[j][0] = false
//fill the table:
for each i from 1 to SUM+1:
for each j from 1 to n+1:
if i < arr[j]:
table[i][j] = table[i][j-1]
else:
table[i][j] = table[i-arr[j]][j-1] OR table[i][j-1]
Once you have the table, you need the smallest i such that for all j: table[i][j] = false
Complexity of solution is O(n*SUM), where SUM is the sum of all elements, but note that the algorithm can actually be trimmed after the required number was found, without the need to go on for the next rows, which are un-needed for the solution.

how to find a sequential sub array which has the biggest sum within O(n) [duplicate]

This question already has answers here:
Maximum sum sublist?
(13 answers)
Closed 8 years ago.
Given a number array, including positive and negative numbers, the question is to find a sequential sub array which has the biggest sum and the time complexity is O(n), for example, [1,-2,3,10,-4,7,2,-5] is an array, and the sub array [3, 10, -4, 7, 2] has the biggest sum which is 18.
So how to find this sub array within O(n)?
Thx
Wiki link to this solution. Its called Maximum subarray sum problem. Solution is provided by Kadane which runs in O(n) time.
Here's a solution in Python. The idea is to search the maximum consecutive sum. When that sum is negative, you empty the list, if it's not negative, then you must keep those elements.
l = [1,-2,3,10,-4,7,2,-5]
def find_max(l):
s = 0 # Current sum
lsum = [] # Current subarray
res = (0, []) # Max value and subarray
for v in l:
s += v
lsum.append(v)
if s > res[0]:
res = (s, lsum[:])
elif s < 0:
s = 0
lsum = []
return res
print find_max(l)
Result:
(18, [3, 10, -4, 7, 2])
The idea is look at the cumulative series (treat the values as increment/decrements of something) and then find the low and subsequent high of this series.
In pseudo code:
sum = 0
low = Integer.MaxValue
highestSumSinceLow = Integer.MinValue
For i = 0 to Array.Length-1
sum += Array[i] // keep track of cumulative value since start
If sum < low Then
low = sum // keep track of lowest sum since start so far
substart = i + 1 // and set substart to next value
sumsincelow = sum - low // calculate sum from that low to here
If sumsincelow > highestSumSinceLow Then
highestSumSinceLow = sumsincelow // keep track of highest sumsincelow
subend = i // and set subend to this value
Next i
After going through the entire array, substart and subend point to the indices of the sub array with the highest sum (which is highestSumSinceLow).
This is probably the simplest and most efficient solution. It is O(n) and doesn't use temporary arrays. It just goes through the array once from start to finish and keeps track of the lowest cumulative sum since start and the highest sum since that low.

Find subset with elements that are furthest apart from eachother

I have an interview question that I can't seem to figure out. Given an array of size N, find the subset of size k such that the elements in the subset are the furthest apart from each other. In other words, maximize the minimum pairwise distance between the elements.
Example:
Array = [1,2,6,10]
k = 3
answer = [1,6,10]
The bruteforce way requires finding all subsets of size k which is exponential in runtime.
One idea I had was to take values evenly spaced from the array. What I mean by this is
Take the 1st and last element
find the difference between them (in this case 10-1) and divide that by k ((10-1)/3=3)
move 2 pointers inward from both ends, picking out elements that are +/- 3 from your previous pick. So in this case, you start from 1 and 10 and find the closest elements to 4 and 7. That would be 6.
This is based on the intuition that the elements should be as evenly spread as possible. I have no idea how to prove it works/doesn't work. If anyone knows how or has a better algorithm please do share. Thanks!
This can be solved in polynomial time using DP.
The first step is, as you mentioned, sort the list A. Let X[i,j] be the solution for selecting j elements from first i elements A.
Now, X[i+1, j+1] = max( min( X[k,j], A[i+1]-A[k] ) ) over k<=i.
I will leave initialization step and memorization of subset step for you to work on.
In your example (1,2,6,10) it works the following way:
1 2 6 10
1 - - - -
2 - 1 5 9
3 - - 1 4
4 - - - 1
The basic idea is right, I think. You should start by sorting the array, then take the first and the last elements, then determine the rest.
I cannot think of a polynomial algorithm to solve this, so I would suggest one of the two options.
One is to use a search algorithm, branch-and-bound style, since you have a nice heuristic at hand: the upper bound for any solution is the minimum size of the gap between the elements picked so far, so the first guess (evenly spaced cells, as you suggested) can give you a good baseline, which will help prune most of the branches right away. This will work fine for smaller values of k, although the worst case performance is O(N^k).
The other option is to start with the same baseline, calculate the minimum pairwise distance for it and then try to improve it. Say you have a subset with minimum distance of 10, now try to get one with 11. This can be easily done by a greedy algorithm -- pick the first item in the sorted sequence such that the distance between it and the previous item is bigger-or-equal to the distance you want. If you succeed, try increasing further, if you fail -- there is no such subset.
The latter solution can be faster when the array is large and k is relatively large as well, but the elements in the array are relatively small. If they are bound by some value M, this algorithm will take O(N*M) time, or, with a small improvement, O(N*log(M)), where N is the size of the array.
As Evgeny Kluev suggests in his answer, there is also a good upper bound on the maximum pairwise distance, which can be used in either one of these algorithms. So the complexity of the latter is actually O(N*log(M/k)).
You can do this in O(n*(log n) + n*log(M)), where M is max(A) - min(A).
The idea is to use binary search to find the maximum separation possible.
First, sort the array. Then, we just need a helper function that takes in a distance d, and greedily builds the longest subarray possible with consecutive elements separated by at least d. We can do this in O(n) time.
If the generated array has length at least k, then the maximum separation possible is >=d. Otherwise, it's strictly less than d. This means we can use binary search to find the maximum value. With some cleverness, you can shrink the 'low' and 'high' bounds of the binary search, but it's already so fast that sorting would become the bottleneck.
Python code:
def maximize_distance(nums: List[int], k: int) -> List[int]:
"""Given an array of numbers and size k, uses binary search
to find a subset of size k with maximum min-pairwise-distance"""
assert len(nums) >= k
if k == 1:
return [nums[0]]
nums.sort()
def longest_separated_array(desired_distance: int) -> List[int]:
"""Given a distance, returns a subarray of nums
of length k with pairwise differences at least that distance (if
one exists)."""
answer = [nums[0]]
for x in nums[1:]:
if x - answer[-1] >= desired_distance:
answer.append(x)
if len(answer) == k:
break
return answer
low, high = 0, (nums[-1] - nums[0])
while low < high:
mid = (low + high + 1) // 2
if len(longest_separated_array(mid)) == k:
low = mid
else:
high = mid - 1
return longest_separated_array(low)
I suppose your set is ordered. If not, my answer will be changed slightly.
Let's suppose you have an array X = (X1, X2, ..., Xn)
Energy(Xi) = min(|X(i-1) - Xi|, |X(i+1) - Xi|), 1 < i <n
j <- 1
while j < n - k do
X.Exclude(min(Energy(Xi)), 1 < i < n)
j <- j + 1
n <- n - 1
end while
$length = length($array);
sort($array); //sorts the list in ascending order
$differences = ($array << 1) - $array; //gets the difference between each value and the next largest value
sort($differences); //sorts the list in ascending order
$max = ($array[$length-1]-$array[0])/$M; //this is the theoretical max of how large the result can be
$result = array();
for ($i = 0; i < $length-1; $i++){
$count += $differences[i];
if ($length-$i == $M - 1 || $count >= $max){ //if there are either no more coins that can be taken or we have gone above or equal to the theoretical max, add a point
$result.push_back($count);
$count = 0;
$M--;
}
}
return min($result)
For the non-code people: sort the list, find the differences between each 2 sequential elements, sort that list (in ascending order), then loop through it summing up sequential values until you either pass the theoretical max or there arent enough elements remaining; then add that value to a new array and continue until you hit the end of the array. then return the minimum of the newly created array.
This is just a quick draft though. At a quick glance any operation here can be done in linear time (radix sort for the sorts).
For example, with 1, 4, 7, 100, and 200 and M=3, we get:
$differences = 3, 3, 93, 100
$max = (200-1)/3 ~ 67
then we loop:
$count = 3, 3+3=6, 6+93=99 > 67 so we push 99
$count = 100 > 67 so we push 100
min(99,100) = 99
It is a simple exercise to convert this to the set solution that I leave to the reader (P.S. after all the times reading that in a book, I've always wanted to say it :P)

Resources