How to find minimum pairing cost? (Any Language) - algorithm

I came across an algorithm question recently and I still haven't been able to come up with a way to solve it. Can anyone help with pseudocode or logic?
Here is the question:
There are n elements in the array. N is an odd number. When we exclude 1 element from array, we are finding the minimum pairing cost.
Rules:
1 <= n <= 1001
n % 2 = 1
Example:
Given array is [4, 2, 1, 7, 8]
When we pair items with the closest ones [[1,2], [7,8]] and "4" is excluded.
So the minimum cost is |1 - 2| + |7 - 8| = 2;
What i tried:
Sort array first: [1,2,4,7,8]
Remove the middle element: 4
Pair items with the next ones: [[1, 2], [7, 8]]
According the example it works but what if the given array is [1, 7, 8, 16, 17]?
Sort array first: [1, 7, 8, 16, 17]
Remove the middle element: 8
Pair items with the next ones: [[1, 7], [16, 17]] Wrong Answer
"1" must be excluded and the pairs must be [[7, 8], [16, 17]]

Once the array is sorted, you can pair all elements from left to right, keep track of the total sum, and replace the last pairing with one starting from the right, updating the total sum if it's smaller.
In pseudo-code (all zero-based indexing):
let S be the sum of all pairing costs of
elements 2i and 2i+1 for i from 0 to (n-3)/2
(that is all pairings when you exclude the very last element)
let j = (n-1)/2
for i from (n-3)/2 to 0 (included):
let L be the pairing cost of elements 2i and 2i+1
let R be the pairing cost of elements 2i+1 and 2i+2
let S' = S - L + R
if S' < S
replace S with S'
replace j with i
2j is the element to exclude

Sorting the array first is a good start. Once you've done that, you have a choice of removing any value from index 1..N. A brute-force approach would be to calculate the pairing cost of omitting index 1, then recalculate omitting only index 2, and so on until you reach index N.
You'd be calculating many of the pairs over and over. To avoid that, consider that all the pairs to the left of your omitted index are paired odd-even (from the perspective of starting at element 1) and to the right of the omitted index will be even-odd. If you precalculate the sums of the left pairings and the sums of the right pairings into two arrays, you could determine the minimum cost at each position as the minimum sum of both values at each position of those two arrays.

Related

Dynamic Programming Problem for "disorder" in array

Given a sequence a=(a1,a2....an) from n postive integers. We call Disorder D(ak) of ak=(a1,a2...ak) the diference between ak's max from ak's min. We call Total Disorder the sum all D(ak) for all subsequences from k=2 to k=n. We are looking for a dp algorithm with a recursive solution for b*, witch is a permutation of a,and it achieves minimum D(ak) from k=2 to k=n.
Exmples:
a=(6, 2, 3, 1, 3, 3) then b*=(3, 3, 3, 2, 1, 6)[with D(b*) = 0 + 0 + 1 + 2 + 5 = 8]
a=(1, 3, 3, 3, 6, 6) then b*=(3, 3, 3, 6, 6, 1)[with D(b*) = 0 + 0 + 3 + 3 + 5 = 11]
The only thing i was able to prove was that at the end of b* the number will be either the max of a or min of a.
Pls help.
First sort the input array, and then consider building the result permutation backwards from the end towards the start.
For every element you will either remove the first or last element of the sorted array. Also, for every position k, the disorder of the subarray ending at that position is known -- it's just the difference between the two ends of the remaining element array.
To find the optimal selection, then, you can use DP[k,n] = the minimum disorder so far if we've chosen n elements from the front of the sorted array (with the remainder chosen from the back).
DP[k,n] is easily calculated from DP[k+1,n] and DP[k+1,n-1], and the minimum DP[0,?] is the minimum total disorder.

Maximum Sum for Subarray with fixed cutoff

I have a list of integers, and I need to find a way to get the maximum sum of a subset of them, adding elements to the total until the sum is equal to (or greater than) a fixed cutoff. I know this seems similar to the knapsack, but I was unsure whether it was equivalent.
Sorting the array and adding the maximum element until sum <= cutoff does not work. Observe the following list:
list = [6, 5, 4, 4, 4, 3, 2, 2, 1]
cutoff = 15
For this list, doing it the naive way results in a sum of 15, which is very sub-optimal. As far as I can see, the maximum you could arrive at using this list is 20, by adding 4 + 4 + 4 + 2 + 6. If this is just a different version of knapsack, I can just implement a knapsack solution, as I probably have small enough lists to get away with this, but I'd prefer to do something more efficient.
First of all in any sum, you won't have produced a worse result by adding the largest element last. So there is no harm in assuming that the elements are sorted from smallest to largest as a first step.
And now you use a dynamic programming approach similar to the usual subset sum.
def best_cutoff_sum (cutoff, elements):
elements = sorted(elements)
sums = {0: None}
for e in elements:
next_sums = {}
for v, path in sums.iteritems():
next_sums[v] = path
if v < cutoff:
next_sums[v + e] = [e, path]
sums = next_sums
best = max(sums.keys())
return (best, sums[best])
print(best_cutoff_sum(15, [6, 5, 4, 4, 4, 3, 2, 2, 1]))
With a little work you can turn the path from the nested array it currently is to whatever format you want.
If your list of non-negative elements has n elements, your cutoff is c and your maximum value is v, then this algorithm will take time O(n * (k + v))

Find the maximum weight that can be collected from a store under given limit

I faced this problem in placement exam of SAP labs:
It's your birthday, so you are given a bag with a fixed space 'S'. You can go to a store and pick as many items you like which can be accommodated inside your bag. The store has 'n' items and each item occupies a space s[i]. You have to find out the maximum space in bag which you can fill.
For example, say the limit of you bag is S = 15 and the store has 10 items of sizes [1, 7, 3, 5, 4, 10, 6, 15, 20, 8]. Now you can fill 15 space by various ways such as [1, 7, 3, 4], [7, 3, 5], [15], [5, 10] and many more. So you return 15.
Note: There is quirk in the sizes of items. All of the items but at most 15 follow the following rule: *for all i, j, either size[i]>=2*size[j]+1 or size[j] >= 2*size[i] +1 if i ≠ j.*
Constraints:
1<= n <= 60.
1<= size[i] <= 10^17.
1<= S <= 10^18.
Example: S = 9, n = 5, sizes = [1, 7, 4, 4, 10].
Output: 8. You can't fill exactly 9 space in any way. You can fill 8 space either by using [1, 7] or [4, 4].
Let´s call x the elements that follow that rule. Note that for this set of elements, we have some nice properties:
Given x in sorted ascending order, sum(x[i..j]) < x[j + 1]
To solve maximum sum <= k, just iterate in sorted descending order and substract from k x[i] whenever possible. original k - remaining k is the solution. Assuming elements were already sorted, this is O(|x|).
One way to obtain this set is to iterate items sorted by size in ascending order and add to set if :
set has no elements or
current element >= 2 * size[lastElementAdded] + 1
Now we are left with at most 15 items that do not follow this rule. So we can´t use the efficient solving like before. For each item, we can consider to put it or not in the bag. This leads to 2^15 possible sums. For each of those sums, we can run our method for the elements that follow the rule.
Overall complexity: 2^15 * (n - 15). For n = 60, this should be solved in less than a second.
As an exercise: by using accumulated sums and binary search, it can be brought down to 2^15 * log2(n - 15).

add elements of array thats sum equals the largest element

what is a way to add elements of array thats sum would equal the largest element in the array?
example for this array [4, 6, 23, 10, 1, 3] I have sorted the array first resulting in [1, 3, 4, 6, 10, 23] then I pop the last digit or the last element max = 23. I'm left with [1, 3, 4, 6, 10] and need a way to find a way to find the elements that add up to 23 which are 3 + 4 + 6 + 10 = 23. The elements don't have to be subsequent they can be at random points of the array but they must add up to max.
I can find the permutations of the sorted array from 2 elements to n-1 elements and sum them and compare them to max but that seems inefficient. plz help
This is exactly the subset sum problem, which is NP-Complete, but if your numbers are relatively small integers, there is an efficient pseudo-polynomial solution using Dynamic Programming:
D(i,0) = TRUE
D(0,x) = FALSE x>0
D(i,x) = D(i-1,x) OR D(i-1,x-arr[i])
If there is a solution, you need to step back in the matrix created by the DP solution, and "record" each choice you have made along the way, to get the elements used for the summation. This thread deals with how to find the actual elements in a very similar problem (known as knapsack problem), which is solved similarly: How to find which elements are in the bag, using Knapsack Algorithm [and not only the bag's value]?

find kth smallest number in O(logn) time

Here is the problem, an unsorted array a[n], and I need to find the kth smallest number in range [i, j], and absolutely 1<=i<=j<=n, k<=j-i+1.
Typically I will use quick-find to do the job, but it is not fast enough if there many query requests with different range [i, j], I hardly to figure out a algorithm to do the query in O(logn) time (preprocessing is allowed).
Any idea is appreciated.
PS
Let me make the problem easier to understand. Any kinds of preprocessing is allowed, but the query needs to be done in O(logn) time. And there will be many (more than 1) queries, like find the 1st in range [3,7], or 3rd in range [10,17], or 11th in range [33, 52].
By range [i, j] I mean in the original array, not sorted or something.
For example, a[5] = {3,1,7,5,9}, query 1st in range [3,4] is 5, 2nd in range [1,3] is 5, 3rd in range [0,2] is 7.
If pre-processing is allowed and not counted towards the time complexity, just use that to construct sub-lists so that you can efficiently find the element you're looking for. As with most optimisations, this trades space for time.
Your pre-processing step is to take your original list of n numbers and create a number of new sublists.
Each of these sublists is a portion of the original, starting with the nth element, extending for m elements and then sorted. So your original list of:
{3, 1, 7, 5, 9}
gives you:
list[0][0] = {3}
list[0][1] = {1, 3}
list[0][2] = {1, 3, 7}
list[0][3] = {1, 3, 5, 7}
list[0][4] = {1, 3, 5, 7, 9}
list[1][0] = {1}
list[1][1] = {1, 7}
list[1][2] = {1, 5, 7}
list[1][3] = {1, 5, 7, 9}
list[2][0] = {7}
list[2][1] = {5, 7}
list[2][2] = {5, 7, 9}
list[3][0] = {5}
list[3][1] = {5,9}
list[4][0] = {9}
This isn't a cheap operation (in time or space) so you may want to maintain a "dirty" flag on the list so you only perform it the first time after you do an modifying operation (insert, delete, change).
In fact, you can use lazy evaluation for even more efficiency. Basically set all sublists to an empty list when you start and whenever you perform a modifying operation. Then, whenever you attempt to access a sublist and it's empty, calculate that sublist (and that one only) before trying to get the kth value out of it.
That ensures sublists are evaluated only when needed and cached to prevent unnecessary recalculation. For example, if you never ask for a value from the 3-through-6 sublist, it's never calculated.
The pseudo-code for creating all the sublists is basically (for loops inclusive at both ends):
for n = 0 to a.lastindex:
create array list[n]
for m = 0 to a.lastindex - n
create array list[n][m]
for i = 0 to m:
list[n][m][i] = a[n+i]
sort list[n][m]
The code for lazy evaluation is a little more complex (but only a little), so I won't provide pseudo-code for that.
Then, in order to find the kth smallest number in the range i through j (where i and j are the original indexes), you simply look up lists[i][j-i][k-1], a very fast O(1) operation:
+--------------------------+
| |
| v
1st in range [3,4] (values 5,9), list[3][4-3=1][1-1-0] = 5
2nd in range [1,3] (values 1,7,5), list[1][3-1=2][2-1=1] = 5
3rd in range [0,2] (values 3,1,7), list[0][2-0=2][3-1=2] = 7
| | ^ ^ ^
| | | | |
| +-------------------------+----+ |
| |
+-------------------------------------------------+
Here's some Python code which shows this in action:
orig = [3,1,7,5,9]
print orig
print "====="
list = []
for n in range (len(orig)):
list.append([])
for m in range (len(orig) - n):
list[-1].append([])
for i in range (m+1):
list[-1][-1].append(orig[n+i])
list[-1][-1] = sorted(list[-1][-1])
print "(%d,%d)=%s"%(n,m,list[-1][-1])
print "====="
# Gives xth smallest in index range y through z inclusive.
x = 1; y = 3; z = 4; print "(%d,%d,%d)=%d"%(x,y,z,list[y][z-y][x-1])
x = 2; y = 1; z = 3; print "(%d,%d,%d)=%d"%(x,y,z,list[y][z-y][x-1])
x = 3; y = 0; z = 2; print "(%d,%d,%d)=%d"%(x,y,z,list[y][z-y][x-1])
print "====="
As expected, the output is:
[3, 1, 7, 5, 9]
=====
(0,0)=[3]
(0,1)=[1, 3]
(0,2)=[1, 3, 7]
(0,3)=[1, 3, 5, 7]
(0,4)=[1, 3, 5, 7, 9]
(1,0)=[1]
(1,1)=[1, 7]
(1,2)=[1, 5, 7]
(1,3)=[1, 5, 7, 9]
(2,0)=[7]
(2,1)=[5, 7]
(2,2)=[5, 7, 9]
(3,0)=[5]
(3,1)=[5, 9]
(4,0)=[9]
=====
(1,3,4)=5
(2,1,3)=5
(3,0,2)=7
=====
Current solution is O( (logn)^2 ). I am pretty sure it can be modified to run on O(logn). The main advantage of this algorithm over paxdiablo's algorithm is space efficiency. This algorithm needs O(nlogn) space, not O(n^2) space.
First, the complexity of finding kth smallest element from two sorted arrays of length m and n is O(logm + logn). Complexity of finding kth smallest element from arrays of lengths a,b,c,d.. is O(loga+logb+.....).
Now, sort the whole array and store it. Sort the first half and second half of the array and store it and so on. You will have 1 sorted array of length n, 2 sorted of arrays of length n/2, 4 sorted arrays of length n/4 and so on. Total memory required = 1*n+2*n/2+4*n/4+8*n/8...= nlogn.
Once you have i and j figure out the list of of subarrays which, when concatenated, give you range [i,j]. There are going to be logn number of arrays. Finding kth smallest number among them would take O( (logn)^2) time.
Example for the last paragraph:
Assume the array is of size 8 (indexed from 0 to 7). You have the following sorted lists:
A:0-7, B:0-3, C:4-7, D:0-1, E:2-3, F:4-5, G:6-7.
Now construct a tree with pointers to these arrays such that every node contains its immediate constituents. A will be root, B and C are its children and so on.
Now implement a recursive function that returns a list of arrays.
def getArrays(node, i, j):
if i==node.min and j==node.max:
return [node];
if i<=node.left.max:
if j<=node.left.max:
return [getArrays(node.left, i, j)]; # (i,j) is located within left node
else:
return [ getArrays(node.left, i, node.left.max), getArrays(node.right, node.right.min, j) ]; # (i,j) is spread over left and right node
else:
return [getArrays(node.right, i, j)]; # (i,j) is located within right node
Preprocess: Make an nxn array where the [k][r] element is the kth smallest element of the first r elements (1-indexed for convenience).
Then, given some particular range [i,j] and value for k, do the following:
Find the element at the [k][j] slot of the matrix; call this x.
go down the i-1 column of your matrix and find how many values in it are smaller than or equal to x (treat column 0 as having 0 smaller entries). By construction, this column will be sorted (all columns will be sorted), so it can be found in log time. Call this value s
Find the element in the [k+s][j] slot of the matrix. This is your answer.
E.g., given 3 1 7 5 9
3 1 1 1 1
X 3 3 3 3
X X 7 5 5
X X X 7 7
X X X X 9
Now, if we're asked for the 2nd smallest in [2,4] range (again, 1-indexing), I first find the 2nd smallest in [1,4] range which is 3. I then look at column 1 and see that there is 1 element less than or equal to 3. Finally, I find the 3rd smallest in [1,4] range at [3][5] slot which is 5, as desired.
This takes n^2 space, and log(n) lookup time.
This one does not require pre-process but is somehow slower than O(logN). It's significantly faster than a naive iterate&count, and could support dynamic modification on the sequence.
It goes like this. Suppose the length n has n=2^x for some x. Construct a segment-tree whose root node represent [0,n-1]. For each of the node, if it represent a node [a,b], b>a, let it has two child nodes each representing [a,(a+b)/2], [(a+b)/2+1,b]. (That is, do a recursive divide-by-two).
Then, on each node, maintain a separate binary search tree for the numbers within that segment. Therefore, each modification on the sequence takes O(logN)[on the segement]*O(logN)[on the BST]. Queries can be done like this, Let Q(a,b,x) be rank of x within segment [a,b]. Obviously, if Q(a,b,x) can be computed efficiently, a binary search on x can compute the answer desired effectively (with an extra O(logE) factor.
Q(a,b,x) can be computed as: find smallest number of segments that make up [a,b], which can be done in O(logN) on the segment tree. For each segment, query on the binary search tree for that segment for the number of elements less than x. Add all these numbers to get Q(a,b,x).
This should be O(logN*logE*logN). Well not exactly what you have asked for though.
In O(log n) time it's not possible to read all of the elements of the array. Since it's not sorted, and there's no other provided information, this is impossible.
There's no way you can do better than O(n) in both worst and average case. You have to look at every single element.

Resources