Algorithm for obtaining a sum with minimum number of terms - algorithm

The problem statement is following:
Given N. We need to find x1,x2,..,xp such that N = x1 + x2 + .. + xp, p must be minimum(means number of terms in the sum) and we also must be able to get all the numbers from 1 to (N-1) from the sum of the subset of (x1,x2,x3..xp).And numbers in the set might be repeated also.
For example if N=7.
7 = 1+2+4
And 6= (2,4) , 5= (4,1), 4 = (4),3=(1,2) and so on.
Example 2:
8 = 1+2+4+1
Example 3:(invalid)
8 = 1+2+5
But we can't get 4 from the subset of (1,2,5).So (1,2,5) is not a valid combination
My approach is if 'N-1'can be written as sum of p terms than 'N' either have p or p+1 terms. But that approach will require to check all possible combinations which sums up to "N-1" and have "p" terms. Can anyone has better solution other than this?
Solution:
Step1:
Assume that we got "K" entries in our set as our answer. Therefore we can obtain 2^K different numbers of sums from these numbers because each entry either will appear or not appear in the sum. And also if the the number is "N", we need to compute the sum for '1' to 'N'. Therefore (2^K -1) = N K=log(N+1)
Step2:
After the step1, we know that our answer must include "K" entries but what these entries actual are? Assume that our entries are (a1,a2,a3...ak). So number P can be written as
P = a1*b1 + a2*b2 + a3*b3....+ ak*bk. Where all b[i] = 0 or 1. Here, we can see P as a decimal representation of binary number (b1 b2 b3 bk), therefore we can take a[i] = 2^(i-1).

You should take all numbers 1,2,4 ....2^k, N-(1+...+2^k). (The last one only if it doesn't equal to 0)
Proof
First of all, if we only get k numbers, we can get maximum 2^k - 1 different sums except 0. So if N>=2^k, We need at least k + 1 numbers. So you can see that if our group of numbers correct it's minimum by size(or one of the minimums)
It's easy to see that we can get any number from 0 to 2^(k+1) - 1 using first numbers. What If we need more? We just get last number because it's less than 2^(k + 1). And get difference using first elements

I haven't run out the numbers on this, but you should be very very interested in the fact that you have listed the first three powers of two.
If I were looking for a better solution, that's where I'd start.

Related

Number of sets of queries for the game guess the number?

I have asked this problem on math.stackexchange, but I didn't get any answer.
The problem is to solve a two-player number-guessing game. Let the first person be A and the second person be B. A chooses a number x between 1 and an upper limit n. B gives queries to A in form of (L,R) and A will answer with Yes/NO if the number exists in the interval (L,R) both inclusive.
B will need to ask a set of queries to uniquely determine the number x. So the problem is to find the number of such distinct sets of queries such that B will be able to uniquely determine x, irrespective of what the value of x. A returns the answers to the queries as a whole series -- B gets the yes/no responses in a batch, only after making all of the queries.
For example, let's say n is 2. The sets of possible queries would be,
{(1,1)},
{(2,2)},
{(1,1),(2,2)},
{(1,1),(1,2)},
{(2,2),(1,2)},
{(1,1),(2,2),(1,2)}
Question: How can I determine which of these query sets will uniquely identify any integer that A might choose?
I could think was the we need to basically isolate all the possible numbers from 1 to n in some way, otherwise it's not possible to uniquely determine the number. But I have no idea what to do with this information.
Set #1 to detect a number:
1 set with all single pairs: {{1,1},{2,2},...,{N,N}}
Set #2 to detect a number:
N sets with single pairs except one pair: {{1,1},{2,2},..,{x-1,x-1},{x+1,x+1},..,{N,N}}
Pairs that can't help to identify number:
N-1 pairs with length 2: {1,2},{2,3},,{N-1,N}
N-2 pairs with length 3: {1,3},{2,4},,{N-2,N}
...
2 pairs with length N-1: {1,N-1},{2,N}
1 pairs with length N: {1,N}
Total number of useless pairs is:
K = (N-1) + (N-2) + ... 2 + 1 = N*(N-1)/2
Total number of useless sets is:
Z = C(K,0) + C(K, 1) + ... + C(K, K) = 2^K
Number of sets of queries
To find the answer we need to combine all correct sets with all other types of sets.
ANSWER = (Number of set #1 + Number of sets #2) * Z = (1 + N) * (2^K)
UDP: Answer is wrong, see comment below

Calculating limits in dynamic programming

I found this question on topcoder:
Your friend Lucas gave you a sequence S of positive integers.
For a while, you two played a simple game with S: Lucas would pick a number, and you had to select some elements of S such that the sum of all numbers you selected is the number chosen by Lucas. For example, if S={2,1,2,7} and Lucas chose the number 11, you would answer that 2+2+7 = 11.
Lucas now wants to trick you by choosing a number X such that there will be no valid answer. For example, if S={2,1,2,7}, it is not possible to select elements of S that sum up to 6.
You are given the int[] S. Find the smallest positive integer X that cannot be obtained as the sum of some (possibly all) elements of S.
Constraints: - S will contain between 1 and 20 elements, inclusive. - Each element of S will be between 1 and 100,000, inclusive.
But in the editorial solution it has been written:
How about finding the smallest impossible sum? Well, we can try the following naive algorithm: First try with x = 1, if this is not a valid sum (found using the methods in the previous section), then we can return x, else we increment x and try again, and again until we find the smallest number that is not a valid sum.
Let's find an upper bound for the number of iterations, the number of values of x we will need to try before we find a result. First of all, the maximum sum possible in this problem is 100000 * 20 (All numbers are the maximum 100000), this means that 100000 * 20 + 1 will not be an impossible value. We can be certain to need at most 2000001 steps.
How good is this upper bound? If we had 100000 in each of the 20 numbers, 1 wouldn't be a possible sum. So we actually need one iteration in that case. If we want 1 to be a possible sum, we should have 1 in the initial elements. Then we need a 2 (Else we would only need 2 iterations), then a 4 (3 can be found by adding 1+2), then 8 (Numbers from 5 to 7 can be found by adding some of the first 3 powers of two), then 16, 32, .... It turns out that with the powers of 2, we can easily make inputs that require many iterations. With the first 17 powers of two, we can cover up to the first 262143 integer numbers. That should be a good estimation for the largest number. (We cannot use 2^18 in the input, smaller than 100000).
Up to 262143 times, we need to query if a number x is in the set of possible sums. We can just use a boolean array here. It appears that even O(log(n)) data structures should be fast enough, however.
I did understand the first paragraph. But after that they have explained something about "How good is this upper bound?...". I couldnt understand that paragraph. How did they deduce to the fact that we need to query 262143 times if a number x is in the set of possible sums?
I am a newbie at dynamic programming and so it would be great if somebody could explain this to me.
Thank you.
The idea is as follows:
If the input sequence contains the first k powers of two: 2^0, 2^1, ... 2^(k-1), then the sum can be any integer between 0 and (2^k) - 1. Since the greatest power of two that can appear in the sequence is 2^17, the greatest sum that you can build from 18 numbers is 2^18 - 1=262,143. If a power of two would be missing, there would be a smaller sum that was not possible to achieve.
However, the statement is missing that there may be 2 more numbers in the sequence (at most 20). From these two numbers, you can repeat the same process. Hence, the maximum number to check is actually (2^18) - 1 + (2^2) - 1.
You may wonder why we use powers of two and not any other powers. The reason is the binary selection that we perform on the numbers in the input sequence. Either we add a number to the sum or we don't. So, if we represent this selection for number ni as a selection variable si (either 0 or 1), then the possible sum is:
s = s0 * n0 + s1 * n1 + s2 * n2 + ...
Now, if we choose the ni to be powers of two ni = 2^i, then:
s = s0 * 2^0 + s1 * 2^1 + s2 * 2^2 + ...
= sum si * 2^i
This is equivalent to the binary representations of numbers (see Positional Notation). By definition, different choices for the selection variables will produce different sums. Hence, the number of possible sums is maximal by choosing powers of two in the input sequence.

Generate a random integer from 0 to N-1 which is not in the list

You are given N and an int K[].
The task at hand is to generate a equal probabilistic random number between 0 to N-1 which doesn't exist in K.
N is strictly a integer >= 0.
And K.length is < N-1. And 0 <= K[i] <= N-1. Also assume K is sorted and each element of K is unique.
You are given a function uniformRand(int M) which generates uniform random number in the range 0 to M-1 And assume this functions's complexity is O(1).
Example:
N = 7
K = {0, 1, 5}
the function should return any random number { 2, 3, 4, 6 } with equal
probability.
I could get a O(N) solution for this : First generate a random number between 0 to N - K.length. And map the thus generated random number to a number not in K. The second step will take the complexity to O(N). Can it be done better in may be O(log N) ?
You can use the fact that all the numbers in K[] are between 0 and N-1 and they are distinct.
For your example case, you generate a random number from 0 to 3. Say you get a random number r. Now you conduct binary search on the array K[].
Initialize i = K.length/2.
Find K[i] - i. This will give you the number of numbers missing from the array in the range 0 to i.
For example K[2] = 5. So 3 elements are missing from K[0] to K[2] (2,3,4)
Hence you can decide whether you have to conduct the remaining search in the first part of array K or the next part. This is because you know r.
This search will give you a complexity of log(K.length)
EDIT: For example,
N = 7
K = {0, 1, 4} // modified the array to clarify the algorithm steps.
the function should return any random number { 2, 3, 5, 6 } with equal probability.
Random number generated between 0 and N-K.length = random{0-3}. Say we get 3. Hence we require the 4th missing number in array K.
Conduct binary search on array K[].
Initial i = K.length/2 = 1.
Now we see K[1] - 1 = 0. Hence no number is missing upto i = 1. Hence we search on the latter part of the array.
Now i = 2. K[2] - 2 = 4 - 2 = 2. Hence there are 2 missing numbers up to index i = 2. But we need the 4th missing element. So we again have to search in the latter part of the array.
Now we reach an empty array. What should we do now? If we reach an empty array between say K[j] & K[j+1] then it simply means that all elements between K[j] and K[j+1] are missing from the array K.
Hence all elements above K[2] are missing from the array, namely 5 and 6. We need the 4th element out of which we have already discarded 2 elements. Hence we will choose the second element which is 6.
Binary search.
The basic algorithm:
(not quite the same as the other answer - the number is only generated at the end)
Start in the middle of K.
By looking at the current value and it's index, we can determine the number of pickable numbers (numbers not in K) to the left.
Similarly, by including N, we can determine the number of pickable numbers to the right.
Now randomly go either left or right, weighted based on the count of pickable numbers on each side.
Repeat in the chosen subarray until the subarray is empty.
Then generate a random number in the range consisting of the numbers before and after the subarray in the array.
The running time would be O(log |K|), and, since |K| < N-1, O(log N).
The exact mathematics for number counts and weights can be derived from the example below.
Extension with K containing a bigger range:
Now let's say (for enrichment purposes) K can also contain values N or larger.
Then, instead of starting with the entire K, we start with a subarray up to position min(N, |K|), and start in the middle of that.
It's easy to see that the N-th position in K (if one exists) will be >= N, so this chosen range includes any possible number we can generate.
From here, we need to do a binary search for N (which would give us a point where all values to the left are < N, even if N could not be found) (the above algorithm doesn't deal with K containing values greater than N).
Then we just run the algorithm as above with the subarray ending at the last value < N.
The running time would be O(log N), or, more specifically, O(log min(N, |K|)).
Example:
N = 10
K = {0, 1, 4, 5, 8}
So we start in the middle - 4.
Given that we're at index 2, we know there are 2 elements to the left, and the value is 4, so there are 4 - 2 = 2 pickable values to the left.
Similarly, there are 10 - (4+1) - 2 = 3 pickable values to the right.
So now we go left with probability 2/(2+3) and right with probability 3/(2+3).
Let's say we went right, and our next middle value is 5.
We are at the first position in this subarray, and the previous value is 4, so we have 5 - (4+1) = 0 pickable values to the left.
And there are 10 - (5+1) - 1 = 3 pickable values to the right.
We can't go left (0 probability). If we go right, our next middle value would be 8.
There would be 2 pickable values to the left, and 1 to the right.
If we go left, we'd have an empty subarray.
So then we'd generate a number between 5 and 8, which would be 6 or 7 with equal probability.
This can be solved by basically solving this:
Find the rth smallest number not in the given array, K, subject to
conditions in the question.
For that consider the implicit array D, defined by
D[i] = K[i] - i for 0 <= i < L, where L is length of K
We also set D[-1] = 0 and D[L] = N
We also define K[-1] = 0.
Note, we don't actually need to construct D. Also note that D is sorted (and all elements non-negative), as the numbers in K[] are unique and increasing.
Now we make the following claim:
CLAIM: To find the rth smallest number not in K[], we need to find right most occurrence of r' in D (which occurs at position defined by j), where r' is the largest number in D, which is < r. Such an r' exists, because D[-1] = 0. Once we find such an r' (and j), the number we are looking for is r-r' + K[j].
Proof: Basically the definition of r' and j tells us that there are exactlyr' numbers missing from 0 to K[j], and more than r numbers missing from 0 to K[j+1]. Thus all the numbers from K[j]+1 to K[j+1]-1 are missing (and these missing are at least r-r' in number), and the number we seek is among them, given by K[j] + r-r'.
Algorithm:
In order to find (r',j) all we need to do is a (modified) binary search for r in D, where we keep moving to the left even if we find r in the array.
This is an O(log K) algorithm.
If you are running this many times, it probably pays to speed up your generation operation: O(log N) time just isn't acceptable.
Make an empty array G. Starting at zero, count upwards while progressing through the values of K. If a value isn't in K add it to G. If it is in K don't add it and progress your K pointer. (This relies on K being sorted.)
Now you have an array G which has only acceptable numbers.
Use your random number generator to choose a value from G.
This requires O(N) preparatory work and each generation happens in O(1) time. After N look-ups the amortized time of all operations is O(1).
A Python mock-up:
import random
class PRNG:
def __init__(self, K,N):
self.G = []
kptr = 0
for i in range(N):
if kptr<len(K) and K[kptr]==i:
kptr+=1
else:
self.G.append(i)
def getRand(self):
rn = random.randint(0,len(self.G)-1)
return self.G[rn]
prng=PRNG( [0,1,5], 7)
for i in range(20):
print prng.getRand()

Arrangement of sequence of 'n' numbers

In how many ways can you arrange a sequence of 'n' (1 to n) numbers such that no number occurs at the index represent by its value?
For eg
1 can not be at first position
2 can not be at second position
.
.
n can not be at nth position
Please give a general solution. Also solve it for n=6.
Its not a homework.
You want fixed point free permutations, also known as derangements. The formula for their number is slightly more complicated than for the number of permutations that may have fixed points.
Let P(n) be the number of such arrangements for n numbers.
For 123456....n
Cases are of the form
2*****
3*****
4*****
5*****
.
.
n*****
Now 1 can be anywhere at the rest (n-1) positions.
If 1 is put at the position of the number replacing it...
21****
3*1***
4**1**
.
.
n****1
then first and the replaced numbers are fixed.
Then total cases = (n-1) * P(n-2)
Else if
1 is also restricted not to be at a particular position (positions in above cases)
Then total cases = (n-1) * P(n-1)
So
P(n) = (P(n-1) + P(n-2)) * (n-1)
with P(1) = 0
and P(2) = 1
The number of derangements (fixed point free permutations) of n things is round(n!/e) where e is the base of natural logarithms. Here round means nearest integer function. This is described in the Wikipedia article, but in a manner that could stand clarification.
For n = 6 one easily calculates there are round(264.87...) = 265 derangements.
In effect you've asked a frequently covered question from MathSE.

Need an algorithm for this problem

There are two integer sequences A[] and B[] of length N,both unsorted.
Requirement: through the swapping of elements between A[] and B[]( can randomly exchange, not with same index), make the difference between {the sum of all elements in A[]} and {the sum of all elements in B[]} to be minimum.
PS: actually,it is an interview question I encountered.
Many thanks
This is going to be NP-hard! I believe you can do a reduction from Subset Sum to this.
As per BlueRaja/polygene's comments, I will try to provide a full reduction from Subset Sum.
Here is a reduction:
Subset Sum problem: Given integers x1, x2, ..., xn, is there some non-empty subset which sums to zero?
Our problem: Given two integer arrays of size k, find the minimum possible difference of the sum of the two arrays, assuming we can shuffle around the integers in the arrays, treating both arrays as one array.
Say we had a polynomial time algo for our problem.
Say now you are given integers T = {x1,x2, ...,xn} (multiset)
Let Si = x1 + x2 + ...+ xn + xi.
Let Ti = {x1, x2, ..., xi-1, xi+1, ..., xn } ( = T - xi)
Define
Ai = Array formed using Ti
Bi = [Si, 0, ..., 0] (i.e one element is Si and rest are zeroes).
Let mi = the min difference found by our problem for arrays Ai and Bi
(we run our problem n times).
Claim: Some non-empty subset of T sums to zero if and only if, there is some i, for which mi = 0.
Proof: (wlog) say x1 + x2 + .. + xk = 0
Then
A = [xk+1, ..., xn, 0, ...0]
B = [x2, x3, ..., xk, S1, 0, ..0]
gives the minimum difference m1 to be |x2 + .. + xk + (x1 + ... + xn) + x1 - (xk+1 + .. + xn)| = |2(x1+ x2 + .. xk)| = 0.
Similarly the if part can be proved.
In fact, this actually also follows (more easily) from Partition too: just create new array with all zeroes.
Hoepfully I haven't made any mistakes.
Take any instance of the NP-complete partition problem:
Partition a multiset A of positive integers into two multisets B and C with the same sum
like {a1,a2,...,an}. Add n zeroes {0,0,0...,0,a1,...,an} and ask if the set can be partitioned into two multisets A and B with the same sum and same number of elements. I claim these two conditions are equivalent:
If A and B are a solution to the problem, then you can strike out the zeroes and get a solution of partiton problem.
If there is a solution to the partition problem, for example ai1 + ai2 + ... aik = aj1 + ... +ajl where {ai1, ai2, aik, aj1, ..., ajl} = {a1, ... , an} then obviously k+l = n. Add l zeroes to the left side and k zeroes to the right side and you'll get 0 + ... + 0 + ai1 + ai2 + ... aik = 0 + ... + 0 + aj1 + ... +ajl, whichi is a solution of your problem.
So, this is a reduction (so the problem is NP-hard) and the problem is NP, so it is NP-complete.
"sequences A[] and B[] of length N" -> does this mean both A and B are each of length N?
(For the purpose of clarity I am using 1-based arrays below).
If so, how about this:
Assume A[1..N] and B[1..N]
Concatenate A and B into a new array C of length 2N: C[1..N] <- A[1..N]; C[N+1 .. 2N] <- B[1..N]
Sort C in ascending order.
Take the first pair of numbers from C; send the first element (C[1]) to A[1] and second element (C[2]) to B[1]
Take the second pair of numbers from C; this time send the second element (C[4]) to A[2] and the first element (C[3]) to B[2] (the order of elements in the pair sent to A and B is the opposite of 3)
... repeat 3 and 4 until C is exhausted
The observation here is that, in a sorted array, an adjacent pair of numbers will have the smallest difference (compared to a pair of numbers from non-adjacent positions). Step 3 ensures that A[1] and B[1] consists of a pair of numbers with the least possible difference. Step 4 ensures that (a) A[2] and B[2] consist of a pair of numbers with the least possible difference (from the available numbers) and also (b) that the difference is opposite in sign from step 3. By continuing like this, we are ensuring that A[i] and B[i] contain numbers with the least possible difference. Also, by flipping the order in which we send elements to A and B, we are ensuring that the difference changes sign for each successive i.
Try being greedy about it. Given such limited information, I'm not sure what else one could put out there.
I'm not sure that this will ensure the minimum possible distance, but the first thing that comes to mi mind is something like this:
int diff=0;
for (int i = 0; i<len; i++){
int x = a[i] - b[i];
if (abs(diff - x) > abs(diff + x)){
swap(a,b,i);
diff-=x;
}else{
diff+=x;
}
}
assuming that you have a swap function which takes the two arrays and exchanges the items at position i :)
computing and adding the difference between the two values at position i you get the incremental difference between the sums of the elements of the two arrays.
at each step you check if it's better to add (a[i]-b[i]) or (b[i]-a[i]). if the b[i]-a[i] it's the case, you swap the elements at position i in the arrays.
Maybe this will not be the best way, but it should be a start :)
The problem is NP-Complete.
We can reduce the partition problem to the decision version of this problem, i.e. given two arrays of ints of the same size, determine whether items can be swapped so that the sums are equal.
The input to the partition problem: a set S of integers, of size N
In order to transform this input into an input to our problem, we define A to be an array of all items in S, and B an array of the same size, with B[i]=0 for all i. This transformation is linear in the input size.
It is clear that our algorithm applied on A and B returns true if and only if there is a partition of S into 2 subsets such that the sums are equal.

Resources