Determine if two numbers in a set equals x in nlgn [duplicate] - algorithm

This question already has answers here:
Determine whether or not there exist two elements in Set S whose sum is exactly x - correct solution?
(8 answers)
Closed 7 years ago.
I do not want a solution if my own answer is incorrect because i really want to solve this on my own. What i do want is either a yes this is correct or a no this is not correct, followed by tips or suggestions that may lead to an answer without spoiling it all.
The question:
Describe a theta(nlgn)-time algorithm that, given a set S of n integers and another integer x, determines whether or not there exist two elements in S whose sum is exactly x. (Intro to Algorithms, 2.3-7)
The attempt:
First the problem doesnt state whether or not the set is sorted. I assumed it was not, and sorted it using merge sort as it is theta(nlgn) under the worst case.
Then i said okay, the only way this will still be theta(nlgn) is if i recursively split the problem into two. My approach was to start at index i=0 of our array, see what value k would i need for x=i+k and then used binary search to split the problem in half until i either found k or not. If i did not find a k such that x=i+k, then i would continue the same process for index i=2 to n-1. This would result in theta(nlgn).
The total time complexity from both sorting the array and finding k would add up to theta(nlgn) + theta(nlgn) = theta(2nlg2n) = theta(nlgn) if you remove constants.

Yes, your solution is correct. Don't forget to handle the case when a found element k has the same index i.

Related

Sum of a number with any k elements in Array

Design an algorithm that, given a set S of n integers and another
integer x, determines whether or not there exist k (n>k>2) elements in
S whose sum is exactly x. Please give the running time of your
algorithm
I have been preparing for an interview, and i have come across this algorithm. I have solved problems where k has been specified in the problem. like 2 or 3. But i cannot find an answer where i might solve for any k that might exist. I have tried solving it using dynamic programming but didn't get results. Can anyone help me on this.
You can make an int array cnt of size x, then go through the set, and mark reachable points. All elements of cnt are set to -1 initially except element zero, which is set to zero.
Repeat the same process for each element si of S: for each element of cnt at position i that is non-negative, check element cnt[i+si] (if it's within the bounds of the array). If it is, set it to cnt[si+i] = max(cnt[i]+1, cnt[si+i]).
Once you go through all elements of S, check cnt[x]. If it is set to two or more, then there exists a combination of two or more elements in S adding up to x.
This algorithm is pseudo-polynomial, with running time O(x*|S|)

list all the permutation to sum up some numbers to a dest number [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Algorithm to find which number in a list sum up to a certain number
question:
there is a list: [1,2,3,4,6,8,10,12], i want to use these numbers to sum up to a new number 16.
rules:
1) don't need to use all of the numbers, 6 + 10 will be ok.
2) a number can use multi times, 12+2+1+1 will be ok.
3) order matters, 12 + 6 and 6 + 12 are two different combinations.
i have see algorithm to sum up a list of numbers for all combinations, but this is not the same.
don't know much about algorithm, if this fit certain algorithm, please let me know, or some python code / pseudo-code will be much appreciated.
First - note that even finding if there is any subset that sums to the desired number is NP-Complete and is known as the subset sum problem, so there is no known polynomial solution for it.
Now, regarding the specific issue, here are some options:
First there is of course the obvious "generate all subsets and check the sum" way. Note that if your elements are all non-negative, you can use branch and bound and terminate a large portion of the possibilities before you actually develop them (If you found a subset X with sum(X) == s, and you are looking for the number n < s - you can be sure any set containing X will NOT find the solution). Something along the lines of:
findSubsets(list,sol,n):
if (list.empty() and n == 0): #found a feasible subset!
print sol
return
else if (n < 0): #bounding non feasible solutions
return
else if (list.empty()): #a solution that sums to a smaller number then n
return
e <- list.removeAndReturnFirst()
sol <- sol.add(e)
findSubsets(list,sol,n-e)
sol <- sol.removeLast()
findSubsets(list,sol,n)
list.addFirst(e) #cleanup, return the list to original state
Invoke with findSubsets(list,[],n) where list is your list of items, n is the desired number and [] is an empty list.
Note that it can be easily parallelized if needed, there is no real syncrhonization needed between two subsets explored.
Another alternative if the list contains only integers is using Dynamic Programming for solving the subset sum problem. Once you have the matrix you can re-create all the elements from the table by going back in the table. This similar question discusses how to get a list from the knapsack DP solution. The principles of the two problems are pretty much the same.

Find the smallest missing number in an array [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Find the Smallest Integer Not in a List
I got asked this question in an interview
Given an unsorted array, find the smallest number that is missing. Assume that all numbers are postive.
input = {4,2,1,3,678,3432}
output = 5
Sorting it was my first approach. Second approach of mine is to have a boolean flag array.The second approach takes up lots of space.
Is there any other better approach than this ?
Suppose the length of the given array is N.
You can go with the boolean-flag approach, but you don't need to take numbers that are larger than N into account, since they're obviously too big to affect the answer. You could also consider a bitmap to save some space.
An alternative solution to unkulunkulu's is the following:
k := 1
Make an array of 2^k booleans, all initially set to false
For every value, V, in the array:
if V < 2^k then:
set the V'th boolean to true
Scan through the booleans, if there are any falses then the lowest one indicates the lowest missing value.
If there were no falses then increment k and go to step 2.
This will run in O(n log s) time and O(s) space, where n is the size of the input array and s is the smallest value not present. It can be made more efficient by not rechecking the lowest values on successive iterations, but that doesn't change the constraints.

Tricky programming problem that I'm having trouble getting my head around

First off, let me say that this is not homework (I am an A-Level student, this is nothing close to what we problem solve (this is way harder)), but more of a problem I'm trying to suss out to improve my programming logic.
I thought of a scenario where there is an array of random integers, let's for example say 10 integers. The user will input a number he wants to count to, and the algorithm will try and work out what numbers are needed to make that sum. For example if I wanted to make the sum 44 from this array of integers:
myIntegers = array(1, 5, 9, 3, 7, 12, 36, 22, 19, 63);
The output would be:
36 + 3 + 5 = 44
Or something along those lines. I hope I make myself clear. As an added bonus I would like to make the algorithm pick as few numbers as possible to make the required sum, or give out an error if the sum cannot be made with the numbers supplied.
I thought about using recursion and iterating through the array, adding numbers over and over until the sum is met or gone past. But what I can't get my head around is what to do if the algorithm goes past the sum and needs to be selective about what numbers to pick from the array.
I'm not looking for complete code, or a complete algorithm, I just want your opinions on how I should proceed with this and perhaps share a few tips or something. I'll probably start work on this tonight. :P
As I said, not homework. Just me wanting to do something a bit more advanced.
Thanks for any help you're able to offer. :)
You are looking at the Knapsack Problem
The knapsack problem or rucksack problem is a problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size knapsack and must fill it with the most useful items.
Edit: Your special case is the Subset Sum Problem
Will subset sum do? ;]
This is the classic Knapsack problem that you would see in college level algorithms course (or at least I saw it then). Best to work this out on paper and the solution in code should be relatively easy to work out.
EDIT: One thing to consider is dynamic programming.
Your Problem is related to the subset sum problem.
You have to try all possible combinations in the worst case.
No shortcuts here I'm afraid. In addition to what other people have said, about what specific problem this is etc., here's some practical advice to offer you a starting point:
I would sort the array and given the input sum m, would find the first number in the array less than m, call it n (this is your first possible number for the sum), and start from the highest possible complement (m-n), working your way down.
If you don't find a precise match, pick the highest available, call it o, (that now is your 2nd number) and look for the 3rd one starting from (m-n-o) and work your way down again.
If you don't find a precise match, start with the next number n (index of original n at index-1) and do the same. You can keep doing this until you find a precise match for two numbers. If no match for the sum is found for two numbers, start the process again, but expand it to include a 3rd number. And so on.
That could be done recursively. At least this approach ensures that when you find a match, it will be the one with the least possible numbers in the set forming the total input sum.
Potentially though, worst case, you end up going through the whole lot.
Edit: As Venr correctly points out, my first approach was incorrect. Edited approach to reflect this.
There is a very efficient randomized algorithm for this problem. I know you already accepted an answer, but I'm happy to share anyway, I just hope people will still check this question :).
Let Used = list of numbers that you sum.
Let Unused = list of numbers that you DON'T sum.
Let tmpsum = 0.
Let S = desired sum you want to reach.
for ( each number x you read )
toss a coin:
if it's heads and tmpsum < S
add x to Used
else
add x to Unused
while ( tmpsum != S )
if tmpsum < S
MOVE one random number from Unused to Used
else
MOVE one random number from Used to Unused
print the Used list, containing the numbers you need to add to get S
This will be much faster than the dynamic programming solution, especially for random inputs. The only problems are that you cannot reliably detect when there is no solution (you could let the algorithm run for a few seconds and if it doesn't finish, assume there is no solution) and that you cannot be sure you will get the solution with minimum number of elements chosen. Again, you could add some logic to make the algorithm keep going and trying to find a solution with less elements until certain stop conditions are met, but this will make it slower. However, if you are only interested in a solution that works and you have a LOT of numbers and the desired sum can be VERY big, this is probably better than the DP algorithm.
Another advantage of this approach is that it will also work for negative and rational numbers with no modifications, which is not true for the DP solution, because the DP solution involves using partial sums as array indexes, and indexes can only be natural numbers. You can of course use hashtables for example, but that will make the DP solution even slower.
I don't know exactly what's this task is called, but it seems that it's kind of http://en.wikipedia.org/wiki/Knapsack_problem.
Heh, I'll play the "incomplete specification" card (nobody said that numbers couldn't appear more than once!) and reduce this to the "making change" problem. Sort your numbers in decreasing order, find the first one less than your desired sum, then subtract that from your sum (division and remainders could speed this up). Repeat until sum = 0 or no number less than the sum is found.
For completeness, you would need to keep track of the number of addends in each sum, and of course generate the additional sequences by keeping track of the first number you use, skipping that, and repeating the process with the additional numbers. This would solve the (7 + 2 + 1) over (6 + 4) problem.
Repeating the answer of others: it is a Subset Sum problem.
It could be efficiently solved by Dynamic Programing technique.
The following has not been mentioned yet: the problem is Pseudo-P (or NP-Complete in weak sense).
Existence of an algorithm (based on dynamic programming) polynomial in S (where S is the sum) and n (the number of elements) proves this claim.
Regards.
Ok, I wrote a C++ program to solve the above problem. The algorithm is simple :-)
First of all arrange whatever array you have in descending order(I have hard-coded the array in descending form but you may apply any of the sorting algorithms ).
Next I took three stacks n, pos and sum. The first one stores the number for which a possible sum combination is to be found, the second holds the index of the array from where to start the search, the third stores the elements whose addition will give you the number you enter.
The function looks for the largest number in the array which is smaller than or equal to the number entered. If it is equal, it simply pushes the number onto the sum stack. If not, then it pushes the encountered array element to the sum stack(temporarily), and finds the difference between the number to search for and number encountered, and then it performs recursion.
Let me show an example:-
to find 44 in {63,36,22,19,12,9,7,5,3,1}
first 36 will be pushed in sum(largest number less than 44)
44-36=8 will be pushed in n(next number to search for)
7 will be pushed in sum
8-7=1 will be pushed in n
1 will be pushed in sum
thus 44=36+7+1 :-)
#include <iostream>
#include<conio.h>
using namespace std;
int found=0;
void func(int n[],int pos[],int sum[],int arr[],int &topN,int &topP,int &topS)
{
int i=pos[topP],temp;
while(i<=9)
{
if(arr[i]<=n[topN])
{
pos[topP]=i;
topS++;
sum[topS]=arr[i];
temp=n[topN]-arr[i];
if(temp==0)
{
found=1;
break;
}
topN++;
n[topN]=temp;
temp=pos[topP]+1;
topP++;
pos[topP]=temp;
break;
}
i++;
}
if(i==10)
{
topP=topP-1;
topN=topN-1;
pos[topP]+=1;
topS=topS-1;
if(topP!=-1)
func(n,pos,sum,arr,topN,topP,topS);
}
else if(found!=1)
func(n,pos,sum,arr,topN,topP,topS);
}
main()
{
int x,n[100],pos[100],sum[100],arr[10]={63,36,22,19,12,9,7,5,3,1},topN=-1,topP=-1,topS=-1;
cout<<"Enter a number: ";
cin>>x;
topN=topN+1;
n[topN]=x;
topP=topP+1;
pos[topP]=0;
func(n,pos,sum,arr,topN,topP,topS);
if(found==0)
cout<<"Not found any combination";
else{
cout<<"\n"<<sum[0];
for(int i=1;i<=topS;i++)
cout<<" + "<<sum[i];
}
getch();
}
You can copy the code and paste it in your IDE, works fine :-)

Finding a single number in a list [duplicate]

This question already has answers here:
How to find the only number in an array that doesn't occur twice [duplicate]
(5 answers)
Closed 7 years ago.
What would be the best algorithm for finding a number that occurs only once in a list which has all other numbers occurring exactly twice.
So, in the list of integers (lets take it as an array) each integer repeats exactly twice, except one. To find that one, what is the best algorithm.
The fastest (O(n)) and most memory efficient (O(1)) way is with the XOR operation.
In C:
int arr[] = {3, 2, 5, 2, 1, 5, 3};
int num = 0, i;
for (i=0; i < 7; i++)
num ^= arr[i];
printf("%i\n", num);
This prints "1", which is the only one that occurs once.
This works because the first time you hit a number it marks the num variable with itself, and the second time it unmarks num with itself (more or less). The only one that remains unmarked is your non-duplicate.
By the way, you can expand on this idea to very quickly find two unique numbers among a list of duplicates.
Let's call the unique numbers a and b. First take the XOR of everything, as Kyle suggested. What we get is a^b. We know a^b != 0, since a != b. Choose any 1 bit of a^b, and use that as a mask -- in more detail: choose x as a power of 2 so that x & (a^b) is nonzero.
Now split the list into two sublists -- one sublist contains all numbers y with y&x == 0, and the rest go in the other sublist. By the way we chose x, we know that a and b are in different buckets. We also know that each pair of duplicates is still in the same bucket. So we can now apply ye olde "XOR-em-all" trick to each bucket independently, and discover what a and b are completely.
Bam.
O(N) time, O(N) memory
HT= Hash Table
HT.clear()
go over the list in order
for each item you see
if(HT.Contains(item)) -> HT.Remove(item)
else
ht.add(item)
at the end, the item in the HT is the item you are looking for.
Note (credit #Jared Updike): This system will find all Odd instances of items.
comment: I don't see how can people vote up solutions that give you NLogN performance. in which universe is that "better" ?
I am even more shocked you marked the accepted answer s NLogN solution...
I do agree however that if memory is required to be constant, then NLogN would be (so far) the best solution.
Kyle's solution would obviously not catch situations were the data set does not follow the rules. If all numbers were in pairs the algorithm would give a result of zero, the exact same value as if zero would be the only value with single occurance.
If there were multiple single occurance values or triples, the result would be errouness as well.
Testing the data set might well end up with a more costly algorithm, either in memory or time.
Csmba's solution does show some errouness data (no or more then one single occurence value), but not other (quadrouples). Regarding his solution, depending on the implementation of HT, either memory and/or time is more then O(n).
If we cannot be sure about the correctness of the input set, sorting and counting or using a hashtable counting occurances with the integer itself being the hash key would both be feasible.
I would say that using a sorting algorithm and then going through the sorted list to find the number is a good way to do it.
And now the problem is finding "the best" sorting algorithm. There are a lot of sorting algorithms, each of them with its strong and weak points, so this is quite a complicated question. The Wikipedia entry seems like a nice source of info on that.
Implementation in Ruby:
a = [1,2,3,4,123,1,2,.........]
t = a.length-1
for i in 0..t
s = a.index(a[i])+1
b = a[s..t]
w = b.include?a[i]
if w == false
puts a[i]
end
end
You need to specify what you mean by "best" - to some, speed is all that matters and would qualify an answer as "best" - for others, they might forgive a few hundred milliseconds if the solution was more readable.
"Best" is subjective unless you are more specific.
That said:
Iterate through the numbers, for each number search the list for that number and when you reach the number that returns only a 1 for the number of search results, you are done.
Seems like the best you could do is to iterate through the list, for every item add it to a list of "seen" items or else remove it from the "seen" if it's already there, and at the end your list of "seen" items will include the singular element. This is O(n) in regards to time and n in regards to space (in the worst case, it will be much better if the list is sorted).
The fact that they're integers doesn't really factor in, since there's nothing special you can do with adding them up... is there?
Question
I don't understand why the selected answer is "best" by any standard. O(N*lgN) > O(N), and it changes the list (or else creates a copy of it, which is still more expensive in space and time). Am I missing something?
Depends on how large/small/diverse the numbers are though. A radix sort might be applicable which would reduce the sorting time of the O(N log N) solution by a large degree.
The sorting method and the XOR method have the same time complexity. The XOR method is only O(n) if you assume that bitwise XOR of two strings is a constant time operation. This is equivalent to saying that the size of the integers in the array is bounded by a constant. In that case you can use Radix sort to sort the array in O(n).
If the numbers are not bounded, then bitwise XOR takes time O(k) where k is the length of the bit string, and the XOR method takes O(nk). Now again Radix sort will sort the array in time O(nk).
You could simply put the elements in the set into a hash until you find a collision. In ruby, this is a one-liner.
def find_dupe(array)
h={}
array.detect { |e| h[e]||(h[e]=true; false) }
end
So, find_dupe([1,2,3,4,5,1]) would return 1.
This is actually a common "trick" interview question though. It is normally about a list of consecutive integers with one duplicate. In this case the interviewer is often looking for you to use the Gaussian sum of n-integers trick e.g. n*(n+1)/2 subtracted from the actual sum. The textbook answer is something like this.
def find_dupe_for_consecutive_integers(array)
n=array.size-1 # subtract one from array.size because of the dupe
array.sum - n*(n+1)/2
end

Resources