What numbers add up to k? - algorithm

I am not sure how to do this. Given a list of numbers and a number k, return all pairs of numbers from the list that add up to k. only pass through the list once.
For example, given [10, 15, 3, 7] and k of 17. The program should return 10 + 7.
How do you order and return every pair while only going through the list once.

Use a set to keep track of what you've seen. Runtime O(N), Space: O(N)
def twoAddToK(nums, k):
seen = set()
N = len(nums)
for i in range(N):
if k - nums[i] in seen:
return True
seen.add(nums[i])
return False

As an alternative to Shawn's code, which uses a set, there is also the option of sorting the list in O(N log N) time (and possibly no extra space, if you are allowed to overwrite the original input), and then applying an O(N) algorithm to solve the problem on a sorted list
While asymptotic complexity slightly favors using hash sets in terms of time, since O(N) is better than O(N log N), I am ready to bet that sorting + single-pass lookup is considerably faster in practice.

Related

Time Complexity of Subset-Sum Enumeration

Normally, when dealing with combinations, the Big-O complexity seems to be O(n choose k). In this algorithm, I am generating all the combinations within the array that match the target sum:
def combos(candidates,start, target):
if target == 0:
return [[]]
res = []
for i in range(start,len(candidates)):
for c in combos(candidates, i+1, target-candidates[i]):
res.append([candidates[i]]+ c)
return res
print combos([2,1,10,5,6,4], 10)
# [[1, 5, 4], [10], [6, 4]]
I am having a hard time determining Big-O here, is this a O(n choose t) algorithm? If not, what is it and why?
If the point is to give the worst-cast complexity in terms of the set size, n, then it is Θ(2n). Given any set, if the target sum is large enough, you'll end up enumerating all the possible subsets of the set. This is Θ(2n), as can be seen in two ways:
Each item can be chosen or not.
It is your Θ(n choose k), just summed up over all k.
A more refined bound would take into account both n and the target sum t. In this case, following the reasoning of the 2nd point above, then if all elements (and target sum) are positive integers, then the complexity will be the sum of Θ(n choose k) for k summing only up to t.
Your algorithm is at least O(2^n) and I believe it is O(n * 2^n). Here is an explanation.
In your algorithm you have to generate all possible combinations of a set (except of an empty set). So this is:
O(2^n) at least. Now for every combinations you have to sum them up. Some sets are of length 1 on is of length n, but majority of them would be somewhere of length n/2. So I believe that your complexity is close to the O(n * 2^n).

Efficient algorithm to determine if two sets of numbers are disjoint

Practicing for software developer interviews and got stuck on an algorithm question.
Given two sets of unsorted integers with array of length m and other of
length n and where m < n find an efficient algorithm to determine if
the sets are disjoint. I've found solutions in O(nm) time, but haven't
found any that are more efficient than this, such as in O(n log m) time.
Using a datastructure that has O(1) lookup/insertion you can easily insert all elements of first set.
Then foreach element in second set, if it exists not disjoint, otherwise it is disjoint
Pseudocode
function isDisjoint(list1, list2)
HashMap = new HashMap();
foreach( x in list1)
HashMap.put(x, true);
foreach(y in list2)
if(HashMap.hasKey(y))
return false;
return true;
This will give you an O(n + m) solution
Fairly obvious approach - sort the array of length m - O(m log m).
For every element in the array of length n, use binary search to check if it exists in the array of length m - O(log m) per element = O(n log m). Since m<n, this adds up to O(n log m).
Here's a link to a post that I think answers your question.
3) Sort smaller O((m + n)logm)
Say, m < n, sort A
Binary search for each element of B into A
Disadvantage: Modifies the input
Looks like Cheruvian beat me to it, but you can use a hash table to get O(n+m) in average case:
*Insert all elements of m into the table, taking (probably) constant time for each, assuming there aren't a lot with the same hash. This step is O(m)
*For each element of n, check to see if it is in the table. If it is, return false. Otherwise, move on to the next. This takes O(n).
*If none are in the table, return true.
As I said before, this works because a hash table gives constant lookup time in average case. In the rare event that many unique elements in m have the same hash, it will take slightly longer. However, most people don't need to care about hypothetical worst cases. For example, quick sort is used more than merge sort because it gives better average performance, despite the O(n^2) upper bound.

Find pairs with given difference

Given n, k and n number of integers. How would you find the pairs of integers for which their difference is k?
There is a n*log n solution, but I cannot figure it out.
You can do it like this:
Sort the array
For each item data[i], determine its two target pairs, i.e. data[i]+k and data[i]-k
Run a binary search on the sorted array for these two targets; if found, add both data[i] and data[targetPos] to the output.
Sorting is done in O(n*log n). Each of the n search steps take 2 * log n time to look for the targets, for the overall time of O(n*log n)
For this problem exists the linear solution! Just ask yourself one question. If you have a what number should be in the array? Of course a+k or a-k (A special case: k = 0, required an alternative solution). So, what now?
You are creating a hash-set (for example unordered_set in C++11) with all values from the array. O(1) - Average complexity for each element, so it's O(n).
You are iterating through the array, and check for each element Is present in the array (x+k) or (x-k)?. You check it for each element, in set in O(1), You check each element once, so it's linear (O(n)).
If you found x with pair (x+k / x-k), it is what you are looking for.
So it's linear (O(n)). If you really want O(n lg n) you should use a set on tree, with checking is_exist in (lg n), then you have O(n lg n) algorithm.
Apposition: No need to check x+k and x-k, just x+k is sufficient. Cause if a and b are good pair then:
if a < b then
a + k == b
else
b + k == a
Improvement: If you know a range, you can guarantee linear complexity, by using bool table (set_tab[i] == true, when i is in table.).
Solution similar to one above:
Sort the array
set variables i = 0; j = 1;
check the difference between array[i] and array[j]
if the difference is too small, increase j
if the difference is too big, increase i
if the difference is the one you're looking for, add it to results and increase j
repeat 3 and 4 until the end of array
Sorting is O(n*lg n), the next step is, if I'm correct, O(n) (at most 2*n comparisons), so the whole algorithm is O(n*lg n)

Finding the minimum unique number in an array

The minimum unique number in an array is defined as
min{v|v occurs only once in the array}
For example, the minimum unique number of {1, 4, 1, 2, 3} is 2.
Is there any way better than sorting?
I believe this is an O(N) solution in both time and space:
HashSet seenOnce; // sufficiently large that access is O(1)
HashSet seenMultiple; // sufficiently large that access is O(1)
for each in input // O(N)
if item in seenMultiple
next
if item in seenOnce
remove item from seenOnce
add to item seenMultiple
else
add to item seeOnce
smallest = SENTINEL
for each in seenOnce // worst case, O(N)
if item < smallest
smallest = item
If you have a limited range of integral values, you can replace the HashSets with BitArrays indexed by the value.
You don't need to do full sorting. Perform bubble sort inner loop until you get distinct minimum value at one end. In the best case this will have time complexity O(k * n) where k = number of non-distinct minimum values. However worst case complexity is O(n*n). So, this can be efficient when expected value of k << n.
I think this would be the minimum possible time complexity unless you can adapt any O(n * logn) sorting algorithms to the above task.
Python version using dictionary.
Time complexity O(n) and space complexity O(n):
from collections import defaultdict
d=defaultdict(int)
for _ in range(int(input())):
ele=int(input())
d[ele]+=1
m=9999999
for i in d:
if d[i]==1 and i<m:
m=i
print(m if m!= 9999999 else -1)
Please tell me if there is a better approach.

Can this algorithm be optimised?

I have a list of some objects and I want to iterate through them in a specific sequence for a particular number as returned by the following function. What the following does is, removes each number based on the modulo of the hash number to the list size and generates a sequence.
def genSeq(hash,n):
l = range(n)
seq = []
while l:
ind = hash % len(l)
seq.append(l[ind])
del l[ind]
return seq
Eg: genSeq(53,5) will return [3, 1, 4, 2, 0]
I am presenting the algo in python for easy understanding. I am supposed to code in c++. The complexity in this form in O(n^2) both for vector and list. (we either pay for removing or for access). Can this be made any better ?
A skip list would give you O(log n) access and removal. So your total traversal time would be O(n log n).
I'd like to think there is a linear solution, but nothing jumps out at me.
The sequence
[hash % (i+1) in range(len(l))]
can be seen as a number in factoradic.
There is a bijection between permutations and factoradic numbers.
The mapping from factoradic numbers to permutations is described in the top answer under the section "Permuting a list using an index sequence". Unfortunately the algorithm provided is quadratic. However there is a commenter which points to a data-structure which would make the algorithm O(nlog(n)).
You should not delete from your vector, but swap with the end and decrease a fill pointer. Think about the Fisher-Yates shuffle.

Resources