Finding pair with sum between 1 and 2 - algorithm

Given n positive real numbers, the task is to provide a YES or NO answer to the following question:
"Does there exist a pair of numbers x and y such that 1 <= x+y <= 2.
The obvious solution is to sort all the numbers which will take O(nlogn). Now, pair can be checked in O(n) time.
However, the question is expected to be solved in constant space and linear time. Any insights?

Only numbers between 0 and 2 are useful for participating in a winning pair. The others can be ignored.
Each such number x makes it possible to create a pair with one additional number from the list between 1-x and 2-x.
Compute and maintain the acceptable bounds as you progress through the list. There cannot be more than two intervals of acceptable next values at any given time, because all the intervals of acceptable next values are included between -1 and 2 and have width 1. Therefore the acceptable next values to complete a pair can be represented in constant space.
Answer YES as soon as a number appears from the list that is in one of the at most two intervals of acceptable next values. Answer NO if you get to the end of the list without encountering the situation.
Example: 0.1, 0.5, 2.0 …
When starting, the set of values that can appear and complete a pair is the empty set.
After reading 0.1, the set of values that would complete a pair if they appeared now is [0.9, 1.9].
0.5 does not belong to the set of values that can complete a pair. However, after reading it, values in [0.5, 1.5] can complete a pair. Since we already had the set [0.9, 1.9], the new set of values that can complete a pair is [0.5, 1.9].
2.0 does not belong to the set of values that can complete a pair. However, we can now read any value in [-1, 0] to complete a pair. The new set of values that can be read to complete a pair henceforth is [-1, 0] ∪ [0.5, 1.9].
and so on…

I like Pascal Cuoq's algorithm for this problem, which I think is a nice and elegant solution. I wanted to post a different approach here that gives a slightly different perspective on the solution.
First, here's the algorithm:
Make one pass over the input and keep track of the following: the smallest number between 1 and 2, the smallest number less than 1 the largest number less than 1/2, and the number of numbers between 1/2 and 1.
If the sum of the smallest number between 1 and 2 and the smallest number less than 1 is less than 2, output YES.
Otherwise, if there are at least two numbers between 1/2 and 1, output YES.
Otherwise, if there are no numbers between 1/2 and 1, output NO.
Otherwise, if the sum of the largest number less than 1/2 and the unique number in the array between 1/2 and 1 is greater than 1, output YES.
Otherwise, output NO.
Here's a proof of why this works. As Pascal noted, we only care about numbers in the range [0, 2); anything outside this range can't be part of something that sums up to between 1 and 2. We can do a case analysis to think about what the possible numbers in the sum could be.
First, it's possible that one of the numbers is in the range (1, 2). We can't have two numbers in this range, so the other number must be in the range [0, 1]. In that case, we can take the smallest number in the range [0, 1] and see what happens when we add it with the smallest number in the range (1, 2): if their sum is between 1 and 2, we're done; otherwise, no sum involving a number in the range (1, 2) can be part of the summation.
Otherwise, the summation must purely consist of numbers from the range [0, 1]. Notice that at least one of the numbers must be in the range [1/2, 1], since otherwise their sum can't be at least 1. Also note that the sum of any two numbers in this range will never exceed 2. In this case, if there are two numbers in the range [1/2, 1], their sum satisfies the condition and we're done. If there are 0 numbers in the range [1/2, 1], there is no solution. Otherwise, we can try adding the largest number in the range [0, 1/2) to the one number in the range [1/2, 1] and see if the sum is at least 1. If the answer is yes, we've got the pair; if not, the answer is no.
I definitely like Pascal's algorithm more than this one, but I thought I'd post this to show how a case analysis can be used here.
Hope this helps!

Related

Summing a given series of numbers in order to reset the summation as many times as possible algorithm

I'm looking for an efficient algorithm (not necessarily a code) for solving the following question:
Given n positive and negative numbers that sum up to zero, we would like to find a starting index that will cause the cumulated sum to zero up as many times as possible.
It doesn't have to be in a specific manner, but the importance here is the efficincy- we want the algorithm/idea to be able to this in less then a qudratic "time complexity"
An example:
Given the numbers: 2, -1, 3, 1, -3, -2:
If we strat summing up with 2 (first index), the sum will be zero only once (at the end of the summation), but strting with -1 will yield zero twice during the summation.
The given numbers may have more than one "best index", but we would like to find at least one of these indexes.
I've tried doing it with binary search, but didn't make much progress- so any hints/help will be appreciated.
You can compute prefix sums. In terms of prefix sums, zeros are positions that have the same value of a prefix sum as the start position. So the problem is reduced to finding the most frequent element in the array of prefix sums. It can be solved efficiently using sorting or hash tables.
Here is an example:
Input: {2, -1, 3, 1, -3, 2}
Prefix sums: {0, 2, 1, 4, 5, 2, 0}
The most frequent element is 2. The first occurrence of 2 is in the first position. Thus, starting from the second element yields optimal answer.

Is this possible? Last few digits of sum equal to another number

I have a n-digit number and a list of numbers, from which any number can be used any number of times.
Taking numbers from the list, how do I know that it is possible to generate a sum such that the last n-digits of the sum are the the n-digit number?
Note: The sum has some initial value, its not zero.
EDIT - If a solution exists, I need to find the minimum number of the numbers added to get a number such that it has the last 4 digits as the given number. That be easily solved with DP (minimum coin change problem).
For example, if n=4,
Given number = 1212
Initial value = 5234
List = [1023, 101, 1]
A solution exists: 21212 = 5234 + 1023*15 + 101*6 + 1*27
It's easy to find a counterexample (see comments).
Now, for the solution here's a dynamic programming approach:
All arithmetic is modulo 10^n. For each value in the range 0 - 10^n-1 you need a flag whether it was found and you need a queue for the elements to be processed.
Push the initial value to the to-be-processed-list.
Get an element from the to-be-processed list. If empty, finished. No solution.
Try to add each number separately to this number. If it was already found, nothing to do. If sum is found, you've finished, there's a solution. If not, mark it as found and push it to the queue.
Goto 2
An actual solution can be reconstructed if you store how you reached a number. You just have to walk back from sum till you hit the initial value.
If the greatest common factor of the numbers in the list is a unit modulo 10n (that is, not divisible by 2 or 5) you can solve the problem for any choice of the other given values: use the extended Euclid's algorithm to find a linear combination of the list that sums to the gcf, find the multiplicative inverse of the gcf modulo 10n and multiply by the difference between the given and the initial values.
If the gcf of the numbers in the list is divisible by 2 or 5 (that is, is not a unit) and the difference between the given and the initial value is also divisible by 2 or 5, divide the numbers in the list and the difference by the largest powers of 2 and 5 that divide them all. If the gcf you end up with is a unit there is a solution and you can find it with the procedure above. Otherwise there is no solution.
For example, given 16 and initial value for the sum 5, and list of numbers [3].
The gcf of the numbers in the list is 3 which is a unit. Its inverse modulo 100 is 67 (3×67 = 201).
Multiplying by the difference between the given number and the initial value 16-5 = 11 we get the factor 67*11 = 737 for 3. Since we're working modulo 100 that's the same as 37.
Checking the result: 5 + 37×3 = 16. Yep, that works.

Algorithm to find min & max value of equation from an array

I have an array of 8 int, 4 positive and 4 negative.
X [10,-2,30,-4,5,-20,8,-9]
Now, let
Evaluated = a-b+c-d+e-f+g-h
where a,b..h are unique values taken from X.
I need to ensure that
Case 1. Evaluated = closest to zero.
Case 2. List out the 5 greatest possibility by solving Evaluated.
I can find the maximum value by sorting the array and assigning the max values to a,c,e and g, and the minimum values b,d,f and h. but how to find the next 4 values?
There are 8! ways of solving this equation right?
What would be the best way of determining this solution?
Ensure that negative values get positive sign and positive values get negative sign. You will be getting smallest possible value. You don't even need to sort.
One simple way of doing is this...
Loop Each Element of X
if X[i] > 0 Then X[i] = -1 * X[i]
End Loop
Add all elements of X (yes just don't think about subtracting, just add)
The result sum is the smallest possible value.
Just choose a, c, e, g as the four smallest and the rest as the largest values.
Function Small in Excel might help you.
If I understand the question correctly, you have an array of eight numbers. You want to choose four of the numbers to add, and four to subtract in order to get the smallest possible result. I would proceed as follows:
Sort the array from smallest to largest. This article describes two ways to do this. In your example, the sorted array would have [-20, -9, -4, -2, 5, 8, 10, 30].
Add The first four values in the array.
Subtract The last four values.
This will give you the smallest result possible by adding four values and subtracting the remaining values in the array of eight.

Given an array of integers, find the LARGEST number using the digits of the array such that it is divisible by 3

E.g.: Array: 4,3,0,1,5 {Assume all digits are >=0. Also each element in array correspond to a digit. i.e. each element on the array is between 0 and 9. }
In the above array, the largest number is: 5430 {using digits 5, 4, 3 and 0 from the array}
My Approach:
For divisibility by 3, we need the sum of digits to be divisible by 3.
So,
Step-1: Remove all the zeroes from the array.
Step-2: These zeroes will come at the end. {Since they dont affect the sum and we have to find the largest number}
Step-3: Find the subset of the elements of array (excluding zeroes) such that the number of digits is MAXIMUM and also that the sum of digits is MAXIMUM and the sum is divisible by 3.
STEP-4: The required digit consists of the digits in the above found set in decreasing order.
So, the main step is STEP-3 i.e. How to find the subset such that it contains MAXIMUM possible number of elements such that their sum is MAX and is divisible by 3 .
I was thinking, maybe Step-3 could be done by GREEDY CHOICE of taking all the elements and keep on removing the smallest element in the set till the sum is divisible by 3.
But i am not convinced that this GREEDY choice will work.
Please tell if my approach is correct.
If it is, then please suggest as to how to do Step-3 ?
Also, please suggest any other possible/efficient algorithm.
Observation: If you can get a number that is divisible by 3, you need to remove at most 2 numbers, to maintain optimal solution.
A simple O(n^2) solution will be to check all possibilities to remove 1 number, and if none is valid, check all pairs (There are O(n^2) of those).
EDIT:
O(n) solution: Create 3 buckets - bucket1, bucket2, bucket0. Each will denote the modulus 3 value of the numbers. Ignore bucket0 in the next algorithm.
Let the sum of the array be sum.
If sum % 3 ==0: we are done.
else if sum % 3 == 1:
if there is a number in bucket1 - chose the minimal
else: take 2 minimals from bucket 2
else if sum % 3 == 2
if there is a number in bucket2 - chose the minimal
else: take 2 minimals from bucket1
Note: You don't actually need the bucket, to achieve O(1) space - you need only the 2 minimal values from bucket1 and bucket2, since it is the only number we actually used from these buckets.
Example:
arr = { 3, 4, 0, 1, 5 }
bucket0 = {3,0} ; bucket1 = {4,1} bucket2 = { 5 }
sum = 13 ; sum %3 = 1
bucket1 is not empty - chose minimal from it (1), and remove it from the array.
result array = { 3, 4, 0, 5 }
proceed to STEP 4 "as planned"
Greedy choice definitely doesn't work: consider the set {5, 2, 1}. You'd remove the 1 first, but you should remove the 2.
I think you should work out the sum of the array modulo 3, which is either 0 (you're finished), or 1, or 2. Then you're looking to remove the minimal subset whose sum modulo 3 is 1 or 2.
I think that's fairly straightforward, so no real need for dynamic programming. Do it by removing one number with that modulus if possible, otherwise do it by removing two numbers with the other modulus. Once you know how many to remove, choose the smallest possible. You'll never need to remove three numbers.
You don't need to treat 0 specially, although if you're going to do that then you can further reduce the set under consideration in step 3 if you temporarily remove all 0, 3, 6, 9 from it.
Putting it all together, I would probably:
Sort the digits, descending.
Calculate the modulus. If 0, we're finished.
Try to remove a digit with that modulus, starting from the end. If successful, we're finished.
Remove two digits with negative-that-modulus, starting from the end. This always succeeds, so we're finished.
We might be left with an empty array (e.g. if the input is 1, 1), in which case the problem was impossible. Otherwise, the array contains the digits of our result.
Time complexity is O(n) provided that you do a counting sort in step 1. Which you certainly can since the values are digits.
What do you think about this:
first sort an array elements by value
sum up all numbers
- if sum's remainder after division by 3 is equal to 0, just return the sorted
array
- otherwise
- if sum of remainders after division by 3 of all the numbers is smaller
than the remainder of their sum, there is no solution
- otherwise
- if it's equal to 1, try to return the smallest number with remainder
equal to 1, or if no such, try two smallest with remainder equal to 2,
if no such two (I suppose it can happen), there's no solution
- if it's equal to 2, try to return the smallest number with remainder
equal to 2, or if no such, try two smallest with remainder equal to 1,
if no such two, there's no solution
first sort an array elements by remainder of division by 3 ascending
then each subset of equal remainder sort by value descending
First, this problem reduces to maximizing the number of elements selected such that their sum is divisible by 3.
Trivial: Select all numbers divisible by 3 (0,3,6,9).
Le a be the elements that leave 1 as remainder, b be the elements that leave 2 as remainder. If (|a|-|b|)%3 is 0, then select all elements from both a and b. If (|a|-|b|)%3 is 1, select all elements from b, and |a|-1 highest numbers from a. If the remainder is 2, then select all numbers from a, and |b|-1 highest numbers from b.
Once you have all the numbers, sort them in reverse order and concatenate. that is your answer.
Ultimately if n is the number of elements this algorithm returns a number that is al least n-1 digits long (except corner cases. see below).
NOTE: Take care of corner cases(i.e. what is |a|=0 or |b|=0 etc). (-1)%3 = 2 and (-2)%3 = 1 .
If m is the size of alphabet, and n is the number of elements, this my algorithm is O(m+n)
Sorting the data is unnecessary, since there are only ten different values.
Just count the number of zeroes, ones, twos etc. in O (n) if n digits are given.
Calculate the sum of all digits, check whether the remainder modulo 3 is 0, 1 or 2.
If the remainder is 1: Remove the first of the following which is possible (one of these is guaranteed to be possible): 1, 4, 7, 2+2, 2+5, 5+5, 2+8, 5+8, 8+8.
If the remainder is 2: Remove the first of the following which is possible (one of these is guaranteed to be possible): 2, 5, 8, 1+1, 1+4, 4+4, 1+7, 4+7, 7+7.
If there are no digits left then the problem cannot be solved. Otherwise, the solution is created by concatenating 9's, 8's, 7's, and so on as many as are remaining.
(Sorting n digits would take O (n log n). Unless of course you sort by counting how often each digit occurs and generating the sorted result according to these numbers).
Amit's answer has a tiny thing missing.
If bucket1 is not empty but it has a humongous value, lets say 79 and 97 and b2 is not empty as well and its 2 minimals are, say 2 and 5. Then in this case, when the modulus of the sum of all digits is 1, we should choose to remove 2 and 5 from bucket 2 instead of the minimal in bucket 1 to get the largest concatenated number.
Test case : 8 2 3 5 78 79
If we follow Amits and Steve's suggested method, largest number would be 878532 whereas the largest number possible divisble by 3 in this array is 879783
Solution would be to compare the appropriate bucket's smallest minimal with the concatenation of both the minimals of the other bucket and eliminate the smaller one.

Greatest GCD between some numbers

We've got some nonnegative numbers. We want to find the pair with maximum gcd. actually this maximum is more important than the pair!
For example if we have:
2 4 5 15
gcd(2,4)=2
gcd(2,5)=1
gcd(2,15)=1
gcd(4,5)=1
gcd(4,15)=1
gcd(5,15)=5
The answer is 5.
You can use the Euclidean Algorithm to find the GCD of two numbers.
while (b != 0)
{
int m = a % b;
a = b;
b = m;
}
return a;
If you want an alternative to the obvious algorithm, then assuming your numbers are in a bounded range, and you have plenty of memory, you can beat O(N^2) time, N being the number of values:
Create an array of a small integer type, indexes 1 to the max input. O(1)
For each value, increment the count of every element of the index which is a factor of the number (make sure you don't wraparound). O(N).
Starting at the end of the array, scan back until you find a value >= 2. O(1)
That tells you the max gcd, but doesn't tell you which pair produced it. For your example input, the computed array looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
4 2 1 1 2 0 0 0 0 0 0 0 0 0 1
I don't know whether this is actually any faster for the inputs you have to handle. The constant factors involved are large: the bound on your values and the time to factorise a value within that bound.
You don't have to factorise each value - you could use memoisation and/or a pregenerated list of primes. Which gives me the idea that if you are memoising the factorisation, you don't need the array:
Create an empty set of int, and a best-so-far value 1.
For each input integer:
if it's less than or equal to best-so-far, continue.
check whether it's in the set. If so, best-so-far = max(best-so-far, this-value), continue. If not:
add it to the set
repeat for all of its factors (larger than best-so-far).
Add/lookup in a set could be O(log N), although it depends what data structure you use. Each value has O(f(k)) factors, where k is the max value and I can't remember what the function f is...
The reason that you're finished with a value as soon as you encounter it in the set is that you've found a number which is a common factor of two input values. If you keep factorising, you'll only find smaller such numbers, which are not interesting.
I'm not quite sure what the best way is to repeat for the larger factors. I think in practice you might have to strike a balance: you don't want to do them quite in decreasing order because it's awkward to generate ordered factors, but you also don't want to actually find all the factors.
Even in the realms of O(N^2), you might be able to beat the use of the Euclidean algorithm:
Fully factorise each number, storing it as a sequence of exponents of primes (so for example 2 is {1}, 4 is {2}, 5 is {0, 0, 1}, 15 is {0, 1, 1}). Then you can calculate gcd(a,b) by taking the min value at each index and multiplying them back out. No idea whether this is faster than Euclid on average, but it might be. Obviously it uses a load more memory.
The optimisations I can think of is
1) start with the two biggest numbers since they are likely to have most prime factors and thus likely to have the most shared prime factors (and thus the highest GCD).
2) When calculating the GCDs of other pairs you can stop your Euclidean algorithm loop if you get below your current greatest GCD.
Off the top of my head I can't think of a way that you can work out the greatest GCD of a pair without trying to work out each pair individually (and optimise a bit as above).
Disclaimer: I've never looked at this problem before and the above is off the top of my head. There may be better ways and I may be wrong. I'm happy to discuss my thoughts in more length if anybody wants. :)
There is no O(n log n) solution to this problem in general. In fact, the worst case is O(n^2) in the number of items in the list. Consider the following set of numbers:
2^20 3^13 5^9 7^2*11^4 7^4*11^3
Only the GCD of the last two is greater than 1, but the only way to know that from looking at the GCDs is to try out every pair and notice that one of them is greater than 1.
So you're stuck with the boring brute-force try-every-pair approach, perhaps with a couple of clever optimizations to avoid doing needless work when you've already found a large GCD (while making sure that you don't miss anything).
With some constraints, e.g the numbers in the array are within a given range, say 1-1e7, it is doable in O(NlogN) / O(MAX * logMAX), where MAX is the maximum possible value in A.
Inspired from the sieve algorithm, and came across it in a Hackerrank Challenge -- there it is done for two arrays. Check their editorial.
find min(A) and max(A) - O(N)
create a binary mask, to mark which elements of A appear in the given range, for O(1) lookup; O(N) to build; O(MAX_RANGE) storage.
for every number a in the range (min(A), max(A)):
for aa = a; aa < max(A); aa += a:
if aa in A, increment a counter for aa, and compare it to current max_gcd, if counter >= 2 (i.e, you have two numbers divisible by aa);
store top two candidates for each GCD candidate.
could also ignore elements which are less than current max_gcd;
Previous answer:
Still O(N^2) -- sort the array; should eliminate some of the unnecessary comparisons;
max_gcd = 1
# assuming you want pairs of distinct elements.
sort(a) # assume in place
for ii = n - 1: -1 : 0 do
if a[ii] <= max_gcd
break
for jj = ii - 1 : -1 :0 do
if a[jj] <= max_gcd
break
current_gcd = GCD(a[ii], a[jj])
if current_gcd > max_gcd:
max_gcd = current_gcd
This should save some unnecessary computation.
There is a solution that would take O(n):
Let our numbers be a_i. First, calculate m=a_0*a_1*a_2*.... For each number a_i, calculate gcd(m/a_i, a_i). The number you are looking for is the maximum of these values.
I haven't proved that this is always true, but in your example, it works:
m=2*4*5*15=600,
max(gcd(m/2,2), gcd(m/4,4), gcd(m/5,5), gcd(m/15,15))=max(2, 2, 5, 5)=5
NOTE: This is not correct. If the number a_i has a factor p_j repeated twice, and if two other numbers also contain this factor, p_j, then you get the incorrect result p_j^2 insted of p_j. For example, for the set 3, 5, 15, 25, you get 25 as the answer instead of 5.
However, you can still use this to quickly filter out numbers. For example, in the above case, once you determine the 25, you can first do the exhaustive search for a_3=25 with gcd(a_3, a_i) to find the real maximum, 5, then filter out gcd(m/a_i, a_i), i!=3 which are less than or equal to 5 (in the example above, this filters out all others).
Added for clarification and justification:
To see why this should work, note that gcd(a_i, a_j) divides gcd(m/a_i, a_i) for all j!=i.
Let's call gcd(m/a_i, a_i) as g_i, and max(gcd(a_i, a_j),j=1..n, j!=i) as r_i. What I say above is g_i=x_i*r_i, and x_i is an integer. It is obvious that r_i <= g_i, so in n gcd operations, we get an upper bound for r_i for all i.
The above claim is not very obvious. Let's examine it a bit deeper to see why it is true: the gcd of a_i and a_j is the product of all prime factors that appear in both a_i and a_j (by definition). Now, multiply a_j with another number, b. The gcd of a_i and b*a_j is either equal to gcd(a_i, a_j), or is a multiple of it, because b*a_j contains all prime factors of a_j, and some more prime factors contributed by b, which may also be included in the factorization of a_i. In fact, gcd(a_i, b*a_j)=gcd(a_i/gcd(a_i, a_j), b)*gcd(a_i, a_j), I think. But I can't see a way to make use of this. :)
Anyhow, in our construction, m/a_i is simply a shortcut to calculate the product of all a_j, where j=1..1, j!=i. As a result, gcd(m/a_i, a_i) contains all gcd(a_i, a_j) as a factor. So, obviously, the maximum of these individual gcd results will divide g_i.
Now, the largest g_i is of particular interest to us: it is either the maximum gcd itself (if x_i is 1), or a good candidate for being one. To do that, we do another n-1 gcd operations, and calculate r_i explicitly. Then, we drop all g_j less than or equal to r_i as candidates. If we don't have any other candidate left, we are done. If not, we pick up the next largest g_k, and calculate r_k. If r_k <= r_i, we drop g_k, and repeat with another g_k'. If r_k > r_i, we filter out remaining g_j <= r_k, and repeat.
I think it is possible to construct a number set that will make this algorithm run in O(n^2) (if we fail to filter out anything), but on random number sets, I think it will quickly get rid of large chunks of candidates.
pseudocode
function getGcdMax(array[])
arrayUB=upperbound(array)
if (arrayUB<1)
error
pointerA=0
pointerB=1
gcdMax=0
do
gcdMax=MAX(gcdMax,gcd(array[pointera],array[pointerb]))
pointerB++
if (pointerB>arrayUB)
pointerA++
pointerB=pointerA+1
until (pointerB>arrayUB)
return gcdMax

Resources