Related
I found this problem in a hiring contest(which is over now). Here it is:
You are given two natural numbers N and X. You are required to create an array of N natural numbers such that the bitwise XOR of these numbers is equal to X. The sum of all the natural numbers that are available in the array is as minimum as possible.
If there exist multiple arrays, print the smallest one
Array A< Array B if
A[i] < B[i] for any index i, and A[i]=B[i] for all indices less than i
Sample Input: N=3, X=2
Sample output : 1 1 2
Explanation: We have to print 3 natural numbers having the minimum sum Thus the N-spaced numbers are [1 1 2]
My approach:
If N is odd, I put N-1 ones in the array (so that their xor is zero) and then put X
If N is even, I put N-1 ones again and then put X-1(if X is odd) and X+1(if X is even)
But this algorithm failed for most of the test cases. For example, when N=4 and X=6 my output is
1 1 1 7 but it should be 1 1 2 4
Anyone knows how to make the array sum minimum?
In order to have the minimum sum, you need to make sure that when your target is X, you are not cancelling the bits of X and recreating them again. Because this will increase the sum. For this, you have create the bits of X one by one (ideally) from the end of the array. So, as in your example of N=4 and X=6 we have: (I use ^ to show xor)
X= 7 = 110 (binary) = 2 + 4. Note that 2^4 = 6 as well because these numbers don't share any common bits. So, the output is 1 1 2 4.
So, we start by creating the most significant bits of X from the end of the output array. Then, we also have to handle the corner cases for different values of N. I'm going with a number of different examples to make the idea clear:
``
A) X=14, N=5:
X=1110=8+4+2. So, the array is 1 1 2 4 8.
B) X=14, N=6:
X=8+4+2. The array should be 1 1 1 1 2 12.
C) X=15, N=6:
X=8+4+2+1. The array should be 1 1 1 2 4 8.
D) X=15, N=5:
The array should be 1 1 1 2 12.
E) X=14, N=2:
The array should be 2 12. Because 12 = 4^8
``
So, we go as follows. We compute the number of powers of 2 in X. Let this number be k.
Case 1 - If k <= n (example E): we start by picking the smallest powers from left to right and merge the remaining on the last position in the array.
Case 2 - If k > n (example A, B, C, D): we compute h = n - k. If h is odd we put h = n-k+1. Now, we start by putting h 1's in the beginning of the array. Then, the number of places left is less than k. So, we can follow the idea of Case 1 for the remaining positions. Note that in case 2, instead of having odd number of added 1's we put and even number of 1's and then do some merging at the end. This guarantees that the array is the smallest it can be.
We have to consider that we have to minimize the sum of the array for solution and that is the key point.
First calculate set bits in N suppose if count of setbits are less than or equal to X then divide N in X integers based on set bits like
N = 15, X = 2
setbits in 15 are 4 solution is 1 14
if X = 3 solution is 1 2 12
this minimizes array sum too.
other case if setbits are greater than X
calculate difference = setbits(N) - X
If difference is even then add ones as needed and apply above algorithm all ones will cancel out.
If difference is odd then add ones but now you have take care of that 1 extra one in the answer array.
Check for the corner cases too.
The problems is to find the count of numbers between A and B (inclusive) that have sum of digits equal to S.
Also print the smallest such number between A and B (inclusive).
Input:
Single line consisting of A,B,S.
Output:
Two lines.
In first line the number of integers between A and B having sum of digits equal to S.
In second line the smallest such number between A and B.
Constraints:
1 <= A <= B < 10^15
1 <= S <= 135
Source: Hacker Earth
My solution works for only 30 pc of their inputs. What could be the best possible solution to this?
The algorithm I am using now computes the sum of the smallest digit and then upon every change of the tens digit computes the sum again.
Below is the solution in Python:
def sum(n):
if (n<10):return n
return n%10 + sum(n/10)
stri = raw_input()
min = 99999
stri = stri.split(" ")
a= long (stri[0])
b= long (stri[1])
s= long (stri[2])
count= 0
su = sum(a)
while a<=b :
if (a % 10 == 0 ):
su = sum(a)
print a
if ( s == su):
count+=1
if (a<= min):
min=a
a+=1
su+=1
print count
print min
There are two separate problems here: finding the smallest number between those numbers that has the right digit sum and finding the number of values in the range with that digit sum. I'll talk about those problems separately.
Counting values between A and B with digit sum S.
The general approach for solving this problem will be the following:
Compute the number of values less than or equal to A - 1 with digit sum S.
Compute the number of values less than or equal to B with digit sum S.
Subtract the first number from the second.
To do this, we should be able to use a dynamic programming approach. We're going to try to answer queries of the following form:
How many D-digit numbers are there, whose first digit is k, whose digits that sum up to S?
We'll create a table N[D, k, S] to hold these values. We know that D is going to be at most 16 and that S is going to be at most 136, so this table will have only 10 × 16 × 136 = 21,760 entries, which isn't too bad. To fill it in, we can use the following base cases:
N[1, S, S] = 1 for 0 ≤ S ≤ 9, since there's only one one-digit number that sums up to any value less than ten.
N[1, k, S] = 0 for 0 ≤ S ≤ 9 if k ≠ S, since no one-digit number whose first digit isn't a particular sum sums up to some value.
N[1, k, S] = 0 for 10 ≤ S ≤ 135, since no one-digit number sums up to exactly S for any k greater than a single digit.
N[1, k, S] = 0 for any S < 0.
Then, we can use the following logic to fill in the other table entries:
N[D + 1, k, S] = sum(i from 0 to 9) N[D, i, S - k].
This says that the number of (D+1)-digit numbers whose first digit is k that sum up to S is given by the number of D-digit numbers that sum up to S - k. The number of D-digit numbers that sum up to S - k is given by the number of D-digit numbers that sum up to S - k whose first digit is 0, 1, 2, ..., 9, so we have to sum up over them.
Filling in this DP table takes time only O(1), and in fact you could conceivably precompute it and hardcode it into the program if you were really concerned about time.
So how can we use this table? Well, suppose we want to know how many numbers that sum up to S are less than or equal to some number X. To do this, we can process the digits of X one at a time. Let's write X one digit at a time as d1 ... dn. We can start off by looking at N[n, d1, S]. This gives us the number of n-digit numbers whose first digit is d1 that sum up to S. This may overestimate the number of values less than or equal to X that sum up to S. For example, if our number is 21,111 and we want the number of values that sum up to exactly 12, then looking up this table value will give us false positives for numbers like 29,100 that start with a 2 and are five digits long, but which are still greater than X. To handle this, we can move to the next digit of the number X. Since the first digit was a 2, the rest of the digits in the number must sum up to 10. Moreover, since the next digit of X (21,111) is a 1, we can now subtract from our total the number of 4-digit numbers starting with 2, 3, 4, 5, ..., 9 that add up to 10. We can then repeat this process one digit at a time.
More generally, our algorithm will be as follows. Let X be our number and S the target sum. Write X = d1d2...dn and compute the following:
# Begin by starting with all numbers whose first digit is less than d[1].
# Those count as well.
result = 0
for i from 0 to d[1]:
result += N[n, i, S]
# Now, exclude everything whose first digit is d[1] that is too large.
S -= d[1]
for i = 2 to n:
for j = d[i] to 8:
result -= N[n, d[i], S]
S -= d[i]
The value of result will then be the number of values less than or equal to X that sum up to exactly S. This algorithm will only run for at most 16 iterations, so it should be very quick. Moreover, using this algorithm and the earlier subtraction trick, we can use it to compute how many values between A and B sum up to exactly S.
Finding the smallest value in [A, B] with digit sum S.
We can use a similar trick with our DP table to find the smallest number greater than A number that sums up to exactly S. I'll leave the details as an exercise, but as a hint, work one digit at a time, trying to find the smallest number for which the DP table returns a nonzero value.
Hope this helps!
I want to find if sum of first k digits of few numbers in given range is equal to sum of last k digits. Here the range is very large and k is less than 20.
One way we can do this is by brute force method. Can someone suggest some other efficient algo. for same?
If it is a range, the first digits will not change often and the last digits will change in a simple way. S is the sum of the first 20 digits. While the secund digit doesn't change, the sum will be increased by one when you go to the next digit. So if all yours digits, except the last one, are fixed, and if the sum with the last digit equal to i is Si, you the only good last digit is n= S - Si + i. You then have to check if n is between 0 and 9, and if the resulting number is in the interval. This decrease by ten the number of lookups.
You can check for the next secund lower digits.
If the first n is lower than 0, you need to decrease the secund digit by -n. Call n2 this secund digit. If n2 > = 0, the good numbers will end by (n2,0), (n2 -1,1), ..., (0, n2). This decrease the complexity by 100.
If n is bigger than 10, you increase the second digit by n-9. Call n2 the second digit. If n2<=9, the good numbers are (n2,9),(n2-1,8),...,(0,something).
This also decrease the complexity by 100.
You can do the same for the third digit, and then for the fourth, up to the 20. This will result in just 1 sum, and a complexity in O(number of solutions), so it is minimal. For coding, be careful that your firsts numbers can change. Do one computation per group of 20 first numbers.
one theoretical improvement to the brute force method:
1) sum up the frist k digits, store in sumFirst
2) sum up the last k digits, but stop if sum exceeds sumFirst.
Point 2 could save summing up some of the last few digits.
But you have to measure if the additional logic, costs more then simply adding all k digits.
Optimization N-k
One way to improve the algorithm is if when the number having N digits has the following property:N < 2k.
For instance if N = 5 and k = 3, 5 < 2x3, digits being
abcde
you only have to count ab against de (ie no need to check k (3) digits, since the 3rd is shared by k-last and k-first digits).In other words, the number of digits to be counted both sides is only
min(k, N-k), having N >= k
If you are going to use that multiple times for the same array, you can sum all element with previous elements which is O(n) where the size of array is n i.e
for(int i = 1; i < n; i++)
arr[i] = arr[i] + arr[i-1];
This will convert your array from probability density function to cumulative distribution function (for discrete numbers). Therefore your query is going to be O(1) i.e.
if(arr[k-1] == (arr[n-1]-arr[n-k])) //arr[k-1] is sum of first k element
return true;
return false;
another improvement over the brute force:
i = 0, T = 0
while |T| < 9 * (k - i)
T = T + last[i] - first[i]
i = i + 1
return T == 0
My input are three numbers - a number s and the beginning b and end e of a range with 0 <= s,b,e <= 10^1000. The task is to find the minimal Levenstein distance between s and all numbers in range [b, e]. It is not necessary to find the number minimizing the distance, the minimal distance is sufficient.
Obviously I have to read the numbers as string, because standard C++ type will not handle such large numbers. Calculating the Levenstein distance for every number in the possibly huge range is not feasible.
Any ideas?
[EDIT 10/8/2013: Some cases considered in the DP algorithm actually don't need to be considered after all, though considering them does not lead to incorrectness :)]
In the following I describe an algorithm that takes O(N^2) time, where N is the largest number of digits in any of b, e, or s. Since all these numbers are limited to 1000 digits, this means at most a few million basic operations, which will take milliseconds on any modern CPU.
Suppose s has n digits. In the following, "between" means "inclusive"; I will say "strictly between" if I mean "excluding its endpoints". Indices are 1-based. x[i] means the ith digit of x, so e.g. x[1] is its first digit.
Splitting up the problem
The first thing to do is to break up the problem into a series of subproblems in which each b and e have the same number of digits. Suppose e has k >= 0 more digits than s: break up the problem into k+1 subproblems. E.g. if b = 5 and e = 14032, create the following subproblems:
b = 5, e = 9
b = 10, e = 99
b = 100, e = 999
b = 1000, e = 9999
b = 10000, e = 14032
We can solve each of these subproblems, and take the minimum solution.
The easy cases: the middle
The easy cases are the ones in the middle. Whenever e has k >= 1 more digits than b, there will be k-1 subproblems (e.g. 3 above) in which b is a power of 10 and e is the next power of 10, minus 1. Suppose b is 10^m. Notice that choosing any digit between 1 and 9, followed by any m digits between 0 and 9, produces a number x that is in the range b <= x <= e. Furthermore there are no numbers in this range that cannot be produced this way. The minimum Levenshtein distance between s (or in fact any given length-n digit string that doesn't start with a 0) and any number x in the range 10^m <= x <= 10^(m+1)-1 is necessarily abs(m+1-n), since if m+1 >= n it's possible to simply choose the first n digits of x to be the same as those in s, and delete the remainder, and if m+1 < n then choose the first m+1 to be the same as those in s and insert the remainder.
In fact we can deal with all these subproblems in a single constant-time operation: if the smallest "easy" subproblem has b = 10^m and the largest "easy" subproblem has b = 10^u, then the minimum Levenshtein distance between s and any number in any of these ranges is m-n if n < m, n-u if n > u, and 0 otherwise.
The hard cases: the end(s)
The hard cases are when b and e are not restricted to have the form b = 10^m and e = 10^(m+1)-1 respectively. Any master problem can generate at most two subproblems like this: either two "ends" (resulting from a master problem in which b and e have different numbers of digits, such as the example at the top) or a single subproblem (i.e. the master problem itself, which didn't need to be subdivided at all because b and e already have the same number of digits). Note that due to the previous splitting of the problem, we can assume that the subproblem's b and e have the same number of digits, which we will call m.
Super-Levenshtein!
What we will do is design a variation of the Levenshtein DP matrix that calculates the minimum Levenshtein distance between a given digit string (s) and any number x in the range b <= x <= e. Despite this added "power", the algorithm will still run in O(n^2) time :)
First, observe that if b and e have the same number of digits and b != e, then it must be the case that they consist of some number q >= 0 of identical digits at the left, followed by a digit that is larger in e than in b. Now consider the following procedure for generating a random digit string x:
Set x to the first q digits of b.
Append a randomly-chosen digit d between b[i] and e[i] to x.
If d == b[i], we "hug" the lower bound:
For i from q+1 to m:
If b[i] == 9 then append b[i]. [EDIT 10/8/2013: Actually this can't happen, because we chose q so that e[i] will be larger then b[i], and there is no digit larger than 9!]
Otherwise, flip a coin:
Heads: Append b[i].
Tails: Append a randomly-chosen digit d > b[i], then goto 6.
Stop.
Else if d == e[i], we "hug" the upper bound:
For i from q+1 to m:
If e[i] == 0 then append e[i]. [EDIT 10/8/2013: Actually this can't happen, because we chose q so that b[i] will be smaller then e[i], and there is no digit smaller than 0!]
Otherwise, flip a coin:
Heads: Append e[i].
Tails: Append a randomly-chosen digit d < e[i], then goto 6.
Stop.
Otherwise (if d is strictly between b[i] and e[i]), drop through to step 6.
Keep appending randomly-chosen digits to x until it has m digits.
The basic idea is that after including all the digits that you must include, you can either "hug" the lower bound's digits for as long as you want, or "hug" the upper bound's digits for as long as you want, and as soon as you decide to stop "hugging", you can thereafter choose any digits you want. For suitable random choices, this procedure will generate all and only the numbers x such that b <= x <= e.
In the "usual" Levenshtein distance computation between two strings s and x, of lengths n and m respectively, we have a rectangular grid from (0, 0) to (n, m), and at each grid point (i, j) we record the Levenshtein distance between the prefix s[1..i] and the prefix x[1..j]. The score at (i, j) is calculated from the scores at (i-1, j), (i, j-1) and (i-1, j-1) using bottom-up dynamic programming. To adapt this to treat x as one of a set of possible strings (specifically, a digit string corresponding to a number between b and e) instead of a particular given string, what we need to do is record not one but two scores for each grid point: one for the case where we assume that the digit at position j was chosen to hug the lower bound, and one where we assume it was chosen to hug the upper bound. The 3rd possibility (step 5 above) doesn't actually require space in the DP matrix because we can work out the minimal Levenshtein distance for the entire rest of the input string immediately, very similar to the way we work it out for the "easy" subproblems in the first section.
Super-Levenshtein DP recursion
Call the overall minimal score at grid point (i, j) v(i, j). Let diff(a, b) = 1 if characters a and b are different, and 0 otherwise. Let inrange(a, b..c) be 1 if the character a is in the range b..c, and 0 otherwise. The calculations are:
# The best Lev distance overall between s[1..i] and x[1..j]
v(i, j) = min(hb(i, j), he(i, j))
# The best Lev distance between s[1..i] and x[1..j] obtainable by
# continuing to hug the lower bound
hb(i, j) = min(hb(i-1, j)+1, hb(i, j-1)+1, hb(i-1, j-1)+diff(s[i], b[j]))
# The best Lev distance between s[1..i] and x[1..j] obtainable by
# continuing to hug the upper bound
he(i, j) = min(he(i-1, j)+1, he(i, j-1)+1, he(i-1, j-1)+diff(s[i], e[j]))
At the point in time when v(i, j) is being calculated, we will also calculate the Levenshtein distance resulting from choosing to "stop hugging", i.e. by choosing a digit that is strictly in between b[j] and e[j] (if j == q) or (if j != q) is either above b[j] or below e[j], and thereafter freely choosing digits to make the suffix of x match the suffix of s as closely as possible:
# The best Lev distance possible between the ENTIRE STRINGS s and x, given that
# we choose to stop hugging at the jth digit of x, and have optimally aligned
# the first i digits of s to these j digits
sh(i, j) = if j >= q then shc(i, j)+abs(n-i-m+j)
else infinity
shc(i, j) = if j == q then
min(hb(i, j-1)+1, hb(i-1, j-1)+inrange(s[i], (b[j]+1)..(e[j]-1)))
else
min(hb(i, j-1)+1, hb(i-1, j-1)+inrange(s[i], (b[j]+1)..9),
he(i, j-1)+1, he(i-1, j-1)+inrange(s[i], (0..(e[j]-1)))
The formula for shc(i, j) doesn't need to consider "downward" moves, since such moves don't involve any digit choice for x.
The overall minimal Levenshtein distance is the minimum of v(n, m) and sh(i, j), for all 0 <= i <= n and 0 <= j <= m.
Complexity
Take N to be the largest number of digits in any of s, b or e. The original problem can be split in linear time into at most 1 set of easy problems that collectively takes O(1) time to solve and 2 hard subproblems that each take O(N^2) time to solve using the super-Levenshtein algorithm, so overall the problem can be solved in O(N^2) time, i.e. time proportional to the square of the number of digits.
A first idea to speed up the computation (works if |e-b| is not too large):
Question: how much can the Levestein distance change when we compare s with n and then with n+1?
Answer: not too much!
Let's see the dynamic-programming tables for s = 12007 and two consecutive n
n = 12296
0 1 2 3 4 5
1 0 1 2 3 4
2 1 0 1 2 3
3 2 1 1 2 3
4 3 2 2 2 3
5 4 3 3 3 3
and
n = 12297
0 1 2 3 4 5
1 0 1 2 3 4
2 1 0 1 2 3
3 2 1 1 2 3
4 3 2 2 2 3
5 4 3 3 3 2
As you can see, only the last column changes, since n and n+1 have the same digits, except for the last one.
If you have the dynamic-programming table for the edit-distance of s = 12001 and n = 12296, you already have the table for n = 12297, you just need to update the last column!
Obviously if n = 12299 then n+1 = 12300 and you need to update the last 3 columns of the previous table.. but this happens just once every 100 iteration.
In general, you have to
update the last column on every iterations (so, length(s) cells)
update the second-to-last too, once every 10 iterations
update the third-to-last, too, once every 100 iterations
so let L = length(s) and D = e-b. First you compute the edit-distance between s and b. Then you can find the minimum Levenstein distance over [b,e] looping over every integer in the interval. There are D of them, so the execution time is about:
Now since
we have an algorithm wich is
I need to loop through all n-bit integers which has at most k bits ON (bits 1), where 0 < n <= 32 and 0 <= k <= n. For example, if n = 4 and k = 2 then these numbers are (in binary digits): 0000, 0001, 0010, 0100, 1000, 0011, 0101, 0110, 1001, 1010, 1100. The order in which these numbers are looped through is not important, but each is visited only once.
Currently I am using this straightforward algorithm:
for x = 0 to (2^n - 1)
count number of bits 1 in x
if count <= k
do something with x
end if
end for
I think this algorithm is inefficient because it has to loop through too many numbers. For example, if n = 32 and k = 2 then it has to loop through 2^32 numbers to find only 529 numbers (which have <= 2 bits 1).
My question is: is there any more efficient algorithm to do this?
You are going to need to make your own bitwise counting algorithm for incrementing the loop counter. Basically, for calculating the next number in the sequence, if there are fewer than k '1' bits, increment normally, if there are k '1' bits, pretend the '0' bits after the least significant '1' don't exist and increment normally.
Another way of saying it is that with a standard counter you add 1 to the least significant bit and carry. In your case, when there are k number of '1's you will add in the 1 to the lowest '1' bit.
For instance if k is 2 and you have 1010 ignore the last 0 and increment the 101 so you get 110 and then add in the 0 for 1100.
Here is Pseudocode for incrementing the number:
Count 1 bits in current number
If number of 1's is < k
number = number + 1
Else
shift_number = number of 0 bits after least significant 1 bit
number = number >> shift_number
number = number + 1
number = number << shift_number
Take the answer to Bit hack to generate all integers with a given number of 1s and loop over [1,k]. That will generate each integer with up to k bits once.
If you have to set 2 bits in 4, the lowest bit set must be at most the third (counting from 0...3) and the highest at least the second.
So you could loop with 2 loops
for lowest in 0 to (n-k)
for highest in lowest + 1 to 3
(0000).setBit (lowest).setBit (highest)
Since you don't want to write 16 loops for 16 bits, you might transform this idea into a recursive one.
Combinatorics.
If you have integers with n bit length and r bits set then there are nCr such numbers. Simply use a combination generator and iterate over the combinations as appropriate.
you can use a while loop as shown below. This loop will only loop for no of bits on. incase your no of bits is fixed you can use a break
countbits = 0
while num > 0
num = num & (num-1)
countbits = countbits + 1
end while
eg:
if 64(1000000) it will loop only once,
if 72(1001000) then 2 times