Stuck on complexity analysis of a tricky program - algorithm

Really stuck the complexity analysis of this problem .
Given digits 0–9 , we need to find all the numbers of max length k whose digits will be in increasing order .
for example if k = 3 , numbers can be 0,00,000,01,02,03,04,.... 1,11,111,12,...
So the question basically that if repetitions allowed for digits,
How many such combinations are possible to find all the numbers less than size k (less than digit length k) such that digits from left to right will be non-decreasing order.

Numbers with at most k digits that are weakly increasing are in 1-1 correspondence with binary strings of length k+10, with exactly ten 1's. The number of consecutive 0s just before the ith one and one in the binary string is the number of i digits in the original number. For example, if k=7, then 001119 maps to 00100011111111010 (2 zeros, 3 ones, 0 twos, 0 threes, ..., 0 eights, 1 nine, 1 digit left over to make the number of digits up to 7).
These binary strings are easy to count: there's choose(k+10, 10)-1 of them (missing one because the empty number is disallowed). This can be computed in O(1) arithmetic operations (actually 10 additions, 18 multiplications and one division).

I don't have enough reputation neither, so I cannot answer Paul's or Globe's answer.
Globe's answer choose(k+9,9) is not perfect, because it only counts the solutions where the numbers have exactly k digits. But the original problems allows numbers with less digits too.
Paul's answer choose(k+10,10) counts these shorter numbers too, but it also allows numbers with zero digits. Let's say k=7 then the following binary string describes a number with no digits: 11111111110000000. We have to exclude this one.
So the solution is: choose(k+10,10)-1

I don't have enough reputation to comment on Paul's answer, so I'm adding another answer. The formula isn't choose(k+10, 10) as specified by Paul, it's choose(k+9, 9).
For instance if we have k=2, choose(2+10, 10) gives us 66, when there are only 55 numbers that satisfy the property.
We pick stars and separators, where the separators divide our digits into buckets from 0 to 9, and stars tell us how many digits to pick from a bucket. (E.g. **|**||*||||||* corresponding to 001139)
The reasoning behind it being k+9 and not k+10 is as follows:
we have to pick 9 separators between 10 digits, so while we have k choices for the stars, we only have 9 choices for the separators.

Related

Number of positive integers in [1,1e18] that cannot be divided by any integers in [2,10]

I am having difficulty trying to solve the following problem:
For Q queries, Q <= 1e6, where each query is a positive integer N, N <= 1e18, find the number of integers in [1,N] that cannot be
divided by integers in [2,10] for each query.
I thought of using using a sieve method to filter out numbers in [1,1e18] for each query (similar to sieve of eratosthenes). However, the value of N could be very large. Hence, there is no way I could use this method. The most useful observation that I could make is that numbers ending with 0,2,4,5,6,8 are invalid. But that does not help me with this problem.
I saw a solution for a similar problem that uses a smaller number of queries (Q <= 200). But it doesn't work for this problem (and I don't understand that solution).
Could someone please advise me on how to solve this problem?
The only matter numbers in [2,10] are those primes which are 2, 3, 5, 7
So, Let say, the number cannot be divided by integers in [2,10] is the number cannot be divided by {2,3,5,7}
Which is also equalled to the total number between [1,n] minus all number that is divided by any combination of {2,3,5,7}.
So, this is the fun part: from [1,n] how many numbers that is divided by 2?
The answer is n/2 (why? simple, because every 2 number, there is one number divided by 2)
Similarly, how many numbers that is divided by 5? The answer is n/5
...
So, do we have our answer yet? No, as we found out that we have doubled count those numbers that divided by both {2, 5} or {2, 7} ..., so now, we need to minus them.
But wait, seems like we are double minus those that divided by {2,5,7} ... so we need to add it back
...
Keep doing this until all combinations are taken care of,
so there should be 2^4 combination, which is 16 in total, pretty small to deal with.
Take a look at Inclusion-Exclusion principle for some good understanding.
Good luck!
Here is an approach on how to handle this.
The place to start is to think about how you can split this into pieces. With such a problem, a place to start is the least common denominator (LCD) -- in this case 2,520 (the smallest number divisible by all the numbers less than 10).
The idea is that if x is not divisible by any number from 2-10, then x + 2,520 is also not divisible.
Hence, you can divide the problem into two pieces:
How many numbers between 1 and 2,520 are "relatively prime" to the numbers from 2-10?
How many times does 2,520 go into your target number? You need to take the remainder into account as well.

Hashing Elements by Mid-Square

I'm making a hash function for a hash table of size 10 (indexes 0-9), and hashing elements using mid-square method.
The Problem is I'm confused whether i should use 1 middle digit or 2 digits then taking mod 10 of it.
The problem is if I choose to take 2 mid digit then taking mod 10, this method will fail if the squared number has 3 digits. Which two digits will I take then?
and if I choose to have 1 middle digit, then I'll have problem taking mid digit when squared number has even number of digits. I know in this situation both of the mid two element are made from contribution of all elements of the original number.
Right now, I'm going with {floor(n/2+1)}th digit. This way it works for three digit squared number, and when the squared number has even no. of digits, it's taking the later digit of the middle two.
I wanna know what more efficient approach of doing this?
Thanks.
I was taught that the correct way to mid square hash is to use 1 digit if the length of the resulting square is odd, if it's even then use the middle 2 digits.

how to find gcd of all permutations of a large number of order 10^250?

Consider the decimal representation of a natural number N. Find the greatest common divisor (GCD) of all numbers that can be obtained by permuting the digits in the given number. Leading zeroes are allowed
I don't want the code, just the logic on how to approach the problem
http://www.spoj.com/problems/GCD/
Here is the pseudo code that I was trying:
if sum of digits divide by 3 then k=3
if sum of digits divide by 9 then k=9
else k=1
if all digits divide by 2 then o=2
if all digits divide by 4 then o=4
if all digits divide by 8 then o=8
if all digits divide by 5 then o=5
if all digits divide by 7 then o=7
else o=1
if all digits are same , print itself
else print o*k
But i am getting Wrong Answer every time.
Let me think...
If for example your huge number contains the digits two and three, then one permutation ends in 23, another permutation is identical except that it ends in 32. The difference is 9. Therefore the gcd of all permutations is a factor of 9, which means it is 9, 3 or 1.
Can you go from there?
To give you a stronger hint: Every common divisor of x and y is also a divisor of x-y. That's why you don't need to find all the divisors of 250 digit numbers. You find the divisors of some differences of such numbers.
If you have two digits 5 and 7 in your number (because that is what you asked), there is one permutation (248 digits)57 and another permutation (same 248 digits)75. The difference is 18. The gcd of all numbers divides 18.
Now it's your turn. What can you conclude if you have two digit 2 and 9 by taking one permutation ending in 29 and one ending in 92? And if you have more than two different digits, what can you conclude?

Fastest algorithm to find factors of 2 and 5

I've read this question: Which is the fastest algorithm to find prime numbers?, but I'd like to do this only for 2 and 5 primes.
For example, the number 42000 is factorized as:
24 • 31 • 53 • 71
I'm only interested in finding the powers of 2 and 5: 4 and 3 in this example.
My naive approach is to successively divide by 2 while the remainder is 0, then successively divide by 5 while the remainder is 0.
The number of successful divisions (with zero remainder) are the powers of 2 and 5.
This involves performing (x + y + 2) divisions, where x is the power of 2 and y is the power of 5.
Is there a faster algorithm to find the powers of 2 and 5?
Following the conversation, I do think your idea is the fastest way to go, with one exception:
Division (in most cases) is expensive. On the other hand, checking the last digit of the number is (usually?) faster, so I would check the last digit (0/5 and 0/2/4/6/8) before dividing.
I am basing this off this comment by the OP:
my library is written in PHP and the number is actually stored as a string in base 10. That's not the most efficient indeed, but this is what worked best within the technical limits of the language.
If you are committed to strings-in-php, then the following pseudo-code will speed things up compared to actual general-purpose repeated modulus and division:
while the string ends in 0, but is not 0
chop a zero off the end,
increment ctr2 and ctr5
switch repeatedly depending on the last digit:
if it is a 5,
divide it by 5
increment ctr5
if it is 2, 4, 6, 8,
divide it by 2
increment ctr2
otherwise
you have finished
This does not require any modulus operations, and you can implement divide-by-5 and divide-by-2 cheaper than a general-purpose long-number division.
On the other hand, if you want performance, using string representations for unlimited-size integers is suicide. Use gmp (which has a php library) for your math, and convert to strings only when necessary.
edit:
you can gain extra efficiency (and simplify your operations) using the following pseudocode:
if the string is zero, terminate early
while the last non-zero character of the string is a '5',
add the string to itself
decrement ctr2
count the '0's at the end of the string into a ctr0
chop off ctr0 zeros from the string
ctr2 += ctr0
ctr5 += ctr0
while the last digit is 2, 4, 6, 8
divide the string by 2
increment ctr2
Chopping many 0s at once is better than looping. And mul2 beats div5 in terms of speed (it can be implemented by adding the number once).
If you have a billion digit number, you do not want to do divisions on it unless it's really necessary. If you don't have reason to believe that it is in the 1/2^1000 numbers divisible by 2^1000, then it makes sense to use much faster tests that only look at the last few digits. You can tell whether a number is divisible by 2 by looking at the last digit, whether it is divisible by 4 by looking at the last 2 digits, and by 2^n by looking at the last n digits. Similarly, you can tell whether a number is divisible by 5 by looking at the last digit, whether it is divisible by 25 by looking at the last 2 digits, and by 5^n by looking at the last n digits.
I suggest that you first count and remove the trailing 0s, then decide from the last digit whether you are testing for powers of 2 (last digit 2,4,6, or 8) or powers of 5 (last digit 5).
If you are testing for powers of 2, then take the last 2, 4, 8, 16, ... 2^i digits, and multiply this by 25, 625, ... 5^2^i, counting the trailing 0s up to 2^i (but not beyond). If you get fewer than 2^i trailing 0s, then stop.
If you are testing for powers of 5, then take the last 2, 4, 8, 16, ... 2^i digits, and multiply this by 4, 16, ... 2^2^i, counting the trailing 0s up to 2^i (but not beyond). If you get fewer than 2^i trailing 0s, then stop.
For example, suppose the number you are analyzing is 283,795,456. Multiply 56 by 25, you get 1400 which has 2 trailing 0s, continue. Multiply 5,456 by 625, you get 3,410,000, which has 4 trailing 0s, continue. Multiply 83,795,456 by 5^8=390,625, you get 32,732,600,000,000, which has 8 trailing 0s, continue. Multiply 283,795,456 by 5^16 to get 43,303,750,000,000,000,000 which has only 13 trailing 0s. That's less than 16, so stop, the power of 2 in the prime factorization is 2^13.
I hope that for larger multiplications you are implementing an n log n algorithm for multiplying n digit numbers, but even if you aren't, this technique should outperform anything involving division on typical large numbers.
Let's look at the average-case time complexity of various algorithms, assuming that each n-digit number is equally likely.
Addition or subtraction of two n-digit numbers takes theta(n) steps.
Dividing an n-digit number by a small number like 5 takes theta(n) steps. Dividing by the base is O(1).
Dividing an n-digit number by another large number takes theta(n log n) steps using the FFT, or theta(n^2) by a naive algorithm. The same is true for multiplication.
The algorithm of repeatedly dividing a base 10 number by 2 has an average case time complexity of theta(n): It takes theta(n) time for the first division, and on average, you need to do only O(1) divisions.
Computing a large power of 2 with at least n digits takes theta(n log n) by repeated squaring, or theta(n^2) with simple multiplication. Performing Euclid's algorithm to compute the GCD takes an average of theta(n) steps. Although divisions take theta(n log n) time, most of the steps can be done as repeated subtractions and it takes only theta(n) time to do those. It takes O(n^2 log log n) to perform Euclid's algorithm this way. Other improvements might bring this down to theta(n^2).
Checking the last digit for divisibility by 2 or 5 before performing a more expensive calculation is good, but it only results in a constant factor improvement. Applying the original algorithm after this still takes theta(n) steps on average.
Checking the last d digits for divisibility by 2^d or 5^d takes O(d^2) time, O(d log d) with the FFT. It is very likely that we only need to do this when d is small. The fraction of n-digit numbers divisible by 2^d is 1/2^d. So, the average time spent on these checks is O(sum(d^2 / 2^d)) and that sum is bounded independent of n, so it takes theta(1) time on average. When you use the last digits to check for divisibility, you usually don't have to do any operations on close to n digits.
depends on whether you're starting with a native binary number or some bigint string -
chopping off very long chains of trailing edge zeros in bigint strings are a lot easier than trying to extract powers of 2 and 5 separately - e.g. 23456789 x 10^66
23456789000000000000000000000000000000000000000000000000000000000000000000
This particular integer, on the surface, is 244-bits in total, requiring a 177-bit-wide mantissa (178-bit precision minus 1-bit implicit) to handle it losslessly, so even newer data types such as uint128 types won't suffice :
11010100011010101100101010010000110000101000100001000110100101
01011011111101001110100110111100001001010000110111110101101101
01001000011001110110010011010100001001101000010000110100000000
0000000000000000000000000000000000000000000000000000000000
The sequential approach is to spend 132 loop cycles in a bigint package to get them out ::
129 63 2932098625
130 64 586419725
131 65 117283945
132 66 23456789
133 2^66 x
5^66 x
23456789
But once you can quickly realize there's a chain of 66 trailing zeros, the bigint package becomes fully optional, since the residual digits is less than 24.5-bits in total width:
2^66
5^66
23456789
I think your algorithm will be the fastest. But I have a couple of suggestions.
One alternative is based on the greatest common divisor. Take the gcd of your input number with the smallest power of 2 greater than your input number; that will give you all the factors of 2. Divide by the gcd, then repeat the same operation with 5; that will give you all the factors of 5. Divide again by the gcd, and the remainder tells you if there are any other factors.
Another alternative is based on binary search. Split the binary representation of your input number in half; if the right half is 0, move left, otherwise move right. Once you have the factors of 2, divide, then apply the same algorithm on the remainder using powers of 5.
I'll leave it to you to implement and time these algorithms. But my gut feeling is that repeated division will be hard to beat.
I just read your comment that your input number is stored in base 10. In that case, divide repeatedly by 10 as long as the remainder is 0; that gives factors of both 2 and 5. Then apply your algorithm on the reduced number.

Find the largest number small than having the same digits as of n

Given a number n, find the largest number small than having the same digits as of n. E.g. 231 output will be 213?
You need to find the last digit that has a smaller digit to its right, and swap that digit with the largest of the smaller digits to its right. And then sort all of the digits to the right of the swapped number in descending order.
For example, given 74125, 4 is the last digit that has smaller digits to the right, and 2 is the largest of the smaller digits so the answer is found by swapping 4 and 2 to get 72145, and then sort all of the digits to the right of the 2 to get 72541.
Additional note: If there are multiple copies of the largest-of-the-smaller-digits-to-the-right digit, then swap with the leftmost copy of that digit. So for example, 74122267 becomes 72142267 before sorting, and 72764221 after sorting.

Resources