Discrete logarithm of number of form (2ⁿ - 1) - algorithm

I need to find the smallest number k for which (2n - 1)k % M equals a given X.
The catch here is that n can be a very large number, with possibly 10,000 digits, and hence will be stored as a string. I know that this is a hard problem in general, but does the special form of the number imply any property that makes this easier in this case? M is not necessarily prime, but is within reasonable bounds of 108.

First you can't store the value in string because 210000 digits is far more than the total number of particles in the universe (1080 ≈ 2265.75). You don't even have enough memory if you store it as bits (in fact that's how bigint libraries store their numbers, no good libraries store values as characters)
So what you can do is to use modular exponentiation to get the modulo. Basically you use the (a * b) % M = ((a % M) * (b % M)) % M property to avoid calculating the real power value. Many languages already have built-in support for that, for example Python pow function has an optional third argument for this, resulting in pow(base, exp[, mod]). The implementation is exactly like the normal pow, just replace power *= base with modpow = (modpow * base) % M. There are a lot of examples on SO
Calculating (a^b)%MOD
Calculating pow(a,b) mod n
Calculate (a^b)%c where 0<=a,b,c<=10^18
You don't need to loop (2n - 1)k times. It's actually impossible because assuming you can loop 232 times a second then you'll need 232 seconds ≈ 136 years to loop 264 times. Imagine how many centuries it need to count up to 210000. Luckily the result will repeat after a cycle, you just need to calculate the cycle length
Those are the hints needed. You can reference how to calculate a^(b^c) mod n? and finding a^b^c^... mod m which are closer to your problem

Related

Is this number a power of two?

I have a number (in base 10) represented as a string with up to 10^6 digits. I want to check if this number is a power of two. One thing I can think of is binary search on exponents and using FFT and fast exponentiation algorithm, but it is quite long and complex to code. Let n denote the length of the input (i.e., the number of decimal digits in the input). What's the most efficient algorithm for solving this problem, as a function of n?
There are either two or three powers of 2 for any given size of a decimal number, and it is easy to guess what they are, since the size of the decimal number is a good approximation of its base 10 logarithm, and you can compute the base 2 logarithm by just multiplying by an appropriate constant (log210). So a binary search would be inefficient and unnecessary.
Once you have a trial exponent, which will be on the order of three million, you can use the squaring exponentiation algorithm with about 22 bugnum decimal multiplications. (And up to 21 doublings, but those are relatively easy.)
Depending on how often you do this check, you might want to invest in fast bignum code. But if it is infrequent, simple multiplication should be ok.
If you don't expect the numbers to be powers of 2, you could first do a quick computation mod 109 to see if the last 9 digits match. That will eliminate all but a tiny percentage of random numbers. Or, for an even faster but slightly weaker filter, using 64-bit arithmetic check that the last 20 digits are divisible by 220 and not by 10.
Here is an easy probabilistic solution.
Say your number is n, and we want to find k: n = 2^k. Obviously, k = log2(n) = log10(n) * log2(10). We can estimate log10(n) ~ len(n) and find k' = len(n) * log2(10) with a small error (say, |k - k'| <= 5, I didn't check but this should be enough). Probably you'll need this part in any solutions that can come in mind, it was mentioned in other answers as well.
Now let's check that n = 2^k for some known k. Select a random prime number P with from 2 to k^2. If remainders are not equal that k is definitely not a match. But what if they are equal? I claim that false positive rate is bounded by 2 log(k)/k.
Why it is so? Because if n = 2^k (mod P) then P divides D = n-2^k. The number D has length about k (because n and 2^k has similar magnitude due to the first part) and thus cannot have more than k distinct prime divisors. There are around k^2 / log(k^2) primes less than k^2, so a probability that you've picked a prime divisor of D at random is less than k / (k^2 / log(k^2)) = 2 log(k) / k.
In practice, primes up to 10^9 (or even up to log(n)) should suffice, but you have to do a bit deeper analysis to prove the probability.
This solution does not require any long arithmetics at all, all calculations could be made in 64-bit integers.
P.S. In order to select a random prime from 1 to T you may use the following logic: select a random number from 1 to T and increment it by one until it is prime. In this case the distribution on primes is not uniform and the former analysis is not completely correct, but it can be adapted to such kind of random as well.
i am not sure if its easy to apply, but i would do it in the following way:
1) show the number in binary. now if the number is a power of two, it would look like:
1000000....
with only one 1 and the rest are 0. checking this number would be easy. now the question is how is the number stored. for example, it could have leading zeroes that will harden the search for the 1:
...000010000....
if there are only small number of leading zeroes, just search from left to right. if the number of zeroes is unknown, we will have to...
2) binary search for the 1:
2a) cut in the middle.
2b) if both or neither of them are 0 (hopefully you can check if a number is zero in reasonable time), stop and return false. (false = not power of 2)
else continue with the non-zero part.
stop if the non-zero part = 1 and return true.
estimation: if the number is n digits (decimal), then its 2^n digits binary.
binary search takes O(log t), and since t = 2^n, log t = n. therefore the algorithm should take O(n).
assumptions:
1) you can access the binary view of the number.
2) you can compare a number to zero in a reasonable time.

For hashing, what happens to number of empty slots when n tends to inifinity?

Under universal hashing assumption, if i have hash table of size m=cn, c>0, and as n tends to infinity, what does the number of empty slots tend to?
I'm a bit stuck on how to do this because m is a function of n...(the answer I get for different values of m too) is always tending to infinity and I'm not exactly sure if that is accurate...
Take a specific cell i. The probability that all keys missed it is
((m - 1) / m)n = (1 - 1/m)n = (1 - 1/m)m/c ~ e-1/c.
(For the last approximation, see representations of e.)
The event that all keys missed some other specific cell j is not independent, but by linearity of expectation, that doesn't matter. The expected number of empty bins will be m multiplied by the previous expression.

How to count number of divisible terms without using modulus operator?

Given three numbers N, A and B. Find how integers in range from 1 to N are divisible by A or B. I can't use modulus operator from range 1 to N because N can be as large as 10^12 and then I would run out of allocated time for the program to produce an output. I have tried formulating the equation but couldn't come up with a solution.
Input Constraints:=
1<=N<=10^12
1<=A<=10^5
1<=B<=10^5
I just want to use some equation to evaluate this thing rather than a
modulus operator because the program needs to produce results within 1
sec. I have tried this
counter=(((int)(N/A))+((int)(N/B)))-((int)(N/(A*B))); but it fails for
input N=200 A=20 B=8
You are already on the right track, your formula indicates you are trying to apply the inclusion-exclusion principle.
(int) (N/A) and (int) (N/B) corresponds to the counts of integers ≤ N that are dividable by A and B, respectively.
However, (int) (N/(A*B)) does not give you the correct count of integers ≤ N that are dividable by both A and B.
In fact, you should replace (int) (N/(A*B)) by (int) (N/lcm(A,B)) in order to get the correct result. Here lcm(A, B) returns the least common multiple of A and B.
To implement the lcm(A, B) function, you can simply use the following formula:
lcm(A, B) = A * B / gcd(A, B);
where gcd(A, B) returns the greatest common divisor of A and B, and it can be efficiently computed by Euclidean Algorithm, which inevitably involves using the modulus operator only very few times (max {log(A), log(B)} times to be precise), so there should not really be any performance issue for you.

Determining whether a system of congruences has a solution

Having a system of linear congruences, I'd like to determine if it has a solution. Using simple algorithms that solve such systems is impossible, as the answer may grow exponentially.
One hypothesis I have is that if a system of congruences has no solution, then there are two of them that contradict each other. I have no idea if this holds, if it did that would lead to an easy O(n^2 log n) algo, as checking if a pair of congruences has a solution requires O(log n) time. Nevertheless for this problem I'd rather see something closer to O(n).
We may assume that no moduli exceeds 10^6, especially we can quickly factor them all to begin with. We may even further assume that the sum of all moduli doesn't exceed 10^6 (but still, their product can be huge).
As you suspect, there's a fairly simple way to determine whether the set of congruences has a solution without actually needing to build that solution. You need to:
Reduce each congruence to the form x = a (mod n) if necessary; from the comments, it sounds as though you already have this.
Factorize each modulus n as a product of prime powers: n = p1^e1 * p2^e2 * ... * pk^ek.
Replace each congruence x = a (mod n) with a collection of congruences x = a (mod pi^ei), one for each of the k prime powers you found in step 2.
And now, by the Chinese Remainder Theorem it's enough to check compatibility for each prime independently: given any two congruences x = a (mod p^e) and x = b (mod p^f), they're compatible if and only if a = b (mod p^(min(e, f)). Having determined compatibility, you can throw out the congruence with smaller modulus without losing any information.
With the right data structures, you can do all this in a single pass through your congruences: for each prime p encountered, you'll need to keep track of the biggest exponent e found so far, together with the corresponding right-hand side (reduced modulo p^e for convenience). The running time will likely be dominated by the modulus factorizations, though if no modulus exceeds 10^6, then you can make that step very fast, too, by prebuilding a mapping from each integer in the range 1 .. 10^6 to its smallest prime factor.
EDIT: And since this is supposed to be a programming site, here's some (Python 3) code to illustrate the above. (For Python 2, replace the range call with xrange for better efficiency.)
def prime_power_factorisation(n):
"""Brain-dead factorisation routine, for illustration purposes only."""
# DO NOT USE FOR LARGE n!
while n > 1:
p, pe = next(d for d in range(2, n+1) if n % d == 0), 1
while n % p == 0:
n, pe = n // p, pe*p
yield p, pe
def compatible(old_ppc, new_ppc):
"""Determine whether two prime power congruences (with the same
prime) are compatible."""
m, a = old_ppc
n, b = new_ppc
return (a - b) % min(m, n) == 0
def are_congruences_solvable(moduli, right_hand_sides):
"""Determine whether the given congruences have a common solution."""
# prime_power_congruences is a dictionary mapping each prime encountered
# so far to a pair (prime power modulus, right-hand side).
prime_power_congruences = {}
for m, a in zip(moduli, right_hand_sides):
for p, pe in prime_power_factorisation(m):
# new prime-power congruence: modulus, rhs
new_ppc = pe, a % pe
if p in prime_power_congruences:
old_ppc = prime_power_congruences[p]
if not compatible(new_ppc, old_ppc):
return False
# Keep the one with bigger exponent.
prime_power_congruences[p] = max(new_ppc, old_ppc)
else:
prime_power_congruences[p] = new_ppc
# If we got this far, there are no incompatibilities, and
# the congruences have a mutual solution.
return True
One final note: in the above, we made use of the fact that the moduli were small, so that computing prime power factorisations wasn't a big deal. But if you do need to do this for much larger moduli (hundreds or thousands of digits), it's still feasible. You can skip the factorisation step, and instead find a "coprime base" for the collection of moduli: that is, a collection of pairwise relatively prime positive integers such that each modulus appearing in your congruences can be expressed as a product (possibly with repetitions) of elements of that collection. Now proceed as above, but with reference to that coprime base instead of the set of primes and prime powers. See this article by Daniel Bernstein for an efficient way to compute a coprime base for a set of positive integers. You'd likely end up making two passes through your list: one to compute the coprime base, and a second to check the consistency.

Random number in range 0 to n

Given a function R which produces true random 32 bit numbers, I would like a function that returns random integers in the range 0 to n, where n is arbitrary (less than 2^32).
The function must produce all values 0 to n with equal probability.
I would like a function that executes in constant time with no if statements or loops, so something like the Java Random.nextInt(n) function is out.
I suspect that a simple modulus will not do the job unless n is a power of 2 -- am I right?
I have accepted Jason's answer, despite it requiring a loop of undetermined duration, since it appears to be the best method to use in practice and essentially answers my question. However I am still interested in any algorithms (even if less efficient) which would be deterministic in nature and be guaranteed to terminate, such as Mark Byers has pointed to.
Without discarding some of the values from the source, you can not do this. For example, a set of size 2^32 can not be partitioned into three equally sized sets. Therefore, it is impossible to do this without discarding some of the values and iterating until a non-discarded value is produced.
So, just use this (pseudocode):
rng is random number generator produces uniform integers from [0, max)
compute m = max modulo (n + 1)
do {
draw a random number r from rng
} while(r >= max - m)
return r modulo (n + 1)
Effectively I am throwing out the top part of the distribution that causes problems. If rng is uniform on [0, max), then this algorithm will be uniform on [0, n]
What you're asking for is impossible. You can't partition 2**32 numbers into three sets of exactly equal size.
If you want to guarantee an absolutely perfect uniform distribution in 0 <= x < n, where n is not a power of 2 then you have to be prepared to call R potentially an infinite number of times. In reality you will typically need only one or two calls, but the code has to in theory be able call R any number of times otherwise it can't be completely uniform.
I don't understand why modulus wouldn't do what you want? Since R is a function that produces true random 32 bit numbers, that means that each number has the same probability to be produced, right? So, if you use a modulus n:
randomNumber = R() % (n + 1) //EDITED: n+1 to return values from 0-n
then each number from 0 to n has the same probability!
You can generate two 32 bit numbers and put them together to form 64 bit number. Worst case scenario can be than biased by 0.99999999976716936 if you do not discharge numbers (if you need number whit no more than 32 bits) that mean that some number have by this factor lower probability than other.
But if you still want to remove this small bias you will have low ration "out of range" hits and in that case more that 1 discharge.
Depending upon your problem/use of the random numbers, maybe you could pre-allocate your random numbers using a slow method and put them into a simple array.
Then getNextRnd() can just return the next in the array.
Quick, fixed time call, no branches, just wasting memory (which is usually pretty cheap) and process initialization time.

Resources