integer factorization and cryptography - public-key-encryption

i know that public key cryptography uses prime numbers,
also know that two large(e.g. 100 digit) prime numbers (P, Q) are used as the private key,
the product is a public key N = P * Q,
and using prime numbers is because the factorization of N to obtain P , Q is sooo difficult and takes much time,
i'm ok with that, but i'm puzzled
why not just use any ordinary large non-prime numbers for P , Q
and so the factorization of N will be still difficult
because there would because now , there are not only 2 factors possible, but even more.
thanks....

I am not a crypto expert.
why not just use any ordinary large
non-prime numbers for P , Q
Because there would be more factors. Integer factorization is an attack against public private key encryption. This attack exploits this very relation.
One could more easily find the relation and possible values with more common factors. It boils down to algebra.
N = P * Q
if P and Q are both Prime then N has 4 factors {N P Q 1}
However!
if P and Q both share a factor of 2
N / 4 = P / 2 * Q / 2
If N could have been 0..2^4096 it is now 0..2^4094 and since 2 was a factor another large number was also a factor.
This means that I could find a scalar multiple, P', Q' of P,Q S.T. P',Q' < P,Q
I don't fully understand the concept myself but I believe this shows where i'm going with this.
You have to search a smaller space until you brute force the key.

It is perfectly possible to use RSA with a modulus N that is composed of more than two prime factors P and Q, but two things have to be noted:
You must know the exact value of all of these factors, or else you will be unable to derive the private key from the public key upon key generation. The equation for two-prime RSA is 1 = D*E (mod LCM(P-1,Q-1)). If you do know these prime factors, you can perform the calculation. If you don't know the prime factors you can't perform this calculation, which BTW is why it is safe to make the public key E,N public - you can't derive the private key D if you only have the information that is easily derived from the public key, unless you are able to factor N.
The security of RSA is effectively bounded by the magnitude of the second largest prime factor of the RSA modulus N. Finding small prime factors that are less than 2^32 can be done in a fraction of a second on a modern computer, simply by trying to divide the modulus N by each such prime and check if the residue is zero (meaning N is divisible by that number) or not (meaning that number is not a factor of N). If N is composed of only such small prime factors times a single large prime factor Q, it would be trivial to find that Q as well, simply by dividing N by all small factors to get N' and test N' for primality. If N' is a prime, it is the last prime factor Q.

I'm not really expert in cryptology (so if I'm wrong just tell me in a comment, and I'll promptly delete this answer), but I think it's because if you just use random large numbers you may get easily factorisable ones (i.e. you don't have to get up to really large prime numbers to get their prime factors). So just really big, guaranteed-prime numbers are used.

Related

Is this number a power of two?

I have a number (in base 10) represented as a string with up to 10^6 digits. I want to check if this number is a power of two. One thing I can think of is binary search on exponents and using FFT and fast exponentiation algorithm, but it is quite long and complex to code. Let n denote the length of the input (i.e., the number of decimal digits in the input). What's the most efficient algorithm for solving this problem, as a function of n?
There are either two or three powers of 2 for any given size of a decimal number, and it is easy to guess what they are, since the size of the decimal number is a good approximation of its base 10 logarithm, and you can compute the base 2 logarithm by just multiplying by an appropriate constant (log210). So a binary search would be inefficient and unnecessary.
Once you have a trial exponent, which will be on the order of three million, you can use the squaring exponentiation algorithm with about 22 bugnum decimal multiplications. (And up to 21 doublings, but those are relatively easy.)
Depending on how often you do this check, you might want to invest in fast bignum code. But if it is infrequent, simple multiplication should be ok.
If you don't expect the numbers to be powers of 2, you could first do a quick computation mod 109 to see if the last 9 digits match. That will eliminate all but a tiny percentage of random numbers. Or, for an even faster but slightly weaker filter, using 64-bit arithmetic check that the last 20 digits are divisible by 220 and not by 10.
Here is an easy probabilistic solution.
Say your number is n, and we want to find k: n = 2^k. Obviously, k = log2(n) = log10(n) * log2(10). We can estimate log10(n) ~ len(n) and find k' = len(n) * log2(10) with a small error (say, |k - k'| <= 5, I didn't check but this should be enough). Probably you'll need this part in any solutions that can come in mind, it was mentioned in other answers as well.
Now let's check that n = 2^k for some known k. Select a random prime number P with from 2 to k^2. If remainders are not equal that k is definitely not a match. But what if they are equal? I claim that false positive rate is bounded by 2 log(k)/k.
Why it is so? Because if n = 2^k (mod P) then P divides D = n-2^k. The number D has length about k (because n and 2^k has similar magnitude due to the first part) and thus cannot have more than k distinct prime divisors. There are around k^2 / log(k^2) primes less than k^2, so a probability that you've picked a prime divisor of D at random is less than k / (k^2 / log(k^2)) = 2 log(k) / k.
In practice, primes up to 10^9 (or even up to log(n)) should suffice, but you have to do a bit deeper analysis to prove the probability.
This solution does not require any long arithmetics at all, all calculations could be made in 64-bit integers.
P.S. In order to select a random prime from 1 to T you may use the following logic: select a random number from 1 to T and increment it by one until it is prime. In this case the distribution on primes is not uniform and the former analysis is not completely correct, but it can be adapted to such kind of random as well.
i am not sure if its easy to apply, but i would do it in the following way:
1) show the number in binary. now if the number is a power of two, it would look like:
1000000....
with only one 1 and the rest are 0. checking this number would be easy. now the question is how is the number stored. for example, it could have leading zeroes that will harden the search for the 1:
...000010000....
if there are only small number of leading zeroes, just search from left to right. if the number of zeroes is unknown, we will have to...
2) binary search for the 1:
2a) cut in the middle.
2b) if both or neither of them are 0 (hopefully you can check if a number is zero in reasonable time), stop and return false. (false = not power of 2)
else continue with the non-zero part.
stop if the non-zero part = 1 and return true.
estimation: if the number is n digits (decimal), then its 2^n digits binary.
binary search takes O(log t), and since t = 2^n, log t = n. therefore the algorithm should take O(n).
assumptions:
1) you can access the binary view of the number.
2) you can compare a number to zero in a reasonable time.

What is the efficiency of dividing N positive integers by a given power of 2?

For example:
Let's say N = 128.
We want to divide each positive integer up to and including N, by say, 8.
So we would perform integer division for:
1/8
2/8
3/8
...
127/8
128/8
In looking this up, I see that bit shift operations are the way to go and that any good compiler will automatically do it that way in the first place. But nonetheless, I can't seem to find any big O function for this type of algorithm.
To sum up: given a positive integer N, and a number Y which is a power of 2, what is the efficiency of an algorithm which divides each of the numbers 1,2,3,...,N by Y?

Find prime factors such that difference is smallest as possible

Suppose n, a, b are positive integers where n is not a prime number, such that n=ab with a≥b and (a−b) is small as possible. What would be the best algorithm to find the values of a and b if n is given?
I read a solution where they try to represent n as the difference between two squares via searching for a square S bigger than n such that S - n = (another square). Why would that be better than simply finding the prime factors of n and searching for the combination where a,b are factors of n and a - b is minimized?
Firstly....to answer why your approach
simply finding the prime factors of n and searching for the combination where a,b are factors of n and a - b is minimized
is not optimal:
Suppose your number is n = 2^7 * 3^4 * 5^2 * 7 * 11 * 13 (=259459200), well within range of int. From the combinatorics theory, this number has exactly (8 * 5 * 3 * 2 * 2 * 2 = 960) factors. So, firstly you find all of these 960 factors, then find all pairs (a,b) such that a * b = n, which in this case will be (6C1 + 9C2 + 11C3 + 13C4 + 14C5 + 15C6 + 16C7 + 16C8) ways. (if I'm not wrong, my combinatorics is a bit weak). This is of the order 1e5 if implemented optimally. Also, implementation of this approach is hard.
Now, why the difference of squares approach
represent S - n = Q, such that S and Q are perfect squares
is good:
This is because if you can represent S - n = Q, this implies, n = S - Q
=> n = s^2 - q^2
=> n = (s+q)(s-q)
=> Your reqd ans = 2 * q
Now, even if you iterate for all squares, you will either find your answer or terminate when difference of 2 consecutive squares is greater than n
But I don't think this will be doable for all n (eg. if n=6, there is no solution for (S,Q).)
Another approach:
Iterate from floor(sqrt(n)) to 1. The first number (say, x), such that x|n will be one of the numbers in the required pair (a,b). Other will be, obvs, y = x/n. So, your answer will be y - x.
This is O(sqrt(n)) time complex algorithm.
A general method could be this:
Find the prime factorization of your number: n = Π pi ai. Except for the worst cases where n is prime or semiprime, this will be substantially faster than O(n1/2) time of the iteration down from the square root, which won't divide the found factors out of the number.
Recall that the simplest, trial division, prime factorization is done by repeatedly trying to divide the number by increasing odd numbers (or by primes) below the number's square root, dividing out of the number each factor -- thus prime by construction -- as it is found (n := n/f).
Then, lazily enumerate the factors of n in order from its prime factorization. Stop after producing half of them. Having thus found n's (not necessarily prime) factor that is closest to its square root, find the second factor by simple division.
In case this must repeatedly run many times, it will greatly pay out to precalculate the needed primes below the n's square root, to use in the factorizations.

Quick idea for a shuffling algorithm - would this work?

I was discussing with my brothers a quick-and-dirty algorithm for shuffling a deck of cards (i.e. an array in which every element is unique). Description of the algorithm:
Let the number of cards in the deck be n. Take a number x so that gcd(n,x)=1. Now iteratively pick the card number (x*i) mod n for i=1 up to i=n and put it in a new pile of cards (without removing the card from the original deck, that is, make a copy of the card). The new pile of card will be our result.
It seems clear to me that only performing this algorithm once will not give a result that is "random enough" (in the sense that it would fail statistical tests for determining randomness). But what if we perform the algorithm iteratively, possibly for a new value of x that also fulfills gcd(n,x)=1? If doing this a sufficient number of times would give us a "random enough" result, how many times could we expect to need to do this as a function of n?
Doing it multiple times would be insufficient due to the wonders of modulo arithmetic. In fact, there are at most n permutations you could ever achieve this way, and that's precisely when n is prime or 1.
Suppose you were to do this twice, with x relatively prime to n the first time and y relatively prime to n the second time.
The first time an element at position p is moved to p * x (mod n). Then the second time it is moved to (p * x) * y (mod n). This is the same as moving it to p * (x * y) (mod n) because of the associative nature of modular arithmetic. But if x * y = v (mod n) then it's the same as moving it to p * v (mod n) -- and as you know, there aren't more than n equivalence classes.
Hence, there are at most n permutations that could be resulted in a n-length deck. (No, this isn't a rigorous proof!)
Edit:
I had claimed if you used modular multiplicative exponentiation instead, it would be superior. However, after additional consideration many trivial configurations would still fall prey to modular arithmetic in the same "at most n permutations" way.

Determining whether a system of congruences has a solution

Having a system of linear congruences, I'd like to determine if it has a solution. Using simple algorithms that solve such systems is impossible, as the answer may grow exponentially.
One hypothesis I have is that if a system of congruences has no solution, then there are two of them that contradict each other. I have no idea if this holds, if it did that would lead to an easy O(n^2 log n) algo, as checking if a pair of congruences has a solution requires O(log n) time. Nevertheless for this problem I'd rather see something closer to O(n).
We may assume that no moduli exceeds 10^6, especially we can quickly factor them all to begin with. We may even further assume that the sum of all moduli doesn't exceed 10^6 (but still, their product can be huge).
As you suspect, there's a fairly simple way to determine whether the set of congruences has a solution without actually needing to build that solution. You need to:
Reduce each congruence to the form x = a (mod n) if necessary; from the comments, it sounds as though you already have this.
Factorize each modulus n as a product of prime powers: n = p1^e1 * p2^e2 * ... * pk^ek.
Replace each congruence x = a (mod n) with a collection of congruences x = a (mod pi^ei), one for each of the k prime powers you found in step 2.
And now, by the Chinese Remainder Theorem it's enough to check compatibility for each prime independently: given any two congruences x = a (mod p^e) and x = b (mod p^f), they're compatible if and only if a = b (mod p^(min(e, f)). Having determined compatibility, you can throw out the congruence with smaller modulus without losing any information.
With the right data structures, you can do all this in a single pass through your congruences: for each prime p encountered, you'll need to keep track of the biggest exponent e found so far, together with the corresponding right-hand side (reduced modulo p^e for convenience). The running time will likely be dominated by the modulus factorizations, though if no modulus exceeds 10^6, then you can make that step very fast, too, by prebuilding a mapping from each integer in the range 1 .. 10^6 to its smallest prime factor.
EDIT: And since this is supposed to be a programming site, here's some (Python 3) code to illustrate the above. (For Python 2, replace the range call with xrange for better efficiency.)
def prime_power_factorisation(n):
"""Brain-dead factorisation routine, for illustration purposes only."""
# DO NOT USE FOR LARGE n!
while n > 1:
p, pe = next(d for d in range(2, n+1) if n % d == 0), 1
while n % p == 0:
n, pe = n // p, pe*p
yield p, pe
def compatible(old_ppc, new_ppc):
"""Determine whether two prime power congruences (with the same
prime) are compatible."""
m, a = old_ppc
n, b = new_ppc
return (a - b) % min(m, n) == 0
def are_congruences_solvable(moduli, right_hand_sides):
"""Determine whether the given congruences have a common solution."""
# prime_power_congruences is a dictionary mapping each prime encountered
# so far to a pair (prime power modulus, right-hand side).
prime_power_congruences = {}
for m, a in zip(moduli, right_hand_sides):
for p, pe in prime_power_factorisation(m):
# new prime-power congruence: modulus, rhs
new_ppc = pe, a % pe
if p in prime_power_congruences:
old_ppc = prime_power_congruences[p]
if not compatible(new_ppc, old_ppc):
return False
# Keep the one with bigger exponent.
prime_power_congruences[p] = max(new_ppc, old_ppc)
else:
prime_power_congruences[p] = new_ppc
# If we got this far, there are no incompatibilities, and
# the congruences have a mutual solution.
return True
One final note: in the above, we made use of the fact that the moduli were small, so that computing prime power factorisations wasn't a big deal. But if you do need to do this for much larger moduli (hundreds or thousands of digits), it's still feasible. You can skip the factorisation step, and instead find a "coprime base" for the collection of moduli: that is, a collection of pairwise relatively prime positive integers such that each modulus appearing in your congruences can be expressed as a product (possibly with repetitions) of elements of that collection. Now proceed as above, but with reference to that coprime base instead of the set of primes and prime powers. See this article by Daniel Bernstein for an efficient way to compute a coprime base for a set of positive integers. You'd likely end up making two passes through your list: one to compute the coprime base, and a second to check the consistency.

Resources