Quick idea for a shuffling algorithm - would this work? - algorithm

I was discussing with my brothers a quick-and-dirty algorithm for shuffling a deck of cards (i.e. an array in which every element is unique). Description of the algorithm:
Let the number of cards in the deck be n. Take a number x so that gcd(n,x)=1. Now iteratively pick the card number (x*i) mod n for i=1 up to i=n and put it in a new pile of cards (without removing the card from the original deck, that is, make a copy of the card). The new pile of card will be our result.
It seems clear to me that only performing this algorithm once will not give a result that is "random enough" (in the sense that it would fail statistical tests for determining randomness). But what if we perform the algorithm iteratively, possibly for a new value of x that also fulfills gcd(n,x)=1? If doing this a sufficient number of times would give us a "random enough" result, how many times could we expect to need to do this as a function of n?

Doing it multiple times would be insufficient due to the wonders of modulo arithmetic. In fact, there are at most n permutations you could ever achieve this way, and that's precisely when n is prime or 1.
Suppose you were to do this twice, with x relatively prime to n the first time and y relatively prime to n the second time.
The first time an element at position p is moved to p * x (mod n). Then the second time it is moved to (p * x) * y (mod n). This is the same as moving it to p * (x * y) (mod n) because of the associative nature of modular arithmetic. But if x * y = v (mod n) then it's the same as moving it to p * v (mod n) -- and as you know, there aren't more than n equivalence classes.
Hence, there are at most n permutations that could be resulted in a n-length deck. (No, this isn't a rigorous proof!)
Edit:
I had claimed if you used modular multiplicative exponentiation instead, it would be superior. However, after additional consideration many trivial configurations would still fall prey to modular arithmetic in the same "at most n permutations" way.

Related

Is this number a power of two?

I have a number (in base 10) represented as a string with up to 10^6 digits. I want to check if this number is a power of two. One thing I can think of is binary search on exponents and using FFT and fast exponentiation algorithm, but it is quite long and complex to code. Let n denote the length of the input (i.e., the number of decimal digits in the input). What's the most efficient algorithm for solving this problem, as a function of n?
There are either two or three powers of 2 for any given size of a decimal number, and it is easy to guess what they are, since the size of the decimal number is a good approximation of its base 10 logarithm, and you can compute the base 2 logarithm by just multiplying by an appropriate constant (log210). So a binary search would be inefficient and unnecessary.
Once you have a trial exponent, which will be on the order of three million, you can use the squaring exponentiation algorithm with about 22 bugnum decimal multiplications. (And up to 21 doublings, but those are relatively easy.)
Depending on how often you do this check, you might want to invest in fast bignum code. But if it is infrequent, simple multiplication should be ok.
If you don't expect the numbers to be powers of 2, you could first do a quick computation mod 109 to see if the last 9 digits match. That will eliminate all but a tiny percentage of random numbers. Or, for an even faster but slightly weaker filter, using 64-bit arithmetic check that the last 20 digits are divisible by 220 and not by 10.
Here is an easy probabilistic solution.
Say your number is n, and we want to find k: n = 2^k. Obviously, k = log2(n) = log10(n) * log2(10). We can estimate log10(n) ~ len(n) and find k' = len(n) * log2(10) with a small error (say, |k - k'| <= 5, I didn't check but this should be enough). Probably you'll need this part in any solutions that can come in mind, it was mentioned in other answers as well.
Now let's check that n = 2^k for some known k. Select a random prime number P with from 2 to k^2. If remainders are not equal that k is definitely not a match. But what if they are equal? I claim that false positive rate is bounded by 2 log(k)/k.
Why it is so? Because if n = 2^k (mod P) then P divides D = n-2^k. The number D has length about k (because n and 2^k has similar magnitude due to the first part) and thus cannot have more than k distinct prime divisors. There are around k^2 / log(k^2) primes less than k^2, so a probability that you've picked a prime divisor of D at random is less than k / (k^2 / log(k^2)) = 2 log(k) / k.
In practice, primes up to 10^9 (or even up to log(n)) should suffice, but you have to do a bit deeper analysis to prove the probability.
This solution does not require any long arithmetics at all, all calculations could be made in 64-bit integers.
P.S. In order to select a random prime from 1 to T you may use the following logic: select a random number from 1 to T and increment it by one until it is prime. In this case the distribution on primes is not uniform and the former analysis is not completely correct, but it can be adapted to such kind of random as well.
i am not sure if its easy to apply, but i would do it in the following way:
1) show the number in binary. now if the number is a power of two, it would look like:
1000000....
with only one 1 and the rest are 0. checking this number would be easy. now the question is how is the number stored. for example, it could have leading zeroes that will harden the search for the 1:
...000010000....
if there are only small number of leading zeroes, just search from left to right. if the number of zeroes is unknown, we will have to...
2) binary search for the 1:
2a) cut in the middle.
2b) if both or neither of them are 0 (hopefully you can check if a number is zero in reasonable time), stop and return false. (false = not power of 2)
else continue with the non-zero part.
stop if the non-zero part = 1 and return true.
estimation: if the number is n digits (decimal), then its 2^n digits binary.
binary search takes O(log t), and since t = 2^n, log t = n. therefore the algorithm should take O(n).
assumptions:
1) you can access the binary view of the number.
2) you can compare a number to zero in a reasonable time.

Find prime factors such that difference is smallest as possible

Suppose n, a, b are positive integers where n is not a prime number, such that n=ab with a≥b and (a−b) is small as possible. What would be the best algorithm to find the values of a and b if n is given?
I read a solution where they try to represent n as the difference between two squares via searching for a square S bigger than n such that S - n = (another square). Why would that be better than simply finding the prime factors of n and searching for the combination where a,b are factors of n and a - b is minimized?
Firstly....to answer why your approach
simply finding the prime factors of n and searching for the combination where a,b are factors of n and a - b is minimized
is not optimal:
Suppose your number is n = 2^7 * 3^4 * 5^2 * 7 * 11 * 13 (=259459200), well within range of int. From the combinatorics theory, this number has exactly (8 * 5 * 3 * 2 * 2 * 2 = 960) factors. So, firstly you find all of these 960 factors, then find all pairs (a,b) such that a * b = n, which in this case will be (6C1 + 9C2 + 11C3 + 13C4 + 14C5 + 15C6 + 16C7 + 16C8) ways. (if I'm not wrong, my combinatorics is a bit weak). This is of the order 1e5 if implemented optimally. Also, implementation of this approach is hard.
Now, why the difference of squares approach
represent S - n = Q, such that S and Q are perfect squares
is good:
This is because if you can represent S - n = Q, this implies, n = S - Q
=> n = s^2 - q^2
=> n = (s+q)(s-q)
=> Your reqd ans = 2 * q
Now, even if you iterate for all squares, you will either find your answer or terminate when difference of 2 consecutive squares is greater than n
But I don't think this will be doable for all n (eg. if n=6, there is no solution for (S,Q).)
Another approach:
Iterate from floor(sqrt(n)) to 1. The first number (say, x), such that x|n will be one of the numbers in the required pair (a,b). Other will be, obvs, y = x/n. So, your answer will be y - x.
This is O(sqrt(n)) time complex algorithm.
A general method could be this:
Find the prime factorization of your number: n = Π pi ai. Except for the worst cases where n is prime or semiprime, this will be substantially faster than O(n1/2) time of the iteration down from the square root, which won't divide the found factors out of the number.
Recall that the simplest, trial division, prime factorization is done by repeatedly trying to divide the number by increasing odd numbers (or by primes) below the number's square root, dividing out of the number each factor -- thus prime by construction -- as it is found (n := n/f).
Then, lazily enumerate the factors of n in order from its prime factorization. Stop after producing half of them. Having thus found n's (not necessarily prime) factor that is closest to its square root, find the second factor by simple division.
In case this must repeatedly run many times, it will greatly pay out to precalculate the needed primes below the n's square root, to use in the factorizations.

Determining whether a system of congruences has a solution

Having a system of linear congruences, I'd like to determine if it has a solution. Using simple algorithms that solve such systems is impossible, as the answer may grow exponentially.
One hypothesis I have is that if a system of congruences has no solution, then there are two of them that contradict each other. I have no idea if this holds, if it did that would lead to an easy O(n^2 log n) algo, as checking if a pair of congruences has a solution requires O(log n) time. Nevertheless for this problem I'd rather see something closer to O(n).
We may assume that no moduli exceeds 10^6, especially we can quickly factor them all to begin with. We may even further assume that the sum of all moduli doesn't exceed 10^6 (but still, their product can be huge).
As you suspect, there's a fairly simple way to determine whether the set of congruences has a solution without actually needing to build that solution. You need to:
Reduce each congruence to the form x = a (mod n) if necessary; from the comments, it sounds as though you already have this.
Factorize each modulus n as a product of prime powers: n = p1^e1 * p2^e2 * ... * pk^ek.
Replace each congruence x = a (mod n) with a collection of congruences x = a (mod pi^ei), one for each of the k prime powers you found in step 2.
And now, by the Chinese Remainder Theorem it's enough to check compatibility for each prime independently: given any two congruences x = a (mod p^e) and x = b (mod p^f), they're compatible if and only if a = b (mod p^(min(e, f)). Having determined compatibility, you can throw out the congruence with smaller modulus without losing any information.
With the right data structures, you can do all this in a single pass through your congruences: for each prime p encountered, you'll need to keep track of the biggest exponent e found so far, together with the corresponding right-hand side (reduced modulo p^e for convenience). The running time will likely be dominated by the modulus factorizations, though if no modulus exceeds 10^6, then you can make that step very fast, too, by prebuilding a mapping from each integer in the range 1 .. 10^6 to its smallest prime factor.
EDIT: And since this is supposed to be a programming site, here's some (Python 3) code to illustrate the above. (For Python 2, replace the range call with xrange for better efficiency.)
def prime_power_factorisation(n):
"""Brain-dead factorisation routine, for illustration purposes only."""
# DO NOT USE FOR LARGE n!
while n > 1:
p, pe = next(d for d in range(2, n+1) if n % d == 0), 1
while n % p == 0:
n, pe = n // p, pe*p
yield p, pe
def compatible(old_ppc, new_ppc):
"""Determine whether two prime power congruences (with the same
prime) are compatible."""
m, a = old_ppc
n, b = new_ppc
return (a - b) % min(m, n) == 0
def are_congruences_solvable(moduli, right_hand_sides):
"""Determine whether the given congruences have a common solution."""
# prime_power_congruences is a dictionary mapping each prime encountered
# so far to a pair (prime power modulus, right-hand side).
prime_power_congruences = {}
for m, a in zip(moduli, right_hand_sides):
for p, pe in prime_power_factorisation(m):
# new prime-power congruence: modulus, rhs
new_ppc = pe, a % pe
if p in prime_power_congruences:
old_ppc = prime_power_congruences[p]
if not compatible(new_ppc, old_ppc):
return False
# Keep the one with bigger exponent.
prime_power_congruences[p] = max(new_ppc, old_ppc)
else:
prime_power_congruences[p] = new_ppc
# If we got this far, there are no incompatibilities, and
# the congruences have a mutual solution.
return True
One final note: in the above, we made use of the fact that the moduli were small, so that computing prime power factorisations wasn't a big deal. But if you do need to do this for much larger moduli (hundreds or thousands of digits), it's still feasible. You can skip the factorisation step, and instead find a "coprime base" for the collection of moduli: that is, a collection of pairwise relatively prime positive integers such that each modulus appearing in your congruences can be expressed as a product (possibly with repetitions) of elements of that collection. Now proceed as above, but with reference to that coprime base instead of the set of primes and prime powers. See this article by Daniel Bernstein for an efficient way to compute a coprime base for a set of positive integers. You'd likely end up making two passes through your list: one to compute the coprime base, and a second to check the consistency.

Finding even numbers in an array without using feedback

I saw this post: Finding even numbers in an array and I was thinking about how you could do it without feedback. Here's what I mean.
Given an array of length n containing at most e even numbers and a
function isEven that returns true if the input is even and false
otherwise, write a function that prints all the even numbers in the
array using the fewest number of calls to isEven.
The answer on the post was to use a binary search, which is neat since it doesn't mean the array has to be in order. The number of times you have to check if a number is even is e log n instead if n because you do a binary search (log n) to find one even number each time (e times).
But that idea means that you divide the array in half, test for evenness, then decide which half to keep based on the result.
My question is whether or not you can beat n calls on a fixed testing scheme where you check all the numbers you want for evenness without knowing the outcome, and then figure out where the even numbers are after you've done all the tests based on the results. So I guess it's no-feedback or blind or some term like that.
I was thinking about this for a while and couldn't come up with anything. The binary search idea doesn't work at all with this constraint, but maybe something else does? Even getting down to n/2 calls instead of n (yes, I know they are the same big-O) would be good.
The technical term for "no-feedback or blind" is "non-adaptive". O(e log n) calls still suffice, but the algorithm is rather more involved.
Instead of testing the evenness of products, we're going to test the evenness of sums. Let E ≠ F be distinct subsets of {1, …, n}. If we have one array x1, …, xn with even numbers at positions E and another array y1, …, yn with even numbers at positions F, how many subsets J of {1, …, n} satisfy
(∑i in J xi) mod 2 ≠ (∑i in J yi) mod 2?
The answer is 2n-1. Let i be an index such that xi mod 2 ≠ yi mod 2. Let S be a subset of {1, …, i - 1, i + 1, … n}. Either J = S is a solution or J = S union {i} is a solution, but not both.
For every possible outcome E, we need to make calls that eliminate every other possible outcome F. Suppose we make 2e log n calls at random. For each pair E ≠ F, the probability that we still cannot distinguish E from F is (2n-1/2n)2e log n = n-2e, because there are 2n possible calls and only 2n-1 fail to distinguish. There are at most ne + 1 choices of E and thus at most (ne + 1)ne/2 pairs. By a union bound, the probability that there exists some indistinguishable pair is at most n-2e(ne + 1)ne/2 < 1 (assuming we're looking at an interesting case where e ≥ 1 and n ≥ 2), so there exists a sequence of 2e log n calls that does the job.
Note that, while I've used randomness to show that a good sequence of calls exists, the resulting algorithm is deterministic (and, of course, non-adaptive, because we chose that sequence without knowledge of the outcomes).
You can use the Chinese Remainder Theorem to do this. I'm going to change your notation a bit.
Suppose you have N numbers of which at most E are even. Choose a sequence of distinct prime powers q1,q2,...,qk such that their product is at least N^E, i.e.
qi = pi^ei
where pi is prime and ei > 0 is an integer and
q1 * q2 * ... * qk >= N^E
Now make a bunch of 0-1 matrices. Let Mi be the qi x N matrix where the entry in row r and column c has a 1 if c = r mod qi and a 0 otherwise. For example, if qi = 3^2, then row 2 has ones in columns 2, 11, 20, ... 2 + 9j and 0 elsewhere.
Now stack these matrices vertically to get a Q x N matrix M, where Q = q1 + q2 + ... + qk. The rows of M tell you which numbers to multiply together (the nonzero positions). This gives a total of Q products that you need to test for evenness. Call each row a "trial", and say that a "trial involves j" if the jth column of that row is nonempty. The theorem you need is the following:
THEOREM: The number in position j is even if and only if all trials involving j are even.
So you do a total of Q trials and then look at the results. If you choose the prime powers intelligently, then Q should be significantly smaller than N. There are asymptotic results that show you can always get Q on the order of
(2E log N)^2 / 2log(2E log N)
This theorem is actually a corollary of the Chinese Remainder Theorem. The only place that I've seen this used is in Combinatorial Group Testing. Apparently the problem originally arose when testing soldiers coming back from WWII for syphilis.
The problem you are facing is a form of group testing, type of a problem with the objective of reducing the cost of identifying certain elements of a set (up to d elements of a set of N elements).
As you've already stated, there are two basic principles via which the testing may be carried out:
Non-adaptive Group Testing, where all the tests to be performed are decided a priori.
Adaptive Group Testing, where we perform several tests, basing each test on the outcome of previous tests. Obviously, adaptive testing has a potential to reduce the cost, compared to non-adaptive testing.
Theoretical bounds for both principles have been studied, and are available in this Wiki article, or this paper.
For adaptive testing, the upper bound is O(d*log(N)) (as already described in this answer).
For non-adaptive testing, it can be shown that the upper bound is O(d*d/log(d)*log(N)), which is obviously larger than the upper bound for adaptive testing by a factor of d/log(d).
This upper bound for non-adaptive testing comes from an algorithm which uses disjunct matrices: matrices of dimension T x N ("number of tests" x "number of elements"), where each item can be either true (if an element was included in a test), or false (if it wasn't), with a property that any subset of d columns must differ from all other columns by at least a single row (test inclusion). This allows linear time of decoding (there are also "d-separable" matrices where fewer test are needed, but the time complexity for their decoding is exponential and not computationaly feasible).
Conclusion:
My question is whether or not you can beat n calls on a fixed testing scheme [...]
For such a scheme and a sufficiently large value of N, a disjunct matrix can be constructed which would have less than K * [d*d/log(d)*log(N)] rows. So, for large values of N, yes, you can beat it.
The underlying question (challenge) is kind of silly. If the binary search answer is acceptable (where it sums sub arrays and sends them to IsEven) then I can think of a way to do it with E or less calls to IsEven (assuming the numbers are integers of course).
JavaScript to demonstrate
// sort the array by only the first bit of the number
A.sort(function(x,y) { return (x & 1) - (y & 1); });
// all of the evens will be at the beginning
for(var i=0; i < E && i < A.length; i++) {
if(IsEven(A[i]))
Print(A[i]);
else
break;
}
Not exactly a solution, but just few thoughts.
It is easy to see that if a solution exists for array length n that takes less than n tests, then for any array length m > n it is easy to see that there is always a solution with less than m tests. So, if you have a solution for n = 2 or 3 or 4, then the problem is solved.
You can split the array into pairs of numbers and for each pair: if the sum is odd, then exactly one of them is even, otherwise if one of the numbers is even, then both of them are even. This way for each pair it takes either one or two tests. Best case:n/2 tests, worse case:n tests, if even and odd numbers are chosen with equal probability, then: 3n/4 tests.
My hunch is there is no solution with less than n tests. Not sure how to prove it.
UPDATE: The second solution can be extended in the following way.
Check if the sum of two numbers is even. If odd, then exactly one of them is even. Otherwise label the set as "homogeneous set of size 2". Take two "homogenous set"s of same size n. Pick one number from each set and check if their sum is even. If it is even, combine these two sets to a "homogeneous set of size 2n". Otherwise, it implies that one of those sets purely consists of even numbers and the other one purely odd numbers.
Best case:n/2 tests. Average case: 3*n/2. Worst case is still n. Worst case exists only when all the numbers are even or all the numbers are odd.
If we can add and multiply array elements, then we can compute every Boolean function (up to complementation) on the low-order bits. Simulate a circuit that encodes the positions of the even numbers as a number from 0 to nC0 + nC1 + ... + nCe - 1 represented in binary and use calls to isEven to read off the bits.
Number of calls used: within 1 of the information-theoretic optimum.
See also fully homomorphic encryption.

integer factorization and cryptography

i know that public key cryptography uses prime numbers,
also know that two large(e.g. 100 digit) prime numbers (P, Q) are used as the private key,
the product is a public key N = P * Q,
and using prime numbers is because the factorization of N to obtain P , Q is sooo difficult and takes much time,
i'm ok with that, but i'm puzzled
why not just use any ordinary large non-prime numbers for P , Q
and so the factorization of N will be still difficult
because there would because now , there are not only 2 factors possible, but even more.
thanks....
I am not a crypto expert.
why not just use any ordinary large
non-prime numbers for P , Q
Because there would be more factors. Integer factorization is an attack against public private key encryption. This attack exploits this very relation.
One could more easily find the relation and possible values with more common factors. It boils down to algebra.
N = P * Q
if P and Q are both Prime then N has 4 factors {N P Q 1}
However!
if P and Q both share a factor of 2
N / 4 = P / 2 * Q / 2
If N could have been 0..2^4096 it is now 0..2^4094 and since 2 was a factor another large number was also a factor.
This means that I could find a scalar multiple, P', Q' of P,Q S.T. P',Q' < P,Q
I don't fully understand the concept myself but I believe this shows where i'm going with this.
You have to search a smaller space until you brute force the key.
It is perfectly possible to use RSA with a modulus N that is composed of more than two prime factors P and Q, but two things have to be noted:
You must know the exact value of all of these factors, or else you will be unable to derive the private key from the public key upon key generation. The equation for two-prime RSA is 1 = D*E (mod LCM(P-1,Q-1)). If you do know these prime factors, you can perform the calculation. If you don't know the prime factors you can't perform this calculation, which BTW is why it is safe to make the public key E,N public - you can't derive the private key D if you only have the information that is easily derived from the public key, unless you are able to factor N.
The security of RSA is effectively bounded by the magnitude of the second largest prime factor of the RSA modulus N. Finding small prime factors that are less than 2^32 can be done in a fraction of a second on a modern computer, simply by trying to divide the modulus N by each such prime and check if the residue is zero (meaning N is divisible by that number) or not (meaning that number is not a factor of N). If N is composed of only such small prime factors times a single large prime factor Q, it would be trivial to find that Q as well, simply by dividing N by all small factors to get N' and test N' for primality. If N' is a prime, it is the last prime factor Q.
I'm not really expert in cryptology (so if I'm wrong just tell me in a comment, and I'll promptly delete this answer), but I think it's because if you just use random large numbers you may get easily factorisable ones (i.e. you don't have to get up to really large prime numbers to get their prime factors). So just really big, guaranteed-prime numbers are used.

Resources