finding a coprime number of a general magnitude - algorithm

I have an arbitrary number x. I would like to compute a number that is coprime to x that's close(ish) to the square root of x. I don't need to find them all, and factoring x is expensive. I just need one number.
Constant time, preferably.

You can compute the GCD with the Euclidean algorithm quite efficiently, so if you just try the numbers close to the square root you should find a candidate very quickly.
You are unlikely to get a whole string of numbers that have common factors because if you find a common prime p, the next time you can get hit by the same prime is p later.

Related

Binary XGCD for polynomials

There exists a binary GCD algorithm for finding the greatest common divisor of a number. In general, the GCD can be extended to the XGCD, which can help find a multiplicative inverse in a field.
I am working with binary numbers that represent a polynomial. For example, the bitstring 1101 represents x^3 + x^2 + 1. I need to compute the modular inverse of a random polynomial modulo x^p - 1 for some large known prime p. However, I need to do it in constant time (meaning that the runtime should not depend on the number I am inverting). I know how to make the binary GCD constant time and I know how to implement the XGCD for polynomials in order to compute multiplicative inverses. What I don't know is if there exists a binary GCD equivalent (with corresponding XGCD) for (binary) polynomials?
Yes there is. The "binary" GCD works in any ring where the smallest prime exists. For integers it is 2, hence the name binary. For polynomials, it is x. The algorithm follows the same idea: subtract polynomials to eliminate a free term in one of higher degree, factor out the highest possible power of x, and keep going until the result of subtraction becomes zero.

What is the reason behind calculating GCD in Pollard rho integer factorisation?

This is the pseudo code for calculating integer factorisation took from CLRS. But what is the point in calculating GCD involved in Line 8 and the need for doubling k when i == k in Line 13.? Help please.
That pseudocode is not Pollard-rho factorization despite the label. It is one trial of the related Brent's factorization method. In Pollard-rho factorization, in the ith step you compute x_i and x_(2i), and check the GCD of x_(2i)-x_i with n. In Brent's factorization method, you compute GCD(x_(2^a)-x_(2^a+b),n) for b=1,2, ..., 2^a. (I used the indices starting with 1 to agree with the pseudocode, but elsewhere the sequence is initialized with x_0.) In the code, k=2^a and i=2^a+b. When you detect that i has reached the next power of 2, you increase k to 2^(a+1).
GCDs can be computed very rapidly by Euclid's algorithm without knowing the factorizations of the numbers. Any time you find a nontrivial GCD with n, this helps you to factor n. In both Pollard-rho factorization and Brent's algorithm, one idea is that if you iterate a polynomial such as x^2-c, the differences between the values of the iterates mod n tend to be good candidates for numbers that share nontrivial factors with n. This is because (by the Chinese Remainder Theorem) iterating the polynomial mod n is the same as simultaneously iterating the polynomial mod each prime power in the prime factorization of n. If x_i=x_j mod p1^e1 but not mod p2^e2, then GCD(xi-xj,n) will have p1^e1 as a factor but not p2^e2, so it will be a nontrivial factor.
This is one trial because x_1 is initialized once. If you get unlucky, the value you choose for x_1 starts a preperiodic sequence that repeats at the same time mod each prime power in the prime factorization of n, even though n is not prime. For example, suppose n=1711=29*59, and x_1 = 4, x_2=15, x_3=224, x_4=556, x_5=1155, x_6=1155, ... This sequence does not help you to find a nontrivial factorization, since all of the GCDs of differences between distinct elements and 1711 are 1. If you start with x_1=5, then x_2=24, x_3=575, x_4=401, x_5=1677, x_6=1155, x_7=1155, ... In either factorization method, you would find that GCD(x_4-x_2,1711)=GCD(377,1711)=29, a nontrivial factor of 1711. Not only are some sequences not helpful, others might work, but it might be faster to give up and start with another initial value. So, normally you don't keep increasing i forever, normally there is a termination threshold where you might try a different initial value.

Given n integers, find the m whose sum's absolute value is minimal

I have n integers given; both positive and negative values are included. What is a good algorithm to find m integers from that list, such that that the absolute value of the sum of those m integers is the smallest possible?
The problem is NP-hard, since solving it efficiently would solve the subset-sum decision problem efficiently.
Given that, you're not going to find an efficient algorithm to solve it unless you believe that P=NP.
You can always come up with some heuristics to direct your search but in the worst case you'll have to check every subset of m integers.
If "good" means "correct", then just try every possibility. This will take you about n choose m time. Very slow. Unfortunately, this is the best you can do in general, because for any set of integers you can always add one more that is the negative of a sum of m-1 other ones--and those others could all have the same sign, so you have no way to search.
If "good" means "fast and usually works okay", then there are various ways to proceed. E.g.:
Suppose you can solve the problem for m=2, and suppose further you can solve it for both the positive and the negative answer (and then take the smaller of the two). Now suppose you want to solve m=4. Solve for m=2, then throw those two numbers out and solve again...should be obvious what to do next! Now, what about m=6?
Now suppose you can solve the problem for m=3 and m=2. Think you can get a decent answer for m=5?
Finally, note that if you sort the numbers, you can solve for m=2 in one pass, and for m=3 you have an annoying quadratic search to do, but at least you can do it on only about a quarter of the list twice (the small halves of the positive and negative numbers) and look for a number of opposite sign to cancel.

What is the most efficient algorithm to find the closest prime less than a given number n?

Problem
Given a number n, 2<=n<=2^63. n could be prime itself. Find the prime p that is closest to n.
Using the fact that for all primes p, p>2, p is odd and p is of the form 6k+1 or 6k+5, one could write a loop from n−1 to 2 to check if that number is prime. So instead of checking for all numbers I need to check for every odd of the two forms above. However, I wonder if there is a faster algorithm to solve this problem? i.e. some constraints that can restrict the range of numbers need to be checked? Any idea would be greatly appreciated.
In reality, the odds of finding a prime number are "high" so brute force checking while skipping "trivial" numbers (numbers divisible by small primes) is going to be your best approach given what we know about number theory to date.
[update] A mild optimization that you might do is similar to the Sieve of Eratosthenes where you define some small smooth bound and mark all numbers in a range about N as being composite and only test the numbers relatively prime to your smooth base. You will need to make your range and smoothness small enough as to not eclipse the runtime of the comparatively "expense" prime test.
The biggest optimization that you can do is to use a fast primality check before doing a full test. For instance see http://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test for a commonly used test that will quickly eliminate most numbers as "probably not prime". Only after you have good reason to believe that a number is prime should you attempt to properly prove primality. (For many purposes people are happy to just accept that if it passes a fixed number of trials of the Rabin-Miller test, it is so likely to be prime that you can just accept that fact.)

Algorithm for generating a size k error-correcting code on n bits

I want to generate a code on n bits for k different inputs that I want to classify. The main requirement of this code is the error-correcting criteria: that the minimum pairwise distance between any two encodings of different inputs is maximized. I don't need it to be exact - approximate will do, and ease of use and speed of computational implementation is a priority too.
In general, n will be in the hundreds, k in the dozens.
Also, is there a reasonably tight bound on the minimum hamming distance between k different n-bit binary encodings?
The problem of finding the exact best error-correcting code for given parameters is very hard, even approximately best codes are hard. On top of that, some codes don't have any decent decoding algorithms, while for others the decoding problem is quite tricky.
However, you're asking about a particular range of parameters where n ≫ k, where if I understand correctly you want a k-dimensional code of length n. (So that k bits are encoded in n bits.) In this range, first, a random code is likely to have very good minimum distance. The only problem is that decoding is anywhere from impractical to intractible, and actually calculating the minimum distance is not that easy either.
Second, if you want an explicit code for the case n ≫ k, then you can do reasonably well with a BCH code with q=2. As the Wikipedia page explains, there is a good decoding algorithm for BCH codes.
Concerning upper bounds for the minimum Hamming distance, in the range n ≫ k you should start with the Hamming bound, also known as the volume bound or the sphere packing bound. The idea of the bound is simple and beautiful: If the minimum distance is t, then the code can correct errors up to distance floor((t-1)/2). If you can correct errors out to some radius, it means that the Hamming balls of that radius don't overlap. On the other hand, the total number of possible words is 2n, so if you divide that by the number of points in one Hamming ball (which in the binary case is a sum of binomial coefficients), you get an upper bound on the number of error-free code words. It is possible to beat this bound, but for large minimum distance it's not easy. In this regime it's a very good bound.

Resources