Express a given number as a sum of four squares - algorithm

I am looking for an algorithm that expresses a given number as a sum of (up to) four squares.
Examples
       120 = 82 + 62 + 42 + 22
       6 = 02 + 12 + 12 + 22
       20 = 42 + 22 + 02+ 02
My approach
Take the square root and repeat this repeatedly for the remainder:
while (count != 4) {
root = (int) Math.sqrt(N)
N -= root * root
count++
}
But this fails when N is 23, even though there is a solution:
       32 + 32+ 22 + 12
Question
Is there any other algorithm to do that?
Is it always possible?

###Always possible?
Yes, the Lagrange's four square theorem states that:
every natural number can be represented as the sum of four integer squares.
It has been proved in several ways.
###Algorithm
There are some smarter algorithms, but I would suggest the following algorithm:
Factorise the number into prime factors. They don't have to be prime, but the smaller they are, the better: so primes are best. Then solve the task for each of these factors as below, and combine any resulting 4 squares with the previously found 4 squares with the Euler's four-square identity.
         (a2 + b2 + c2 + d2)
(A2 + B2 + C2 + D2) =
               (aA + bB + cC + dD)2 +
               (aB − bA + cD − dC)2 +
               (aC − bD − cA + dB)2 +
               (aD + bC − cB − dA)2
Given a number n (one of the factors mentioned above), get the greatest square that is not greater than n, and see if n minus this square can be written as the sum of three squares using the Legendre's three-square theorem: it is possible, if and only when this number is NOT of the following form:
        4a(8b+7)
If this square is not found suitable, try the next smaller one, ... until you find one. It guaranteed there will be one, and most are found within a few retries.
Try to find an actual second square term in the same way as in step 1, but now test its viability using Fermat's theorem on sums of two squares which in extension means that:
if all the prime factors of n congruent to 3 modulo 4 occur to an even exponent, then n is expressible as a sum of two squares. The converse also holds.
If this square is not found suitable, try the next smaller one, ... until you find one. It's guaranteed there will be one.
Now we have a remainder after subtracting two squares. Try subtracting a third square until that yields another square, which means we have a solution. This step can be improved by first factoring out the largest square divisor. Then when the two square terms are identified, each can then be multiplied again by the square root of that square divisor.
This is roughly the idea. For finding prime factors there are several solutions. Below I will just use the Sieve of Eratosthenes.
This is JavaScript code, so you can run it immediately -- it will produce a random number as input and display it as the sum of four squares:
function divisor(n, factor) {
var divisor = 1;
while (n % factor == 0) {
n = n / factor;
divisor = divisor * factor;
}
return divisor;
}
function getPrimesUntil(n) {
// Prime sieve algorithm
var range = Math.floor(Math.sqrt(n)) + 1;
var isPrime = Array(n).fill(1);
var primes = [2];
for (var m = 3; m < range; m += 2) {
if (isPrime[m]) {
primes.push(m);
for (var k = m * m; k <= n; k += m) {
isPrime[k] = 0;
}
}
}
for (var m = range + 1 - (range % 2); m <= n; m += 2) {
if (isPrime[m]) primes.push(m);
}
return {
primes: primes,
factorize: function (n) {
var p, count, primeFactors;
// Trial division algorithm
if (n < 2) return [];
primeFactors = [];
for (p of this.primes) {
count = 0;
while (n % p == 0) {
count++;
n /= p;
}
if (count) primeFactors.push({value: p, count: count});
}
if (n > 1) {
primeFactors.push({value: n, count: 1});
}
return primeFactors;
}
}
}
function squareTerms4(n) {
var n1, n2, n3, n4, sq, sq1, sq2, sq3, sq4, primes, factors, f, f3, factors3, ok,
res1, res2, res3, res4;
primes = getPrimesUntil(n);
factors = primes.factorize(n);
res1 = n > 0 ? 1 : 0;
res2 = res3 = res4 = 0;
for (f of factors) { // For each of the factors:
n1 = f.value;
// 1. Find a suitable first square
for (sq1 = Math.floor(Math.sqrt(n1)); sq1>0; sq1--) {
n2 = n1 - sq1*sq1;
// A number can be written as a sum of three squares
// <==> it is NOT of the form 4^a(8b+7)
if ( (n2 / divisor(n2, 4)) % 8 !== 7 ) break; // found a possibility
}
// 2. Find a suitable second square
for (sq2 = Math.floor(Math.sqrt(n2)); sq2>0; sq2--) {
n3 = n2 - sq2*sq2;
// A number can be written as a sum of two squares
// <==> all its prime factors of the form 4a+3 have an even exponent
factors3 = primes.factorize(n3);
ok = true;
for (f3 of factors3) {
ok = (f3.value % 4 != 3) || (f3.count % 2 == 0);
if (!ok) break;
}
if (ok) break;
}
// To save time: extract the largest square divisor from the previous factorisation:
sq = 1;
for (f3 of factors3) {
sq *= Math.pow(f3.value, (f3.count - f3.count % 2) / 2);
f3.count = f3.count % 2;
}
n3 /= sq*sq;
// 3. Find a suitable third square
sq4 = 0;
// b. Find square for the remaining value:
for (sq3 = Math.floor(Math.sqrt(n3)); sq3>0; sq3--) {
n4 = n3 - sq3*sq3;
// See if this yields a sum of two squares:
sq4 = Math.floor(Math.sqrt(n4));
if (n4 == sq4*sq4) break; // YES!
}
// Incorporate the square divisor back into the step-3 result:
sq3 *= sq;
sq4 *= sq;
// 4. Merge this quadruple of squares with any previous
// quadruple we had, using the Euler square identity:
while (f.count--) {
[res1, res2, res3, res4] = [
Math.abs(res1*sq1 + res2*sq2 + res3*sq3 + res4*sq4),
Math.abs(res1*sq2 - res2*sq1 + res3*sq4 - res4*sq3),
Math.abs(res1*sq3 - res2*sq4 - res3*sq1 + res4*sq2),
Math.abs(res1*sq4 + res2*sq3 - res3*sq2 - res4*sq1)
];
}
}
// Return the 4 squares in descending order (for convenience):
return [res1, res2, res3, res4].sort( (a,b) => b-a );
}
// Produce the result for some random input number
var n = Math.floor(Math.random() * 1000000);
var solution = squareTerms4(n);
// Perform the sum of squares to see it is correct:
var check = solution.reduce( (a,b) => a+b*b, 0 );
if (check !== n) throw "FAILURE: difference " + n + " - " + check;
// Print the result
console.log(n + ' = ' + solution.map( x => x+'²' ).join(' + '));
The article by by Michael Barr on the subject probably represents a more time-efficient method, but the text is more intended as a proof than an algorithm. However, if you need more time-efficiency you could consider that, together with a more efficient factorisation algorithm.

It's always possible -- it's a theorem in number theory called "Lagrange's four square theorem."
To solve it efficiently: the paper Randomized algorithms in number theory (Rabin, Shallit) gives a method that runs in expected O((log n)^2) time.
There is interesting discussion about the implementation here: https://math.stackexchange.com/questions/483101/rabin-and-shallit-algorithm
Found via Wikipedia:Langrange's four square theorem.

Here is solution , Simple 4 loops
max = square_root(N)
for(int i=0;i<=max;i++)
for(int j=0;j<=max;j++)
for(int k=0;k<=max;k++)
for(int l=0;l<=max;l++)
if(i*i+j*j+k*k+l*l==N){
found
break;
}
So you can test for any numbers. You can use break condition after two loops if sum exceeds then break it.

const fourSquares = (n) => {
const result = [];
for (let i = 0; i <= n; i++) {
for (let j = 0; j <= n; j++) {
for (let k = 0; k <= n; k++) {
for (let l = 0; l <= n; l++) {
if (i * i + j * j + k * k + l * l === n) {
result.push(i, j, k, l);
return result;
}
}
}
}
}
return result;
}
It's running too long
const fourSquares = (n) => {
const result = [];
for (let i = 0; i <= n; i++) {
for (let j = 0; j <= (n - i * i); j++) {
for (let k = 0; k <= (n - i * i - j * j); k++) {
for (let l = 0; l <= (n - i * i - j * j - k * k); l++) {
if (i * i + j * j + k * k + l * l === n) {
result.push(i, j, k, l);
return result;
}
}
}
}
}
return result;
}

const fourSquares = (n) => {
const result = [];
for (let i = 0; i * i <= n; i++) {
for (let j = 0; j * j <= n; j++) {
for (let k = 0; k * k <= n; k++) {
for (let l = 0; l * l <= n; l++) {
if (i * i + j * j + k * k + l * l === n) {
result.push(i, j, k, l);
return result;
}
}
}
}
}
return result;
}
const fourSquares = (n) => {
let a = Math.sqrt(n);
let b = Math.sqrt(n - a * a);
let c = Math.sqrt(n - a * a - b * b);
let d = Math.sqrt(n - a * a - b * b - c * c);
if (n === a * a + b * b + c * c + d * d) {
return [a, b, c, d];
}
}

Related

Finding sum of geometric sequence with modulo 10^9+7 with my program

The problem is given as:
Output the answer of (A^1+A^2+A^3+...+A^K) modulo 1,000,000,007, where 1≤ A, K ≤ 10^9, and A and K must be an integer.
I am trying to write a program to compute the above question. I have tried using the formula for geometric sequence, then applying the modulo on the answer. Since the results must be an integer as well, finding modulo inverse is not required.
Below is the code I have now, its in pascal
Var
a,k,i:longint;
power,sum: int64;
Begin
Readln(a,k);
power := 1;
For i := 1 to k do
power := ((power mod 1000000007) * a) mod 1000000007;
sum := a * (power-1) div (a-1);
Writeln(sum mod 1000000007);
End.
This task came from my school, they do not give away their test data to the students. Hence I do not know why or where my program is wrong. I only know that my program outputs the wrong answer for their test data.
If you want to do this without calculating a modular inverse, you can calculate it recursively using:
1+ A + A2 + A3 + ... + Ak
= 1 + (A + A2)(1 + A2 + (A2)2 + ... + (A2)k/2-1)
That's for even k. For odd k:
1+ A + A2 + A3 + ... + Ak
= (1 + A)(1 + A2 + (A2)2 + ... + (A2)(k-1)/2)
Since k is divided by 2 in each recursive call, the resulting algorithm has O(log k) complexity. In java:
static int modSumAtoAk(int A, int k, int mod)
{
return (modSum1ToAk(A, k, mod) + mod-1) % mod;
}
static int modSum1ToAk(int A, int k, int mod)
{
long sum;
if (k < 5) {
//k is small -- just iterate
sum = 0;
long x = 1;
for (int i=0; i<=k; ++i) {
sum = (sum+x) % mod;
x = (x*A) % mod;
}
return (int)sum;
}
//k is big
int A2 = (int)( ((long)A)*A % mod );
if ((k%2)==0) {
// k even
sum = modSum1ToAk(A2, (k/2)-1, mod);
sum = (sum + sum*A) % mod;
sum = ((sum * A) + 1) % mod;
} else {
// k odd
sum = modSum1ToAk(A2, (k-1)/2, mod);
sum = (sum + sum*A) % mod;
}
return (int)sum;
}
Note that I've been very careful to make sure that each product is done in 64 bits, and to reduce by the modulus after each one.
With a little math, the above can be converted to an iterative version that doesn't require any storage:
static int modSumAtoAk(int A, int k, int mod)
{
// first, we calculate the sum of all 1... A^k
// we'll refer to that as SUM1 in comments below
long fac=1;
long add=0;
//INVARIANT: SUM1 = add + fac*(sum 1...A^k)
//this will remain true as we change k
while (k > 0) {
//above INVARIANT is true here, too
long newmul, newadd;
if ((k%2)==0) {
//k is even. sum 1...A^k = 1+A*(sum 1...A^(k-1))
newmul = A;
newadd = 1;
k-=1;
} else {
//k is odd.
newmul = A+1L;
newadd = 0;
A = (int)(((long)A) * A % mod);
k = (k-1)/2;
}
//SUM1 = add + fac * (newadd + newmul*(sum 1...Ak))
// = add+fac*newadd + fac*newmul*(sum 1...Ak)
add = (add+fac*newadd) % mod;
fac = (fac*newmul) % mod;
//INVARIANT is restored
}
// k == 0
long sum1 = fac + add;
return (int)((sum1 + mod -1) % mod);
}

Count the number of ways a given sequence of digits can be split with each part at most K

Hi guys I'm practicing dynamic programming and came across the following problem:
Given a number K, 0 <= K <= 10^100, a sequence of digits N, what is the number of possible ways of dividing N so that each part is at most K?
Input:
K = 8
N = 123
Output: 1
Explanation:
123
1-23
12-3
1-2-3
Are all possibilities of spliting N and only the last one is valid...
What I have achieved so far:
Let Dp[i] = the number of valid ways of dividing N, using i first digits.
Given a state, i must use the previous answer to compute new answers, we have 2 possibilities:
Use dp[i-1] + number of valid ways that split the digit i
Use dp[i-1] + number of valid ways that not split the digit i
But I'm stuck there and I don't know what to do
Thanks
Using dynamic programming implies that you need to think about the problem in terms of subproblems.
Let's denote by N[i...] the suffix of N starting at index i (for instance, with N = 45678955, we have N[3...] = 78955)
Let's denote by dp[i] the number of possible ways of dividing N[i...] so that each part is at most K.
We will also use a small function, max_part_len(N, K, i) which will represent the maximum length of a 'part' starting at i. For instance, with N = 45678955, K = 37, i = 3, we have max_part_len(N, K, i) = 1 because 7 < 37 but 78 > 37.
Now we can write the recurrence (or induction) relation on dp[i].
dp[i] = sum_(j from 1 to max_part_len(N, K, i)) dp[i+j]
This relation means that the the number of possible ways of dividing N[i...] so that each part is at most K, is:
The sum of the the number of possible ways of dividing N[i+j...] so that each part is at most K, for each j such that N[i...j] <= k.
From there the algorithm is quite straight forward if you understood the basics of dynamic programming, I leave this part to you ;-)
I think we can also use divide and conquer. Let f(l, r) represent the number of ways to divide the range of digits indexed from l to r, so that each part is at most k. Then divide the string, 45678955 in two:
4567 8955
and the result would be
f(4567) * f(8955)
plus a division with a part that includes at least one from each side of the split, so each left extension paired with all right extensions. Say k was 1000. Then
f(456) * 1 * f(955) + // 78
f(456) * 1 * f(55) + // 789
f(45) * 1 * f(955) // 678
where each one of the calls to f performs a similar divide and conquer.
Here's JavaScript code comparing a recursive (top-down) implementation of m.raynal's algorithm with this divide and conquer:
function max_part_len(N, K, i){
let d = 0;
let a = 0;
while (a <= K && d <= N.length - i){
d = d + 1;
a = Number(N.substr(i, d));
}
return d - 1;
}
// m.raynal's algorithm
function f(N, K, i, memo={}){
let key = String([N, i])
if (memo.hasOwnProperty(key))
return memo[key];
if (i == N.length)
return 1
if (i == N.length - 1)
return (Number(N[i]) <= K) & 1
let s = 0;
for (let j=1; j<=max_part_len(N, K, i); j++)
s = s + f(N, K, i + j, memo);
return memo[key] = s;
}
// divide and conquer
function g(N, K, memo={}){
if (memo.hasOwnProperty(N))
return memo[N];
if (!N)
return memo[N] = 1;
if (N.length == 1)
return memo[N] = (Number(N) <= K) & 1;
let mid = Math.floor(N.length / 2);
let left = g(N.substr(0, mid), K);
let right = g(N.substr(mid), K);
let s = 0;
let i = mid - 1;
let j = mid;
let str = N.substring(i, j + 1);
while (i >= 0 && Number(str) <= K){
if (j == N.length){
if (i == 0){
break;
} else{
i = i - 1;
j = mid;
str = N.substring(i, j + 1);
continue
}
}
let l = g(N.substring(0, i), K, memo);
let r = g(N.substring(j + 1, N.length, memo), K);
s = s + l * r;
j = j + 1;
str = N.substring(i, j + 1);
if (Number(str) > K){
j = mid;
i = i - 1;
str = N.substring(i, j + 1);
}
}
return memo[N] = left * right + s;
}
let start = new Date;
for (let i=5; i<100000; i++){
let k = Math.ceil(Math.random() * i)
let ii = String(i);
let ff = f(ii, k, 0);
}
console.log(`Running f() 100,000 times took ${ (new Date - start)/1000 } sec`)
start = new Date;
for (let i=5; i<100000; i++){
let k = Math.ceil(Math.random() * i)
let ii = String(i);
let gg = g(ii, k);
}
console.log(`Running g() 100,000 times took ${ (new Date - start)/1000 } sec`)
start = new Date;
for (let i=5; i<100000; i++){
let k = Math.ceil(Math.random() * i)
let ii = String(i);
let ff = f(ii, k, 0);
let gg = g(ii, k);
if (ff != gg){
console.log("Mismatch found.", ii, k, ff, gg);
break;
}
}
console.log(`No discrepancies found between f() and g(). ${ (new Date - start)/1000 } sec`)

Count of co-prime pairs from two arrays in less than O(n^2) complexity

I came to this problem in a challenge.
There are two arrays A and B both of size of N and we need to return the count of pairs (A[i],B[j]) where gcd(A[i],B[j])==1 and A[i] != B[j].
I could only think of brute force approach which exceeded time limit for few test cases.
for(int i=0; i<n; i++) {
for(int j=0; j<n; j++) {
if(__gcd(a[i],b[j])==1) {
printf("%d %d\n", a[i], b[j]);
}
}
}
Can you advice time efficient algorithm to solve this.
Edit: Not able to share question link as this was from a hiring challenge. Adding the constraints and input/output format as I remember.
Input -
First line will contain N, the number of elements present in both arrays.
Second line will contain N space separated integers, elements of array A.
Third line will contain N space separated integers, elements of array B.
Output -
The count of pairs A[i],A[j] as per the conditions.
Constraints -
1 <= N <= 10^5
1 < A[i],B[j] <= 10^9 where i,j < N
The first step is to use Eratosthenes sieve to calculate the prime numbers up to sqrt(10^9). This sieve can then be used to quickly find all prime factors of any number less than 10^9 (see the getPrimeFactors(...) function in the code sample below).
Next, for each A[i] with prime factors p0, p1, ..., pk, we compute all possible sub-products X - p0, p1, p0p1, p2, p0p2, p1p2, p0p1p2, p3, p0p3, ..., p0p1p2...pk and count them in map cntp[X]. Effectively, the map cntp[X] tells us the number of elements A[i] divisible by X, where X is a product of prime numbers to the power of 0 or 1. So for example, for the number A[i] = 12, the prime factors are 2, 3. We will count cntp[2]++, cntp[3]++ and cntp[6]++.
Finally, for each B[j] with prime factors p0, p1, ..., pk, we again compute all possible sub-products X and use the Inclusion-exclusion principle to count all non-coprime pairs C_j (i.e. the number of A[i]s that share at least one prime factor with B[j]). The numbers C_j are then subtracted from the total number of pairs - N*N to get the final answer.
Note: the Inclusion-exclusion principle looks like this:
C_j = (cntp[p0] + cntp[p1] + ... + cntp[pk]) -
(cntp[p0p1] + cntp[p0p2] + ... + cntp[pk-1pk]) +
(cntp[p0p1p2] + cntp[p0p1p3] + ... + cntp[pk-2pk-1pk]) -
...
and accounts for the fact that in cntp[X] and cntp[Y] we could have counted the same number A[i] twice, given that it is divisible by both X and Y.
Here is a possible C++ implementation of the algorithm, which produces the same results as the naive O(n^2) algorithm by OP:
// get prime factors of a using pre-generated sieve
std::vector<int> getPrimeFactors(int a, const std::vector<int> & primes) {
std::vector<int> f;
for (auto p : primes) {
if (p > a) break;
if (a % p == 0) {
f.push_back(p);
do {
a /= p;
} while (a % p == 0);
}
}
if (a > 1) f.push_back(a);
return f;
}
// find coprime pairs A_i and B_j
// A_i and B_i <= 1e9
void solution(const std::vector<int> & A, const std::vector<int> & B) {
// generate prime sieve
std::vector<int> primes;
primes.push_back(2);
for (int i = 3; i*i <= 1e9; ++i) {
bool isPrime = true;
for (auto p : primes) {
if (i % p == 0) {
isPrime = false;
break;
}
}
if (isPrime) {
primes.push_back(i);
}
}
int N = A.size();
struct Entry {
int n = 0;
int64_t p = 0;
};
// cntp[X] - number of times the product X can be expressed
// with prime factors of A_i
std::map<int64_t, int64_t> cntp;
for (int i = 0; i < N; i++) {
auto f = getPrimeFactors(A[i], primes);
// count possible products using non-repeating prime factors of A_i
std::vector<Entry> x;
x.push_back({ 0, 1 });
for (auto p : f) {
int k = x.size();
for (int i = 0; i < k; ++i) {
int nn = x[i].n + 1;
int64_t pp = x[i].p*p;
++cntp[pp];
x.push_back({ nn, pp });
}
}
}
// use Inclusion–exclusion principle to count non-coprime pairs
// and subtract them from the total number of prairs N*N
int64_t cnt = N; cnt *= N;
for (int i = 0; i < N; i++) {
auto f = getPrimeFactors(B[i], primes);
std::vector<Entry> x;
x.push_back({ 0, 1 });
for (auto p : f) {
int k = x.size();
for (int i = 0; i < k; ++i) {
int nn = x[i].n + 1;
int64_t pp = x[i].p*p;
x.push_back({ nn, pp });
if (nn % 2 == 1) {
cnt -= cntp[pp];
} else {
cnt += cntp[pp];
}
}
}
}
printf("cnt = %d\n", (int) cnt);
}
Live example
I cannot estimate the complexity analytically, but here are some profiling result on my laptop for different N and uniformly random A[i] and B[j]:
For N = 1e2, takes ~0.02 sec
For N = 1e3, takes ~0.05 sec
For N = 1e4, takes ~0.38 sec
For N = 1e5, takes ~3.80 sec
For comparison, the O(n^2) approach takes:
For N = 1e2, takes ~0.00 sec
For N = 1e3, takes ~0.15 sec
For N = 1e4, takes ~15.1 sec
For N = 1e5, takes too long, didn't wait to finish
Python Implementation:
import math
from collections import defaultdict
def sieve(MAXN):
spf = [0 for i in range(MAXN)]
spf[1] = 1
for i in range(2, MAXN):
spf[i] = i
for i in range(4, MAXN, 2):
spf[i] = 2
for i in range(3, math.ceil(math.sqrt(MAXN))):
if (spf[i] == i):
for j in range(i * i, MAXN, i):
if (spf[j] == j):
spf[j] = i
return(spf)
def getFactorization(x,spf):
ret = list()
while (x != 1):
ret.append(spf[x])
x = x // spf[x]
return(list(set(ret)))
def coprime_pairs(N,A,B):
MAXN=max(max(A),max(B))+1
spf=sieve(MAXN)
cntp=defaultdict(int)
for i in range(N):
f=getFactorization(A[i],spf)
x=[[0,1]]
for p in f:
k=len(x)
for i in range(k):
nn=x[i][0]+1
pp=x[i][1]*p
cntp[pp]+=1
x.append([nn,pp])
cnt=0
for i in range(N):
f=getFactorization(B[i],spf)
x=[[0,1]]
for p in f:
k=len(x)
for i in range(k):
nn=x[i][0]+1
pp=x[i][1]*p
x.append([nn,pp])
if(nn%2==1):
cnt+=cntp[pp]
else:
cnt-=cntp[pp]
return(N*N-cnt)
import random
N=10001
A=[random.randint(1,N) for _ in range(N)]
B=[random.randint(1,N) for _ in range(N)]
print(coprime_pairs(N,A,B))

Discover long patterns

Given a sorted list of numbers, I would like to find the longest subsequence where the differences between successive elements are geometrically increasing. So if the list is
1, 2, 3, 4, 7, 15, 27, 30, 31, 81
then the subsequence is 1, 3, 7, 15, 31. Alternatively consider 1, 2, 5, 6, 11, 15, 23, 41, 47 which has subsequence 5, 11, 23, 47 with a = 3 and k = 2.
Can this be solved in O(n2) time? Where n is the length of the list.
I am interested both in the general case where the progression of differences is ak, ak2, ak3, etc., where both a and k are integers, and in the special case where a = 1, so the progression of difference is k, k2, k3, etc.
Update
I have made an improvement of the algorithm that it takes an average of O(M + N^2) and memory needs of O(M+N). Mainly is the same that the protocol described below, but to calculate the possible factors A,K for ech diference D, I preload a table. This table takes less than a second to be constructed for M=10^7.
I have made a C implementation that takes less than 10minutes to solve N=10^5 diferent random integer elements.
Here is the source code in C: To execute just do: gcc -O3 -o findgeo findgeo.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <memory.h>
#include <time.h>
struct Factor {
int a;
int k;
struct Factor *next;
};
struct Factor *factors = 0;
int factorsL=0;
void ConstructFactors(int R) {
int a,k,C;
int R2;
struct Factor *f;
float seconds;
clock_t end;
clock_t start = clock();
if (factors) free(factors);
factors = malloc (sizeof(struct Factor) *((R>>1) + 1));
R2 = R>>1 ;
for (a=0;a<=R2;a++) {
factors[a].a= a;
factors[a].k=1;
factors[a].next=NULL;
}
factorsL=R2+1;
R2 = floor(sqrt(R));
for (k=2; k<=R2; k++) {
a=1;
C=a*k*(k+1);
while (C<R) {
C >>= 1;
f=malloc(sizeof(struct Factor));
*f=factors[C];
factors[C].a=a;
factors[C].k=k;
factors[C].next=f;
a++;
C=a*k*(k+1);
}
}
end = clock();
seconds = (float)(end - start) / CLOCKS_PER_SEC;
printf("Construct Table: %f\n",seconds);
}
void DestructFactors() {
int i;
struct Factor *f;
for (i=0;i<factorsL;i++) {
while (factors[i].next) {
f=factors[i].next->next;
free(factors[i].next);
factors[i].next=f;
}
}
free(factors);
factors=NULL;
factorsL=0;
}
int ipow(int base, int exp)
{
int result = 1;
while (exp)
{
if (exp & 1)
result *= base;
exp >>= 1;
base *= base;
}
return result;
}
void findGeo(int **bestSolution, int *bestSolutionL,int *Arr, int L) {
int i,j,D;
int mustExistToBeBetter;
int R=Arr[L-1]-Arr[0];
int *possibleSolution;
int possibleSolutionL=0;
int exp;
int NextVal;
int idx;
int kMax,aMax;
float seconds;
clock_t end;
clock_t start = clock();
kMax = floor(sqrt(R));
aMax = floor(R/2);
ConstructFactors(R);
*bestSolutionL=2;
*bestSolution=malloc(0);
possibleSolution = malloc(sizeof(int)*(R+1));
struct Factor *f;
int *H=malloc(sizeof(int)*(R+1));
memset(H,0, sizeof(int)*(R+1));
for (i=0;i<L;i++) {
H[ Arr[i]-Arr[0] ]=1;
}
for (i=0; i<L-2;i++) {
for (j=i+2; j<L; j++) {
D=Arr[j]-Arr[i];
if (D & 1) continue;
f = factors + (D >>1);
while (f) {
idx=Arr[i] + f->a * f->k - Arr[0];
if ((f->k <= kMax)&& (f->a<aMax)&&(idx<=R)&&H[idx]) {
if (f->k ==1) {
mustExistToBeBetter = Arr[i] + f->a * (*bestSolutionL);
} else {
mustExistToBeBetter = Arr[i] + f->a * f->k * (ipow(f->k,*bestSolutionL) - 1)/(f->k-1);
}
if (mustExistToBeBetter< Arr[L-1]+1) {
idx= floor(mustExistToBeBetter - Arr[0]);
} else {
idx = R+1;
}
if ((idx<=R)&&H[idx]) {
possibleSolution[0]=Arr[i];
possibleSolution[1]=Arr[i] + f->a*f->k;
possibleSolution[2]=Arr[j];
possibleSolutionL=3;
exp = f->k * f->k * f->k;
NextVal = Arr[j] + f->a * exp;
idx=NextVal - Arr[0];
while ( (idx<=R) && H[idx]) {
possibleSolution[possibleSolutionL]=NextVal;
possibleSolutionL++;
exp = exp * f->k;
NextVal = NextVal + f->a * exp;
idx=NextVal - Arr[0];
}
if (possibleSolutionL > *bestSolutionL) {
free(*bestSolution);
*bestSolution = possibleSolution;
possibleSolution = malloc(sizeof(int)*(R+1));
*bestSolutionL=possibleSolutionL;
kMax= floor( pow (R, 1/ (*bestSolutionL) ));
aMax= floor(R / (*bestSolutionL));
}
}
}
f=f->next;
}
}
}
if (*bestSolutionL == 2) {
free(*bestSolution);
possibleSolutionL=0;
for (i=0; (i<2)&&(i<L); i++ ) {
possibleSolution[possibleSolutionL]=Arr[i];
possibleSolutionL++;
}
*bestSolution = possibleSolution;
*bestSolutionL=possibleSolutionL;
} else {
free(possibleSolution);
}
DestructFactors();
free(H);
end = clock();
seconds = (float)(end - start) / CLOCKS_PER_SEC;
printf("findGeo: %f\n",seconds);
}
int compareInt (const void * a, const void * b)
{
return *(int *)a - *(int *)b;
}
int main(void) {
int N=100000;
int R=10000000;
int *A = malloc(sizeof(int)*N);
int *Sol;
int SolL;
int i;
int *S=malloc(sizeof(int)*R);
for (i=0;i<R;i++) S[i]=i+1;
for (i=0;i<N;i++) {
int r = rand() % (R-i);
A[i]=S[r];
S[r]=S[R-i-1];
}
free(S);
qsort(A,N,sizeof(int),compareInt);
/*
int step = floor(R/N);
A[0]=1;
for (i=1;i<N;i++) {
A[i]=A[i-1]+step;
}
*/
findGeo(&Sol,&SolL,A,N);
printf("[");
for (i=0;i<SolL;i++) {
if (i>0) printf(",");
printf("%d",Sol[i]);
}
printf("]\n");
printf("Size: %d\n",SolL);
free(Sol);
free(A);
return EXIT_SUCCESS;
}
Demostration
I will try to demonstrate that the algorithm that I proposed is in average for an equally distributed random sequence. I’m not a mathematician and I am not used to do this kind of demonstrations, so please fill free to correct me any error that you can see.
There are 4 indented loops, the two firsts are the N^2 factor. The M is for the calculation of the possible factors table).
The third loop is executed only once in average for each pair. You can see this checking the size of the pre-calculated factors table. It’s size is M when N->inf. So the average steps for each pair is M/M=1.
So the proof happens to check that the forth loop. (The one that traverses the good made sequences is executed less that or equal O(N^2) for all the pairs.
To demonstrate that, I will consider two cases: one where M>>N and other where M ~= N. Where M is the maximum difference of the initial array: M= S(n)-S(1).
For the first case, (M>>N) the probability to find a coincidence is p=N/M. To start a sequence, it must coincide the second and the b+1 element where b is the length of the best sequence until now. So the loop will enter times. And the average length of this series (supposing an infinite series) is . So the total number of times that the loop will be executed is . And this is close to 0 when M>>N. The problem here is when M~=N.
Now lets consider this case where M~=N. Lets consider that b is the best sequence length until now. For the case A=k=1, then the sequence must start before N-b, so the number of sequences will be N-b, and the times that will go for the loop will be a maximum of (N-b)*b.
For A>1 and k=1 we can extrapolate to where d is M/N (the average distance between numbers). If we add for all A’s from 1 to dN/b then we see a top limit of:
For the cases where k>=2, we see that the sequence must start before , So the loop will enter an average of and adding for all As from 1 to dN/k^b, it gives a limit of
Here, the worst case is when b is minimum. Because we are considering minimum series, lets consider a very worst case of b= 2 so the number of passes for the 4th loop for a given k will be less than
.
And if we add all k’s from 2 to infinite will be:
So adding all the passes for k=1 and k>=2, we have a maximum of:
Note that d=M/N=1/p.
So we have two limits, One that goes to infinite when d=1/p=M/N goes to 1 and other that goes to infinite when d goes to infinite. So our limit is the minimum of both, and the worst case is when both equetions cross. So if we solve the equation:
we see that the maximum is when d=1.353
So it is demonstrated that the forth loops will be processed less than 1.55N^2 times in total.
Of course, this is for the average case. For the worst case I am not able to find a way to generate series whose forth loop are higher than O(N^2), and I strongly believe that they does not exist, but I am not a mathematician to prove it.
Old Answer
Here is a solution in average of O((n^2)*cube_root(M)) where M is the difference between the first and last element of the array. And memory requirements of O(M+N).
1.- Construct an array H of length M so that M[i - S[0]]=true if i exists in the initial array and false if it does not exist.
2.- For each pair in the array S[j], S[i] do:
2.1 Check if it can be the first and third elements of a possible solution. To do so, calculate all possible A,K pairs that meet the equation S(i) = S(j) + AK + AK^2. Check this SO question to see how to solve this problem. And check that exist the second element: S[i]+ A*K
2.2 Check also that exist the element one position further that the best solution that we have. For example, if the best solution that we have until now is 4 elements long then check that exist the element A[j] + AK + AK^2 + AK^3 + AK^4
2.3 If 2.1 and 2.2 are true, then iterate how long is this series and set as the bestSolution until now is is longer that the last.
Here is the code in javascript:
function getAKs(A) {
if (A / 2 != Math.floor(A / 2)) return [];
var solution = [];
var i;
var SR3 = Math.pow(A, 1 / 3);
for (i = 1; i <= SR3; i++) {
var B, C;
C = i;
B = A / (C * (C + 1));
if (B == Math.floor(B)) {
solution.push([B, C]);
}
B = i;
C = (-1 + Math.sqrt(1 + 4 * A / B)) / 2;
if (C == Math.floor(C)) {
solution.push([B, C]);
}
}
return solution;
}
function getBestGeometricSequence(S) {
var i, j, k;
var bestSolution = [];
var H = Array(S[S.length-1]-S[0]);
for (i = 0; i < S.length; i++) H[S[i] - S[0]] = true;
for (i = 0; i < S.length; i++) {
for (j = 0; j < i; j++) {
var PossibleAKs = getAKs(S[i] - S[j]);
for (k = 0; k < PossibleAKs.length; k++) {
var A = PossibleAKs[k][0];
var K = PossibleAKs[k][17];
var mustExistToBeBetter;
if (K==1) {
mustExistToBeBetter = S[j] + A * bestSolution.length;
} else {
mustExistToBeBetter = S[j] + A * K * (Math.pow(K,bestSolution.length) - 1)/(K-1);
}
if ((H[S[j] + A * K - S[0]]) && (H[mustExistToBeBetter - S[0]])) {
var possibleSolution=[S[j],S[j] + A * K,S[i]];
exp = K * K * K;
var NextVal = S[i] + A * exp;
while (H[NextVal - S[0]] === true) {
possibleSolution.push(NextVal);
exp = exp * K;
NextVal = NextVal + A * exp;
}
if (possibleSolution.length > bestSolution.length) {
bestSolution = possibleSolution;
}
}
}
}
}
return bestSolution;
}
//var A= [ 1, 2, 3,5,7, 15, 27, 30,31, 81];
var A=[];
for (i=1;i<=3000;i++) {
A.push(i);
}
var sol=getBestGeometricSequence(A);
$("#result").html(JSON.stringify(sol));
You can check the code here: http://jsfiddle.net/6yHyR/1/
I maintain the other solution because I believe that it is still better when M is very big compared to N.
Just to start with something, here is a simple solution in JavaScript:
var input = [0.7, 1, 2, 3, 4, 7, 15, 27, 30, 31, 81],
output = [], indexes, values, i, index, value, i_max_length,
i1, i2, i3, j1, j2, j3, difference12a, difference23a, difference12b, difference23b,
scale_factor, common_ratio_a, common_ratio_b, common_ratio_c,
error, EPSILON = 1e-9, common_ratio_is_integer,
resultDiv = $("#result");
for (i1 = 0; i1 < input.length - 2; ++i1) {
for (i2 = i1 + 1; i2 < input.length - 1; ++i2) {
scale_factor = difference12a = input[i2] - input[i1];
for (i3 = i2 + 1; i3 < input.length; ++i3) {
difference23a = input[i3] - input[i2];
common_ratio_1a = difference23a / difference12a;
common_ratio_2a = Math.round(common_ratio_1a);
error = Math.abs((common_ratio_2a - common_ratio_1a) / common_ratio_1a);
common_ratio_is_integer = error < EPSILON;
if (common_ratio_2a > 1 && common_ratio_is_integer) {
indexes = [i1, i2, i3];
j1 = i2;
j2 = i3
difference12b = difference23a;
for (j3 = j2 + 1; j3 < input.length; ++j3) {
difference23b = input[j3] - input[j2];
common_ratio_1b = difference23b / difference12b;
common_ratio_2b = Math.round(common_ratio_1b);
error = Math.abs((common_ratio_2b - common_ratio_1b) / common_ratio_1b);
common_ratio_is_integer = error < EPSILON;
if (common_ratio_is_integer && common_ratio_2a === common_ratio_2b) {
indexes.push(j3);
j1 = j2;
j2 = j3
difference12b = difference23b;
}
}
values = [];
for (i = 0; i < indexes.length; ++i) {
index = indexes[i];
value = input[index];
values.push(value);
}
output.push(values);
}
}
}
}
if (output !== []) {
i_max_length = 0;
for (i = 1; i < output.length; ++i) {
if (output[i_max_length].length < output[i].length)
i_max_length = i;
}
for (i = 0; i < output.length; ++i) {
if (output[i_max_length].length == output[i].length)
resultDiv.append("<p>[" + output[i] + "]</p>");
}
}
Output:
[1, 3, 7, 15, 31]
I find the first three items of every subsequence candidate, calculate the scale factor and the common ratio from them, and if the common ratio is integer, then I iterate over the remaining elements after the third one, and add those to the subsequence, which fit into the geometric progression defined by the first three items. As a last step, I select the sebsequence/s which has/have the largest length.
In fact it is exactly the same question as Longest equally-spaced subsequence, you just have to consider the logarithm of your data. If the sequence is a, ak, ak^2, ak^3, the logarithmique value is ln(a), ln(a) + ln(k), ln(a)+2ln(k), ln(a)+3ln(k), so it is equally spaced. The opposite is of course true. There is a lot of different code in the question above.
I don't think the special case a=1 can be resolved more efficiently than an adaptation from an algorithm above.
Here is my solution in Javascript. It should be close to O(n^2) except may be in some pathological cases.
function bsearch(Arr,Val, left,right) {
if (left == right) return left;
var m=Math.floor((left + right) /2);
if (Val <= Arr[m]) {
return bsearch(Arr,Val,left,m);
} else {
return bsearch(Arr,Val,m+1,right);
}
}
function findLongestGeometricSequence(S) {
var bestSolution=[];
var i,j,k;
var H={};
for (i=0;i<S.length;i++) H[S[i]]=true;
for (i=0;i<S.length;i++) {
for (j=0;j<i;j++) {
for (k=j+1;k<i;) {
var possibleSolution=[S[j],S[k],S[i]];
var K = (S[i] - S[k]) / (S[k] - S[j]);
var A = (S[k] - S[j]) * (S[k] - S[j]) / (S[i] - S[k]);
if ((Math.floor(K) == K) && (Math.floor(A)==A)) {
exp= K*K*K;
var NextVal= S[i] + A * exp;
while (H[NextVal] === true) {
possibleSolution.push(NextVal);
exp = exp * K;
NextVal= NextVal + A * exp;
}
if (possibleSolution.length > bestSolution.length)
bestSolution=possibleSolution;
K--;
} else {
K=Math.floor(K);
}
if (K>0) {
var NextPossibleMidValue= (S[i] + K*S[j]) / (K +1);
k++;
if (S[k]<NextPossibleMidValue) {
k=bsearch(S,NextPossibleMidValue, k+1, i);
}
} else {
k=i;
}
}
}
}
return bestSolution;
}
function Run() {
var MyS= [0.7, 1, 2, 3, 4, 5,6,7, 15, 27, 30,31, 81];
var sol = findLongestGeometricSequence(MyS);
alert(JSON.stringify(sol));
}
Small Explanation
If we take 3 numbers of the array S(j) < S(k) < S(i) then you can calculate a and k so that: S(k) = S(j) + a*k and S(i) = S(k) + a*k^2 (2 equations and 2 incognits). With that in mind, you can check if exist a number in the array that is S(next) = S(i) + a*k^3. If that is the case, then continue checknng for S(next2) = S(next) + a*k^4 and so on.
This would be a O(n^3) solution, but you can hava advantage that k must be integer in order to limit the S(k) points selected.
In case that a is known, then you can calculate a(k) and you need to check only one number in the third loop, so this case will be clearly a O(n^2).
I think this task is related with not so long ago posted Longest equally-spaced subsequence. I've just modified my algorithm in Python a little bit:
from math import sqrt
def add_precalc(precalc, end, (a, k), count, res, N):
if end + a * k ** res[1]["count"] > N: return
x = end + a * k ** count
if x > N or x < 0: return
if precalc[x] is None: return
if (a, k) not in precalc[x]:
precalc[x][(a, k)] = count
return
def factors(n):
res = []
for x in range(1, int(sqrt(n)) + 1):
if n % x == 0:
y = n / x
res.append((x, y))
res.append((y, x))
return res
def work(input):
precalc = [None] * (max(input) + 1)
for x in input: precalc[x] = {}
N = max(input)
res = ((0, 0), {"end":0, "count":0})
for i, x in enumerate(input):
for y in input[i::-1]:
for a, k in factors(x - y):
if (a, k) in precalc[x]: continue
add_precalc(precalc, x, (a, k), 2, res, N)
for step, count in precalc[x].iteritems():
count += 1
if count > res[1]["count"]: res = (step, {"end":x, "count":count})
add_precalc(precalc, x, step, count, res, N)
precalc[x] = None
d = [res[1]["end"]]
for x in range(res[1]["count"] - 1, 0, -1):
d.append(d[-1] - res[0][0] * res[0][1] ** x)
d.reverse()
return d
explanation
Traversing the array
For each previous element of the array calculate factors of the difference between current and taken previous element and then precalculate next possible element of the sequence and saving it to precalc array
So when arriving at element i there're already all possible sequences with element i in the precalc array, so we have to calculate next possible element and save it to precalc.
Currently there's one place in algorithm that could be slow - factorization of each previous number. I think it could be made faster with two optimizations:
more effective factorization algorithm
find a way not to see at each element of array, using the fact that array is sorted and there's already a precalculated sequences
Python:
def subseq(a):
seq = []
aset = set(a)
for i, x in enumerate(a):
# elements after x
for j, x2 in enumerate(a[i+1:]):
j += i + 1 # enumerate starts j at 0, we want a[j] = x2
bk = x2 - x # b*k (assuming k and k's exponent start at 1)
# given b*k, bruteforce values of k
for k in range(1, bk + 1):
items = [x, x2] # our subsequence so far
nextdist = bk * k # what x3 - x2 should look like
while items[-1] + nextdist in aset:
items.append(items[-1] + nextdist)
nextdist *= k
if len(items) > len(seq):
seq = items
return seq
Running time is O(dn^3), where d is the (average?) distance between two elements,
and n is of course len(a).

Sum of series: 1^1 + 2^2 + 3^3 + ... + n^n (mod m)

Can someone give me an idea of an efficient algorithm for large n (say 10^10) to find the sum of above series?
Mycode is getting klilled for n= 100000 and m=200000
#include<stdio.h>
int main() {
int n,m,i,j,sum,t;
scanf("%d%d",&n,&m);
sum=0;
for(i=1;i<=n;i++) {
t=1;
for(j=1;j<=i;j++)
t=((long long)t*i)%m;
sum=(sum+t)%m;
}
printf("%d\n",sum);
}
Two notes:
(a + b + c) % m
is equivalent to
(a % m + b % m + c % m) % m
and
(a * b * c) % m
is equivalent to
((a % m) * (b % m) * (c % m)) % m
As a result, you can calculate each term using a recursive function in O(log p):
int expmod(int n, int p, int m) {
if (p == 0) return 1;
int nm = n % m;
long long r = expmod(nm, p / 2, m);
r = (r * r) % m;
if (p % 2 == 0) return r;
return (r * nm) % m;
}
And sum elements using a for loop:
long long r = 0;
for (int i = 1; i <= n; ++i)
r = (r + expmod(i, i, m)) % m;
This algorithm is O(n log n).
I think you can use Euler's theorem to avoid some exponentation, as phi(200000)=80000. Chinese remainder theorem might also help as it reduces the modulo.
You may have a look at my answer to this post. The implementation there is slightly buggy, but the idea is there. The key strategy is to find x such that n^(x-1)<m and n^x>m and repeatedly reduce n^n%m to (n^x%m)^(n/x)*n^(n%x)%m. I am sure this strategy works.
I encountered similar question recently: my 'n' is 1435, 'm' is 10^10. Here is my solution (C#):
ulong n = 1435, s = 0, mod = 0;
mod = ulong.Parse(Math.Pow(10, 10).ToString());
for (ulong i = 1; i <= n;
{
ulong summand = i;
for (ulong j = 2; j <= i; j++)
{
summand *= i;
summand = summand % mod;
}
s += summand;
s = s % mod;
}
At the end 's' is equal to required number.
Are you getting killed here:
for(j=1;j<=i;j++)
t=((long long)t*i)%m;
Exponentials mod m could be implemented using the sum of squares method.
n = 10000;
m = 20000;
sqr = n;
bit = n;
sum = 0;
while(bit > 0)
{
if(bit % 2 == 1)
{
sum += sqr;
}
sqr = (sqr * sqr) % m;
bit >>= 2;
}
I can't add comment, but for the Chinese remainder theorem, see http://mathworld.wolfram.com/ChineseRemainderTheorem.html formulas (4)-(6).

Resources