Convert a number m to n using minimum number of given operations - algorithm

Question:
Given 2 integers N and M. Convert a number N to M using minimum number of given operations.
The operations are:
Square N (N = N^2)
Divide N by a prime integer P if N is divisible by P (N = N / P and N % P == 0)
Contrants:
N, M <= 10^9
Example:
N = 12, M = 18
The minimum operations are:
N /= 2 -> N = 6
N = N^2 -> N = 36
N /= 2 -> N = 18
My take:
I'm trying to use BFS to solve this problem. For each number, the available edges to other numberers are the operations. But it got Time Limit Exceeded. Is there any better way to solve this?
Here is my BFS code:
queue<pair<int,int> > q;
vector<long long> pr;
ll m,n;
bool prime[MAXN+1];
void solve()
{
while (!q.empty())
{
pii x=q.front();
q.pop();
if (x.first==m)
{
cout << x.second;
return;
}
if (x.first==1) continue;
for(ll k:pr)
{
if (k>x.first) break;
if (x.first%k==0) q.push({x.first/k,x.second+1});
}
q.push({x.first*x.first,x.second+1});
}
}

The algorithm uses the decomposition on N and M in prime factors, keeping trace of the corresponding exponents.
If M has a prime factor that N does not have, there is no solution (the code returns -1).
If N has some prime factors that M doesn't have, then the first step is to divide N by these primes.
The corresponding number of operations is the sum of the corresponding exponents.
At this stage, we get two arrays A and B corresponding to the exponents of the common prime factors, for N and M.
It is worth noting that at this stage, the values of the primes involved is not relevant anymore, only the exponents matter.
Then one must determine the minimum number of squares (= multiplications by 2 of the exponents).
The is the smallest k such that A[i] >= 2^k B[i] for all indices i.
The number of multiplications is added to the number of operations only once, as all exponents are multiplied by 2 at the same time.
Last step is to determine, for each pair (a, b) = (A[i], B[i]), the number of subtractions needed to go from a to b, while implementing exactly k multiplications by 2. This is performed with the following rules:
- if (k == 0) f(a, b, k) = a-b
- Else:
- if ((a-1)*2^k >= b: f(a, b, k) = 1 + f(a-1, b, k)
- else: f(a, b, k) = f(2*a, b, k-1)
The complexity is dominated by the decomposition in primes factors: O(sqrt(n))
Code:
This code is rather long, but a great part consists if helper routines needed for debugging and analysis.
#include <iostream>
#include <vector>
#include <cmath>
#include <algorithm>
void print (const std::vector<int> &v, const std::string s = "") {
std::cout << s;
for (auto &x: v) {
std::cout << x << " ";
}
std::cout << std::endl;
}
void print_decomp (int n, const std::vector<int> &primes, const std::vector<int> &mult) {
std::cout << n << " = ";
int k = primes.size();
for (int i = 0; i < k; ++i) {
std::cout << primes[i];
if (mult[i] > 1) std::cout << "^" << mult[i];
std::cout << " ";
}
std::cout << "\n";
}
void prime_decomp (int nn, std::vector<int> &primes, std::vector<int> &mult) {
int n = nn;
if (n <= 1) return;
if (n % 2 == 0) {
primes.push_back(2);
int cpt = 1;
n/= 2;
while (n%2 == 0) {n /= 2; cpt++;}
mult.push_back (cpt);
}
int max_prime = sqrt(n);
int p = 3;
while (p <= max_prime) {
if (n % p == 0) {
primes.push_back(p);
int cpt = 1;
n/= p;
while (n%p == 0) {n /= p; cpt++;}
mult.push_back (cpt);
max_prime = sqrt(n);
}
p += 2;
}
if (n != 1) {
primes.push_back(n);
mult.push_back (1);
}
print_decomp (nn, primes, mult);
}
// Determine the number of subtractions to go from a to b, with exactly k multiplications by 2
int n_sub (int a, int b, int k, int power2) {
if (k == 0){
if (b > a) exit(1);
return a - b;
}
//if (a == 1) return n_sub (2*a, b, k-1, power2/2);
if ((a-1)*power2 >= b) {
return 1 + n_sub(a-1, b, k, power2);
} else {
return n_sub (2*a, b, k-1, power2/2);
}
return 0;
}
// A return of -1 means no possibility
int n_operations (int N, int M) {
int count = 0;
if (N == M) return 0;
if (N == 1) return -1;
std::vector<int> primes_N, primes_M, expon_N, expon_M;
// Prime decomposition
prime_decomp(N, primes_N, expon_N);
prime_decomp (M, primes_M, expon_M);
// Compare decomposition, check if a solution can exist, set up two exponent arrays
std::vector<int> A, B;
int index_A = 0, index_B = 0;
int nA = primes_N.size();
int nB = primes_M.size();
while (true) {
if ((index_A == nA) && (index_B == nB)) {
break;
}
if ((index_A < nA) && (index_B < nB)) {
if (primes_N[index_A] == primes_M[index_B]) {
A.push_back(expon_N[index_A]);
B.push_back(expon_M[index_B]);
index_A++; index_B++;
continue;
}
if (primes_N[index_A] < primes_M[index_B]) {
count += expon_N[index_A];
index_A++;
continue;
}
return -1; // M has a prime that N doesn't have: impossibility to go to M
}
if (index_B != nB) { // impossibility
return -1;
}
for (int i = index_A; i < nA; ++i) {
count += expon_N[i]; // suppression of primes in N not in M
}
break;
}
std::cout << "1st step, count = " << count << "\n";
print (A, "exponents of N: ");
print (B, "exponents of M: ");
// Determination of the number of multiplications by two of the exponents (= number of squares)
int n = A.size();
int n_mult2 = 0;
int power2 = 1;
for (int i = 0; i < n; ++i) {
while (power2*A[i] < B[i]) {
power2 *= 2;
n_mult2++;
}
}
count += n_mult2;
std::cout << "number of squares = " << n_mult2 << " -> " << power2 << "\n";
// For each pair of exponent, determine the number of subtractions,
// with a fixed number of multiplication by 2
for (int i = 0; i < n; ++i) {
count += n_sub (A[i], B[i], n_mult2, power2);
}
return count;
}
int main() {
int N, M;
std::cin >> N >> M;
auto ans = n_operations (N, M);
std::cout << ans << "\n";
return 0;
}

Related

How to all possible ways in which any combination of n elements m times can be added to give a result

This is kind of an extension to this question:
Finding all possible combinations of numbers to reach a given sum
The difference is that in the above-linked question, each number(from the set of options) would be counted one time. But what if each number is allowed to be chosen multiple times? For example, if the given set of options is {1, 4, 9}, to get a total of 15, we can do any of the following:
a) 1*15
b) 4*3 + 1*2
c) 4*2 + 1*7
d) 4*1 + 1*11
e) 9*1 + 4*1 + 1*2
f) 9*1 + 1*6
Since you have asked all the possible combinations and not the best one so simple recursion can be used to obtain the results. The main idea is to:
1. Sort the array(non-decreasing).
2. First remove all the duplicates from array.
3. Then use recursion and backtracking to solve
A c++ solution for your problem:
#include <bits/stdc++.h>
using namespace std;
void getNumbers(vector<int>& ar, int sum, vector<vector<int> >& res,vector<int>& r, int i) {
if (sum < 0)
return;
if (sum == 0)
{
res.push_back(r);
return;
}
while (i < ar.size() && sum - ar[i] >= 0)
{
r.push_back(ar[i]);
getNumbers(ar, sum - ar[i], res, r, i);
i++;
r.pop_back();
}
}
vector<vector<int> > getSum(vector<int>& ar, int sum)
{
sort(ar.begin(), ar.end());
ar.erase(unique(ar.begin(), ar.end()), ar.end());
vector<int> r;
vector<vector<int> > res;
getNumbers(ar, sum, res, r, 0);
return res;
}
int main()
{
vector<int> ar;
ar.push_back(1);
ar.push_back(4);
ar.push_back(9);
int n = ar.size();
int sum = 15;
vector<vector<int> > res = getSum(ar, sum);
if (res.size() == 0)
{
cout << "Emptyn";
return 0;
}
for (int i = 0; i < res.size(); i++)
{
if (res[i].size() > 0)
{
cout << " ( ";
for (int j = 0; j < res[i].size(); j++)
cout << res[i][j] << " ";
cout << ")";
}
}
}

Number of heaps using n distinct integers- Time complexity

I am solving the problem to find the maximum number of max heaps that can be formed using n distinct integers (say 1..n). I have solved it using the following
recurrence with some help from this: https://www.quora.com/How-many-Binary-heaps-can-be-made-from-N-distinct-elements :
T(N) = N-1 (C) L * T(L) * T(R). where L is the number of nodes in the left subtree and R is the number of nodes in the right subtree. I have also implemented it in c++ using dynamic programming. But I am stuck in find the time complexity of it. Can someone help me with this?
#include <iostream>
using namespace std;
#define MAXN 105 //maximum value of n here
int dp[MAXN]; //dp[i] = number of max heaps for i distinct integers
int nck[MAXN][MAXN]; //nck[i][j] = number of ways to choose j elements form i elements, no order */
int log2[MAXN]; //log2[i] = floor of logarithm of base 2 of i
//to calculate nCk
int choose(int n, int k)
{
if (k > n)
return 0;
if (n <= 1)
return 1;
if (k == 0)
return 1;
if (nck[n][k] != -1)
return nck[n][k];
int answer = choose(n-1, k-1) + choose(n-1, k);
nck[n][k] = answer;
return answer;
}
//calculate l for give value of n
int getLeft(int n)
{
if (n == 1)
return 0;
int h = log2[n];
//max number of elements that can be present in the hth level of any heap
int numh = (1 << h); //(2 ^ h)
//number of elements that are actually present in last level(hth level)
//(2^h - 1)
int last = n - ((1 << h) - 1);
//if more than half-filled
if (last >= (numh / 2))
return (1 << h) - 1; // (2^h) - 1
else
return (1 << h) - 1 - ((numh / 2) - last);
}
//find maximum number of heaps for n
int numberOfHeaps(int n)
{
if (n <= 1)
return 1;
if (dp[n] != -1)
return dp[n];
int left = getLeft(n);
int ans = (choose(n-1, left) * numberOfHeaps(left)) * (numberOfHeaps(n-1-left));
dp[n] = ans;
return ans;
}
//function to intialize arrays
int solve(int n)
{
for (int i = 0; i <= n; i++)
dp[i] = -1;
for (int i = 0; i <= n; i++)
for (int j = 0; j <=n; j++)
nck[i][j] = -1;
int currLog2 = -1;
int currPower2 = 1;
//for each power of two find logarithm
for (int i = 1; i <= n; i++)
{
if (currPower2 == i)
{
currLog2++;
currPower2 *= 2;
}
log2[i] = currLog2;
}
return numberOfHeaps(n);
}
//driver function
int main()
{
int n=10;
cout << solve(n) << endl;
return 0;
}

Algorithm. How to find longest subsequence of integers in an array such that gcd of any two consecutive number in the sequence is greather than 1?

Given`en an array of integers. We have to find the length of the longest subsequence of integers such that gcd of any two consecutive elements in the sequence is greater than 1.
for ex: if array = [12, 8, 2, 3, 6, 9]
then one such subsequence can be = {12, 8, 2, 6, 9}
other one can be= {12, 3, 6, 9}
I tried to solve this problem by dynamic programming. Assume that maxCount is the array such that maxCount[i] will have the length of such longest subsequence
ending at index i.
`maxCount[0]=1 ;
for(i=1; i<N; i++)
{
max = 1 ;
for(j=i-1; j>=0; j--)
{
if(gcd(arr[i], arr[j]) > 1)
{
temp = maxCount[j] + 1 ;
if(temp > max)
max = temp ;
}
}
maxCount[i]=max;
}``
max = 0;
for(i=0; i<N; i++)
{
if(maxCount[i] > max)
max = maxCount[i] ;
}
cout<<max<<endl ;
`
But, this approach is getting timeout. As its time complexity is O(N^2). Can we improve the time complexity?
The condition "gcd is greater than 1" means that numbers have at least one common divisor. So, let dp[i] equals to the length of longest sequence finishing on a number divisible by i.
int n;
cin >> n;
const int MAX_NUM = 100 * 1000;
static int dp[MAX_NUM];
for(int i = 0; i < n; ++i)
{
int x;
cin >> x;
int cur = 1;
vector<int> d;
for(int i = 2; i * i <= x; ++i)
{
if(x % i == 0)
{
cur = max(cur, dp[i] + 1);
cur = max(cur, dp[x / i] + 1);
d.push_back(i);
d.push_back(x / i);
}
}
if(x > 1)
{
cur = max(cur, dp[x] + 1);
d.push_back(x);
}
for(int j : d)
{
dp[j] = cur;
}
}
cout << *max_element(dp, dp + MAX_NUM) << endl;
This solution has O(N * sqrt(MAX_NUM)) complexity. Actually you can calculate dp values only for prime numbers. To implement this you should be able to get prime factorization in less than O(N^0.5) time (this method, for example). That optimization should cast complexity to O(N * factorization + Nlog(N)). As memory optimization, you can replace dp array with map or unordered_map.
GCD takes log m time, where m is the maximum number in the array. Therefore, using a Segment Tree and binary search, one can reduce the time complexity to O(n log (m² * n)) (with O(n log m) preprocessing). This list details other data structures that can be used for RMQ-type queries and to reduce the complexity further.
Here is one possible implementation of this:
#include <bits/stdc++.h>
using namespace std;
struct SegTree {
using ftype = function<int(int, int)>;
vector<int> vec;
int l, og, dummy;
ftype f;
template<typename T> SegTree(const vector<T> &v, const T &x, const ftype &func) : og(v.size()), f(func), l(1), dummy(x) {
assert(og >= 1);
while (l < og) l *= 2;
vec = vector<int>(l*2);
for (int i = l; i < l+og; i++) vec[i] = v[i-l];
for (int i = l+og; i < 2*l; i++) vec[i] = dummy;
for (int i = l-1; i >= 1; i--) {
if (vec[2*i] == dummy && vec[2*i+1] == dummy) vec[i] = dummy;
else if (vec[2*i] == dummy) vec[i] = vec[2*i+1];
else if (vec[2*i+1] == dummy) vec[i] = vec[2*i];
else vec[i] = f(vec[2*i], vec[2*i+1]);
}
}
SegTree() {}
void valid(int x) {assert(x >= 0 && x < og);}
int get(int a, int b) {
valid(a); valid(b); assert(b >= a);
a += l; b += l;
int s = vec[a];
a++;
while (a <= b) {
if (a % 2 == 1) {
if (vec[a] != dummy) s = f(s, vec[a]);
a++;
}
if (b % 2 == 0) {
if (vec[b] != dummy) s = f(s, vec[b]);
b--;
}
a /= 2; b /= 2;
}
return s;
}
void add(int x, int c) {
valid(x);
x += l;
vec[x] += c;
for (x /= 2; x >= 1; x /= 2) {
if (vec[2*x] == dummy && vec[2*x+1] == dummy) vec[x] = dummy;
else if (vec[2*x] == dummy) vec[x] = vec[2*x+1];
else if (vec[2*x+1] == dummy) vec[x] = vec[2*x];
else vec[x] = f(vec[2*x], vec[2*x+1]);
}
}
void update(int x, int c) {add(x, c-vec[x+l]);}
};
// Constructor (where val is something that an element in the array is
// guaranteed to never reach):
// SegTree st(vec, val, func);
// finds longest subsequence where GCD is greater than 1
int longest(const vector<int> &vec) {
int l = vec.size();
SegTree st(vec, -1, [](int a, int b){return __gcd(a, b);});
// checks if a certain length is valid in O(n log (m² * n)) time
auto valid = [&](int n) -> bool {
for (int i = 0; i <= l-n; i++) {
if (st.get(i, i+n-1) != 1) {
return true;
}
}
return false;
};
int length = 0;
// do a "binary search" on the best possible length
for (int i = l; i >= 1; i /= 2) {
while (length+i <= l && valid(length+i)) {
length += i;
}
}
return length;
}

(ACM) How to use segment tree to count how many elements in [a,b] is smaller than a given constant?

I am quite new to segment tree and would like to make myself busy by doing some more exercise on segment tree.
The problem's actually more ACM like and have following conditions:
There are n numbers and m operations, n,m<=10,000, each operation can be one of the following:
1. Update an interval by minus a number x, x can be different each time
2. Query an interval to find how many numbers in the interval is <= 0
Building the segment tree and updating here is obviously can be done in O(nlog n) / O(log n)
But I cannot figure out how to make a query in O(log n), can anyone give me some suggestions / hints?
Any suggestions would be helpful! Thanks!
TL;DR:
Given n numbers, and 2 type operations:
add x to all elements in [a,b], x can be different each time
Query number of elements in [a,b] is < C, C is given constant
How to make operation 1 & 2 both can be done in O(log n)?
Nice Problem:)
I think for a while but still can't work out this problem with segment tree, but I've tried using "Bucket Method" to solve this problem.
We can divide the initial n numbers into B buckets, sort the number in each buckets and maintain the total add val in each bucket. Then for each query:
"Add" update interval [a, b] with c
we only need to rebuild at most two buckets and add c to (b - a) / BUCKET_SIZE buckets
"Query" query interval [a, b] <= c
we only need to scan at most two buckets with each value one by one and quick go through (b-a) / BUCKET_SIZE buckets with binary search quickly
It should be run in O( N/BUCKET_SIZE * log(BUCKET_SIZE, 2)) for each query, which is smaller than bruteforce method( O(N)). Though it's bigger than O(logN), it may be sufficient in most cases.
Here are the test code:
#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <string>
#include <cstring>
#include <cmath>
#include <algorithm>
#include <vector>
#include <set>
#include <map>
#include <ctime>
#include <cassert>
using namespace std;
struct Query {
//A a b c add c in [a, b] of arr
//Q a b c Query number of i in [a, b] which arr[i] <= c
char ty;
int a, b, c;
Query(char _ty, int _a, int _b, int _c):ty(_ty), a(_a), b(_b), c(_c){}
};
int n, m;
vector<int> arr;
vector<Query> queries;
vector<int> bruteforce() {
vector<int> ret;
vector<int> numbers = arr;
for (int i = 0; i < m; i++) {
Query q = queries[i];
if (q.ty == 'A') {
for (int i = q.a; i <= q.b; i++) {
numbers[i] += q.c;
}
ret.push_back(-1);
} else {
int tmp = 0;
for(int i = q.a; i <= q.b; i++) {
tmp += numbers[i] <= q.c;
}
ret.push_back(tmp);
}
}
return ret;
}
struct Bucket {
vector<int> numbers;
vector<int> numbers_sorted;
int add;
Bucket() {
add = 0;
numbers_sorted.clear();
numbers.clear();
}
int query(int pos) {
return numbers[pos] + add;
}
void add_pos(int pos, int val) {
numbers[pos] += val;
}
void build() {
numbers_sorted = numbers;
sort(numbers_sorted.begin(), numbers_sorted.end());
}
};
vector<int> bucket_count(int bucket_size) {
vector<int> ret;
vector<Bucket> buckets;
buckets.resize(int(n / bucket_size) + 5);
for (int i = 0; i < n; i++) {
buckets[i / bucket_size].numbers.push_back(arr[i]);
}
for (int i = 0; i <= n / bucket_size; i++) {
buckets[i].build();
}
for (int i = 0; i < m; i++) {
Query q = queries[i];
char ty = q.ty;
int a, b, c;
a = q.a, b = q.b, c = q.c;
if (ty == 'A') {
set<int> affect_buckets;
while (a < b && a % bucket_size != 0) buckets[a/ bucket_size].add_pos(a % bucket_size, c), affect_buckets.insert(a/bucket_size), a++;
while (a < b && b % bucket_size != 0) buckets[b/ bucket_size].add_pos(b % bucket_size, c), affect_buckets.insert(b/bucket_size), b--;
while (a < b) {
buckets[a/bucket_size].add += c;
a += bucket_size;
}
buckets[a/bucket_size].add_pos(a % bucket_size, c), affect_buckets.insert(a / bucket_size);
for (set<int>::iterator it = affect_buckets.begin(); it != affect_buckets.end(); it++) {
int id = *it;
buckets[id].build();
}
ret.push_back(-1);
} else {
int tmp = 0;
while (a < b && a % bucket_size != 0) tmp += (buckets[a/ bucket_size].query(a % bucket_size) <=c), a++;
while (a < b && b % bucket_size != 0) tmp += (buckets[b/ bucket_size].query(b % bucket_size) <=c), b--;
while (a < b) {
int pos = a / bucket_size;
tmp += upper_bound(buckets[pos].numbers_sorted.begin(), buckets[pos].numbers_sorted.end(), c - buckets[pos].add) - buckets[pos].numbers_sorted.begin();
a += bucket_size;
}
tmp += (buckets[a / bucket_size].query(a % bucket_size) <= c);
ret.push_back(tmp);
}
}
return ret;
}
void process(int cas) {
clock_t begin_t=clock();
vector<int> bf_ans = bruteforce();
clock_t bf_end_t =clock();
double bf_sec = ((1.0 * bf_end_t - begin_t)) / CLOCKS_PER_SEC;
//bucket_size is important
int bucket_size = 200;
vector<int> ans = bucket_count(bucket_size);
clock_t bucket_end_t =clock();
double bucket_sec = ((1.0 * bucket_end_t - bf_end_t)) / CLOCKS_PER_SEC;
bool correct = true;
for (int i = 0; i < ans.size(); i++) {
if (ans[i] != bf_ans[i]) {
cout << "query " << i + 1 << " bf = " << bf_ans[i] << " bucket = " << ans[i] << " bucket size = " << bucket_size << " " << n << " " << m << endl;
correct = false;
}
}
printf("Case #%d:%s bf_sec = %.9lf, bucket_sec = %.9lf\n", cas, correct ? "YES":"NO", bf_sec, bucket_sec);
}
void read() {
cin >> n >> m;
arr.clear();
for (int i = 0; i < n; i++) {
int val;
cin >> val;
arr.push_back(val);
}
queries.clear();
for (int i = 0; i < m; i++) {
char ty;
int a, b, c;
// a, b, c in [0, n - 1], a <= b
cin >> ty >> a >> b >> c;
queries.push_back(Query(ty, a, b, c));
}
}
void run(int cas) {
read();
process(cas);
}
int main() {
freopen("bucket.in", "r", stdin);
//freopen("bucket.out", "w", stdout);
int T;
scanf("%d", &T);
for (int cas = 1; cas <= T; cas++) {
run(cas);
}
return 0;
}
and here are the data gen code:
#coding=utf8
import random
import math
def gen_buckets(f):
t = random.randint(10, 20)
print >> f, t
nlimit = 100000
mlimit = 10000
limit = 100000
for i in xrange(t):
n = random.randint(1, nlimit)
m = random.randint(1, mlimit)
print >> f, n, m
for i in xrange(n):
val = random.randint(1, limit)
print >> f, val ,
print >> f
for i in xrange(m):
ty = random.randint(1, 2)
a = random.randint(0, n - 1)
b = random.randint(a, n - 1)
#a = 0
#b = n - 1
c = random.randint(-limit, limit)
print >> f, 'A' if ty == 1 else 'Q', a, b, c
f = open("bucket.in", "w")
gen_buckets(f)
Try applying a Binary Index Trees (BIT) instead of a segmented tree. Here's the link to the tutorial

Fast Iterative GCD

I have GCD(n, i) where i=1 is increasing in loop by 1 up to n. Is there any algorithm which calculate all GCD's faster than naive increasing and compute GCD using Euclidean algorithm?
PS I've noticed if n is prime I can assume that number from 1 to n-1 would give 1, because prime number would be co-prime to them. Any ideas for other numbers than prime?
C++ implementation, works in O(n * log log n) (assuming size of integers are O(1)):
#include <cstdio>
#include <cstring>
using namespace std;
void find_gcd(int n, int *gcd) {
// divisor[x] - any prime divisor of x
// or 0 if x == 1 or x is prime
int *divisor = new int[n + 1];
memset(divisor, 0, (n + 1) * sizeof(int));
// This is almost copypaste of sieve of Eratosthenes, but instead of
// just marking number as 'non-prime' we remeber its divisor.
// O(n * log log n)
for (int x = 2; x * x <= n; ++x) {
if (divisor[x] == 0) {
for (int y = x * x; y <= n; y += x) {
divisor[y] = x;
}
}
}
for (int x = 1; x <= n; ++x) {
if (n % x == 0) gcd[x] = x;
else if (divisor[x] == 0) gcd[x] = 1; // x is prime, and does not divide n (previous line)
else {
int a = x / divisor[x], p = divisor[x]; // x == a * p
// gcd(a * p, n) = gcd(a, n) * gcd(p, n / gcd(a, n))
// gcd(p, n / gcd(a, n)) == 1 or p
gcd[x] = gcd[a];
if ((n / gcd[a]) % p == 0) gcd[x] *= p;
}
}
}
int main() {
int n;
scanf("%d", &n);
int *gcd = new int[n + 1];
find_gcd(n, gcd);
for (int x = 1; x <= n; ++x) {
printf("%d:\t%d\n", x, gcd[x]);
}
return 0;
}
SUMMARY
The possible answers for the gcd consist of the factors of n.
You can compute these efficiently as follows.
ALGORITHM
First factorise n into a product of prime factors, i.e. n=p1^n1*p2^n2*..*pk^nk.
Then you can loop over all factors of n and for each factor of n set the contents of the GCD array at that position to the factor.
If you make sure that the factors are done in a sensible order (e.g. sorted) you should find that the array entries that are written multiple times will end up being written with the highest value (which will be the gcd).
CODE
Here is some Python code to do this for the number 1400=2^3*5^2*7:
prime_factors=[2,5,7]
prime_counts=[3,2,1]
N=1
for prime,count in zip(prime_factors,prime_counts):
N *= prime**count
GCD = [0]*(N+1)
GCD[0] = N
def go(i,n):
"""Try all counts for prime[i]"""
if i==len(prime_factors):
for x in xrange(n,N+1,n):
GCD[x]=n
return
n2=n
for c in xrange(prime_counts[i]+1):
go(i+1,n2)
n2*=prime_factors[i]
go(0,1)
print N,GCD
Binary GCD algorithm:
https://en.wikipedia.org/wiki/Binary_GCD_algorithm
is faster than Euclidean algorithm:
https://en.wikipedia.org/wiki/Euclidean_algorithm
I implemented "gcd()" in C for type "__uint128_t" (with gcc on Intel i7 Ubuntu), based on iterative Rust version:
https://en.wikipedia.org/wiki/Binary_GCD_algorithm#Iterative_version_in_Rust
Determining number of trailing 0s was done efficiently with "__builtin_ctzll()". I did benchmark 1 million loops of two biggest 128bit Fibonacci numbers (they result in maximal number of iterations) against gmplib "mpz_gcd()" and saw 10% slowdown. Utilizing the fact that u/v values only decrease, I switched to 64bit special case "_gcd()" when "<=UINT64_max" and now see speedup of 1.31 over gmplib, for details see:
https://www.raspberrypi.org/forums/viewtopic.php?f=33&t=311893&p=1873552#p1873552
inline int ctz(__uint128_t u)
{
unsigned long long h = u;
return (h!=0) ? __builtin_ctzll( h )
: 64 + __builtin_ctzll( u>>64 );
}
unsigned long long _gcd(unsigned long long u, unsigned long long v)
{
for(;;) {
if (u > v) { unsigned long long a=u; u=v; v=a; }
v -= u;
if (v == 0) return u;
v >>= __builtin_ctzll(v);
}
}
__uint128_t gcd(__uint128_t u, __uint128_t v)
{
if (u == 0) { return v; }
else if (v == 0) { return u; }
int i = ctz(u); u >>= i;
int j = ctz(v); v >>= j;
int k = (i < j) ? i : j;
for(;;) {
if (u > v) { __uint128_t a=u; u=v; v=a; }
if (v <= UINT64_MAX) return _gcd(u, v) << k;
v -= u;
if (v == 0) return u << k;
v >>= ctz(v);
}
}

Resources