Approach for better solution - Sum of medians

Approach for better solution - Sum of medians - algorithm

Here is the question Spoj-WEIRDFN
Problem:
Let us define :
F[1] = 1
F[i] = (a*M[i] + b*i + c)%1000000007 for i > 1
where M[i] is the median of the array {F[1],F[2],..,F[i-1]}
Given a,b,c and n, calculate the sum F[1] + F[2] + .. + F[n].
Constraints:
0 <= a,b,c < 1000000007
1 <= n <= 200000
I came up with a solution which is not so efficient
MY SOLUTION::--
#include <bits/stdc++.h>
using namespace std;
#define ll long long int
#define mod 1000000007
int main() {
// your code goes here
int t;
scanf("%d",&t);
while(t--)
{
ll a,b,c,sum=0;
int n;
scanf("%lld%lld%lld%d",&a,&b,&c,&n);
ll f[n+1];
f[1]=1;
f[0]=0;
for(int i=2;i<=n;i++)
{
ll temp;
sort(&f[1],&f[i]);
temp=f[i/2];
f[i]=((a*(temp)%mod)+((b*i)%mod)+(c%mod))%mod;
sum+=f[i];
}
printf("%lld\n",sum+f[1]);
}
return 0;
}
Can anybody give me hint for for better algorithm or data structure for this task

For each test case, you can maintain a binary search tree, thus you can find the median of n elements in O(log n) time, and you only need O(log n) time to add a new element into the tree.
Thus, we have an O(T*nlogn) algorithm, with T is number of test case, and n is number of elements, which should be enough to pass.

Related

Fastest way to prime factorise a number up to 10^18

Given a number 1 <= n <= 10^18, how can I factorise it in least time complexity?
There are many posts on the internet addressing how you can find prime factors but none of them (at least from what I've seen) state their benefits, say in a particular situation.
I use Pollard's rho algorithm in addition to Eratosthenes' sieve:
Using sieve, find all prime numbers in the first 107 numbers, and then divide n with these primes as much as possible.
Now use Pollard's rho algorithm to try and find the rest of the primes until n is equal to 1.
My Implementation:
#include <iostream>
#include <vector>
#include <cstdio>
#include <ctime>
#include <cmath>
#include <cstdlib>
#include <algorithm>
#include <string>
using namespace std;
typedef unsigned long long ull;
typedef long double ld;
typedef pair <ull, int> pui;
#define x first
#define y second
#define mp make_pair
bool prime[10000005];
vector <ull> p;
void initprime(){
prime[2] = 1;
for(int i = 3 ; i < 10000005 ; i += 2){
prime[i] = 1;
}
for(int i = 3 ; i * i < 10000005 ; i += 2){
if(prime[i]){
for(int j = i * i ; j < 10000005 ; j += 2 * i){
prime[j] = 0;
}
}
}
for(int i = 0 ; i < 10000005 ; ++i){
if(prime[i]){
p.push_back((ull)i);
}
}
}
ull modularpow(ull base, ull exp, ull mod){
ull ret = 1;
while(exp){
if(exp & 1){
ret = (ret * base) % mod;
}
exp >>= 1;
base = (base * base) % mod;
}
return ret;
}
ull gcd(ull x, ull y){
while(y){
ull temp = y;
y = x % y;
x = temp;
}
return x;
}
ull pollardrho(ull n){
srand(time(NULL));
if(n == 1)
return n;
ull x = (rand() % (n - 2)) + 2;
ull y = x;
ull c = (rand() % (n - 1)) + 1;
ull d = 1;
while(d == 1){
x = (modularpow(x, 2, n) + c + n) % n;
y = (modularpow(y, 2, n) + c + n) % n;
y = (modularpow(y, 2, n) + c + n) % n;
d = gcd(abs(x - y), n);
if(d == n){
return pollardrho(n);
}
}
return d;
}
int main ()
{
ios_base::sync_with_stdio(false);
cin.tie(0);
initprime();
ull n;
cin >> n;
ull c = n;
vector <pui> o;
for(vector <ull>::iterator i = p.begin() ; i != p.end() ; ++i){
ull t = *i;
if(!(n % t)){
o.push_back(mp(t, 0));
}
while(!(n % t)){
n /= t;
o[o.size() - 1].y++;
}
}
while(n > 1){
ull u = pollardrho(n);
o.push_back(mp(u, 0));
while(!(n % u)){
n /= u;
o[o.size() - 1].y++;
}
if(n < 10000005){
if(prime[n]){
o.push_back(mp(n, 1));
}
}
}
return 0;
}
Is there any faster way to factor such numbers? If possible, please explain why along with the source code.

Approach
Lets say you have a number n that goes up to 1018 and you want to prime factorise it. Since this number can be as small as unity and as big as 1018, all along it can be prime as well as composite, this would be my approach -
Using miller rabin primality testing, make sure that the number is composite.
Factorise n using primes up to 106, which can be calculated using sieve of Eratosthenes.
Now the updated value of n is such that it has prime factors only above 106 and since the value of n can still be as big as 1018, we conclude that the number is either prime or it has exactly two prime factors (not necessarily distinct).
Run Miller Rabin again to ensure the number isn't prime.
Use Pollard rho algorithm to get one prime factor.
You have the complete factorisation now.
Lets look at the time-complexity of the above approach:
Miller Rabin takes O(log n)
Sieve of Eratosthenes takes O(n*log n)
The implementation of Pollard rho I shared takes O(n^0.25)
Time Complexity
Step 2 takes maximum time which is equal to O(10^7), which is in turn the complexity of the above algorithm. This means you can find the factorisation within a second for almost all programming languages.
Space Complexity
Space is used only in the step 2 where sieve is implemented and is equal to O(10^6). Again, very practical for the purpose.
Implementation
Complete Code implemented in C++14. The code has a hidden bug. You can either reveal it in the next section, or skip towards the challenge ;)
Bug in the code
In line 105, iterate till i<=np. Otherwise, you may miss the cases where prime[np]=999983 is a prime factor
Challenge
Give me a value of n, if any, where the shared code results in wrong prime factorisation.
Bonus
How many such values of n exist ?
Hint
For such value of n, assertion in Line 119 may fail.
Solution
Lets call P=999983. All numbers of the form n = p*q*r where p, q, r are primes >= P such that at least one of them is equal to P will result in wrong prime factorisation.
Bonus Solution
There are exactly four such numbers: {P03, P02P1, P02P2, P0P12}, where P0 = P = 999983, P1 = next_prime(P0) = 1000003, P2 = next_prime(P1) = 1000033.

The fastest solution for 64-bit inputs on modern processors is a small amount of trial division (the amount will differ, but something under 100 is common) followed by Pollard's Rho. You will need a good deterministic primality test using Miller-Rabin or BPSW, and a infrastructure to handle multiple factors (e.g. if a composite is split into more composites). For 32-bit you can optimize each of these things even more.
You will want a fast mulmod, as it is the core of both Pollard's Rho, Miller-Rabin, and the Lucas test. Ideally this is done as a tiny assembler snippet.
Times should be under 1 millisecond to factor any 64-bit input. Significantly faster under 50 bits.
As shown by Ben Buhrow's spBrent implementation, algorithm P2'' from Brent's 1980 paper seems to be as fast as the other implementations I'm aware of. It uses Brent's improved cycle finding as well as the useful trick of delaying GCDs with the necessary added backtracking.
See this thread on Mersenneforum for some messy details and benchmarking of various solutions. I have a number of benchmarks of these and other implementations at different sizes, but haven't published anything (partly because there are so many ways to look at the data).
One of the really interesting things to come out of this was that SQUFOF, for many years believed to be the better solution for the high end of the 64-bit range, no longer is competitive. SQUFOF does have the advantage of only needing a fast perfect-square detector for best speed, which doesn't have to be in asm to be really fast.

Variant of Subset-Sum

Given 3 positive integers n, k, and sum, find exactly k number of distinct elements a_i, where
a_i \in S, 1 <= i <= k, and a_i \neq a_j for i \neq j
and, S is the set
S = {1, 2, 3, ..., n}
such that
\sum_{i=1}^{k}{a_i} = sum
I don't want to apply brute force (checking all possible combinations) to solve the problem due to exponential complexity. Can someone give me a hint towards another approach in solving this problem? Also, how can we exploit the fact the set S is sorted?
Is it possible to have complexity of O(k) in this problem?

An idea how to exploit 1..n set properties:
Sum of k continuous members of natural row starting from a is
sum = k*(2*a + (k-1))/2
To get sum of such subsequence about needed s, we can solve
a >= s/k - k/2 + 1/2
or
a <= s/k - k/2 + 1/2
compare s and sum values and make corrections.
For example, having s=173, n=40 and k=5, we can find
a <= 173/5 - 5/2 + 1/2 = 32.6
for starting number 32 we have sequence 32,33,34,35,36 with sum = 170, and for correction by 3 we can just change 36 with 39, or 34,35,36 with 35,36,37 and so on.
Seems that using this approach we get O(1) complexity (of course, there might exist some subtleties that I did miss)

It's possible to modify the pseudo-polynomial algorithm for subset sum.
Prepare a matrix P with dimension k X sum, and initialize all elements to 0. The meaning of P[p, q] == 1 is that there is a subset of p numbers summing to q, and P[p, q] == 0 means that such a subset has not yet been found.
Now iterate over i = 1, ..., n. In each iteration:
If i ≤ sum, set P[1, i] = 1 (there is a subset of size 1 that achieves i).
For any entry P[p, q] == 1, you now know that P[p + 1, q + i] should now be 1 too. If (p + 1, q + i) is within the boundaries of the matrix, set P[p + 1, q + i] = 1.
Finally, check if P[k, sum] == 1.
The complexity, assuming that all integer math operations is constant, is Θ(n2 sum).

There is a O(1) (so to speak) solution. What follows is a formal enough (I hope) development of the idea by #MBo.
It is sufficient to assume that S is a set of all integers and find a minimal solution. Solution K is smaller than K' iff max(K) < max(K'). If max(K) <= n, then K is also a solution to the original problem; otherwise, the original problem has no solution.
So we disregard n and find K, a minimal solution. Let g = max(K) = ceil(sum/k + (k - 1)/2) and s = g + (g-1) + (g-2) + ... (g-k+1) and s' = (g-1) + (g-2) + ... + (g-k). That is, s' is s shifted down by 1. Note s' = s - k.
Obviously s >= sum and (because K is minimal) s' < sum.
If s == sum the solution is K and we're done. Otherwise consider the set K+ = {g, g-1, ..., g-k}. We know that \sum(K+ \setminus {g}) < sum and \sum(K+ \setminus {g-k}) > sum, therefore, there's a single element g_i of K+ such that \sum (K+ \setminus {g_i}) = sum. The solution isK+ \setminus {\sum(K+)-sum}.
The solution in the form of 4 integers a, b, c, d where the actual set is understood to be [a..b] \setunion [c..d] can be computed in O(1).

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
unsigned long int arithmeticSum(unsigned long int a, unsigned long int k, unsigned long int n, unsigned long int *A);
void printSubset(unsigned long int k, unsigned long int *A);
int main(void)
{
unsigned long int n, k, sum;
// scan the respective values of sum, n, and k
scanf("%lu %lu %lu", &sum, &n, &k);
// find the starting element using the formula for the sum of an A.P. having 'k' terms
// starting at 'a', common difference 'd' ( = 1 in this problem), having 'sum' = sum
// sum = [k/2][2*a + (k-1)*d]
unsigned long startElement = (long double)sum/k - (long double)k/2 + (long double)1/2;
// exit if the arithmetic progression formed at the startElement is not within the required bounds
if(startElement < 1 || startElement + k - 1 > n)
{
printf("-1\n");
return 0;
}
// we now work on the k-element set [startElement, startElement + k - 1]
// create an array to store the k elements
unsigned long int *A = malloc(k * sizeof(unsigned long int));
// calculate the sum of k elements in the arithmetic progression [a, a + 1, a + 2, ..., a + (k - 1)]
unsigned long int currentSum = arithmeticSum(startElement, k, n, A);
// if the currentSum is equal to the required sum, then print the array A, and we are done
if(currentSum == sum)
{
printSubset(k, A);
}
// we enter into this block only if currentSum < sum
// i.e. we need to add 'something' to the currentSum in order to make it equal to sum
// i.e. we need to remove an element from the k-element set [startElement, startElement + k - 1]
// and replace it with an element of higher magnitude
// i.e. we need to replace an element in the set [startElement, startElement + k - 1] and replace
// it with an element in the range [startElement + k, n]
else
{
long int j;
bool done;
// calculate the amount which we need to add to the currentSum
unsigned long int difference = sum - currentSum;
// starting from A[k-1] upto A[0] do the following...
for(j = k - 1, done = false; j >= 0; j--)
{
// check if adding the "difference" to A[j] results in a number in the range [startElement + k, n]
// if it does then replace A[j] with that element, and we are done
if(A[j] + difference <= n && A[j] + difference > A[k-1])
{
A[j] += difference;
printSubset(k, A);
done = true;
break;
}
}
// if no such A[j] is found then, exit with fail
if(done == false)
{
printf("-1\n");
}
}
return 0;
}
unsigned long int arithmeticSum(unsigned long int a, unsigned long int k, unsigned long int n, unsigned long int *A)
{
unsigned long int currentSum;
long int j;
// calculate the sum of the arithmetic progression and store the each member in the array A
for(j = 0, currentSum = 0; j < k; j++)
{
A[j] = a + j;
currentSum += A[j];
}
return currentSum;
}
void printSubset(unsigned long int k, unsigned long int *A)
{
long int j;
for(j = 0; j < k; j++)
{
printf("%lu ", A[j]);
}
printf("\n");
}

How to print values in memoization method-Dynamic pragraming

I know for a problem that can be solved using DP, can be solved by either tabulation(bottom-up) approach or memoization(top-down) approach. personally i find memoization is easy and even efficient approach(analysis required just to get recursive formula,once recursive formula is obtained, a brute-force recursive method can easily be converted to store sub-problem's result and reuse it.) The only problem that i am facing in this approach is, i am not able to construct actual result from the table which i filled on demand.
For example, in Matrix Product Parenthesization problem ( to decide in which order to perform the multiplications on Matrices so that cost of multiplication is minimum) i am able to calculate minimum cost not not able to generate order in algo.
For example, suppose A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix. Then,
(AB)C = (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations
A(BC) = (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 operations.
here i am able to get min-cost as 27000 but unable to get order which is A(BC).
I used this. Suppose F[i, j] represents least number of multiplication needed to multiply Ai.....Aj and an array p[] is given which represents the chain of matrices such that the ith matrix Ai is of dimension p[i-1] x p[i]. So
0 if i=j
F[i,j]=
min(F[i,k] + F[k+1,j] +P_i-1 * P_k * P_j where k∈[i,j)
Below is the implementation that i have created.
#include<stdio.h>
#include<limits.h>
#include<string.h>
#define MAX 4
int lookup[MAX][MAX];
int MatrixChainOrder(int p[], int i, int j)
{
if(i==j) return 0;
int min = INT_MAX;
int k, count;
if(lookup[i][j]==0){
// recursively calculate count of multiplcations and return the minimum count
for (k = i; k<j; k++) {
int gmin=0;
if(lookup[i][k]==0)
lookup[i][k]=MatrixChainOrder(p, i, k);
if(lookup[k+1][j]==0)
lookup[k+1][j]=MatrixChainOrder(p, k+1, j);
count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < min){
min = count;
printf("\n****%d ",k); // i think something has be done here to represent the correct answer ((AB)C)D where first mat is represented by A second by B and so on.
}
}
lookup[i][j] = min;
}
return lookup[i][j];
}
// Driver program to test above function
int main()
{
int arr[] = {2,3,6,4,5};
int n = sizeof(arr)/sizeof(arr[0]);
memset(lookup, 0, sizeof(lookup));
int width =10;
printf("Minimum number of multiplications is %d ", MatrixChainOrder(arr, 1, n-1));
printf("\n ---->");
for(int l=0;l<MAX;++l)
printf(" %*d ",width,l);
printf("\n");
for(int z=0;z<MAX;z++){
printf("\n %d--->",z);
for(int x=0;x<MAX;x++)
printf(" %*d ",width,lookup[z][x]);
}
return 0;
}
I know using tabulation approach printing the solution is much easy but i want to do it in memoization technique.
Thanks.

Your code correctly computes the minimum number of multiplications, but you're struggling to display the optimal chain of matrix multiplications.
There's two possibilities:
When you compute the table, you can store the best index found in another memoization array.
You can recompute the optimal splitting points from the results in the memoization array.
The first would involve creating the split points in a separate array:
int lookup_splits[MAX][MAX];
And then updating it inside your MatrixChainOrder function:
...
if (count < min) {
min = count;
lookup_splits[i][j] = k;
}
You can then generate the multiplication chain recursively like this:
void print_mult_chain(int i, int j) {
if (i == j) {
putchar('A' + i - 1);
return;
}
putchar('(');
print_mult_chain(i, lookup_splits[i][j]);
print_mult_chain(lookup_splits[i][j] + 1, j);
putchar(')');
}
You can call the function with print_mult_chain(1, n - 1) from main.
The second possibility is that you don't cache lookup_splits and recompute it as necessary.
int get_lookup_splits(int p[], int i, int j) {
int best = INT_MAX;
int k_best;
for (int k = i; k < j; k++) {
int count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < best) {
best = count;
k_best = k;
}
}
return k;
}
This is essentially the same computation you did inside MatrixChainOrder, so if you go with this solution you should factor the code appropriately to avoid having two copies.
With this function, you can adapt print_mult_chain above to use it rather than the lookup_splits array. (You'll need to pass the p array in).
[None of this code is tested, so you may need to edit the answer to fix bugs].

finding all divisors of all the numbers from 1 to 10^6 efficiently

I need to find all divisors of all numbers between 1 and n (including 1 and n). where n equals 10^6 and I want to store them in the vector.
vector< vector<int> > divisors(1000000);
void abc()
{
long int n=1,num;
while(n<1000000)
{
num=n;
int limit=sqrt(num);
for(long int i=1;i<limit;i++)
{
if(num%i==0)
{
divisors[n].push_back(i);
divisors[n].push_back(num/i);
}
}
n++;
}
}
This is too much time taking as well. Can i optimize it in any way?

const int N = 1000000;
vector<vector<int>> divisors(N+1);
for (int i = 2; i <= N; i++) {
for (j = i; j <= N; j += i) {
divisors[j].push_back(i);
}
}
this runs in O(N*log(N))
Intuition is that upper N/2 numbers are run only once. Then from remaining numbers upper half are run once more ...
Other way around. If you increase N from lets say 10^6 to 10^7, than you have as many opertions as at 10^6 times 10. (that is linear), but what is extra are numbers from 10^6 to 10^7 that doesnt run more than 10 times each at worst.
number of operaions is
sum (N/n for n from 1 to N)
this becomes then N * sum(1/n for n from 1 to N) and this is N*log(N) that can be shown using integration of 1/x over dx from 1 to N
We can see that algorhitm is optimal, because there is as many operation as is number of divisors. Size of result or total number of divisors is same as complexity of algorhitm.

I think this might not be the best solution, but it is much better than the one presented, so here we go:
Go over all the numbers (i) from 1 to n, and for each number:
Add the number to the list of itself.
Set multiplier to 2.
Add i to the list of i * multiplier.
increase multiplier.
Repeat steps 3 & 4 until i * multiplier is greater than n.

[Edit3] complete reedit
Your current approach is O(n^1.5) not O(n^2)
Originally I suggested to see Why are my nested for loops taking so long to compute?
But as Oliver Charlesworth suggested to me to read About Vectors growth That should not be much of an issue here (also the measurements confirmed it).
So no need to preallocating of memroy for the list (it would just waste memory and due to CACHE barriers even lower the overall performance at least on mine setup).
So how to optimize?
either lower the constant time so the runtime is better of your iteration (even with worse complexity)
or lower the complexity so much that overhead is not bigger to actually have some speedup
I would start with SoF (Sieve of Eratosthenes)
But instead setting number as divisible I would add currently iterated sieve to the number divisor list. This should be O(n^2) but with much lower overhead (no divisions and fully parallelisable) if coded right.
start computing SoF for all numbers i=2,3,4,5,...,n-1
for each number x you hit do not update SoF table (you do not need it). Instead add the iterated sieve i to the divisor list of x. Something like:
C++ source:
const int n=1000000;
List<int> divs[n];
void divisors()
{
int i,x;
for (i=1;i<n;i++)
for (x=i;x<n;x+=i)
divs[x].add(i);
}
This took 1.739s and found 13969984 divisors total, max 240 divisors per number (including 1 and x). As you can see it does not use any divisions. and the divisors are sorted ascending.
List<int> is dynamic list of integers template (something like your vector<>)
You can adapt this to your kind of iteration so you can check up to nn=sqrt(n) and add 2 divisors per iteration that is O(n^1.5*log(n)) with different constant time (overhead) a bit slower due to single division need and duplicity check (log(n) with high base) so you need to measure if it speeds things up or not on my setup is this way slower (~2.383s even if it is with better complexity).
const int n=1000000;
List<int> divs[n];
int i,j,x,y,nn=sqrt(n);
for (i=1;i<=nn;i++)
for (x=i;x<n;x+=i)
{
for (y=divs[x].num-1;y>=0;y--)
if (i==divs[x][y]) break;
if (y<0) divs[x].add(i);
j=x/i;
for (y=divs[x].num-1;y>=0;y--)
if (j==divs[x][y]) break;
if (y<0) divs[x].add(j);
}
Next thing is to use direct memory access (not sure you can do that with vector<>) my list is capable of such thing do not confuse it with hardware DMA this is just avoidance of array range checking. This speeds up the constant overhead of the duplicity check and the result time is [1.793s] which is a little bit slower then the raw SoF O(n^2) version. So if you got bigger n this would be the way.
[Notes]
If you want to do prime decomposition then iterate i only through primes (in that case you need the SoF table) ...
If you got problems with the SoF or primes look at Prime numbers by Eratosthenes quicker sequential than concurrently? for some additional ideas on this

Another optimization is not to use -vector- nor -list- , but a large array of divisors, see http://oeis.org/A027750
First step: Sieve of number of divisors
Second step: Sieve of divisors with the total number of divisors
Note: A maximum of 20-fold time increase for 10-fold range. --> O(N*log(N))
Dev-C++ 5.11 , in C
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int SieveNbOfDiv(int NumberOfDivisors[], int IndexCount[], int Limit) {
for (int i = 1; i*i <= Limit; i++) {
NumberOfDivisors[i*i] += 1;
for (int j = i*(i+1); j <= Limit; j += i )
NumberOfDivisors[j] += 2;
}
int Count = 0;
for (int i = 1; i <= Limit; i++) {
Count += NumberOfDivisors[i];
NumberOfDivisors[i] = Count;
IndexCount[i] = Count;
}
return Count;
}
void SieveDivisors(int IndexCount[], int NumberOfDivisors[], int Divisors[], int Limit) {
for (int i = 1; i <= Limit; i++) {
Divisors[IndexCount[i-1]++] = 1;
Divisors[IndexCount[i]-1] = i;
}
for (int i = 2; i*i <= Limit; i++) {
Divisors[IndexCount[i*i-1]++] = i;
for (int j = i*(i+1); j <= Limit; j += i ) {
Divisors[IndexCount[j-1]++] = i;
Divisors[NumberOfDivisors[j-1] + NumberOfDivisors[j] - IndexCount[j-1]] = j/i;
}
}
}
int main(int argc, char *argv[]) {
int N = 1000000;
if (argc > 1) N = atoi(argv[1]);
int ToPrint = 0;
if (argc > 2) ToPrint = atoi(argv[2]);
clock_t Start = clock();
printf("Using sieve of divisors from 1 to %d\n\n", N);
printf("Evaluating sieve of number of divisors ...\n");
int *NumberOfDivisors = (int*) calloc(N+1, sizeof(int));
int *IndexCount = (int*) calloc(N+1, sizeof(int));
int size = SieveNbOfDiv(NumberOfDivisors, IndexCount, N);
printf("Total number of divisors = %d\n", size);
printf("%0.3f second(s)\n\n", (clock() - Start)/1000.0);
printf("Evaluating sieve of divisors ...\n");
int *Divisors = (int*) calloc(size+1, sizeof(int));
SieveDivisors(IndexCount, NumberOfDivisors, Divisors, N);
printf("%0.3f second(s)\n", (clock() - Start)/1000.0);
if (ToPrint == 1)
for (int i = 1; i <= N; i++) {
printf("%d(%d) = ", i, NumberOfDivisors[i] - NumberOfDivisors[i-1]);
for (int j = NumberOfDivisors[i-1]; j < NumberOfDivisors[i]; j++)
printf("%d ", Divisors[j]);
printf("\n");
}
return 0;
}
With some results:
Copyright (c) 2009 Microsoft Corporation. All rights reserved.
c:\Users\Ab\Documents\gcc\sievedivisors>sievedivisors 100000
Using sieve of divisors from 1 to 100000
Evaluating sieve of number of divisors ...
Total number of divisors = 1166750
0.000 second(s)
Evaluating sieve of divisors ...
0.020 second(s)
c:\Users\Ab\Documents\gcc\sievedivisors>sievedivisors 1000000
Using sieve of divisors from 1 to 1000000
Evaluating sieve of number of divisors ...
Total number of divisors = 13970034
0.060 second(s)
Evaluating sieve of divisors ...
0.610 second(s)
c:\Users\Ab\Documents\gcc\sievedivisors>sievedivisors 10000000
Using sieve of divisors from 1 to 10000000
Evaluating sieve of number of divisors ...
Total number of divisors = 162725364
0.995 second(s)
Evaluating sieve of divisors ...
11.900 second(s)
c:\Users\Ab\Documents\gcc\sievedivisors>

SPOJ DQUERY : TLE Even With BIT?

Here is The Problem i Want to Solve , I am Using The Fact That Prefix Sum[i] - Prefix Sum[i-1] Leads to Frequency being Greater than Zero to Identify Distinct Digits and Then i am Eliminating The Frequency , But Even with BIT , i am Getting a TLE
Given a sequence of n numbers a1, a2, ..., an and a number of d-queries.
A d-query is a pair (i, j) (1 ≤ i ≤ j ≤ n).
For each d-query (i, j), you have to return the number of distinct elements in the subsequence ai, ai+1, ..., aj.
Input
Line 1: n (1 ≤ n ≤ 30000).
Line 2: n numbers a1, a2, ..., an (1 ≤ ai ≤ 106).
Line 3: q (1 ≤ q ≤ 200000), the number of d-queries.
In the next q lines, each line contains 2 numbers i, j
representing a d-query (1 ≤ i ≤ j ≤ n).
Output
For each d-query (i, j), print the number of distinct elements in the
subsequence ai, ai+1, ..., aj in a single line.
Example
Input
5
1 1 2 1 3
3
1 5
2 4
3 5
Output
3
2
3
the code is:
#include <iostream>
#include <algorithm>
#include <vector>
#include <stdlib.h>
#include <stdio.h>
typedef long long int ll;
using namespace std;
void update(ll n, ll val, vector<ll> &b);
ll read(ll n,vector<ll> &b);
ll readsingle(ll n,vector<ll> &b);
void map(vector<ll> &a,vector<ll> &b,ll n) /**** RElative Mapping ***/
{
ll temp;
a.clear();
b.clear();
for(ll i=0; i<n; i++)
{
cin>>temp;
a.push_back(temp);
b.push_back(temp);
}
sort(b.begin(),b.end());
for(ll i=0; i<n; i++)
*(a.begin()+i) = (lower_bound(b.begin(),b.end(),a[i])-b.begin())+1;
b.assign(n+1,0);
}
int main()
{
ll n;
cin>>n;
vector<ll> a,b;
map(a,b,n);
ll t;
cin>>t;
while(t--)
{
ll l ,u;
b.assign(n+1,0);
cin>>l>>u;
l--;/*** Reduce For Zero Based INdex ****/
u--;
for(ll i=l;i<=u;i++)
update(a[i],1,b);
ll cont=0;
for(ll i=l;i<=u;i++)
if(readsingle(a[i],b)>0)
{
cont++;
update(a[i],-readsingle(a[i],b),b); /***Eliminate The Frequency */
}
cout<<cont<<endl;
}
return 0;
}
ll readsingle(ll n,vector<ll> &b)
{
return read(n,b)-read(n-1,b);
}
ll read(ll n,vector<ll> &b)
{
ll sum=0;
for(; n; sum+=b[n],n-=n&-n);
return sum;
}
void update(ll n, ll val, vector<ll> &b)
{
for(; n<=b.size(); b[n]+=val,n+=n&-n);
}

The algorithm you use is too slow. For each query, your iterate over the entire query range, which already gives n * q operations(obviously, it is way too much). Here is a better solution(it has O((n + q) * log n) time and O(n + q) space complexity (it is an offline solution):
Let's sort all queries by their right end(there is no need to sort them explicitly, you can just add a query to an appropriate position (from 0 to n - 1)).
Now let's iterate over all positions in the array from left to right and maintain a BIT. Each position in the BIT is either 1(it means that there is a new element at position i) or 0(initially, it is filled with zeros).
For each element a[i]: if it the first occurrence of this element, just add one to the i position in the BIT. Otherwise, add -1 to the position of the previous occurrence of this element and then add 1 to the i position.
The answer to the query (left, right) is just sum for all elements from left to right.
To maintain the last occurrence of each element, you can use a map.
It is possible to make it online using persistent segment tree(the time complexity would be the same, the same complexity would become O(n * log n + q)), but it is not required here.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio