List of divisors of an integer n (Haskell) - performance

I currently have the following function to get the divisors of an integer:
-- All divisors of a number
divisors :: Integer -> [Integer]
divisors 1 = [1]
divisors n = firstHalf ++ secondHalf
where firstHalf = filter (divides n) (candidates n)
secondHalf = filter (\d -> n `div` d /= d) (map (n `div`) (reverse firstHalf))
candidates n = takeWhile (\d -> d * d <= n) [1..n]
I ended up adding the filter to secondHalf because a divisor was repeating when n is a square of a prime number. This seems like a very inefficient way to solve this problem.
So I have two questions: How do I measure if this really is a bottle neck in my algorithm? And if it is, how do I go about finding a better way to avoid repetitions when n is a square of a prime?

To mesure where the bottleneck is, put the three auxiliary definitions (firstHalf, secondHalf, candidates) at the top level, and run your code with the profiler on: ghc -prof --make divisors.hs ./divisors 100 +RTS -p -RTS
Also, you know that the biggest candidate is sqrt n, so instead of doing that many multiplications d*d, just consider [1..floor (sqrt n)]
For better algorithms, you should take a maths book, for it's not a haskell related question… Things you can consider: if "a divides b", then for all divisor d of a, d divides b as well.
You'll want to use memoization or dynamic programming to avoid checking multiple times if a given d divides b (for example, if 15 and 27 divide b, then you need to mathematically check only once that 3 divides b. The other times, you just see if 3 is in your table of divisors of b).

You needn't test all the elements of reversed second half. You know that if the square root is present, it is the head element there:
secondHalf = let (r:ds) = [n `div` d | d <- reverse firstHalf]
in [r | n `div` r /= r] ++ ds
This assumes n is positive.
A simpler way to handle the sqrt of a number differently is to handle it separately:
divs n =
let
r = floor $ sqrt $ fromIntegral n
(a,b) = unzip $ (1,n) : [(d, q) | d<-[2..r-1], let (q,r)=quotRem n d, r==0]
in
if r*r==n
then a ++ r : reverse b
else a ++ reverse b
That way we get the second half for free, as a part of producing the first half.
But this could hardly be a bottleneck in your application because the algorithm itself is inefficient. It is usually much faster to generate the divisors from a number's prime factorization. Prime factorization by trial division can be much quicker because we divide out each divisor as it is found, reducing the number being factorized and thus the amount of divisors that are tried (up to the reduced number's square root). For example, 12348 = 2*2*3*3*7*7*7 and no factor above 7 is tried in the process of factorization, whereas in divs 12348 the number 12348 is divided by all numbers from 2 to 110:
factorize n = go n (2:[3,5..]) -- or: (go n primes) where
where -- primes = 2 :
go n ds#(d:t) -- filter (null.tail.factorize) [3,5..]
| d*d > n = [n]
| r == 0 = d : go q ds
| otherwise = go n t
where (q,r) = quotRem n d

Related

polynomial (in n) time algorithm that decides whether N is a power

I am a computer science student; I am studying the Algorithms course independently.
During the course, I saw this question:
Given an n-bit integer N, find a polynomial (in n) time algorithm that decides whether N is a power (that is, there are integers a and k > 1 so that a^k = N).
I thought of a first option that is exponential in n:
For all k , 1<k<N , try to divide N by k until I get result 1.
For example, if N = 27, I will start with k = 2 , because 2 doesn't divide 27, I will go to next k =3.
I will divide 27 / 3 to get 9, and divide it again until I will get 1. This is not a good solution because it is exponential in n.
My second option is using Modular arithmetic, using ak ≡ 1 mod (k+1) if gcd(a, k+1 ) = 1 (Euler's theorem). I don't know if a and k are relatively prime.
I am trying to write an algorithm, but I am struggling to do it:
function power(N)
Input: Positive integer N
Output: yes/no
Pick positive integers a_1, a_2, . . . , a_k < N at random
if (a_i)^N−1 ≡ 1 (mod N)
for all i = 1, 2, . . . , k:
return yes
else:
return no
I'm not sure if the algorithm is correct. How can I write this correctly?
Ignoring the cases when N is 0 or 1, you want to know if N is representable as a^b for a>1, b>1.
If you knew b, you could find a in O(log(N)) arithmetic operations (using binary search). Each arithmetic operation including exponentiation runs in polynomial time in log(N), so that would be polynomial too.
It's possible to bound b: it can be at most log_2(N)+1, otherwise a will be less than 2.
So simply try each b from 2 to floor(log_2(N)+1). Each try is polynomial in n (n ~= log_2(N)), and there's O(n) trials, so the resulting time is polynomial in n.
This looks like a simple math question. Suppose that we are given N = 96889010407 which is much less than Number.MAX_SAFE_INTEGER.
The question trys to figure out if N is a power where a**k === N for a > 1 and k > 1 . So we can also write it as
Math.log(a**k) === Math.log(N) yielding k*Math.log(a) === Math.log(N) yielding Math.log(a) === Math.log(N) / k where k is an Integer > 1.
Now remember the inverse logarithm. Math.log(y) = x yields y = Math.E**x.
This means we are looking for an Integer like a = Math.E**(Math.log(N) / k) for some k if exists. So start from k=2 and increment by 1.
k a = Math.E**(Math.log(N) / k)
___ _____________________________
2 311269.99599543784 -> NO
3 4592.947769836504 -> NO
4 557.9157606623403 -> NO
5 157.49069663608586 -> NO
6 67.77129015915592 -> NO
7 37.1080205641031 -> NO
8 23.62024048697092 -> NO
9 16.622531664172815 -> NO
10 12.54952973764698 -> NO
11 9.971310247420734 -> NO
12 8.232332000056601 -> NO
13 6.999999999999999 -> YES a is 7 and 96889010407 = 7^13
So for how long do we have to iterate? As long as Math.E**(Math.log(N) / k >= 2. In this case max 36 iterations since Math.E**(Math.log(96889010407) / 37 is 1.9811909632660634 and a must be an integer > 1.
This algorithm is probably the most efficient one for this job. It's time complexity is O(log2(N)) as we iterate k (the power). Had we chosen a to iterate then the time complexity would be O(sqrt(N)).
This is OK for Natural numbers but you can extend this to the Rationals as well.
Say, is 10.999671418529301 a perfect power?
All you have to do is to convert the decimal into a fraction the best way possible to get the rational form 4084101/371293 and apply both the numerator and the denominator to the mentioned algorithm above, to see if they both give the same power which in this case would be 5. 10.999671418529301 is 21^5/13^5.
Note: JS Math object is used in the example.
The number N cannot exceed 2^n. Hence you can initialize i=2, j=n and compute i^j with decreasing j until you arrive at N, then increase i and so on. A power is found in polynomial time.
E.g. with 7776 < 8192 = 2^13, you try 2^12 = 4096, then 3^12, 3^11, 3^10, 3^9, 3^8, then 4^8, 4^7, 4^6, 5^6, 5^5, 6^5 and you are done.

My Haskell Solution to Euler #3 is Inefficient

I am attempting to solve Euler problem 3 in Haskell, which involves finding the largest prime factor of a number. My code runs for a long time and seems to hang. What is causing my code to be so grossly inefficient?
primes = sieve (2:[3,5..])
where sieve (x:xs) = x:[y | y <- (sieve xs), mod y x /= 0]
sieve [] = []
primefactors n = filter (\x -> mod n x == 0) (primesUnder n)
where primesUnder z = reverse (takeWhile (< z) primes)
solve3 = head (primefactors 600851475143)
Your main problem is you're checking for enormous primes -- all the way up to 600851475143. You can improve things a lot by observing two things:
Every time you find a prime, you can decrease the maximum prime you look at by dividing away that factor.
You only have to look for primes until you reach the square root of the target. If your primes are bigger than that, and you know there are no smaller factors, you're done.
Using these two improvements together, even without the nicety that you used of only checking primes for divisibility, makes the program run in a snap:
factor = go (2:[3,5..]) where
go (p:ps) n
| p*p > n = [n]
| n `mod` p == 0 = p : go (p:ps) (n `div` p)
| otherwise = go ps n
main = print . last . factor $ 600851475143
In ghci:
*Main> main
6857
(0.00 secs, 0 bytes)
You can see that we only had to inspect numbers up to 6857 -- eight orders of magnitude smaller than what you would have to do with your approach.
Independently, your sieve is dog slow. You could have a look at the wiki for ideas about how to find primes quickly.

Determining which integer is closest to the kth root of n without using floating point arithmetic?

Suppose that I want to compute k√n rounded to the nearest integer, where n and k are nonnegative integers. Using binary search, I can find an integer a such that
ak ≤ n < (a+1)k.
This means that either a or a+1 is the kth root of n rounded to the nearest integer. However, I'm not sure how to determine which one it is without doing some calculations that involve floating-point arithmetic.
Given the values of a, n, and k, is there a way to determine the kth root of n rounded to the nearest integer without doing any floating-point calculations?
Thanks!
2kak < 2kn < (2a+1)k → (dividing by 2k) ak < n < (a+0.5)k → (taking the kth root) a < k√n < a+0.5, so the kth root of n is closer to a than to a+1. Note that the edge case will not occur; the kth root of an integer can not be an integer plus 0.5 (a+0.5) as the kth roots of n which are not kth powers are irrational and if n were a perfect kth power, then the kth root would be an integer.
The answers by Ramchandra Apte and Lazarus both contain what seems to be the essence of the correct answer, but both are also (at least to me) a bit hard to follow. Let me try to explain the trick they seem to be getting at, as I understand it, a bit more clearly:
The basic idea is that, to find out whether a or a+1 is closer to k√n, we need to test whether k√n < a+½.
To get rid of the ½, we can simply multiply both sides of this inequality by 2, giving 2·k√n < 2a+1, and by raising both sides to the k-th power (and assuming they're both positive) we get the equivalent inequality 2k·n < (2a+1)k. So, at least as long as 2k·n = n &ll; k does not overflow, we can simply compare it with (2a+1)k to obtain the answer.
In fact, we could simply compute b = ⌊ k√(2k·n) ⌋ to begin with. If b is even, then the closest integer to k√n is b / 2; if b is odd, it is (b + 1) / 2. Indeed, we can combine the two cases and say that the closest integer to k√n is ⌊ (b+1) / 2 ⌋, or, in pseudo-C:
int round_root( int k, int n ) {
int b = floor_root( k, n << k );
return (b + 1) / 2;
}
Ps. An alternative approach could be to compute an approximation (a+½)k directly using the binomial theorem as
(a+½)k
= ∑i=0..k (k choose i) ak−i / 2i
&approx; ak
+ k·ak−1 / 2 + ... and compare it directly with n. However, at least naïvely, summing all the terms of the binomial expansion would still require keeping track of k extra bits of precision (or at least k−1; I believe the last term can be safely neglected), so it may not gain much over the method suggested above.
My guess is that you want to use this algorithm on an FPGA/CPLD, or a processor with limited resources, since your approach reminds me of CORDIC. Hence, I will give a solution with that in mind.
When you reach a^k ≤ n < (a+1)^k, it means that floor of x=root(n,k) is 'a'. In other words, x = a + f, where 0=<f<0.5. Thus, multiplying the equation by 2, you will have 2x=2a+2f. It basically means that floor(2x) = 2a (since 2f<1). Now, x = √n (kth root), thus 2x = k√((2^k)*n) (kth root). So, just shift n by k bits to left, then calculate its kth root with your algorithm. If its lower bound was exactly 2 times kth root of n, then kth root of n is a, otherwise it is a+1.
Assuming you have a function that gives you the lower bound of the kth root of n (rootk(n)), the final result, using binary operators and with C notations, would be:
closestint = a + ((rootk(n) << 1) == rootk(n>>k) );
Compute the cube of (a + 0.5)*10 (or 10a + 5 - no floating point arithmetic), then divide it by 1000.
Then check on which side the number is.
The idea of multiplying by 10 is to shift the decimal place one position to the right. Then we divide by 1000 because we multiplied by 10 3 times because of the cubing.
For example:
Input: 16
a = 2
a+1 = 3
a^3 = 8
(a+1)^3 = 27
10a + 5 = 25
25^3 = 15625
floor(15625 / 1000) = 15
16 > 15, thus 3 is closer.
It would also work to, as Oli pointed out, compute the cube of (a + 0.5)*2 (or 2a + 1), then divide it by 8.
For example:
2a + 1 = 5
5^3 = 125
floor(125 / 8) = 15
16 > 15, thus 3 is closer.
You can use Newton's method to find a; it works perfectly well with integers, and is faster than binary search. Then compute ak and (a+1)k using the square-and-multiply powering algorithm. Here's some code, in Scheme because I happened to have that handy:
(define (iroot k n) ; largest integer x such that x ^ k <= n
(let ((k-1 (- k 1)))
(let loop ((u n) (s (+ n 1)))
(if (<= s u) s
(loop (quotient (+ (* k-1 u) (quotient n (expt u k-1))) k) u)))))
(define (ipow b e) ; b^e
(if (= e 0) 1
(let loop ((s b) (i e) (a 1)) ; a * s^i = b^e
(let ((a (if (odd? i) (* a s) a)) (i (quotient i 2)))
(if (zero? i) a (loop (* s s) i a))))))
To determine which of ak and (a+1)k is closer to the root, you could use the powering algorithm to compute (a + 1/2)k — it's an exact calculation that the square-and-multiply operation can perform — then compare the result to n and determine which side is closer.
Edit: -
Sorry of misunderstanding the problem. Here is a possible solution of original question :-
Use newtons approximation theorem : -
here = means (approximately = )
f(b+a) = f(b) + a*f'(b)
a -> 0
f(x) = x^k
f'(x) = k*x^(k-1)
hence using above equation
f(a+0.5) = a^k + 1/2*k*a^(k-1);
need to check n < f(a+0.5)
n < a^k + 1/2*k*a^(k-1)
rearranging (n-a^k)*2 < k*a^(k-1)
Note: you can use binomial theorem to get more precision.
Think. Ideally, you'd do one more step of binary search, to see which side of a+½ the root lies. That is, test the inequality
(a+0.5)k < n
But the left hand side is difficult to compute precisely (floating point issues). So write down an equivalent inequality in which all the terms are integers:
(2a+1)k < 2k n
Done.

Very large number modulo prime number

I was asked the following question in an interview:
How to solve this: ((3000000!)/(30!)^100000)%(any prime no.)
I coded the C program for same using brute force, but I am sure that he was not expecting this. Any suggestions for the solutions?
3000000! = 1*2*3*4*5*..*8*...*16*...*24*...*32*...40*...*64*...*3000000
Can we count the number of 2s in the result? Yes, each power of 2 contributes one 2 to each of its multiples. So the total number of 2s in the factorization of n! is n/2 + n/4 + n/8 + n/16 + n/32 + ... where / is integer division and terms are summed up while they are greater than 0:
fnf n f = -- number of `f` factors in `n!`
sum . takeWhile (>0) . tail . iterate (`div` f) $ n
(writing the pseudocode in Haskell). when f*f < n, there will be more than one entry to sum up. For bigger fs, there will be only one entry to sum, viz. n `div` f.
So the factorization of n! is found as
factfact n = -- factorization of n! as [ (p,k) ... ] for n! = PROD p_i^k_i
let
(ps,qs) = span (\p-> p*p <= n) primes -- (before, after)
in
[(f, fnf n f) | f <- ps] ++
[(f, n `div` f) | f <- takeWhile (<= n) qs]
Now, factorization of 30! has 10 factors:
> factfact 30
[(2,26),(3,14),(5,7),(7,4),(11,2),(13,2),(17,1),(19,1),(23,1),(29,1)]
The 100000th power of it just has each of its factor coefficients multiplied by 100000. When we take the factorization of 3000000!, its first few terms out of 216816 total, are:
> factfact 3000000
[(2,2999990),(3,1499993),(5,749998),(7,499996),(11,299996),(13,249998),
(17,187497),(19,166665),(23,136361),(29,107142),(31,99998), ...
so after the division when we subtract the second from the first none are lacking nor cancelled out completely:
[(2,399990),(3,99993),(5,49998),(7,99996),(11,99996),(13,49998),
(17,87497),(19,66665),(23,36361),(29,7142),(31,99998), ...
So for any prime less than 3000000 the remainder is 0. What if it is bigger, p > 3000000? Then, modular exponentiation mod p and multiplication mod p for this factorization, that we found above, must be used. There are plenty of answers about those, on SO.
Of course in the production code (for a non-lazy programming language) we wouldn't build the intermediate factorization list, but instead just process each prime below 3000000, one by one (there's no need for that with a lazy language).

Why do we check up to the square root of a number to determine if the number is prime?

To test whether a number is prime or not, why do we have to test whether it is divisible only up to the square root of that number?
If a number n is not a prime, it can be factored into two factors a and b:
n = a * b
Now a and b can't be both greater than the square root of n, since then the product a * b would be greater than sqrt(n) * sqrt(n) = n. So in any factorization of n, at least one of the factors must be smaller than the square root of n, and if we can't find any factors less than or equal to the square root, n must be a prime.
Let's say m = sqrt(n) then m × m = n. Now if n is not a prime then n can be written as n = a × b, so m × m = a × b. Notice that m is a real number whereas n, a and b are natural numbers.
Now there can be 3 cases:
a > m ⇒ b < m
a = m ⇒ b = m
a < m ⇒ b > m
In all 3 cases, min(a, b) ≤ m. Hence if we search till m, we are bound to find at least one factor of n, which is enough to show that n is not prime.
Because if a factor is greater than the square root of n, the other factor that would multiply with it to equal n is necessarily less than the square root of n.
Suppose n is not a prime number (greater than 1). So there are numbers a and b such that
n = ab (1 < a <= b < n)
By multiplying the relation a<=b by a and b we get:
a^2 <= ab
ab <= b^2
Therefore: (note that n=ab)
a^2 <= n <= b^2
Hence: (Note that a and b are positive)
a <= sqrt(n) <= b
So if a number (greater than 1) is not prime and we test divisibility up to square root of the number, we will find one of the factors.
It's all really just basic uses of Factorization and Square Roots.
It may appear to be abstract, but in reality it simply lies with the fact that a non-prime-number's maximum possible factorial would have to be its square root because:
sqrroot(n) * sqrroot(n) = n.
Given that, if any whole number above 1 and below or up to sqrroot(n) divides evenly into n, then n cannot be a prime number.
Pseudo-code example:
i = 2;
is_prime = true;
while loop (i <= sqrroot(n))
{
if (n % i == 0)
{
is_prime = false;
exit while;
}
++i;
}
Let's suppose that the given integer N is not prime,
Then N can be factorized into two factors a and b , 2 <= a, b < N such that N = a*b.
Clearly, both of them can't be greater than sqrt(N) simultaneously.
Let us assume without loss of generality that a is smaller.
Now, if you could not find any divisor of N belonging in the range [2, sqrt(N)], what does that mean?
This means that N does not have any divisor in [2, a] as a <= sqrt(N).
Therefore, a = 1 and b = n and hence By definition, N is prime.
...
Further reading if you are not satisfied:
Many different combinations of (a, b) may be possible. Let's say they are:
(a1, b1), (a2, b2), (a3, b3), ..... , (ak, bk). Without loss of generality, assume ai < bi, 1<= i <=k.
Now, to be able to show that N is not prime it is sufficient to show that none of ai can be factorized further. And we also know that ai <= sqrt(N) and thus you need to check till sqrt(N) which will cover all ai. And hence you will be able to conclude whether or not N is prime.
...
So to check whether a number N is Prime or not.
We need to only check if N is divisible by numbers<=SQROOT(N). This is because, if we factor N into any 2 factors say X and Y, ie. N=XY.
Each of X and Y cannot be less than SQROOT(N) because then, XY < N
Each of X and Y cannot be greater than SQROOT(N) because then, X*Y > N
Therefore one factor must be less than or equal to SQROOT(N) ( while the other factor is greater than or equal to SQROOT(N) ).
So to check if N is Prime we need only check those numbers <= SQROOT(N).
Let's say we have a number "a", which is not prime [not prime/composite number means - a number which can be divided evenly by numbers other than 1 or itself. For example, 6 can be divided evenly by 2, or by 3, as well as by 1 or 6].
6 = 1 × 6 or 6 = 2 × 3
So now if "a" is not prime then it can be divided by two other numbers and let's say those numbers are "b" and "c". Which means
a=b*c.
Now if "b" or "c" , any of them is greater than square root of "a "than multiplication of "b" & "c" will be greater than "a".
So, "b" or "c" is always <= square root of "a" to prove the equation "a=b*c".
Because of the above reason, when we test if a number is prime or not, we only check until square root of that number.
Given any number n, then one way to find its factors is to get its square root p:
sqrt(n) = p
Of course, if we multiply p by itself, then we get back n:
p*p = n
It can be re-written as:
a*b = n
Where p = a = b. If a increases, then b decreases to maintain a*b = n. Therefore, p is the upper limit.
Update: I am re-reading this answer again today and it became clearer to me more. The value p does not necessarily mean an integer because if it is, then n would not be a prime. So, p could be a real number (ie, with fractions). And instead of going through the whole range of n, now we only need to go through the whole range of p. The other p is a mirror copy so in effect we halve the range. And then, now I am seeing that we can actually continue re-doing the square root and doing it to p to further half the range.
Let n be non-prime. Therefore, it has at least two integer factors greater than 1. Let f be the smallest of n's such factors. Suppose f > sqrt n. Then n/f is an integer ≤ sqrt n, thus smaller than f. Therefore, f cannot be n's smallest factor. Reductio ad absurdum; n's smallest factor must be ≤ sqrt n.
Any composite number is a product of primes.
Let say n = p1 * p2, where p2 > p1 and they are primes.
If n % p1 === 0 then n is a composite number.
If n % p2 === 0 then guess what n % p1 === 0 as well!
So there is no way that if n % p2 === 0 but n % p1 !== 0 at the same time.
In other words if a composite number n can be divided evenly by
p2,p3...pi (its greater factor) it must be divided by its lowest factor p1 too.
It turns out that the lowest factor p1 <= Math.square(n) is always true.
Yes, as it was properly explained above, it's enough to iterate up to Math.floor of a number's square root to check its primality (because sqrt covers all possible cases of division; and Math.floor, because any integer above sqrt will already be beyond its range).
Here is a runnable JavaScript code snippet that represents a simple implementation of this approach – and its "runtime-friendliness" is good enough for handling pretty big numbers (I tried checking both prime and not prime numbers up to 10**12, i.e. 1 trillion, compared results with the online database of prime numbers and encountered no errors or lags even on my cheap phone):
function isPrime(num) {
if (num % 2 === 0 || num < 3 || !Number.isSafeInteger(num)) {
return num === 2;
} else {
const sqrt = Math.floor(Math.sqrt(num));
for (let i = 3; i <= sqrt; i += 2) {
if (num % i === 0) return false;
}
return true;
}
}
<label for="inp">Enter a number and click "Check!":</label><br>
<input type="number" id="inp"></input>
<button onclick="alert(isPrime(+document.getElementById('inp').value) ? 'Prime' : 'Not prime')" type="button">Check!</button>
To test the primality of a number, n, one would expect a loop such as following in the first place :
bool isPrime = true;
for(int i = 2; i < n; i++){
if(n%i == 0){
isPrime = false;
break;
}
}
What the above loop does is this : for a given 1 < i < n, it checks if n/i is an integer (leaves remainder 0). If there exists an i for which n/i is an integer, then we can be sure that n is not a prime number, at which point the loop terminates. If for no i, n/i is an integer, then n is prime.
As with every algorithm, we ask : Can we do better ?
Let us see what is going on in the above loop.
The sequence of i goes : i = 2, 3, 4, ... , n-1
And the sequence of integer-checks goes : j = n/i, which is n/2, n/3, n/4, ... , n/(n-1)
If for some i = a, n/a is an integer, then n/a = k (integer)
or n = ak, clearly n > k > 1 (if k = 1, then a = n, but i never reaches n; and if k = n, then a = 1, but i starts form 2)
Also, n/k = a, and as stated above, a is a value of i so n > a > 1.
So, a and k are both integers between 1 and n (exclusive). Since, i reaches every integer in that range, at some iteration i = a, and at some other iteration i = k. If the primality test of n fails for min(a,k), it will also fail for max(a,k). So we need to check only one of these two cases, unless min(a,k) = max(a,k) (where two checks reduce to one) i.e., a = k , at which point a*a = n, which implies a = sqrt(n).
In other words, if the primality test of n were to fail for some i >= sqrt(n) (i.e., max(a,k)), then it would also fail for some i <= n (i.e., min(a,k)). So, it would suffice if we run the test for i = 2 to sqrt(n).

Resources