I've been reading the CLRS algorithm book and I decided to try out a problem for myself. I've been trying to use a new method to help understand the complexity of the arbitrary algorithm, displayed below. Here is my method:
Line-by-line Analysis:
Line 2. Sum of natural numbers. Theta (N)
Line 3. Sum of squares. Theta (N2)
Line 4. If j is divisible by i. Theta (?)
Line 5. Sum of natural numbers. Theta (N)
Algorithm:
1. sum = 0
2. for i=1 to n do
3. for j=1 to i2 do
4. if(j(mod i) == 0) then
5. for k=1 to j do
6. sum++
As you can tell, I'm trying to find the number of times sum++ is executed, and it's Line 4 that is throwing me off. Could someone guide me a little bit?
Related
I read order statistics from the book The Design and Analysis of Computer Algorithms", by Aho, Hopcroft, Ullman, Addison-Wesley
Algorithm 3.6. Finding the kth smallest element.
The procedure is as follows:
procedure SELECT(k, S):
1. if |S| < 50 then
begin
2. sort S:
3. return kth smallest element in S
end
else
begin
4. divide S into L|S|/5 J sequences of 5 elements each *lower bound*
5. with up to four leftover elements:
6. sort each 5-element sequence;
7. let M be the sequence of medians of the 5-element sets;
8. m <- SELECT(||M|/2|, M); *Upper bound*
9. let S1, S2 , and S3 be the sequences of elements in S less
than, equal to, and greater than' m, respectively;
10. if |S_1|>= k then return SELECT(k_1, S_1)
else ,
11. if (|S_1| + |S_2| >= k) then return m
12. else return SELECT(k- |S_1| - |S_2|, S_3)
end
In line number 1. Why we consider |S|<50? If |S| > 50 does this algorithm work?
In line number 4, Why we divide |S|/5? What if we divide |S|/4 or |S|/6?
If anyone clarify my doubts that would be great help. Thank you.
Long version: I recommend you to read Introduction to Algorithms (Cormen), 3rd edition, Section 9.3, where they explain with full details selection in worst-case linear time (you can find the math there).
Short version: About why to pick |S| < 50 which can be seen as an arbitrary number. It is chosen that way to satisfy function's big O calculation (as n0) according to split size in step 4 (current size is 5, but can be 4, 6 as you stated or any other fixed positive number, it will appear as a constant on big O calculation T(n) <= T(n/5) + T(7n/10 + 6) + O(n) [from Introduction to Algorithms]).
I’ve been assigned to analyze this pseudo code
I’ve tried to sketch the code out and I came to the conclusion that this code computes the sum of elements in an array A then decides whether it’s a prime number or not . Am I correct ?
Now I’m trying to figure out it’s worst case time complexity
So far I’ve concluded
Line 1,13 O(1)
Line 2-4 O(n)
Line 5-8 O(n)
Line 8-10 O(n-1)
Line 4,7,11,12 ——
Is it safe to say it’s worst case time complexity is O(n)?
Input: Array A with the length |A|=n `
of the natural numbers a in N ^ >=1
Output: Boolean value
1. x:= 0;
2. For i:= 1 to n do
3. x:= x + A[i];
4. End for
5. If x<2 then
6. Return false;
7. End if
8. For i:= 2 to x -1 do
9. If x mod i=0 then
10. Return false;
11. End if
12. End for
13. Return true;
Generating prime numbers from 1 to n Python 3. How to improve efficiency and what is the complexity?
Input: A number, max (a large number)
Output: All the primes from 1 to max
Output is in the form of a list and will be [2,3,5,7,11,13,.......]
The code attempts to perform this task in an efficient way (least time complexity).
from math import sqrt
max = (10**6)*3
print("\nThis code prints all primes till: " , max , "\n")
list_primes=[2]
def am_i_prime(num):
"""
Input/Parameter the function takes: An integer number
Output: returns True, if the number is prime and False if not
"""
decision=True
i=0
while(list_primes[i] <= sqrt(num)): #Till sqrt(n) to save comparisons
if(num%list_primes[i]==0):
decision=False
break
#break is inserted so that we get out of comparisons faster
#Eg. for 1568, we should break from the loop as soon as we know that 1568%2==0
i+=1
return decision
for i in range(3,max,2): #starts from 3 as our list contains 2 from the beginning
if am_i_prime(i)==True:
list_primes.append(i) #if a number is found to be prime, we append it to our list of primes
print(list_primes)
How can I make this faster? Where can I improve?
What is the time complexity of this code? Which steps are inefficient?
In what ways is the Sieve of Eratosthenes more efficient than this?
Working for the first few iterations:-
We have a list_primes which contains prime numbers. It initially contains only 2.
We go to the next number, 3. Is 3 divisible by any of the numbers in list_primes? No! We append 3 to list_primes. Right now, list_primes=[2,3]
We go to the next number 4. Is 4 divisible by any of the numbers in list_primes? Yes (4 is divisible by 2). So, we don't do anything. Right now list_primes=[2,3]
We go to the next number, 5. Is 5 divisible by any of the numbers in list_primes? No! We append 5 to list_primes. Right now, list_primes=[2,3,5]
We go to the next number, 6. Is 6 divisible by any of the numbers in list_primes? Yes (6 is divisible by 2 and also divisible by 3). So, we don't do anything. Right now list_primes=[2,3,5]
And so on...
Interestingly, it takes a rather deep mathematical theorem to prove that your algorithm is correct at all. The theorem is: "For every n ≥ 2, there is a prime number between n and n^2". I know it has been proven, and much stricter bounds are proven, but I must admit I wouldn't know how to prove it myself. And if this theorem is not correct, then the loop in am_i_prime can go past the bounds of the array.
The number of primes ≤ k is O (k / log k) - this is again a very deep mathematical theorem. Again, beyond me to prove.
But anyway, there are about n / log n primes up to n, and for these primes the loop will iterate through all primes up to n^(1/2), and there are O (n^(1/2) / log n) of them.
So for the primes alone, the runtime is therefore O (n^1.5 / log^2 n), so that is a lower bound. With some effort it should be possible to prove that for all numbers, the runtime is asymptotically the same.
O (n^1.5 / log n) is obviously an upper bound, but experimentally the number of divisions to find all primes ≤ n seems to be ≤ 2 n^1.5 / log^2 n, where log is the natural logarithm.
The following rearrangement and optimization of your code will reach your maximum in nearly 1/2 the time of your original code. It combines your top level loop and predicate function into a single function to eliminate overhead and manages squares (square roots) more efficiently:
def get_primes(maximum):
primes = []
if maximum > 1:
primes.append(2)
squares = [4]
for number in range(3, maximum, 2):
i = 0
while squares[i] <= number:
if number % primes[i] == 0:
break
i += 1
else: # no break
primes.append(number)
squares.append(number * number)
return primes
maximum = 10 ** 6 * 3
print(get_primes(maximum))
However, a sieve-based algorithm will easily beat this (as it avoids division and/or multiplication). Your code has a bug: setting max = 1 will create the list [2] instead of the correct answer of an empty list. Always test both ends of your limits.
O(N**2)
Approximately speaking, the first call to am_I_prime does 1 comparison, the second does 2, ..., so the total count is 1 + 2 + ... + N, which is (N * (N-1)) / 2, which has order N-squared.
Interview question: How many Fibonacci numbers exists less than a given number k? Can you find a function in terms of k, to get the number of fibonacci number less than k?
Example : n = 6
Answer: 6 as (0, 1, 1, 2, 3, 5)
Easy enough, write a loop or use the recursive definition of Fibonacci. However, that sounds too easy... is there a way to do this using the closed-form definition? (https://en.wikipedia.org/wiki/Fibonacci_number#Closed-form_expression)
Here is a close-form Python solution which is O(1). It uses Binet's formula (from the Wikipedia article that you linked to):
>>> from math import sqrt,log
>>> def numFibs(n): return int(log(sqrt(5)*n)/log((1+sqrt(5))/2))
>>> numFibs(10)
6
Which tracks with 1,1,2,3,5,8
The point is that the second term in Binet's formula is negligible and it is easy enough to invert the result of neglecting it.
The above formula counts the number of Fibonacci numbers which are less than or equal to n. It jumps by 1 with each new Fibonacci number. So, for example, numFibs(12) = 6 and numFibs(13) = 7. 13 is the 7th Fibonacci number, so if you want the number of Fibobacci numbers which are strictly smaller than n you have to introduce a lag. Something like:
def smallerFibs(n):
if n <= 1:
return 0
else:
return min(numFibs(n-1),numFibs(n))
Now smallerFibs(13) is still 6 but then smallerFibs(14) = 7. This is of course still O(1).
I think it's fairly easy to see the growth of this number, at least. By the Binet / De-Moivre formula,
fn = (φn - ψn) / 5
Since |ψ| < 1 < φ, then
fn ∼ φn / 5.
From this it follows that the number of Fibonacci numbers smaller than x grows like logφ(5x).
I need to calculate the sum of phi(k) for 1 <= k <= N where N = 1,000,000 and phi(k) is Euler's totient function. This is for a project Euler problem. I've already solved it using this previous StackOverflow question where it asks to calculate each value of phi(k) for 1 < k < N. However, I wonder if any further optimizations can be made since we only require the final sum of the phi(k) and not the individual value of each addend.
The Wikipedia page on Euler's totient function includes a formula due to Arnold Wafisz for the sum of φ(k) for k from 1 to n:
sum(1<=k<=n) φ(k) = (1 + sum(1<=k<=n) μ(k)*(floor(n/k))^2) / 2
(It's a lot easier to read on Wikipedia.)
The Möbius function μ(k) is 0 if k has any squared prime factor, and otherwise (-1)f where f is the number of unique prime factors in k. (In other words, it is 1 if k's prime factorization has an even number of unique primes; -1 if it has an odd number; and 0 if some prime appears more than once.) You should be able to use a modified sieve to rapidly compute μ(k).
That might turn out to be a bit faster.