What is the Big-O complexity of this function? - big-o

def dividing(n):
while n!=1:
n=math.floor(n/2)
return True
There's that code, my initial thought is that it just has a complexity of n since it n is just the input, no squares no anything but when researching about n/2, I found out that it's Big-O complexity is Log(m), so now im confused, is the big-o complexity of this is log(m)? If yes, why is that so?

Let's assume we count everything within the scope of the while loop as a basic operation.
while(invariant) { basic-operation }
Then, to find an upper asymptotic bound for the number of basic operations of the dividing function, we need an upper bound on, given an input n to the function, how many times the while loop executes.
We may simple reverse the loop and it will become quite apparent (ignore the flooring):
// The value of 'n' until termination of
// the while loop, in reverse (n here == n_start)
1 + 2 + 4 + ... + n
= 2^0 + 2^1 + 2^2 + ... + 2^(log2(n))
= sum_{i=0}^{i=log2(n_start)} 2^i
The sum expression runs i, which in our context is the loop variable, from 0 to log2(n) by steps of 1, meaning the while loop runs (ignoring flooring) log2(n) + 1 times, in turn meaning that O(log2(n)) provides an upper asymptotic bound for the time complexity of your function.

In this kind of situation, where the input is a single number, we cannot assume that arithmetic operations take constant time. Formally, the "input size" of the algorithm should be measured in bits, and it takes more time to divide a number that takes more bits to represent.
Your code actually works with floating-point numbers, which means the actual magnitude of the number is not directly related to the number of bits required to represent it (and we will have to ignore that real floating-point numbers have a fixed size in bits, otherwise the concepts of "input size" and "time complexity" simply make no sense). It is simpler to analyse a similar algorithm which works on integers; let's instead say the algorithm takes an integer and does n = n // 2 or n = n >> 1 instead of n = math.floor(n / 2). An integer n takes O(log n) bits to represent.
The actual amount of time depends on the algorithm used to perform the division; we can say that the time complexity is O(D(n) log n) where D(n) is the time complexity of the division algorithm. There are various division algorithms with different time complexities, which also depend on the time complexity of the multiplication algorithm used. On the other hand, since dividing by 2 is equivalent to a right-shift by 1 bit, if we write the algorithm to use a bit-shift (or if the division algorithm optimises it to a bit-shift in this special case), we will have D(n) = O(log n) because bit-shifting takes linear time in the number of bits. In that case, the original algorithm's time complexity would be O(log^2 n).

Related

Sum of linear time algorithms is linear time?

Pardon me if the question is "silly". I am new to algorithmic time complexity.
I understand that if I have n numbers and I want to sum them, it takes "n steps", which means the algorithm is O(n) or linear time. i.e. Number of steps taken increases linearly with number of input, n.
If I write a new algorithm that does this summing 5 times, one after another, I understand that it is O(5n) = O(n) time, still linear (according to wikipedia).
Question
If I have say 10 different O(n) time algorithms (sum, linear time sort etc). And I run them one after another on the n inputs.
Does this mean that overall this runs in O(10n) = O(n), linear time?
Yep, O(kn) for any constant k, = O(n)
If you start growing your problem and decide that your 10 linear ops are actually k linear ops based on, say k being the length of a user input array, it would then be incorrect to drop that information from the big-oh
It's best to work it through from the definition of big-O, then learn the rule of thumb once you've "proved" it correct.
If you have 10 O(n) algorithms, that means that there are 10 constants C1 to C10, such that for each algorithm Ai, the time taken to execute it is less than Ci * n for sufficiently large n.
Hence[*] the time taken to run all 10 algorithms for sufficiently large n is less than:
C1 * n + C2 * n + ... + C10 * n
= (C1 + C2 + ... + C10) * n
So the total is also O(n), with constant C1 + ... + C10.
Rule of thumb learned: the sum of a constant number of O(f(n)) functions is O(f(n)).
[*] proof of this left to the reader. Hint: there are 10 different values of "sufficient" to consider.
Yes, O(10n) = O(n). Also, O(C*n) = O(n), where C is a constant. In this case C is 10. It might seem as O(n^2) if C is equal to n, but this is not true. As C is a constant, it does not
change with n.
Also, note that in summation of complexities, the highest order or the most complex one is considered the overall complexity. In this case it is O(n) + O(n) ... + O(n) ten times. Thus, it is O(n).

Big O, what is the complexity of summing a series of n numbers?

I always thought the complexity of:
1 + 2 + 3 + ... + n is O(n), and summing two n by n matrices would be O(n^2).
But today I read from a textbook, "by the formula for the sum of the first n integers, this is n(n+1)/2" and then thus: (1/2)n^2 + (1/2)n, and thus O(n^2).
What am I missing here?
The big O notation can be used to determine the growth rate of any function.
In this case, it seems the book is not talking about the time complexity of computing the value, but about the value itself. And n(n+1)/2 is O(n^2).
You are confusing complexity of runtime and the size (complexity) of the result.
The running time of summing, one after the other, the first n consecutive numbers is indeed O(n).1
But the complexity of the result, that is the size of “sum from 1 to n” = n(n – 1) / 2 is O(n ^ 2).
1 But for arbitrarily large numbers this is simplistic since adding large numbers takes longer than adding small numbers. For a precise runtime analysis, you indeed have to consider the size of the result. However, this isn’t usually relevant in programming, nor even in purely theoretical computer science. In both domains, summing numbers is usually considered an O(1) operation unless explicitly required otherwise by the domain (i.e. when implementing an operation for a bignum library).
n(n+1)/2 is the quick way to sum a consecutive sequence of N integers (starting from 1). I think you're confusing an algorithm with big-oh notation!
If you thought of it as a function, then the big-oh complexity of this function is O(1):
public int sum_of_first_n_integers(int n) {
return (n * (n+1))/2;
}
The naive implementation would have big-oh complexity of O(n).
public int sum_of_first_n_integers(int n) {
int sum = 0;
for (int i = 1; i <= n; i++) {
sum += n;
}
return sum;
}
Even just looking at each cell of a single n-by-n matrix is O(n^2), since the matrix has n^2 cells.
There really isn't a complexity of a problem, but rather a complexity of an algorithm.
In your case, if you choose to iterate through all the numbers, the the complexity is, indeed, O(n).
But that's not the most efficient algorithm. A more efficient one is to apply the formula - n*(n+1)/2, which is constant, and thus the complexity is O(1).
So my guess is that this is actually a reference to Cracking the Coding Interview, which has this paragraph on a StringBuffer implementation:
On each concatenation, a new copy of the string is created, and the
two strings are copied over, character by character. The first
iteration requires us to copy x characters. The second iteration
requires copying 2x characters. The third iteration requires 3x, and
so on. The total time therefore is O(x + 2x + ... + nx). This reduces
to O(xn²). (Why isn't it O(xnⁿ)? Because 1 + 2 + ... n equals n(n+1)/2
or, O(n²).)
For whatever reason I found this a little confusing on my first read-through, too. The important bit to see is that n is multiplying n, or in other words that n² is happening, and that dominates. This is why ultimately O(xn²) is just O(n²) -- the x is sort of a red herring.
You have a formula that doesn't depend on the number of numbers being added, so it's a constant-time algorithm, or O(1).
If you add each number one at a time, then it's indeed O(n). The formula is a shortcut; it's a different, more efficient algorithm. The shortcut works when the numbers being added are all 1..n. If you have a non-contiguous sequence of numbers, then the shortcut formula doesn't work and you'll have to go back to the one-by-one algorithm.
None of this applies to the matrix of numbers, though. To add two matrices, it's still O(n^2) because you're adding n^2 distinct pairs of numbers to get a matrix of n^2 results.
There's a difference between summing N arbitrary integers and summing N that are all in a row. For 1+2+3+4+...+N, you can take advantage of the fact that they can be divided into pairs with a common sum, e.g. 1+N = 2+(N-1) = 3+(N-2) = ... = N + 1. So that's N+1, N/2 times. (If there's an odd number, one of them will be unpaired, but with a little effort you can see that the same formula holds in that case.)
That is not O(N^2), though. It's just a formula that uses N^2, actually O(1). O(N^2) would mean (roughly) that the number of steps to calculate it grows like N^2, for large N. In this case, the number of steps is the same regardless of N.
Adding the first n numbers:
Consider the algorithm:
Series_Add(n)
return n*(n+1)/2
this algorithm indeed runs in O(|n|^2), where |n| is the length (the bits) of n and not the magnitude, simply because multiplication of 2 numbers, one of k bits and the other of l bits runs in O(k*l) time.
Careful
Considering this algorithm:
Series_Add_pseudo(n):
sum=0
for i= 1 to n:
sum += i
return sum
which is the naive approach, you can assume that this algorithm runs in linear time or generally in polynomial time. This is not the case.
The input representation(length) of n is O(logn) bits (any n-ary coding except unary), and the algorithm (although it is running linearly in the magnitude) it runs exponentially (2^logn) in the length of the input.
This is actually the pseudo-polynomial algorithm case. It appears to be polynomial but it is not.
You could even try it in python (or any programming language), for a medium length number like 200 bits.
Applying the first algorithm the result comes in a split second, and applying the second, you have to wait a century...
1+2+3+...+n is always less than n+n+n...+n n times. you can rewrite this n+n+..+n as n*n.
f(n) = O(g(n)) if there exists a positive integer n0 and a positive
constant c, such that f(n) ≤ c * g(n) ∀ n ≥ n0
since Big-Oh represents the upper bound of the function, where the function f(n) is the sum of natural numbers up to n.
now, talking about time complexity, for small numbers, the addition should be of a constant amount of work. but the size of n could be humongous; you can't deny that probability.
adding integers can take linear amount of time when n is really large.. So you can say that addition is O(n) operation and you're adding n items. so that alone would make it O(n^2). of course, it will not always take n^2 time, but it's the worst-case when n is really large. (upper bound, remember?)
Now, let's say you directly try to achieve it using n(n+1)/2. Just one multiplication and one division, this should be a constant operation, no?
No.
using a natural size metric of number of digits, the time complexity of multiplying two n-digit numbers using long multiplication is Θ(n^2). When implemented in software, long multiplication algorithms must deal with overflow during additions, which can be expensive. Wikipedia
That again leaves us to O(n^2).
It's equivalent to BigO(n^2), because it is equivalent to (n^2 + n) / 2 and in BigO you ignore constants, so even though the squared n is divided by 2, you still have exponential growth at the rate of square.
Think about O(n) and O(n/2) ? We similarly don't distinguish the two, O(n/2) is just O(n) for a smaller n, but the growth rate is still linear.
What that means is that as n increase, if you were to plot the number of operations on a graph, you would see a n^2 curve appear.
You can see that already:
when n = 2 you get 3
when n = 3 you get 6
when n = 4 you get 10
when n = 5 you get 15
when n = 6 you get 21
And if you plot it like I did here:
You see that the curve is similar to that of n^2, you will have a smaller number at each y, but the curve is similar to it. Thus we say that the magnitude is the same, because it will grow in time complexity similarly to n^2 as n grows bigger.
answer of sum of series of n natural can be found using two ways. first way is by adding all the numbers in loop. in this case algorithm is linear and code will be like this
int sum = 0;
for (int i = 1; i <= n; i++) {
sum += n;
}
return sum;
it is analogous to 1+2+3+4+......+n. in this case complexity of algorithm is calculated as number of times addition operation is performed which is O(n).
second way of finding answer of sum of series of n natural number is direst formula n*(n+1)/2. this formula use multiplication instead of repetitive addition. multiplication operation has not linear time complexity. there are various algorithm available for multiplication which has time complexity ranging from O(N^1.45) to O (N^2). therefore in case of multiplication time complexity depends on the processor's architecture. but for the analysis purpose time complexity of multiplication is considered as O(N^2). therefore when one use second way to find the sum then time complexity will be O(N^2).
here multiplication operation is not same as the addition operation. if anybody has knowledge of computer organisation subject then he can easily understand the internal working of multiplication and addition operation. multiplication circuit is more complex than the adder circuit and require much higher time than the adder circuit to compute the result. so time complexity of sum of series can't be constant.

Determining Big O Notation

I need help understanding/doing Big O Notation. I understand the purpose of it, I just don't know how to "determine the complexity given a piece of code".
Determine the Big O notation for each of the following
a.
n=6;
cout<<n<<endl;
b.
n=16;
for (i=0; i<n; i++)
cout<<i<<endl;
c.
i=6;
n=23;
while (i<n) {
cout<<i-6<<endl;
i++;
}
d.
int a[ ] = {1, 3, 5, 7, 9, 11, 13, 15, 17, 19};
n=10;
for (i=0; i<n; i++)
a[i]=a[i]*2;
for (i=9; i>=0; i--)
cout<<a[i]<<endl;
e.
sum=0;
n=6;
k=pow(2,n);
for (i=0;i<k;i++)
sum=sum+k;
Big O indicates the order of the complexity of your algorithm.
Basic things :
This complexity is measured regarding to the entry size
You choose a unit operation (usually affectation or comparison)
You count how much time this operation is called
A constant term or constant factor is usually ignored when using complexity so if the number of operation is 3*n^3 + 12 it's simplified to n^3 also marked O(n^3)
a.) Will just run once, no loop, complexity is trivial here O(1)
b.) Call n times in the loop: O(n)
c.) Here we choose to analyze n (because it's usually the incrementing variable in an algorithm). The number of calls is n - 6 so this is O(n).
d.) Let's suppose here that 10 (n) is the size of your array and nine (i) this size minus one. For each value to n, we have to go from 0 to n then n-1 to 0. n * (n-1) operations, technically: O(n * 2) which some people approximate as O(n). Both are called Linear Time, what is different is the slope of the line which BigO doesn't care about.
e.) The loop goes from 0 to the pow(2, n), which is 1 to 2^n, summarized as O(2^n)
Assuming you don't count the cout as part of your Big-O measurement.
a)
O(1) you can perform the integer assignment in constant time.
b)
O(n) because it takes n operations for the loop.
c)
O(n - c) = O(n) constants disappear in Big-O.
d.1)
O(2*n) = O(n) two linear time algorithms end up being linear time.
d.2)
If n grows with pow(2, n) = 2^n, then the number of operations are O(2^n); however if n is constant it would grow with O(k), where k = 2^6 = 64, which would be linear.
These examples are fairly simple. First what you have to do is to determine the main (simple) operation in the code and try to express the number of invocations of this operation as a function of input.
To be less abstract:
a.)
This code always runs in a constant time. This time is dependant on the computer, I/O latency, etc. - but it is almost not dependant on the value of n.
b.)
This time a piece of code inside a loop is executed several times. If n is two times bigger, what can you say about the number of iterations?
c.)
Again some code inside a loop. But this time the number of iterations is less than n. But if n is sufficiently big, do you simmilarity to b.)?
d.)
This code is interesting. The operation inside a first loop is more sophisticated, but again it takes more-or-less constant amount of time. So how many times is it executed in relation to n? Once again compare with b.)
The second loop is there only to trick you. For small n it might actually take more time than the first one. However O(n) notation always takes high n values into account.
5.)
The last piece of code is actually pretty simple. The number of simple operations inside a loop is equal to n^2. Add 1 to n and you'll get twice as much operations.
To understand the full mathematical definition I recommend Wikipedia. For simple purposes big-oh is an upper bound to algorithm, given a routine how many times does it iterate before finishing given a length of n. we call this upper bound O(n) or big oh of n.
In code accessing a member of a simple array in c++ is O(1). It is one operation regardless of how large the array is.
A linear iteration through an array in a for loop is O(n)
Nested for loops are O(n^2) or O(n^k) if have more than one nested for loop
Recursion with divide and conquer (heaps, binary trees etc) is O(lg n) or O(n lg n) depending on the operation.
a
n=6;
cout<<n<<endl;
Constant time, O(1). This means as n increases from 1 to infinity, the amount of time needed to execute this statement does not increase. Each time you increment n, the amount of time needed does not increase.
b
n=16;
for (i=0; i<n; i++)
cout<<i<<endl;
Linear Time, O(n). This means that as n increases from 1 to infinity, the amount of time needed to execute this statement increases linearly. Each time you increment n, the amount of additional time needed from the previous remains constant.
c
i=6;
n=23;
while (i<n) {
cout<<i-6<<endl;
i++;
}
Linear Time, O(n), same as example 2 above.
d
int a[ ] = {1, 3, 5, 7, 9, 11, 13, 15, 17, 19};
n=10;
for (i=0; i<n; i++)
a[i]=a[i]*2;
for (i=9; i>=0; i--)
cout<<a[i]<<endl;
Linear time, O(n). As n increases from 1 to infinity, the amount of time needed to execute these statements increase linearly. The linear line is twice as steep as example 3, however Big O Notation does not concern itself with how steep the line is, it's only concerned with how the time requirements grow. The two loops require linearly growing amount of time as n increases.
e
sum=0;
n=6;
k=pow(2,n);
for (i=0;i<k;i++)
sum=sum+k;
Create a graph of how many times sum=sum+k is executed given the value n:
n number_of_times_in_loop
1 2^1 = 2
2 2^2 = 4
3 2^3 = 8
4 2^4 = 16
5 2^5 = 32
6 2^6 = 64
As n goes from 1 to infinity, notice how the number of times we are in the loop exponentially increases. 2->4->8->16->32->64. What would happen if I plugged in n of 150? The number of times we would be in the loop becomes astronomical.
This is Exponential time: O(2^n) (see here) denotes an algorithm whose growth will double with each additional element in the input data set. Plug in a large sized n at your own peril, you will be waiting hours or years for the calculation to complete for a handful of input items.
Why do we care?
As computer scientists, we are interested in properly understanding BigO notation because we want to be able to say things like this with authority and conviction:
"Jim's algorithm for calculating the distance between planets takes exponential time. If we want to do 20 objects it takes too much time, his code is crap because I can make one in linear time."
And better yet, if they don't like what they hear, you can prove it with Math.

Determining complexity of an integer factorization algorithm

I'm starting to study computational complexity, BigOh notation and the likes, and I was tasked to do an integer factorization algorithm and determine its complexity. I've written the algorithm and it is working, but I'm having trouble calculating the complexity. The pseudo code is as follows:
DEF fact (INT n)
BEGIN
INT i
FOR (i -> 2 TO i <= n / i STEP 1)
DO
WHILE ((n MOD i) = 0)
DO
PRINT("%int X", i)
n -> n / i
DONE
DONE
IF (n > 1)
THEN
PRINT("%int", n)
END
What I attempted to do, I think, is extremely wrong:
f(x) = n-1 + n-1 + 1 + 1 = 2n
so
f(n) = O(n)
Which I think it's wrong because factorization algorithms are supposed to be computationally hard, they can't even be polynomial. So what do you suggest to help me? Maybe I'm just too tired at this time of the night and I'm screwing this all up :(
Thank you in advance.
This phenomenon is called pseudopolynomiality: a complexity that seems to be polynomial, but really isn't. If you ask whether a certain complexity (here, n) is polynomial or not, you must look at how the complexity relates to the size of the input. In most cases, such as sorting (which e.g. merge sort can solve in O(n lg n)), n describes the size of the input (the number of elements). In this case, however, n does not describe the size of the input; it is the input value. What, then, is the size of n? A natural choice would be the number of bits in n, which is approximately lg n. So let w = lg n be the size of n. Now we see that O(n) = O(2^(lg n)) = O(2^w) - in other words, exponential in the input size w.
(Note that O(n) = O(2^(lg n)) = O(2^w) is always true; the question is whether the input size is described by n or by w = lg n. Also, if n describes the number of elements in a list, one should strictly speaking count the bits of every single element in the list in order to get the total input size; however, one usually assumes that in lists, all numbers are bounded in size (to e.g. 32 bits)).
Use the fact that your algorithm is recursive. If f(x) is the number of operations take to factor, if n is the first factor that is found, then f(x)=(n-1)+f(x/n). The worst case for any factoring algorithm is a prime number, for which the complexity of your algorithm is O(n).
Factoring algorithms are 'hard' mainly because they are used on obscenely large numbers.
In big-O notation, n is the size of input, not the input itself (as in your case). The size of the input is lg(n) bits. So basically your algorithm is exponential.

Complexity of recursive factorial program

What's the complexity of a recursive program to find factorial of a number n? My hunch is that it might be O(n).
If you take multiplication as O(1), then yes, O(N) is correct. However, note that multiplying two numbers of arbitrary length x is not O(1) on finite hardware -- as x tends to infinity, the time needed for multiplication grows (e.g. if you use Karatsuba multiplication, it's O(x ** 1.585)).
You can theoretically do better for sufficiently huge numbers with Schönhage-Strassen, but I confess I have no real world experience with that one. x, the "length" or "number of digits" (in whatever base, doesn't matter for big-O anyway of N, grows with O(log N), of course.
If you mean to limit your question to factorials of numbers short enough to be multiplied in O(1), then there's no way N can "tend to infinity" and therefore big-O notation is inappropriate.
Assuming you're talking about the most naive factorial algorithm ever:
factorial (n):
if (n = 0) then return 1
otherwise return n * factorial(n-1)
Yes, the algorithm is linear, running in O(n) time. This is the case because it executes once every time it decrements the value n, and it decrements the value n until it reaches 0, meaning the function is called recursively n times. This is assuming, of course, that both decrementation and multiplication are constant operations.
Of course, if you implement factorial some other way (for example, using addition recursively instead of multiplication), you can end up with a much more time-complex algorithm. I wouldn't advise using such an algorithm, though.
When you express the complexity of an algorithm, it is always as a function of the input size. It is only valid to assume that multiplication is an O(1) operation if the numbers that you are multiplying are of fixed size. For example, if you wanted to determine the complexity of an algorithm that computes matrix products, you might assume that the individual components of the matrices were of fixed size. Then it would be valid to assume that multiplication of two individual matrix components was O(1), and you would compute the complexity according to the number of entries in each matrix.
However, when you want to figure out the complexity of an algorithm to compute N! you have to assume that N can be arbitrarily large, so it is not valid to assume that multiplication is an O(1) operation.
If you want to multiply an n-bit number with an m-bit number the naive algorithm (the kind you do by hand) takes time O(mn), but there are faster algorithms.
If you want to analyze the complexity of the easy algorithm for computing N!
factorial(N)
f=1
for i = 2 to N
f=f*i
return f
then at the k-th step in the for loop, you are multiplying (k-1)! by k. The number of bits used to represent (k-1)! is O(k log k) and the number of bits used to represent k is O(log k). So the time required to multiply (k-1)! and k is O(k (log k)^2) (assuming you use the naive multiplication algorithm). Then the total amount of time taken by the algorithm is the sum of the time taken at each step:
sum k = 1 to N [k (log k)^2] <= (log N)^2 * (sum k = 1 to N [k]) =
O(N^2 (log N)^2)
You could improve this performance by using a faster multiplication algorithm, like Schönhage-Strassen which takes time O(n*log(n)*log(log(n))) for 2 n-bit numbers.
The other way to improve performance is to use a better algorithm to compute N!. The fastest one that I know of first computes the prime factorization of N! and then multiplies all the prime factors.
The time-complexity of recursive factorial would be:
factorial (n) {
if (n = 0)
return 1
else
return n * factorial(n-1)
}
So,
The time complexity for one recursive call would be:
T(n) = T(n-1) + 3 (3 is for As we have to do three constant operations like
multiplication,subtraction and checking the value of n in each recursive
call)
= T(n-2) + 6 (Second recursive call)
= T(n-3) + 9 (Third recursive call)
.
.
.
.
= T(n-k) + 3k
till, k = n
Then,
= T(n-n) + 3n
= T(0) + 3n
= 1 + 3n
To represent in Big-Oh notation,
T(N) is directly proportional to n,
Therefore,
The time complexity of recursive factorial is O(n).
As there is no extra space taken during the recursive calls,the space complexity is O(N).

Resources