What is the Big O of this while loop? - algorithm

Normally when I see a loop, I assume it is O(n). But this was given as an interview question and seems to easy.
let a = 1;
while(a < n){
a = a * 2;
}
Am I oversimplifying? It appears to simply compute the powers of 2.

Never assume that a loop is always O(n). Loops that iterate over every element of a sequential container (like arrays) normally have a time complexity of O(n), but it ultimately depends on the the condition of the loop and how the loop iterates. In your case, a is doubling in value until it becomes greater than or equal to n. If you double n a few times this is what you see:
n # iterations
----------------
1 0
2 1
4 2
8 3
16 4
32 5
64 6
As you can see, the number of iterations is proportional to log(n), making the time complexity O(log n).

Looks like a grows exponentially with n, so the loop will likely complete in O(log(n))
I haven't done all the math, but a is not LINEAR wrt n...
But if you put in a loop counter, that counter would approximate log-base-2(n)

Related

Time complexity of an algorithm that runs 1+2+...+n times;

To start off I found this stackoverflow question that references the time complexity to be O(n^2), but it doesn't answer the question of why O(n^2) is the time complexity but instead asks for an example of such an algorithm. From my understanding an algorithm that runs 1+2+3+...+n times would be
less than O(n^2). For example, take this function
function(n: number) {
let sum = 0;
for(let i = 0; i < n; i++) {
for(let j = 0; j < i+1; j++) {
sum += 1;
}
}
return sum;
}
Here are some input and return values
num
sum
1
1
2
3
3
6
4
10
5
15
6
21
7
28
From this table you can see that this algorithm runs in less than O(n^2) but more than O(n). I also realize than algorithm that runs 1+(1+2)+(1+2+3)+...+(1+2+3+...+n) is true O(n^2) time complexity. For the algorithm stated in the problem, do we just say it runs in O(n^2) because it runs more than O(log n) times?
It's known that 1 + 2 + ... + n has a short form of n * (n + 1) / 2. Even if you didn't know that, you have to consider that, when i gets to n, the inner loop runs at most n times. So you have exactly n times (for outer loop i), each running at most n times (for inner loop j), so the O(n^2) becomes more apparent.
I agree that the complexity would be exactly n^2 if the inner loop also ran from 0 to n, so you have your reasons to think that a loop i from 0 to n and another loop j from 0 to i has to perform better and that's true, but with big Oh notation you're actually measuring the degree of algorithm's complexity, not the exact number of operations.
p.s. O(log n) is usually achieved when you split the main problem into sub-problems.
I think you should interpret the table differently. The O(N^2) complexity says that if you double the input N, the runtime should quadruple (take 4 times as long). In this case, the function(n: number) returns a number mirroring its runtime. I use f(N) as a short for it.
So say N goes from 1 to 2, which means the input has doubled (2/1 = 2). The runtime then has gone from f(1) to f(2), which means it has increased f(2)/f(1) = 3/1 = 3 times. That is not 4 times, but the Big-O complexity measure is asymptotic, dealing with the situation where N approaches infinity. If we test another input doubling from the table, we have f(6)/f(3) = 21/6 = 3.5. It is already closer to 4.
Let us now stray outside the table and try more doublings with bigger N. For example we have f(200)/f(100) = 20100/5050 = 3.980 and f(5000)/f(2500) = 12502500/3126250 = 3.999. The trend is clear. As N approaches infinity, a doubled input tends toward a quadrupled runtime. And that is the hallmark of O(N^2).

Worst case runtime in Big-Theta notation

What is the worst case run time in Big-Theta notation for the code below? The code calculates the average of homework scores from a list of homework scores after dropping the lowest score.
m := 1
for i := 2 to n
if h_i < h_m then m := i
total := 0
for j := 1 to n
if j != m then total := total + h_j
return total/(n − 1)
At worst case, this would mean the lowest score is situated in the last position. This implies at the first loop, it would run n-1 iterations. Both the upper-bound and lower-bound of the first loop is O(n) and Ω(n) respectively. I believe this would mean it has a runtime of Θ(n)
The second loop is pretty much the same thing, except it's n iterations.
I wonder for the overall runtime of the whole program, do we use max(Θ(n),Θ(n)) = Θ(n) like we do with big-O notations i.e. max (O(n),(O(1)) = O(n)?
I asked this question because supposedly I modified above code to run on only ONE loop :-
m := 1 ; total = h_1
for i := 2 to n
if h_i < h_m then m := i
total = total + h_i
total = total - h_m
return total/(n − 1)
This code also runs n-1 iterations => Θ(n). Now this seems weird to me because obviously the first code has a longer runtime than the second code since it has two loops. Which is why I asked is it correct to use max (Θ(f(n)) , Θ(g(n)).
You're falling into a common mistake of thinking that big-O/θ notation tells you the run time. It doesn't, it tells you how the run time will (asymptotically) scale as a function of n. If algorithm 1 grows linearly in n, and algorithm 2 takes twice as long to run as algorithm 1, algorithm 2 still has linear growth as well. That's why we ignore any scaling constants for big-O/θ.
It's the same as with big-O notation, constant factors are dropped. So the whole runtime is Θ(2n) = Θ(n).
Also, having two loops doesn't mean a longer runtime, because the loops can be shorter or do less per iteration. Your second program does more per iteration so total runtime would be about the same.
You don't return before the finish of the iteration that means you always go through the full list

Big-O complexity of a piece of code

I have a question in algorithm design about complexity. In this question a piece of code is given and I should calculate this code's complexity.
The pseudo-code is:
for(i=1;i<=n;i++){
j=i
do{
k=j;
j = j / 2;
}while(k is even);
}
I tried this algorithm for some numbers. and I have gotten different results. for example if n = 6 this algorithm output is like below
i = 1 -> executes 1 time
i = 2 -> executes 2 times
i = 3 -> executes 1 time
i = 4 -> executes 3 times
i = 5 -> executes 1 time
i = 6 -> executes 2 times
It doesn't have a regular theme, how should I calculate this?
The upper bound given by the other answers is actually too high. This algorithm has a O(n) runtime, which is a tighter upper bound than O(n*logn).
Proof: Let's count how many total iterations the inner loop will perform.
The outer loop runs n times. The inner loop runs at least once for each of those.
For even i, the inner loop runs at least twice. This happens n/2 times.
For i divisible by 4, the inner loop runs at least three times. This happens n/4 times.
For i divisible by 8, the inner loop runs at least four times. This happens n/8 times.
...
So the total amount of times the inner loop runs is:
n + n/2 + n/4 + n/8 + n/16 + ... <= 2n
The total amount of inner loop iterations is between n and 2n, i.e. it's Θ(n).
You always assume you get the worst scenario in each level.
now, you iterate over an array with N elements, so we start with O(N) already.
now let's say your i is always equals to X and X is always even (remember, worst case every time). how many times you need to divide X by 2 to get 1 ? (which is the only condition for even numbers to stop the division, when they reach 1).
in other words, we need to solve the equation
X/2^k = 1 which is X=2^k and k=log<2>(X)
this makes our algorithm take O(n log<2>(X)) steps, which can easly be written as O(nlog(n))
For such loop, we cannot separate count of inner loop and outer loop -> variables are tighted!
We thus have to count all steps.
In fact, for each iteration of outer loop (on i), we will have
1 + v_2(i) steps
where v_2 is the 2-adic valuation (see for example : http://planetmath.org/padicvaluation) which corresponds to the power of 2 in the decomposition in prime factor of i.
So if we add steps for all i we get a total number of steps of :
n_steps = \sum_{i=1}^{n} (1 + v_2(i))
= n + v_2(n!) // since v_2(i) + v_2(j) = v_2(i*j)
= 2n - s_2(n) // from Legendre formula (see http://en.wikipedia.org/wiki/Legendre%27s_formula with `p = 2`)
We then see that the number of steps is exactly :
n_steps = 2n - s_2(n)
As s_2(n) is the sum of the digits of n in base 2, it is negligible (at most log_2(n) since digit in base 2 is 0 or 1 and as there is at most log_2(n) digits) compared to n.
So the complexity of your algorithm is equivalent to n:
n_steps = O(n)
which is not the O(nlog(n)) stated in many other solutions but a smaller quantity!
lets start with worst case:
if you keep dividing with 2 (integral) you don't need to stop until you
get to 1. basically making the number of steps dependent on bit-width,
something you find out using two's logarithm. so the inner part is log n.
the outer part is obviously n, so N log N total.
A do loop halves j until k becomes odd. k is initially a copy of j which is a copy of i, so do runs 1 + power of 2 which divides i:
i=1 is odd, so it makes 1 pass through do loop,
i=2 divides by 2 once, so 1+1,
i=4 divides twice by 2, so 1+2, etc.
That makes at most 1+log(i) do executions (logarithm with base 2).
The for loop iterates i from 1 through n, so the upper bound is n times (1+log n), which is O(n log n).

How is Summation(n) Theta(n^2) according to its formula but Theta(n) ij we just look at it as a single for loop?

Our prof and various materials say Summation(n) = (n) (n+1) /2 and hence is theta(n^2). But intuitively, we just need one loop to find the sum of first n terms! So, it has to be theta(n).I'm wondering what am I missing here?!
All of these answers are misunderstanding the problem just like the original question: The point is not to measure the runtime complexity of an algorithm for summing integers, it's talking about how to reason about the complexity of an algorithm which takes i steps during each pass for i in 1..n. Consider insertion sort: On each step i to insert one member of the original list the output list is i elements long, thus it takes i steps (on average) to perform the insert. What is the complexity of insertion sort? It's the sum of all of those steps, or the sum of i for i in 1..n. That sum is n(n+1)/2 which has an n^2 in it, thus insertion sort is O(n^2).
The running time of the this code is Θ(1) (assuming addition/subtraction and multiplaction are constant time operations):
result = n*(n + 1)/2 // This statement executes once
The running time of the following pseudocode, which is what you described, is indeed Θ(n):
result = 0
for i from 1 up to n:
result = result + i // This statement executes exactly n times
Here is another way to compute it which has a running time of Θ(n²):
result = 0
for i from 1 up to n:
for j from i up to n:
result = result + 1 // This statement executes exactly n*(n + 1)/2 times
All three of those code blocks compute the natural numbers' sum from 1 to n.
This Θ(n²) loop is probably the type you are being asked to analyse. Whenever you have a loop of the form:
for i from 1 up to n:
for j from i up to n:
// Some statements that run in constant time
You have a running time complexity of Θ(n²), because those statements execute exactly summation(n) times.
I think the problem is that you're incorrectly assuming that the summation formula has time complexity theta(n^2).
The formula has an n^2 in it, but it doesn't require a number of computations or amount of time proportional to n^2.
Summing everything up to n in a loop would be theta(n), as you say, because you would have to iterate through the loop n times.
However, calculating the result of the equation n(n+1)/2 would just be theta(1) as it's a single calculation that is performed once regardless of how big n is.
Summation(n) being n(n+1)/2 refers to the sum of numbers from 1 to n. Which is a mathematical formula and can be calculated without a loop which is O(1) time. If you iterate an array to sum all values that is an O(n) algorithm.

Determining Big O Notation

I need help understanding/doing Big O Notation. I understand the purpose of it, I just don't know how to "determine the complexity given a piece of code".
Determine the Big O notation for each of the following
a.
n=6;
cout<<n<<endl;
b.
n=16;
for (i=0; i<n; i++)
cout<<i<<endl;
c.
i=6;
n=23;
while (i<n) {
cout<<i-6<<endl;
i++;
}
d.
int a[ ] = {1, 3, 5, 7, 9, 11, 13, 15, 17, 19};
n=10;
for (i=0; i<n; i++)
a[i]=a[i]*2;
for (i=9; i>=0; i--)
cout<<a[i]<<endl;
e.
sum=0;
n=6;
k=pow(2,n);
for (i=0;i<k;i++)
sum=sum+k;
Big O indicates the order of the complexity of your algorithm.
Basic things :
This complexity is measured regarding to the entry size
You choose a unit operation (usually affectation or comparison)
You count how much time this operation is called
A constant term or constant factor is usually ignored when using complexity so if the number of operation is 3*n^3 + 12 it's simplified to n^3 also marked O(n^3)
a.) Will just run once, no loop, complexity is trivial here O(1)
b.) Call n times in the loop: O(n)
c.) Here we choose to analyze n (because it's usually the incrementing variable in an algorithm). The number of calls is n - 6 so this is O(n).
d.) Let's suppose here that 10 (n) is the size of your array and nine (i) this size minus one. For each value to n, we have to go from 0 to n then n-1 to 0. n * (n-1) operations, technically: O(n * 2) which some people approximate as O(n). Both are called Linear Time, what is different is the slope of the line which BigO doesn't care about.
e.) The loop goes from 0 to the pow(2, n), which is 1 to 2^n, summarized as O(2^n)
Assuming you don't count the cout as part of your Big-O measurement.
a)
O(1) you can perform the integer assignment in constant time.
b)
O(n) because it takes n operations for the loop.
c)
O(n - c) = O(n) constants disappear in Big-O.
d.1)
O(2*n) = O(n) two linear time algorithms end up being linear time.
d.2)
If n grows with pow(2, n) = 2^n, then the number of operations are O(2^n); however if n is constant it would grow with O(k), where k = 2^6 = 64, which would be linear.
These examples are fairly simple. First what you have to do is to determine the main (simple) operation in the code and try to express the number of invocations of this operation as a function of input.
To be less abstract:
a.)
This code always runs in a constant time. This time is dependant on the computer, I/O latency, etc. - but it is almost not dependant on the value of n.
b.)
This time a piece of code inside a loop is executed several times. If n is two times bigger, what can you say about the number of iterations?
c.)
Again some code inside a loop. But this time the number of iterations is less than n. But if n is sufficiently big, do you simmilarity to b.)?
d.)
This code is interesting. The operation inside a first loop is more sophisticated, but again it takes more-or-less constant amount of time. So how many times is it executed in relation to n? Once again compare with b.)
The second loop is there only to trick you. For small n it might actually take more time than the first one. However O(n) notation always takes high n values into account.
5.)
The last piece of code is actually pretty simple. The number of simple operations inside a loop is equal to n^2. Add 1 to n and you'll get twice as much operations.
To understand the full mathematical definition I recommend Wikipedia. For simple purposes big-oh is an upper bound to algorithm, given a routine how many times does it iterate before finishing given a length of n. we call this upper bound O(n) or big oh of n.
In code accessing a member of a simple array in c++ is O(1). It is one operation regardless of how large the array is.
A linear iteration through an array in a for loop is O(n)
Nested for loops are O(n^2) or O(n^k) if have more than one nested for loop
Recursion with divide and conquer (heaps, binary trees etc) is O(lg n) or O(n lg n) depending on the operation.
a
n=6;
cout<<n<<endl;
Constant time, O(1). This means as n increases from 1 to infinity, the amount of time needed to execute this statement does not increase. Each time you increment n, the amount of time needed does not increase.
b
n=16;
for (i=0; i<n; i++)
cout<<i<<endl;
Linear Time, O(n). This means that as n increases from 1 to infinity, the amount of time needed to execute this statement increases linearly. Each time you increment n, the amount of additional time needed from the previous remains constant.
c
i=6;
n=23;
while (i<n) {
cout<<i-6<<endl;
i++;
}
Linear Time, O(n), same as example 2 above.
d
int a[ ] = {1, 3, 5, 7, 9, 11, 13, 15, 17, 19};
n=10;
for (i=0; i<n; i++)
a[i]=a[i]*2;
for (i=9; i>=0; i--)
cout<<a[i]<<endl;
Linear time, O(n). As n increases from 1 to infinity, the amount of time needed to execute these statements increase linearly. The linear line is twice as steep as example 3, however Big O Notation does not concern itself with how steep the line is, it's only concerned with how the time requirements grow. The two loops require linearly growing amount of time as n increases.
e
sum=0;
n=6;
k=pow(2,n);
for (i=0;i<k;i++)
sum=sum+k;
Create a graph of how many times sum=sum+k is executed given the value n:
n number_of_times_in_loop
1 2^1 = 2
2 2^2 = 4
3 2^3 = 8
4 2^4 = 16
5 2^5 = 32
6 2^6 = 64
As n goes from 1 to infinity, notice how the number of times we are in the loop exponentially increases. 2->4->8->16->32->64. What would happen if I plugged in n of 150? The number of times we would be in the loop becomes astronomical.
This is Exponential time: O(2^n) (see here) denotes an algorithm whose growth will double with each additional element in the input data set. Plug in a large sized n at your own peril, you will be waiting hours or years for the calculation to complete for a handful of input items.
Why do we care?
As computer scientists, we are interested in properly understanding BigO notation because we want to be able to say things like this with authority and conviction:
"Jim's algorithm for calculating the distance between planets takes exponential time. If we want to do 20 objects it takes too much time, his code is crap because I can make one in linear time."
And better yet, if they don't like what they hear, you can prove it with Math.

Resources