How do i measure the O-notation of a for loop? - algorithm

i have a question about O-notation. (big O)
In my code, i am using a for loop to iterate through an array of users.
The for loop has if-statements that makes it break out of the loop, if the rigth user is found.
My question is how i measure the O-notation?
Is the O-notation is O(N) as i loop through all the users in the array?
Or is the O-notation O(1), as the loop breaks and never runs again?

O notation defines an "order of" relationship between an amount of work (however measured) and the number of items processed (usually 'n'). So "O(n)" means "in direct proportion to the number of items n". "O(1)" means simply "constant". If a loop processes every item once then the amount of work is intuitively in direct proportion to n, but let's say that your exit condition gets hit on average half way through, we might be tempted to say that this is O(n/2), but instead we still say that it is O(n) because the relationship to n is still direct/linear. Similarly if you were to assess the relationship to be O(7n^3 + 2n), you'd say the relationship was simply O(n^3) because n^3 is the term that dominates as n grows large.
The answer to your specific question is therefore O(n) because the number of iterations is in direct proportion to n. All that this says is that if N user records take M milliseconds to process, 2N should take about 2M milliseconds.
It is probably worth noting that O notation is strictly concerned with worst case and not the average cost of algorithms (although I have started to find that it is quite common for people to use it in the latter sense). It is always a good idea to specify to avoid ambiguity.

Big O notation answers the following two questions:
If there are N data elements, how many steps will the algorithm take?
How will the performance of the algorithm change if the number of data elements increases?
Best-case scenario in your case is that the user you are searching for is found at the first index. Time complexity in this case would be O(1) because number of steps taken by the algorithm are constant and do not change if the number of elements in the array are changed.
The worst-case scenario is that your loop will have to iterate over all the users. That makes the time complexity to be O(N) because number of steps taken by the algorithm will be directly proportional to the number of elements in the array.
Big O notation generally refers to the worst-case scenario, so you can say that the time complexity in your case is O(N).

Best case complexity of for loop is O(1) and worst case complexity is O(N). In linear search best case is O(N) and worst case is O(N). It also depends on the approach followed by you to solve problem. Like for(int i = n; i>1; i=i/2) in this case complexity is O(log(N). Complexity of if else condition is O(1).

Related

Time Complexity (Big O) - Can value of N decides whether the time complexity is O(1) or O(N) when we have 2 nested FOR loops?

Suppose that I have 2 nested for loops, and 1 array of size N as shown in my code below:
int result = 0;
for( int i = 0; i < N ; i++)
{
for( int j = i; j < N ; j++)
{
result = array[i] + array[j]; // just some funny operation
}
}
Here are 2 cases:
(1) if the constraint is that N >= 1,000,000 strictly, then we can definitely say that the time complexity is O(N^2). This is true for sure as we all know.
(2) Now, if the constraint is that N < 25 strictly, then people could probably say that because we know that definitely, N is always too small, the time complexity is estimated to be O(1) since it takes very little time to run and complete these 2 for loops WITH MODERN COMPUTERS ? Does that sound right ?
Please tell me if the value of N plays a role in deciding the outcome of the time complexity O(N) ? If yes, then how big the value N needs to be in order to play that role (1,000 ? 5,000 ? 20,000 ? 500,000 ?) In other words, what is the general rule of thumb here ?
INTERESTING THEORETICAL QUESTION: If 15 years from now, the computer is so fast that even if N = 25,000,000, these 2 for loops can be completed in 1 second. At that time, can we say that the time complexity would be O(1) even for N = 25,000,000 ? I suppose the answer would be YES at that time. Do you agree ?
tl:dr No. The value of N has no effect on time complexity. O(1) versus O(N) is a statement about "all N" or how the amount of computation increases when N increases.
Great question! It reminds me of when I was first trying to understand time complexity. I think many people have to go through a similar journey before it ever starts to make sense so I hope this discussion can help others.
First of all, your "funny operation" is actually funnier than you think since your entire nested for-loops can be replaced with:
result = array[N - 1] + array[N - 1]; // just some hilarious operation hahaha ha ha
Since result is overwritten each time, only the last iteration effects the outcome. We'll come back to this.
As far as what you're really asking here, the purpose of Big-O is to provide a meaningful way to compare algorithms in a way that is indenependent of input size and independent of the computer's processing speed. In other words, O(1) versus O(N) has nothing to with the size of N and nothing to do with how "modern" your computer is. That all effects execution time of the algorithm on a particular machine with a particular input, but does not effect time complexity, i.e. O(1) versus O(N).
It is actually a statement about the algorithm itself, so a math discussion is unavoidable, as dxiv has so graciously alluded to in his comment. Disclaimer: I'm going to omit certain nuances in the math since the critical stuff is already a lot to explain and I'll defer to the mountains of complete explanations elsewhere on the web and textbooks.
Your code is a great example to understand what Big-O does tell us. The way you wrote it, its complexity is O(N^2). That means that no matter what machine or what era you run your code in, if you were to count the number of operations the computer has to do, for each N, and graph it as a function, say f(N), there exists some quadratic function, say g(N)=9999N^2+99999N+999 that is greater than f(N) for all N.
But wait, if we just need to find big enough coefficients in order for g(N) to be an upper bound, can't we just claim that the algorithm is O(N) and find some g(N)=aN+b with gigantic enough coefficients that its an upper bound of f(N)??? THE ANSWER TO THIS IS THE MOST IMPORTANT MATH OBSERVATION YOU NEED TO UNDERSTAND TO REALLY UNDERSTAND BIG-O NOTATION. Spoiler alert. The answer is no.
For visuals, try this graph on Desmos where you can adjust the coefficients:[https://www.desmos.com/calculator/3ppk6shwem][1]
No matter what coefficients you choose, a function of the form aN^2+bN+c will ALWAYS eventually outgrow a function of the form aN+b (both having positive a). You can push a line as high as you want like g(N)=99999N+99999, but even the function f(N)=0.01N^2+0.01N+0.01 crosses that line and grows past it after N=9999900. There is no linear function that is an upper bound to a quadratic. Similarly, there is no constant function that is an upper bound to a linear function or quadratic function. Yet, we can find a quadratic upper bound to this f(N) such as h(N)=0.01N^2+0.01N+0.02, so f(N) is in O(N^2). This observation is what allows us to just say O(1) and O(N^2) without having to distinguish between O(1), O(3), O(999), O(4N+3), O(23N+2), O(34N^2+4+e^N), etc. By using phrases like "there exists a function such that" we can brush all the constant coefficients under the rug.
So having a quadratic upper bound, aka being in O(N^2), means that the function f(N) is no bigger than quadratic and in this case happens to be exactly quadratic. It sounds like this just comes down to comparing the degree of polynomials, why not just say that the algorithm is a degree-2 algorithm? Why do we need this super abstract "there exists an upper bound function such that bla bla bla..."? This is the generalization necessary for Big-O to account for non-polynomial functions, some common ones being logN, NlogN, and e^N.
For example if the number of operations required by your algorithm is given by f(N)=floor(50+50*sin(N)), we would say that it's O(1) because there is a constant function, e.g. g(N)=101 that is an upper bound to f(N). In this example, you have some bizarre algorithm with oscillating execution times, but you can convey to someone else how much it doesn't slow down for large inputs by simply saying that it's O(1). Neat. Plus we have a way to meaningfully say that this algorithm with trigonometric execution time is more efficient than one with linear complexity O(N). Neat. Notice how it doesn't matter how fast the computer is because we're not measuring in seconds, we're measuring in operations. So you can evaluate the algorithm by hand on paper and it's still O(1) even if it takes you all day.
As for the example in your question, we know it's O(N^2) because there are aN^2+bN+c operations involved for some a, b, c. It can't be O(1) because no matter what aN+b you pick, I can find a large enough input size N such that your algorithm requires more than aN+b operations. On any computer, in any time zone, with any chance of rain outside. Nothing physical effects O(1) versus O(N) versus (N^2). What changes it to O(1) is changing the algorithm itself to the one-liner that I provided above where you just add two numbers and spit out the result no matter what N is. Let's say for N=10 it takes 4 operations to do both array lookups, the addition, and the variable assignment. If you run it again on the same machine with N=10000000 it's still doing the same 4 operations. The amount of operations required by the algorithm doesn't grow with N. That's why the algorithm is O(1).
It's why problems like finding a O(NlogN) algorithm to sort an array are math problems and not nano-technology problems. Big-O doesn't even assume you have a computer with electronics.
Hopefully this rant gives you a hint as to what you don't understand so you can do more effective studying for a complete understanding. There's no way to cover everything needed in one post here. It was some good soul-searching for me, so thanks.

Time complexity will two O(n^2) algorithms take the same amount of time?

I am having trouble fully understanding this question:
Two O(n2) algorithms will always take the same amount of time for a given vale of n. True or false? Explain.
I think the answer is false, because from my understanding, I think that the asymptotic time complexity only measures the two algorithms running at O(n2) time, however one algorithm might take longer as perhaps it might have additional O(n) components to the algorithm. Like O(n2) vs (O(n2) + O(n)).
I am not sure if my logic is correct. Any help would be appreciated.
Yes, you are right. Big Oh notation depicts the upper bound of time complexity. There might some extra constant term c or smaller term of n like O(n) added to it which won't be considered for time complexity.
Moreover,
for i = 0 to n
for j = 0 to n
// some constant time operation
end
end
And
for i = 0 to n
for j = i to n
// some constant time operation
end
end
Both of these are O(n^2) asymptotically but won't take same time.
The concept of big Oh analysis is not to calculate the precise amount of time a program takes to execute, it's not about counting how many times a loop iterates. Rather it indicates the algorithm's growth rate with n.
The answer is correct but the explanation is lacking.
For one, the big O notation allows arbitrary constant factors. so both n2 and 100*n2 are in O(n2) but clearly the second is always larger.
Another reason is that the notation only gives an upper bound so even a runtime of n is in O(n2) so one of the algorithms may in fact be linear.

How to calculate "n" for different notations Big O, Omega, Litle o , Litle omega and Theta Notation

I am studying algorithms, but the calculations to find Time Complexity are not that much easy for me, it is hard to remember when to use log n, n log n, n^2, n^3, 2n, etc, my doubt is all about how to consider these input functions while computing the complexity, is their any specific way to calculate the complexity ,like using for loop take's this much complexity always and so on....?
Log(n): when you are using recursion and a tree is generated use log(n).
I mean in divide and conquer when you are diving problem into 2-halfs actually you are generating a recursive tree.
its complexity is Log(n), why ? because its a binary tree in nature and for binary tree we use Log(Base2)(n).
try yourself: suppose n=4(Elements) so log(base2)(4)=2, you divide it into equal half.
nLog(n): remember Log(n) its was division till single element. after that you start merging sorted elements that take liner time
in other words Merging of elements has complexity "n" so total complexity will be n(Merging) + Log(n)(Dividing) which is finally become nLog(n).
n^2:
when you see a problem is solved in two nested loop then Complexity is n^2.
i.e Matrix/2-D arrays they computed in 2 Loops. one loop inside the outer Loop.
n^3: oh 3-D arrays, this is for 3 nested loops. loop inside loop inside loop.
2n: thanks you did not forgot to write "2" with this "n" otherwise I forgot to explain this.
so "2" in here with "n" is constant just ignore it. why ?. because if you travel to other city by AIR. you will count only hours taken by flight not the hours consumed in reaching AIR port. I mean this is minor we remove constant.
and for "n" just remember this word "Linear" i.e Big-O(n) is linear complexity. Sadly I discovered there is no Algorithm that sort elements in Linear time. i.e just in one loop.(Single array traversal).
Things To Remember:
Nominal Time: Linear Time, Complexity Big-O(n)
Polynomial Time: Not Linear Time, Complexity Big-O[ log(n), nlog(n), n^2, n^3, n^4, n^5).
Exponential Time: 2^n, n^n i.e this problem will solve in exponential time i.e N^power(n) (These are bad bad bad, not called algorithm)
There are many links on how to roughly calculate Big O and its sibling's complexity, but there is no true formula.
However, there are guidelines to help you calculate complexity such as these presented below. I suggest reviewing as many different programs and data structures to help familiarize yourself with the pattern and just study, study, study until you get it! There is a pattern and you will see it the more you study it.
Source: http://www.dreamincode.net/forums/topic/125427-determining-big-o-notation/
Nested loops are multiplied together.
Sequential loops are added.
Only the largest term is kept, all others are dropped.
Constants are dropped.
Conditional checks are constant (i.e. 1).

What is the relation/difference between worst case time complexity of an algorithm and its upper bound?

What is the relation/difference between worst case time complexity of an algorithm and its upper bound?
The term "upper bound" is not very clear, as it may refer to two possible things:
The upper bound of the algorithm - the bound where the algorithm can never run "slower" than it. This is basically the worst case performance of it, so if this is what you mean - the answer is pretty simple.
big-O notation, which provides an upper bound on the complexity of the algorithm under a specific analysis. The big-O notation is a set of functions, and can be applied to any analysis of an algorithm, including worst case, average case, and even best case.
Let's take Quick Sort as an example.
Quick Sort is said to have O(n^2) worst case performance, and O(nlogn) average case performance. How can one algorithm has two complexities? Simple, the function representing the analysis of average case, and the one representing the worst case, are completely different funcitons - and we can apply big O notation on each of them, there is no restriction about it.
Moreover, we can even apply it to best-case. Consider a small optimization to quick-sort, where it first checks if the array is already sorted, and if it is - it stops immidiately. This is effectively O(n) operation, and there is some input that will provide this behavior - so we can now say that the algorithm's best case complexity is O(n)
The difference between worst case and big O(UPPER BOUND) is that
the worst case is a case that actually happens to your code,
the upper bound is an overestimate, an assumption that we put in order to
calculate the big O, it doesn't have to happen
example on insertion sort:
Worst Case:
The numbers are all arranged reversely so you need to arrange and move every single
number
Pseudo-code
for j=2 to n
do key = a[i]
i=j-1
while i>0 & a[i]>key
do a[i+1] = a[i]
i=i-1
end while
a[i+1]=key
end for
Upper Bound:
We assume that the order of the inner loop is i =n-1 every single time, but in fact,
it is changeable every time, it can't be n-1 every time, but we assumed
/overestimated it to calculate the big O

Best and worst case time for Algorithm S when time complexity changes in accordance to n being even/odd

The following is a homework assignment, so I would rather get hints or bits of information that would help me figure this out, and not complete answers.
Consider S an algorithm solution to a problem that takes as input an array A of size n. After analysis, the following conclusion was obtained:
Algorithm S executes an O(n)-time computation for each even number in A.
Algorithm S executes an O(logn)-time computation for each odd number in A.
What are the best and worst case time for algorithm S?
From this I understand that the time complexity changes in accordance to n being even or odd. In other words, if n is even, S takes O(n) time and when n is odd, S takes O(logn).
Is it a simple matter of taking the best case and the worst case of both growth-rates, and choosing their boundaries? Meaning:
Best case of O(n) is O(1), and worst case is O(n).
Best case of O(logn) is O(logn) and worst case is O(logn).
Therefore the best case for Algorithm S is O(logn) and the worst case is O(n)?
Am I missing something? or am I wrong in assessing the different best/worst case of both cases of big-Oh?
1st attempt:
Ok, so I completely misunderstood the problem. Thanks to candu, I can now better understand what is required of me, and so try to calculate the best and worst case better.
It seems that Algorithm S changes its runtime according to EACH number in A. If the number is even, the runtime is O(n), and if the number is odd, we get O(logn).
The worst case will be composed of an array A of n even numbers, and for each the algorithm will run O(n). In other words, the worst case runtime for Algorithm S should be n*O(n).
The best case will be composed of an array A of n odd numbers, and for each the algorithm will run O(logn). The best case runtime for algorithm S should be n*O(logn).
Am I making any sense? is it true then that:
Best case of algorithm S is nO(logn) and worst case is nO(n)?
If that is true, can it be rewritten? for example, as O(log^n(n)) and O(n^n)? or is this an arithmetic mistake?
2nd attempt:
Following JuanLopes' response, it seems like I can rewrite nO(n) as O(n*n) or O(n^2), and nO(logn) as O(nlogn).
Does it make sense now that Algorithm S runs at O(nlogn) at the best case, and O(n^2) at the worst case?
There's a bit of confusion here: the algorithm runtime doesn't depend on n being even or odd, but on whether the numbers in A are even or odd.
With that in mind, what sort of input A would make Algorithm S run faster? Slower?
Also: it doesn't make sense to say that the best case of O(n) is O(1). Suppose I have an algorithm ("Algorithm Q") that is O(n); all I can say is that there exists a constant c such that, for any input of size n, Algorithm Q takes less than cn time. There is no guarantee that I can find specific inputs for which Algorithm Q is O(1).
To give a concrete example, this takes linear time no matter what input it is passed:
def length(A):
len = 0
for x in A:
len += 1
return len
A few thoughts.
First, there is no mention of asymptotically tight time. So an O(n) algorithm can actually be an O(logn) one. So just imagine the best case running time this algorithm can be in this case. I know, this is a little picky. But this is a homework, I guess it's always welcome to mention all the possibilities.
Second, even if it's asymptotically tight, it doesn't necessarily mean it's tight for all elements. Consider insertion sort. For each new element to insert, we need to find the correct position in the previous already-sorted subarray. The time is proportional to the number of element in subarray, which has the upper bound O(n). But it doesn't mean each new element need exactly #n comparisons to insert. Actually, the shorter the subarray, the quicker the insertion.
Back to this question. "executes an O(logn)-time computation for each odd number in A." Let's assume all odd nubmers. It could be that the first odd takes O(log1), the second odd takes O(log2), .. the nth takes O(logn). Totally, it takes O(logn!). It doesn't contradicts "O(logn) for each odd number".
As to worst case, you may analysize it in much the same way.

Resources