Big O based comparison of an algorithm's running time - algorithm

This was during a Data Structures and Algorithms lesson that I encountered this doubt. I have looked up several resources,both online and offline, but still have doubts. I was asked to figure out the running time of an algorithm and I did that correctly - O(n^3) . But what puzzled me was this question - how much slower does the algorithm work if n goes from ,say, 90 to 900,000 ? My doubts were :
The algorithm has a running time of O(n^3) , so obviously it will take more time for a larger input. But how do I compare an algorithm's performance for different inputs based on just the worst case time complexity ?
Can we just plug in the values of 'n' and divide the big-O to get a ratio ?
Please correct me wherever I am mistaken ! Thanks !

Obviously,your thinking is correct!
As the algorithm goes for the size(n),IN THE WORST-CASE,if size is increased the speed will go inversely with n^3.
Your assumption is correct.You just divide after putting values for n=900,000 by n=90 and obtain the result.This factor will be the slowing factor!
Here,slowing factor = (900,000)^3 / (90)^3 = 10^12 !
Hence, slow factor=10^12 .Your program will slow down by a factor of 10^12 !!!Such a drastic change!Or in other words,your efficiency will fall 10^(-12) times!!!
EDIT based on suggested comments :-
But how do I compare an algorithm's performance for different inputs
based on just the worst case time complexity ?
As hinted by G.Bach,one of the commentator on this post,The basic idea underlying your question is itself contradictory!You should have talked about Big-Theta Notation,instead of Big-O to think of a general solution! Big-O is an upper-bound whereas Big-Theta is a tight bound. When people only worry about what's the worst that can happen, big-O is sufficient; i.e. it says that "it can't get much worse than this". The tighter the bound the better, of course, but a tight bound isn't always easy to compute.
So,in worst case analysis your question would be answered the way both of us have answered.Now,one narrow reason that people use O instead of Ω is to drop disclaimers about worst or average cases.But,for a tighter analysis you'll have to check for both O and big-Omega as well to frame a question. This question can suitably be found solution for varied size of n though.I leave it for you to figure out.But,if you have any doubts,PLEASE,feel free to comment.
Allso,your running time is not directly related to worst case analysis,but,somehow it is related and can be reframed this way!
P.S.---> I have taken some ideas/statements from Big-oh vs big-theta. So,THANKS to the authors too!
I hope it helps.Feel free to comment if any info skimmed!

Your understanding is correct.
If the only performance measure you have is worst-case time complexity, that's all you can compare. If you had more information you could make a better estimate.
Yes, just substitute the values of n and divide 900,0003 by 903 to get the ratio.

Related

Asymptotic notation omega

BigO always checks the upper bound. So we can measure ..the way we write the code, so that there will be less time complexity and thus increase our code performance. But why do we use the lowerbound (omega) ? I did not understand the use of omega in real time. Can anybody please suggest me on this
It's a precision feature. It happens that it is usually easier to prove that an algorithm will take, say, O(n) operations to complete than proving that it will take at least O(n) operations (BTW, in this context operation means elementary computations such as the logical and arithmetic ones.)
By providing a lower bound, you are also giving an estimate of the best case scenario, as the big-O notation only provides an upper bound.
From a practical viewpoint, this has the benefit of telling that no matter what, any algorithm will require so many (elementary) steps (or more).
Note also, that it is also useful to have estimates of the average, the worst and the best cases, because these will shed more light on the complexity of the algorithm.
There are problems whose inherent complexity is known to be at least of some order (meaning there is a mathematical theorem proving the fact). So, no matter the algorithm, these problems cannot be solved with less that a certain number of calculations. This is also useful because lets you know whether a given algorithm is sub-optimal or matches the inherent complexity of the problem.

When (not how or why) to calculate Big O of an algorithm

I was asked this question in an interview recently and was curious as to what others thought.
"When should you calculate Big O?"
Most sites/books talk about HOW to calc Big O but not actually when you should do it. I'm an entry level developer and I have minimal experience so I'm not sure if I'm thinking on the right track. My thinking is you would have a target Big O to work towards, develop the algorithm then calculate the Big O. Then try to refactor the algorithm for efficiency.
My question then becomes is this what actually happens in industry or am I far off?
"When should you calculate Big O?"
When you care about the Time Complexity of the algorithm.
When do I care?
When you need to make your algorithm to be able to scale, meaning that it's expected to have big datasets as input to your algorithm (e.g. number of points and number of dimensions in a nearest neighbor algorithm).
Most notably, when you want to compare algorithms!
You are asked to do a task, for which several algorithms can be applied to. Which one do you choose? You compare the Space, Time and development/maintenance complexities of them, and choose the one that best fits your needs.
Big O or asymptotic notations allow us to analyze an algorithm's running time by identifying its behavior as the input size for the algorithm increases.
So whenever you need to analyse your algorithm's behavior with respect to growth of the input, you will calculate this. Let me give you an example -
Suppose you need to query over 1 billion data. So you wrote a linear search algorithm. So is it okay? How would you know? You will calculate Big-o. It's O(n) for linear search. So in worst case it would execute 1 billion instruction to query. If your machine executes 10^7 instruction per second(let's assume), then it would take 100 seconds. So you see - you are getting an runtime analysis in terms of growth of the input.
When we are solving an algorithmic problem we want to test the algorithm irrespective of hardware where we are running the algorithm. So we have certain asymptotic notation using which we can define the time and space complexities of our algorithm.
Theta-Notation: Used for defining average case complexity as it bounds the function from top and bottom
Omega-Notation: Bounds the function from below. It is used for best-time complexity
Big-O Notation: This is important as it tells about worst-case complexity and it bounds the function from top.
Now I think the answer to Why BIG-O is calculated is that using it we can get a fair idea that how bad our algorithm can perform as the size of input increases. And If we can optimize our algorithm for worst case then average and best case will take care for themselves.
I assume that you want to ask "when should I calculate time complexity?", just to avoid technicalities about Theta, Omega and Big-O.
Right attitude is to guess it almost always. Notable exceptions include piece of code you want to run just once and you are sure that it will never receive bigger input.
The emphasis on guess is because it does not matter that much whether complexity is constant or logarithmic. There is also a little difference between O(n^2) and O(n^2 log n) or between O(n^3) and O(n^4). But there is a big difference between constant and linear.
The main goal of the guess, is the answer to the question: "What happens if I get 10 times larger input?". If complexity is constant, nothing happens (in theory at least). If complexity is linear, you will get 10 times larger running time. If complexity is quadratic or bigger, you start to have problems.
Secondary goal of the guess is the answer to question: 'What is the biggest input I can handle?". Again quadratic will get you up to 10000 at most. O(2^n) ends around 25.
This might sound scary and time consuming, but in practice, getting time complexity of the code is rather trivial, since most of the things are either constant, logarithmic or linear.
It represents the upper bound.
Big-oh is the most useful because represents the worst-case behavior. So, it guarantees that the program will terminate within a certain time period, it may stop earlier, but never later.
It gives the worst time complexity or maximum time require to execute the algorithm

Relation between worst case and average case running time of an algorithm

Let's say A(n) is the average running time of an algorithm and W(n) is the worst. Is it correct to say that
A(n) = O(W(n))
is always true?
The Big O notation is kind of tricky, since it only defines an upper bound to the execution time of a given algorithm.
What this means is, if f(x) = O(g(x)) then for every other function h(x) such that g(x) < h(x) you'll have f(x) = O(h(x)) . The problem is, are those over extimated execution times usefull? and the clear answer is not at all. What you usually whant is the "smallest"
upper bound you can get, but this is not strictly required in the definition, so you can play around with it.
You can get some stricter bound using the other notations, such as the Big Theta, as you can read here.
So, the answer to your question is yes, A(n) = O(W(n)), but that doesn't give any usefull information on the algorithm.
If you're mentioning A(n) and W(n) are functions - then, yes, you can do such statement in common case - it is because big-o formal definition.
Note, that in terms on big-o there's no sense to act such way - since it makes understanding of the real complexity worse. (In general, three cases - worst, average, best - are present exactly to show complexity more clear)
Yes, it is not a mistake to say so.
People use asymptotic notation to convey the growth of running time on specific cases in terms of input sizes.To compare the average case complexity with the worst case complexity isn't providing much insight into understanding the function's growth on either of the cases.
Whilst it is not wrong, it fails to provide more information than what we already know.
I'm unsure of exactly what you're trying to ask, but bear in mind the below.
The typical algorithm used to show the difference between average and worst case running time complexities is Quick Sort with poorly chosen pivots.
On average with a random sample of unsorted data, the runtime complexity is n log(n). However, with an already sorted set of data where pivots are taken from either the front/end of the list, the runtime complexity is n^2.

Using worst/avg/best case for asymptotic analysis [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I understand the worst/avg/best case are used to determine the complexity time of an algorithm into a function but how is that used in asymptotic analysis? I understand the upper/tight/lower bound(big O, big omega, big theta) are used to compare two functions and seeing what it's limit(growth) is in perspective to the other as n increases but i'm having trouble seeing the difference between worst/avg/best case big O and asymptotic analysis. What exactly do we get out of imputing our worst/avg/best case big O into the asymptotic analysis and measuring bounds? Would we use asymptotic analysis to specifically compare two algorithms of worst/avg/best case big O? If so do we use function f(n) for algorithm 1 and g(n) for algorithm 2 or do we have separate asymptotic analysis for each algorithm where algorithm 1 is f(n) and we try to find some c*g(n) such that => f(n) and such that c*g(n) <= f(n) and then do the same thing for algorithm 2. I'm not seeing the big picture here.
Since you want the big picture, let me try to give you the same.
Asymptotic analysis is used to study how the running time grows as size of input increases.This growth is studied in terms of the input size.Input size, which is usually denoted as N or M, it could mean anything from number of numbers(as in sorting), number of nodes(as in graphs) or even number of bits(as in multiplication of two numbers).
While dealing with asymptotic analysis our goal is find out which algorithm fares better in specific cases.Realize that an algorithm runs on quite varying times even for same sized inputs.To appreciate this, consider you are a sorting machine.You will be given a set of numbers and you need to sort them.If I give yuo a sorted list of numbers, you would have no work, and you are done already.If I gave you a reverse sorted list of numbers, imagine the number of operations you need to do to make the list sorted.Now that you see this, realize that we need a way of knowing what case the input would be?Would it be a best case?Would I get a worst case input?To answer this, we need some knowledge of the distribution of the input.Will it all be worst cases?Or would it be average cases?Or would it mostly be best cases?
The knowledge of the input distribution is fairly difficult to ascertain in most cases.Then we are left with two options.Either we can assume average case all the time and analyze our algorithm, or we could get a guarantee on the running case irrespective of the input distribution.The former is referred to as average case analysis, and to do such an analysis would require a formal definition of what makes an average case.Sometimes this is difficult to define and requires much mathematical insight.All the trouble is worth it, when you know that some algorithm runs much faster on the average case, compared to its worst case running time.There are several randomized algorithms that stand testimony to this.In such cases, doing an average case analysis reveals its practical applicability.
The latter, the worst case analysis is more often used since it provides a nice guarantee on the running time.In practice coming up with the worst case scenario is often fairly intuitive.Say you are the sorting machine, worst case is like reverse sorted array.What's the average case?
Yup, you are thinking, right?Not so intuitive.
The best case analysis is rarely used as one does not always get best cases.Still one can do such an analysis and find interesting behavior.
In conclusion, when we have a problem that we wanna solve, we come up with algorithms.Once we have an algorithm, we need to decide if it's of any practical use to our situation.If so we go ahead and shortlist the algorithms that can be applied, and compare them based on their time and space complexity.There could be more metrics for comparison, but these two are fundamental.One such metric could be ease of implementation.And depending on the situation at hand yu would employ either worst case analysis or average case analysis ir best case analysis.For example if you rarely have worst case scenarios, then its makes much more sense to carry out average case analysis.However if the performance of our code is of critical nature and we need to provide the output in a strict time limit, then its much more prudent to look at worst case analysis.Thus, the analysis that you make depends n the situation at hand, and with time, the intuition of which analysis to apply becomes second nature.
Please ask if you have more questions.
To know more about big-oh and the other notations read my answers here and here.
The Wikipedia article on quicksort provides a good example of how asymptotic analysis is used on best/average/worst case: it's got a worst case of O(n^2), an average case of O(n log n), and a best case of O(n log n), and if you're comparing this to another algorithm (say, heapsort) you would compare apples to apples, e.g. you'd compare quicksort's worst-case big-theta to heapsort's worst-case big-theta, or quicksort's space big-oh to heapsort's space big-oh.
You can also compare big-theta to big-oh if you're interested in upper bounds, or big-theta to big-omega if you're interested in lower bounds.
Big-omega is usually only of theoretical interest - you're far more likely to see analysis in terms of big-oh or big-theta.
What exactly do we get out of imputing our worst/avg/best case big O into the asymptotic analysis and measuring bounds?
It just gives an idea when you are comparing different approach for some problem. This will help you in comparing different approaches.
Would we use asymptotic analysis to specifically compare two algorithms of worst/avg/best case big O?
Generally only worse case gets more focus, compared to Big Omega and theta.
Yes, we use function f(n) for algorithm 1 and g(n) for algorithm. And these functions are Big O of their respective algorithms.

Estimation of program execution time from complexity

I want to know, how i can estimate the time that my program will take to execute on my machine (for example a 2.5 Ghz machine), if i have an estimation of its worst case time complexity?
For Example : - If I have a program which is O(n^2), in worst case, and n<100000, how can i know /estimate before writing the actual program/procedure, the time that it will take to execute in seconds?
Wouldn't it be good to know how a program actually performs, and it will also save writing code which eventually turns out to be inefficient!
Help greatly appreciated.
Since big O complexity ignores linear coefficients and smaller terms, it is impossible to estimate the performance of an algorithm given only its big o complexity.
In fact, for any specific N, you cannot predict which of two given algorithms will execute faster.
For example, O(N) is not always faster than O(N*N) since an algorithm that takes 100000000*n steps is O(N) is slower than an algorithm than takes N*N steps for many small values of N.
These linear coefficients and asymptotically smaller terms vary from platform to platform and even amongst algorithms of the same equivalence class (in terms of big O measure). 3
The problem you are trying to use big O notation for is not the one it is designed to solve.
Instead of dealing with complecity, you might want to have a look at Worst Case Execution Time (WCET). This area of research most likely corresponds to what you are looking for.
http://en.wikipedia.org/wiki/Worst-case_execution_time
Multiply N^2 by the time You spend in an iteration of the innermost loop, and You have a ballpark estimate.

Resources