how to do big-O analysis when 2 algorithms are involved - big-o

I'm confused about how to do big-O analysis for the following problem -
find an element from an array of integers. ( an example problem)
my solution
sort the array using bubble sort ( n^2 )
binary search on the array for a given element (logn)
now the big-O for this is n^2 or n^2 + logn ? Should we only consider the higher term ?

Big-O for a problem is that of the best algorithm that exists for a problem. That for an algorithm made of two steps (like yours) is indeed the highest of the two, because e.g.
O(n^2) == O(n^2 + log n)
However, you can't say that O(n^2) is the correct O for your sample problem without proving that no better algorithm exists (which is of course not the case in the example;-).

Only the higher order term. The complexity is always the complexity of the highest term.

The way you did it, it would be O(n^2), since for large n, n^2 >>> logn

To put the analysis, well, more-practically (if you prefer, crudely) than Alex did, the added log n doesn't have an appreciable effect on the outcome. Consider analyzing this in a real-world system with one million inputs, each of which takes one millisecond to sort, and one millisecond to search (it's a highly-hypothetical example). Given O(n^2), the sort takes over thirty years. The search takes an additional 0.014 seconds. Which part do you care about improving? :)
Now, you'll see algorithms which clock in at O(n^2 x logn). The effect of multiplying n^2 by log n makes log n significant - in our example, it sees our thirty years and raises us four centuries.

Related

How to calculate "n" for different notations Big O, Omega, Litle o , Litle omega and Theta Notation

I am studying algorithms, but the calculations to find Time Complexity are not that much easy for me, it is hard to remember when to use log n, n log n, n^2, n^3, 2n, etc, my doubt is all about how to consider these input functions while computing the complexity, is their any specific way to calculate the complexity ,like using for loop take's this much complexity always and so on....?
Log(n): when you are using recursion and a tree is generated use log(n).
I mean in divide and conquer when you are diving problem into 2-halfs actually you are generating a recursive tree.
its complexity is Log(n), why ? because its a binary tree in nature and for binary tree we use Log(Base2)(n).
try yourself: suppose n=4(Elements) so log(base2)(4)=2, you divide it into equal half.
nLog(n): remember Log(n) its was division till single element. after that you start merging sorted elements that take liner time
in other words Merging of elements has complexity "n" so total complexity will be n(Merging) + Log(n)(Dividing) which is finally become nLog(n).
n^2:
when you see a problem is solved in two nested loop then Complexity is n^2.
i.e Matrix/2-D arrays they computed in 2 Loops. one loop inside the outer Loop.
n^3: oh 3-D arrays, this is for 3 nested loops. loop inside loop inside loop.
2n: thanks you did not forgot to write "2" with this "n" otherwise I forgot to explain this.
so "2" in here with "n" is constant just ignore it. why ?. because if you travel to other city by AIR. you will count only hours taken by flight not the hours consumed in reaching AIR port. I mean this is minor we remove constant.
and for "n" just remember this word "Linear" i.e Big-O(n) is linear complexity. Sadly I discovered there is no Algorithm that sort elements in Linear time. i.e just in one loop.(Single array traversal).
Things To Remember:
Nominal Time: Linear Time, Complexity Big-O(n)
Polynomial Time: Not Linear Time, Complexity Big-O[ log(n), nlog(n), n^2, n^3, n^4, n^5).
Exponential Time: 2^n, n^n i.e this problem will solve in exponential time i.e N^power(n) (These are bad bad bad, not called algorithm)
There are many links on how to roughly calculate Big O and its sibling's complexity, but there is no true formula.
However, there are guidelines to help you calculate complexity such as these presented below. I suggest reviewing as many different programs and data structures to help familiarize yourself with the pattern and just study, study, study until you get it! There is a pattern and you will see it the more you study it.
Source: http://www.dreamincode.net/forums/topic/125427-determining-big-o-notation/
Nested loops are multiplied together.
Sequential loops are added.
Only the largest term is kept, all others are dropped.
Constants are dropped.
Conditional checks are constant (i.e. 1).

Which is bigger: O(n*logn) or O(1)?

We are going over the master theorem in my algorithms class, and for one problem, I'm trying to compare nlogn vs 1 to figure out which case of the MT it falls under. But I'm having a hard timing figuring out which is bigger.
Edit: This is for solving a recurrence problem. The equation is T(n) = 2T(n/4) + N*LogN. Just threw this in incase it helps.
Think about it this way:
O(N*LogN) will increase with N in such a way that for any X, no matter how large, you can find a value of N such that N*LogN is greater than X.
O(1) will stay the same, no matter what N is.
This means that O(1) is asymptotically better, i.e. for some (perhaps very high) value of N the O(N*LogN) will become slower.
If an algorithm is O(NlogN) that means that there exists a number A and a quantity of execution time B, such that for any input size N greater than A, the execution time will be less than B times NlogN.
If an algorithm is O(1), that would mean that there exists some fixed amount of time C in which the algorithm would be guaranteed to complete regardless of the input size.
In comparing two algorithms, one of which is O(NlgN) and one of which is O(1), one will generally discover that the O(1) algorithm is faster for values of N that are sufficiently large, but in many cases the O(NlgN) algorithm may be faster for small values of N.
Indeed, while something like an O(N^3) or O(N^4) algorithm would generally seem pretty bad, it's possible that even an O(N^4) algorithm may outperform an O(1) algorithm if N is usually a small number (e.g. 1-5 or so) and never gets very big (even an occasional value of 50 could seriously dog performance).

Trying to understand Big-oh notation

Hi I would really appreciate some help with Big-O notation. I have an exam in it tomorrow and while I can define what f(x) is O(g(x)) is, I can't say I thoroughly understand it.
The following question ALWAYS comes up on the exam and I really need to try and figure it out, the first part seems easy (I think) Do you just pick a value for n, compute them all on a claculator and put them in order? This seems to easy though so I'm not sure. I'm finding it very hard to find examples online.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
What is the
worst-case computational-complexity of
the Binary Search algorithm on an
ordered list of length n = 2k?
That guy should help you.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
The order is same as if you compare their limit at infinity. like lim(a/b), if it is 1, then they are same, inf. or 0 means one of them is faster.
What is the worst-case
computational-complexity of the Binary
Search algorithm on an ordered list of
length n = 2k?
Find binary search best/worst Big-O.
Find linked list access by index best/worst Big-O.
Make conclusions.
Hey there. Big-O notation is tough to figure out if you don't really understand what the "n" means. You've already seen people talking about how O(n) == O(2n), so I'll try to explain exactly why that is.
When we describe an algorithm as having "order-n space complexity", we mean that the size of the storage space used by the algorithm gets larger with a linear relationship to the size of the problem that it's working on (referred to as n.) If we have an algorithm that, say, sorted an array, and in order to do that sort operation the largest thing we did in memory was to create an exact copy of that array, we'd say that had "order-n space complexity" because as the size of the array (call it n elements) got larger, the algorithm would take up more space in order to match the input of the array. Hence, the algorithm uses "O(n)" space in memory.
Why does O(2n) = O(n)? Because when we talk in terms of O(n), we're only concerned with the behavior of the algorithm as n gets as large as it could possibly be. If n was to become infinite, the O(2n) algorithm would take up two times infinity spaces of memory, and the O(n) algorithm would take up one times infinity spaces of memory. Since two times infinity is just infinity, both algorithms are considered to take up a similar-enough amount of room to be both called O(n) algorithms.
You're probably thinking to yourself "An algorithm that takes up twice as much space as another algorithm is still relatively inefficient. Why are they referred to using the same notation when one is much more efficient?" Because the gain in efficiency for arbitrarily large n when going from O(2n) to O(n) is absolutely dwarfed by the gain in efficiency for arbitrarily large n when going from O(n^2) to O(500n). When n is 10, n^2 is 10 times 10 or 100, and 500n is 500 times 10, or 5000. But we're interested in n as n becomes as large as possible. They cross over and become equal for an n of 500, but once more, we're not even interested in an n as small as 500. When n is 1000, n^2 is one MILLION while 500n is a "mere" half million. When n is one million, n^2 is one thousand billion - 1,000,000,000,000 - while 500n looks on in awe with the simplicity of it's five-hundred-million - 500,000,000 - points of complexity. And once more, we can keep making n larger, because when using O(n) logic, we're only concerned with the largest possible n.
(You may argue that when n reaches infinity, n^2 is infinity times infinity, while 500n is five hundred times infinity, and didn't you just say that anything times infinity is infinity? That doesn't actually work for infinity times infinity. I think. It just doesn't. Can a mathematician back me up on this?)
This gives us the weirdly counterintuitive result where O(Seventy-five hundred billion spillion kajillion n) is considered an improvement on O(n * log n). Due to the fact that we're working with arbitrarily large "n", all that matters is how many times and where n appears in the O(). The rules of thumb mentioned in Julia Hayward's post will help you out, but here's some additional information to give you a hand.
One, because n gets as big as possible, O(n^2+61n+1682) = O(n^2), because the n^2 contributes so much more than the 61n as n gets arbitrarily large that the 61n is simply ignored, and the 61n term already dominates the 1682 term. If you see addition inside a O(), only concern yourself with the n with the highest degree.
Two, O(log10n) = O(log(any number)n), because for any base b, log10(x) = log_b(*x*)/log_b(10). Hence, O(log10n) = O(log_b(x) * 1/(log_b(10)). That 1/log_b(10) figure is a constant, which we've already shown drop out of O(n) notation.
Very loosely, you could imagine picking extremely large values of n, and calculating them. Might exceed your calculator's range for large factorials, though.
If the definition isn't clear, a more intuitive description is that "higher order" means "grows faster than, as n grows". Some rules of thumb:
O(n^a) is a higher order than O(n^b) if a > b.
log(n) grows more slowly than any positive power of n
exp(n) grows more quickly than any power of n
n! grows more quickly than exp(kn)
Oh, and as far as complexity goes, ignore the constant multipliers.
That's enough to deduce that the correct order is O(1), O(log n), O(2n) = O(n), O(n log n), O(n^2), O(n!)
For big-O complexities, the rule is that if two things vary only by constant factors, then they are the same. If one grows faster than another ignoring constant factors, then it is bigger.
So O(2n) and O(n) are the same -- they only vary by a constant factor (2). One way to think about it is to just drop the constants, since they don't impact the complexity.
The other problem with picking n and using a calculator is that it will give you the wrong answer for certain n. Big O is a measure of how fast something grows as n increases, but at any given n the complexities might not be in the right order. For instance, at n=2, n^2 is 4 and n! is 2, but n! grows quite a bit faster than n^2.
It's important to get that right, because for running times with multiple terms, you can drop the lesser terms -- ie, if O(f(n)) is 3n^2+2n+5, you can drop the 5 (constant), drop the 2n (3n^2 grows faster), then drop the 3 (constant factor) to get O(n^2)... but if you don't know that n^2 is bigger, you won't get the right answer.
In practice, you can just know that n is linear, log(n) grows more slowly than linear, n^a > n^b if a>b, 2^n is faster than any n^a, and n! is even faster than that. (Hint: try to avoid algorithms that have n in the exponent, and especially avoid ones that are n!.)
For the second part of your question, what happens with a binary search in the worst case? At each step, you cut the space in half until eventually you find your item (or run out of places to look). That is log2(2k). A search where you just walk through the list to find your item would take n steps. And we know from the first part that O(log(n)) < O(n), which is why binary search is faster than just a linear search.
Good luck with the exam!
In easy to understand terms the Big-O notation defines how quickly a particular function grows. Although it has its roots in pure mathematics its most popular application is the analysis of algorithms which can be analyzed on the basis of input size to determine the approximate number of operations that must be performed.
The benefit of using the notation is that you can categorize function growth rates by their complexity. Many different functions (an infinite number really) could all be expressed with the same complexity using this notation. For example, n+5, 2*n, and 4*n + 1/n all have O(n) complexity because the function g(n)=n most simply represents how these functions grow.
I put an emphasis on most simply because the focus of the notation is on the dominating term of the function. For example, O(2*n + 5) = O(2*n) = O(n) because n is the dominating term in the growth. This is because the notation assumes that n goes to infinity which causes the remaining terms to play less of a role in the growth rate. And, by convention, any constants or multiplicatives are omitted.
Read Big O notation and Time complexity for more a more in depth overview.
See this and look up for solutions here is first one.

Avgerage Time Complexity of a sorting algorithm

I have a treesort function which performs two distinct tasks, each with its own time complexity. I figured out the avg. case time complexity of the two tasks but how do I find the overall complexity of the algorithm.
For example the algorithm takes in a random list of "n" keys x:
Sort(x):
Insert(x):
#Time complexity of O(nLog(n))
Traverse(x):
#Time complexity of O(n)
Do I just add the two complexities together to give me O(n + nLog(n)) or do I take the dominant task (in this case Insert) and end up with an overall complexity of O(nLog(n))
In a simple case like this,
O((n) + (n log(n)) = O(n + n log(n))
= O(n (log(n) + 1))
= O(n log(n))
or do I take the dominant task (in this case Insert) and end up with an over complexity of O(nLog(n))
That's right. As n grows, first element in O(n + nLog(n)) sum will become less and less significant. Thus, for sufficiently large n, its contribution can be ignored.
You need to take the dominant one.
The whole idea of measuring complexity this way is based on the assumption that you want to know what happens with large ns.
So if you have a polynomial, you can discard all but the highest order element, if you have a logarithm, you can ignore the base and so on.
In everyday practice however, these differences may start to matter, so it's sometimes good to have a more precise picture of your algorithm's complexity, even down to the level where you assign different weights to different operations.
(Returning to your original questions, assuming you're using base 2 logarithms, at n=1048576, the difference between n+n*logn and n*logn is around 5%, which is probably not really worth worrying about.)

Big O Notation: differences between O(n^2) and O(n.log(n))?

What is the difference between O(n^2) and O(n.log(n))?
n^2 grows in complexity more quickly.
Big O calculates an upper limit of running time relative to the size of a data set (n).
An O(n*log(n)) is not always faster than a O(n^2) algorithm, but when considering the worst case it probably is. A O(n^2)-algorithm takes ~4 times longer when you duplicate the working set (worst case), for O(n*log(n))-algorithm it's less. The bigger your data set is the more it usually gets faster using an O(n*log(n))-algorithm.
EDIT: Thanks to 'harms', I'll correct a wrong statement in my first answer: I told that when considering the worst case O(n^2) would always be slower than O(n*log(n)), that's wrong since both are except for a constant factor!
Sample: Say we have the worst case and our data set has size 100.
O(n^2) --> 100*100 = 10000
O(n*log(n)) --> 100*2 = 200 (using log_10)
The problem is that both can be multiplied by a constant factor, say we multiply c to the latter one. The result will be:
O(n^2) --> 100*100 = 10000
O(n*log(n)) --> 100*2*c = 200*c (using log_10)
So for c > 50 we get O(n*log(n)) > O(n^2), for n=100.
I have to update my statement: For every problem, when considering the worst case, a O(n*log(n)) algorithm will be quicker than a O(n^2) algorithm for arbitrarily big data sets.
The reason is: The choice of c is arbitrary but constant. If you increase the data set large enough it will dominate the effect of every constant choice of c and when discussing two algorithms the cs for both are constant!
You'll need to be a bit more specific about what you are asking, but in this case O(n log(n)) is faster
Algorithms that run in O(nlog(n)) time are generally faster than those that run in O(n^2).
Big-O defines the upper-bound on performance. As the size of the data set grows (n) the length of time it takes to perform the task. You might be interested in the iTunes U algorithms course from MIT.
n log(n) grows significantly slower
"Big Oh" notation gives an estimated upper bound on the growth in the running time of an algorithm. If an algorithm is supposed to be O(n^2), in a naive way, it says that for n=1, it takes a max. time 1 units, for n=2 it takes max. time 4 units and so on. Similarly for O(n log(n)), it says the grown will be such that it obeys the upper bound of O(n log(n)).
(If I am more than naive here, please correct me in a comment).
I hope that helps.

Resources