Is it really any different from O(2^n)??
It's like saying n where n is in the set of i/2 where i is any real number. If i is a set of real numbers, then n would be so as well and so O(2^n) is the same as O(2^(n/2)) right?
2^(n/2) = √(2^n); also, lim 2^(n/2)/2^n = 0, so these two orders of complexity are quite different. In fact, they are much more different than n vs. n².
An example of O(2^n) cost is counting the ordered partitions of n+1 (e.g. n=3 -> (1,1,1,1), (2,1,1), (1,2,1), (1,1,2), (2,2), (3,1), (1,3), (4) -> 8).
An example of O(2^(n/2)) cost is counting the ordered partitions of n+1 that are symmetric (e.g. n=3 -> (1,1,1,1), (1,2,1), (2,2), (4) -> 4).
O(2^n) and O(2^(n/2)) are similar, but not identical. O(2^n) represents an algorithm whose time complexity is directly proportional to 2^n, while O(2^(n/2)) represents an algorithm whose time complexity is directly proportional to 2^(n/2). This means O(2^n) represents a problem that will double in size with each additional input, while O(2^(n/2)) represents a problem that will increase by a factor of 2^(1/2) with each additional input.
These complexities can be quite different in terms of the actual running time of the algorithm. For example, a problem of size 8 with O(2^n) is going to take 2^8 = 256 times more computation than the same problem with size 4, while O(2^(n/2)) will take 2^(8/2) = 16 times more computation. While both complexities are exponential, the actual running time can be very different.
In general, O(2^n) is considered to be much worse than O(2^(n/2)), as it increases much faster. So, it's important to understand the difference between these complexities, and not confuse them with each other.
Related
In computer science, the iterated logarithm of n, written log* n (usually read "log star"), is the number of times the logarithm function must be iteratively applied before the result is less than or equal to 1. The simplest formal definition is the result of this recursive function:
Is there any algorithm with time complexity O(lg * n) ?
If you implement union find algorithm with path compression and union by rank, both union and find will have complexity O(log*(n)).
It's rare but not unheard of to see log* n appear in the runtime analysis of algorithms. Here are a couple of cases that tend to cause log* n to appear.
Approach 1: Shrinking By a Log Factor
Many divide-and-conquer algorithms work by converting an input of size n into an input of size n / k. The number of phases of these algorithms is then O(log n), since you can only divide by a constant O(log n) times before you shrink your input down to a constant size. In that sense, when you see "the input is divided by a constant," you should think "so it can only be divided O(log n) times before we run out of things to divide."
In rarer cases, some algorithms work by shrinking the size of the input down by a logarithmic factor. For example, one data structure for the range semigroup query problem work by breaking a larger problem down into blocks of size log n, then recursively subdividing each block of size log n into blocks of size log log n, etc. This process eventually stops once the blocks hit some small constant size, which means that it stops after O(log* n) iterations. (This particular approach can then be improved to give a data structure in which the blocks have size log* n for an overall number of rounds of O(log** n), eventually converging to an optimal structure with runtime O(α(n)), where α(n) is the inverse Ackermann function.
Approach 2: Compressing Digits of Numbers
The above section talks about approaches that explicitly break a larger problem down into smaller pieces whose sizes are logarithmic in the size of the original problem. However, there's another way to take an input of size n and to reduce it to an input of size O(log n): replace the input with something roughly comparable in size to its number of digits. Since writing out the number n requires O(log n) digits to write out, this has the effect of shrinking the size of the number down by the amount needed to get an O(log* n) term to arise.
As a simple example of this, consider an algorithm to compute the digital root of a number. This is the number you get by repeatedly adding the digits of a number up until you're down to a single digit. For example, the digital root of 78979871 can be found by computing
7 + 8 + 9 + 7 + 9 + 8 + 7 + 1 = 56
5 + 6 = 11
1 + 1 = 2
2
and getting a digital root of two. Each time we sum the digits of the number, we replace the number n with a number that's at most 9 ⌈log10 n⌉, so the number of rounds is O(log* n). (That being said, the total runtime is O(log n), since we have to factor in the work associated with adding up the digits of the number, and adding the digits of the original number dominates the runtime.)
For a more elaborate example, there is a parallel algorithm for 3-coloring the nodes of a tree described in the paper "Parallel Symmetry-Breaking in Sparse Graphs" by Goldberg et al. The algorithm works by repeatedly replacing numbers with simpler numbers formed by summing up certain bits of the numbers, and the number of rounds needed, like the approach mentioned above, is O(log* n).
Hope this helps!
Consider this example of binary search tree.
n =10 ;and if base = 2 then
log n = log2(10) = 3.321928.
I am assuming it means 3.321 steps(accesses) will be required at max to search for an element. Also I assume that BST is balanced binary tree.
Now to access node with value 25. I have to go following nodes:
50
40
30
25
So I had to access 4 nodes. And 3.321 is nearly equal to 4.
Is this understanding is right or erroneous?
I'd call your understanding not quite correct.
The big-O notation does not say anything about an exact amount of steps done. A notation O(log n) means that something is approximately proportional to log n, but not neccesarily equal.
If you say that the number of steps to search for a value in a BST is O(log n), this means that it is approximately C*log n for some constant C not depending on n, but it says nothing about the value of C. So for n=10 this never says that the number of steps is 4 or whatever. It can be 1, or it can be 1000000, depending on what C is.
What does this notation say is that if you consider two examples with different and big enough sizes, say n1 and n2, then ratio of the number of steps in these two examples will be approximately log(n1)/log(n2).
So if for n=10 it took you, say, 4 steps, then for n=100 it should take approximately two times more, that is, 8 steps, because log(100)/log(10)=2, and for n=10000 it should take 16 steps.
And if for n=10 it took you 1000000 steps, then for n=100 it should take 2000000, and for n=10000 — 4000000.
This is all for "large enough" n — for small ns the number of steps can deviate from this proportionality. For most practical algorithms the "large enough" usually starts from 5-10, if not even 1, but from a strict point of view the big-O notation does not set any requirement on when the proportionality should start.
Also in fact O(log n) notation does not require that number of steps growths proportionally to log n, but requires that the number of steps growths no faster than proportionally to log n, that is the ratio of the numbers of steps should not be log(n1)/log(n2), but <=log(n1)/log(n2).
Note also another situation that can make the background for O-notation more clear. Consider not the number of steps, but the time spent for search in a BST. You clearly can not predict this time because it depends on the machine you are running on, on a particular implementation of the algorithm, after all on the units you use for time (seconds or nanoseconds, etc.). So the time can be 0.0001 or 100000 or whatever. However, all these effects (speed of your machine, etc) routhly changes all the measurement results by some constant factor. Therefore you can say that the time is O(log n), just in different cases the C constant will be different.
Your thinking is not correct totally. The steps/accesses which are considered are for comparisons. But, O(log n) is just a mere parameter to measure asymptotic complexity, and not the exact steps calculation. As exactly answered by Petr, you should go through the points mentioned in his answer.
Also, BST is binary search tree,also sometimes called ordered or sorted binary trees.
Exact running time/comparisons can't be derived from Asymptotic complexity measurement. For that, you'll have to return to the exact derivation of searching an element in a BST.
Assume that we have a “balanced” tree with n nodes. If the maximum number of comparisons to find an entry is (k+1), where k is the height, we have
2^(k+1) - 1 = n
from which we obtain
k = log2(n+1) – 1 = O(log2n).
As you can see, there are other constant factors which are removed while measuring asymptotic complexity in worst case analysis. So, the complexity of the comparisons gets reduced to O(log2n).
Next, demonstration of how element is searched in a BST based on how comparison is done :-
1. Selecting 50,if root element //compare and move below for left-child or right-child
2. Movement downwards from 50 to 40, the leftmost child
3. Movement downwards from 40 to 30, the leftmost child
4. Movement downwards from 30 to 25, found and hence no movement further.
// Had there been more elements to traverse deeper, it would have been counted the 5th step.
Hence, it searched the item 25 after 3 iterative down-traversal. So, there is 4 comparisons and 3 downward-traversals(because height is 3).
Usually you say something like this :-
Given a balanced binary search tree with n elements, you need O(log n) operations to search.
or
Search in a balanced binary search tree of n elements is in O(log n).
I like the second phrase more, because it emphasizes that O is a function returning a set of functions given x (short: O(x)). x: N → N is a function. The input of x is the size of the input of a function and the output of x can be interpreted as the number of operations you need.
An function g is in O(x) when g is lower than x multiplied by an arbitrary non-negative constant from some starting point n_0 for all following n.
In computer science, g is often set equal with an algorithm which is wrong. It might be the number of operations of an algorithm, given the input size. Note that this is something different.
More formally:
So, regarding your question: You have to define what n is (conceptually, not as a number). In your example, it is eventually the number of nodes or the number of nodes on the longest path to a leaf.
Usually, when you use Big-O notation, you are not interested for a "average" case (and especially not for some given case) but you want to say something about the worst case.
Do the higher order of growth functions take the longest time? So, that O(n2) takes less time than (2n)?
Are the highest order of growth functions take the longest to actually run when N is a large number?
The idea of Big O Notation is to express the worst case scenario of algorithm complexity. For instance, a function using a loop may be described as O(n) even if it contains several O(1) statements, since it may have to run the entire loop over n items.
n is the number of inputs to the algorithm you are measuring. Big-O in terms of n tells you how that algorithm will perform as the number of inputs gets increasingly large. This may mean that it will take more time to execute, or that something will take more space to store. If you notice your algorithm has a high Big-O (like O(2^n) or O(n!)), you should consider an alternate implementation that scales better (assuming n will ever get large -- if n is always small it doesn't matter). The key take-away here is that Big-O is useful for showing you which of two algorithms will scale better, or possibly just the input size for which one algorithm would become a serious bottleneck on performance.
Here is an example image comparing several polynomials which might give you an idea of their growth rates in terms of Big-O. The growth time is as n approaches infinity, which in graphical terms is how sharply the function curves upwards along the y-axis as x grows large.
In case it is not clear, the x-axis here is your n, and the y-axis the time taken. You can see from this how much more quickly, for instance, something O(n^2) would eat up time (or space, or whatever) than something O(n). If you graph more of them and zoom out, you will see the incredible difference in, say, O(2^n) and O(n^3).
Attempting a Concrete Example
Using your example of comparing two string arrays of size 20, let's say we do this (pseudocode since this is language agnostic):
for each needle in string_array_1:
for each haystack in string_array_2:
if needle == haystack:
print 'Found!'
break
This is O(n^2). In the worst case scenario, it has to run completely through the second loop (in case no match is found), which happens on every iteration of the first loop. For two arrays of size 20, this is 400 total iterations. If each array was incrased by just one string to size 21, the total number of iterations in the worst case grows to 441! Obviously, this could get out of hand quickly. What if we had arrays with 1000 or more members? Note that it's not really correct to think of n as being 20 here, because the arrays could be of different sizes. n is an abstraction to help you see how bad things could get under more and more load. Even if string_array_1 was size 20 and string_array_2 was size 10 (or 30, or 5!), this is still O(n^2).
O-time is only relevant when compared against itself, but 2^n will grow faster than n^2.
Compare as n grows:
N n^2 2^n
1 1 2
2 4 4
3 9 8
4 16 16
5 25 32
6 36 64
...
10 100 1024
20 400 1048576
Relevant Link: Wolfram Alpha
You can think in reverse: let an algorithm be such that T(n) = t; how many more elements can I process in time 2.t ?
O(n^2) -> 41% elements more ((n + 0.41 n)^2 = 2.n^2)
O(2^n) -> a single element more (2^(n+1) = 2.2^n)
Or in time 1000000.t ?
O(n^2) -> 1000 times more
O(2^n) -> 20 elements more
We are going over the master theorem in my algorithms class, and for one problem, I'm trying to compare nlogn vs 1 to figure out which case of the MT it falls under. But I'm having a hard timing figuring out which is bigger.
Edit: This is for solving a recurrence problem. The equation is T(n) = 2T(n/4) + N*LogN. Just threw this in incase it helps.
Think about it this way:
O(N*LogN) will increase with N in such a way that for any X, no matter how large, you can find a value of N such that N*LogN is greater than X.
O(1) will stay the same, no matter what N is.
This means that O(1) is asymptotically better, i.e. for some (perhaps very high) value of N the O(N*LogN) will become slower.
If an algorithm is O(NlogN) that means that there exists a number A and a quantity of execution time B, such that for any input size N greater than A, the execution time will be less than B times NlogN.
If an algorithm is O(1), that would mean that there exists some fixed amount of time C in which the algorithm would be guaranteed to complete regardless of the input size.
In comparing two algorithms, one of which is O(NlgN) and one of which is O(1), one will generally discover that the O(1) algorithm is faster for values of N that are sufficiently large, but in many cases the O(NlgN) algorithm may be faster for small values of N.
Indeed, while something like an O(N^3) or O(N^4) algorithm would generally seem pretty bad, it's possible that even an O(N^4) algorithm may outperform an O(1) algorithm if N is usually a small number (e.g. 1-5 or so) and never gets very big (even an occasional value of 50 could seriously dog performance).
Hi I would really appreciate some help with Big-O notation. I have an exam in it tomorrow and while I can define what f(x) is O(g(x)) is, I can't say I thoroughly understand it.
The following question ALWAYS comes up on the exam and I really need to try and figure it out, the first part seems easy (I think) Do you just pick a value for n, compute them all on a claculator and put them in order? This seems to easy though so I'm not sure. I'm finding it very hard to find examples online.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
What is the
worst-case computational-complexity of
the Binary Search algorithm on an
ordered list of length n = 2k?
That guy should help you.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
The order is same as if you compare their limit at infinity. like lim(a/b), if it is 1, then they are same, inf. or 0 means one of them is faster.
What is the worst-case
computational-complexity of the Binary
Search algorithm on an ordered list of
length n = 2k?
Find binary search best/worst Big-O.
Find linked list access by index best/worst Big-O.
Make conclusions.
Hey there. Big-O notation is tough to figure out if you don't really understand what the "n" means. You've already seen people talking about how O(n) == O(2n), so I'll try to explain exactly why that is.
When we describe an algorithm as having "order-n space complexity", we mean that the size of the storage space used by the algorithm gets larger with a linear relationship to the size of the problem that it's working on (referred to as n.) If we have an algorithm that, say, sorted an array, and in order to do that sort operation the largest thing we did in memory was to create an exact copy of that array, we'd say that had "order-n space complexity" because as the size of the array (call it n elements) got larger, the algorithm would take up more space in order to match the input of the array. Hence, the algorithm uses "O(n)" space in memory.
Why does O(2n) = O(n)? Because when we talk in terms of O(n), we're only concerned with the behavior of the algorithm as n gets as large as it could possibly be. If n was to become infinite, the O(2n) algorithm would take up two times infinity spaces of memory, and the O(n) algorithm would take up one times infinity spaces of memory. Since two times infinity is just infinity, both algorithms are considered to take up a similar-enough amount of room to be both called O(n) algorithms.
You're probably thinking to yourself "An algorithm that takes up twice as much space as another algorithm is still relatively inefficient. Why are they referred to using the same notation when one is much more efficient?" Because the gain in efficiency for arbitrarily large n when going from O(2n) to O(n) is absolutely dwarfed by the gain in efficiency for arbitrarily large n when going from O(n^2) to O(500n). When n is 10, n^2 is 10 times 10 or 100, and 500n is 500 times 10, or 5000. But we're interested in n as n becomes as large as possible. They cross over and become equal for an n of 500, but once more, we're not even interested in an n as small as 500. When n is 1000, n^2 is one MILLION while 500n is a "mere" half million. When n is one million, n^2 is one thousand billion - 1,000,000,000,000 - while 500n looks on in awe with the simplicity of it's five-hundred-million - 500,000,000 - points of complexity. And once more, we can keep making n larger, because when using O(n) logic, we're only concerned with the largest possible n.
(You may argue that when n reaches infinity, n^2 is infinity times infinity, while 500n is five hundred times infinity, and didn't you just say that anything times infinity is infinity? That doesn't actually work for infinity times infinity. I think. It just doesn't. Can a mathematician back me up on this?)
This gives us the weirdly counterintuitive result where O(Seventy-five hundred billion spillion kajillion n) is considered an improvement on O(n * log n). Due to the fact that we're working with arbitrarily large "n", all that matters is how many times and where n appears in the O(). The rules of thumb mentioned in Julia Hayward's post will help you out, but here's some additional information to give you a hand.
One, because n gets as big as possible, O(n^2+61n+1682) = O(n^2), because the n^2 contributes so much more than the 61n as n gets arbitrarily large that the 61n is simply ignored, and the 61n term already dominates the 1682 term. If you see addition inside a O(), only concern yourself with the n with the highest degree.
Two, O(log10n) = O(log(any number)n), because for any base b, log10(x) = log_b(*x*)/log_b(10). Hence, O(log10n) = O(log_b(x) * 1/(log_b(10)). That 1/log_b(10) figure is a constant, which we've already shown drop out of O(n) notation.
Very loosely, you could imagine picking extremely large values of n, and calculating them. Might exceed your calculator's range for large factorials, though.
If the definition isn't clear, a more intuitive description is that "higher order" means "grows faster than, as n grows". Some rules of thumb:
O(n^a) is a higher order than O(n^b) if a > b.
log(n) grows more slowly than any positive power of n
exp(n) grows more quickly than any power of n
n! grows more quickly than exp(kn)
Oh, and as far as complexity goes, ignore the constant multipliers.
That's enough to deduce that the correct order is O(1), O(log n), O(2n) = O(n), O(n log n), O(n^2), O(n!)
For big-O complexities, the rule is that if two things vary only by constant factors, then they are the same. If one grows faster than another ignoring constant factors, then it is bigger.
So O(2n) and O(n) are the same -- they only vary by a constant factor (2). One way to think about it is to just drop the constants, since they don't impact the complexity.
The other problem with picking n and using a calculator is that it will give you the wrong answer for certain n. Big O is a measure of how fast something grows as n increases, but at any given n the complexities might not be in the right order. For instance, at n=2, n^2 is 4 and n! is 2, but n! grows quite a bit faster than n^2.
It's important to get that right, because for running times with multiple terms, you can drop the lesser terms -- ie, if O(f(n)) is 3n^2+2n+5, you can drop the 5 (constant), drop the 2n (3n^2 grows faster), then drop the 3 (constant factor) to get O(n^2)... but if you don't know that n^2 is bigger, you won't get the right answer.
In practice, you can just know that n is linear, log(n) grows more slowly than linear, n^a > n^b if a>b, 2^n is faster than any n^a, and n! is even faster than that. (Hint: try to avoid algorithms that have n in the exponent, and especially avoid ones that are n!.)
For the second part of your question, what happens with a binary search in the worst case? At each step, you cut the space in half until eventually you find your item (or run out of places to look). That is log2(2k). A search where you just walk through the list to find your item would take n steps. And we know from the first part that O(log(n)) < O(n), which is why binary search is faster than just a linear search.
Good luck with the exam!
In easy to understand terms the Big-O notation defines how quickly a particular function grows. Although it has its roots in pure mathematics its most popular application is the analysis of algorithms which can be analyzed on the basis of input size to determine the approximate number of operations that must be performed.
The benefit of using the notation is that you can categorize function growth rates by their complexity. Many different functions (an infinite number really) could all be expressed with the same complexity using this notation. For example, n+5, 2*n, and 4*n + 1/n all have O(n) complexity because the function g(n)=n most simply represents how these functions grow.
I put an emphasis on most simply because the focus of the notation is on the dominating term of the function. For example, O(2*n + 5) = O(2*n) = O(n) because n is the dominating term in the growth. This is because the notation assumes that n goes to infinity which causes the remaining terms to play less of a role in the growth rate. And, by convention, any constants or multiplicatives are omitted.
Read Big O notation and Time complexity for more a more in depth overview.
See this and look up for solutions here is first one.