What does O(log n) mean exactly? - algorithm

I am learning about Big O Notation running times and amortized times. I understand the notion of O(n) linear time, meaning that the size of the input affects the growth of the algorithm proportionally...and the same goes for, for example, quadratic time O(n2) etc..even algorithms, such as permutation generators, with O(n!) times, that grow by factorials.
For example, the following function is O(n) because the algorithm grows in proportion to its input n:
f(int n) {
int i;
for (i = 0; i < n; ++i)
printf("%d", i);
}
Similarly, if there was a nested loop, the time would be O(n2).
But what exactly is O(log n)? For example, what does it mean to say that the height of a complete binary tree is O(log n)?
I do know (maybe not in great detail) what Logarithm is, in the sense that: log10 100 = 2, but I cannot understand how to identify a function with a logarithmic time.

I cannot understand how to identify a function with a log time.
The most common attributes of logarithmic running-time function are that:
the choice of the next element on which to perform some action is one of several possibilities, and
only one will need to be chosen.
or
the elements on which the action is performed are digits of n
This is why, for example, looking up people in a phone book is O(log n). You don't need to check every person in the phone book to find the right one; instead, you can simply divide-and-conquer by looking based on where their name is alphabetically, and in every section you only need to explore a subset of each section before you eventually find someone's phone number.
Of course, a bigger phone book will still take you a longer time, but it won't grow as quickly as the proportional increase in the additional size.
We can expand the phone book example to compare other kinds of operations and their running time. We will assume our phone book has businesses (the "Yellow Pages") which have unique names and people (the "White Pages") which may not have unique names. A phone number is assigned to at most one person or business. We will also assume that it takes constant time to flip to a specific page.
Here are the running times of some operations we might perform on the phone book, from fastest to slowest:
O(1) (in the worst case): Given the page that a business's name is on and the business name, find the phone number.
O(1) (in the average case): Given the page that a person's name is on and their name, find the phone number.
O(log n): Given a person's name, find the phone number by picking a random point about halfway through the part of the book you haven't searched yet, then checking to see whether the person's name is at that point. Then repeat the process about halfway through the part of the book where the person's name lies. (This is a binary search for a person's name.)
O(n): Find all people whose phone numbers contain the digit "5".
O(n): Given a phone number, find the person or business with that number.
O(n log n): There was a mix-up at the printer's office, and our phone book had all its pages inserted in a random order. Fix the ordering so that it's correct by looking at the first name on each page and then putting that page in the appropriate spot in a new, empty phone book.
For the below examples, we're now at the printer's office. Phone books are waiting to be mailed to each resident or business, and there's a sticker on each phone book identifying where it should be mailed to. Every person or business gets one phone book.
O(n log n): We want to personalize the phone book, so we're going to find each person or business's name in their designated copy, then circle their name in the book and write a short thank-you note for their patronage.
O(n2): A mistake occurred at the office, and every entry in each of the phone books has an extra "0" at the end of the phone number. Take some white-out and remove each zero.
O(n · n!): We're ready to load the phonebooks onto the shipping dock. Unfortunately, the robot that was supposed to load the books has gone haywire: it's putting the books onto the truck in a random order! Even worse, it loads all the books onto the truck, then checks to see if they're in the right order, and if not, it unloads them and starts over. (This is the dreaded bogo sort.)
O(nn): You fix the robot so that it's loading things correctly. The next day, one of your co-workers plays a prank on you and wires the loading dock robot to the automated printing systems. Every time the robot goes to load an original book, the factory printer makes a duplicate run of all the phonebooks! Fortunately, the robot's bug-detection systems are sophisticated enough that the robot doesn't try printing even more copies when it encounters a duplicate book for loading, but it still has to load every original and duplicate book that's been printed.

O(log N) basically means time goes up linearly while the n goes up exponentially. So if it takes 1 second to compute 10 elements, it will take 2 seconds to compute 100 elements, 3 seconds to compute 1000 elements, and so on.
​It is O(log n) when we do divide and conquer type of algorithms e.g binary search. Another example is quick sort where each time we divide the array into two parts and each time it takes O(N) time to find a pivot element. Hence it N O(log N)

Many good answers have already been posted to this question, but I believe we really are missing an important one - namely, the illustrated answer.
What does it mean to say that the height of a complete binary tree is O(log n)?
The following drawing depicts a binary tree. Notice how each level contains double the number of nodes compared to the level above (hence binary):
Binary search is an example with complexity O(log n). Let's say that the nodes in the bottom level of the tree in figure 1 represents items in some sorted collection. Binary search is a divide-and-conquer algorithm, and the drawing shows how we will need (at most) 4 comparisons to find the record we are searching for in this 16 item dataset.
Assume we had instead a dataset with 32 elements. Continue the drawing above to find that we will now need 5 comparisons to find what we are searching for, as the tree has only grown one level deeper when we multiplied the amount of data. As a result, the complexity of the algorithm can be described as a logarithmic order.
Plotting log(n) on a plain piece of paper, will result in a graph where the rise of the curve decelerates as n increases:

Overview
Others have given good diagram examples, such as the tree diagrams. I did not see any simple code examples. So in addition to my explanation, I'll provide some algorithms with simple print statements to illustrate the complexity of different algorithm categories.
First, you'll want to have a general idea of Logarithm, which you can get from https://en.wikipedia.org/wiki/Logarithm . Natural science use e and the natural log. Engineering disciples will use log_10 (log base 10) and computer scientists will use log_2 (log base 2) a lot, since computers are binary based. Sometimes you'll see abbreviations of natural log as ln(), engineers normally leave the _10 off and just use log() and log_2 is abbreviated as lg(). All of the types of logarithms grow in a similar fashion, that is why they share the same category of log(n).
When you look at the code examples below, I recommend looking at O(1), then O(n), then O(n^2). After you are good with those, then look at the others. I've included clean examples as well as variations to demonstrate how subtle changes can still result in the same categorization.
You can think of O(1), O(n), O(logn), etc as classes or categories of growth. Some categories will take more time to do than others. These categories help give us a way of ordering the algorithm performance. Some grown faster as the input n grows. The following table demonstrates said growth numerically. In the table below think of log(n) as the ceiling of log_2.
Simple Code Examples Of Various Big O Categories:
O(1) - Constant Time Examples:
Algorithm 1:
Algorithm 1 prints hello once and it doesn't depend on n, so it will always run in constant time, so it is O(1).
print "hello";
Algorithm 2:
Algorithm 2 prints hello 3 times, however it does not depend on an input size. Even as n grows, this algorithm will always only print hello 3 times. That being said 3, is a constant, so this algorithm is also O(1).
print "hello";
print "hello";
print "hello";
O(log(n)) - Logarithmic Examples:
Algorithm 3 - This acts like "log_2"
Algorithm 3 demonstrates an algorithm that runs in log_2(n). Notice the post operation of the for loop multiples the current value of i by 2, so i goes from 1 to 2 to 4 to 8 to 16 to 32 ...
for(int i = 1; i <= n; i = i * 2)
print "hello";
Algorithm 4 - This acts like "log_3"
Algorithm 4 demonstrates log_3. Notice i goes from 1 to 3 to 9 to 27...
for(int i = 1; i <= n; i = i * 3)
print "hello";
Algorithm 5 - This acts like "log_1.02"
Algorithm 5 is important, as it helps show that as long as the number is greater than 1 and the result is repeatedly multiplied against itself, that you are looking at a logarithmic algorithm.
for(double i = 1; i < n; i = i * 1.02)
print "hello";
O(n) - Linear Time Examples:
Algorithm 6
This algorithm is simple, which prints hello n times.
for(int i = 0; i < n; i++)
print "hello";
Algorithm 7
This algorithm shows a variation, where it will print hello n/2 times. n/2 = 1/2 * n. We ignore the 1/2 constant and see that this algorithm is O(n).
for(int i = 0; i < n; i = i + 2)
print "hello";
O(n*log(n)) - nlog(n) Examples:
Algorithm 8
Think of this as a combination of O(log(n)) and O(n). The nesting of the for loops help us obtain the O(n*log(n))
for(int i = 0; i < n; i++)
for(int j = 1; j < n; j = j * 2)
print "hello";
Algorithm 9
Algorithm 9 is like algorithm 8, but each of the loops has allowed variations, which still result in the final result being O(n*log(n))
for(int i = 0; i < n; i = i + 2)
for(int j = 1; j < n; j = j * 3)
print "hello";
O(n^2) - n squared Examples:
Algorithm 10
O(n^2) is obtained easily by nesting standard for loops.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j++)
print "hello";
Algorithm 11
Like algorithm 10, but with some variations.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j = j + 2)
print "hello";
O(n^3) - n cubed Examples:
Algorithm 12
This is like algorithm 10, but with 3 loops instead of 2.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j++)
for(int k = 0; k < n; k++)
print "hello";
Algorithm 13
Like algorithm 12, but with some variations that still yield O(n^3).
for(int i = 0; i < n; i++)
for(int j = 0; j < n + 5; j = j + 2)
for(int k = 0; k < n; k = k + 3)
print "hello";
Summary
The above give several straight forward examples, and variations to help demonstrate what subtle changes can be introduced that really don't change the analysis. Hopefully it gives you enough insight.

The explanation below is using the case of a fully balanced binary tree to help you understand how we get logarithmic time complexity.
Binary tree is a case where a problem of size n is divided into sub-problem of size n/2 until we reach a problem of size 1:
And that's how you get O(log n) which is the amount of work that needs to be done on the above tree to reach a solution.
A common algorithm with O(log n) time complexity is Binary Search whose recursive relation is T(n/2) + O(1) i.e. at every subsequent level of the tree you divide problem into half and do constant amount of additional work.

If you had a function that takes:
1 millisecond to complete if you have 2 elements.
2 milliseconds to complete if you have 4 elements.
3 milliseconds to complete if you have 8 elements.
4 milliseconds to complete if you have 16 elements.
...
n milliseconds to complete if you have 2^n elements.
Then it takes log2(n) time. The Big O notation, loosely speaking, means that the relationship only needs to be true for large n, and that constant factors and smaller terms can be ignored.

The logarithm
Ok let's try and fully understand what a logarithm actually is.
Imagine we have a rope and we have tied it to a horse. If the rope is directly tied to the horse, the force the horse would need to pull away (say, from a man) is directly 1.
Now imagine the rope is looped round a pole. The horse to get away will now have to pull many times harder. The amount of times will depend on the roughness of the rope and the size of the pole, but let's assume it will multiply one's strength by 10 (when the rope makes a complete turn).
Now if the rope is looped once, the horse will need to pull 10 times harder. If the human decides to make it really difficult for the horse, he may loop the rope again round a pole, increasing it's strength by an additional 10 times. A third loop will again increase the strength by a further 10 times.
We can see that for each loop, the value increases by 10. The number of turns required to get any number is called the logarithm of the number i.e. we need 3 posts to multiple your strength by 1000 times, 6 posts to multiply your strength by 1,000,000.
3 is the logarithm of 1,000, and 6 is the logarithm of 1,000,000 (base 10).
So what does O(log n) actually mean?
In our example above, our 'growth rate' is O(log n). For every additional loop, the force our rope can handle is 10 times more:
Turns | Max Force
0 | 1
1 | 10
2 | 100
3 | 1000
4 | 10000
n | 10^n
Now the example above did use base 10, but fortunately the base of the log is insignificant when we talk about big o notation.
Now let's imagine you are trying to guess a number between 1-100.
Your Friend: Guess my number between 1-100!
Your Guess: 50
Your Friend: Lower!
Your Guess: 25
Your Friend: Lower!
Your Guess: 13
Your Friend: Higher!
Your Guess: 19
Your Friend: Higher!
Your Friend: 22
Your Guess: Lower!
Your Guess: 20
Your Friend: Higher!
Your Guess: 21
Your Friend: YOU GOT IT!
Now it took you 7 guesses to get this right. But what is the relationship here? What is the most amount of items that you can guess from each additional guess?
Guesses | Items
1 | 2
2 | 4
3 | 8
4 | 16
5 | 32
6 | 64
7 | 128
10 | 1024
Using the graph, we can see that if we use a binary search to guess a number between 1-100 it will take us at most 7 attempts. If we had 128 numbers, we could also guess the number in 7 attemps but 129 numbers will takes us at most 8 attempts (in relations to logarithms, here we would need 7 guesses for a 128 value range, 10 guesses for a 1024 value range. 7 is the logarithm of 128, 10 is the logarithm of 1024 (base 2)).
Notice that I have bolded 'at most'. Big-O notation always refers to the worse case. If you're lucky, you could guess the number in one attempt and so the best case is O(1), but that's another story.
We can see that for every guess our data set is shrinking. A good rule of thumb to identify if an algorithm has a logarithmtic time is
to see if the data set shrinks by a certain order after each iteration
What about O(n log n)?
You will eventually come across a linearithmic time O(n log(n)) algorithm. The rule of thumb above applies again, but this time the logarithmic function has to run n times e.g. reducing the size of a list n times, which occurs in algorithms like a mergesort.
You can easily identify if the algorithmic time is n log n. Look for an outer loop which iterates through a list (O(n)). Then look to see if there is an inner loop. If the inner loop is cutting/reducing the data set on each iteration, that loop is (O(log n)), and so the overall algorithm is = O(n log n).
Disclaimer: The rope-logarithm example was grabbed from the excellent Mathematician's Delight book by W.Sawyer.

Logarithmic running time (O(log n)) essentially means that the running time grows in proportion to the logarithm of the input size - as an example, if 10 items takes at most some amount of time x, and 100 items takes at most, say, 2x, and 10,000 items takes at most 4x, then it's looking like an O(log n) time complexity.

First I recommend you to read following book;
Algorithms (4th Edition)
Here is some functions and their expected complexities. Numbers are indicating statement execution frequencies.
Following Big-O Complexity Chart also taken from bigocheatsheet
Lastly very simple showcase there is shows how it is calculated;
Anatomy of a program’s statement execution frequencies.
Analyzing the running time of a program (example).

You can think of O(log N) intuitively by saying the time is proportional to the number of digits in N.
If an operation performs constant time work on each digit or bit of an input, the whole operation will take time proportional to the number of digits or bits in the input, not the magnitude of the input; thus, O(log N) rather than O(N).
If an operation makes a series of constant time decisions each of which halves (reduces by a factor of 3, 4, 5..) the size of the input to be considered, the whole will take time proportional to log base 2 (base 3, base 4, base 5...) of the size N of the input, rather than being O(N).
And so on.

What's logb(n)?
It is the number of times you can cut a log of length n repeatedly into b equal parts before reaching a section of size 1.

The best way I've always had to mentally visualize an algorithm that runs in O(log n) is as follows:
If you increase the problem size by a multiplicative amount (i.e. multiply its size by 10), the work is only increased by an additive amount.
Applying this to your binary tree question so you have a good application: if you double the number of nodes in a binary tree, the height only increases by 1 (an additive amount). If you double it again, it still only increased by 1. (Obviously I'm assuming it stays balanced and such). That way, instead of doubling your work when the problem size is multiplied, you're only doing very slightly more work. That's why O(log n) algorithms are awesome.

Divide and conquer algorithms usually have a logn component to the running time. This comes from the repeated halving of the input.
In the case of binary search, every iteration you throw away half of the input. It should be noted that in Big-O notation, log is log base 2.
Edit: As noted, the log base doesn't matter, but when deriving the Big-O performance of an algorithm, the log factor will come from halving, hence why I think of it as base 2.

O(log n) is a bit misleading, more precisely it's O(log2 n), i.e. (logarithm with base 2).
The height of a balanced binary tree is O(log2 n), since every node has two (note the "two" as in log2 n) child nodes. So, a tree with n nodes has a height of log2 n.
Another example is binary search, which has a running time of O(log2 n) because at every step you divide the search space by 2.

But what exactly is O(log n)? For example, what does it mean to say that the height of a >complete binary tree is O(log n)?
I would rephrase this as 'height of a complete binary tree is log n'. Figuring the height of a complete binary tree would be O(log n), if you were traversing down step by step.
I cannot understand how to identify a function with a logarithmic
time.
Logarithm is essentially the inverse of exponentiation. So, if each 'step' of your function is eliminating a factor of elements from the original item set, that is a logarithmic time algorithm.
For the tree example, you can easily see that stepping down a level of nodes cuts down an exponential number of elements as you continue traversing. The popular example of looking through a name-sorted phone book is essentially equivalent to traversing down a binary search tree (middle page is the root element, and you can deduce at each step whether to go left or right).

O(log n) refers to a function (or algorithm, or step in an algorithm) working in an amount of time proportional to the logarithm (usually base 2 in most cases, but not always, and in any event this is insignificant by big-O notation*) of the size of the input.
The logarithmic function is the inverse of the exponential function. Put another way, if your input grows exponentially (rather than linearly, as you would normally consider it), your function grows linearly.
O(log n) running times are very common in any sort of divide-and-conquer application, because you are (ideally) cutting the work in half every time. If in each of the division or conquer steps, you are doing constant time work (or work that is not constant-time, but with time growing more slowly than O(log n)), then your entire function is O(log n). It's fairly common to have each step require linear time on the input instead; this will amount to a total time complexity of O(n log n).
The running time complexity of binary search is an example of O(log n). This is because in binary search, you are always ignoring half of your input in each later step by dividing the array in half and only focusing on one half with each step. Each step is constant-time, because in binary search you only need to compare one element with your key in order to figure out what to do next irregardless of how big the array you are considering is at any point. So you do approximately log(n)/log(2) steps.
The running time complexity of merge sort is an example of O(n log n). This is because you are dividing the array in half with each step, resulting in a total of approximately log(n)/log(2) steps. However, in each step you need to perform merge operations on all elements (whether it's one merge operation on two sublists of n/2 elements, or two merge operations on four sublists of n/4 elements, is irrelevant because it adds to having to do this for n elements in each step). Thus, the total complexity is O(n log n).
*Remember that big-O notation, by definition, constants don't matter. Also by the change of base rule for logarithms, the only difference between logarithms of different bases is a constant factor.

These 2 cases will take O(log n) time
case 1: f(int n) {
int i;
for (i = 1; i < n; i=i*2)
printf("%d", i);
}
case 2 : f(int n) {
int i;
for (i = n; i>=1 ; i=i/2)
printf("%d", i);
}

It simply means that the time needed for this task grows with log(n) (example : 2s for n = 10, 4s for n = 100, ...). Read the Wikipedia articles on Binary Search Algorithm and Big O Notation for more precisions.

Simply put: At each step of your algorithm you can cut the work in half. (Asymptotically equivalent to third, fourth, ...)

If you plot a logarithmic function on a graphical calculator or something similar, you'll see that it rises really slowly -- even more slowly than a linear function.
This is why algorithms with a logarithmic time complexity are highly sought after: even for really big n (let's say n = 10^8, for example), they perform more than acceptably.

I can add something interesting, that I read in book by Kormen and etc. a long time ago. Now, imagine a problem, where we have to find a solution in a problem space. This problem space should be finite.
Now, if you can prove, that at every iteration of your algorithm you cut off a fraction of this space, that is no less than some limit, this means that your algorithm is running in O(logN) time.
I should point out, that we are talking here about a relative fraction limit, not the absolute one. The binary search is a classical example. At each step we throw away 1/2 of the problem space. But binary search is not the only such example. Suppose, you proved somehow, that at each step you throw away at least 1/128 of problem space. That means, your program is still running at O(logN) time, although significantly slower than the binary search. This is a very good hint in analyzing of recursive algorithms. It often can be proved that at each step the recursion will not use several variants, and this leads to the cutoff of some fraction in problem space.

Actually, if you have a list of n elements, and create a binary tree from that list (like in the divide and conquer algorithm), you will keep dividing by 2 until you reach lists of size 1 (the leaves).
At the first step, you divide by 2. You then have 2 lists (2^1), you divide each by 2, so you have 4 lists (2^2), you divide again, you have 8 lists (2^3)and so on until your list size is 1
That gives you the equation :
n/(2^steps)=1 <=> n=2^steps <=> lg(n)=steps
(you take the lg of each side, lg being the log base 2)

I can give an example for a for loop and maybe once grasped the concept maybe it will be simpler to understand in different contexts.
That means that in the loop the step grows exponentially. E.g.
for (i=1; i<=n; i=i*2) {;}
The complexity in O-notation of this program is O(log(n)). Let's try to loop through it by hand (n being somewhere between 512 and 1023 (excluding 1024):
step: 1 2 3 4 5 6 7 8 9 10
i: 1 2 4 8 16 32 64 128 256 512
Although n is somewhere between 512 and 1023, only 10 iterations take place. This is because the step in the loop grows exponentially and thus takes only 10 iterations to reach the termination.
The logarithm of x (to the base of a) is the reverse function of a^x.
It is like saying that logarithm is the inverse of exponential.
Now try to see it that way, if exponential grows very fast then logarithm grows (inversely) very slow.
The difference between O(n) and O(log(n)) is huge, similar to the difference between O(n) and O(a^n) (a being a constant).

Every time we write an algorithm or code we try to analyze its asymptotic complexity.
It is different from its time complexity.
Asymptotic complexity is the behavior of execution time of an algorithm while the time complexity is the actual execution time. But some people use these terms interchangeably.
Because time complexity depends on various parameters viz.
1. Physical System
2. Programming Language
3. coding Style
4. And much more ......
The actual execution time is not a good measure for analysis.
Instead we take input size as the parameter because whatever the code is, the input is same.
So the execution time is a function of input size.
Following is an example of Linear Time Algorithm
Linear Search
Given n input elements, to search an element in the array you need at most 'n' comparisons. In other words, no matter what programming language you use, what coding style you prefer, on what system you execute it. In the worst case scenario it requires only n comparisons.The execution time is linearly proportional to the input size.
And its not just search, whatever may be the work (increment, compare or any operation) its a function of input size.
So when you say any algorithm is O(log n)
it means the execution time is log times the input size n.
As the input size increases the work done(here the execution time) increases.(Hence proportionality)
n Work
2 1 units of work
4 2 units of work
8 3 units of work
See as the input size increased the work done is increased and it is independent of any machine.
And if you try to find out the value of units of work
It's actually dependent onto those above specified parameters.It will change according to the systems and all.

The complete binary example is O(ln n) because the search looks like this:
1 2 3 4 5 6 7 8 9 10 11 12
Searching for 4 yields 3 hits: 6, 3 then 4. And log2 12 = 3, which is a good apporximate to how many hits where needed.

log x to base b = y is the inverse of b^y = x
If you have an M-ary tree of depth d and size n, then:
traversing the whole tree ~ O(M^d) = O(n)
Walking a single path in the tree ~ O(d) = O(log n to base M)

In information technology it means that:
f(n)=O(g(n)) If there is suitable constant C and N0 independent on N,
such that
for all N>N0 "C*g(n) > f(n) > 0" is true.
Ant it seems that this notation was mostly have taken from mathematics.
In this article there is a quote:
D.E. Knuth, "BIG OMICRON AND BIG OMEGA AND BIG THETA", 1976:
On the basis of the issues discussed here, I propose that members of
SIGACT, and editors of computer science and mathematics journals,
adopt notations as defined above, unless a better alternative can be
found reasonably soon.
Today is 2016, but we use it still today.
In mathematical analysis it means that:
lim (f(n)/g(n))=Constant; where n goes to +infinity
But even in mathematical analysis sometimes this symbol was used in meaning "C*g(n) > f(n) > 0".
As I know from university the symbol was intoduced by German mathematician Landau (1877-1938)

O(logn) is one of the polynomial time complexity to measure the runtime performance of any code.
I hope you have already heard of Binary search algorithm.
Let's assume you have to find an element in the array of size N.
Basically, the code execution is like
N
N/2
N/4
N/8....etc
If you sum all the work done at each level you will end up with n(1+1/2+1/4....) and that is equal to O(logn)

If you are looking for a intuition based answer I would like to put up two interpretations for you.
Imagine a very high hill with a very broad base as well. To reach the top of the hill there are two ways: one is a dedicated pathway going spirally around the hill reaching at the top, the other: small terrace like carvings cut out to provide a staircase. Now if the first way is reaching in linear time O(n), the second one is O(log n).
Imagine an algorithm, which accepts an integer, n as input and completes in time proportional to n then it is O(n) or theta(n) but if it runs in time proportion to the number of digits or the number of bits in the binary representation on number then the algorithm runs in O(log n) or theta(log n) time.

Algorithms in the Divide and Conquer paradigm are of complexity O(logn). One example here, calculate your own power function,
int power(int x, unsigned int y)
{
int temp;
if( y == 0)
return 1;
temp = power(x, y/2);
if (y%2 == 0)
return temp*temp;
else
return x*temp*temp;
}
from http://www.geeksforgeeks.org/write-a-c-program-to-calculate-powxn/

Related

If only Log n is written what is its base 10 or 2 in algorithm [duplicate]

I am learning about Big O Notation running times and amortized times. I understand the notion of O(n) linear time, meaning that the size of the input affects the growth of the algorithm proportionally...and the same goes for, for example, quadratic time O(n2) etc..even algorithms, such as permutation generators, with O(n!) times, that grow by factorials.
For example, the following function is O(n) because the algorithm grows in proportion to its input n:
f(int n) {
int i;
for (i = 0; i < n; ++i)
printf("%d", i);
}
Similarly, if there was a nested loop, the time would be O(n2).
But what exactly is O(log n)? For example, what does it mean to say that the height of a complete binary tree is O(log n)?
I do know (maybe not in great detail) what Logarithm is, in the sense that: log10 100 = 2, but I cannot understand how to identify a function with a logarithmic time.
I cannot understand how to identify a function with a log time.
The most common attributes of logarithmic running-time function are that:
the choice of the next element on which to perform some action is one of several possibilities, and
only one will need to be chosen.
or
the elements on which the action is performed are digits of n
This is why, for example, looking up people in a phone book is O(log n). You don't need to check every person in the phone book to find the right one; instead, you can simply divide-and-conquer by looking based on where their name is alphabetically, and in every section you only need to explore a subset of each section before you eventually find someone's phone number.
Of course, a bigger phone book will still take you a longer time, but it won't grow as quickly as the proportional increase in the additional size.
We can expand the phone book example to compare other kinds of operations and their running time. We will assume our phone book has businesses (the "Yellow Pages") which have unique names and people (the "White Pages") which may not have unique names. A phone number is assigned to at most one person or business. We will also assume that it takes constant time to flip to a specific page.
Here are the running times of some operations we might perform on the phone book, from fastest to slowest:
O(1) (in the worst case): Given the page that a business's name is on and the business name, find the phone number.
O(1) (in the average case): Given the page that a person's name is on and their name, find the phone number.
O(log n): Given a person's name, find the phone number by picking a random point about halfway through the part of the book you haven't searched yet, then checking to see whether the person's name is at that point. Then repeat the process about halfway through the part of the book where the person's name lies. (This is a binary search for a person's name.)
O(n): Find all people whose phone numbers contain the digit "5".
O(n): Given a phone number, find the person or business with that number.
O(n log n): There was a mix-up at the printer's office, and our phone book had all its pages inserted in a random order. Fix the ordering so that it's correct by looking at the first name on each page and then putting that page in the appropriate spot in a new, empty phone book.
For the below examples, we're now at the printer's office. Phone books are waiting to be mailed to each resident or business, and there's a sticker on each phone book identifying where it should be mailed to. Every person or business gets one phone book.
O(n log n): We want to personalize the phone book, so we're going to find each person or business's name in their designated copy, then circle their name in the book and write a short thank-you note for their patronage.
O(n2): A mistake occurred at the office, and every entry in each of the phone books has an extra "0" at the end of the phone number. Take some white-out and remove each zero.
O(n · n!): We're ready to load the phonebooks onto the shipping dock. Unfortunately, the robot that was supposed to load the books has gone haywire: it's putting the books onto the truck in a random order! Even worse, it loads all the books onto the truck, then checks to see if they're in the right order, and if not, it unloads them and starts over. (This is the dreaded bogo sort.)
O(nn): You fix the robot so that it's loading things correctly. The next day, one of your co-workers plays a prank on you and wires the loading dock robot to the automated printing systems. Every time the robot goes to load an original book, the factory printer makes a duplicate run of all the phonebooks! Fortunately, the robot's bug-detection systems are sophisticated enough that the robot doesn't try printing even more copies when it encounters a duplicate book for loading, but it still has to load every original and duplicate book that's been printed.
O(log N) basically means time goes up linearly while the n goes up exponentially. So if it takes 1 second to compute 10 elements, it will take 2 seconds to compute 100 elements, 3 seconds to compute 1000 elements, and so on.
​It is O(log n) when we do divide and conquer type of algorithms e.g binary search. Another example is quick sort where each time we divide the array into two parts and each time it takes O(N) time to find a pivot element. Hence it N O(log N)
Many good answers have already been posted to this question, but I believe we really are missing an important one - namely, the illustrated answer.
What does it mean to say that the height of a complete binary tree is O(log n)?
The following drawing depicts a binary tree. Notice how each level contains double the number of nodes compared to the level above (hence binary):
Binary search is an example with complexity O(log n). Let's say that the nodes in the bottom level of the tree in figure 1 represents items in some sorted collection. Binary search is a divide-and-conquer algorithm, and the drawing shows how we will need (at most) 4 comparisons to find the record we are searching for in this 16 item dataset.
Assume we had instead a dataset with 32 elements. Continue the drawing above to find that we will now need 5 comparisons to find what we are searching for, as the tree has only grown one level deeper when we multiplied the amount of data. As a result, the complexity of the algorithm can be described as a logarithmic order.
Plotting log(n) on a plain piece of paper, will result in a graph where the rise of the curve decelerates as n increases:
Overview
Others have given good diagram examples, such as the tree diagrams. I did not see any simple code examples. So in addition to my explanation, I'll provide some algorithms with simple print statements to illustrate the complexity of different algorithm categories.
First, you'll want to have a general idea of Logarithm, which you can get from https://en.wikipedia.org/wiki/Logarithm . Natural science use e and the natural log. Engineering disciples will use log_10 (log base 10) and computer scientists will use log_2 (log base 2) a lot, since computers are binary based. Sometimes you'll see abbreviations of natural log as ln(), engineers normally leave the _10 off and just use log() and log_2 is abbreviated as lg(). All of the types of logarithms grow in a similar fashion, that is why they share the same category of log(n).
When you look at the code examples below, I recommend looking at O(1), then O(n), then O(n^2). After you are good with those, then look at the others. I've included clean examples as well as variations to demonstrate how subtle changes can still result in the same categorization.
You can think of O(1), O(n), O(logn), etc as classes or categories of growth. Some categories will take more time to do than others. These categories help give us a way of ordering the algorithm performance. Some grown faster as the input n grows. The following table demonstrates said growth numerically. In the table below think of log(n) as the ceiling of log_2.
Simple Code Examples Of Various Big O Categories:
O(1) - Constant Time Examples:
Algorithm 1:
Algorithm 1 prints hello once and it doesn't depend on n, so it will always run in constant time, so it is O(1).
print "hello";
Algorithm 2:
Algorithm 2 prints hello 3 times, however it does not depend on an input size. Even as n grows, this algorithm will always only print hello 3 times. That being said 3, is a constant, so this algorithm is also O(1).
print "hello";
print "hello";
print "hello";
O(log(n)) - Logarithmic Examples:
Algorithm 3 - This acts like "log_2"
Algorithm 3 demonstrates an algorithm that runs in log_2(n). Notice the post operation of the for loop multiples the current value of i by 2, so i goes from 1 to 2 to 4 to 8 to 16 to 32 ...
for(int i = 1; i <= n; i = i * 2)
print "hello";
Algorithm 4 - This acts like "log_3"
Algorithm 4 demonstrates log_3. Notice i goes from 1 to 3 to 9 to 27...
for(int i = 1; i <= n; i = i * 3)
print "hello";
Algorithm 5 - This acts like "log_1.02"
Algorithm 5 is important, as it helps show that as long as the number is greater than 1 and the result is repeatedly multiplied against itself, that you are looking at a logarithmic algorithm.
for(double i = 1; i < n; i = i * 1.02)
print "hello";
O(n) - Linear Time Examples:
Algorithm 6
This algorithm is simple, which prints hello n times.
for(int i = 0; i < n; i++)
print "hello";
Algorithm 7
This algorithm shows a variation, where it will print hello n/2 times. n/2 = 1/2 * n. We ignore the 1/2 constant and see that this algorithm is O(n).
for(int i = 0; i < n; i = i + 2)
print "hello";
O(n*log(n)) - nlog(n) Examples:
Algorithm 8
Think of this as a combination of O(log(n)) and O(n). The nesting of the for loops help us obtain the O(n*log(n))
for(int i = 0; i < n; i++)
for(int j = 1; j < n; j = j * 2)
print "hello";
Algorithm 9
Algorithm 9 is like algorithm 8, but each of the loops has allowed variations, which still result in the final result being O(n*log(n))
for(int i = 0; i < n; i = i + 2)
for(int j = 1; j < n; j = j * 3)
print "hello";
O(n^2) - n squared Examples:
Algorithm 10
O(n^2) is obtained easily by nesting standard for loops.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j++)
print "hello";
Algorithm 11
Like algorithm 10, but with some variations.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j = j + 2)
print "hello";
O(n^3) - n cubed Examples:
Algorithm 12
This is like algorithm 10, but with 3 loops instead of 2.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j++)
for(int k = 0; k < n; k++)
print "hello";
Algorithm 13
Like algorithm 12, but with some variations that still yield O(n^3).
for(int i = 0; i < n; i++)
for(int j = 0; j < n + 5; j = j + 2)
for(int k = 0; k < n; k = k + 3)
print "hello";
Summary
The above give several straight forward examples, and variations to help demonstrate what subtle changes can be introduced that really don't change the analysis. Hopefully it gives you enough insight.
The explanation below is using the case of a fully balanced binary tree to help you understand how we get logarithmic time complexity.
Binary tree is a case where a problem of size n is divided into sub-problem of size n/2 until we reach a problem of size 1:
And that's how you get O(log n) which is the amount of work that needs to be done on the above tree to reach a solution.
A common algorithm with O(log n) time complexity is Binary Search whose recursive relation is T(n/2) + O(1) i.e. at every subsequent level of the tree you divide problem into half and do constant amount of additional work.
If you had a function that takes:
1 millisecond to complete if you have 2 elements.
2 milliseconds to complete if you have 4 elements.
3 milliseconds to complete if you have 8 elements.
4 milliseconds to complete if you have 16 elements.
...
n milliseconds to complete if you have 2^n elements.
Then it takes log2(n) time. The Big O notation, loosely speaking, means that the relationship only needs to be true for large n, and that constant factors and smaller terms can be ignored.
The logarithm
Ok let's try and fully understand what a logarithm actually is.
Imagine we have a rope and we have tied it to a horse. If the rope is directly tied to the horse, the force the horse would need to pull away (say, from a man) is directly 1.
Now imagine the rope is looped round a pole. The horse to get away will now have to pull many times harder. The amount of times will depend on the roughness of the rope and the size of the pole, but let's assume it will multiply one's strength by 10 (when the rope makes a complete turn).
Now if the rope is looped once, the horse will need to pull 10 times harder. If the human decides to make it really difficult for the horse, he may loop the rope again round a pole, increasing it's strength by an additional 10 times. A third loop will again increase the strength by a further 10 times.
We can see that for each loop, the value increases by 10. The number of turns required to get any number is called the logarithm of the number i.e. we need 3 posts to multiple your strength by 1000 times, 6 posts to multiply your strength by 1,000,000.
3 is the logarithm of 1,000, and 6 is the logarithm of 1,000,000 (base 10).
So what does O(log n) actually mean?
In our example above, our 'growth rate' is O(log n). For every additional loop, the force our rope can handle is 10 times more:
Turns | Max Force
0 | 1
1 | 10
2 | 100
3 | 1000
4 | 10000
n | 10^n
Now the example above did use base 10, but fortunately the base of the log is insignificant when we talk about big o notation.
Now let's imagine you are trying to guess a number between 1-100.
Your Friend: Guess my number between 1-100!
Your Guess: 50
Your Friend: Lower!
Your Guess: 25
Your Friend: Lower!
Your Guess: 13
Your Friend: Higher!
Your Guess: 19
Your Friend: Higher!
Your Friend: 22
Your Guess: Lower!
Your Guess: 20
Your Friend: Higher!
Your Guess: 21
Your Friend: YOU GOT IT!
Now it took you 7 guesses to get this right. But what is the relationship here? What is the most amount of items that you can guess from each additional guess?
Guesses | Items
1 | 2
2 | 4
3 | 8
4 | 16
5 | 32
6 | 64
7 | 128
10 | 1024
Using the graph, we can see that if we use a binary search to guess a number between 1-100 it will take us at most 7 attempts. If we had 128 numbers, we could also guess the number in 7 attemps but 129 numbers will takes us at most 8 attempts (in relations to logarithms, here we would need 7 guesses for a 128 value range, 10 guesses for a 1024 value range. 7 is the logarithm of 128, 10 is the logarithm of 1024 (base 2)).
Notice that I have bolded 'at most'. Big-O notation always refers to the worse case. If you're lucky, you could guess the number in one attempt and so the best case is O(1), but that's another story.
We can see that for every guess our data set is shrinking. A good rule of thumb to identify if an algorithm has a logarithmtic time is
to see if the data set shrinks by a certain order after each iteration
What about O(n log n)?
You will eventually come across a linearithmic time O(n log(n)) algorithm. The rule of thumb above applies again, but this time the logarithmic function has to run n times e.g. reducing the size of a list n times, which occurs in algorithms like a mergesort.
You can easily identify if the algorithmic time is n log n. Look for an outer loop which iterates through a list (O(n)). Then look to see if there is an inner loop. If the inner loop is cutting/reducing the data set on each iteration, that loop is (O(log n)), and so the overall algorithm is = O(n log n).
Disclaimer: The rope-logarithm example was grabbed from the excellent Mathematician's Delight book by W.Sawyer.
Logarithmic running time (O(log n)) essentially means that the running time grows in proportion to the logarithm of the input size - as an example, if 10 items takes at most some amount of time x, and 100 items takes at most, say, 2x, and 10,000 items takes at most 4x, then it's looking like an O(log n) time complexity.
First I recommend you to read following book;
Algorithms (4th Edition)
Here is some functions and their expected complexities. Numbers are indicating statement execution frequencies.
Following Big-O Complexity Chart also taken from bigocheatsheet
Lastly very simple showcase there is shows how it is calculated;
Anatomy of a program’s statement execution frequencies.
Analyzing the running time of a program (example).
You can think of O(log N) intuitively by saying the time is proportional to the number of digits in N.
If an operation performs constant time work on each digit or bit of an input, the whole operation will take time proportional to the number of digits or bits in the input, not the magnitude of the input; thus, O(log N) rather than O(N).
If an operation makes a series of constant time decisions each of which halves (reduces by a factor of 3, 4, 5..) the size of the input to be considered, the whole will take time proportional to log base 2 (base 3, base 4, base 5...) of the size N of the input, rather than being O(N).
And so on.
What's logb(n)?
It is the number of times you can cut a log of length n repeatedly into b equal parts before reaching a section of size 1.
The best way I've always had to mentally visualize an algorithm that runs in O(log n) is as follows:
If you increase the problem size by a multiplicative amount (i.e. multiply its size by 10), the work is only increased by an additive amount.
Applying this to your binary tree question so you have a good application: if you double the number of nodes in a binary tree, the height only increases by 1 (an additive amount). If you double it again, it still only increased by 1. (Obviously I'm assuming it stays balanced and such). That way, instead of doubling your work when the problem size is multiplied, you're only doing very slightly more work. That's why O(log n) algorithms are awesome.
Divide and conquer algorithms usually have a logn component to the running time. This comes from the repeated halving of the input.
In the case of binary search, every iteration you throw away half of the input. It should be noted that in Big-O notation, log is log base 2.
Edit: As noted, the log base doesn't matter, but when deriving the Big-O performance of an algorithm, the log factor will come from halving, hence why I think of it as base 2.
O(log n) is a bit misleading, more precisely it's O(log2 n), i.e. (logarithm with base 2).
The height of a balanced binary tree is O(log2 n), since every node has two (note the "two" as in log2 n) child nodes. So, a tree with n nodes has a height of log2 n.
Another example is binary search, which has a running time of O(log2 n) because at every step you divide the search space by 2.
But what exactly is O(log n)? For example, what does it mean to say that the height of a >complete binary tree is O(log n)?
I would rephrase this as 'height of a complete binary tree is log n'. Figuring the height of a complete binary tree would be O(log n), if you were traversing down step by step.
I cannot understand how to identify a function with a logarithmic
time.
Logarithm is essentially the inverse of exponentiation. So, if each 'step' of your function is eliminating a factor of elements from the original item set, that is a logarithmic time algorithm.
For the tree example, you can easily see that stepping down a level of nodes cuts down an exponential number of elements as you continue traversing. The popular example of looking through a name-sorted phone book is essentially equivalent to traversing down a binary search tree (middle page is the root element, and you can deduce at each step whether to go left or right).
O(log n) refers to a function (or algorithm, or step in an algorithm) working in an amount of time proportional to the logarithm (usually base 2 in most cases, but not always, and in any event this is insignificant by big-O notation*) of the size of the input.
The logarithmic function is the inverse of the exponential function. Put another way, if your input grows exponentially (rather than linearly, as you would normally consider it), your function grows linearly.
O(log n) running times are very common in any sort of divide-and-conquer application, because you are (ideally) cutting the work in half every time. If in each of the division or conquer steps, you are doing constant time work (or work that is not constant-time, but with time growing more slowly than O(log n)), then your entire function is O(log n). It's fairly common to have each step require linear time on the input instead; this will amount to a total time complexity of O(n log n).
The running time complexity of binary search is an example of O(log n). This is because in binary search, you are always ignoring half of your input in each later step by dividing the array in half and only focusing on one half with each step. Each step is constant-time, because in binary search you only need to compare one element with your key in order to figure out what to do next irregardless of how big the array you are considering is at any point. So you do approximately log(n)/log(2) steps.
The running time complexity of merge sort is an example of O(n log n). This is because you are dividing the array in half with each step, resulting in a total of approximately log(n)/log(2) steps. However, in each step you need to perform merge operations on all elements (whether it's one merge operation on two sublists of n/2 elements, or two merge operations on four sublists of n/4 elements, is irrelevant because it adds to having to do this for n elements in each step). Thus, the total complexity is O(n log n).
*Remember that big-O notation, by definition, constants don't matter. Also by the change of base rule for logarithms, the only difference between logarithms of different bases is a constant factor.
These 2 cases will take O(log n) time
case 1: f(int n) {
int i;
for (i = 1; i < n; i=i*2)
printf("%d", i);
}
case 2 : f(int n) {
int i;
for (i = n; i>=1 ; i=i/2)
printf("%d", i);
}
It simply means that the time needed for this task grows with log(n) (example : 2s for n = 10, 4s for n = 100, ...). Read the Wikipedia articles on Binary Search Algorithm and Big O Notation for more precisions.
Simply put: At each step of your algorithm you can cut the work in half. (Asymptotically equivalent to third, fourth, ...)
If you plot a logarithmic function on a graphical calculator or something similar, you'll see that it rises really slowly -- even more slowly than a linear function.
This is why algorithms with a logarithmic time complexity are highly sought after: even for really big n (let's say n = 10^8, for example), they perform more than acceptably.
I can add something interesting, that I read in book by Kormen and etc. a long time ago. Now, imagine a problem, where we have to find a solution in a problem space. This problem space should be finite.
Now, if you can prove, that at every iteration of your algorithm you cut off a fraction of this space, that is no less than some limit, this means that your algorithm is running in O(logN) time.
I should point out, that we are talking here about a relative fraction limit, not the absolute one. The binary search is a classical example. At each step we throw away 1/2 of the problem space. But binary search is not the only such example. Suppose, you proved somehow, that at each step you throw away at least 1/128 of problem space. That means, your program is still running at O(logN) time, although significantly slower than the binary search. This is a very good hint in analyzing of recursive algorithms. It often can be proved that at each step the recursion will not use several variants, and this leads to the cutoff of some fraction in problem space.
Actually, if you have a list of n elements, and create a binary tree from that list (like in the divide and conquer algorithm), you will keep dividing by 2 until you reach lists of size 1 (the leaves).
At the first step, you divide by 2. You then have 2 lists (2^1), you divide each by 2, so you have 4 lists (2^2), you divide again, you have 8 lists (2^3)and so on until your list size is 1
That gives you the equation :
n/(2^steps)=1 <=> n=2^steps <=> lg(n)=steps
(you take the lg of each side, lg being the log base 2)
I can give an example for a for loop and maybe once grasped the concept maybe it will be simpler to understand in different contexts.
That means that in the loop the step grows exponentially. E.g.
for (i=1; i<=n; i=i*2) {;}
The complexity in O-notation of this program is O(log(n)). Let's try to loop through it by hand (n being somewhere between 512 and 1023 (excluding 1024):
step: 1 2 3 4 5 6 7 8 9 10
i: 1 2 4 8 16 32 64 128 256 512
Although n is somewhere between 512 and 1023, only 10 iterations take place. This is because the step in the loop grows exponentially and thus takes only 10 iterations to reach the termination.
The logarithm of x (to the base of a) is the reverse function of a^x.
It is like saying that logarithm is the inverse of exponential.
Now try to see it that way, if exponential grows very fast then logarithm grows (inversely) very slow.
The difference between O(n) and O(log(n)) is huge, similar to the difference between O(n) and O(a^n) (a being a constant).
Every time we write an algorithm or code we try to analyze its asymptotic complexity.
It is different from its time complexity.
Asymptotic complexity is the behavior of execution time of an algorithm while the time complexity is the actual execution time. But some people use these terms interchangeably.
Because time complexity depends on various parameters viz.
1. Physical System
2. Programming Language
3. coding Style
4. And much more ......
The actual execution time is not a good measure for analysis.
Instead we take input size as the parameter because whatever the code is, the input is same.
So the execution time is a function of input size.
Following is an example of Linear Time Algorithm
Linear Search
Given n input elements, to search an element in the array you need at most 'n' comparisons. In other words, no matter what programming language you use, what coding style you prefer, on what system you execute it. In the worst case scenario it requires only n comparisons.The execution time is linearly proportional to the input size.
And its not just search, whatever may be the work (increment, compare or any operation) its a function of input size.
So when you say any algorithm is O(log n)
it means the execution time is log times the input size n.
As the input size increases the work done(here the execution time) increases.(Hence proportionality)
n Work
2 1 units of work
4 2 units of work
8 3 units of work
See as the input size increased the work done is increased and it is independent of any machine.
And if you try to find out the value of units of work
It's actually dependent onto those above specified parameters.It will change according to the systems and all.
The complete binary example is O(ln n) because the search looks like this:
1 2 3 4 5 6 7 8 9 10 11 12
Searching for 4 yields 3 hits: 6, 3 then 4. And log2 12 = 3, which is a good apporximate to how many hits where needed.
log x to base b = y is the inverse of b^y = x
If you have an M-ary tree of depth d and size n, then:
traversing the whole tree ~ O(M^d) = O(n)
Walking a single path in the tree ~ O(d) = O(log n to base M)
In information technology it means that:
f(n)=O(g(n)) If there is suitable constant C and N0 independent on N,
such that
for all N>N0 "C*g(n) > f(n) > 0" is true.
Ant it seems that this notation was mostly have taken from mathematics.
In this article there is a quote:
D.E. Knuth, "BIG OMICRON AND BIG OMEGA AND BIG THETA", 1976:
On the basis of the issues discussed here, I propose that members of
SIGACT, and editors of computer science and mathematics journals,
adopt notations as defined above, unless a better alternative can be
found reasonably soon.
Today is 2016, but we use it still today.
In mathematical analysis it means that:
lim (f(n)/g(n))=Constant; where n goes to +infinity
But even in mathematical analysis sometimes this symbol was used in meaning "C*g(n) > f(n) > 0".
As I know from university the symbol was intoduced by German mathematician Landau (1877-1938)
O(logn) is one of the polynomial time complexity to measure the runtime performance of any code.
I hope you have already heard of Binary search algorithm.
Let's assume you have to find an element in the array of size N.
Basically, the code execution is like
N
N/2
N/4
N/8....etc
If you sum all the work done at each level you will end up with n(1+1/2+1/4....) and that is equal to O(logn)
If you are looking for a intuition based answer I would like to put up two interpretations for you.
Imagine a very high hill with a very broad base as well. To reach the top of the hill there are two ways: one is a dedicated pathway going spirally around the hill reaching at the top, the other: small terrace like carvings cut out to provide a staircase. Now if the first way is reaching in linear time O(n), the second one is O(log n).
Imagine an algorithm, which accepts an integer, n as input and completes in time proportional to n then it is O(n) or theta(n) but if it runs in time proportion to the number of digits or the number of bits in the binary representation on number then the algorithm runs in O(log n) or theta(log n) time.
Algorithms in the Divide and Conquer paradigm are of complexity O(logn). One example here, calculate your own power function,
int power(int x, unsigned int y)
{
int temp;
if( y == 0)
return 1;
temp = power(x, y/2);
if (y%2 == 0)
return temp*temp;
else
return x*temp*temp;
}
from http://www.geeksforgeeks.org/write-a-c-program-to-calculate-powxn/

Is there any algorithm with time complexity O(lg * n) (iterated logarithm function)?

In computer science, the iterated logarithm of n, written log* n (usually read "log star"), is the number of times the logarithm function must be iteratively applied before the result is less than or equal to 1. The simplest formal definition is the result of this recursive function:
Is there any algorithm with time complexity O(lg * n) ?
If you implement union find algorithm with path compression and union by rank, both union and find will have complexity O(log*(n)).
It's rare but not unheard of to see log* n appear in the runtime analysis of algorithms. Here are a couple of cases that tend to cause log* n to appear.
Approach 1: Shrinking By a Log Factor
Many divide-and-conquer algorithms work by converting an input of size n into an input of size n / k. The number of phases of these algorithms is then O(log n), since you can only divide by a constant O(log n) times before you shrink your input down to a constant size. In that sense, when you see "the input is divided by a constant," you should think "so it can only be divided O(log n) times before we run out of things to divide."
In rarer cases, some algorithms work by shrinking the size of the input down by a logarithmic factor. For example, one data structure for the range semigroup query problem work by breaking a larger problem down into blocks of size log n, then recursively subdividing each block of size log n into blocks of size log log n, etc. This process eventually stops once the blocks hit some small constant size, which means that it stops after O(log* n) iterations. (This particular approach can then be improved to give a data structure in which the blocks have size log* n for an overall number of rounds of O(log** n), eventually converging to an optimal structure with runtime O(α(n)), where α(n) is the inverse Ackermann function.
Approach 2: Compressing Digits of Numbers
The above section talks about approaches that explicitly break a larger problem down into smaller pieces whose sizes are logarithmic in the size of the original problem. However, there's another way to take an input of size n and to reduce it to an input of size O(log n): replace the input with something roughly comparable in size to its number of digits. Since writing out the number n requires O(log n) digits to write out, this has the effect of shrinking the size of the number down by the amount needed to get an O(log* n) term to arise.
As a simple example of this, consider an algorithm to compute the digital root of a number. This is the number you get by repeatedly adding the digits of a number up until you're down to a single digit. For example, the digital root of 78979871 can be found by computing
7 + 8 + 9 + 7 + 9 + 8 + 7 + 1 = 56
5 + 6 = 11
1 + 1 = 2
2
and getting a digital root of two. Each time we sum the digits of the number, we replace the number n with a number that's at most 9 ⌈log10 n⌉, so the number of rounds is O(log* n). (That being said, the total runtime is O(log n), since we have to factor in the work associated with adding up the digits of the number, and adding the digits of the original number dominates the runtime.)
For a more elaborate example, there is a parallel algorithm for 3-coloring the nodes of a tree described in the paper "Parallel Symmetry-Breaking in Sparse Graphs" by Goldberg et al. The algorithm works by repeatedly replacing numbers with simpler numbers formed by summing up certain bits of the numbers, and the number of rounds needed, like the approach mentioned above, is O(log* n).
Hope this helps!

What is meant by complexity of O(log n)?

Consider this example of binary search tree.
n =10 ;and if base = 2 then
log n = log2(10) = 3.321928.
I am assuming it means 3.321 steps(accesses) will be required at max to search for an element. Also I assume that BST is balanced binary tree.
Now to access node with value 25. I have to go following nodes:
50
40
30
25
So I had to access 4 nodes. And 3.321 is nearly equal to 4.
Is this understanding is right or erroneous?
I'd call your understanding not quite correct.
The big-O notation does not say anything about an exact amount of steps done. A notation O(log n) means that something is approximately proportional to log n, but not neccesarily equal.
If you say that the number of steps to search for a value in a BST is O(log n), this means that it is approximately C*log n for some constant C not depending on n, but it says nothing about the value of C. So for n=10 this never says that the number of steps is 4 or whatever. It can be 1, or it can be 1000000, depending on what C is.
What does this notation say is that if you consider two examples with different and big enough sizes, say n1 and n2, then ratio of the number of steps in these two examples will be approximately log(n1)/log(n2).
So if for n=10 it took you, say, 4 steps, then for n=100 it should take approximately two times more, that is, 8 steps, because log(100)/log(10)=2, and for n=10000 it should take 16 steps.
And if for n=10 it took you 1000000 steps, then for n=100 it should take 2000000, and for n=10000 — 4000000.
This is all for "large enough" n — for small ns the number of steps can deviate from this proportionality. For most practical algorithms the "large enough" usually starts from 5-10, if not even 1, but from a strict point of view the big-O notation does not set any requirement on when the proportionality should start.
Also in fact O(log n) notation does not require that number of steps growths proportionally to log n, but requires that the number of steps growths no faster than proportionally to log n, that is the ratio of the numbers of steps should not be log(n1)/log(n2), but <=log(n1)/log(n2).
Note also another situation that can make the background for O-notation more clear. Consider not the number of steps, but the time spent for search in a BST. You clearly can not predict this time because it depends on the machine you are running on, on a particular implementation of the algorithm, after all on the units you use for time (seconds or nanoseconds, etc.). So the time can be 0.0001 or 100000 or whatever. However, all these effects (speed of your machine, etc) routhly changes all the measurement results by some constant factor. Therefore you can say that the time is O(log n), just in different cases the C constant will be different.
Your thinking is not correct totally. The steps/accesses which are considered are for comparisons. But, O(log n) is just a mere parameter to measure asymptotic complexity, and not the exact steps calculation. As exactly answered by Petr, you should go through the points mentioned in his answer.
Also, BST is binary search tree,also sometimes called ordered or sorted binary trees.
Exact running time/comparisons can't be derived from Asymptotic complexity measurement. For that, you'll have to return to the exact derivation of searching an element in a BST.
Assume that we have a “balanced” tree with n nodes. If the maximum number of comparisons to find an entry is (k+1), where k is the height, we have
2^(k+1) - 1 = n
from which we obtain
k = log2(n+1) – 1 = O(log2n).
As you can see, there are other constant factors which are removed while measuring asymptotic complexity in worst case analysis. So, the complexity of the comparisons gets reduced to O(log2n).
Next, demonstration of how element is searched in a BST based on how comparison is done :-
1. Selecting 50,if root element //compare and move below for left-child or right-child
2. Movement downwards from 50 to 40, the leftmost child
3. Movement downwards from 40 to 30, the leftmost child
4. Movement downwards from 30 to 25, found and hence no movement further.
// Had there been more elements to traverse deeper, it would have been counted the 5th step.
Hence, it searched the item 25 after 3 iterative down-traversal. So, there is 4 comparisons and 3 downward-traversals(because height is 3).
Usually you say something like this :-
Given a balanced binary search tree with n elements, you need O(log n) operations to search.
or
Search in a balanced binary search tree of n elements is in O(log n).
I like the second phrase more, because it emphasizes that O is a function returning a set of functions given x (short: O(x)). x: N → N is a function. The input of x is the size of the input of a function and the output of x can be interpreted as the number of operations you need.
An function g is in O(x) when g is lower than x multiplied by an arbitrary non-negative constant from some starting point n_0 for all following n.
In computer science, g is often set equal with an algorithm which is wrong. It might be the number of operations of an algorithm, given the input size. Note that this is something different.
More formally:
So, regarding your question: You have to define what n is (conceptually, not as a number). In your example, it is eventually the number of nodes or the number of nodes on the longest path to a leaf.
Usually, when you use Big-O notation, you are not interested for a "average" case (and especially not for some given case) but you want to say something about the worst case.

Precise Input Size and Time Complexity

When talking about time complexity we usually use n as input, which is not a precise measure of the actual input size. I am having trouble showing that, when using specific size for input (s) an algorithm remains in the same complexity class.
For instance, take a simple Sequential Search algorithm. In its worst case it takes W(n) time. If we apply specific input size (in base 2), the order should be W(lg L), where L is the largest integer.
How do I show that Sequential Search, or any algorithm, remains the same complexity class, in this case linear time? I understand that there is some sort of substitution that needs to take place, but I am shaky on how to come to the conclusion.
EDIT
I think I may have found what I was looking for, but I'm not entirely sure.
If you define worst case time complexity as W(s), the maximum number of steps done by an algorithm for an input size of s, then by definition of input size, s = lg n, where n is the input. Then, n = 2^s, leading to the conclusion that the time complexity is W(2^s), an exponential complexity. Therefore, the algorithm's performance with binary encoding is exponential, not linear as it is in terms of magnitude.
When talking about time complexity we usually use n as input, which is not a precise measure of the actual input size. I am having trouble showing that, when using specific size for input (s) an algorithm remains in the same complexity class.
For instance, take a simple Sequential Search algorithm. In its worst case it takes W(n) time. If we apply specific input size (in base 2), the order should be W(lg L), where L is the largest integer.
L is a variable that represents the largest integer.
n is a variable that represents the size of the input.
L is not a specific value anymore than n is.
When you apply a specific value, you aren't talking about a complexity class anymore, you are talking about an instance of that class.
Let's say you are searching a list of 500 integers. In other words, n = 500
The worst-case complexity class of Sequential Search is O(n)
The complexity is n
The specific instance of worst-case complexity is 500
Edit:
Your values will be uniform in the number of bits required to encode each value. If the input is a list of 32bit integers, then c = 32, the number of bits per integer. Complexity would be 32*n => O(n).
In terms of L, if L is the largest value, and lg L is the number of bits required to encode L, then lg L is the constant c. Your complexity in terms of bits is O(n) = c*n, where c = lg L is the constant specific input size.
What I know is that the maximum number
of steps done by Sequential Search is,
obviously, cn^2 + nlg L. cn^2 being
the number of steps to increment loops
and do branching.
That's not true at all. The maximum number of steps done by a sequential search is going to be c*n, where n is the number of items in the list and c is some constant. That's the worst case. There is no n^2 component or logarithmic component.
For example, a simple sequential search would be:
for (int i = 0; i < NumItems; ++i)
{
if (Items[i] == query)
return i;
}
return -1;
With that algorithm, if you search for each item, then half of the searches will require fewer than NumItems/2 iterations and half of the searches will require NumItems/2 or more iterations. If an item you search for isn't in the list, it will require NumItems iterations to determine that. The worst case running time is NumItems iterations. The average case is NumItems/2 iterations.
The actual number of operations performed is some constant, C, multiplied by the number of iterations. On average it's C*NumItems/2.
As Lucia Moura states: "Except for the unary encoding, all the other encodings for natural
numbers have lengths that are polynomially related"
Here is the source. Take a look at page 19.

Big-O for Eight Year Olds? [duplicate]

This question already has answers here:
What is a plain English explanation of "Big O" notation?
(43 answers)
Closed 5 years ago.
I'm asking more about what this means to my code. I understand the concepts mathematically, I just have a hard time wrapping my head around what they mean conceptually. For example, if one were to perform an O(1) operation on a data structure, I understand that the number of operations it has to perform won't grow because there are more items. And an O(n) operation would mean that you would perform a set of operations on each element. Could somebody fill in the blanks here?
Like what exactly would an O(n^2) operation do?
And what the heck does it mean if an operation is O(n log(n))?
And does somebody have to smoke crack to write an O(x!)?
One way of thinking about it is this:
O(N^2) means for every element, you're doing something with every other element, such as comparing them. Bubble sort is an example of this.
O(N log N) means for every element, you're doing something that only needs to look at log N of the elements. This is usually because you know something about the elements that let you make an efficient choice. Most efficient sorts are an example of this, such as merge sort.
O(N!) means to do something for all possible permutations of the N elements. Traveling salesman is an example of this, where there are N! ways to visit the nodes, and the brute force solution is to look at the total cost of every possible permutation to find the optimal one.
The big thing that Big-O notation means to your code is how it will scale when you double the amount of "things" it operates on. Here's a concrete example:
Big-O | computations for 10 things | computations for 100 things
----------------------------------------------------------------------
O(1) | 1 | 1
O(log(n)) | 3 | 7
O(n) | 10 | 100
O(n log(n)) | 30 | 700
O(n^2) | 100 | 10000
So take quicksort which is O(n log(n)) vs bubble sort which is O(n^2). When sorting 10 things, quicksort is 3 times faster than bubble sort. But when sorting 100 things, it's 14 times faster! Clearly picking the fastest algorithm is important then. When you get to databases with million rows, it can mean the difference between your query executing in 0.2 seconds, versus taking hours.
Another thing to consider is that a bad algorithm is one thing that Moore's law cannot help. For example, if you've got some scientific calculation that's O(n^3) and it can compute 100 things a day, doubling the processor speed only gets you 125 things in a day. However, knock that calculation to O(n^2) and you're doing 1000 things a day.
clarification:
Actually, Big-O says nothing about comparative performance of different algorithms at the same specific size point, but rather about comparative performance of the same algorithm at different size points:
computations computations computations
Big-O | for 10 things | for 100 things | for 1000 things
----------------------------------------------------------------------
O(1) | 1 | 1 | 1
O(log(n)) | 1 | 3 | 7
O(n) | 1 | 10 | 100
O(n log(n)) | 1 | 33 | 664
O(n^2) | 1 | 100 | 10000
You might find it useful to visualize it:
Also, on LogY/LogX scale the functions n1/2, n, n2 all look like straight lines, while on LogY/X scale 2n, en, 10n are straight lines and n! is linearithmic (looks like n log n).
This might be too mathematical, but here's my try. (I am a mathematician.)
If something is O(f(n)), then it's running time on n elements will be equal to A f(n) + B (measured in, say, clock cycles or CPU operations). It's key to understanding that you also have these constants A and B, which arise from the specific implementation. B represents essentially the "constant overhead" of your operation, for example some preprocessing that you do that doesn't depend on the size of the collection. A represents the speed of your actual item-processing algorithm.
The key, though, is that you use big O notation to figure out how well something will scale. So those constants won't really matter: if you're trying to figure out how to scale from 10 to 10000 items, who cares about the constant overhead B? Similarly, other concerns (see below) will certainly outweigh the weight of the multiplicative constant A.
So the real deal is f(n). If f grows not at all with n, e.g. f(n) = 1, then you'll scale fantastically---your running time will always just be A + B. If f grows linearly with n, i.e. f(n) = n, your running time will scale pretty much as best as can be expected---if your users are waiting 10 ns for 10 elements, they'll wait 10000 ns for 10000 elements (ignoring the additive constant). But if it grows faster, like n2, then you're in trouble; things will start slowing down way too much when you get larger collections. f(n) = n log(n) is a good compromise, usually: your operation can't be so simple as to give linear scaling, but you've managed to cut things down such that it'll scale much better than f(n) = n2.
Practically, here are some good examples:
O(1): retrieving an element from an array. We know exactly where it is in memory, so we just go get it. It doesn't matter if the collection has 10 items or 10000; it's still at index (say) 3, so we just jump to location 3 in memory.
O(n): retrieving an element from a linked list. Here, A = 0.5, because on average you''ll have to go through 1/2 of the linked list before you find the element you're looking for.
O(n2): various "dumb" sorting algorithms. Because generally their strategy involves, for each element (n), you look at all the other elements (so times another n, giving n2), then position yourself in the right place.
O(n log(n)): various "smart" sorting algorithms. It turns out that you only need to look at, say, 10 elements in a 1010-element collection to intelligently sort yourself relative to everyone else in the collection. Because everyone else is also going to look at 10 elements, and the emergent behavior is orchestrated just right so that this is enough to produce a sorted list.
O(n!): an algorithm that "tries everything," since there are (proportional to) n! possible combinations of n elements that might solve a given problem. So it just loops through all such combinations, tries them, then stops whenever it succeeds.
don.neufeld's answer is very good, but I'd probably explain it in two parts: first, there's a rough hierarchy of O()'s that most algorithms fall into. Then, you can look at each of those to come up with sketches of what typical algorithms of that time complexity do.
For practical purposes, the only O()'s that ever seem to matter are:
O(1) "constant time" - the time required is independent of the size of the input. As a rough category, I would include algorithms such as hash lookups and Union-Find here, even though neither of those are actually O(1).
O(log(n)) "logarithmic" - it gets slower as you get larger inputs, but once your input gets fairly large, it won't change enough to worry about. If your runtime is ok with reasonably-sized data, you can swamp it with as much additional data as you want and it'll still be ok.
O(n) "linear" - the more input, the longer it takes, in an even tradeoff. Three times the input size will take roughly three times as long.
O(n log(n)) "better than quadratic" - increasing the input size hurts, but it's still manageable. The algorithm is probably decent, it's just that the underlying problem is more difficult (decisions are less localized with respect to the input data) than those problems that can be solved in linear time. If your input sizes are getting up there, don't assume that you could necessarily handle twice the size without changing your architecture around (eg by moving things to overnight batch computations, or not doing things per-frame). It's ok if the input size increases a little bit, though; just watch out for multiples.
O(n^2) "quadratic" - it's really only going to work up to a certain size of your input, so pay attention to how big it could get. Also, your algorithm may suck -- think hard to see if there's an O(n log(n)) algorithm that would give you what you need. Once you're here, feel very grateful for the amazing hardware we've been gifted with. Not long ago, what you are trying to do would have been impossible for all practical purposes.
O(n^3) "cubic" - not qualitatively all that different from O(n^2). The same comments apply, only more so. There's a decent chance that a more clever algorithm could shave this time down to something smaller, eg O(n^2 log(n)) or O(n^2.8...), but then again, there's a good chance that it won't be worth the trouble. (You're already limited in your practical input size, so the constant factors that may be required for the more clever algorithms will probably swamp their advantages for practical cases. Also, thinking is slow; letting the computer chew on it may save you time overall.)
O(2^n) "exponential" - the problem is either fundamentally computationally hard or you're being an idiot. These problems have a recognizable flavor to them. Your input sizes are capped at a fairly specific hard limit. You'll know quickly whether you fit into that limit.
And that's it. There are many other possibilities that fit between these (or are greater than O(2^n)), but they don't often happen in practice and they're not qualitatively much different from one of these. Cubic algorithms are already a bit of a stretch; I only included them because I've run into them often enough to be worth mentioning (eg matrix multiplication).
What's actually happening for these classes of algorithms? Well, I think you had a good start, although there are many examples that wouldn't fit these characterizations. But for the above, I'd say it usually goes something like:
O(1) - you're only looking at most at a fixed-size chunk of your input data, and possibly none of it. Example: the maximum of a sorted list.
Or your input size is bounded. Example: addition of two numbers. (Note that addition of N numbers is linear time.)
O(log n) - each element of your input tells you enough to ignore a large fraction of the rest of the input. Example: when you look at an array element in binary search, its value tells you that you can ignore "half" of your array without looking at any of it. Or similarly, the element you look at gives you enough of a summary of a fraction of the remaining input that you won't need to look at it.
There's nothing special about halves, though -- if you can only ignore 10% of your input at each step, it's still logarithmic.
O(n) - you do some fixed amount of work per input element. (But see below.)
O(n log(n)) - there are a few variants.
You can divide the input into two piles (in no more than linear time), solve the problem independently on each pile, and then combine the two piles to form the final solution. The independence of the two piles is key. Example: classic recursive mergesort.
Each linear-time pass over the data gets you halfway to your solution. Example: quicksort if you think in terms of the maximum distance of each element to its final sorted position at each partitioning step (and yes, I know that it's actually O(n^2) because of degenerate pivot choices. But practically speaking, it falls into my O(n log(n)) category.)
O(n^2) - you have to look at every pair of input elements.
Or you don't, but you think you do, and you're using the wrong algorithm.
O(n^3) - um... I don't have a snappy characterization of these. It's probably one of:
You're multiplying matrices
You're looking at every pair of inputs but the operation you do requires looking at all of the inputs again
the entire graph structure of your input is relevant
O(2^n) - you need to consider every possible subset of your inputs.
None of these are rigorous. Especially not linear time algorithms (O(n)): I could come up with a number of examples where you have to look at all of the inputs, then half of them, then half of those, etc. Or the other way around -- you fold together pairs of inputs, then recurse on the output. These don't fit the description above, since you're not looking at each input once, but it still comes out in linear time. Still, 99.2% of the time, linear time means looking at each input once.
A lot of these are easy to demonstrate with something non-programming, like shuffling cards.
Sorting a deck of cards by going through the whole deck to find the ace of spades, then going through the whole deck to find the 2 of spades, and so on would be worst case n^2, if the deck was already sorted backwards. You looked at all 52 cards 52 times.
In general the really bad algorithms aren't necessarily intentional, they're commonly a misuse of something else, like calling a method that is linear inside some other method that repeats over the same set linearly.
I try to explain by giving simple code examples in C# and JavaScript.
C#
For List<int> numbers = new List<int> {1,2,3,4,5,6,7,12,543,7};
O(1) looks like
return numbers.First();
O(n) looks like
int result = 0;
foreach (int num in numbers)
{
result += num;
}
return result;
O(n log(n)) looks like
int result = 0;
foreach (int num in numbers)
{
int index = numbers.Count - 1;
while (index > 1)
{
// yeah, stupid, but couldn't come up with something more useful :-(
result += numbers[index];
index /= 2;
}
}
return result;
O(n2) looks like
int result = 0;
foreach (int outerNum in numbers)
{
foreach (int innerNum in numbers)
{
result += outerNum * innerNum;
}
}
return result;
O(n!) looks like, uhm, to tired to come up with anything simple.
But I hope you get the general point?
JavaScript
For const numbers = [ 1, 2, 3, 4, 5, 6, 7, 12, 543, 7 ];
O(1) looks like
numbers[0];
O(n) looks like
let result = 0;
for (num of numbers){
result += num;
}
O(n log(n)) looks like
let result = 0;
for (num of numbers){
let index = numbers.length - 1;
while (index > 1){
// yeah, stupid, but couldn't come up with something more useful :-(
result += numbers[index];
index = Math.floor(index/2)
}
}
O(n2) looks like
let result = 0;
for (outerNum of numbers){
for (innerNum of numbers){
result += outerNum * innerNum;
}
}
Ok - there are some very good answers here but almost all of them seem to make the same mistake and it's one that is pervading common usage.
Informally, we write that f(n) = O( g(n) ) if, up to a scaling factor and for all n larger than some n0, g(n) is larger than f(n). That is, f(n) grows no quicker than, or is bounded from above by, g(n). This tells us nothing about how fast f(n) grows, save for the fact that it is guaranteed not to be any worse than g(n).
A concrete example: n = O( 2^n ). We all know that n grows much less quickly than 2^n, so that entitles us to say that it is bounded by above by the exponential function. There is a lot of room between n and 2^n, so it's not a very tight bound, but it's still a legitimate bound.
Why do we (computer scientists) use bounds rather than being exact? Because a) bounds are often easier to prove and b) it gives us a short-hand to express properties of algorithms. If I say that my new algorithm is O(n.log n) that means that in the worst case its run-time will be bounded from above by n.log n on n inputs, for large enough n (although see my comments below on when I might not mean worst-case).
If instead, we want to say that a function grows exactly as quickly as some other function, we use theta to make that point (I'll write T( f(n) ) to mean \Theta of f(n) in markdown). T( g(n) ) is short hand for being bounded from above and below by g(n), again, up to a scaling factor and asymptotically.
That is f(n) = T( g(n) ) <=> f(n) = O(g(n)) and g(n) = O(f(n)). In our example, we can see that n != T( 2^n ) because 2^n != O(n).
Why get concerned about this? Because in your question you write 'would someone have to smoke crack to write an O(x!)?' The answer is no - because basically everything you write will be bounded from above by the factorial function. The run time of quicksort is O(n!) - it's just not a tight bound.
There's also another dimension of subtlety here. Typically we are talking about the worst case input when we use O( g(n) ) notation, so that we are making a compound statement: in the worst case running time it will not be any worse than an algorithm that takes g(n) steps, again modulo scaling and for large enough n. But sometimes we want to talk about the running time of the average and even best cases.
Vanilla quicksort is, as ever, a good example. It's T( n^2 ) in the worst case (it will actually take at least n^2 steps, but not significantly more), but T(n.log n) in the average case, which is to say the expected number of steps is proportional to n.log n. In the best case it is also T(n.log n) - but you could improve that for, by example, checking if the array was already sorted in which case the best case running time would be T( n ).
How does this relate to your question about the practical realisations of these bounds? Well, unfortunately, O( ) notation hides constants which real-world implementations have to deal with. So although we can say that, for example, for a T(n^2) operation we have to visit every possible pair of elements, we don't know how many times we have to visit them (except that it's not a function of n). So we could have to visit every pair 10 times, or 10^10 times, and the T(n^2) statement makes no distinction. Lower order functions are also hidden - we could have to visit every pair of elements once, and every individual element 100 times, because n^2 + 100n = T(n^2). The idea behind O( ) notation is that for large enough n, this doesn't matter at all because n^2 gets so much larger than 100n that we don't even notice the impact of 100n on the running time. However, we often deal with 'sufficiently small' n such that constant factors and so on make a real, significant difference.
For example, quicksort (average cost T(n.log n)) and heapsort (average cost T(n.log n)) are both sorting algorithms with the same average cost - yet quicksort is typically much faster than heapsort. This is because heapsort does a few more comparisons per element than quicksort.
This is not to say that O( ) notation is useless, just imprecise. It's quite a blunt tool to wield for small n.
(As a final note to this treatise, remember that O( ) notation just describes the growth of any function - it doesn't necessarily have to be time, it could be memory, messages exchanged in a distributed system or number of CPUs required for a parallel algorithm.)
The way I describe it to my nontechnical friends is like this:
Consider multi-digit addition. Good old-fashioned, pencil-and-paper addition. The kind you learned when you were 7-8 years old. Given two three-or-four-digit numbers, you can find out what they add up to fairly easily.
If I gave you two 100-digit numbers, and asked you what they add up to, figuring it out would be pretty straightforward, even if you had to use pencil-and-paper. A bright kid could do such an addition in just a few minutes. This would only require about 100 operations.
Now, consider multi-digit multiplication. You probably learned that at around 8 or 9 years old. You (hopefully) did lots of repetitive drills to learn the mechanics behind it.
Now, imagine I gave you those same two 100-digit numbers and told you to multiply them together. This would be a much, much harder task, something that would take you hours to do - and that you'd be unlikely to do without mistakes. The reason for this is that (this version of) multiplication is O(n^2); each digit in the bottom number has to be multiplied by each digit in the top number, leaving a total of about n^2 operations. In the case of the 100-digit numbers, that's 10,000 multiplications.
No, an O(n) algorithm does not mean it will perform an operation on each element. Big-O notation gives you a way to talk about the "speed" of you algorithm independent of your actual machine.
O(n) means that the time your algorithm will take grows linearly as your input increase. O(n^2) means that the time your algorithm takes grows as the square of your input. And so forth.
The way I think about it, is you have the task of cleaning up a problem caused by some evil villain V who picks N, and you have to estimate out how much longer it's going to take to finish your problem when he increases N.
O(1) -> increasing N really doesn't make any difference at all
O(log(N)) -> every time V doubles N, you have to spend an extra amount of time T to complete the task. V doubles N again, and you spend the same amount.
O(N) -> every time V doubles N, you spend twice as much time.
O(N^2) -> every time V doubles N, you spend 4x as much time. (it's not fair!!!)
O(N log(N)) -> every time V doubles N, you spend twice as much time plus a little more.
These are bounds of an algorithm; computer scientists want to describe how long it is going to take for large values of N. (which gets important when you are factoring numbers that are used in cryptography -- if the computers speed up by a factor of 10, how many more bits do you have to use to ensure it will still take them 100 years to break your encryption and not just 1 year?)
Some of the bounds can have weird expressions if it makes a difference to the people involved. I've seen stuff like O(N log(N) log(log(N))) somewhere in Knuth's Art of Computer Programming for some algorithms. (can't remember which one off the top of my head)
One thing that hasn't been touched on yet for some reason:
When you see algorithms with things like O(2^n) or O(n^3) or other nasty values it often means you're going to have to accept an imperfect answer to your problem in order to get acceptable performance.
Correct solutions that blow up like this are common when dealing with optimization problems. A nearly-correct answer delivered in a reasonable timeframe is better than a correct answer delivered long after the machine has decayed to dust.
Consider chess: I don't know exactly what the correct solution is considered to be but it's probably something like O(n^50) or even worse. It is theoretically impossible for any computer to actually calculate the correct answer--even if you use every particle in the universe as a computing element performing an operation in the minimum possible time for the life of the universe you still have a lot of zeros left. (Whether a quantum computer can solve it is another matter.)
The "Intuitition" behind Big-O
Imagine a "competition" between two functions over x, as x approaches infinity: f(x) and g(x).
Now, if from some point on (some x) one function always has a higher value then the other, then let's call this function "faster" than the other.
So, for example, if for every x > 100 you see that f(x) > g(x), then f(x) is "faster" than g(x).
In this case we would say g(x) = O(f(x)). f(x) poses a sort of "speed limit" of sorts for g(x), since eventually it passes it and leaves it behind for good.
This isn't exactly the definition of big-O notation, which also states that f(x) only has to be larger than C*g(x) for some constant C (which is just another way of saying that you can't help g(x) win the competition by multiplying it by a constant factor - f(x) will always win in the end). The formal definition also uses absolute values. But I hope I managed to make it intuitive.
And does somebody have to smoke crack to write an O(x!)?
No, just use Prolog. If you write a sorting algorithm in Prolog by just describing that each element should be bigger than the previous, and let backtracking do the sorting for you, that will be O(x!). Also known as "permutation sort".
I like don neufeld's answer, but I think I can add something about O(n log n).
An algorithm which uses a simple divide and conquer strategy is probably going to be O(log n). The simplest example of this is finding a something in an sorted list. You don't start at the beginning and scan for it. You go to the middle, you decide if you should then go backwards or forwards, jump halfway to the last place you looked, and repeat this until you find the item you're looking for.
If you look at the quicksort or mergesort algorithms, you will see that they both take the approach of dividing the list to be sorted in half, sorting each half (using the same algorithm, recursively), and then recombining the two halves. This sort of recursive divide and conquer strategy will be O(n log n).
If you think about it carefully, you'll see that quicksort does an O(n) partitioning algorithm on the whole n items, then an O(n) partitioning twice on n/2 items, then 4 times on n/4 items, etc... until you get to an n partitions on 1 item (which is degenerate). The number of times you divide n in half to get to 1 is approximately log n, and each step is O(n), so recursive divide and conquer is O(n log n). Mergesort builds the other way, starting with n recombinations of 1 item, and finishing with 1 recombination of n items, where the recombination of two sorted lists is O(n).
As for smoking crack to write an O(n!) algorithm, you are unless you have no choice. The traveling salesman problem given above is believed to be one such problem.
Think of it as stacking lego blocks (n) vertically and jumping over them.
O(1) means at each step, you do nothing. The height stays the same.
O(n) means at each step, you stack c blocks, where c1 is a constant.
O(n^2) means at each step, you stack c2 x n blocks, where c2 is a constant, and n is the number of stacked blocks.
O(nlogn) means at each step, you stack c3 x n x log n blocks, where c3 is a constant, and n is the number of stacked blocks.
Most Jon Bentley books (e.g. Programming Pearls) cover such stuff in a really pragmatic manner. This talk given by him includes one such analysis of a quicksort.
While not entirely relevant to the question, Knuth came up with an interesting idea: teaching Big-O notation in high school calculus classes, though I find this idea quite eccentric.
To understand O(n log n), remember that log n means log-base-2 of n. Then look at each part:
O(n) is, more or less, when you operate on each item in the set.
O(log n) is when the number of operations is the same as the exponent to which you raise 2, to get the number of items. A binary search, for instance, has to cut the set in half log n times.
O(n log n) is a combination – you're doing something along the lines of a binary search for each item in the set. Efficient sorts often operate by doing one loop per item, and in each loop doing a good search to find the right place to put the item or group in question. Hence n * log n.
Just to respond to the couple of comments on my above post:
Domenic - I'm on this site, and I care. Not for pedantry's sake, but because we - as programmers - typically care about precision. Using O( ) notation incorrectly in the style that some have done here renders it kind of meaningless; we may just as well say something takes n^2 units of time as O( n^2 ) under the conventions used here. Using the O( ) adds nothing. It's not just a small discrepancy between common usage and mathematical precision that I'm talking about, it's the difference between it being meaningful and it not.
I know many, many excellent programmers who use these terms precisely. Saying 'oh, we're programmers therefore we don't care' cheapens the whole enterprise.
onebyone - Well, not really although I take your point. It's not O(1) for arbitrarily large n, which is kind of the definition of O( ). It just goes to show that O( ) has limited applicability for bounded n, where we would rather actually talk about the number of steps taken rather than a bound on that number.
Tell your eight year old log(n) means the number of times you have to chop a length n log in two for it to get down to size n=1 :p
O(n log n) is usually sorting
O(n^2) is usually comparing all pairs of elements
Suppose you had a computer that could solve a problem of a certain size. Now imagine that we can double the performance a few times. How much bigger a problem can we solve with each doubling?
If we can solve a problem of double the size, that's O(n).
If we have some multiplier that isn't one, that's some sort of polynomial complexity. For example, if each doubling allows us to increase the problem size by about 40%, it's O(n^2), and about 30% would be O(n^3).
If we just add to the problem size, it's exponential or worse. For example, if each doubling means we can solve a problem 1 bigger, it's O(2^n). (This is why brute-forcing a cipher key becomes effectively impossible with reasonably sized keys: a 128-bit key requires about 16 quintillion times as much processing as a 64-bit.)
Remember the fable of the tortoise and the hare (turtle and rabbit)?
Over the long run, the tortoise wins, but over the short run the hare wins.
That's like O(logN) (tortoise) vs. O(N) (hare).
If two methods differ in their big-O, then there is a level of N at which one of them will win, but big-O says nothing about how big that N is.
To remain sincere to the question asked I would answer the question in the manner I would answer an 8 year old kid
Suppose an ice-cream seller prepares a number of ice creams ( say N ) of different shapes arranged in an orderly fashion.
You want to eat the ice cream lying in the middle
Case 1 : - You can eat an ice cream only if you have eaten all the ice creams smaller than it
You will have to eat half of all the ice creams prepared (input).Answer directly depends on the size of the input
Solution will be of order o(N)
Case 2 :- You can directly eat the ice cream in the middle
Solution will be O(1)
Case 3 : You can eat an ice cream only if you have eaten all the ice creams smaller than it and each time you eat an ice cream you allow another kid (new kid everytime ) to eat all his ice creams
Total time taken would be N + N + N.......(N/2) times
Solution will be O(N2)
log(n) means logarithmic growth. An example would be divide and conquer algorithms. If you have 1000 sorted numbers in an array ( ex. 3, 10, 34, 244, 1203 ... ) and want to search for a number in the list (find its position), you could start with checking the value of the number at index 500. If it is lower than what you seek, jump to 750. If it is higher than what you seek, jump to 250. Then you repeat the process until you find your value (and key). Every time we jump half the search space, we can cull away testing many other values since we know the number 3004 can't be above number 5000 (remember, it is a sorted list).
n log(n) then means n * log(n).
I'll try to actually write an explanation for a real eight year old boy, aside from technical terms and mathematical notions.
Like what exactly would an O(n^2) operation do?
If you are in a party, and there are n people in the party including you. How many handshakes it take so that everyone has handshaked everyone else, given that people would probably forget who they handshaked at some point.
Note: this approximate to a simplex yielding n(n-1) which is close enough to n^2.
And what the heck does it mean if an operation is O(n log(n))?
Your favorite team has won, they are standing in line, and there are n players in the team. How many hanshakes it would take you to handshake every player, given that you will hanshake each one multiple times, how many times, how many digits are in the number of the players n.
Note: this will yield n * log n to the base 10.
And does somebody have to smoke crack to write an O(x!)?
You are a rich kid and in your wardrobe there are alot of cloths, there are x drawers for each type of clothing, the drawers are next to each others, the first drawer has 1 item, each drawer has as many cloths as in the drawer to its left and one more, so you have something like 1 hat, 2 wigs, .. (x-1) pants, then x shirts. Now in how many ways can you dress up using a single item from each drawer.
Note: this example represent how many leaves in a decision-tree where number of children = depth, which is done through 1 * 2 * 3 * .. * x

Resources