Related
I am learning about Big O Notation running times and amortized times. I understand the notion of O(n) linear time, meaning that the size of the input affects the growth of the algorithm proportionally...and the same goes for, for example, quadratic time O(n2) etc..even algorithms, such as permutation generators, with O(n!) times, that grow by factorials.
For example, the following function is O(n) because the algorithm grows in proportion to its input n:
f(int n) {
int i;
for (i = 0; i < n; ++i)
printf("%d", i);
}
Similarly, if there was a nested loop, the time would be O(n2).
But what exactly is O(log n)? For example, what does it mean to say that the height of a complete binary tree is O(log n)?
I do know (maybe not in great detail) what Logarithm is, in the sense that: log10 100 = 2, but I cannot understand how to identify a function with a logarithmic time.
I cannot understand how to identify a function with a log time.
The most common attributes of logarithmic running-time function are that:
the choice of the next element on which to perform some action is one of several possibilities, and
only one will need to be chosen.
or
the elements on which the action is performed are digits of n
This is why, for example, looking up people in a phone book is O(log n). You don't need to check every person in the phone book to find the right one; instead, you can simply divide-and-conquer by looking based on where their name is alphabetically, and in every section you only need to explore a subset of each section before you eventually find someone's phone number.
Of course, a bigger phone book will still take you a longer time, but it won't grow as quickly as the proportional increase in the additional size.
We can expand the phone book example to compare other kinds of operations and their running time. We will assume our phone book has businesses (the "Yellow Pages") which have unique names and people (the "White Pages") which may not have unique names. A phone number is assigned to at most one person or business. We will also assume that it takes constant time to flip to a specific page.
Here are the running times of some operations we might perform on the phone book, from fastest to slowest:
O(1) (in the worst case): Given the page that a business's name is on and the business name, find the phone number.
O(1) (in the average case): Given the page that a person's name is on and their name, find the phone number.
O(log n): Given a person's name, find the phone number by picking a random point about halfway through the part of the book you haven't searched yet, then checking to see whether the person's name is at that point. Then repeat the process about halfway through the part of the book where the person's name lies. (This is a binary search for a person's name.)
O(n): Find all people whose phone numbers contain the digit "5".
O(n): Given a phone number, find the person or business with that number.
O(n log n): There was a mix-up at the printer's office, and our phone book had all its pages inserted in a random order. Fix the ordering so that it's correct by looking at the first name on each page and then putting that page in the appropriate spot in a new, empty phone book.
For the below examples, we're now at the printer's office. Phone books are waiting to be mailed to each resident or business, and there's a sticker on each phone book identifying where it should be mailed to. Every person or business gets one phone book.
O(n log n): We want to personalize the phone book, so we're going to find each person or business's name in their designated copy, then circle their name in the book and write a short thank-you note for their patronage.
O(n2): A mistake occurred at the office, and every entry in each of the phone books has an extra "0" at the end of the phone number. Take some white-out and remove each zero.
O(n · n!): We're ready to load the phonebooks onto the shipping dock. Unfortunately, the robot that was supposed to load the books has gone haywire: it's putting the books onto the truck in a random order! Even worse, it loads all the books onto the truck, then checks to see if they're in the right order, and if not, it unloads them and starts over. (This is the dreaded bogo sort.)
O(nn): You fix the robot so that it's loading things correctly. The next day, one of your co-workers plays a prank on you and wires the loading dock robot to the automated printing systems. Every time the robot goes to load an original book, the factory printer makes a duplicate run of all the phonebooks! Fortunately, the robot's bug-detection systems are sophisticated enough that the robot doesn't try printing even more copies when it encounters a duplicate book for loading, but it still has to load every original and duplicate book that's been printed.
O(log N) basically means time goes up linearly while the n goes up exponentially. So if it takes 1 second to compute 10 elements, it will take 2 seconds to compute 100 elements, 3 seconds to compute 1000 elements, and so on.
It is O(log n) when we do divide and conquer type of algorithms e.g binary search. Another example is quick sort where each time we divide the array into two parts and each time it takes O(N) time to find a pivot element. Hence it N O(log N)
Many good answers have already been posted to this question, but I believe we really are missing an important one - namely, the illustrated answer.
What does it mean to say that the height of a complete binary tree is O(log n)?
The following drawing depicts a binary tree. Notice how each level contains double the number of nodes compared to the level above (hence binary):
Binary search is an example with complexity O(log n). Let's say that the nodes in the bottom level of the tree in figure 1 represents items in some sorted collection. Binary search is a divide-and-conquer algorithm, and the drawing shows how we will need (at most) 4 comparisons to find the record we are searching for in this 16 item dataset.
Assume we had instead a dataset with 32 elements. Continue the drawing above to find that we will now need 5 comparisons to find what we are searching for, as the tree has only grown one level deeper when we multiplied the amount of data. As a result, the complexity of the algorithm can be described as a logarithmic order.
Plotting log(n) on a plain piece of paper, will result in a graph where the rise of the curve decelerates as n increases:
Overview
Others have given good diagram examples, such as the tree diagrams. I did not see any simple code examples. So in addition to my explanation, I'll provide some algorithms with simple print statements to illustrate the complexity of different algorithm categories.
First, you'll want to have a general idea of Logarithm, which you can get from https://en.wikipedia.org/wiki/Logarithm . Natural science use e and the natural log. Engineering disciples will use log_10 (log base 10) and computer scientists will use log_2 (log base 2) a lot, since computers are binary based. Sometimes you'll see abbreviations of natural log as ln(), engineers normally leave the _10 off and just use log() and log_2 is abbreviated as lg(). All of the types of logarithms grow in a similar fashion, that is why they share the same category of log(n).
When you look at the code examples below, I recommend looking at O(1), then O(n), then O(n^2). After you are good with those, then look at the others. I've included clean examples as well as variations to demonstrate how subtle changes can still result in the same categorization.
You can think of O(1), O(n), O(logn), etc as classes or categories of growth. Some categories will take more time to do than others. These categories help give us a way of ordering the algorithm performance. Some grown faster as the input n grows. The following table demonstrates said growth numerically. In the table below think of log(n) as the ceiling of log_2.
Simple Code Examples Of Various Big O Categories:
O(1) - Constant Time Examples:
Algorithm 1:
Algorithm 1 prints hello once and it doesn't depend on n, so it will always run in constant time, so it is O(1).
print "hello";
Algorithm 2:
Algorithm 2 prints hello 3 times, however it does not depend on an input size. Even as n grows, this algorithm will always only print hello 3 times. That being said 3, is a constant, so this algorithm is also O(1).
print "hello";
print "hello";
print "hello";
O(log(n)) - Logarithmic Examples:
Algorithm 3 - This acts like "log_2"
Algorithm 3 demonstrates an algorithm that runs in log_2(n). Notice the post operation of the for loop multiples the current value of i by 2, so i goes from 1 to 2 to 4 to 8 to 16 to 32 ...
for(int i = 1; i <= n; i = i * 2)
print "hello";
Algorithm 4 - This acts like "log_3"
Algorithm 4 demonstrates log_3. Notice i goes from 1 to 3 to 9 to 27...
for(int i = 1; i <= n; i = i * 3)
print "hello";
Algorithm 5 - This acts like "log_1.02"
Algorithm 5 is important, as it helps show that as long as the number is greater than 1 and the result is repeatedly multiplied against itself, that you are looking at a logarithmic algorithm.
for(double i = 1; i < n; i = i * 1.02)
print "hello";
O(n) - Linear Time Examples:
Algorithm 6
This algorithm is simple, which prints hello n times.
for(int i = 0; i < n; i++)
print "hello";
Algorithm 7
This algorithm shows a variation, where it will print hello n/2 times. n/2 = 1/2 * n. We ignore the 1/2 constant and see that this algorithm is O(n).
for(int i = 0; i < n; i = i + 2)
print "hello";
O(n*log(n)) - nlog(n) Examples:
Algorithm 8
Think of this as a combination of O(log(n)) and O(n). The nesting of the for loops help us obtain the O(n*log(n))
for(int i = 0; i < n; i++)
for(int j = 1; j < n; j = j * 2)
print "hello";
Algorithm 9
Algorithm 9 is like algorithm 8, but each of the loops has allowed variations, which still result in the final result being O(n*log(n))
for(int i = 0; i < n; i = i + 2)
for(int j = 1; j < n; j = j * 3)
print "hello";
O(n^2) - n squared Examples:
Algorithm 10
O(n^2) is obtained easily by nesting standard for loops.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j++)
print "hello";
Algorithm 11
Like algorithm 10, but with some variations.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j = j + 2)
print "hello";
O(n^3) - n cubed Examples:
Algorithm 12
This is like algorithm 10, but with 3 loops instead of 2.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j++)
for(int k = 0; k < n; k++)
print "hello";
Algorithm 13
Like algorithm 12, but with some variations that still yield O(n^3).
for(int i = 0; i < n; i++)
for(int j = 0; j < n + 5; j = j + 2)
for(int k = 0; k < n; k = k + 3)
print "hello";
Summary
The above give several straight forward examples, and variations to help demonstrate what subtle changes can be introduced that really don't change the analysis. Hopefully it gives you enough insight.
The explanation below is using the case of a fully balanced binary tree to help you understand how we get logarithmic time complexity.
Binary tree is a case where a problem of size n is divided into sub-problem of size n/2 until we reach a problem of size 1:
And that's how you get O(log n) which is the amount of work that needs to be done on the above tree to reach a solution.
A common algorithm with O(log n) time complexity is Binary Search whose recursive relation is T(n/2) + O(1) i.e. at every subsequent level of the tree you divide problem into half and do constant amount of additional work.
If you had a function that takes:
1 millisecond to complete if you have 2 elements.
2 milliseconds to complete if you have 4 elements.
3 milliseconds to complete if you have 8 elements.
4 milliseconds to complete if you have 16 elements.
...
n milliseconds to complete if you have 2^n elements.
Then it takes log2(n) time. The Big O notation, loosely speaking, means that the relationship only needs to be true for large n, and that constant factors and smaller terms can be ignored.
The logarithm
Ok let's try and fully understand what a logarithm actually is.
Imagine we have a rope and we have tied it to a horse. If the rope is directly tied to the horse, the force the horse would need to pull away (say, from a man) is directly 1.
Now imagine the rope is looped round a pole. The horse to get away will now have to pull many times harder. The amount of times will depend on the roughness of the rope and the size of the pole, but let's assume it will multiply one's strength by 10 (when the rope makes a complete turn).
Now if the rope is looped once, the horse will need to pull 10 times harder. If the human decides to make it really difficult for the horse, he may loop the rope again round a pole, increasing it's strength by an additional 10 times. A third loop will again increase the strength by a further 10 times.
We can see that for each loop, the value increases by 10. The number of turns required to get any number is called the logarithm of the number i.e. we need 3 posts to multiple your strength by 1000 times, 6 posts to multiply your strength by 1,000,000.
3 is the logarithm of 1,000, and 6 is the logarithm of 1,000,000 (base 10).
So what does O(log n) actually mean?
In our example above, our 'growth rate' is O(log n). For every additional loop, the force our rope can handle is 10 times more:
Turns | Max Force
0 | 1
1 | 10
2 | 100
3 | 1000
4 | 10000
n | 10^n
Now the example above did use base 10, but fortunately the base of the log is insignificant when we talk about big o notation.
Now let's imagine you are trying to guess a number between 1-100.
Your Friend: Guess my number between 1-100!
Your Guess: 50
Your Friend: Lower!
Your Guess: 25
Your Friend: Lower!
Your Guess: 13
Your Friend: Higher!
Your Guess: 19
Your Friend: Higher!
Your Friend: 22
Your Guess: Lower!
Your Guess: 20
Your Friend: Higher!
Your Guess: 21
Your Friend: YOU GOT IT!
Now it took you 7 guesses to get this right. But what is the relationship here? What is the most amount of items that you can guess from each additional guess?
Guesses | Items
1 | 2
2 | 4
3 | 8
4 | 16
5 | 32
6 | 64
7 | 128
10 | 1024
Using the graph, we can see that if we use a binary search to guess a number between 1-100 it will take us at most 7 attempts. If we had 128 numbers, we could also guess the number in 7 attemps but 129 numbers will takes us at most 8 attempts (in relations to logarithms, here we would need 7 guesses for a 128 value range, 10 guesses for a 1024 value range. 7 is the logarithm of 128, 10 is the logarithm of 1024 (base 2)).
Notice that I have bolded 'at most'. Big-O notation always refers to the worse case. If you're lucky, you could guess the number in one attempt and so the best case is O(1), but that's another story.
We can see that for every guess our data set is shrinking. A good rule of thumb to identify if an algorithm has a logarithmtic time is
to see if the data set shrinks by a certain order after each iteration
What about O(n log n)?
You will eventually come across a linearithmic time O(n log(n)) algorithm. The rule of thumb above applies again, but this time the logarithmic function has to run n times e.g. reducing the size of a list n times, which occurs in algorithms like a mergesort.
You can easily identify if the algorithmic time is n log n. Look for an outer loop which iterates through a list (O(n)). Then look to see if there is an inner loop. If the inner loop is cutting/reducing the data set on each iteration, that loop is (O(log n)), and so the overall algorithm is = O(n log n).
Disclaimer: The rope-logarithm example was grabbed from the excellent Mathematician's Delight book by W.Sawyer.
Logarithmic running time (O(log n)) essentially means that the running time grows in proportion to the logarithm of the input size - as an example, if 10 items takes at most some amount of time x, and 100 items takes at most, say, 2x, and 10,000 items takes at most 4x, then it's looking like an O(log n) time complexity.
First I recommend you to read following book;
Algorithms (4th Edition)
Here is some functions and their expected complexities. Numbers are indicating statement execution frequencies.
Following Big-O Complexity Chart also taken from bigocheatsheet
Lastly very simple showcase there is shows how it is calculated;
Anatomy of a program’s statement execution frequencies.
Analyzing the running time of a program (example).
You can think of O(log N) intuitively by saying the time is proportional to the number of digits in N.
If an operation performs constant time work on each digit or bit of an input, the whole operation will take time proportional to the number of digits or bits in the input, not the magnitude of the input; thus, O(log N) rather than O(N).
If an operation makes a series of constant time decisions each of which halves (reduces by a factor of 3, 4, 5..) the size of the input to be considered, the whole will take time proportional to log base 2 (base 3, base 4, base 5...) of the size N of the input, rather than being O(N).
And so on.
What's logb(n)?
It is the number of times you can cut a log of length n repeatedly into b equal parts before reaching a section of size 1.
The best way I've always had to mentally visualize an algorithm that runs in O(log n) is as follows:
If you increase the problem size by a multiplicative amount (i.e. multiply its size by 10), the work is only increased by an additive amount.
Applying this to your binary tree question so you have a good application: if you double the number of nodes in a binary tree, the height only increases by 1 (an additive amount). If you double it again, it still only increased by 1. (Obviously I'm assuming it stays balanced and such). That way, instead of doubling your work when the problem size is multiplied, you're only doing very slightly more work. That's why O(log n) algorithms are awesome.
Divide and conquer algorithms usually have a logn component to the running time. This comes from the repeated halving of the input.
In the case of binary search, every iteration you throw away half of the input. It should be noted that in Big-O notation, log is log base 2.
Edit: As noted, the log base doesn't matter, but when deriving the Big-O performance of an algorithm, the log factor will come from halving, hence why I think of it as base 2.
O(log n) is a bit misleading, more precisely it's O(log2 n), i.e. (logarithm with base 2).
The height of a balanced binary tree is O(log2 n), since every node has two (note the "two" as in log2 n) child nodes. So, a tree with n nodes has a height of log2 n.
Another example is binary search, which has a running time of O(log2 n) because at every step you divide the search space by 2.
But what exactly is O(log n)? For example, what does it mean to say that the height of a >complete binary tree is O(log n)?
I would rephrase this as 'height of a complete binary tree is log n'. Figuring the height of a complete binary tree would be O(log n), if you were traversing down step by step.
I cannot understand how to identify a function with a logarithmic
time.
Logarithm is essentially the inverse of exponentiation. So, if each 'step' of your function is eliminating a factor of elements from the original item set, that is a logarithmic time algorithm.
For the tree example, you can easily see that stepping down a level of nodes cuts down an exponential number of elements as you continue traversing. The popular example of looking through a name-sorted phone book is essentially equivalent to traversing down a binary search tree (middle page is the root element, and you can deduce at each step whether to go left or right).
O(log n) refers to a function (or algorithm, or step in an algorithm) working in an amount of time proportional to the logarithm (usually base 2 in most cases, but not always, and in any event this is insignificant by big-O notation*) of the size of the input.
The logarithmic function is the inverse of the exponential function. Put another way, if your input grows exponentially (rather than linearly, as you would normally consider it), your function grows linearly.
O(log n) running times are very common in any sort of divide-and-conquer application, because you are (ideally) cutting the work in half every time. If in each of the division or conquer steps, you are doing constant time work (or work that is not constant-time, but with time growing more slowly than O(log n)), then your entire function is O(log n). It's fairly common to have each step require linear time on the input instead; this will amount to a total time complexity of O(n log n).
The running time complexity of binary search is an example of O(log n). This is because in binary search, you are always ignoring half of your input in each later step by dividing the array in half and only focusing on one half with each step. Each step is constant-time, because in binary search you only need to compare one element with your key in order to figure out what to do next irregardless of how big the array you are considering is at any point. So you do approximately log(n)/log(2) steps.
The running time complexity of merge sort is an example of O(n log n). This is because you are dividing the array in half with each step, resulting in a total of approximately log(n)/log(2) steps. However, in each step you need to perform merge operations on all elements (whether it's one merge operation on two sublists of n/2 elements, or two merge operations on four sublists of n/4 elements, is irrelevant because it adds to having to do this for n elements in each step). Thus, the total complexity is O(n log n).
*Remember that big-O notation, by definition, constants don't matter. Also by the change of base rule for logarithms, the only difference between logarithms of different bases is a constant factor.
These 2 cases will take O(log n) time
case 1: f(int n) {
int i;
for (i = 1; i < n; i=i*2)
printf("%d", i);
}
case 2 : f(int n) {
int i;
for (i = n; i>=1 ; i=i/2)
printf("%d", i);
}
It simply means that the time needed for this task grows with log(n) (example : 2s for n = 10, 4s for n = 100, ...). Read the Wikipedia articles on Binary Search Algorithm and Big O Notation for more precisions.
Simply put: At each step of your algorithm you can cut the work in half. (Asymptotically equivalent to third, fourth, ...)
If you plot a logarithmic function on a graphical calculator or something similar, you'll see that it rises really slowly -- even more slowly than a linear function.
This is why algorithms with a logarithmic time complexity are highly sought after: even for really big n (let's say n = 10^8, for example), they perform more than acceptably.
I can add something interesting, that I read in book by Kormen and etc. a long time ago. Now, imagine a problem, where we have to find a solution in a problem space. This problem space should be finite.
Now, if you can prove, that at every iteration of your algorithm you cut off a fraction of this space, that is no less than some limit, this means that your algorithm is running in O(logN) time.
I should point out, that we are talking here about a relative fraction limit, not the absolute one. The binary search is a classical example. At each step we throw away 1/2 of the problem space. But binary search is not the only such example. Suppose, you proved somehow, that at each step you throw away at least 1/128 of problem space. That means, your program is still running at O(logN) time, although significantly slower than the binary search. This is a very good hint in analyzing of recursive algorithms. It often can be proved that at each step the recursion will not use several variants, and this leads to the cutoff of some fraction in problem space.
Actually, if you have a list of n elements, and create a binary tree from that list (like in the divide and conquer algorithm), you will keep dividing by 2 until you reach lists of size 1 (the leaves).
At the first step, you divide by 2. You then have 2 lists (2^1), you divide each by 2, so you have 4 lists (2^2), you divide again, you have 8 lists (2^3)and so on until your list size is 1
That gives you the equation :
n/(2^steps)=1 <=> n=2^steps <=> lg(n)=steps
(you take the lg of each side, lg being the log base 2)
I can give an example for a for loop and maybe once grasped the concept maybe it will be simpler to understand in different contexts.
That means that in the loop the step grows exponentially. E.g.
for (i=1; i<=n; i=i*2) {;}
The complexity in O-notation of this program is O(log(n)). Let's try to loop through it by hand (n being somewhere between 512 and 1023 (excluding 1024):
step: 1 2 3 4 5 6 7 8 9 10
i: 1 2 4 8 16 32 64 128 256 512
Although n is somewhere between 512 and 1023, only 10 iterations take place. This is because the step in the loop grows exponentially and thus takes only 10 iterations to reach the termination.
The logarithm of x (to the base of a) is the reverse function of a^x.
It is like saying that logarithm is the inverse of exponential.
Now try to see it that way, if exponential grows very fast then logarithm grows (inversely) very slow.
The difference between O(n) and O(log(n)) is huge, similar to the difference between O(n) and O(a^n) (a being a constant).
Every time we write an algorithm or code we try to analyze its asymptotic complexity.
It is different from its time complexity.
Asymptotic complexity is the behavior of execution time of an algorithm while the time complexity is the actual execution time. But some people use these terms interchangeably.
Because time complexity depends on various parameters viz.
1. Physical System
2. Programming Language
3. coding Style
4. And much more ......
The actual execution time is not a good measure for analysis.
Instead we take input size as the parameter because whatever the code is, the input is same.
So the execution time is a function of input size.
Following is an example of Linear Time Algorithm
Linear Search
Given n input elements, to search an element in the array you need at most 'n' comparisons. In other words, no matter what programming language you use, what coding style you prefer, on what system you execute it. In the worst case scenario it requires only n comparisons.The execution time is linearly proportional to the input size.
And its not just search, whatever may be the work (increment, compare or any operation) its a function of input size.
So when you say any algorithm is O(log n)
it means the execution time is log times the input size n.
As the input size increases the work done(here the execution time) increases.(Hence proportionality)
n Work
2 1 units of work
4 2 units of work
8 3 units of work
See as the input size increased the work done is increased and it is independent of any machine.
And if you try to find out the value of units of work
It's actually dependent onto those above specified parameters.It will change according to the systems and all.
The complete binary example is O(ln n) because the search looks like this:
1 2 3 4 5 6 7 8 9 10 11 12
Searching for 4 yields 3 hits: 6, 3 then 4. And log2 12 = 3, which is a good apporximate to how many hits where needed.
log x to base b = y is the inverse of b^y = x
If you have an M-ary tree of depth d and size n, then:
traversing the whole tree ~ O(M^d) = O(n)
Walking a single path in the tree ~ O(d) = O(log n to base M)
In information technology it means that:
f(n)=O(g(n)) If there is suitable constant C and N0 independent on N,
such that
for all N>N0 "C*g(n) > f(n) > 0" is true.
Ant it seems that this notation was mostly have taken from mathematics.
In this article there is a quote:
D.E. Knuth, "BIG OMICRON AND BIG OMEGA AND BIG THETA", 1976:
On the basis of the issues discussed here, I propose that members of
SIGACT, and editors of computer science and mathematics journals,
adopt notations as defined above, unless a better alternative can be
found reasonably soon.
Today is 2016, but we use it still today.
In mathematical analysis it means that:
lim (f(n)/g(n))=Constant; where n goes to +infinity
But even in mathematical analysis sometimes this symbol was used in meaning "C*g(n) > f(n) > 0".
As I know from university the symbol was intoduced by German mathematician Landau (1877-1938)
O(logn) is one of the polynomial time complexity to measure the runtime performance of any code.
I hope you have already heard of Binary search algorithm.
Let's assume you have to find an element in the array of size N.
Basically, the code execution is like
N
N/2
N/4
N/8....etc
If you sum all the work done at each level you will end up with n(1+1/2+1/4....) and that is equal to O(logn)
If you are looking for a intuition based answer I would like to put up two interpretations for you.
Imagine a very high hill with a very broad base as well. To reach the top of the hill there are two ways: one is a dedicated pathway going spirally around the hill reaching at the top, the other: small terrace like carvings cut out to provide a staircase. Now if the first way is reaching in linear time O(n), the second one is O(log n).
Imagine an algorithm, which accepts an integer, n as input and completes in time proportional to n then it is O(n) or theta(n) but if it runs in time proportion to the number of digits or the number of bits in the binary representation on number then the algorithm runs in O(log n) or theta(log n) time.
Algorithms in the Divide and Conquer paradigm are of complexity O(logn). One example here, calculate your own power function,
int power(int x, unsigned int y)
{
int temp;
if( y == 0)
return 1;
temp = power(x, y/2);
if (y%2 == 0)
return temp*temp;
else
return x*temp*temp;
}
from http://www.geeksforgeeks.org/write-a-c-program-to-calculate-powxn/
The instructor said that the complexity of an algorithm is typically measured with respect to its input size.
So, when we say an algorithm is linear, then even if you give it an input size of 2^n (say 2^n being the number of nodes in a binary tree), the algorithm is still linear to the input size?
The above seems to be what the instructor means, but I’m having a hard time turning it in my head. If you give it a 2^n input, which is exponential to some parameter ‘n’, but then call this input “x”, then, sure, your algorithm is linear to x. But deep-down, isn’t it still exponential in ‘n’? What’s the point of saying its linear to x?
Whenever you see the term "linear," you should ask - linear in what? Usually, when we talk about an algorithm's runtime being "linear time," we mean "the algorithm's runtime is O(n), where n is the size of the input."
You're asking what happens if, say, n = 2k and we're passing in an exponentially-sized input into the function. In that case, since the runtime is O(n) and n = 2k, then the overall runtime would be O(2k). There's no contradiction here between this statement and the fact that the algorithm runs in linear time, since "linear time" means "linear as a function of the size of the input."
Notice that I'm explicitly choosing to use a different variable k in the expression 2k to call attention to the fact that there are indeed two different quantities here - the size of the input as a function of k (which is 2k) and the variable n representing the size of the input to the function more generally. You sometimes see this combined, as in "if the runtime of the algorithm is O(n), then running the algorithm on an input of size 2n takes time O(2n)." That statement is true but a bit tricky to parse, since n is playing two different roles there.
If an algorithm has a linear time-complexity, then it is linear regardless the size of the input. Whether it is a fixed size input, quadratic or exponential.
Obviously running that algorithm on a fixed size array, quadratic or exponential will take different time, but still, the complexity is O(n).
Perhaps this example will help you understand, does running merge-sort on an array of size 16 mean merge-sort is O(1) because it took constant operations to sort that array? the answer is NO.
When we say an algorithm is O(n), means if the input size is n, it is linear regards to the input size. Hence, if n is exponential in terms of another parameter k (for example n = 2^k), the algorithm is linear as well, in regards to the input size.
Another example is time complexity for the binary search for an input array with size n. We say that binary search for a sorted array with size n is in O(log(n)). It means in regards to the input size, it takes asymptotically at most log(n) comparison to search an item inside an input array with size n,
Lets say you are printing first n numbers, and to print each number it takes 3 operations:
n-> 10, number of operations -> 3 x 10 = 30
n-> 100, number of operations -> 3 x 100 = 300
n-> 1000, number of operations -> 3 x 1000 = 3000
n ->10000, we can also say, n = 100^2 (say k^2),
number of operations --> 3 x 10000 = 30,000
Even though n is exponent of something(in this case 100), our number of operations solely depends upon number on the input(n which is 10,000).
So we can say, it is linear time complexity algorithm.
In computer science, the iterated logarithm of n, written log* n (usually read "log star"), is the number of times the logarithm function must be iteratively applied before the result is less than or equal to 1. The simplest formal definition is the result of this recursive function:
Is there any algorithm with time complexity O(lg * n) ?
If you implement union find algorithm with path compression and union by rank, both union and find will have complexity O(log*(n)).
It's rare but not unheard of to see log* n appear in the runtime analysis of algorithms. Here are a couple of cases that tend to cause log* n to appear.
Approach 1: Shrinking By a Log Factor
Many divide-and-conquer algorithms work by converting an input of size n into an input of size n / k. The number of phases of these algorithms is then O(log n), since you can only divide by a constant O(log n) times before you shrink your input down to a constant size. In that sense, when you see "the input is divided by a constant," you should think "so it can only be divided O(log n) times before we run out of things to divide."
In rarer cases, some algorithms work by shrinking the size of the input down by a logarithmic factor. For example, one data structure for the range semigroup query problem work by breaking a larger problem down into blocks of size log n, then recursively subdividing each block of size log n into blocks of size log log n, etc. This process eventually stops once the blocks hit some small constant size, which means that it stops after O(log* n) iterations. (This particular approach can then be improved to give a data structure in which the blocks have size log* n for an overall number of rounds of O(log** n), eventually converging to an optimal structure with runtime O(α(n)), where α(n) is the inverse Ackermann function.
Approach 2: Compressing Digits of Numbers
The above section talks about approaches that explicitly break a larger problem down into smaller pieces whose sizes are logarithmic in the size of the original problem. However, there's another way to take an input of size n and to reduce it to an input of size O(log n): replace the input with something roughly comparable in size to its number of digits. Since writing out the number n requires O(log n) digits to write out, this has the effect of shrinking the size of the number down by the amount needed to get an O(log* n) term to arise.
As a simple example of this, consider an algorithm to compute the digital root of a number. This is the number you get by repeatedly adding the digits of a number up until you're down to a single digit. For example, the digital root of 78979871 can be found by computing
7 + 8 + 9 + 7 + 9 + 8 + 7 + 1 = 56
5 + 6 = 11
1 + 1 = 2
2
and getting a digital root of two. Each time we sum the digits of the number, we replace the number n with a number that's at most 9 ⌈log10 n⌉, so the number of rounds is O(log* n). (That being said, the total runtime is O(log n), since we have to factor in the work associated with adding up the digits of the number, and adding the digits of the original number dominates the runtime.)
For a more elaborate example, there is a parallel algorithm for 3-coloring the nodes of a tree described in the paper "Parallel Symmetry-Breaking in Sparse Graphs" by Goldberg et al. The algorithm works by repeatedly replacing numbers with simpler numbers formed by summing up certain bits of the numbers, and the number of rounds needed, like the approach mentioned above, is O(log* n).
Hope this helps!
Is there a data structure representing a large set S of (64-bit) integers, that starts out empty and supports the following two operations:
insert(s) inserts the number s into S;
minmod(m) returns the number s in S such that s mod m is minimal.
An example:
insert(11)
insert(15)
minmod(7) -> the answer is 15 (which mod 7 = 1)
insert(14)
minmod(7) -> the answer is 14 (which mod 7 = 0)
minmod(10) -> the answer is 11 (which mod 10 = 1)
I am interested in minimizing the maximal total time spent on a sequence of n such operations. It is obviously possible to just maintain a list of elements for S and iterate through them for every minmod operation; then insert is O(1) and minmod is O(|S|), which would take O(n^2) time for n operations (e.g., n/2 insert operations followed by n/2 minmod operations would take roughly n^2/4 operations).
So: is it possible to do better than O(n^2) for a sequence of n operations? Maybe O(n sqrt(n)) or O(n log(n))? If this is possible, then I would also be interested to know if there are data structures that additionally admit removing single elements from S, or removing all numbers within an interval.
Another idea based on balanced binary search tree, as in Keith's answer.
Suppose all inserted elements so far are stored in balanced BST, and we need to compute minmod(m). Consider our set S as a union of subsets of numbers, lying in intervals [0,m-1], [m, 2m-1], [2m, 3m-1] .. etc. The answer will obviously be among the minimal numbers we have in each of that intervals. So, we can consequently lookup the tree to find the minimal numbers of that intervals. It's easy to do, for example if we need to find the minimal number in [a,b], we'll move left if current value is greater than a, and right otherwise, keeping track of the minimal value in [a,b] we've met so far.
Now if we suppose that m is uniformly distributed in [1, 2^64], let's calculate the mathematical expectation of number of queries we'll need.
For all m in [2^63, 2^64-1] we'll need 2 queries. The probability of this is 1/2.
For all m in [2^62, 2^63-1] we'll need 4 queries. The probability of this is 1/4.
...
The mathematical expectation will be sum[ 1/(2^k) * 2^k ], for k in [1,64], which is 64 queries.
So, to sum up, the average minmod(m) query complexity will be O(64*logn). In general, if we m has unknown upper bound, this will be O(logmlogn). The BST update is, as known, O(logn), so the overall complexity in case of n queries will be O(nlogm*logn).
Partial answer too big for a comment.
Suppose you implement S as a balanced binary search tree.
When you seek S.minmod(m), naively you walk the tree and the cost is O(n^2).
However, at a given time during the walk, you have the best (lowest) result so far. You can use this to avoid checking whole sub-trees when:
bestSoFar < leftChild mod m
and
rightChild - leftChild < m - leftChild mod m
This will only help much if a common spacing b/w the numbers in the set is smaller than common values of m.
Update the next morning...
Grigor has better and more fully articulated my idea and shown how it works well for "large" m. He also shows how a "random" m is typically "large", so works well.
Grigor's algorithm is so efficient for large m that one needs to think about the risk for much smaller m.
So it is clear that you need to think about the distribution of m and optimise for different cases if need be.
For example, it might be worth simply keeping track of the minimal modulus for very small m.
But suppose m ~ 2^32? Then the search algorithm (certainly as given but also otherwise) needs to check 2^32 intervals, which may amount to searching the whole set anyway.
I am learning about Big O Notation running times and amortized times. I understand the notion of O(n) linear time, meaning that the size of the input affects the growth of the algorithm proportionally...and the same goes for, for example, quadratic time O(n2) etc..even algorithms, such as permutation generators, with O(n!) times, that grow by factorials.
For example, the following function is O(n) because the algorithm grows in proportion to its input n:
f(int n) {
int i;
for (i = 0; i < n; ++i)
printf("%d", i);
}
Similarly, if there was a nested loop, the time would be O(n2).
But what exactly is O(log n)? For example, what does it mean to say that the height of a complete binary tree is O(log n)?
I do know (maybe not in great detail) what Logarithm is, in the sense that: log10 100 = 2, but I cannot understand how to identify a function with a logarithmic time.
I cannot understand how to identify a function with a log time.
The most common attributes of logarithmic running-time function are that:
the choice of the next element on which to perform some action is one of several possibilities, and
only one will need to be chosen.
or
the elements on which the action is performed are digits of n
This is why, for example, looking up people in a phone book is O(log n). You don't need to check every person in the phone book to find the right one; instead, you can simply divide-and-conquer by looking based on where their name is alphabetically, and in every section you only need to explore a subset of each section before you eventually find someone's phone number.
Of course, a bigger phone book will still take you a longer time, but it won't grow as quickly as the proportional increase in the additional size.
We can expand the phone book example to compare other kinds of operations and their running time. We will assume our phone book has businesses (the "Yellow Pages") which have unique names and people (the "White Pages") which may not have unique names. A phone number is assigned to at most one person or business. We will also assume that it takes constant time to flip to a specific page.
Here are the running times of some operations we might perform on the phone book, from fastest to slowest:
O(1) (in the worst case): Given the page that a business's name is on and the business name, find the phone number.
O(1) (in the average case): Given the page that a person's name is on and their name, find the phone number.
O(log n): Given a person's name, find the phone number by picking a random point about halfway through the part of the book you haven't searched yet, then checking to see whether the person's name is at that point. Then repeat the process about halfway through the part of the book where the person's name lies. (This is a binary search for a person's name.)
O(n): Find all people whose phone numbers contain the digit "5".
O(n): Given a phone number, find the person or business with that number.
O(n log n): There was a mix-up at the printer's office, and our phone book had all its pages inserted in a random order. Fix the ordering so that it's correct by looking at the first name on each page and then putting that page in the appropriate spot in a new, empty phone book.
For the below examples, we're now at the printer's office. Phone books are waiting to be mailed to each resident or business, and there's a sticker on each phone book identifying where it should be mailed to. Every person or business gets one phone book.
O(n log n): We want to personalize the phone book, so we're going to find each person or business's name in their designated copy, then circle their name in the book and write a short thank-you note for their patronage.
O(n2): A mistake occurred at the office, and every entry in each of the phone books has an extra "0" at the end of the phone number. Take some white-out and remove each zero.
O(n · n!): We're ready to load the phonebooks onto the shipping dock. Unfortunately, the robot that was supposed to load the books has gone haywire: it's putting the books onto the truck in a random order! Even worse, it loads all the books onto the truck, then checks to see if they're in the right order, and if not, it unloads them and starts over. (This is the dreaded bogo sort.)
O(nn): You fix the robot so that it's loading things correctly. The next day, one of your co-workers plays a prank on you and wires the loading dock robot to the automated printing systems. Every time the robot goes to load an original book, the factory printer makes a duplicate run of all the phonebooks! Fortunately, the robot's bug-detection systems are sophisticated enough that the robot doesn't try printing even more copies when it encounters a duplicate book for loading, but it still has to load every original and duplicate book that's been printed.
O(log N) basically means time goes up linearly while the n goes up exponentially. So if it takes 1 second to compute 10 elements, it will take 2 seconds to compute 100 elements, 3 seconds to compute 1000 elements, and so on.
It is O(log n) when we do divide and conquer type of algorithms e.g binary search. Another example is quick sort where each time we divide the array into two parts and each time it takes O(N) time to find a pivot element. Hence it N O(log N)
Many good answers have already been posted to this question, but I believe we really are missing an important one - namely, the illustrated answer.
What does it mean to say that the height of a complete binary tree is O(log n)?
The following drawing depicts a binary tree. Notice how each level contains double the number of nodes compared to the level above (hence binary):
Binary search is an example with complexity O(log n). Let's say that the nodes in the bottom level of the tree in figure 1 represents items in some sorted collection. Binary search is a divide-and-conquer algorithm, and the drawing shows how we will need (at most) 4 comparisons to find the record we are searching for in this 16 item dataset.
Assume we had instead a dataset with 32 elements. Continue the drawing above to find that we will now need 5 comparisons to find what we are searching for, as the tree has only grown one level deeper when we multiplied the amount of data. As a result, the complexity of the algorithm can be described as a logarithmic order.
Plotting log(n) on a plain piece of paper, will result in a graph where the rise of the curve decelerates as n increases:
Overview
Others have given good diagram examples, such as the tree diagrams. I did not see any simple code examples. So in addition to my explanation, I'll provide some algorithms with simple print statements to illustrate the complexity of different algorithm categories.
First, you'll want to have a general idea of Logarithm, which you can get from https://en.wikipedia.org/wiki/Logarithm . Natural science use e and the natural log. Engineering disciples will use log_10 (log base 10) and computer scientists will use log_2 (log base 2) a lot, since computers are binary based. Sometimes you'll see abbreviations of natural log as ln(), engineers normally leave the _10 off and just use log() and log_2 is abbreviated as lg(). All of the types of logarithms grow in a similar fashion, that is why they share the same category of log(n).
When you look at the code examples below, I recommend looking at O(1), then O(n), then O(n^2). After you are good with those, then look at the others. I've included clean examples as well as variations to demonstrate how subtle changes can still result in the same categorization.
You can think of O(1), O(n), O(logn), etc as classes or categories of growth. Some categories will take more time to do than others. These categories help give us a way of ordering the algorithm performance. Some grown faster as the input n grows. The following table demonstrates said growth numerically. In the table below think of log(n) as the ceiling of log_2.
Simple Code Examples Of Various Big O Categories:
O(1) - Constant Time Examples:
Algorithm 1:
Algorithm 1 prints hello once and it doesn't depend on n, so it will always run in constant time, so it is O(1).
print "hello";
Algorithm 2:
Algorithm 2 prints hello 3 times, however it does not depend on an input size. Even as n grows, this algorithm will always only print hello 3 times. That being said 3, is a constant, so this algorithm is also O(1).
print "hello";
print "hello";
print "hello";
O(log(n)) - Logarithmic Examples:
Algorithm 3 - This acts like "log_2"
Algorithm 3 demonstrates an algorithm that runs in log_2(n). Notice the post operation of the for loop multiples the current value of i by 2, so i goes from 1 to 2 to 4 to 8 to 16 to 32 ...
for(int i = 1; i <= n; i = i * 2)
print "hello";
Algorithm 4 - This acts like "log_3"
Algorithm 4 demonstrates log_3. Notice i goes from 1 to 3 to 9 to 27...
for(int i = 1; i <= n; i = i * 3)
print "hello";
Algorithm 5 - This acts like "log_1.02"
Algorithm 5 is important, as it helps show that as long as the number is greater than 1 and the result is repeatedly multiplied against itself, that you are looking at a logarithmic algorithm.
for(double i = 1; i < n; i = i * 1.02)
print "hello";
O(n) - Linear Time Examples:
Algorithm 6
This algorithm is simple, which prints hello n times.
for(int i = 0; i < n; i++)
print "hello";
Algorithm 7
This algorithm shows a variation, where it will print hello n/2 times. n/2 = 1/2 * n. We ignore the 1/2 constant and see that this algorithm is O(n).
for(int i = 0; i < n; i = i + 2)
print "hello";
O(n*log(n)) - nlog(n) Examples:
Algorithm 8
Think of this as a combination of O(log(n)) and O(n). The nesting of the for loops help us obtain the O(n*log(n))
for(int i = 0; i < n; i++)
for(int j = 1; j < n; j = j * 2)
print "hello";
Algorithm 9
Algorithm 9 is like algorithm 8, but each of the loops has allowed variations, which still result in the final result being O(n*log(n))
for(int i = 0; i < n; i = i + 2)
for(int j = 1; j < n; j = j * 3)
print "hello";
O(n^2) - n squared Examples:
Algorithm 10
O(n^2) is obtained easily by nesting standard for loops.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j++)
print "hello";
Algorithm 11
Like algorithm 10, but with some variations.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j = j + 2)
print "hello";
O(n^3) - n cubed Examples:
Algorithm 12
This is like algorithm 10, but with 3 loops instead of 2.
for(int i = 0; i < n; i++)
for(int j = 0; j < n; j++)
for(int k = 0; k < n; k++)
print "hello";
Algorithm 13
Like algorithm 12, but with some variations that still yield O(n^3).
for(int i = 0; i < n; i++)
for(int j = 0; j < n + 5; j = j + 2)
for(int k = 0; k < n; k = k + 3)
print "hello";
Summary
The above give several straight forward examples, and variations to help demonstrate what subtle changes can be introduced that really don't change the analysis. Hopefully it gives you enough insight.
The explanation below is using the case of a fully balanced binary tree to help you understand how we get logarithmic time complexity.
Binary tree is a case where a problem of size n is divided into sub-problem of size n/2 until we reach a problem of size 1:
And that's how you get O(log n) which is the amount of work that needs to be done on the above tree to reach a solution.
A common algorithm with O(log n) time complexity is Binary Search whose recursive relation is T(n/2) + O(1) i.e. at every subsequent level of the tree you divide problem into half and do constant amount of additional work.
If you had a function that takes:
1 millisecond to complete if you have 2 elements.
2 milliseconds to complete if you have 4 elements.
3 milliseconds to complete if you have 8 elements.
4 milliseconds to complete if you have 16 elements.
...
n milliseconds to complete if you have 2^n elements.
Then it takes log2(n) time. The Big O notation, loosely speaking, means that the relationship only needs to be true for large n, and that constant factors and smaller terms can be ignored.
The logarithm
Ok let's try and fully understand what a logarithm actually is.
Imagine we have a rope and we have tied it to a horse. If the rope is directly tied to the horse, the force the horse would need to pull away (say, from a man) is directly 1.
Now imagine the rope is looped round a pole. The horse to get away will now have to pull many times harder. The amount of times will depend on the roughness of the rope and the size of the pole, but let's assume it will multiply one's strength by 10 (when the rope makes a complete turn).
Now if the rope is looped once, the horse will need to pull 10 times harder. If the human decides to make it really difficult for the horse, he may loop the rope again round a pole, increasing it's strength by an additional 10 times. A third loop will again increase the strength by a further 10 times.
We can see that for each loop, the value increases by 10. The number of turns required to get any number is called the logarithm of the number i.e. we need 3 posts to multiple your strength by 1000 times, 6 posts to multiply your strength by 1,000,000.
3 is the logarithm of 1,000, and 6 is the logarithm of 1,000,000 (base 10).
So what does O(log n) actually mean?
In our example above, our 'growth rate' is O(log n). For every additional loop, the force our rope can handle is 10 times more:
Turns | Max Force
0 | 1
1 | 10
2 | 100
3 | 1000
4 | 10000
n | 10^n
Now the example above did use base 10, but fortunately the base of the log is insignificant when we talk about big o notation.
Now let's imagine you are trying to guess a number between 1-100.
Your Friend: Guess my number between 1-100!
Your Guess: 50
Your Friend: Lower!
Your Guess: 25
Your Friend: Lower!
Your Guess: 13
Your Friend: Higher!
Your Guess: 19
Your Friend: Higher!
Your Friend: 22
Your Guess: Lower!
Your Guess: 20
Your Friend: Higher!
Your Guess: 21
Your Friend: YOU GOT IT!
Now it took you 7 guesses to get this right. But what is the relationship here? What is the most amount of items that you can guess from each additional guess?
Guesses | Items
1 | 2
2 | 4
3 | 8
4 | 16
5 | 32
6 | 64
7 | 128
10 | 1024
Using the graph, we can see that if we use a binary search to guess a number between 1-100 it will take us at most 7 attempts. If we had 128 numbers, we could also guess the number in 7 attemps but 129 numbers will takes us at most 8 attempts (in relations to logarithms, here we would need 7 guesses for a 128 value range, 10 guesses for a 1024 value range. 7 is the logarithm of 128, 10 is the logarithm of 1024 (base 2)).
Notice that I have bolded 'at most'. Big-O notation always refers to the worse case. If you're lucky, you could guess the number in one attempt and so the best case is O(1), but that's another story.
We can see that for every guess our data set is shrinking. A good rule of thumb to identify if an algorithm has a logarithmtic time is
to see if the data set shrinks by a certain order after each iteration
What about O(n log n)?
You will eventually come across a linearithmic time O(n log(n)) algorithm. The rule of thumb above applies again, but this time the logarithmic function has to run n times e.g. reducing the size of a list n times, which occurs in algorithms like a mergesort.
You can easily identify if the algorithmic time is n log n. Look for an outer loop which iterates through a list (O(n)). Then look to see if there is an inner loop. If the inner loop is cutting/reducing the data set on each iteration, that loop is (O(log n)), and so the overall algorithm is = O(n log n).
Disclaimer: The rope-logarithm example was grabbed from the excellent Mathematician's Delight book by W.Sawyer.
Logarithmic running time (O(log n)) essentially means that the running time grows in proportion to the logarithm of the input size - as an example, if 10 items takes at most some amount of time x, and 100 items takes at most, say, 2x, and 10,000 items takes at most 4x, then it's looking like an O(log n) time complexity.
First I recommend you to read following book;
Algorithms (4th Edition)
Here is some functions and their expected complexities. Numbers are indicating statement execution frequencies.
Following Big-O Complexity Chart also taken from bigocheatsheet
Lastly very simple showcase there is shows how it is calculated;
Anatomy of a program’s statement execution frequencies.
Analyzing the running time of a program (example).
You can think of O(log N) intuitively by saying the time is proportional to the number of digits in N.
If an operation performs constant time work on each digit or bit of an input, the whole operation will take time proportional to the number of digits or bits in the input, not the magnitude of the input; thus, O(log N) rather than O(N).
If an operation makes a series of constant time decisions each of which halves (reduces by a factor of 3, 4, 5..) the size of the input to be considered, the whole will take time proportional to log base 2 (base 3, base 4, base 5...) of the size N of the input, rather than being O(N).
And so on.
What's logb(n)?
It is the number of times you can cut a log of length n repeatedly into b equal parts before reaching a section of size 1.
The best way I've always had to mentally visualize an algorithm that runs in O(log n) is as follows:
If you increase the problem size by a multiplicative amount (i.e. multiply its size by 10), the work is only increased by an additive amount.
Applying this to your binary tree question so you have a good application: if you double the number of nodes in a binary tree, the height only increases by 1 (an additive amount). If you double it again, it still only increased by 1. (Obviously I'm assuming it stays balanced and such). That way, instead of doubling your work when the problem size is multiplied, you're only doing very slightly more work. That's why O(log n) algorithms are awesome.
Divide and conquer algorithms usually have a logn component to the running time. This comes from the repeated halving of the input.
In the case of binary search, every iteration you throw away half of the input. It should be noted that in Big-O notation, log is log base 2.
Edit: As noted, the log base doesn't matter, but when deriving the Big-O performance of an algorithm, the log factor will come from halving, hence why I think of it as base 2.
O(log n) is a bit misleading, more precisely it's O(log2 n), i.e. (logarithm with base 2).
The height of a balanced binary tree is O(log2 n), since every node has two (note the "two" as in log2 n) child nodes. So, a tree with n nodes has a height of log2 n.
Another example is binary search, which has a running time of O(log2 n) because at every step you divide the search space by 2.
But what exactly is O(log n)? For example, what does it mean to say that the height of a >complete binary tree is O(log n)?
I would rephrase this as 'height of a complete binary tree is log n'. Figuring the height of a complete binary tree would be O(log n), if you were traversing down step by step.
I cannot understand how to identify a function with a logarithmic
time.
Logarithm is essentially the inverse of exponentiation. So, if each 'step' of your function is eliminating a factor of elements from the original item set, that is a logarithmic time algorithm.
For the tree example, you can easily see that stepping down a level of nodes cuts down an exponential number of elements as you continue traversing. The popular example of looking through a name-sorted phone book is essentially equivalent to traversing down a binary search tree (middle page is the root element, and you can deduce at each step whether to go left or right).
O(log n) refers to a function (or algorithm, or step in an algorithm) working in an amount of time proportional to the logarithm (usually base 2 in most cases, but not always, and in any event this is insignificant by big-O notation*) of the size of the input.
The logarithmic function is the inverse of the exponential function. Put another way, if your input grows exponentially (rather than linearly, as you would normally consider it), your function grows linearly.
O(log n) running times are very common in any sort of divide-and-conquer application, because you are (ideally) cutting the work in half every time. If in each of the division or conquer steps, you are doing constant time work (or work that is not constant-time, but with time growing more slowly than O(log n)), then your entire function is O(log n). It's fairly common to have each step require linear time on the input instead; this will amount to a total time complexity of O(n log n).
The running time complexity of binary search is an example of O(log n). This is because in binary search, you are always ignoring half of your input in each later step by dividing the array in half and only focusing on one half with each step. Each step is constant-time, because in binary search you only need to compare one element with your key in order to figure out what to do next irregardless of how big the array you are considering is at any point. So you do approximately log(n)/log(2) steps.
The running time complexity of merge sort is an example of O(n log n). This is because you are dividing the array in half with each step, resulting in a total of approximately log(n)/log(2) steps. However, in each step you need to perform merge operations on all elements (whether it's one merge operation on two sublists of n/2 elements, or two merge operations on four sublists of n/4 elements, is irrelevant because it adds to having to do this for n elements in each step). Thus, the total complexity is O(n log n).
*Remember that big-O notation, by definition, constants don't matter. Also by the change of base rule for logarithms, the only difference between logarithms of different bases is a constant factor.
These 2 cases will take O(log n) time
case 1: f(int n) {
int i;
for (i = 1; i < n; i=i*2)
printf("%d", i);
}
case 2 : f(int n) {
int i;
for (i = n; i>=1 ; i=i/2)
printf("%d", i);
}
It simply means that the time needed for this task grows with log(n) (example : 2s for n = 10, 4s for n = 100, ...). Read the Wikipedia articles on Binary Search Algorithm and Big O Notation for more precisions.
Simply put: At each step of your algorithm you can cut the work in half. (Asymptotically equivalent to third, fourth, ...)
If you plot a logarithmic function on a graphical calculator or something similar, you'll see that it rises really slowly -- even more slowly than a linear function.
This is why algorithms with a logarithmic time complexity are highly sought after: even for really big n (let's say n = 10^8, for example), they perform more than acceptably.
I can add something interesting, that I read in book by Kormen and etc. a long time ago. Now, imagine a problem, where we have to find a solution in a problem space. This problem space should be finite.
Now, if you can prove, that at every iteration of your algorithm you cut off a fraction of this space, that is no less than some limit, this means that your algorithm is running in O(logN) time.
I should point out, that we are talking here about a relative fraction limit, not the absolute one. The binary search is a classical example. At each step we throw away 1/2 of the problem space. But binary search is not the only such example. Suppose, you proved somehow, that at each step you throw away at least 1/128 of problem space. That means, your program is still running at O(logN) time, although significantly slower than the binary search. This is a very good hint in analyzing of recursive algorithms. It often can be proved that at each step the recursion will not use several variants, and this leads to the cutoff of some fraction in problem space.
Actually, if you have a list of n elements, and create a binary tree from that list (like in the divide and conquer algorithm), you will keep dividing by 2 until you reach lists of size 1 (the leaves).
At the first step, you divide by 2. You then have 2 lists (2^1), you divide each by 2, so you have 4 lists (2^2), you divide again, you have 8 lists (2^3)and so on until your list size is 1
That gives you the equation :
n/(2^steps)=1 <=> n=2^steps <=> lg(n)=steps
(you take the lg of each side, lg being the log base 2)
I can give an example for a for loop and maybe once grasped the concept maybe it will be simpler to understand in different contexts.
That means that in the loop the step grows exponentially. E.g.
for (i=1; i<=n; i=i*2) {;}
The complexity in O-notation of this program is O(log(n)). Let's try to loop through it by hand (n being somewhere between 512 and 1023 (excluding 1024):
step: 1 2 3 4 5 6 7 8 9 10
i: 1 2 4 8 16 32 64 128 256 512
Although n is somewhere between 512 and 1023, only 10 iterations take place. This is because the step in the loop grows exponentially and thus takes only 10 iterations to reach the termination.
The logarithm of x (to the base of a) is the reverse function of a^x.
It is like saying that logarithm is the inverse of exponential.
Now try to see it that way, if exponential grows very fast then logarithm grows (inversely) very slow.
The difference between O(n) and O(log(n)) is huge, similar to the difference between O(n) and O(a^n) (a being a constant).
Every time we write an algorithm or code we try to analyze its asymptotic complexity.
It is different from its time complexity.
Asymptotic complexity is the behavior of execution time of an algorithm while the time complexity is the actual execution time. But some people use these terms interchangeably.
Because time complexity depends on various parameters viz.
1. Physical System
2. Programming Language
3. coding Style
4. And much more ......
The actual execution time is not a good measure for analysis.
Instead we take input size as the parameter because whatever the code is, the input is same.
So the execution time is a function of input size.
Following is an example of Linear Time Algorithm
Linear Search
Given n input elements, to search an element in the array you need at most 'n' comparisons. In other words, no matter what programming language you use, what coding style you prefer, on what system you execute it. In the worst case scenario it requires only n comparisons.The execution time is linearly proportional to the input size.
And its not just search, whatever may be the work (increment, compare or any operation) its a function of input size.
So when you say any algorithm is O(log n)
it means the execution time is log times the input size n.
As the input size increases the work done(here the execution time) increases.(Hence proportionality)
n Work
2 1 units of work
4 2 units of work
8 3 units of work
See as the input size increased the work done is increased and it is independent of any machine.
And if you try to find out the value of units of work
It's actually dependent onto those above specified parameters.It will change according to the systems and all.
The complete binary example is O(ln n) because the search looks like this:
1 2 3 4 5 6 7 8 9 10 11 12
Searching for 4 yields 3 hits: 6, 3 then 4. And log2 12 = 3, which is a good apporximate to how many hits where needed.
log x to base b = y is the inverse of b^y = x
If you have an M-ary tree of depth d and size n, then:
traversing the whole tree ~ O(M^d) = O(n)
Walking a single path in the tree ~ O(d) = O(log n to base M)
In information technology it means that:
f(n)=O(g(n)) If there is suitable constant C and N0 independent on N,
such that
for all N>N0 "C*g(n) > f(n) > 0" is true.
Ant it seems that this notation was mostly have taken from mathematics.
In this article there is a quote:
D.E. Knuth, "BIG OMICRON AND BIG OMEGA AND BIG THETA", 1976:
On the basis of the issues discussed here, I propose that members of
SIGACT, and editors of computer science and mathematics journals,
adopt notations as defined above, unless a better alternative can be
found reasonably soon.
Today is 2016, but we use it still today.
In mathematical analysis it means that:
lim (f(n)/g(n))=Constant; where n goes to +infinity
But even in mathematical analysis sometimes this symbol was used in meaning "C*g(n) > f(n) > 0".
As I know from university the symbol was intoduced by German mathematician Landau (1877-1938)
O(logn) is one of the polynomial time complexity to measure the runtime performance of any code.
I hope you have already heard of Binary search algorithm.
Let's assume you have to find an element in the array of size N.
Basically, the code execution is like
N
N/2
N/4
N/8....etc
If you sum all the work done at each level you will end up with n(1+1/2+1/4....) and that is equal to O(logn)
If you are looking for a intuition based answer I would like to put up two interpretations for you.
Imagine a very high hill with a very broad base as well. To reach the top of the hill there are two ways: one is a dedicated pathway going spirally around the hill reaching at the top, the other: small terrace like carvings cut out to provide a staircase. Now if the first way is reaching in linear time O(n), the second one is O(log n).
Imagine an algorithm, which accepts an integer, n as input and completes in time proportional to n then it is O(n) or theta(n) but if it runs in time proportion to the number of digits or the number of bits in the binary representation on number then the algorithm runs in O(log n) or theta(log n) time.
Algorithms in the Divide and Conquer paradigm are of complexity O(logn). One example here, calculate your own power function,
int power(int x, unsigned int y)
{
int temp;
if( y == 0)
return 1;
temp = power(x, y/2);
if (y%2 == 0)
return temp*temp;
else
return x*temp*temp;
}
from http://www.geeksforgeeks.org/write-a-c-program-to-calculate-powxn/