what does O(N) mean [duplicate] - performance

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
What is Big O notation? Do you use it?
Hi all,
fairly basic scalability notation question.
I recently recieved a comment on a post that my python ordered-list implimentation
"but beware that your 'ordered set' implementation is O(N) for insertions"
Which is great to know, but I'm not sure what this means.
I've seen notation such as n(o) o(N), N(o-1) or N(o*o)
what does the above notation refer to?

The comment was referring to the Big-O Notation.
Briefly:
O(1) means in constant time -
independent of the number of items.
O(N) means in proportion to the
number of items.
O(log N) means a time proportional to
log(N)
Basically any 'O' notation means an operation will take time up to a maximum of k*f(N)
where:
k is a constant multiplier
f() is a function that depends on N

O(n) is Big O Notation and refers to the complexity of a given algorithm. n refers to the size of the input, in your case it's the number of items in your list.
O(n) means that your algorithm will take on the order of n operations to insert an item. e.g. looping through the list once (or a constant number of times such as twice or only looping through half).
O(1) means it takes a constant time, that it is not dependent on how many items are in the list.
O(n^2) means that for every insert, it takes n*n operations. i.e. 1 operation for 1 item, 4 operations for 2 items, 9 operations for 3 items. As you can see, O(n^2) algorithms become inefficient for handling large number of items.
For lists O(n) is not bad for insertion, but not the quickest. Also note that O(n/2) is considered as being the same as O(n) because they both grow at the same rate with n.

It's called Big O Notation: http://en.wikipedia.org/wiki/Big_O_notation
So saying that insertion is O(n) means that you have to walk through the whole list (or half of it -- big O notation ignores constant factors) to perform the insertion.
This looks like a nice introduction: http://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/

Specifically O(n) means that if there's 2x as many items in the list, it'll takes No more than twice as long, if there's 50 times as many it'll take No more than 50 times as long. See the wikipedia article dreeves pointed out for more details
Edit (in bold above): It was pointed out that Big-O does represent the upper bound, so if there's twice as many elements in the list, insertion will take at most twice as long, and if there's 50 times as many elements, it would take at most 50 times as long.
If it was additionally Ω(n) (Big Omega of n) then it would take at least twice as long for a list that is twice as big. If your implementation is both O(n) and Ω(n), meaning that it'll take both at least and at most twice as long for a list twice as big, then it can be said to be Θ(n) (Big Theta of n), meaning that it'll take exactly twice as long if there are twice as many elements.
According to Wikipedia (and personal experience, being guilty of it myself) Big-O is often used where Big-Theta is what is meant. It would be technically correct to call your function O(n^n^n^n) because all Big-O says is that your function is no slower than that, but no one would actually say that other than to prove a point because it's not very useful and misleading information, despite it being technically accurate.

It refers to how complex your program is, i.e., how many operations it takes to actually solve a problem. O(n) means that each operation takes the same number of steps as the items in your list, which for insertion, is very slow. Likewise, if you have O(n^2) means that any operation takes "n" squared number of steps to accomplish, and so on... The "O" is for Order of Magnitude, and the the expression in the parentheses is always related to the number of items being manipulated in the procedure.

Short answer: It means that the processing time is in linear relation to the size of input. E.g if the size of input (length of list) triples, the processing time (roughly) triples. And if it increases thousandfold, the processing time also increases in the same magnitude.
Long answer: See the links provided by Ian P and dreeves

This may help:
http://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions
O(n): Finding an item in an unsorted
list or a malformed tree (worst case);
adding two n-digit numbers
Good luck!

Wikipedia explains it far better than I can, however it means that if your list size is N, it takes at max N loops/iterations to insert an item. (In effect, you have to iterate over the whole list)
If you want a better understanding, there is a free book from Berkeley that goes more in-depth about the notation.

Related

Do problem constraints change the time complexity of algorithms?

Let's say that the algorithm involves iterating through a string character by character.
If I know for sure that the length of the string is less than, say, 15 characters, will the time complexity be O(1) or will it remain as O(n)?
There are two aspects to this question - the core of the question is, can problem constraints change the asymptotic complexity of an algorithm? The answer to that is yes. But then you give an example of a constraint (strings limited to 15 characters) where the answer is: the question doesn't make sense. A lot of the other answers here are misleading because they address only the second aspect but try to reach a conclusion about the first one.
Formally, the asymptotic complexity of an algorithm is measured by considering a set of inputs where the input sizes (i.e. what we call n) are unbounded. The reason n must be unbounded is because the definition of asymptotic complexity is a statement like "there is some n0 such that for all n ≥ n0, ...", so if the set doesn't contain any inputs of size n ≥ n0 then this statement is vacuous.
Since algorithms can have different running times depending on which inputs of each size we consider, we often distinguish between "average", "worst case" and "best case" time complexity. Take for example insertion sort:
In the average case, insertion sort has to compare the current element with half of the elements in the sorted portion of the array, so the algorithm does about n2/4 comparisons.
In the worst case, when the array is in descending order, insertion sort has to compare the current element with every element in the sorted portion (because it's less than all of them), so the algorithm does about n2/2 comparisons.
In the best case, when the array is in ascending order, insertion sort only has to compare the current element with the largest element in the sorted portion, so the algorithm does about n comparisons.
However, now suppose we add the constraint that the input array is always in ascending order except for its smallest element:
Now the average case does about 3n/2 comparisons,
The worst case does about 2n comparisons,
And the best case does about n comparisons.
Note that it's the same algorithm, insertion sort, but because we're considering a different set of inputs where the algorithm has different performance characteristics, we end up with a different time complexity for the average case because we're taking an average over a different set, and similarly we get a different time complexity for the worst case because we're choosing the worst inputs from a different set. Hence, yes, adding a problem constraint can change the time complexity even if the algorithm itself is not changed.
However, now let's consider your example of an algorithm which iterates over each character in a string, with the added constraint that the string's length is at most 15 characters. Here, it does not make sense to talk about the asymptotic complexity, because the input sizes n in your set are not unbounded. This particular set of inputs is not valid for doing such an analysis with.
In the mathematical sense, yes. Big-O notation describes the behavior of an algorithm in the limit, and if you have a fixed upper bound on the input size, that implies it has a maximum constant complexity.
That said, context is important. All computers have a realistic limit to the amount of input they can accept (a technical upper bound). Just because nothing in the world can store a yottabyte of data doesn't mean saying every algorithm is O(1) is useful! It's about applying the mathematics in a way that makes sense for the situation.
Here are two contexts for your example, one where it makes sense to call it O(1), and one where it does not.
"I decided I won't put strings of length more than 15 into my program, therefore it is O(1)". This is not a super useful interpretation of the runtime. The actual time is still strongly tied to the size of the string; a string of size 1 will run much faster than one of size 15 even if there is technically a constant bound. In other words, within the constraints of your problem there is still a strong correlation to n.
"My algorithm will process a list of n strings, each with maximum size 15". Here we have a different story; the runtime is dominated by having to run through the list! There's a point where n is so large that the time to process a single string doesn't change the correlation. Now it makes sense to consider the time to process a single string O(1), and therefore the time to process the whole list O(n)
That said, Big-O notation doesn't have to only use one variable! There are problems where upper bounds are intrinsic to the algorithm, but you wouldn't put a bound on the input arbitrarily. Instead, you can describe each dimension of your input as a different variable:
n = list length
s = maximum string length
=> O(n*s)
It depends.
If your algorithm's requirements would grow if larger inputs were provided, then the algorithmic complexity can (and should) be evaluated independently of the inputs. So iterating over all the elements of a list, array, string, etc., is O(n) in relation to the length of the input.
If your algorithm is tied to the limited input size, then that fact becomes part of your algorithmic complexity. For example, maybe your algorithm only iterates over the first 15 characters of the input string, regardless of how long it is. Or maybe your business case simply indicates that a larger input would be an indication of a bug in the calling code, so you opt to immediately exit with an error whenever the input size is larger than a fixed number. In those cases, the algorithm will have constant requirements as the input length tends toward very large numbers.
From Wikipedia
Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity.
...
In computer science, big O notation is used to classify algorithms according to how their run time or space requirements grow as the input size grows.
In practice, almost all inputs have limits: you cannot input a number larger than what's representable by the numeric type, or a string that's larger than the available memory space. So it would be silly to say that any limits change an algorithm's asymptotic complexity. You could, in theory, use 15 as your asymptote (or "particular value"), and therefore use Big-O notation to define how an algorithm grows as the input approaches that size. There are some algorithms with such terrible complexity (or some execution environments with limited-enough resources) that this would be meaningful.
But if your argument (string length) does not tend toward a large enough value for some aspect of your algorithm's complexity to define the growth of its resource requirements, it's arguably not appropriate to use asymptotic notation at all.
NO!
The time complexity of an algorithm is independent of program constraints. Here is (a simple) way of thinking about it:
Say your algorithm iterates over the string and appends all consonants to a list.
Now, for iteration time complexity is O(n). This means that the time taken will increase roughly in proportion to the increase in the length of the string. (Time itself though would vary depending on the time taken by the if statement and Branch Prediction)
The fact that you know that the string is between 1 and 15 characters long will not change how the program runs, it merely tells you what to expect.
For example, knowing that your values are going to be less than 65000 you could store them in a 16-bit integer and not worry about Integer overflow.
Do problem constraints change the time complexity of algorithms?
No.
If I know for sure that the length of the string is less than, say, 15 characters ..."
We already know the length of the string is less than SIZE_MAX. Knowing an upper fixed bound for string length does not make the the time complexity O(1).
Time complexity remains O(n).
Big-O measures the complexity of algorithms, not of code. It means Big-O does not know the physical limitations of computers. A Big-O measure today will be the same in 1 million years when computers, and programmers alike, have evolved beyond recognition.
So restrictions imposed by today's computers are irrelevant for Big-O. Even though any loop is finite in code, that need not be the case in algorithmic terms. The loop may be finite or infinite. It is up to the programmer/Big-O analyst to decide. Only s/he knows which algorithm the code intends to implement. If the number of loop iterations is finite, the loop has a Big-O complexity of O(1) because there is no asymptotic growth with N. If, on the other hand, the number of loop iterations is infinite, the Big-O complexity is O(N) because there is an asymptotic growth with N.
The above is straight from the definition of Big-O complexity. There are no ifs or buts. The way the OP describes the loop makes it O(1).
A fundamental requirement of big-O notation is that parameters do not have an upper limit. Suppose performing an operation on N elements takes a time precisely equal to 3E24*N*N*N / (1E24+N*N*N) microseconds. For small values of N, the execution time would be proportional to N^3, but as N gets larger the N^3 term in the denominator would start to play an increasing role in the computation.
If N is 1, the time would be 3 microseconds.
If N is 1E3, the time would be about 3E33/1E24, i.e. 3.0E9.
If N is 1E6, the time would be about 3E42/1E24, i.e. 3.0E18
If N is 1E7, the time would be 3E45/1.001E24, i.e. ~2.997E21
If N is 1E8, the time would be about 3E48/2E24, i.e. 1.5E24
If N is 1E9, the time would be 3E51/1.001E27, i.e. ~2.997E24
If N is 1E10, the time would be about 3E54/1.000001E30, i.e. 2.999997E24
As N gets bigger, the time would continue to grow, but no matter how big N gets the time would always be less than 3.000E24 seconds. Thus, the time required for this algorithm would be O(1) because one could specify a constant k such that the time necessary to perform the computation with size N would be less than k.
For any practical value of N, the time required would be proportional to N^3, but from an O(N) standpoint the worst-case time requirement is constant. The fact that the time changes rapidly in response to small values of N is irrelevant to the "big picture" behaviour, which is what big-O notation measures.
It will be O(1) i.e. constant.
This is because for calculating time complexity or worst-case time complexity (to be precise), we think of the input as a huge chunk of data and the length of this data is assumed to be n.
Let us say, we do some maximum work C on each part of this input data, which we will consider as a constant.
In order to get the worst-case time complexity, we need to loop through each part of the input data i.e. we need to loop n times.
So, the time complexity will be:
n x C.
Since you fixed n to be less than 15 characters, n can also be assumed as a constant number.
Hence in this case:
n = constant and,
(maximum constant work done) = C = constant
So time complexity is n x C = constant x constant = constant i.e. O(1)
Edit
The reason why I have said n = constant and C = constant for this case, is because the time difference for doing calculations for smaller n will become so insignificant (compared to n being a very large number) for modern computers that we can assume it to be constant.
Otherwise, every function ever build will take some time, and we can't say things like:
lookup time is constant for hashmaps

What exctly is "cn" while calculating the complexity of Merge Sort?

I was reading CLRS today to understand the complexity of Merge sort in a better way. I came across a line that says "where the constant c represents the time required to solve problems of size 1 as well as the time per array element of the divide and combine steps." I know what the author means by problems of size 1 but what exactly is time per array element of the divide and combine steps and why is it "cn"?
The image of the text is given below for reference.
You're right that it's a bit confusing and not very precise of the authors. It's better to use recurrences like to this to count discrete events like comparisons. But then you have to lay down details of a particular implementation, and that gets fiddly. This proof is a kind of estimate, which is good enough since we're only looking for big Omega behavior. Constant factors don't make a difference.
To make sense of it, think of cn (which is c times n) as the amount of time it takes to do a merge step on lists with total length n. So c is a rough expression for the constant time it takes to handle one element: the time it takes to execute one iteration of whatever loop is doing the merging.
Rather than merging, he calls this "combining". He's also proposing there might be a per-element cost to splitting the lists for recursive sorting. In a normal array implementation, there is no such per-element cost. A linked list mergesort will have one, though: a loop that divides a big list into two halves. Then c represents one iteration of both loops.
The recursive term 2T(n/2) is an expression for the amount of time it takes to sort the two sub-lists.
You could make this expression a little more precise by saying
T(n) = 2T(n/2) + cn + k
where k is the constant time of code that runs outside the merge (and split if there is one) loop: function call overhead, sublist length math, etc. You might try solving the recurrence with this extra term as an exercise to prove to yourself that the big Omega result doesn't change.

why O(1) != O(log(n)) ? for n=[integer, long, ...]

for example, say n = Integer.MAX_VALUE or 2^123 then O(log(n)) = 32 and 123 so a small integer. isn't it O(1) ?
what is the difference ? I think, the reason is O(1) is constant but O(log(n)) not. Any other ideas ?
If n is bounded above, then complexity classes involving n make no sense. There is no such thing as "in the limit as 2^123 approaches infinity", except in the old joke that "a pentagon approximates a circle, for sufficiently large values of 5".
Generally, when analysing the complexity of code, we pretend that the input size isn't bounded above by the resource limits of the machine, even though it is. This does lead to some slightly odd things going on around log n, since if n has to fit into a fixed-size int type, then log n has quite a small bound, so the bound is more likely to be useful/relevant.
So sometimes, we're analysing a slightly idealised version of the algorithm, because the actual code written cannot accept arbitrarily large input.
For example, your average quicksort formally uses Theta(log n) stack in the worst case, obviously so with the fairly common implementation that call-recurses on the "small" side of the partition and loop-recurses on the "big" side. But on a 32 bit machine you can arrange to in fact use a fixed-size array of about 240 bytes to store the "todo list", which might be less than some other function you've written based on an algorithm that formally has O(1) stack use. The morals are that implementation != algorithm, complexity doesn't tell you anything about small numbers, and any specific number is "small".
If you want to account for bounds, you could say that, for example, your code to sort an array is O(1) running time, because the array has to be below the size that fits in your PC's address space, and hence the time to sort it is bounded. However, you will fail your CS assignment if you do, and you won't be providing anyone with any useful information :-)
Obviously if you know that the input will always have a fixed number of elements, the algorithm will always run in constant time. Big-O notation is used to denote worse-case running time, which describes the limit when the number of elements grows infinitely large.
The difference is that n isn't fixed. The idea behind Big-O notation is to get an idea of how the size of the input effects the running time (or memory usage). So if an algorithm always takes the same amount of time, whether n = 1 or n = Integer.MAX_VALUE, we say it is O(1). If the algorithm takes a unit of time longer each time the input size doubles, then we say it is O(logn).
Edit: to answer your specific question on the difference between O(1) and O(logn), I'll give you an example. Let's say we want an algorithm that will find the min element in an unsorted array. One approach is to go through each element and keep track of the current min. Another approach is to sort the array and then return the first element.
The first algorithm is O(n), and the second algorithm is O(nlogn). So let's say we start with an array of 16 elements. The first algorithm will run in time 16, the second algorithm will run in time 16*4. If we increase it to 17, then it becomes 17 and 17*4. We might naively say that the second algorithm takes 4 times as long as the first algorithm (if we treat the logn component as constant).
But let's look at what happens when our array contains 2^32 elements. Now the first algorithm takes 2^32 time to complete, where our second algorithm takes 32*2^32 time to complete. It takes 32 times as long. Yes, it's a small difference, but it is still a difference. If the first algorithm takes 1 minute, the second algorithm will take over half an hour!
I think you will get a better idea if it is called O(n^0).
It is a scaling function depending on the input variable N. It is a function, not number, you should never assume any number for the variable N.
It is just like that you say that a function f(x) is 3 because f(100) = 3, it is wrong. It is a function, not any particular number. A constant function f(x) = 1 is still a function, it will never equal to another function g(x) = N, i.e. g(x)=f(x)
Its the growth rate that you want to look at. O(1) implies no growth at all. While O(logn) does have growth. Even though the growth is small it is still growth.
You’re not thinking big enough. Any algorithm that runs on a computer will either run forever or terminate after some small number of steps — since the computer is only a finite state machine, you cannot write algorithms that run for an arbitrary amount of time and then terminate. By that argument, Big-O notation is only theoretical and has no purpose in a real-life computer program. Even O(2^n) hits an upper limit at O(2^INT_MAX), which is equivalent to O(1).
Realistically, though, Big-O can help you out if you know the constant factors. Even if an algorithm has an upper bound of O(log n), and n can have 32 bits, that could mean the difference between a request taking 1 second and 32 seconds.
Big-O shows how running time (or memory, etc) changes as the size of problem changes.
When size of the problem gets 10 times bigger, an O(n) solution takes 10 times as long, an O(log(n)) solution takes a bit longer, and an O(1) solution takes the same time: O(1) means 'changes as fast as constant 1', but constants don't change.
Familiarize yourself with the big-O notation in a bit more detail.
There is a reason why you leave "O(n)" in, and consider to drop "O(log n)". They both are "constants": the former is less than 32, and the latter is less than 232. But you nevertheless have a natural feeling that you can't call O(n) O(1).
However, if log(n) < 32, it means that O(n*logn) algorithm works thirty two times slower than its O(n) version. Big enough to write "log*n"s?

What is big-O notation? How do you come up with figures like O(n)? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Plain english explanation of Big O
I'd imagine this is probably something taught in classes, but as I a self-taught programmer, I've only seen it rarely.
I've gathered it is something to do with the time, and O(1) is the best, while stuff like O(n^n) is very bad, but could someone point me to a basic explanation of what it actually represents, and where these numbers come from?
Big O refers to the worst case run-time order. It is used to show how well an algorithm scales based on the size of the data set (n->number of items).
Since we are only concerned with the order, constant multipliers are ignored, and any terms which increase less quickly than the dominant term are also removed. Some examples:
A single operation or set of operations is O(1), since it takes some constant time (does not vary based on data set size).
A loop is O(n). Each element in the data set is looped over.
A nested loop is O(n^2). A nested nested loop is O(n^3), and onward.
Things like binary tree searching are log(n), which is more difficult to show, but at every level in the tree, the possible number of solutions is halved, so the number of levels is log(n) (provided the tree is balanced).
Something like finding the sum of a set of numbers that is closest to a given value is O(n!), since the sum of each subset needs to be calculated. This is very bad.
It's a way of expressing time complexity.
O(n) means for n elements in a list, it takes n computations to sort the list. Which isn't bad at all. Each increase in n increases time complexity linearly.
O(n^n) is bad, because the amount of computation required to perform a sort (or whatever you are doing) will exponentially increase as you increase n.
O(1) is the best, as it means 1 computation to perform a function, think of hash tables, looking up a value in a hash table has O(1) time complexity.
Big O notation as applied to an algorithm refers to how the run time of the algorithm depends on the amount of input data. For example, a sorting algorithm will take longer to sort a large data set than a small data set. If for the sorting algorithm example you graph the run time (vertical-axis) vs the number of values to sort (horizontal-axis), for numbers of values from zero to a large number, the nature of the line or curve that results will depend on the sorting algorithm used. Big O notation is a shorthand method for describing the line or curve.
In big O notation, the expression in the brackets is the function that is graphed. If a variable (say n) is included in the expression, this variable refers to the size of the input data set. You say O(1) is the best. This is true because the graph f(n) = 1 does not vary with n. An O(1) algorithm takes the same amount of time to complete regardless of the size of the input data set. By contrast, the run time of an algorithm of O(n^n) increases with the square of the size of the input data set.
That is the basic idea, for a detailed explanation, consult the wikipedia page titled 'Big O Notation'.

O(log N) == O(1) - Why not?

Whenever I consider algorithms/data structures I tend to replace the log(N) parts by constants. Oh, I know log(N) diverges - but does it matter in real world applications?
log(infinity) < 100 for all practical purposes.
I am really curious for real world examples where this doesn't hold.
To clarify:
I understand O(f(N))
I am curious about real world examples where the asymptotic behaviour matters more than the constants of the actual performance.
If log(N) can be replaced by a constant it still can be replaced by a constant in O( N log N).
This question is for the sake of (a) entertainment and (b) to gather arguments to use if I run (again) into a controversy about the performance of a design.
Big O notation tells you about how your algorithm changes with growing input. O(1) tells you it doesn't matter how much your input grows, the algorithm will always be just as fast. O(logn) says that the algorithm will be fast, but as your input grows it will take a little longer.
O(1) and O(logn) makes a big diference when you start to combine algorithms.
Take doing joins with indexes for example. If you could do a join in O(1) instead of O(logn) you would have huge performance gains. For example with O(1) you can join any amount of times and you still have O(1). But with O(logn) you need to multiply the operation count by logn each time.
For large inputs, if you had an algorithm that was O(n^2) already, you would much rather do an operation that was O(1) inside, and not O(logn) inside.
Also remember that Big-O of anything can have a constant overhead. Let's say that constant overhead is 1 million. With O(1) that constant overhead does not amplify the number of operations as much as O(logn) does.
Another point is that everyone thinks of O(logn) representing n elements of a tree data structure for example. But it could be anything including bytes in a file.
I think this is a pragmatic approach; O(logN) will never be more than 64. In practice, whenever terms get as 'small' as O(logN), you have to measure to see if the constant factors win out. See also
Uses of Ackermann function?
To quote myself from comments on another answer:
[Big-Oh] 'Analysis' only matters for factors
that are at least O(N). For any
smaller factor, big-oh analysis is
useless and you must measure.
and
"With O(logN) your input size does
matter." This is the whole point of
the question. Of course it matters...
in theory. The question the OP asks
is, does it matter in practice? I
contend that the answer is no, there
is not, and never will be, a data set
for which logN will grow so fast as to
always be beaten a constant-time
algorithm. Even for the largest
practical dataset imaginable in the
lifetimes of our grandchildren, a logN
algorithm has a fair chance of beating
a constant time algorithm - you must
always measure.
EDIT
A good talk:
http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey
about halfway through, Rich discusses Clojure's hash tries, which are clearly O(logN), but the base of the logarithm is large and so the depth of the trie is at most 6 even if it contains 4 billion values. Here "6" is still an O(logN) value, but it is an incredibly small value, and so choosing to discard this awesome data structure because "I really need O(1)" is a foolish thing to do. This emphasizes how most of the other answers to this question are simply wrong from the perspective of the pragmatist who wants their algorithm to "run fast" and "scale well", regardless of what the "theory" says.
EDIT
See also
http://queue.acm.org/detail.cfm?id=1814327
which says
What good is an O(log2(n)) algorithm
if those operations cause page faults
and slow disk operations? For most
relevant datasets an O(n) or even an
O(n^2) algorithm, which avoids page
faults, will run circles around it.
(but go read the article for context).
This is a common mistake - remember Big O notation is NOT telling you about the absolute performance of an algorithm at a given value, it's simply telling you the behavior of an algorithm as you increase the size of the input.
When you take it in that context it becomes clear why an algorithm A ~ O(logN) and an algorithm B ~ O(1) algorithm are different:
if I run A on an input of size a, then on an input of size 1000000*a, I can expect the second input to take log(1,000,000) times as long as the first input
if I run B on an input of size a, then on an input of size 1000000*a, I can expect the second input to take about the same amount of time as the first input
EDIT: Thinking over your question some more, I do think there's some wisdom to be had in it. While I would never say it's correct to say O(lgN) == O(1), It IS possible that an O(lgN) algorithm might be used over an O(1) algorithm. This draws back to the point about absolute performance above: Just knowing one algorithm is O(1) and another algorithm is O(lgN) is NOT enough to declare you should use the O(1) over the O(lgN), it's certainly possible given your range of possible inputs an O(lgN) might serve you best.
You asked for a real-world example. I'll give you one. Computational biology. One strand of DNA encoded in ASCII is somewhere on the level of gigabytes in space. A typical database will obviously have many thousands of such strands.
Now, in the case of an indexing/searching algorithm, that log(n) multiple makes a large difference when coupled with constants. The reason why? This is one of the applications where the size of your input is astronomical. Additionally, the input size will always continue to grow.
Admittedly, these type of problems are rare. There are only so many applications this large. In those circumstances, though... it makes a world of difference.
Equality, the way you're describing it, is a common abuse of notation.
To clarify: we usually write f(x) = O(logN) to imply "f(x) is O(logN)".
At any rate, O(1) means a constant number of steps/time (as an upper bound) to perform an action regardless of how large the input set is. But for O(logN), number of steps/time still grows as a function of the input size (the logarithm of it), it just grows very slowly. For most real world applications you may be safe in assuming that this number of steps will not exceed 100, however I'd bet there are multiple examples of datasets large enough to mark your statement both dangerous and void (packet traces, environmental measurements, and many more).
For small enough N, O(N^N) can in practice be replaced with 1. Not O(1) (by definition), but for N=2 you can see it as one operation with 4 parts, or a constant-time operation.
What if all operations take 1hour? The difference between O(log N) and O(1) is then large, even with small N.
Or if you need to run the algorithm ten million times? Ok, that took 30minutes, so when I run it on a dataset a hundred times as large it should still take 30minutes because O(logN) is "the same" as O(1).... eh...what?
Your statement that "I understand O(f(N))" is clearly false.
Real world applications, oh... I don't know.... EVERY USE OF O()-notation EVER?
Binary search in sorted list of 10 million items for example. It's the very REASON we use hash tables when the data gets big enough. If you think O(logN) is the same as O(1), then why would you EVER use a hash instead of a binary tree?
As many have already said, for the real world, you need to look at the constant factors first, before even worrying about factors of O(log N).
Then, consider what you will expect N to be. If you have good reason to think that N<10, you can use a linear search instead of a binary one. That's O(N) instead of O(log N), which according to your lights would be significant -- but a linear search that moves found elements to the front may well outperform a more complicated balanced tree, depending on the application.
On the other hand, note that, even if log N is not likely to exceed 50, a performance factor of 10 is really huge -- if you're compute-bound, a factor like that can easily make or break your application. If that's not enough for you, you'll frequently see factors of (log N)^2 or (logN)^3 in algorithms, so even if you think you can ignore one factor of (log N), that doesn't mean you can ignore more of them.
Finally, note that the simplex algorithm for linear programming has a worst case performance of O(2^n). However, for practical problems, the worst case never comes up; in practice, the simplex algorithm is fast, relatively simple, and consequently very popular.
About 30 years ago, someone developed a polynomial-time algorithm for linear programming, but it was not initially practical because the result was too slow.
Nowadays, there are practical alternative algorithms for linear programming (with polynomial-time wost-case, for what that's worth), which can outperform the simplex method in practice. But, depending on the problem, the simplex method is still competitive.
The observation that O(log n) is oftentimes indistinguishable from O(1) is a good one.
As a familiar example, suppose we wanted to find a single element in a sorted array of one 1,000,000,000,000 elements:
with linear search, the search takes on average 500,000,000,000 steps
with binary search, the search takes on average 40 steps
Suppose we added a single element to the array we are searching, and now we must search for another element:
with linear search, the search takes on average 500,000,000,001 steps (indistinguishable change)
with binary search, the search takes on average 40 steps (indistinguishable change)
Suppose we doubled the number of elements in the array we are searching, and now we must search for another element:
with linear search, the search takes on average 1,000,000,000,000 steps (extraordinarily noticeable change)
with binary search, the search takes on average 41 steps (indistinguishable change)
As we can see from this example, for all intents and purposes, an O(log n) algorithm like binary search is oftentimes indistinguishable from an O(1) algorithm like omniscience.
The takeaway point is this: *we use O(log n) algorithms because they are often indistinguishable from constant time, and because they often perform phenomenally better than linear time algorithms.
Obviously, these examples assume reasonable constants. Obviously, these are generic observations and do not apply to all cases. Obviously, these points apply at the asymptotic end of the curve, not the n=3 end.
But this observation explains why, for example, we use such techniques as tuning a query to do an index seek rather than a table scan - because an index seek operates in nearly constant time no matter the size of the dataset, while a table scan is crushingly slow on sufficiently large datasets. Index seek is O(log n).
You might be interested in Soft-O, which ignores logarithmic cost. Check this paragraph in Wikipedia.
What do you mean by whether or not it "matters"?
If you're faced with the choice of an O(1) algorithm and a O(lg n) one, then you should not assume they're equal. You should choose the constant-time one. Why wouldn't you?
And if no constant-time algorithm exists, then the logarithmic-time one is usually the best you can get. Again, does it then matter? You just have to take the fastest you can find.
Can you give me a situation where you'd gain anything by defining the two as equal? At best, it'd make no difference, and at worst, you'd hide some real scalability characteristics. Because usually, a constant-time algorithm will be faster than a logarithmic one.
Even if, as you say, lg(n) < 100 for all practical purposes, that's still a factor 100 on top of your other overhead. If I call your function, N times, then it starts to matter whether your function runs logarithmic time or constant, because the total complexity is then O(n lg n) or O(n).
So rather than asking if "it matters" that you assume logarithmic complexity to be constant in "the real world", I'd ask if there's any point in doing that.
Often you can assume that logarithmic algorithms are fast enough, but what do you gain by considering them constant?
O(logN)*O(logN)*O(logN) is very different. O(1) * O(1) * O(1) is still constant.
Also a simple quicksort-style O(nlogn) is different than O(n O(1))=O(n). Try sorting 1000 and 1000000 elements. The latter isn't 1000 times slower, it's 2000 times, because log(n^2)=2log(n)
The title of the question is misleading (well chosen to drum up debate, mind you).
O(log N) == O(1) is obviously wrong (and the poster is aware of this). Big O notation, by definition, regards asymptotic analysis. When you see O(N), N is taken to approach infinity. If N is assigned a constant, it's not Big O.
Note, this isn't just a nitpicky detail that only theoretical computer scientists need to care about. All of the arithmetic used to determine the O function for an algorithm relies on it. When you publish the O function for your algorithm, you might be omitting a lot of information about it's performance.
Big O analysis is cool, because it lets you compare algorithms without getting bogged down in platform specific issues (word sizes, instructions per operation, memory speed versus disk speed). When N goes to infinity, those issues disappear. But when N is 10000, 1000, 100, those issues, along with all of the other constants that we left out of the O function, start to matter.
To answer the question of the poster: O(log N) != O(1), and you're right, algorithms with O(1) are sometimes not much better than algorithms with O(log N), depending on the size of the input, and all of those internal constants that got omitted during Big O analysis.
If you know you're going to be cranking up N, then use Big O analysis. If you're not, then you'll need some empirical tests.
In theory
Yes, in practical situations log(n) is bounded by a constant, we'll say 100. However, replacing log(n) by 100 in situations where it's correct is still throwing away information, making the upper bound on operations that you have calculated looser and less useful. Replacing an O(log(n)) by an O(1) in your analysis could result in your large n case performing 100 times worse than you expected based on your small n case. Your theoretical analysis could have been more accurate and could have predicted an issue before you'd built the system.
I would argue that the practical purpose of big-O analysis is to try and predict the execution time of your algorithm as early as possible. You can make your analysis easier by crossing out the log(n) terms, but then you've reduced the predictive power of the estimate.
In practice
If you read the original papers by Larry Page and Sergey Brin on the Google architecture, they talk about using hash tables for everything to ensure that e.g. the lookup of a cached web page only takes one hard-disk seek. If you used B-tree indices to lookup you might need four or five hard-disk seeks to do an uncached lookup [*]. Quadrupling your disk requirements on your cached web page storage is worth caring about from a business perspective, and predictable if you don't cast out all the O(log(n)) terms.
P.S. Sorry for using Google as an example, they're like Hitler in the computer science version of Godwin's law.
[*] Assuming 4KB reads from disk, 100bn web pages in the index, ~ 16 bytes per key in a B-tree node.
As others have pointed out, Big-O tells you about how the performance of your problem scales. Trust me - it matters. I have encountered several times algorithms that were just terrible and failed to meet the customers demands because they were too slow. Understanding the difference and finding an O(1) solution is a lot of times a huge improvement.
However, of course, that is not the whole story - for instance, you may notice that quicksort algorithms will always switch to insertion sort for small elements (Wikipedia says 8 - 20) because of the behaviour of both algorithms on small datasets.
So it's a matter of understanding what tradeoffs you will be doing which involves a thorough understanding of the problem, the architecture, & experience to understand which to use, and how to adjust the constants involved.
No one is saying that O(1) is always better than O(log N). However, I can guarantee you that an O(1) algorithm will also scale way better, so even if you make incorrect assumptions about how many users will be on the system, or the size of the data to process, it won't matter to the algorithm.
Yes, log(N) < 100 for most practical purposes, and No, you can not always replace it by constant.
For example, this may lead to serious errors in estimating performance of your program. If O(N) program processed array of 1000 elements in 1 ms, then you are sure it will process 106 elements in 1 second (or so). If, though, the program is O(N*logN), then it will take it ~2 secs to process 106 elements. This difference may be crucial - for example, you may think you've got enough server power because you get 3000 requests per hour and you think your server can handle up to 3600.
Another example. Imagine you have function f() working in O(logN), and on each iteration calling function g(), which works in O(logN) as well. Then, if you replace both logs by constants, you think that your program works in constant time. Reality will be cruel though - two logs may give you up to 100*100 multiplicator.
The rules of determining the Big-O notation are simpler when you don't decide that O(log n) = O(1).
As krzysio said, you may accumulate O(log n)s and then they would make a very noticeable difference. Imagine you do a binary search: O(log n) comparisons, and then imagine that each comparison's complexity O(log n). If you neglect both you get O(1) instead of O(log2n). Similarly you may somehow arrive at O(log10n) and then you'll notice a big difference for not too large "n"s.
Assume that in your entire application, one algorithm accounts for 90% of the time the user waits for the most common operation.
Suppose in real time the O(1) operation takes a second on your architecture, and the O(logN) operation is basically .5 seconds * log(N). Well, at this point I'd really like to draw you a graph with an arrow at the intersection of the curve and the line, saying, "It matters right here." You want to use the log(N) op for small datasets and the O(1) op for large datasets, in such a scenario.
Big-O notation and performance optimization is an academic exercise rather than delivering real value to the user for operations that are already cheap, but if it's an expensive operation on a critical path, then you bet it matters!
For any algorithm that can take inputs of different sizes N, the number of operations it takes is upper-bounded by some function f(N).
All big-O tells you is the shape of that function.
O(1) means there is some number A such that f(N) < A for large N.
O(N) means there is some A such that f(N) < AN for large N.
O(N^2) means there is some A such that f(N) < AN^2 for large N.
O(log(N)) means there is some A such that f(N) < AlogN for large N.
Big-O says nothing about how big A is (i.e. how fast the algorithm is), or where these functions cross each other. It only says that when you are comparing two algorithms, if their big-Os differ, then there is a value of N (which may be small or it may be very large) where one algorithm will start to outperform the other.
you are right, in many cases it does not matter for pracitcal purposes. but the key question is "how fast GROWS N". most algorithms we know of take the size of the input, so it grows linearily.
but some algorithms have the value of N derived in a complex way. if N is "the number of possible lottery combinations for a lottery with X distinct numbers" it suddenly matters if your algorithm is O(1) or O(logN)
Big-OH tells you that one algorithm is faster than another given some constant factor. If your input implies a sufficiently small constant factor, you could see great performance gains by going with a linear search rather than a log(n) search of some base.
O(log N) can be misleading. Take for example the operations on Red-Black trees.
The operations are O(logN) but rather complex, which means many low level operations.
Whenever N is the amount of objects that is stored in some kind of memory, you're correct. After all, a binary search through EVERY byte representable by a 64-bit pointer can be achieved in just 64 steps. Actually, it's possible to do a binary search of all Planck volumes in the observable universe in just 618 steps.
So in almost all cases, it's safe to approximate O(log N) with O(N) as long as N is (or could be) a physical quantity, and we know for certain that as long as N is (or could be) a physical quantity, then log N < 618
But that is assuming N is that. It may represent something else. Note that it's not always clear what it is. Just as an example, take matrix multiplication, and assume square matrices for simplicity. The time complexity for matrix multiplication is O(N^3) for a trivial algorithm. But what is N here? It is the side length. It is a reasonable way of measuring the input size, but it would also be quite reasonable to use the number of elements in the matrix, which is N^2. Let M=N^2, and now we can say that the time complexity for trivial matrix multiplication is O(M^(3/2)) where M is the number of elements in a matrix.
Unfortunately, I don't have any real world problem per se, which was what you asked. But at least I can make up something that makes some sort of sense:
Let f(S) be a function that returns the sum of the hashes of all the elements in the power set of S. Here is some pesudo:
f(S):
ret = 0
for s = powerset(S))
ret += hash(s)
Here, hash is simply the hash function, and powerset is a generator function. Each time it's called, it will generate the next (according to some order) subset of S. A generator is necessary, because we would not be able to store the lists for huge data otherwise. Btw, here is a python example of such a power set generator:
def powerset(seq):
"""
Returns all the subsets of this set. This is a generator.
"""
if len(seq) <= 1:
yield seq
yield []
else:
for item in powerset(seq[1:]):
yield [seq[0]]+item
yield item
https://www.technomancy.org/python/powerset-generator-python/
So what is the time complexity for f? As with the matrix multiplication, we can choose N to represent many things, but at least two makes a lot of sense. One is number of elements in S, in which case the time complexity is O(2^N), but another sensible way of measuring it is that N is the number of element in the power set of S. In this case the time complexity is O(N)
So what will log N be for sensible sizes of S? Well, list with a million elements are not unusual. If n is the size of S and N is the size of P(S), then N=2^n. So O(log N) = O(log 2^n) = O(n * log 2) = O(n)
In this case it would matter, because it's rare that O(n) == O(log n) in the real world.
I do not believe algorithms where you can freely choose between O(1) with a large constant and O(logN) really exists. If there is N elements to work with at the beginning, it is just plain impossible to make it O(1), the only thing that is possible is move your N to some other part of your code.
What I try to say is that in all real cases I know off you have some space/time tradeoff, or some pre-treatment such as compiling data to a more efficient form.
That is, you do not really go O(1), you just move the N part elsewhere. Either you exchange performance of some part of your code with some memory amount either you exchange performance of one part of your algorithm with another one. To stay sane you should always look at the larger picture.
My point is that if you have N items they can't disappear. In other words you can choose between inefficient O(n^2) algorithms or worse and O(n.logN) : it's a real choice. But you never really go O(1).
What I try to point out is that for every problem and initial data state there is a 'best' algorithm. You can do worse but never better. With some experience you can have a good guessing of what is this intrisic complexity. Then if your overall treatment match that complexity you know you have something. You won't be able to reduce that complexity, but only to move it around.
If problem is O(n) it won't become O(logN) or O(1), you'll merely add some pre-treatment such that the overall complexity is unchanged or worse, and potentially a later step will be improved. Say you want the smaller element of an array, you can search in O(N) or sort the array using any common O(NLogN) sort treatment then have the first using O(1).
Is it a good idea to do that casually ? Only if your problem asked also for second, third, etc. elements. Then your initial problem was truly O(NLogN), not O(N).
And it's not the same if you wait ten times or twenty times longer for your result because you simplified saying O(1) = O(LogN).
I'm waiting for a counter-example ;-) that is any real case where you have choice between O(1) and O(LogN) and where every O(LogN) step won't compare to the O(1). All you can do is take a worse algorithm instead of the natural one or move some heavy treatment to some other part of the larger pictures (pre-computing results, using storage space, etc.)
Let's say you use an image-processing algorithm that runs in O(log N), where N is the number of images. Now... stating that it runs in constant time would make one believe that no matter how many images there are, it would still complete its task it about the same amount of time. If running the algorithm on a single image would hypothetically take a whole day, and assuming that O(logN) will never be more than 100... imagine the surprise of that person that would try to run the algorithm on a very large image database - he would expect it to be done in a day or so... yet it'll take months for it to finish.

Resources