Analyze span - two parallel for - algorithm

If I have an algorithm with two parallel for and I want to analyze the span of the algorithm, what do I have to do?
For example
parallel for a=2 to n
parallel for b=1 to a-1
My guess is the span is theta(lg(n)*lg(n)) but I'm not sure. :) Someone who can help or give a hint?

I am assuming you want the time complexity of this algorithm. Since time complexity is NOT how much time the algorithm actually takes, but rather how much operations are needed for it [a quote supporting this claim follows], the time complexity of this algorithm is O(n^2), as it was if it was not parallel.
from the wiki page: Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, where an elementary operation takes a fixed amount of time to perform
Why don't we care for the fact the algorithm is parallel?
Usually, our cluster size is fixed, and does not depend on the input [n]. let the cluster size be k [meaning, we can perform k operations simultaneously and the algorithm is O(n^2) [for simplicity assume exactly n^2]
assume we have an input of size 100, then it will 'take' (100^2)/k time. if it was of size 1,000, it would take (1000^2)/k, and for n elements: (n^2)/k, as you can see, the k is a constant, and the fact that the program is parallel does not change the complexity. Being able to do k operations at once, is not better [and even worth, but that's for another thread] then a computer k time faster.

Related

How is pre-computation handled by complexity notation?

Suppose I have an algorithm that runs in O(n) for every input of size n, but only after a pre-computation step of O(n^2) for that given size n. Is the algorithm considered O(n) still, with O(n^2) amortized? Or does big O only consider one "run" of the algorithm at size n, and so the pre-computation step is included in the notation, making the true notation O(n+n^2) or O(n^2)?
It's not uncommon to see this accounted for by explicitly separating out the costs into two different pieces. For example, in the range minimum query problem, it's common to see people talk about things like an ⟨O(n2), O(1)⟩-time solution to the problem, where the O(n2) denotes the precomputation cost and the O(1) denotes the lookup cost. You also see this with string algorithms sometimes: a suffix tree provides an O(m)-preprocessing-time, O(n+z)-query-time solution to string searching, while Aho-Corasick string matching offers an O(n)-preprocessing-time, O(m+z)-query-time solution.
The reason for doing so is that the tradeoffs involved here really depend on the use case. It lets you quantitatively measure how many queries you're going to have to make before the preprocessing time starts to be worth it.
People usually care about the total time to get things done when they are talking about complexity etc.
Thus, if getting to the result R requires you to perform steps A and B, then complexity(R) = complexity(A) + complexity(B). This works out to be O(n^2) in your particular example.
You have already noted that for O analysis, the fastest growing term dominates the overall complexity (or in other words, in a pipeline, the slowest module defines the throughput).
However, complexity analysis of A and B will be typically performed in isolation if they are disjoint.
In summary, it's the amount of time taken to get the results that counts, but you can (and usually do) reason about the individual steps independent of one another.
There are cases when you cannot only specify the slowest part of the pipeline. A simple example is BFS, with the complexity O(V + E). Since E = O(V^2), it may be tempting to write the complexity of BFS as O(E) (since E > V). However, that would be incorrect, since there can be a graph with no edges! In those cases, you will still need to iterate over all the vertices.
The point of O(...) notation is not to measure how fast the algorithm is working, because in many specific cases O(n) can be significantly slower than, say O(n^3). (Imagine the algorithm which runs in 10^100 n steps vs. the one which runs in n^3 / 2 steps.) If I tell you that my algorithm runs in O(n^2) time, it tells you nothing about how long it will take for n = 1000.
The point of O(...) is to specify how the algorithm behaves when the input size grows. If I tell you that my algorithm runs in O(n^2) time, and it takes 1 second to run for n = 500, then you'll expect rather 4 seconds to for n = 1000, not 1.5 and not 40.
So, to answer your question -- no, the algorithm will not be O(n), it will be O(n^2), because if I double the input size the time will be multiplied by 4, not by 2.

is anything less than n is log n?

Consider there are two solutions for a problem.
executes in n/2 times ie. if n = 100 then it executes in 50 times
executes in sqrt of n times ie. if n = 100 then it executes in 10 times.
Are both the solutions can be called as O(log N) ?
if so, then there is huge difference between sqrt of N and N/2.
if we can't say O(log N) then can we say it is N ?
But the problem is the difference rate between these two. By the below image the algorithm should come in either of these thing, under which these solutions will come ?
Please help me on this.
Consider the three cases.
Executes n/2 times. That means each time we increase n by a factor of 100, the execution time increases by a factor of 100.
Executes sqrt(n) times. That means each time we increase n by a factor of 100, the execution time increases by a factor of 10.
Executes log(n) times. That means each time we increase n by a factor of 100, the execution time increases by a constant amount.
No, these three things aren't even close to the same. The first is much worse than the second. The third is much better than the second.
Neither of them is O(logn)
Here is an example of O(logn), Binary search algorithm
The best algorithm is the best algorithm for the data that you have. If you don't know what data you have, consider massively large amounts of data, say n= 1 billion. Would you choose O(31623), or O(5000000000)? Graph the comparison and find where your data size is.
If your dataset was n=4, then either algorithm is identical. If you get in to the details, it may actually take longer for the sqrt(n) algorithm due to the operations it conducts.
You can have O(1) which is the fastest. One such example is looking up in a hash map, but your memory size may suffer. So you should consider space constraints as well as time constraints.
You are also misunderstanding and overanalyzing complexity classification. O(n) algorithms are not algorithms that execute with exactly n operations. Any constant multiplier does not affect the Order of the classification. What is important is the grown of the number of operations when the problem grows. Consider two search algorithms.
A) Scan a sorted list sequentially from index 0 to (n-1) to find the number.
B) Scan a sorted list from from index 0 to (n-1), skipping by 2, and backtracking if necessary.
Clearly A takes at most n operations, and B takes n/2+1 operations. Yet they are both O(n). You can say algorithm B is faster, but I might run it on my machine which is twice as fast. So complexity is a general class, one isn't supposed to need to be overly finicky on the details of the operation.
If you were trying to develop a better algorithm, it would be much more useful to search for one with a better complexity class, than one with slightly fewer operations.

Is Big(O) machine dependent?

I am really confused with Big(O) notation. Is Big(O) machine dependent or machine independent ? (Machine in the sense the computer in which we run the Algorithm)
Will Sorting of 1000 numbers using quick sort in i3 processor and i7 processor be the same ? Why don't we consider the machine and it's processor speed when calculating the Time Complexity ? I am a neophyte in this stuff.
Big-O is a measure of scalability, not of speed. It shows you what effect on time and memory it has when you e.g. double the amount of data - does it double the execution, or quadruple it?
Whether you use i7 or i3, double is double. Whether a linear algorithm is fast or slow, double is double.
This also has another implication many people ignore. A complex algorithm such as O(n^3) can be faster than a simple algorithm such as O(n) for a given n that is below a certain limit. Example:
loop n times:
loop n times:
loop n times:
sleep 1 second
is O(n^3), as it has 3 nested loops.
loop n times:
sleep 10 seconds
is O(n), as it only has one loop. For n = 10 the first program executes for 1000 seconds, and the second one executes for only 100. So, O(n) is good! one would be tempted to say. But if you have n = 2, the first, complex program executes in only 8 seconds, while the second, simpler one executes for 20! Even for n = 3, the first executes in 27 seconds, the second one in 30. So while the n is low, a complex program might be able to outperform the simpler one. It's just that as n rises, the complex program gets slower much faster (if that makes sense) than a simple one. For n = 1000, the simple code has risen to only 10000 seconds, but the complex one is now 1000000000 seconds!
Also, this clearly shows you that complexity is not processor-dependent. A second is a second.
EDIT: Also, you might want to read this question, where Big-O is explained in a number of very high-quality answers.
Big(O) Notation is the method of calculating the complexity of an algorithm, and hence the relative time it will take to run. The same algorithm, for the same data, will run faster on a faster processor, but will still take the same number of operations. It's used as a way of evaluating the relative efficiency of different algorithms to achieve the same result.
Big O notation is not architecture-dependent in any way, it is a Mathematical construct.It is a very limited measure of algorithmic complexity, it only gives you a rough upper bound for how performance changes with data size.
Big(O) is alogorithm dependent. It's job is to help compare the relative costs of various algorithms, without the need to consider the machine dependencies.
Linear search though an array, on average will look at about 1/2 of the elements if it is found. for all practical purposes that is O(N/2) which is the same as O(1/2 * N). for compairson, you toss away the coefficient. hence it is O(N) for use.
A binary tree can hold N elements for searching as well. on agerage it will look though log base 2 (N) to find something, hence you will see it described as cost O(LN2(N)).
pop in small values for N, and there isn't a whole lot of difference between the algorithms. Pop in a large value of N, and it will be clear that the binary tree lookup is much faster.
Big(O) is not machine dependent. It is mathematical notation to denote complexity of an algorithm. Usually we use these notations in theory to compare algorithms performance.

What's my Big O?

My program of sorting values clocks at:
100000 8s
1000000 82s
10000000 811s
Is that O(n)?
Looks like it, but in reality, you really need to analyze the algorithm, because there may be different cases based on the data. Some algorithms do better or worse with pre-sorted data, for instance. What's your algorithm?
Yes, that looks like O(n) to me - going from the 1st to 2nd case and the 2nd to 3rd case, you've made the input 10 times bigger, and it's taken 10 times longer.
In particular, it looks like you can predict the rough time by using:
f(n) = n / 12500
or
f(n) = n * 0.00008
which gives a simplest explanation of O(n) for the data provided.
EDIT: However... As has been pointed out, there are various ways in which the data could be misleading you - I rather like Dennis Palmer's idea that the IO cost is dwarfing everything else. For example, suppose you have an algorithm whose absolute number of operations is:
f(n) = 1000000000000n + (n^2)
In that case the complexity is still O(n^2), but that won't become apparent from observations until n is very large.
I think it would be accurate to say that these observations are suggestive of an O(n) algorithm but that doesn't mean it definitely is.
Time behavior doesn't work that way. All you can really say there is that those three datasets are roughly O(n) from each other. That doesn't mean the algorithim is O(n).
The first problem is that I could easily draw a curve that goes exponential ( O(e**n) ) that nonetheless includes those three points.
The big problem though is that you didn't say anything about the data. There are many sorting algorithms that approach O(n) for sorted or nearly sorted inputs (eg: Mergesort). However, their average case (typically randomly-ordered data) and worst case (often reverse-sorted data) is invariably O(nlogn) or worse.
You cannot tell.
First, the time depends on the data, environment and algorithm. If you have an array of zeroes and try to sort it, the program running time shouldn't be much different for 1000 or 1000000 elements.
Second, the O notation tells about function behavior for big value of n, starting at n0. It could be that your algorithm scales well up to certain input size, and then its behavior changes - like the g(n) function.
it looks like O(n) to me.
Yes, that is O(n) because it scales with the number of items.
1000000 = 10 * 100000
and
82s = 10 * 8s (roughly)
You cannot depend on that to say it is O(n). Bubblesort, for instance, can complete in n steps, however it is an O(n^2) algorithm.
Yes, it appears to be O(n), but I don't think you can conclusively analyze the algorithm based on it's timed performance. You might be using the most inefficient sorting algorithm, but have O(n) timed results because the data read/write time is the majority of the total execution time.
Edit: Big-O is determined by how efficient an algorithm is and how it scales. It should predict the growth of execution time as the number of input items grows. The inverse is not necessarily true. Observing a given growth in execution time could mean a few different things. If it stays true to the example numbers you've given, then you can conclude that your program runs at O(n), but as others have said, that doesn't mean your sorting algorithm is O(n).

O(log N) == O(1) - Why not?

Whenever I consider algorithms/data structures I tend to replace the log(N) parts by constants. Oh, I know log(N) diverges - but does it matter in real world applications?
log(infinity) < 100 for all practical purposes.
I am really curious for real world examples where this doesn't hold.
To clarify:
I understand O(f(N))
I am curious about real world examples where the asymptotic behaviour matters more than the constants of the actual performance.
If log(N) can be replaced by a constant it still can be replaced by a constant in O( N log N).
This question is for the sake of (a) entertainment and (b) to gather arguments to use if I run (again) into a controversy about the performance of a design.
Big O notation tells you about how your algorithm changes with growing input. O(1) tells you it doesn't matter how much your input grows, the algorithm will always be just as fast. O(logn) says that the algorithm will be fast, but as your input grows it will take a little longer.
O(1) and O(logn) makes a big diference when you start to combine algorithms.
Take doing joins with indexes for example. If you could do a join in O(1) instead of O(logn) you would have huge performance gains. For example with O(1) you can join any amount of times and you still have O(1). But with O(logn) you need to multiply the operation count by logn each time.
For large inputs, if you had an algorithm that was O(n^2) already, you would much rather do an operation that was O(1) inside, and not O(logn) inside.
Also remember that Big-O of anything can have a constant overhead. Let's say that constant overhead is 1 million. With O(1) that constant overhead does not amplify the number of operations as much as O(logn) does.
Another point is that everyone thinks of O(logn) representing n elements of a tree data structure for example. But it could be anything including bytes in a file.
I think this is a pragmatic approach; O(logN) will never be more than 64. In practice, whenever terms get as 'small' as O(logN), you have to measure to see if the constant factors win out. See also
Uses of Ackermann function?
To quote myself from comments on another answer:
[Big-Oh] 'Analysis' only matters for factors
that are at least O(N). For any
smaller factor, big-oh analysis is
useless and you must measure.
and
"With O(logN) your input size does
matter." This is the whole point of
the question. Of course it matters...
in theory. The question the OP asks
is, does it matter in practice? I
contend that the answer is no, there
is not, and never will be, a data set
for which logN will grow so fast as to
always be beaten a constant-time
algorithm. Even for the largest
practical dataset imaginable in the
lifetimes of our grandchildren, a logN
algorithm has a fair chance of beating
a constant time algorithm - you must
always measure.
EDIT
A good talk:
http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey
about halfway through, Rich discusses Clojure's hash tries, which are clearly O(logN), but the base of the logarithm is large and so the depth of the trie is at most 6 even if it contains 4 billion values. Here "6" is still an O(logN) value, but it is an incredibly small value, and so choosing to discard this awesome data structure because "I really need O(1)" is a foolish thing to do. This emphasizes how most of the other answers to this question are simply wrong from the perspective of the pragmatist who wants their algorithm to "run fast" and "scale well", regardless of what the "theory" says.
EDIT
See also
http://queue.acm.org/detail.cfm?id=1814327
which says
What good is an O(log2(n)) algorithm
if those operations cause page faults
and slow disk operations? For most
relevant datasets an O(n) or even an
O(n^2) algorithm, which avoids page
faults, will run circles around it.
(but go read the article for context).
This is a common mistake - remember Big O notation is NOT telling you about the absolute performance of an algorithm at a given value, it's simply telling you the behavior of an algorithm as you increase the size of the input.
When you take it in that context it becomes clear why an algorithm A ~ O(logN) and an algorithm B ~ O(1) algorithm are different:
if I run A on an input of size a, then on an input of size 1000000*a, I can expect the second input to take log(1,000,000) times as long as the first input
if I run B on an input of size a, then on an input of size 1000000*a, I can expect the second input to take about the same amount of time as the first input
EDIT: Thinking over your question some more, I do think there's some wisdom to be had in it. While I would never say it's correct to say O(lgN) == O(1), It IS possible that an O(lgN) algorithm might be used over an O(1) algorithm. This draws back to the point about absolute performance above: Just knowing one algorithm is O(1) and another algorithm is O(lgN) is NOT enough to declare you should use the O(1) over the O(lgN), it's certainly possible given your range of possible inputs an O(lgN) might serve you best.
You asked for a real-world example. I'll give you one. Computational biology. One strand of DNA encoded in ASCII is somewhere on the level of gigabytes in space. A typical database will obviously have many thousands of such strands.
Now, in the case of an indexing/searching algorithm, that log(n) multiple makes a large difference when coupled with constants. The reason why? This is one of the applications where the size of your input is astronomical. Additionally, the input size will always continue to grow.
Admittedly, these type of problems are rare. There are only so many applications this large. In those circumstances, though... it makes a world of difference.
Equality, the way you're describing it, is a common abuse of notation.
To clarify: we usually write f(x) = O(logN) to imply "f(x) is O(logN)".
At any rate, O(1) means a constant number of steps/time (as an upper bound) to perform an action regardless of how large the input set is. But for O(logN), number of steps/time still grows as a function of the input size (the logarithm of it), it just grows very slowly. For most real world applications you may be safe in assuming that this number of steps will not exceed 100, however I'd bet there are multiple examples of datasets large enough to mark your statement both dangerous and void (packet traces, environmental measurements, and many more).
For small enough N, O(N^N) can in practice be replaced with 1. Not O(1) (by definition), but for N=2 you can see it as one operation with 4 parts, or a constant-time operation.
What if all operations take 1hour? The difference between O(log N) and O(1) is then large, even with small N.
Or if you need to run the algorithm ten million times? Ok, that took 30minutes, so when I run it on a dataset a hundred times as large it should still take 30minutes because O(logN) is "the same" as O(1).... eh...what?
Your statement that "I understand O(f(N))" is clearly false.
Real world applications, oh... I don't know.... EVERY USE OF O()-notation EVER?
Binary search in sorted list of 10 million items for example. It's the very REASON we use hash tables when the data gets big enough. If you think O(logN) is the same as O(1), then why would you EVER use a hash instead of a binary tree?
As many have already said, for the real world, you need to look at the constant factors first, before even worrying about factors of O(log N).
Then, consider what you will expect N to be. If you have good reason to think that N<10, you can use a linear search instead of a binary one. That's O(N) instead of O(log N), which according to your lights would be significant -- but a linear search that moves found elements to the front may well outperform a more complicated balanced tree, depending on the application.
On the other hand, note that, even if log N is not likely to exceed 50, a performance factor of 10 is really huge -- if you're compute-bound, a factor like that can easily make or break your application. If that's not enough for you, you'll frequently see factors of (log N)^2 or (logN)^3 in algorithms, so even if you think you can ignore one factor of (log N), that doesn't mean you can ignore more of them.
Finally, note that the simplex algorithm for linear programming has a worst case performance of O(2^n). However, for practical problems, the worst case never comes up; in practice, the simplex algorithm is fast, relatively simple, and consequently very popular.
About 30 years ago, someone developed a polynomial-time algorithm for linear programming, but it was not initially practical because the result was too slow.
Nowadays, there are practical alternative algorithms for linear programming (with polynomial-time wost-case, for what that's worth), which can outperform the simplex method in practice. But, depending on the problem, the simplex method is still competitive.
The observation that O(log n) is oftentimes indistinguishable from O(1) is a good one.
As a familiar example, suppose we wanted to find a single element in a sorted array of one 1,000,000,000,000 elements:
with linear search, the search takes on average 500,000,000,000 steps
with binary search, the search takes on average 40 steps
Suppose we added a single element to the array we are searching, and now we must search for another element:
with linear search, the search takes on average 500,000,000,001 steps (indistinguishable change)
with binary search, the search takes on average 40 steps (indistinguishable change)
Suppose we doubled the number of elements in the array we are searching, and now we must search for another element:
with linear search, the search takes on average 1,000,000,000,000 steps (extraordinarily noticeable change)
with binary search, the search takes on average 41 steps (indistinguishable change)
As we can see from this example, for all intents and purposes, an O(log n) algorithm like binary search is oftentimes indistinguishable from an O(1) algorithm like omniscience.
The takeaway point is this: *we use O(log n) algorithms because they are often indistinguishable from constant time, and because they often perform phenomenally better than linear time algorithms.
Obviously, these examples assume reasonable constants. Obviously, these are generic observations and do not apply to all cases. Obviously, these points apply at the asymptotic end of the curve, not the n=3 end.
But this observation explains why, for example, we use such techniques as tuning a query to do an index seek rather than a table scan - because an index seek operates in nearly constant time no matter the size of the dataset, while a table scan is crushingly slow on sufficiently large datasets. Index seek is O(log n).
You might be interested in Soft-O, which ignores logarithmic cost. Check this paragraph in Wikipedia.
What do you mean by whether or not it "matters"?
If you're faced with the choice of an O(1) algorithm and a O(lg n) one, then you should not assume they're equal. You should choose the constant-time one. Why wouldn't you?
And if no constant-time algorithm exists, then the logarithmic-time one is usually the best you can get. Again, does it then matter? You just have to take the fastest you can find.
Can you give me a situation where you'd gain anything by defining the two as equal? At best, it'd make no difference, and at worst, you'd hide some real scalability characteristics. Because usually, a constant-time algorithm will be faster than a logarithmic one.
Even if, as you say, lg(n) < 100 for all practical purposes, that's still a factor 100 on top of your other overhead. If I call your function, N times, then it starts to matter whether your function runs logarithmic time or constant, because the total complexity is then O(n lg n) or O(n).
So rather than asking if "it matters" that you assume logarithmic complexity to be constant in "the real world", I'd ask if there's any point in doing that.
Often you can assume that logarithmic algorithms are fast enough, but what do you gain by considering them constant?
O(logN)*O(logN)*O(logN) is very different. O(1) * O(1) * O(1) is still constant.
Also a simple quicksort-style O(nlogn) is different than O(n O(1))=O(n). Try sorting 1000 and 1000000 elements. The latter isn't 1000 times slower, it's 2000 times, because log(n^2)=2log(n)
The title of the question is misleading (well chosen to drum up debate, mind you).
O(log N) == O(1) is obviously wrong (and the poster is aware of this). Big O notation, by definition, regards asymptotic analysis. When you see O(N), N is taken to approach infinity. If N is assigned a constant, it's not Big O.
Note, this isn't just a nitpicky detail that only theoretical computer scientists need to care about. All of the arithmetic used to determine the O function for an algorithm relies on it. When you publish the O function for your algorithm, you might be omitting a lot of information about it's performance.
Big O analysis is cool, because it lets you compare algorithms without getting bogged down in platform specific issues (word sizes, instructions per operation, memory speed versus disk speed). When N goes to infinity, those issues disappear. But when N is 10000, 1000, 100, those issues, along with all of the other constants that we left out of the O function, start to matter.
To answer the question of the poster: O(log N) != O(1), and you're right, algorithms with O(1) are sometimes not much better than algorithms with O(log N), depending on the size of the input, and all of those internal constants that got omitted during Big O analysis.
If you know you're going to be cranking up N, then use Big O analysis. If you're not, then you'll need some empirical tests.
In theory
Yes, in practical situations log(n) is bounded by a constant, we'll say 100. However, replacing log(n) by 100 in situations where it's correct is still throwing away information, making the upper bound on operations that you have calculated looser and less useful. Replacing an O(log(n)) by an O(1) in your analysis could result in your large n case performing 100 times worse than you expected based on your small n case. Your theoretical analysis could have been more accurate and could have predicted an issue before you'd built the system.
I would argue that the practical purpose of big-O analysis is to try and predict the execution time of your algorithm as early as possible. You can make your analysis easier by crossing out the log(n) terms, but then you've reduced the predictive power of the estimate.
In practice
If you read the original papers by Larry Page and Sergey Brin on the Google architecture, they talk about using hash tables for everything to ensure that e.g. the lookup of a cached web page only takes one hard-disk seek. If you used B-tree indices to lookup you might need four or five hard-disk seeks to do an uncached lookup [*]. Quadrupling your disk requirements on your cached web page storage is worth caring about from a business perspective, and predictable if you don't cast out all the O(log(n)) terms.
P.S. Sorry for using Google as an example, they're like Hitler in the computer science version of Godwin's law.
[*] Assuming 4KB reads from disk, 100bn web pages in the index, ~ 16 bytes per key in a B-tree node.
As others have pointed out, Big-O tells you about how the performance of your problem scales. Trust me - it matters. I have encountered several times algorithms that were just terrible and failed to meet the customers demands because they were too slow. Understanding the difference and finding an O(1) solution is a lot of times a huge improvement.
However, of course, that is not the whole story - for instance, you may notice that quicksort algorithms will always switch to insertion sort for small elements (Wikipedia says 8 - 20) because of the behaviour of both algorithms on small datasets.
So it's a matter of understanding what tradeoffs you will be doing which involves a thorough understanding of the problem, the architecture, & experience to understand which to use, and how to adjust the constants involved.
No one is saying that O(1) is always better than O(log N). However, I can guarantee you that an O(1) algorithm will also scale way better, so even if you make incorrect assumptions about how many users will be on the system, or the size of the data to process, it won't matter to the algorithm.
Yes, log(N) < 100 for most practical purposes, and No, you can not always replace it by constant.
For example, this may lead to serious errors in estimating performance of your program. If O(N) program processed array of 1000 elements in 1 ms, then you are sure it will process 106 elements in 1 second (or so). If, though, the program is O(N*logN), then it will take it ~2 secs to process 106 elements. This difference may be crucial - for example, you may think you've got enough server power because you get 3000 requests per hour and you think your server can handle up to 3600.
Another example. Imagine you have function f() working in O(logN), and on each iteration calling function g(), which works in O(logN) as well. Then, if you replace both logs by constants, you think that your program works in constant time. Reality will be cruel though - two logs may give you up to 100*100 multiplicator.
The rules of determining the Big-O notation are simpler when you don't decide that O(log n) = O(1).
As krzysio said, you may accumulate O(log n)s and then they would make a very noticeable difference. Imagine you do a binary search: O(log n) comparisons, and then imagine that each comparison's complexity O(log n). If you neglect both you get O(1) instead of O(log2n). Similarly you may somehow arrive at O(log10n) and then you'll notice a big difference for not too large "n"s.
Assume that in your entire application, one algorithm accounts for 90% of the time the user waits for the most common operation.
Suppose in real time the O(1) operation takes a second on your architecture, and the O(logN) operation is basically .5 seconds * log(N). Well, at this point I'd really like to draw you a graph with an arrow at the intersection of the curve and the line, saying, "It matters right here." You want to use the log(N) op for small datasets and the O(1) op for large datasets, in such a scenario.
Big-O notation and performance optimization is an academic exercise rather than delivering real value to the user for operations that are already cheap, but if it's an expensive operation on a critical path, then you bet it matters!
For any algorithm that can take inputs of different sizes N, the number of operations it takes is upper-bounded by some function f(N).
All big-O tells you is the shape of that function.
O(1) means there is some number A such that f(N) < A for large N.
O(N) means there is some A such that f(N) < AN for large N.
O(N^2) means there is some A such that f(N) < AN^2 for large N.
O(log(N)) means there is some A such that f(N) < AlogN for large N.
Big-O says nothing about how big A is (i.e. how fast the algorithm is), or where these functions cross each other. It only says that when you are comparing two algorithms, if their big-Os differ, then there is a value of N (which may be small or it may be very large) where one algorithm will start to outperform the other.
you are right, in many cases it does not matter for pracitcal purposes. but the key question is "how fast GROWS N". most algorithms we know of take the size of the input, so it grows linearily.
but some algorithms have the value of N derived in a complex way. if N is "the number of possible lottery combinations for a lottery with X distinct numbers" it suddenly matters if your algorithm is O(1) or O(logN)
Big-OH tells you that one algorithm is faster than another given some constant factor. If your input implies a sufficiently small constant factor, you could see great performance gains by going with a linear search rather than a log(n) search of some base.
O(log N) can be misleading. Take for example the operations on Red-Black trees.
The operations are O(logN) but rather complex, which means many low level operations.
Whenever N is the amount of objects that is stored in some kind of memory, you're correct. After all, a binary search through EVERY byte representable by a 64-bit pointer can be achieved in just 64 steps. Actually, it's possible to do a binary search of all Planck volumes in the observable universe in just 618 steps.
So in almost all cases, it's safe to approximate O(log N) with O(N) as long as N is (or could be) a physical quantity, and we know for certain that as long as N is (or could be) a physical quantity, then log N < 618
But that is assuming N is that. It may represent something else. Note that it's not always clear what it is. Just as an example, take matrix multiplication, and assume square matrices for simplicity. The time complexity for matrix multiplication is O(N^3) for a trivial algorithm. But what is N here? It is the side length. It is a reasonable way of measuring the input size, but it would also be quite reasonable to use the number of elements in the matrix, which is N^2. Let M=N^2, and now we can say that the time complexity for trivial matrix multiplication is O(M^(3/2)) where M is the number of elements in a matrix.
Unfortunately, I don't have any real world problem per se, which was what you asked. But at least I can make up something that makes some sort of sense:
Let f(S) be a function that returns the sum of the hashes of all the elements in the power set of S. Here is some pesudo:
f(S):
ret = 0
for s = powerset(S))
ret += hash(s)
Here, hash is simply the hash function, and powerset is a generator function. Each time it's called, it will generate the next (according to some order) subset of S. A generator is necessary, because we would not be able to store the lists for huge data otherwise. Btw, here is a python example of such a power set generator:
def powerset(seq):
"""
Returns all the subsets of this set. This is a generator.
"""
if len(seq) <= 1:
yield seq
yield []
else:
for item in powerset(seq[1:]):
yield [seq[0]]+item
yield item
https://www.technomancy.org/python/powerset-generator-python/
So what is the time complexity for f? As with the matrix multiplication, we can choose N to represent many things, but at least two makes a lot of sense. One is number of elements in S, in which case the time complexity is O(2^N), but another sensible way of measuring it is that N is the number of element in the power set of S. In this case the time complexity is O(N)
So what will log N be for sensible sizes of S? Well, list with a million elements are not unusual. If n is the size of S and N is the size of P(S), then N=2^n. So O(log N) = O(log 2^n) = O(n * log 2) = O(n)
In this case it would matter, because it's rare that O(n) == O(log n) in the real world.
I do not believe algorithms where you can freely choose between O(1) with a large constant and O(logN) really exists. If there is N elements to work with at the beginning, it is just plain impossible to make it O(1), the only thing that is possible is move your N to some other part of your code.
What I try to say is that in all real cases I know off you have some space/time tradeoff, or some pre-treatment such as compiling data to a more efficient form.
That is, you do not really go O(1), you just move the N part elsewhere. Either you exchange performance of some part of your code with some memory amount either you exchange performance of one part of your algorithm with another one. To stay sane you should always look at the larger picture.
My point is that if you have N items they can't disappear. In other words you can choose between inefficient O(n^2) algorithms or worse and O(n.logN) : it's a real choice. But you never really go O(1).
What I try to point out is that for every problem and initial data state there is a 'best' algorithm. You can do worse but never better. With some experience you can have a good guessing of what is this intrisic complexity. Then if your overall treatment match that complexity you know you have something. You won't be able to reduce that complexity, but only to move it around.
If problem is O(n) it won't become O(logN) or O(1), you'll merely add some pre-treatment such that the overall complexity is unchanged or worse, and potentially a later step will be improved. Say you want the smaller element of an array, you can search in O(N) or sort the array using any common O(NLogN) sort treatment then have the first using O(1).
Is it a good idea to do that casually ? Only if your problem asked also for second, third, etc. elements. Then your initial problem was truly O(NLogN), not O(N).
And it's not the same if you wait ten times or twenty times longer for your result because you simplified saying O(1) = O(LogN).
I'm waiting for a counter-example ;-) that is any real case where you have choice between O(1) and O(LogN) and where every O(LogN) step won't compare to the O(1). All you can do is take a worse algorithm instead of the natural one or move some heavy treatment to some other part of the larger pictures (pre-computing results, using storage space, etc.)
Let's say you use an image-processing algorithm that runs in O(log N), where N is the number of images. Now... stating that it runs in constant time would make one believe that no matter how many images there are, it would still complete its task it about the same amount of time. If running the algorithm on a single image would hypothetically take a whole day, and assuming that O(logN) will never be more than 100... imagine the surprise of that person that would try to run the algorithm on a very large image database - he would expect it to be done in a day or so... yet it'll take months for it to finish.

Resources