base of logarithms in time-complexity algorithms - algorithm

What is the base of logarithm on all time-complexity algorithms? Is it base 10 or base e?
When we say that the average sorting complexity is O(n log n). Is the base of log n 10 or e?

In Computer Science, it's often base 2. This is because many divide and conquer algorithms that exhibit this kind of complexity are dividing the problem in two at each step.
Binary search is a classic example. At each step, we divide the array into two and only recursively search in one of the halves, until you reach a base case of a subarray of one element (or zero elements). When dividing an array of length n in two, the total number of divisions before reaching a one element array is log2(n).
This is often simplified because the logarithms of different bases are effectively equivalent when discussing algorithm analysis.

In Big-O complexity analysis, it doesn't actually matter what the logarithm base is. (they are asymptotically the same, i.e. they differ by only a constant factor):
O(log2 N) = O(log10 N) = O(loge N)
Most of the time when mathematicians talk about logs, they implicitly mean to base e. Computer Scientists would tend to favour base 2, but it doesn't actually matter.

In terms of Big-O, the base doesn't matter because of the change of base formula implies that it's only a constant factor difference.
However sometimes it's useful to count approximately how many operations an algorithm performs. In this case, most of the times it's base 2 due to the nature of most divide and conquer algorithms.

There's some explanation in this site
Basically, logarithms from base 10 or base 2 or base e can be exchanged (transformed) to any other base with the addition of a constant. So, it doesn't matter the base for the log.
The key thing to note is that log2N grows slowly. Doubling N has a relatively small effect. Logarithmic curves flatten out nicely.
source

It varies depending on the problem and for the most part is irrelevant. However in many cases it is base 2, for example binary search is log(base 2)n. You can read about it in more detail here.

Related

Are exponential algorithms (with very small exponents) faster than the general logarithmic algorithm?

Say something runs at n^0.5 vs log n. It's true that this obviously isn't fast (the log n beats it). However, what about n^0.1 or n^0.01? Would it still be preferable to go with the logarithmic algorithm?
I guess, how small should the exponent be to switch to exponential?
The exponent does not matter. It is n that matters.
No matter how small the exponent of an exponential-time-complexity algorithm is, the logarithmic-time-complexity algorithm will beat it if n is large enough.
So, it all depends on your n. Substitute a specific n, calculate the actual run-time cost of your exponential-time-complexity algorithm vs your logarithmic-time-complexity algorithm, and see who the winner is.
Asymptotic complexity can be a bit misleading.
A function that's proportional to log n will be less than one that's proportional to (say) n0.01 . . . once n gets large enough.
But for smaller values of n, all bets are off, because the constant of proportionality can play a large role. For example, sorting algorithms that have O(n2) worst-case complexity are often better choices, when n is known to be small, than sorting algorithms that have O(n log n), because the latter are typically more complicated and therefore have more overhead. It's only when n grows larger that the latter start to win out.
In general, performance decisions should be based on profiling and testing, rather than on purely mathematical arguments about what should theoretically be faster.
In general, given two sublinear algorithms you should choose the one with the smallest constant multiplier. Since complexity theory won't help you with that, you will have to write the programs as efficiently as possible and benchmark them. This necessity might lead you to choose the algorithm which is easier to code efficiently, which might also be a reasonable criterion.
This is of course not the case with superlinear functions, where large n exaggerate costs. But even then, you might find an algorithm whose theoretical efficiency is superior but which requires a very large n to be superior to a simpler algorithm, perhaps so large that it will never be tried.
You're talking about big O which tends to refer to how an algorithm scales as a function of inputs, as opposed to how fast it is in the absolute time sense. On certain data sets an algorithm with a worse big O, may perform much better in absolute time.
Let's say you have two algorithms, one is exactly O(n^0.1) and the other is exactly log(n)
Though the O(n^0.1) is worse, it takes until n is about equal to 100,000,000,000 for it to be surpassed by log(n).
An algorithm could plausibly have a sqrt(N) running time, but what would one with an even lower exponent look like? It is an obvious candidate for replacing a logarithmic method if such can be found, and that's where big O analysis ceases to be useful - it depends on N and the other costs of each operation, plus implementation difficulty.
First of all, exponential complexity is in the form of , where n is the exponent. Complexities below are generally called sub-linear.
Yes, from some finite n₀ onwards, general logarithm is always better than any . However, as you observe, this n₀ turns out to be quite large for smaller values of k. Therefore, if you expect your value of n to be reasonably small, the payload of the logarithm may be worse than the power, but here we move from theory to practice.
To see why should the logarithm be better, let's first look at some plots:
This curve looks mostly the same for all values of k, but may cross the n axis for smaller exponents. If its value is above 0, the logarithm is smaller and thus better. The point at which it crosses the axis for the second time (not in this graph) is where the logarithm becomes the best option for all values of n after that.
Notice the minimum - if the minimum exists and is below 0, we can assume that the function will eventually cross the axis for the second time and become in favour of the logarithm. So let's find the minimum, using the derivative.
From the nature of the function, this is the minimum and it always exists. Therefore, the function is rising from this point.
For k = 0.1, this turns out to be 10,000,000,000, and this is not even where the function crosses the n axis for the second time, only the minimum is there. So in practise, using the exponent is better than the logarithm at least up to this point (unless you have some constants there, of course).

Is `log(n)` base 10?

Still getting a grip on logarithms being the opposite of exponentials. (Would it also be correct to describe them as the inversion of exponentials?)
There are lots of great SO entries already on Big-O notation including O(log n) and QuickSort n(log n) specifically. Found some useful graphs as well.
In looking at Divide and Conquer algorithms, I'm coming across n log n, which I think is n multiplied by the value of log n. I often try concrete examples like 100 log 100, to help visualize what's going on in an abstract equation.
Just read that log n assumes base 10. Does n log n translate into:
"the number n multiplied by the amount 10 needs to be raised to the power of in order to equal the number n"?
So 100 log 100 equals 200 because 10 needs to be raised to the power of two to equal 100?
Does the base change as an algorithm iterates through a set? Does the base even matter if we're talking in abstractions anyway?
Yes, the base does change depending on the way it iterates, but it doesn't matter. As you might remember, changing the base of logarithms means multiplying them by a constant. Since you mentioned that you have read about Big-O notation, then you probably already know that constants do not make a difference (O(n) is the same as O(2n) or O(1000n)).
EDIT: to clarify something you said - "I'm coming across n log n, which I think is n multiplied by the value of log n". Yes, you are right. And if you want to know why it involves log n, then think of what algorithms like divide and conquer do - they split the input (in two halves or four quarters or ten tenths, depending on the algorithm) during each iteration. The question is "How many times can that input be split up before the algorithm ends?" So you look at the input and try to find how many times you can divide it by 2, or by 4, or by 10, until the operation is meaningless? (unless the purpose of the algorithm is to divide 0 as many times as possible) Now you can give yourself concrete examples, starting with easy stuff like "How many times can 8 be divided by 2?" or ""How many times can 1000 be divided by 10?"
You don't need to worry about the base - if you're dealing with algorithmic complexity, it doesn't matter which base you're in, because the difference is just a constant factor.
Fundamentally, you just need to know that log n means that as n increases exponentially, the running time (or space used) increases linearly. For example, if n=10 takes 1 minute, then n=100 would take 2 miuntes, and n=1000 would take 3 minutes - roughly. (It's usually in terms of upper bounds, with smaller factors ignored... but that's the general gist of it.)
n log n is just that log n multiplied by n - so the time or space taken increases "a bit faster than linearly", basically.
The base does not matter at all. In fact people tend to drop part of the operations, e.g. if one has O(n^4 + 2*n), this is often reduced to O(n^4). Only the most relevant power needs to be considered when comparing algorithms.
For the case of comparing two closely related algorithms, say O(n^4 + 2*n) against O(n^4 + 3*n), one needs to include the linear dependency in order to conserve the relevant information.
Consider a divide and conquer approach based on bisection: your base is 2, so you may talk about ld(n). On the other hand you use the O-notation to compare different algorithms by means of the same base. This being said, the difference between ld, ln, and log10 is just a matter of a general offset.
Logarithms and Exponents are inverse operations.
if
x^n = y
then
Logx(y) = n
For example,
10^3 = 1000
Log10 (1000) = 3
Divide and conquer algorithms work by dividing the problem into parts that are then solved as independent problems. There can also be a combination step that combines the parts. Most divide and conquer algorithms are base 2 which means they cut the problem in half each time. For example, Binary Search, works like searching a phone book for a name. You flip to the middle and say.. Is the name I'm looking for in the first half or last half? (before or after what you flipped to), then repeat. Every time you do this you divide the problem's size by 2. Therefore, it's base 2, not base 10.
Order notation is primarily only concerned with the "order" of the runtime because that is what is most important when trying to determine if a problem will be tractable (solvable in a reasonable amount of time).
Examples of different orders would be:
O(1)
O(n)
O(n * log n)
O(n^2 * log n)
O(n^3.5)
O(n!)
etc.
The O here stands for "big O notation" which is basically provides an upper bound on the growth rate of the function. Because we only care about the growth of the function for large inputs.. we typically ignore lower order terms for example
n^3 + 2 n^2 + 100 n
would be
O(n^3)
because n^3 is the largest order term it will dominate the growth function for large values of N.
When you see O(n * log n) people are just abbreviating... if you understand the algorithm it is typically Log base 2 because most algorithms cut the problem in half. However, it could be log base 3 for example if the algorithm cut the problem into thirds for example.
Note:
In either case, if you were to graph the growth function, it would appear as a logarithmic curve. But, of course a O(Log3 n) would be faster than O(Log2 n).
The reason you do not see O(log10 n) or O(log3 n) etc.. is that it just isn't that common for an algorithm to work better this way. In our phone book example you could split the pages into 3 separate thirds and compare inbetween 1-2 and 2-3. But, then you just made 2 comparisons and ended up knowing which 1/3 the name was in. However, if you just split it in half each time you'd know which 1/4 it was in which is more efficient.
In the vast set of programming languages I know, the function log() is intended to be base e=2.718281....
In mathematical books sometimes it means base "ten" and sometimes base "e".
As another answers pointed out, for the big-O notation does not matter, because, for all base x, the complexities O(log_x (n)) is the same as O(ln(n)) (here log_x means "logarithm in base x" and ln() means "logarithm in base e").
Finally, it's common that, in the analysis of several algorithms, it's more convenient consider that log() is, indeed, "logarithm in base 2". (I've seen some texts taking this approach). This is obviously related to the binary representation of numbers in the computers.

When working with Mergesort, quicksort etc, is O(n log n) base 2, or base 10?

I'm a little confused about big O notations.
Let's say I'm working with an array of 16 elements and trying to do a QuickSort.
A QuickSort's Average case is O(n log n) and its worst case is O(n^2)
I assumed it would be log base 2 since we're working with computers (binary). But I've also heard you could use base 10 and get the same complexity.
I figured it would be 16*log2*16 == 16 * 4 = 64 comparisons at O(n log n).
At O(n^2) == 16^2 == 256 comparisons at its worst case.
Do I have the right idea? And as a general rule, should I be using base 10 or base 2?
As far as big-Oh notation is concerned, the base of the logarithms doesn't make any real difference, because of this important property, called Change of Base.
According to this property, changing the base of the logarithm, in terms of big-oh notation, only affects the complexity by a constant factor.
What you should use as a general rule... well, it depends.
Mergesort halves the array, so you should think of it as a base-2 logarithm.
The complexity of algorithms dealing with binary trees made out of n elements, should use base-2 logarithms, but if dealing with ternary trees made out of n elements it should be base-3 logarithms.
It's all a matter of counting correctly.
So the only general rule is the change of base property.

Trying to understand Big-oh notation

Hi I would really appreciate some help with Big-O notation. I have an exam in it tomorrow and while I can define what f(x) is O(g(x)) is, I can't say I thoroughly understand it.
The following question ALWAYS comes up on the exam and I really need to try and figure it out, the first part seems easy (I think) Do you just pick a value for n, compute them all on a claculator and put them in order? This seems to easy though so I'm not sure. I'm finding it very hard to find examples online.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
What is the
worst-case computational-complexity of
the Binary Search algorithm on an
ordered list of length n = 2k?
That guy should help you.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
The order is same as if you compare their limit at infinity. like lim(a/b), if it is 1, then they are same, inf. or 0 means one of them is faster.
What is the worst-case
computational-complexity of the Binary
Search algorithm on an ordered list of
length n = 2k?
Find binary search best/worst Big-O.
Find linked list access by index best/worst Big-O.
Make conclusions.
Hey there. Big-O notation is tough to figure out if you don't really understand what the "n" means. You've already seen people talking about how O(n) == O(2n), so I'll try to explain exactly why that is.
When we describe an algorithm as having "order-n space complexity", we mean that the size of the storage space used by the algorithm gets larger with a linear relationship to the size of the problem that it's working on (referred to as n.) If we have an algorithm that, say, sorted an array, and in order to do that sort operation the largest thing we did in memory was to create an exact copy of that array, we'd say that had "order-n space complexity" because as the size of the array (call it n elements) got larger, the algorithm would take up more space in order to match the input of the array. Hence, the algorithm uses "O(n)" space in memory.
Why does O(2n) = O(n)? Because when we talk in terms of O(n), we're only concerned with the behavior of the algorithm as n gets as large as it could possibly be. If n was to become infinite, the O(2n) algorithm would take up two times infinity spaces of memory, and the O(n) algorithm would take up one times infinity spaces of memory. Since two times infinity is just infinity, both algorithms are considered to take up a similar-enough amount of room to be both called O(n) algorithms.
You're probably thinking to yourself "An algorithm that takes up twice as much space as another algorithm is still relatively inefficient. Why are they referred to using the same notation when one is much more efficient?" Because the gain in efficiency for arbitrarily large n when going from O(2n) to O(n) is absolutely dwarfed by the gain in efficiency for arbitrarily large n when going from O(n^2) to O(500n). When n is 10, n^2 is 10 times 10 or 100, and 500n is 500 times 10, or 5000. But we're interested in n as n becomes as large as possible. They cross over and become equal for an n of 500, but once more, we're not even interested in an n as small as 500. When n is 1000, n^2 is one MILLION while 500n is a "mere" half million. When n is one million, n^2 is one thousand billion - 1,000,000,000,000 - while 500n looks on in awe with the simplicity of it's five-hundred-million - 500,000,000 - points of complexity. And once more, we can keep making n larger, because when using O(n) logic, we're only concerned with the largest possible n.
(You may argue that when n reaches infinity, n^2 is infinity times infinity, while 500n is five hundred times infinity, and didn't you just say that anything times infinity is infinity? That doesn't actually work for infinity times infinity. I think. It just doesn't. Can a mathematician back me up on this?)
This gives us the weirdly counterintuitive result where O(Seventy-five hundred billion spillion kajillion n) is considered an improvement on O(n * log n). Due to the fact that we're working with arbitrarily large "n", all that matters is how many times and where n appears in the O(). The rules of thumb mentioned in Julia Hayward's post will help you out, but here's some additional information to give you a hand.
One, because n gets as big as possible, O(n^2+61n+1682) = O(n^2), because the n^2 contributes so much more than the 61n as n gets arbitrarily large that the 61n is simply ignored, and the 61n term already dominates the 1682 term. If you see addition inside a O(), only concern yourself with the n with the highest degree.
Two, O(log10n) = O(log(any number)n), because for any base b, log10(x) = log_b(*x*)/log_b(10). Hence, O(log10n) = O(log_b(x) * 1/(log_b(10)). That 1/log_b(10) figure is a constant, which we've already shown drop out of O(n) notation.
Very loosely, you could imagine picking extremely large values of n, and calculating them. Might exceed your calculator's range for large factorials, though.
If the definition isn't clear, a more intuitive description is that "higher order" means "grows faster than, as n grows". Some rules of thumb:
O(n^a) is a higher order than O(n^b) if a > b.
log(n) grows more slowly than any positive power of n
exp(n) grows more quickly than any power of n
n! grows more quickly than exp(kn)
Oh, and as far as complexity goes, ignore the constant multipliers.
That's enough to deduce that the correct order is O(1), O(log n), O(2n) = O(n), O(n log n), O(n^2), O(n!)
For big-O complexities, the rule is that if two things vary only by constant factors, then they are the same. If one grows faster than another ignoring constant factors, then it is bigger.
So O(2n) and O(n) are the same -- they only vary by a constant factor (2). One way to think about it is to just drop the constants, since they don't impact the complexity.
The other problem with picking n and using a calculator is that it will give you the wrong answer for certain n. Big O is a measure of how fast something grows as n increases, but at any given n the complexities might not be in the right order. For instance, at n=2, n^2 is 4 and n! is 2, but n! grows quite a bit faster than n^2.
It's important to get that right, because for running times with multiple terms, you can drop the lesser terms -- ie, if O(f(n)) is 3n^2+2n+5, you can drop the 5 (constant), drop the 2n (3n^2 grows faster), then drop the 3 (constant factor) to get O(n^2)... but if you don't know that n^2 is bigger, you won't get the right answer.
In practice, you can just know that n is linear, log(n) grows more slowly than linear, n^a > n^b if a>b, 2^n is faster than any n^a, and n! is even faster than that. (Hint: try to avoid algorithms that have n in the exponent, and especially avoid ones that are n!.)
For the second part of your question, what happens with a binary search in the worst case? At each step, you cut the space in half until eventually you find your item (or run out of places to look). That is log2(2k). A search where you just walk through the list to find your item would take n steps. And we know from the first part that O(log(n)) < O(n), which is why binary search is faster than just a linear search.
Good luck with the exam!
In easy to understand terms the Big-O notation defines how quickly a particular function grows. Although it has its roots in pure mathematics its most popular application is the analysis of algorithms which can be analyzed on the basis of input size to determine the approximate number of operations that must be performed.
The benefit of using the notation is that you can categorize function growth rates by their complexity. Many different functions (an infinite number really) could all be expressed with the same complexity using this notation. For example, n+5, 2*n, and 4*n + 1/n all have O(n) complexity because the function g(n)=n most simply represents how these functions grow.
I put an emphasis on most simply because the focus of the notation is on the dominating term of the function. For example, O(2*n + 5) = O(2*n) = O(n) because n is the dominating term in the growth. This is because the notation assumes that n goes to infinity which causes the remaining terms to play less of a role in the growth rate. And, by convention, any constants or multiplicatives are omitted.
Read Big O notation and Time complexity for more a more in depth overview.
See this and look up for solutions here is first one.

Meaning of average complexity when using Big-O notation

While answering to this question a debate began in comments about complexity of QuickSort. What I remember from my university time is that QuickSort is O(n^2) in worst case, O(n log(n)) in average case and O(n log(n)) (but with tighter bound) in best case.
What I need is a correct mathematical explanation of the meaning of average complexity to explain clearly what it is about to someone who believe the big-O notation can only be used for worst-case.
What I remember if that to define average complexity you should consider complexity of algorithm for all possible inputs, count how many degenerating and normal cases. If the number of degenerating cases divided by n tend towards 0 when n get big, then you can speak of average complexity of the overall function for normal cases.
Is this definition right or is definition of average complexity different ? And if it's correct can someone state it more rigorously than I ?
You're right.
Big O (big Theta etc.) is used to measure functions. When you write f=O(g) it doesn't matter what f and g mean. They could be average time complexity, worst time complexity, space complexities, denote distribution of primes etc.
Worst-case complexity is a function that takes size n, and tells you what is maximum number of steps of an algorithm given input of size n.
Average-case complexity is a function that takes size n, and tells you what is expected number of steps of an algorithm given input of size n.
As you see worst-case and average-case complexity are functions, so you can use big O to express their growth.
If you're looking for a formal definition, then:
Average complexity is the expected running time for a random input.
Let's refer Big O Notation in Wikipedia:
Let f and g be two functions defined on some subset of the real numbers. One writes f(x)=O(g(x)) as x --> infinity if ...
So what the premise of the definition states is that the function f should take a number as an input and yield a number as an output. What input number are we talking about? It's supposedly a number of elements in the sequence to be sorted. What output number could we be talking about? It could be a number of operations done to order the sequence. But stop. What is a function? Function in Wikipedia:
a function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output.
Are we producing exacly one output with our prior defition? No, we don't. For a given size of a sequence we can get a wide variation of number of operations. So to ensure the definition is applicable to our case we need to reduce a set possible outcomes (number of operations) to a single value. It can be a maximum ("the worse case"), a minimum ("the best case") or an average.
The conclusion is that talking about best/worst/average case is mathematically correct and using big O notation without those in context of sorting complexity is somewhat sloppy.
On the other hand, we could be more precise and use big Theta notation instead of big O notation.
I think your definition is correct, but your conclusions are wrong.
It's not necessarily true that if the proportion of "bad" cases tends to 0, then the average complexity is equal to the complexity of the "normal" cases.
For example, suppose that 1/(n^2) cases are "bad" and the rest "normal", and that "bad" cases take exactly (n^4) operations, whereas "normal" cases take exactly n operations.
Then the average number of operations required is equal to:
(n^4/n^2) + n(n^2-1)/(n^2)
This function is O(n^2), but not O(n).
In practice, though, you might find that time is polynomial in all cases, and the proportion of "bad" cases shrinks exponentially. That's when you'd ignore the bad cases in calculating an average.
Average case analysis does the following:
Take all inputs of a fixed length (say n), sum up all the running times of all instances of this length, and build the average.
The problem is you will probably have to enumerate all inputs of length n in order to come up with an average complexity.

Resources