What does 'log' represent in asymptotic notation? - algorithm

I understand the principles of asymptotic notation, and I get what it means when something is O(1) or O(n2) for example. But what does O(log n) mean? or O(n log n) for example?

Log is short for "logarithm": http://en.wikipedia.org/wiki/Logarithm
Logarithms tell us for example how many digits are needed to represent a number, or how many levels a balanced tree has when you add N elements to it.

Check: en.wikipedia.org/wiki/Big_O_notation
Remeber that log increases slowly than a an exponential function. So, if you have an algorithm that is n^2 and other, that doing the same, has a logarithmic function, the last would be more efficient (in general term, not always!).
To evaluate the complexity of a function (or algorithm) you must take in consideration the execution in time and space, mainly. You can evaluate a function or algorithm with other parameters, but, initially, those two would be OK.
EDIT:
http://en.wikibooks.org/wiki/Data_Structures/Asymptotic_Notation
Also, check the sorting algorithms. Will give great insight about complexity.

log is a mathematical function. It is the inverse of exponentiation - log (base 2) of 2^n is n. In practice, it is better than n^c for any positive c (including fractional c such as 1/2 (which is square root)). Check wikipedia for more info.

Related

How much time (Big-O) will an algorithm take which can rule out one third of possible numbers from 1 to N in each step?

I am abstracting the problem out. (it has nothing to do with prime numbers)
How much time (in terms of Big-O) will it take to determine if n is the solution?
If suppose I was able to design an algorithm which can rule out one third of the numbers from the possible answers {1,2,...,n} in the first step. Then successively rules out one third of the "remaining" numbers until all numbers are tested.
I have thought a lot about it but cant figure out it will be O(n log₃(n)) or O(log₃(n))
It depends on the algorithm, and on the value of N. You should be able to figure out and program an algorithm that takes O (sqrt (N)) rather easily, and it's not difficult to go down to O (sqrt (N) / log N). Anything better requires some rather deep mathematics, but there are algorithms that are a lot faster for large N.
Now when you say O (N log N), please don't guess these things. O (N log N) is ridiculous. The most stupid algorithm where you use nothing than the definition of a prime number is O (N).
Theoretically, the best effort is O(log^3 N), but the corresponding algorithm is not something you could figure out easily. See http://en.wikipedia.org/wiki/AKS_primality_test
There are more practical probabilistic algorithms though.
BTW. About 'ruling out one third' etc. It does not matter would it be 'log base 3' or 'log base 10' and so on. O(log N) roughly means 'any base logarithm' because they all can be reduced to each other by constant multiplier only. So the complexity of such algorithm will be log N * complexity_of_reduction_step. But the problem is, that 'single step' will hardly take the constant time. And if so, it will not help in achieving O(log N).

Big-Oh Notation

I seem to be confused by a question.
Here's the question, followed by my assumptions:
Al and Bob are arguing about their algorithms. Al claims his O(n log n)-time method is always faster than Bob’s O(n^2)-time method. To settle the issue, they perform a set of experiments. To Al’s dismay, they find that if n<100, the O(n^2)-time algorithm runs faster, and only when n>= 100 is the O(n log n)-time one better. Explain how this is possible.
Based on what I understand, an algorithm written in an O(n^2)-time method is effective only for small amounts of input n. As the input increases, the efficiency decreases as the run time increases dramatically since the run time is proportional to the square of the input. The O(n^2)-time method is more efficient than the O(n log n)-time method only for very small amounts of input (in this case for inputs less than 100), but as the input grows larger (in this case 100 or larger), the O(n log n) becomes the much more efficient method.
Am I only stating what is obvious and presented in the question or does the answer seem to satisfy the question?
You noted in your answer that to be O(N^2), the run-time is proportional to the square of the size of the input. Follow up on that -- there is a constant of proportionality which is present but not described by big-O notation. For actual timings, the magnitudes of the constants matter.
Big-O also ignores lower order terms, since asymptotically they are dominated by the highest order term, but those lower order terms still contribute to the actual timings.
As a consequence of either (or both) of these issues, a function which has a higher growth rate can nevertheless have a smaller outcome for a limited range of inputs.
No, I think this is not enough.
I would expect an answer to explain how the big-oh definition allows a function f(x) > g(x), for some x, even if O(f(x)) < O(g(x)). There a formalism that would answer it in two lines.
Another option is to answer it more intuitively, explaining how the constant term of the time function plays a fundamental role in small input sizes.
You are correct if you consider input explicitly i.e., n<=100 and so on. But actually asymptotic analysis( big O, omega etc) is done for considerable large inputs, like when n tends to infinite. nlogn is efficient than n^2. This statement is true when n is sufficiently large. We talk about big-oh without considering the input size. This means asymtotic analysis by default assumes input size is very large. We ignore specific values of n considering them as machine dependent. That is the reason why constants are ignored( as n approaches sufficient large , effect of constants becomes negligible)
Your answer is correct as far as it goes but it does not mention how or why this can happen. One explanation is that there is a fixed amount of (extra) overhead with the O(n log n)-time method such that the total benefit of the method doesn't add up to equal/exceed the overhead until 100 such benefits have aggregated.
By definition:
T(n) is O(f(n)) if and only if exists two constants C and n0 that:
T(n) < Cf(n) when n > n0.
In your cases this means that coefficients before n^2 is smaller than before nlogn or asymptotic limits can be acquired on n > 100.

Big oh notation for heaps

I am trying to understand big oh notations. Any help would be appreciated.
Say there is a program that creates a max heap and then pushes and removes the item.
Say there is n items.
To create a heap,it takes O(n) to heapify if you have read it into an array and then, heapifies it.
To push an item, it takes O(1) and to remove it, it takes O(1)
To heapify it after that, it takes log n for each remove and n log n for n items
So the big oh notation is O(n + n log n)
OR, is it O(n log n) only because we choose the biggest one.
The complexity to heapify the new element in the heap is O(logN), not O(1)(unless you use an Fibonacci heap which it seems is not the case).
Also there is no notation O(N + NlogN) as NlogN grows faster than N so this notation is simply written as O(NlogN).
EDIT: The big-oh notation only describes the asymptotic behavior of a function, that is - how fast it grows. As you get close to infinity 2*f(x) and 11021392103*f(x) behave similarly and that is why when writing big-oh notation, we ignore any constants in front of the function.
Formally speaking, O(N + N log N) is equivalent to O(N log N).
That said, it's assumed that there are coefficients buried in each of these, ala: O( aN + bN log(cN) ). If you have very large N values, these coefficients become unimportant and the algorithm is bounded only by its largest term, which, in this case, is log(N).
But it doesn't mean the coefficients are entirely unimportant. This is why in discussions of graph algorithms you'll often see authors say something like "the Floyd-Warshall algorithm runs in O(N^3) time, but has small coefficients".
If we could somehow write O(0.5N^3) in this case, we would. But it turns out that the coefficients vary depending on how you implement an algorithm and which computer you run it on. Thus, we settle for asymptotic comparisons, not necessarily because it is the best way, but because there isn't really a good alternative.
You'll also see things like "Worst-case: O(N^2), Average case: O(N)". This is an attempt to capture how the behavior of the algorithm varies with the input. Often times, presorted or random inputs can give you that average case, whereas an evil villain can construct inputs that produce the worst case.
Ultimately, what I am saying is this: O(N + N log N)=O(N log N). This is true, and it's the right answer for your homework. But we use this big-O notation to communicate and, in the fullness of time, you may find situations where you feel that O(N + N log N) is more expressive, perhaps if your algorithm is generally used for small N. In this case, do not worry so much about the formalism - just be clear about what it is you are trying to convey with it.

Meaning of average complexity when using Big-O notation

While answering to this question a debate began in comments about complexity of QuickSort. What I remember from my university time is that QuickSort is O(n^2) in worst case, O(n log(n)) in average case and O(n log(n)) (but with tighter bound) in best case.
What I need is a correct mathematical explanation of the meaning of average complexity to explain clearly what it is about to someone who believe the big-O notation can only be used for worst-case.
What I remember if that to define average complexity you should consider complexity of algorithm for all possible inputs, count how many degenerating and normal cases. If the number of degenerating cases divided by n tend towards 0 when n get big, then you can speak of average complexity of the overall function for normal cases.
Is this definition right or is definition of average complexity different ? And if it's correct can someone state it more rigorously than I ?
You're right.
Big O (big Theta etc.) is used to measure functions. When you write f=O(g) it doesn't matter what f and g mean. They could be average time complexity, worst time complexity, space complexities, denote distribution of primes etc.
Worst-case complexity is a function that takes size n, and tells you what is maximum number of steps of an algorithm given input of size n.
Average-case complexity is a function that takes size n, and tells you what is expected number of steps of an algorithm given input of size n.
As you see worst-case and average-case complexity are functions, so you can use big O to express their growth.
If you're looking for a formal definition, then:
Average complexity is the expected running time for a random input.
Let's refer Big O Notation in Wikipedia:
Let f and g be two functions defined on some subset of the real numbers. One writes f(x)=O(g(x)) as x --> infinity if ...
So what the premise of the definition states is that the function f should take a number as an input and yield a number as an output. What input number are we talking about? It's supposedly a number of elements in the sequence to be sorted. What output number could we be talking about? It could be a number of operations done to order the sequence. But stop. What is a function? Function in Wikipedia:
a function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output.
Are we producing exacly one output with our prior defition? No, we don't. For a given size of a sequence we can get a wide variation of number of operations. So to ensure the definition is applicable to our case we need to reduce a set possible outcomes (number of operations) to a single value. It can be a maximum ("the worse case"), a minimum ("the best case") or an average.
The conclusion is that talking about best/worst/average case is mathematically correct and using big O notation without those in context of sorting complexity is somewhat sloppy.
On the other hand, we could be more precise and use big Theta notation instead of big O notation.
I think your definition is correct, but your conclusions are wrong.
It's not necessarily true that if the proportion of "bad" cases tends to 0, then the average complexity is equal to the complexity of the "normal" cases.
For example, suppose that 1/(n^2) cases are "bad" and the rest "normal", and that "bad" cases take exactly (n^4) operations, whereas "normal" cases take exactly n operations.
Then the average number of operations required is equal to:
(n^4/n^2) + n(n^2-1)/(n^2)
This function is O(n^2), but not O(n).
In practice, though, you might find that time is polynomial in all cases, and the proportion of "bad" cases shrinks exponentially. That's when you'd ignore the bad cases in calculating an average.
Average case analysis does the following:
Take all inputs of a fixed length (say n), sum up all the running times of all instances of this length, and build the average.
The problem is you will probably have to enumerate all inputs of length n in order to come up with an average complexity.

how to do big-O analysis when 2 algorithms are involved

I'm confused about how to do big-O analysis for the following problem -
find an element from an array of integers. ( an example problem)
my solution
sort the array using bubble sort ( n^2 )
binary search on the array for a given element (logn)
now the big-O for this is n^2 or n^2 + logn ? Should we only consider the higher term ?
Big-O for a problem is that of the best algorithm that exists for a problem. That for an algorithm made of two steps (like yours) is indeed the highest of the two, because e.g.
O(n^2) == O(n^2 + log n)
However, you can't say that O(n^2) is the correct O for your sample problem without proving that no better algorithm exists (which is of course not the case in the example;-).
Only the higher order term. The complexity is always the complexity of the highest term.
The way you did it, it would be O(n^2), since for large n, n^2 >>> logn
To put the analysis, well, more-practically (if you prefer, crudely) than Alex did, the added log n doesn't have an appreciable effect on the outcome. Consider analyzing this in a real-world system with one million inputs, each of which takes one millisecond to sort, and one millisecond to search (it's a highly-hypothetical example). Given O(n^2), the sort takes over thirty years. The search takes an additional 0.014 seconds. Which part do you care about improving? :)
Now, you'll see algorithms which clock in at O(n^2 x logn). The effect of multiplying n^2 by log n makes log n significant - in our example, it sees our thirty years and raises us four centuries.

Resources