Is there such a thing as "negative" big-O complexity? [duplicate] - algorithm

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Are there any O(1/n) algorithms?
This just popped in my head for no particular reason, and I suppose it's a strange question. Are there any known algorithms or problems which actually get easier or faster to solve with larger input? I'm guessing that if there are, it wouldn't be for things like mutations or sorting, it would be for decision problems. Perhaps there's some problem where having a ton of input makes it easy to decide something, but I can't imagine what.
If there is no such thing as negative complexity, is there a proof that there cannot be? Or is it just that no one has found it yet?

No that is not possible. Since Big-Oh is suppose to be an approximation of the number of operations an algorithm performs related to its domain size then it would not make sense to describe an algorithm as using a negative number of operations.
The formal definition section of the wikipedia article actually defines the Big-Oh notation in terms of using positive real numbers. So there actually is not even a proof because the whole concept of Big-Oh has no meaning on the negative real numbers per the formal definition.
Short answer: Its not possible because the definition says so.

update
Just to make it clear, I'm answering this part of the question: Are there any known algorithms or problems which actually get easier or faster to solve with larger input?
As noted in accepted answer here, there are no algorithms working faster with bigger input.
Are there any O(1/n) algorithms?
Even an algorithm like sleep(1/n) has to spend time reading its input, so its running time has a lower bound.
In particular, author referes relatively simple substring search algorithm:
http://en.wikipedia.org/wiki/Horspool
PS But using term 'negative complexity' for such algorithms doesn't seem to be reasonable to me.

To think in an algorithm that executes in negative time, is the same as thinking about time going backwards.
If the program starts executing at 10:30 AM and stops at 10:00 AM without passing through 11:00 AM, it has just executed with time = O(-1).
=]
Now, for the mathematical part:
If you can't come up with a sequence of actions that execute backwards in time (you never know...lol), the proof is quite simple:
positiveTime = O(-1) means:
positiveTime <= c * -1, for any C > 0 and n > n0 > 0
Consider the "C > 0" restriction.
We can't find a positive number that multiplied by -1 will result in another positive number.
By taking that in account, this is the result:
positiveTime <= negativeNumber, for any n > n0 > 0
Wich just proves that you can't have an algorithm with O(-1).

Not really. O(1) is the best you can hope for.
The closest I can think of is language translation, which uses large datasets of phrases in the target language to match up smaller snippets from the source language. The larger the dataset, the better (and to a certain extent faster) the translation. But that's still not even O(1).

Well, for many calculations like "given input A return f(A)" you can "cache" calculation results (store them in array or map), which will make calculation faster with larger number of values, IF some of those values repeat.
But I don't think it qualifies as "negative complexity". In this case fastest performance will probably count as O(1), worst case performance will be O(N), and average performance will be somewhere inbetween.
This is somewhat applicable for sorting algorithms - some of them have O(N) best-case scenario complexity and O(N^2) worst case complexity, depending on the state of data to be sorted.
I think that to have negative complexity, algorithm should return result before it has been asked to calculate result. I.e. it should be connected to a time machine and should be able to deal with corresponding "grandfather paradox".

As with the other question about the empty algorithm, this question is a matter of definition rather than a matter of what is possible or impossible. It is certainly possible to think of a cost model for which an algorithm takes O(1/n) time. (That is not negative of course, but rather decreasing with larger input.) The algorithm can do something like sleep(1/n) as one of the other answers suggested. It is true that the cost model breaks down as n is sent to infinity, but n never is sent to infinity; every cost model breaks down eventually anyway. Saying that sleep(1/n) takes O(1/n) time could be very reasonable for an input size ranging from 1 byte to 1 gigabyte. That's a very wide range for any time complexity formula to be applicable.
On the other hand, the simplest, most standard definition of time complexity uses unit time steps. It is impossible for a positive, integer-valued function to have decreasing asymptotics; the smallest it can be is O(1).

I don't know if this quite fits but it reminds me of bittorrent. The more people downloading a file, the faster it goes for all of them

Related

Comparison based sorting is WC min time nlogn, so what about best/average case

There is a theorem in Cormen which says...
(Th 8.1)
"For comparison based sorting techniques you cannot have an algorithm to sort a given list, which takes time less than nlogn time (comparisons) in the worst case"
I.e.
the worst case time complexity is Omega (nlogn) for Comparison based sorting technique...
Now what I was searching is that whether there exists a statement in case of the best case..or even for avg case
Which states something like:
You cannot have a sorting Algorithm which takes time less than some X to sort a given list of elements...in the best case
Basically do we have any lower bound for best case Algorithm. Or even as a matter of fact for average case. (I tried my best to find this, but couldn't find anywhere). Please also tell me whether the point I am raising is even worth it.
Great question! The challenge with defining “average case” complexity is that you have to ask “averaged over what?”
For example, if we assume that the elements of the array have an equal probability of being in any one of the n! possible permutations of n elements, then the Ω(n log n) bound on comparison sorting still holds, though I seem to remember that the proof of this is fairly complicated.
On the other hand, if we assume that there are trends in the data (say, you’re measuring temperatures over the course of a day, where you know they generally trend upward and then downward). Many real world data sets look like this, and there are algorithms like Timsort that can take advantage of those patterns to speed up performance. So perhaps “average” here would mean “averaged over all possible plots formed by a rising and then falling sequence with noise terms added in.” I haven’t encountered anyone working on analyzing algorithms in those cases, but I’m sure some work has been done there and there may even be some nice average case measures there that are less well known.

Big Theta of Factorial Multiplied by a Coefficient

For a function with a run time of (cn)! where c is a coefficient >= 0 and c != n, would the tight bound of the run be Θ(n!) or Θ((cn)!)? Right now, I believe it would be Θ((cn)!) since they would differ by a coefficent >= n since cn != n.
Thanks!
Edit: A more specific example to clarify what I'm asking:
Will (7n)!, (5n/16)! and n! all be Θ(n!)?
You can use Stirling's approximation to get that if c>1 then (cn)! is asymptotically larger than pow(c,n)*n!, which is not O(n!) since the quotient diverges. As a more elementary approach consider this example for c=2: (2n)!=(2n)(2n-1)...(n+1)n!>n!n! and (n!n!)/n!=n! diverges, so (2n)! is NOT O(n!).
Will (7n)!, (5n/16)! and n! all be Θ(n!)?
I think there are two answers to your question.
The shorter one is from the purely theoretical point of view. Of those 3 only the n! lies in the class of Θ(n!). The second lies in the O(n!) (note big-O instead of big-Theta) and (7n)! is slower than Θ(n!), it lies in Θ((7n)!)
There is also a longer but more practical answer. And to get to it we first need to understand what is the big deal with this whole big-O and big-Theta business in the first place?
The thing is that for many practical tasks there are many algorithms and not all of them are equally or even similarly efficient. So the practical question is: can we somehow capture this difference in performance in an easy to understand and compare way? And this is the problem that big-O/big-Theta are trying to solve. The idea behind this method is that if we look at some algorithm with some complicated real formula for the exact time, there is only 1 term that grows faster than all others and thus dominates the time as the problem gets bigger. So let's compress this big formula to that dominant term. Then we can compare those terms and if they are different, we can easily say which is the better algorithm (7*n^2 is clearly better than 2*n^3).
Another idea is that the term "operation" is usually not that well defined at the level people usually think about algorithms. Which "operation" actually maps to a single CPU instruction and which to a few depends on many factors such as particular hardware. Also the instructions themselves can take different time to execute. Moreover sometimes the algorithm's working time is dominated by memory access than CPU instructions and those components are not easily additive. The morale of this story is that if two algorithms are different only in a scalar coefficient, you can't really compare those algorithms just theoretically. You need to compare some implementations in some particular environment. This is why algorithms complexity measure typically boils down to something like O(n^k) where k is a constant.
There is one more consideration: practicality. If the algorithm is some polynomial, there is a huge practical difference between cases a=3 and a=4 in O(n^a). But if it is something like O(2^(n^a)), then there is not much difference what exactly the a as along as a>1. This is because 2^n grows fast enough to make it impractical for almost any realistic n irrespective of a. So in practical terms it is often good enough approximation to put all such algorithms into a single "exponential algorithms" bucket and say they are all impractical even despite the fact there is a huge difference between them. This is where some mathematically unconventional notations like 2^O(n) come from.
From this last practical perspective the difference between Θ(n!) and Θ((7n)!) is also very little: both are totally impractical because both lie beyond even the exponential bucket of 2^O(n) (see Stirling's formula that shows that n! grows a bit faster than (n/e)^n). So it makes sense to put all such algorithms in another bucket of "factorial complexity" and mark them as impractical as well.

How does one calculate the runtime of an algorithm?

I've read about algorithm run-time in some algorithm books, where it's expressed as, O(n). For eg., the given code would run in O(n) time for the best case & O(n3) for the worst case. What does it mean & how does one calculate it for their own code? Is it like linear time , and is it like each predefined library function has their own run-time which should be kept in mind before calling it? Thanks...
A Beginner's Guide to Big O Notation might be a good place to start:
http://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
also take a look at Wikipedia
http://en.wikipedia.org/wiki/Big_O_notation
there are several related questions and good answers on stackoverflow
What is a plain English explanation of "Big O" notation?
and
Big-O for Eight Year Olds?
Should't this be in math?
If you are trying to sort with bubble sort array, that is already sorted, then
you can check, if this move along array checked anything. If not, all okey -- we done.
Than, for best case you will have O(n) compraisons(n-1, to be exact), for worst case(array is reversed) you will have O(n^2) compraisons(n(n-1)/2, to be exact).
More complicated example. Let's find maximum element of array.
Obvilously, you will always do n-1 compraisons, but how many assignments on average?
Complicated math answers: H(n) -1.
Usually, It is easy to Your Answerget best and worst scenarios, but average require a lot of math.
I would suggest you read Knuth, Volume 1. But who would not?
And, formal definition:
f(n)∈O(g(n)) means exist n∈N: for all m>n f(m)
In fact, you must read O-notation about on wiki.
The big-O notation is one kind of asymptotic notation. Asymptotic notation is an idea from mathematics, which describes the behavior of functions "in the limit" - as you approach infinity.
The problem with defining how long an algorithm takes to run is that you usually can't give an answer in milliseconds because it depends on the machine, and you can't give an answer in clock cycles or as an operation count because that would be too specific to particular data to be useful.
The simple way of looking at asymptotic notation is that it discards all the constant factors in a function. Basically, a n2 will always be bigger that b n if n is sufficiently large (assuming everything is positive). Changing the constant factors a and b doesn't change that - it changes the specific value of n where a n2 is bigger, but doesn't change that it happens. So we say that O(n2) is bigger than O(n), and forget about those constants that we probably can't know anyway.
That's useful because the problems with bigger n are usually the ones where things slow down enough that we really care. If n is small enough, the time taken is small, and the gains available from choosing different algorithms are small. When n gets big, choosing a different algorithm can make a huge difference. Also, the problems happen in the real world are often much bigger than the ones we can easily test against - and often, they keep growing over time (e.g. as databases accumulate more data).
It's a useful mathematical model that abstracts away enough awkward-to-handle detail that useful results can be found, but it's not a perfect system. We don't deal with infinite problems in the real world, and there are plenty of times when problems are small enough that those constants are relevant for real-world performance, and sometimes you just have to time things with a clock.
The MIT OCW Introduction to Algorithms course is very good for this kind of thing. The videos and other materials are available for free, and the course book (not free) is among the best books available for algorithms.

Interpreting results of algorithm experiment?

I have a quick sort algorithm and a counter that I increment every time a compare or swap is performed. Here are my results for random integer arrays of different sizes -
Array size --- number of operations
10000 --- 238393
20000 --- 511260
40000 --- 1120512
80000 --- 2370145
Edit:
I have removed the incorrect question I was asking in this post. What I am actually asking is -
What Im trying to find out is 'do these results stack up with the theoretical complexity of quicksort (O(N*log(N)))?
Now, basically what I need to know is how do I interpret those results
so I can determine the Big Oh complexity of QuickSort?
By definition, it is impossible to determine the asymptotic complexity of algorithms by considering their behavior for any (finite) set of inputs and extrapolating.
If you want to try anyway, what you should do is what you do in any science: look at the data, come up with a hypothesis (e.g., "these data are approximated by the curve ...") and then try to disprove it (by checking more numbers, for instance). If you can't disprove the hypothesis through further experiments aimed at disproving it, then it can stand. You'll never really know whether you've got it right using this method, but then again, that's true of all empirical science.
As others have pointed out, the preferred (this is an understatement; universally accepted and sole acceptable may be a better phrasing) method of determining the asymptotic bounds of an algorithm is, well, to analyze it mathematically, and produce a proof that it obeys the bound.
EDIT:
This is ignoring the intricacies involved in fitting curves to data, as well as the fact that designing an effectiv experiment is hard to do. I assume you know how to fit curves (it would be no different here than in any other data analysis... you just need to know what you're looking for and how to look) and that you have designed your experiment in such a way that (a) you can answer the questions you want to answer and (b) the answers you get will have some kind of validity. These are separate issues and require literally years of formal education and training in order to begin to properly use and understand.
Though you cannot get the asymptotic bound of your method by only experimenting, sometimes you can evaluate its behavior by drawing a graph of the complexities similar to your function, and looking at the behavior.
You can do it with drawing a graph of some functions y = f(n) such that f(10000) ~= g(10000) [where g is your function], and check the behavior difference.
In your example, we get the following graphs:
We can clearly see that:
The behavior of your results is sub quadric
The behavior is above linear.
It is very close to logarithmic behavior, but just a bit "higher".
From this, we can deduce that your algorithms is probably O(n^2) [not strict! remember, big O is not a strict bound], and also could be O(nlogn), if we deduce the difference from the O(nlogn) function is a noise.
Notes:
This method proves nothing about the algorithm, and particularly
doesn't give you any worst case [or even average case] bound.
This method is usually used to evaluate two algorithms, and not some pre defined functions, to check which is better for which inputs.
EDIT:
I drew all the graphs as y1(x) = f(x), y2(x) = g(x) , ... because I found it easier to explain this way, but usually when you compare two algorithms [as you often actually use this method], the function is y(x) = f(x) / g(x), and you check if y(x) is staying close to 1, growing, shrinking?

How do I find out out the fundamental operation when calculating run-time complexity?

I am trying to get the worst run-time complexity order on a couple of algorithms created. However I have run into a problem that I keep tending to select the wrong or wrong amount of fundamental operations for an algorithm.
To me it appears to be that the selection of the fundamental operation is more of an art than a science. After googling and reading my text boxes, I still have not found a good definition. So far I have defined it as "An operation that always occurs within an algorithms execution" such as a comparison or array manipulation.
But algorithms often have many comparisons that are always executed so which operation do you pick?
I agree to some degree it's an art, so you should always clarify when writing documentation, etc.. But usually it's a "visit" to the underlying data structure. So like you said, for an array it's a comparison or a swap, for a hash map it may be a manual examination of a key, for a graph it's a visit to a vertex or edge, etc.
Even practicing complexity theorists have disagreements about this sort of thing, so what follows may be a bit subjective: http://blog.computationalcomplexity.org/2009/05/shaving-logs-with-unit-cost.html
The purpose of big-O notation is to summarize the efficiency of an algorithm for the reader. In practical contexts, I am most concerned with how many clock cycles an algorithm takes, assuming that the big-O constant is neither extremely small or large (and ignoring the effects of the memory hierarchy); this is the "unit-cost" model alluded to in the linked post.
The reason to count comparisons for sorting algorithms is that the cost of a comparison depends on the type of the input data. You could say that a sorting algorithm takes O(c n log n) cycles where c is the expense of a comparison, but it's simpler in this case to count comparisons instead because the other work performed by the algorithm is O(n log n). There's a sorting algorithm that sorts the concatenation of n sorted arrays of length n in n^2 log n steps and n^2 comparisons; here, I would expect that the number of comparisons and the computational overhead be stated separately, because neither necessarily dominates the other.
This only works when You have actually implemented the algorithm, but You could just use a profiler to see which operation is the bottleneck. That's a practical point of view. In theory, some assume that everything that is not the fundamental operation runs in zero time.
The somewhat simple definition I have heard is:
The operation which is executed at least as many times as any other
operation in the algorithm.
For example, in a sorting algorithm, these tend to be comparisons rather than assignments as you almost always have to visit and 'check' an element before you re-order it, but the check may not result in a re-ordering. So there will always be at-least as many comparisons as assignments.

Resources