What is the Big O notation for Standard Deviation and Interquartile Range (IQR)? - big-o

I want to know the time complexity for calculating the Standard Deviation and Interquartile Range (IQR).
Please correct me If I am wrong, the Standard Deviation requires O(n),
while the IQR requires O(n log n).
I appreciate any help you can provide.

To calculate the standard deviation, you need iterate through the dataset twice, as you say (once for the mean, once for the element-wise deviation). This gives you a time complexity of O(2n) = O(n) arithmetic operations.
For IQR, you need to find the quartiles Q1 and Q3. As you say, you can achieve this by sorting the data (O(n log n)) and then accessing them directly. But you can do this faster by using the Quickselect algorithm instead, and get the quartiles in O(2n) = O(n) average time.
Notice that I am talking about time complexity with respect to arithmetic operations here. The actual time complexity can be higher if your values are unbounded and you have to take into the account the complexity of multiplication and addition as well.

Related

Best asymptotic notation

If an algorithm worst case running time is 6n^4 + 2, and its best case running time is 67+ 6n^3. What is the most appropriate asymptotic notation.
I'm trying learn about Big O notation.
is it Θ(n^2) ?
Essentially, Big-Oh time complexity analysis is defined for best case, worst case or average number of operations algorithm performs. "is it Θ(n^2) ?" So, you should specify which case are you looking for? Or do you mean to say is it Θ(n^2) for all cases? (which is obviously not correct)
Having said that, we know that algorithm performs 6n^4 + 2 operations in worst case. So it has Θ(n^4) worst case complexity. I've used theta here because I know exactly how many operations are going to be performed. In the best case, it performs 67+ 6n^3 operations. So it has Θ(n^3) time complexity for the best case.
How about average time complexity? Well, I can't know as long as I am not provided with the probability distribution of the inputs. It's maybe the case that best-case-like scenario rarely occurs and average time complexity is Θ(n^4), or vice versa. So we cannot directly infer the average time complexity from the worst/best case time complexities as long as we are not provided with input probability distribution, the algorithm itself or the recurrence relation. (Well, if best and worst time complexities are the same, then of course we can conclude that average time complexity is equal to them)
If algorithm is provided, we can calculate average time complexity making some very basic assumptions on the input (like equally likely distribution etc.). For example in linear search, best case is O(1). Worst case is O(n). Assuming equally likely distribution, you can conclude that average time complexity is O(n) using expectation formula. [sum of (probability of input i) * (number of operations for that input)]
Lastly, your average time complexity CANNOT be Θ(n^2) because your best and worst time complexities are worst than quadratic. It doesn't make sense to wait this algorithm perform n^2 operations in average, while it performs n^3 operations in best case.
Time complexity for best case <= time complexity for average <= time complexity for worst case

Can O(k * n) be considered as linear complexity (O(n))?

When talking about complexity in general, things like O(3n) tend to be simplified to O(n) and so on. This is merely theoretical, so how does complexity work in reality? Can O(3n) also be simplified to O(n)?
For example, if a task implies that solution must be in O(n) complexity and in our code we have 2 times linear search of an array, which is O(n) + O(n). So, in reality, would that solution be considered as linear complexity or not fast enough?
Note that this question is asking about real implementations, not theoretical. I'm already aware that O(n) + O(n) is simplified to O(n)?
Bear in mind that O(f(n)) does not give you the amount of real-world time that something takes: only the rate of growth as n grows. O(n) only indicates that if n doubles, the runtime doubles as well, which lumps functions together that take one second per iteration or one millennium per iteration.
For this reason, O(n) + O(n) and O(2n) are both equivalent to O(n), which is the set of functions of linear complexity, and which should be sufficient for your purposes.
Though an algorithm that takes arbitrary-sized inputs will often want the most optimal function as represented by O(f(n)), an algorithm that grows faster (e.g. O(n²)) may still be faster in practice, especially when the data set size n is limited or fixed in practice. However, learning to reason about O(f(n)) representations can help you compose algorithms to have a predictable—optimal for your use-case—upper bound.
Yes, as long as k is a constant, you can write O(kn) = O(n).
The intuition behind is that the constant k doesn't increase with the size of the input space and at some point will be incomparably small to n, so it doesn't have much influence on the overall complexity.
Yes - as long as the number k of array searches is not affected by the input size, even for inputs that are too big to be possible in practice, O(kn) = O(n). The main idea of the O notation is to emphasize how the computation time increases with the size of the input, and so constant factors that stay the same no matter how big the input is aren't of interest.
An example of an incorrect way to apply this is to say that you can perform selection sort in linear time because you can only fit about one billion numbers in memory, and so selection sort is merely one billion array searches. However, with an ideal computer with infinite memory, your algorithm would not be able to handle more than one billion numbers, and so it is not a correct sorting algorithm (algorithms must be able to handle arbitrarily large inputs unless you specify a limit as a part of the problem statement); it is merely a correct algorithm for sorting up to one billion numbers.
(As a matter of fact, once you put a limit on the input size, most algorithms will become constant-time because for all inputs within your limit, the algorithm will solve it using at most the amount of time that is required for the biggest / most difficult input.)

what is order of complexity in Big O notation?

Question
Hi I am trying to understand what order of complexity in terms of Big O notation is. I have read many articles and am yet to find anything explaining exactly 'order of complexity', even on the useful descriptions of Big O on here.
What I already understand about big O
The part which I already understand. about Big O notation is that we are measuring the time and space complexity of an algorithm in terms of the growth of input size n. I also understand that certain sorting methods have best, worst and average scenarios for Big O such as O(n) ,O(n^2) etc and the n is input size (number of elements to be sorted).
Any simple definitions or examples would be greatly appreciated thanks.
Big-O analysis is a form of runtime analysis that measures the efficiency of an algorithm in terms of the time it takes for the algorithm to run as a function of the input size. It’s not a formal bench- mark, just a simple way to classify algorithms by relative efficiency when dealing with very large input sizes.
Update:
The fastest-possible running time for any runtime analysis is O(1), commonly referred to as constant running time.An algorithm with constant running time always takes the same amount of time
to execute, regardless of the input size.This is the ideal run time for an algorithm, but it’s rarely achievable.
The performance of most algorithms depends on n, the size of the input.The algorithms can be classified as follows from best-to-worse performance:
O(log n) — An algorithm is said to be logarithmic if its running time increases logarithmically in proportion to the input size.
O(n) — A linear algorithm’s running time increases in direct proportion to the input size.
O(n log n) — A superlinear algorithm is midway between a linear algorithm and a polynomial algorithm.
O(n^c) — A polynomial algorithm grows quickly based on the size of the input.
O(c^n) — An exponential algorithm grows even faster than a polynomial algorithm.
O(n!) — A factorial algorithm grows the fastest and becomes quickly unusable for even small values of n.
The run times of different orders of algorithms separate rapidly as n gets larger.Consider the run time for each of these algorithm classes with
n = 10:
log 10 = 1
10 = 10
10 log 10 = 10
10^2 = 100
2^10= 1,024
10! = 3,628,800
Now double it to n = 20:
log 20 = 1.30
20 = 20
20 log 20= 26.02
20^2 = 400
2^20 = 1,048,576
20! = 2.43×1018
Finding an algorithm that works in superlinear time or better can make a huge difference in how well an application performs.
Say, f(n) in O(g(n)) if and only if there exists a C and n0 such that f(n) < C*g(n) for all n greater than n0.
Now that's a rather mathematical approach. So I'll give some examples. The simplest case is O(1). This means "constant". So no matter how large the input (n) of a program, it will take the same time to finish. An example of a constant program is one that takes a list of integers, and returns the first one. No matter how long the list is, you can just take the first and return it right away.
The next is linear, O(n). This means that if the input size of your program doubles, so will your execution time. An example of a linear program is the sum of a list of integers. You'll have to look at each integer once. So if the input is an list of size n, you'll have to look at n integers.
An intuitive definition could define the order of your program as the relation between the input size and the execution time.
Others have explained big O notation well here. I would like to point out that sometimes too much emphasis is given to big O notation.
Consider matrix multplication the naïve algorithm has O(n^3). Using the Strassen algoirthm it can be done as O(n^2.807). Now there are even algorithms that get O(n^2.3727).
One might be tempted to choose the algorithm with the lowest big O but it turns for all pratical purposes that the naïvely O(n^3) method wins out. This is because the constant for the dominating term is much larger for the other methods.
Therefore just looking at the dominating term in the complexity can be misleading. Sometimes one has to consider all terms.
Big O is about finding an upper limit for the growth of some function. See the formal definition on Wikipedia http://en.wikipedia.org/wiki/Big_O_notation
So if you've got an algorithm that sorts an array of size n and it requires only a constant amount of extra space and it takes (for example) 2 n² + n steps to complete, then you would say it's space complexity is O(n) or O(1) (depending on wether you count the size of the input array or not) and it's time complexity is O(n²).
Knowing only those O numbers, you could roughly determine how much more space and time is needed to go from n to n + 100 or 2 n or whatever you are interested in. That is how well an algorithm "scales".
Update
Big O and complexity are really just two terms for the same thing. You can say "linear complexity" instead of O(n), quadratic complexity instead of O(n²), etc...
I see that you are commenting on several answers wanting to know the specific term of order as it relates to Big-O.
Suppose f(n) = O(n^2), we say that the order is n^2.
Be careful here, there are some subtleties. You stated "we are measuring the time and space complexity of an algorithm in terms of the growth of input size n," and that's how people often treat it, but it's not actually correct. Rather, with O(g(n)) we are determining that g(n), scaled suitably, is an upper bound for the time and space complexity of an algorithm for all input of size n bigger than some particular n'. Similarly, with Omega(h(n)) we are determining that h(n), scaled suitably, is a lower bound for the time and space complexity of an algorithm for all input of size n bigger than some particular n'. Finally, if both the lower and upper bound are the same complexity g(n), the complexity is Theta(g(n)). In other words, Theta represents the degree of complexity of the algorithm while big-O and big-Omega bound it above and below.
Constant Growth: O(1)
Linear Growth: O(n)
Quadratic Growth: O(n^2)
Cubic Growth: O(n^3)
Logarithmic Growth: (log(n)) or O(n*log(n))
Big O use Mathematical Definition of complexity .
Order Of use in industrial Definition of complexity .

Meaning of average complexity when using Big-O notation

While answering to this question a debate began in comments about complexity of QuickSort. What I remember from my university time is that QuickSort is O(n^2) in worst case, O(n log(n)) in average case and O(n log(n)) (but with tighter bound) in best case.
What I need is a correct mathematical explanation of the meaning of average complexity to explain clearly what it is about to someone who believe the big-O notation can only be used for worst-case.
What I remember if that to define average complexity you should consider complexity of algorithm for all possible inputs, count how many degenerating and normal cases. If the number of degenerating cases divided by n tend towards 0 when n get big, then you can speak of average complexity of the overall function for normal cases.
Is this definition right or is definition of average complexity different ? And if it's correct can someone state it more rigorously than I ?
You're right.
Big O (big Theta etc.) is used to measure functions. When you write f=O(g) it doesn't matter what f and g mean. They could be average time complexity, worst time complexity, space complexities, denote distribution of primes etc.
Worst-case complexity is a function that takes size n, and tells you what is maximum number of steps of an algorithm given input of size n.
Average-case complexity is a function that takes size n, and tells you what is expected number of steps of an algorithm given input of size n.
As you see worst-case and average-case complexity are functions, so you can use big O to express their growth.
If you're looking for a formal definition, then:
Average complexity is the expected running time for a random input.
Let's refer Big O Notation in Wikipedia:
Let f and g be two functions defined on some subset of the real numbers. One writes f(x)=O(g(x)) as x --> infinity if ...
So what the premise of the definition states is that the function f should take a number as an input and yield a number as an output. What input number are we talking about? It's supposedly a number of elements in the sequence to be sorted. What output number could we be talking about? It could be a number of operations done to order the sequence. But stop. What is a function? Function in Wikipedia:
a function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output.
Are we producing exacly one output with our prior defition? No, we don't. For a given size of a sequence we can get a wide variation of number of operations. So to ensure the definition is applicable to our case we need to reduce a set possible outcomes (number of operations) to a single value. It can be a maximum ("the worse case"), a minimum ("the best case") or an average.
The conclusion is that talking about best/worst/average case is mathematically correct and using big O notation without those in context of sorting complexity is somewhat sloppy.
On the other hand, we could be more precise and use big Theta notation instead of big O notation.
I think your definition is correct, but your conclusions are wrong.
It's not necessarily true that if the proportion of "bad" cases tends to 0, then the average complexity is equal to the complexity of the "normal" cases.
For example, suppose that 1/(n^2) cases are "bad" and the rest "normal", and that "bad" cases take exactly (n^4) operations, whereas "normal" cases take exactly n operations.
Then the average number of operations required is equal to:
(n^4/n^2) + n(n^2-1)/(n^2)
This function is O(n^2), but not O(n).
In practice, though, you might find that time is polynomial in all cases, and the proportion of "bad" cases shrinks exponentially. That's when you'd ignore the bad cases in calculating an average.
Average case analysis does the following:
Take all inputs of a fixed length (say n), sum up all the running times of all instances of this length, and build the average.
The problem is you will probably have to enumerate all inputs of length n in order to come up with an average complexity.

Big O Notation: differences between O(n^2) and O(n.log(n))?

What is the difference between O(n^2) and O(n.log(n))?
n^2 grows in complexity more quickly.
Big O calculates an upper limit of running time relative to the size of a data set (n).
An O(n*log(n)) is not always faster than a O(n^2) algorithm, but when considering the worst case it probably is. A O(n^2)-algorithm takes ~4 times longer when you duplicate the working set (worst case), for O(n*log(n))-algorithm it's less. The bigger your data set is the more it usually gets faster using an O(n*log(n))-algorithm.
EDIT: Thanks to 'harms', I'll correct a wrong statement in my first answer: I told that when considering the worst case O(n^2) would always be slower than O(n*log(n)), that's wrong since both are except for a constant factor!
Sample: Say we have the worst case and our data set has size 100.
O(n^2) --> 100*100 = 10000
O(n*log(n)) --> 100*2 = 200 (using log_10)
The problem is that both can be multiplied by a constant factor, say we multiply c to the latter one. The result will be:
O(n^2) --> 100*100 = 10000
O(n*log(n)) --> 100*2*c = 200*c (using log_10)
So for c > 50 we get O(n*log(n)) > O(n^2), for n=100.
I have to update my statement: For every problem, when considering the worst case, a O(n*log(n)) algorithm will be quicker than a O(n^2) algorithm for arbitrarily big data sets.
The reason is: The choice of c is arbitrary but constant. If you increase the data set large enough it will dominate the effect of every constant choice of c and when discussing two algorithms the cs for both are constant!
You'll need to be a bit more specific about what you are asking, but in this case O(n log(n)) is faster
Algorithms that run in O(nlog(n)) time are generally faster than those that run in O(n^2).
Big-O defines the upper-bound on performance. As the size of the data set grows (n) the length of time it takes to perform the task. You might be interested in the iTunes U algorithms course from MIT.
n log(n) grows significantly slower
"Big Oh" notation gives an estimated upper bound on the growth in the running time of an algorithm. If an algorithm is supposed to be O(n^2), in a naive way, it says that for n=1, it takes a max. time 1 units, for n=2 it takes max. time 4 units and so on. Similarly for O(n log(n)), it says the grown will be such that it obeys the upper bound of O(n log(n)).
(If I am more than naive here, please correct me in a comment).
I hope that helps.

Resources