I have 3 algorithms (A1, A2 and A3) and their estimated time complexities are O(n Log n), O(K n) and O(Q n) respectively, where K and Q are different parameters of the actions. Then I have a fourth algorithm that runs those 3 algorithms consecutively (each one needs the results of the previous).
I'm confused about how should I estimate the total complexity of the suite of algorithms. As far as I can understand, O(n Log n) grows faster than O(K n) and O(Q n), therefore the most important part in terms of time consumption will be A1 and probably that will be the most relevant behavior for a n big enough. But that won't reflect that even after A1 is done, still A2 and A3 will take a lot of time.
So I was wondering, how should I account for that? Is it enough by just saying the complexity is O(n Log n)?
The total time complexity is:
O(n Log n) + O(K n) + O(Q n)
which if assumed that K and Q are parameters that grow slower than or similarly to Log n, then the total time complexity is:
O(n Log n)
since we are using big-o notation. Otherwise the total time complexity is the initial sum (or part of it).
The idea is to keep the term that will dominate the other term(s) when n grows.
Related
I have to construct an algorithm where it's upper bound is O(n2 log n). Can anyone provide any examples on what an O(n2 log n) algorithm would look like? I cannot seem to wrap my mind around it.
My mental image of it would be two nested for loops and within the second loop a log n operation is performed. Is this correct?
There are many ways to get a runtime of O(n2 log n) in an algorithm. Here's a sampler.
Sorting a list of n2 items efficiently. For example, if you take n items, form all n2 pairs of those items, and then sort them using something like heapsort, the runtime will be O(n2 log n2) = O(n2 log n). This follows from properties of logarithms: log n2 = 2 log n = O(log n). More generally, running an O(n log n)-time algorithm on an input of size n2 will give you an O(n2 log n) runtime.
Running Dijkstra's algorithm on a dense graph using a binary heap. The runtime of Dijkstra's algorithm on a graph with n nodes an m edges, using a binary heap, is O(m log n). A dense graph is one where m = Θ(n2), so Dijkstra's algorithm would take time O(n2 log n) in this case. This is also the time bound for running some other graph algorithms on dense graphs, such as Prim's algorithm when using a binary heap.
Certain divide-and-conquer algorithms. A divide-and-conquer algorithm whose recurrence is T(n) = 2T(n / √2) + O(n2) has a runtime of O(n2 log n). This comes up, for example, as a subroutine in the Karger-Stein minimum cut algorithm.
Performing n2 searches on a binary tree of n items. The cost of each search is O(log n), so this would work out to O(n2 log n) total work. More generally, doing any O(log n)-time operation a total of O(n2) times will give you this bound.
Naive construction of a suffix array. A suffix array is an array of all the suffixes of a string in sorted order. Naively sorting the suffixes requires O(n log n) comparisons, but since comparing two suffixes can take time O(n), the total cost is O(n2 log n).
Constructing a 2D range tree. The range tree data structure allows for fast querying of all points in k-D space within an axis-aligned box. In two dimensions, the construction time is O(n2 log n), though this can be improved to O(n log n) using some more clever techniques.
This is, of course, not a comprehensive list, but it gives a sampler of where O(n2 log n) runtimes pop up in practice.
Technically, any algorithm which is asymptotically faster than n^2 log n is called O(n^2 log n). Examples include "do nothing" algorithm Theta(1), binary search Theta(log n), linear search Theta(n), bubble sort Theta(n^2).
The algorithm you describe would be O(n^2 log n) too while also being Omega(n^2 log n) and thus Theta(n^2 log n):
for i in range(n):
for j in range(n):
# binary search in array of size n
One approach to constructing a O(n2 log n) algorithm is to start with a O(n3) algorithm and optimize it so one of the loops runs in log n steps instead of n.
That could be non-trivial though, so searching Google turns up the question Why is the Big-O of this algorithm N^2*log N? The problem there is:
Fill array a from a[0] to a[n-1]: generate random numbers until you
get one that is not already in the previous indexes.
Even though there are faster algorithms to solve this problem, the one presented is O(n2 log n).
I have an algorithm that first does something in O(n*log(n)) time and then does something else in O(n^2) time. Am I correct that the total complexity would be
O(n*log(n) + n^2)
= O(n*(log(n) + n))
= O(n^2)
since log(n) + n is dominated by the + n?
The statement is correct, as O(n log n) is a subset of O(n^2); however, a formal proof would consist out of choosing and constructing suitable constants.
If the call probability of both is equal then you are right. But if the probability of both is not equal you have to do an amortized analysis where you split rare expensive calls (n²) to many fast calls (n log(n)).
For quick sort for example (which generally takes n log(n), but rarly takes n²) you can proof that average running time is n log(n) because of amortized anlysis.
one of the rules of complexity analysis is that you must remove the terms with lower exponent or lower factors.
nlogn vs n^2 (divide both by n)
logn vs n
logn is smaller than n, than you can remove it from the complexity equation
so if the complexity is O(nlogn + n^2), when n is really big, the value of nlogn is not significant if compared to n^2, this is why you remove it and rewrite as O(n^2)
Will performing a O(log N) algorithm N times give O(N log(N))? Or is it O(N)?
e.g. Inserting N elements into a self-balancing tree.
int i = 0;
while (i++ < N) {
insert(itemsToInsert[i]);
}
It's definitely O(N log(N)). It COULD also be O(N), if you could show that the sequence of calls, as a total, grows slow enough (because while SOME calls are O(log N), enough others are fast enough, say O(1), to bring the total down).
Remember: O(f) means the algorithm is no SLOWER than f, but it can be faster (even if just in certain cases).
N times O(log(N)) leads to O(N log(N)).
Big-O notation notates the asymptotic behavior of the algorithm. The cost of each additional step is O(log N); we know that for an O(N) algorithm, the cost of each additional step is O(1) and asymptotically the cost function bound is a straight line.
Therefore O(N) is too low of a bound; O(N log N) seems about right.
Yes and no.
Calculus really helps here. The first iteration is complexity log(1), the second iteration is log(2), &ct until the Nth iteration which is log(N). Rather than thinking of the problem as a multiplication, think of it as an integral...
This happens to come out as O(N log(N)), but that is kind of a coincidence.
I am developing some algorithm with takes up O(log^3 n). (NOTE: Take O as Big Theta, though Big O would be fine too)
I am unsure whereas O(log^3 n), or even O(log^2 n), is considered to be more/less/equaly complex as O(n log n).
If I were to follow the rules stright away, I'd say O(n log n) is the more complex one, but still, I don't have any clue as why or how.
I've done some research but I haven't been able to find an answer to this question.
Thank you very much.
Thus (n log n) is "bigger" than ((log n)3). This could be easily generalized to ((log n)k) via induction.
If you graph the two functions together you can see that n log(n) grows faster than log3 n.
To prove this, you need to prove that n log n > log3 n for all values of n greater than some arbitrary number c. Find such a c and you have your proof.
In fact, n log(n) grows faster than any logx n for positive x.
Binary search has a average case performance as O(log n) and Quick Sort with O(n log n) is O(n log n) is same as O(n) + O(log n)
Imagine a database with with every person in the world. That's 6.7 billion entries. O(log n) is a lookup on an indexed column (e.g. primary key). O(n log n) is returning the entire population in sorted order on an unindexed column.
O(log n) was finished before you finished reading the first word of that sentence.
O(n log n) is still calculating...
Another way to imagine it:
log n is proportional to the number of digits in n.
n log n is n times greater.
Try writing the number 1000 once versus writing it one thousand times. The first takes O(log n) time, the second takes O(n log n) time.
Now try that again with 6700000000. Writing it once is still trivial. Now try writing it 6.7 billion times. Even if you could write it once per second you'd be dead before you finished.
You could visualize it in a plot, see here for example:
No, O(n log n) = O(n) * O(log n)
In mathematics, when you have an expression (i.e. e=mc^2), if there is no operator, then you multiply.
Normally the way to visualize O(n log n) is "do something which takes log n computations n times."
If you had an algorithm which first iterated over a list, then did a binary search of that list (which would be N + log N) you can express that simply as O(n) because the n dwarfs the log n for large values of n
A (log n) plot increases, but is concave downward, which means:
It increases when n gets larger
It's rate of increasing decreases
when n gets larger
A (n log n) plot increases, and is (slightly) concave upward, which means:
It increases when n gets larger
It's rate of increasing (slightly)
increases when n gets larger
Depends on whether you tend to visualize n as having a concrete value.
If you tend to visualize n as having a concrete value, and the units of f(n) are time or instructions, then O(log n) is n times faster than O(n log n) for a given task of size n. For memory or space units, then O(log n) is n times smaller for a given task of size n. In this case, you are focusing on the codomain of f(n) for some known n. You are visualizing answers to questions about how long something will take or how much memory will this operation consume.
If you tend to visualize n as a parameter having any value, then O(log n) is n times more scalable. O(log n) can complete n times as many tasks of size n. In this case, you are focused on the domain of f(n). You are visualizing answers to questions about how big n can get, or how many instances of f(n) you can run in parallel.
Neither perspective is better than the other. The former can be use to compare approaches to solving a specific problem. The latter can be used to compare the practical limitations of the given approaches.