Are there any famous algorithms with this complexity?
I was thinking maybe a skip list where levels of the nodes are not determined by the number of tails coin tosses, but instead are use a number generated randomly (with uniform distribution) from the (1,log(n)) period to determine the level of the node. Such a data structure would have a find(x) operation with the complexity of O(n/log(n)) (I think, at least). I was curious whether there was anything else.
It's common to see algorithms whose runtime is of the form O(nk / log n) or O(log n / log log n) when using the method of Four Russians to speed up an existing algorithm. The classic Four Russians speedup reduces the cost of doing a matrix/vector product on Boolean matrices from O(n2) to O(n2 / log n). The standard dynamic programming algorithm for sequence alignment on two length-n strings runs in time O(n2), which can be decreased to O(n2 / log n) by using a similar trick.
Similarly, the prefix parity problem - in which you need to maintain a sequence of Boolean values while supporting the "flip" and "parity of the prefix of a sequence" operations can be solved in time O(log n / log log n) by using a Four-Russians speedup. (Notice that if you express the runtime as a function of k = log n, this is O(k / log k).
Related
I am having trouble understanding the difference between log(k) and log(n) in complexity analysis.
I have an array of size n. I have another number k < n that is an input of the algorithm (so it's not a constant known ahead of time). What are some examples of algorithms that would have log(n) vs those that would have log(k) in their complexity? I can only think of algorithms that have log(n) in their complexity.
For example, mergesort has log(n) complexity in its runtime analysis (O(nlogn)).
If your algorithm takes a list of size n and a number of magnitude k < n, the input size is on the order of n + log(k) (assuming k may be on the same asymptotic order of n). Why? k is a number represented in a place-value system (e.g., binary or decimal) and a number of magnitude k requires on the order of log k digits to represent.
Therefore, if your algorithm takes an input k and uses it in a way that requires all its digits be used or checked (e.g., equality is being checked, etc.) then the complexity of the whole algorithm is at least on the order of log k. If you do more complicated things with the number, the complexity could be even higher. For instance, if you have something like for i = 1 to k do ..., the complexity of your algorithm is at least k - maybe higher, since you're comparing to a log k-bit number k times (although i will use fewer bits than k for many/most values of i, depending on the base).
There's no "one-size-fits-all" explanation as to where an O(log k) term might come up.
You sometimes see this runtime arise in searching and sorting algorithms where you only need to rearrange some small part of the sequence. For example, the C++ standard library's std::partial_sort function, which rearranges the sequence so that the first k elements are in sorted order and the remainder are in arbitrary order in time O(n log k). One way this could be implemented is to maintain a min-heap of size at most k and do n insertions/deletions on it, which is n operations that each take time O(log k). Similarly, there's an O(n log k)-time algorithm for finding the k largest elements in a data stream, which works by maintaining a max-heap of at most k elements.
(Neither of these approaches are optimal, though - you can do a partial sort in time O(n + k log k) using a linear-time selection algorithm and can similarly find the top k elements of a data stream in O(n).)m
You also sometimes see this runtime in algorithms that involve a divide-and-conquer strategy where the size of the problem in question depends on some parameter of the input size. For example, the Kirkpatrick-Seidel algorithm for computing a convex hull does linear work per level in a recurrence, and the number of layers is O(log k), where k is the number of elements in the resulting convex hull. The total work is then O(n log k), making it an output-sensitive algorithm.
In some cases, an O(log k) term can arise because you are processing a collection of elements one digit at a time. For example, radix sort has a runtime of O(n log k) when used to sort n values that range from 0 to k, inclusive, with the log k term arising from the fact that there are O(log k) digits in the number k.
In graph algorithms, where the number of edges (m) is related to but can be independent of the number of nodes (n), you often see runtimes like O(m log n), as is the case if you implement Dijkstra's algorithm with a binary heap.
Can some one tell me which is better of the two algorithms TriMergeSort and MergeSort.
The time complexity of the MergeSort would be nlogn base 2.
The time complexity of the TriMergeSort is nlogn base 3.
Since TriMergeSort is base 3 and MergeSort is base 2 I am considering TriMergeSort is faster than that of MergeSort.
Please correct me if I am wrong.
While you are right that the number of levels in the recursive structure is log2 n in the case of regular mergesort and log3 n in the case of three-way mergesort, it's important to remember that the work done per level increases as the number of levels increases. Specifically, in your merge step, you need to switch from a normal 2-way merge to a special 3-way merge. At each step in the merge, you need to determine which of the lists has the smallest unused element. In a two-way merge, you just compare the front elements of the two lists against one another. In a three-way merge, there are more comparisons required because you have to find the lowest element out of three elements.
Generalizing this to a k-way mergesort, the number of layers will be logk n, but the work for the merge will be higher than this. It's possible to do a k-way merge of n total elements in time O(n log k) by using binary heaps, so more work is required as k increases.
Interestingly, if we talk about the amount of work required overall, then we can see that we need to do O(n log k) work across logk n levels. This gives us a total runtime of O(n log k logk n). Using the change-of-base formula for logarithms, which says that logk n = log2 n / log2 k, we see that the runtime will be
O(n log k logk n)
= O(n log k (log n / log k))
= O(n log n)
In other words, there isn't an asymptotic difference between the algorithms when you choose different values of k. The drop in levels due to a higher splitting factor is offset by an increased amount of work per level.
To figure out which algorithm is best, the best option would be to run them all and see what happens. Due to caching effects and locality of reference, I suspect that the answer might at some level depend on the particular architecture you're using.
As far as Big-O complexity, it doesn't matter.
Regular merge sort is n * log_2(n) which is equivalent to n * (log(n) / log(2)). The log(2) is constant, so merge sort is simply n * log(n)
Tri-merge sort is n * log_3(n) which, using the same logic for regular merge sort, is simply n * log(n)
Given that both reduce to O(n * log(n)), it's not really possible to say which is better.
An alternate way to demonstrate why you can't just assume tri-merge to be better:
Assume a 3-way merge is better than a 2-way merge.
In general, assume an (N+1)-way merge is better than an N-way merge.
If this were true, it would be best to use an N-way merge where N is the number of elements you're sorting. However, the merge step requires choosing the least element from N sources which requires O(N) time.
This means that the N-way merge sort runs in O(N^2) time, effectively making it selection sort.
I have an algorithm that runs in O(m) time. This algorithm takes in a set of data containing m entries.
The data is generated randomly by specifying a strictly positive integer input n. The number of entries generated is O(n log n).
Edit
Alone, the time complexity of generating the data is independent of n (or O(1)), which means given the integer n, the entries are instantly and randomly generated. The number of resulting entries is random, but is O(n log n). E.g. n = 10, then number of entries generated is some constant times 10 (log 10).
The data is generate before hand. Then the resulting m entries is fed into the algorithms as input.
Question
Can I then assume that the algorithm runs in O(n log n) time?
There are some ambiguities in your question that were either deliberately place to help you internalize the relationship between input size and run time complexity, or simple caused by miscommunication.
So as best as I can interpret this scenario:
Your algorithm complexity O(m) is linear with respect to m.
So since We assume that generating the data is independent of input. i.e. O(1)., your time-complexity is only dependent on some n that you specify that generates entries.
So yes, you can say that the algorithm runs in O(n log n) time, since it doesn't do anything with the input of size m.
In response to your updated question:
It's still hard to follow because some key words refer to different things. But in general I think this is what you are getting at:
You have a data set as input, that is size O(n log n), given some specific n.
This data set is used as input only, it's either pre-generated, or generated using some blackbox that runs in O(1) time regardless of what n is given to the blackbox. (We aren't interested in the blackbox for this question)
This data set is then fed to the algorithm that we are actually interested in analyzing.
The algorithm has time-complexity O(m), for an input of size m.
Since your input has size O(n log n) with respect to n, then by extension your O(m) linear-time algorithm has time complexity O(n log n), with respect to n.
To see the difference: Suppose your algorithm wasn't linear but rather quadratic O(m^2), then it would have time-complexity O(n^2 log^2 n) with respect to n.
Binary search has a average case performance as O(log n) and Quick Sort with O(n log n) is O(n log n) is same as O(n) + O(log n)
Imagine a database with with every person in the world. That's 6.7 billion entries. O(log n) is a lookup on an indexed column (e.g. primary key). O(n log n) is returning the entire population in sorted order on an unindexed column.
O(log n) was finished before you finished reading the first word of that sentence.
O(n log n) is still calculating...
Another way to imagine it:
log n is proportional to the number of digits in n.
n log n is n times greater.
Try writing the number 1000 once versus writing it one thousand times. The first takes O(log n) time, the second takes O(n log n) time.
Now try that again with 6700000000. Writing it once is still trivial. Now try writing it 6.7 billion times. Even if you could write it once per second you'd be dead before you finished.
You could visualize it in a plot, see here for example:
No, O(n log n) = O(n) * O(log n)
In mathematics, when you have an expression (i.e. e=mc^2), if there is no operator, then you multiply.
Normally the way to visualize O(n log n) is "do something which takes log n computations n times."
If you had an algorithm which first iterated over a list, then did a binary search of that list (which would be N + log N) you can express that simply as O(n) because the n dwarfs the log n for large values of n
A (log n) plot increases, but is concave downward, which means:
It increases when n gets larger
It's rate of increasing decreases
when n gets larger
A (n log n) plot increases, and is (slightly) concave upward, which means:
It increases when n gets larger
It's rate of increasing (slightly)
increases when n gets larger
Depends on whether you tend to visualize n as having a concrete value.
If you tend to visualize n as having a concrete value, and the units of f(n) are time or instructions, then O(log n) is n times faster than O(n log n) for a given task of size n. For memory or space units, then O(log n) is n times smaller for a given task of size n. In this case, you are focusing on the codomain of f(n) for some known n. You are visualizing answers to questions about how long something will take or how much memory will this operation consume.
If you tend to visualize n as a parameter having any value, then O(log n) is n times more scalable. O(log n) can complete n times as many tasks of size n. In this case, you are focused on the domain of f(n). You are visualizing answers to questions about how big n can get, or how many instances of f(n) you can run in parallel.
Neither perspective is better than the other. The former can be use to compare approaches to solving a specific problem. The latter can be used to compare the practical limitations of the given approaches.
I know there are quite a bunch of questions about big O notation, I have already checked:
Plain english explanation of Big O
Big O, how do you calculate/approximate it?
Big O Notation Homework--Code Fragment Algorithm Analysis?
to name a few.
I know by "intuition" how to calculate it for n, n^2, n! and so, however I am completely lost on how to calculate it for algorithms that are log n , n log n, n log log n and so.
What I mean is, I know that Quick Sort is n log n (on average).. but, why? Same thing for merge/comb, etc.
Could anybody explain me in a not too math-y way how do you calculate this?
The main reason is that Im about to have a big interview and I'm pretty sure they'll ask for this kind of stuff. I have researched for a few days now, and everybody seem to have either an explanation of why bubble sort is n^2 or the unreadable explanation (for me) on Wikipedia
The logarithm is the inverse operation of exponentiation. An example of exponentiation is when you double the number of items at each step. Thus, a logarithmic algorithm often halves the number of items at each step. For example, binary search falls into this category.
Many algorithms require a logarithmic number of big steps, but each big step requires O(n) units of work. Mergesort falls into this category.
Usually you can identify these kinds of problems by visualizing them as a balanced binary tree. For example, here's merge sort:
6 2 0 4 1 3 7 5
2 6 0 4 1 3 5 7
0 2 4 6 1 3 5 7
0 1 2 3 4 5 6 7
At the top is the input, as leaves of the tree. The algorithm creates a new node by sorting the two nodes above it. We know the height of a balanced binary tree is O(log n) so there are O(log n) big steps. However, creating each new row takes O(n) work. O(log n) big steps of O(n) work each means that mergesort is O(n log n) overall.
Generally, O(log n) algorithms look like the function below. They get to discard half of the data at each step.
def function(data, n):
if n <= constant:
return do_simple_case(data, n)
if some_condition():
function(data[:n/2], n / 2) # Recurse on first half of data
else:
function(data[n/2:], n - n / 2) # Recurse on second half of data
While O(n log n) algorithms look like the function below. They also split the data in half, but they need to consider both halves.
def function(data, n):
if n <= constant:
return do_simple_case(data, n)
part1 = function(data[n/2:], n / 2) # Recurse on first half of data
part2 = function(data[:n/2], n - n / 2) # Recurse on second half of data
return combine(part1, part2)
Where do_simple_case() takes O(1) time and combine() takes no more than O(n) time.
The algorithms don't need to split the data exactly in half. They could split it into one-third and two-thirds, and that would be fine. For average-case performance, splitting it in half on average is sufficient (like QuickSort). As long as the recursion is done on pieces of (n/something) and (n - n/something), it's okay. If it's breaking it into (k) and (n-k) then the height of the tree will be O(n) and not O(log n).
You can usually claim log n for algorithms where it halves the space/time each time it runs. A good example of this is any binary algorithm (e.g., binary search). You pick either left or right, which then axes the space you're searching in half. The pattern of repeatedly doing half is log n.
For some algorithms, getting a tight bound for the running time through intuition is close to impossible (I don't think I'll ever be able to intuit a O(n log log n) running time, for instance, and I doubt anyone will ever expect you to). If you can get your hands on the CLRS Introduction to Algorithms text, you'll find a pretty thorough treatment of asymptotic notation which is appropriately rigorous without being completely opaque.
If the algorithm is recursive, one simple way to derive a bound is to write out a recurrence and then set out to solve it, either iteratively or using the Master Theorem or some other way. For instance, if you're not looking to be super rigorous about it, the easiest way to get QuickSort's running time is through the Master Theorem -- QuickSort entails partitioning the array into two relatively equal subarrays (it should be fairly intuitive to see that this is O(n)), and then calling QuickSort recursively on those two subarrays. Then if we let T(n) denote the running time, we have T(n) = 2T(n/2) + O(n), which by the Master Method is O(n log n).
Check out the "phone book" example given here: What is a plain English explanation of "Big O" notation?
Remember that Big-O is all about scale: how much more operation will this algorithm require as the data set grows?
O(log n) generally means you can cut the dataset in half with each iteration (e.g. binary search)
O(n log n) means you're performing an O(log n) operation for each item in your dataset
I'm pretty sure 'O(n log log n)' doesn't make any sense. Or if it does, it simplifies down to O(n log n).
I'll attempt to do an intuitive analysis of why Mergesort is n log n and if you can give me an example of an n log log n algorithm, I can work through it as well.
Mergesort is a sorting example that works through splitting a list of elements repeatedly until only elements exists and then merging these lists together. The primary operation in each of these merges is comparison and each merge requires at most n comparisons where n is the length of the two lists combined. From this you can derive the recurrence and easily solve it, but we'll avoid that method.
Instead consider how Mergesort is going to behave, we're going to take a list and split it, then take those halves and split it again, until we have n partitions of length 1. I hope that it's easy to see that this recursion will only go log (n) deep until we have split the list up into our n partitions.
Now that we have that each of these n partitions will need to be merged, then once those are merged the next level will need to be merged, until we have a list of length n again. Refer to wikipedia's graphic for a simple example of this process http://en.wikipedia.org/wiki/File:Merge_sort_algorithm_diagram.svg.
Now consider the amount of time that this process will take, we're going to have log (n) levels and at each level we will have to merge all of the lists. As it turns out each level will take n time to merge, because we'll be merging a total of n elements each time. Then you can fairly easily see that it will take n log (n) time to sort an array with mergesort if you take the comparison operation to be the most important operation.
If anything is unclear or I skipped somewhere please let me know and I can try to be more verbose.
Edit Second Explanation:
Let me think if I can explain this better.
The problem is broken into a bunch of smaller lists and then the smaller lists are sorted and merged until you return to the original list which is now sorted.
When you break up the problems you have several different levels of size first you'll have two lists of size: n/2, n/2 then at the next level you'll have four lists of size: n/4, n/4, n/4, n/4 at the next level you'll have n/8, n/8 ,n/8 ,n/8, n/8, n/8 ,n/8 ,n/8 this continues until n/2^k is equal to 1 (each subdivision is the length divided by a power of 2, not all lengths will be divisible by four so it won't be quite this pretty). This is repeated division by two and can continue at most log_2(n) times, because 2^(log_2(n) )=n, so any more division by 2 would yield a list of size zero.
Now the important thing to note is that at every level we have n elements so for each level the merge will take n time, because merge is a linear operation. If there are log(n) levels of the recursion then we will perform this linear operation log(n) times, therefore our running time will be n log(n).
Sorry if that isn't helpful either.
When applying a divide-and-conquer algorithm where you partition the problem into sub-problems until it is so simple that it is trivial, if the partitioning goes well, the size of each sub-problem is n/2 or thereabout. This is often the origin of the log(n) that crops up in big-O complexity: O(log(n)) is the number of recursive calls needed when partitioning goes well.