I know this true or false statement below is false. Since Θ(log n) is the amortized cost. But can we somehow determine the average cost from this question, how? If not what do we need to determine the average cost?
Statement
True or false:
A data structure supports an operation f such that a sequence of n operations f takes
Θ(n log n) time in the worst case. Hence, the average-case cost of an f operation is Θ(log n).
The intended answer to this question is probably "true". If the total cost of n operations is Θ(n log n), then you can divide by n to get the average cost, Θ(log n).
The question is a little confusing, though, because it says a "sequence of n operations f". It's possible that the data structure could support other operations that change how much time operation f would take, and therefore timing a sequence of operations f doesn't necessarily tell you the average cost of f if other operations are allowed... but the questioner probably just not considering that.
The amortized cost of operation f is different from the average, because part of the amortized cost may be paid by other operations. The time taken by operation f can therefore provide a lower bound on its amortized cost, but it tells you nothing about any upper bound.
Related
A data structure supports an operation foo such that a sequence of n operations foo takes Θ(n log n) time to perform in the worst case.
a) What is the amortized time of an foo operation?
b) How large can the actual time of a single foo operation be?
a) First I assume foo is O(log n) worst case.
So the amortized cost comes from how often the foo tales its worst case. Since we know nothing further the amortized is between O(1) and log n
b) O(log n)
Is this correct? What is the proper way to argue here?
a) if n operations take Θ(n log n), then by definition the amortized time for a foo operation is Θ(log n) The amortized time is averaged over all the operations, so you don't count the worst case against just the operation that caused it, but amortized against all the others, too.
b) foo could occasionally cost O(n), as long as it's not more than O(log n) times. foo could even occasionally cost O(n log n), as long as that doesn't happen more than a constant (i.e., O(1)) number of times.
When you do amortized analysis, you don't multiple the worst case by the number of operations, but rather by the number of times that worst case actually happens.
For example, take the strategy of pushing elements into a vector one at a time, but growing the memory by doubling the allocated size each time the new element does not fit in the current allocation. Each doubling instance costs O(n) because you have to copy/move all the current elements. But the amortized time is actually linear, because you copy 1 element once, 2 elements once, 4 elements once, etc: overall you've done log(n) doublings but the sum of the cost of each of these is just 1+2+4+8+...+n = 2*n-1 = O(n). So the amortized time of this push implementation is O(1), even though the worst case is O(n).
we have known some of the algorithm’s asymptotic time complexity is a function of n such as
O(log* n), O(log n), O(log log n), O(n^c) with 0< c < 1, ....
May I know what is the smallest algorithm’s asymptotic time complexity as a function of n ?
Update 1 : we look for the asymptotic time complexity function with n. O(1) is the smallest, but it does not have n.
Update 2: O(1) is the smallest time complexity we can go, but what is the next smallest well-known functions with n ? so far as I research:
O(alpha (n)) : inverse Ackermann: Amortized time per operation using a disjoint set
or O(log * n)iterated logarithmic The find algorithm of Hopcroft and Ullman on a disjoint set
Apart from the trivial O(1), the answer is: there isn't one.
If something isn't O(1) (that is, with n -> infinity, computing time goes to infinity), whatever bounding function of n you find, there's always a smaller one: just take a logarithm of the bounding function. You can do this infinitely, hence there's no smallest non-constant bounding function.
However in practice you should probably stop worrying when you reach the inverse Ackermann function :)
It is not necessary that the complexity of a given algorithm be expressible via well-known functions. Also note that big-oh is not the complexity of a given algorithm. It is an upper bound of the complexity.
You can construct functions growing as slow as you want, for instance n1/k for any k.
O(1) is as low as you can go in terms of complexity and strictly speaking 1 is a valid function, it simply is constant.
EDIT: a really slow growing function that I can think of is the iterated logarithm as complexity of disjoint set forest implemented with both path compression and union by rank.
There is always a "smaller algorithm" that whatever suggested.
O(log log log log(n)) < O(log log log(n)) < O(log log (n)) < O(log(n)).
You can put as many log as you want. But I don't know if there is real life example of these.
So my answer is you will get closer and closer to O(1).
So I'm teaching myself some graph algorithms, now on Kruskal's, and understand that it's recommended to use union-find so checking whether adding an edge creates a cycle only takes O(Log V) time. For practical purposes, I see why you'd want to, but strictly looking through Big O notation, does doing so actually affect the worst-case complexity?
My reasoning: If instead of union find, we did a DFS to check for cycles, the runtime for that would be O(E+V), and you have to perform that V times for a runtime of O(V^2 + VE). It's more than with union find, which would be O(V * LogV), but the bulk of the complexity of Kruskal's comes from deleting the minimum element of the priority queue E times, which is O(E * logE), the Big O answer. I don't really see a space advantage either since the union-find takes O(V) space and so too do the data structures you need to maintain to find a cycle using DFS.
So a probably overly long explanation for a simple question: Does using union-find in Kruskal's algorithm actually affect worst-case runtime?
and understand that it's recommended to use union-find so checking whether adding an edge creates a cycle only takes O(Log V) time
This isn't right. Using union find is O(alpha(n) * m), where alpha(n) is the inverse of the Ackermann function, and, for all intents and purposes, can be considered constant. So much faster than logarithmic:
Since alpha(n) is the inverse of this function, alpha(n) is less than 5 for all remotely practical values of n. Thus, the amortized running time per operation is effectively a small constant.
but the bulk of the complexity of Kruskal's comes from deleting the minimum element of the priority queue E times
This is also wrong. Kruskal's algorithm does not involve using any priority queues. It involves sorting the edges by cost at the beginning. Although the complexity remains the one you mention for this step. However, sorting might be faster in practice than a priority queue (using a priority queue will, at best, be equivalent to a heap sort, which is not the fastest sorting algorithm).
Bottom line, if m is the number of edges and n the number of nodes.:
Sorting the edges: O(m log m).
For each edge, calling union-find: O(m * alpha(n)), or basically just O(m).
Total complexity: O(m log m + m * alpha(n)).
If you don't use union-find, total complexity will be O(m log m + m * (n + m)), if we use your O(n + m) cycle finding algorithm. Although O(n + m) for this step is probably an understatement, since you must also update your structure somehow (insert an edge). The naive disjoint-set algorithm is actually O(n log n), so even worse.
Note: in this case, you can write log n instead of log m if you prefer, because m = O(n^2) and log(n^2) = 2log n.
In conclusion: yes, union-find helps a lot.
Even if you use the O(log n) variant of union-find, which would lead to O(m log m + m log n) total complexity, which you could assimilate to O(m log m), in practice you'd rather keep the second part faster if you can. Since union-find is very easy to implement, there's really no reason not to.
I am trying to understand big oh notations. Any help would be appreciated.
Say there is a program that creates a max heap and then pushes and removes the item.
Say there is n items.
To create a heap,it takes O(n) to heapify if you have read it into an array and then, heapifies it.
To push an item, it takes O(1) and to remove it, it takes O(1)
To heapify it after that, it takes log n for each remove and n log n for n items
So the big oh notation is O(n + n log n)
OR, is it O(n log n) only because we choose the biggest one.
The complexity to heapify the new element in the heap is O(logN), not O(1)(unless you use an Fibonacci heap which it seems is not the case).
Also there is no notation O(N + NlogN) as NlogN grows faster than N so this notation is simply written as O(NlogN).
EDIT: The big-oh notation only describes the asymptotic behavior of a function, that is - how fast it grows. As you get close to infinity 2*f(x) and 11021392103*f(x) behave similarly and that is why when writing big-oh notation, we ignore any constants in front of the function.
Formally speaking, O(N + N log N) is equivalent to O(N log N).
That said, it's assumed that there are coefficients buried in each of these, ala: O( aN + bN log(cN) ). If you have very large N values, these coefficients become unimportant and the algorithm is bounded only by its largest term, which, in this case, is log(N).
But it doesn't mean the coefficients are entirely unimportant. This is why in discussions of graph algorithms you'll often see authors say something like "the Floyd-Warshall algorithm runs in O(N^3) time, but has small coefficients".
If we could somehow write O(0.5N^3) in this case, we would. But it turns out that the coefficients vary depending on how you implement an algorithm and which computer you run it on. Thus, we settle for asymptotic comparisons, not necessarily because it is the best way, but because there isn't really a good alternative.
You'll also see things like "Worst-case: O(N^2), Average case: O(N)". This is an attempt to capture how the behavior of the algorithm varies with the input. Often times, presorted or random inputs can give you that average case, whereas an evil villain can construct inputs that produce the worst case.
Ultimately, what I am saying is this: O(N + N log N)=O(N log N). This is true, and it's the right answer for your homework. But we use this big-O notation to communicate and, in the fullness of time, you may find situations where you feel that O(N + N log N) is more expressive, perhaps if your algorithm is generally used for small N. In this case, do not worry so much about the formalism - just be clear about what it is you are trying to convey with it.
I am currently reading amortized analysis. I am not able to fully understand how it is different from normal analysis we perform to calculate average or worst case behaviour of algorithms. Can someone explain it with an example of sorting or something ?
Amortized analysis gives the average performance (over time) of each operation in
the worst case.
In a sequence of operations the worst case does not occur often in each operation - some operations may be cheap, some may be expensive Therefore, a traditional worst-case per operation analysis can give overly pessimistic bound. For example, in a dynamic array only some inserts take a linear time, though others - a constant time.
When different inserts take different times, how can we accurately calculate the total time? The amortized approach is going to assign an "artificial cost" to each operation in the sequence, called the amortized cost of an operation. It requires that the total real cost of the sequence should be bounded by the total of the amortized costs of all the operations.
Note, there is sometimes flexibility in the assignment of amortized costs.
Three methods are used in amortized analysis
Aggregate Method (or brute force)
Accounting Method (or the banker's method)
Potential Method (or the physicist's method)
For instance assume we’re sorting an array in which all the keys are distinct (since this is the slowest case, and takes the same amount of time as when they are not, if we don’t do anything special with keys that equal the pivot).
Quicksort chooses a random pivot. The pivot is equally likely to be the smallest key,
the second smallest, the third smallest, ..., or the largest. For each key, the
probability is 1/n. Let T(n) be a random variable equal to the running time of quicksort on
n distinct keys. Suppose quicksort picks the ith smallest key as the pivot. Then we run quicksort recursively on a list of length i − 1 and on a list of
length n − i. It takes O(n) time to partition and concatenate the lists–let’s
say at most n dollars–so the running time is
Here i is a random variable that can be any number from 1 (pivot is the
smallest key) to n (pivot is largest key), each chosen with probability 1/n,
so
This equation is called a recurrence. The base cases for the recurrence are T(0) = 1 and T(1) = 1. This means that sorting a list of length zero or one takes at most one dollar (unit of time).
So when you solve:
The expression 1 + 8j log_2 j might be an overestimate, but it doesn’t
matter. The important point is that this proves that E[T(n)] is in O(n log n).
In other words, the expected running time of quicksort is in O(n log n).
Also there’s a subtle but important difference between amortized running time
and expected running time. Quicksort with random pivots takes O(n log n) expected running time, but its worst-case running time is in Θ(n^2). This means that there is a small
possibility that quicksort will cost (n^2) dollars, but the probability that this
will happen approaches zero as n grows large.
Quicksort O(n log n) expected time
Quickselect Θ(n) expected time
For a numeric example:
The Comparison Based Sorting Lower Bound is:
Finally you can find more information about quicksort average case analysis here
average - a probabilistic analysis, the average is in relation to all of the possible inputs, it is an estimate of the likely run time of the algorithm.
amortized - non probabilistic analysis, calculated in relation to a batch of calls to the algorithm.
example - dynamic sized stack:
say we define a stack of some size, and whenever we use up the space, we allocate twice the old size, and copy the elements into the new location.
overall our costs are:
O(1) per insertion \ deletion
O(n) per insertion ( allocation and copying ) when the stack is full
so now we ask, how much time would n insertions take?
one might say O(n^2), however we don't pay O(n) for every insertion.
so we are being pessimistic, the correct answer is O(n) time for n insertions, lets see why:
lets say we start with array size = 1.
ignoring copying we would pay O(n) per n insertions.
now, we do a full copy only when the stack has these number of elements:
1,2,4,8,...,n/2,n
for each of these sizes we do a copy and alloc, so to sum the cost we get:
const*(1+2+4+8+...+n/4+n/2+n) = const*(n+n/2+n/4+...+8+4+2+1) <= const*n(1+1/2+1/4+1/8+...)
where (1+1/2+1/4+1/8+...) = 2
so we pay O(n) for all of the copying + O(n) for the actual n insertions
O(n) worst case for n operation -> O(1) amortized per one operation.