Is my Big O approximations correct? - big-o

I have an algorithm to determine if two strings are permutations of each other. The code can be found here: https://jsfiddle.net/bxohcgjn/
N = String A
M = String B
For time complexity, i have: O(N(N logN + M logM))
For space complexity, I have: O(N + M)
N logN = for sorting A
M logM = for sorting B
I understand that the browser's implementation for sort will change it, but I am assuming quicksort.
Just want to see if my thinking is correct on this.

About your time complexity, the for loop (the lineal order), must not be multiplied by the sum of both sorts.
If an algorithm consists of n steps, the order of the algorithm is the sum of their orders:
O(alg) = O(step_1) + O(step_2) + ... + O(step_n)
In your case, n = 3 (both sorts and the for):
O(is_permutation) = O(sort_A) + O(sort_B) + O(for)
= O(n logn) + O(m logm) + O(n)
Which is the maximum of them:
O(is_permutation) = max(O(n logn), O(m logm), O(n))
But since you test before that the sizes of both strings must be the same before applying the sort, in the worst case (which is what you are analyzing), there is only one size, so, the expression is translated to:
O(is_permutation) = max(O(n logn), O(n logn), O(n))
= max(O(n logn), O(n))
= O(n logn)

Related

Time Complexity in asymptotic analysis log n and log (n+m)

Just some interesting discussion inspired by a conversation in my class.
There are two algorithms, one has time complexity log n and another log (n+m).
Am I correct to argue for average cases, log (n+m) one will perform faster while they make no differences in running time when considering it asymptotically? Because taking the limit of both and f1'/f2' will result in a constant, therefore they have the same order of growth.
Thanks!
As I can see from the question, both n and m are independent variables. So
when stating that
O(m + n) = O(n)
it should hold for any m, which is not: the counter example is
m = exp(n)
O(log(m + n)) = O(log(n + exp(n))) = O(log(exp(n))) = O(n) > O(log(n))
That's why in general case we can only say, that
O(log(m + n)) >= O(log(n))
An interesting problem is when O(m + n) = O(n). If m grows not faster then polynom from n, i.e. O(m) <= O(P(n)):
O(log(m + n)) = O(log(P(n) + n)) = O(log(P(n))) = k * O(log(n)) = O(log(n))
In case of (multi)graphs seldom have we that many edges O(m) > P(n): even complete graph Kn contains only m = n * (n - 1) / 2 = P(n) edges, that's why
O(m + n) = O(n)
holds for ordinary graph (no parallel/multiple edges, no loops)

How to simplify Big O algebra with multiple variables

Assume that the worst-case runtime of an algorithm can be described as:
T(n) = O(n) + O(r^2) + O(n-r)
With n being the input size and r being the index at which a partition was created per the algorithm.
Can this equation be simplified further? If the variables were all n then it would be O(n^2) but can the same idea be applied when r is involved?
As O(n-r) is suppressed by O(n) you can write T(n) = O(n) + O(r^2). Also, as you know that r is between 0 and n, you can write T(n) = O(n + r^2). However, the exact term is T(n,r) = O(n + r^2).

Can this be approximated?

The time complexity of finding k largest element using min-heap is given as
O(k + (n-k)log k) as mentioned here link Can it be approximated to O((n-k) log k)?
Since O(N+Nlog(k))=O(Nlog(k)) is above approximation also true ?
No you can't simplify it like that. This can be shown with a few example values for k that are close to n:
k = n
Now the complexity is defined as: O(n + 0log n) = O(n). If you would have left out the first term of the sum, you would have ended of with O(0), which obviously is wrong.
k = n - 1
We get: O((n-1) + 1log(n-1)) = O(n + log(n)) = O(n). Without the first term, you would get O(log(n)), which again is wrong.

Is the big-O complexity of these functions correct?

I am learning about algorithm complexity, and I just want to verify my understanding is correct.
1) T(n) = 2n + 1 = O(n)
This is because we drop the constants 2 and 1, and we are left with n. Therefore, we have O(n).
2) T(n) = n * n - 100 = O(n^2)
This is because we drop the constant -100, and are left with n * n, which is n^2. Therefore, we have O(n^2)
Am I correct?
Basically you have those different levels determined by the "dominant" factor of your function, starting from the lowest complexity :
O(1) if your function only contains constants
O(log(n)) if the dominant part is in log, ln...
O(n^p) if the dominant part is polynomial and the highest power is p (e.g. O(n^3) for T(n) = n*(3n^2 + 1) -3 )
O(p^n) if the dominant part is a fixed number to n-th power (e.g. O(3^n) for T(n) = 3 + n^99 + 2*3^n)
O(n!) if the dominant part is factorial
and so on...

Drawing Recurrence Tree and Analysis

I am watching Intro to Algorithms (MIT) lecture 1. Theres something like below (analysis of merge sort)
T(n) = 2T(n/2) + O(n)
Few questions:
Why work at bottom level becomes O(n)? It said that the boundary case may have a different constant ... but I still don't get it ...
Its said total = cn(lg n) + O(n). Where does O(n) part come from? The original O(n)?
Although this one's been answered a lot, here's one way to reason it:
If you expand the recursion you get:
t(n) = 2 * t(n/2) + O(n)
t(n) = 2 * (2 * t(n/4) + O(n/2)) + O(n)
t(n) = 2 * (2 * (2 * t(n/8) + O(n/4)) + O(n/2)) + O(n)
...
t(n) = 2^k * t(n / 2^k) + O(n) + 2*O(n/2) + ... + 2^k * O(n/2^k)
The above stops when 2^k = n. So, that means n = log_2(k).
That makes n / 2^k = 1 which makes the first part of the equality simple to express, if we consider t(1) = c (constant).
t(n) = n * c + O(n) + 2*O(n/2) + ... + (2^k * O(n / 2^k))
If we consider the sum of O(n) + .. + 2^k * O(n / 2^k) we can observe that there are exactly k terms, and that each term is actually equivalent to n. So we can rewrite it like so:
t(n) = n * c + {n + n + n + .. + n} <-- where n appears k times
t(n) = n * c + n *k
but since k = log_2(n), we have
t(n) = n * c + n * log_2(n)
And since in Big-Oh notation n * log_2(n) is equivalent to n * log n, and it grows faster than n * c, it follows that the Big-O of the closed form is:
O(n * log n)
I hope this helps!
EDIT
To clarify, your first question, regarding why work at the bottom becomes O(n) is basically because you have n unit operations that take place (you have n leaf nodes in the expansion tree, and each takes a constant c time to complete). In the closed-formula, the work-at-the-bottom is expressed as the first term in the sum: 2 ^ k * t(1). As I said above, you have k levels in the tree, and the unit operation t(1) takes constant time.
To answer the second question, the O(n) does not actually come from the original O(n); it represents the work at the bottom (see answer to first question above).
The original O(n) is the time complexity required to merge the two sub-solutions t(n/2). Since the time complexity of the merge operation is assumed to grow (or decrease) linearly with the size of the problem, that means that at each level you will have a sum of O(n / 2^level), of 2^level terms; this is equivalent to one O(n) operation performed once. Now, since you have k levels, the merge complexity for the initial problem is {O(n) at each level} * {number of levels} which is essentially O(n) * k. Since k = log(n) levels, it follows that the time complexity of the merge operation is: O(n * log n).
Finally, when you examine all the operations performed, you see that the work at the bottom is less than the actual work performed to merge the solutions. Mathematically speaking, the work performed for each of the n items, grows asymptotically slower than the work performed to merge the sub-solutions; put differently, for large values of n, the merge operation dominates. So in Big-Oh analysis, the formula becomes: O(n * log(n)).

Resources