What is the Big-O of this two-part algorithm? - algorithm

Given the following algorithm on a dataset of size N:
Separate the data into M=(N/lg N)
blocks in O(N) time.
Partition the blocks in O(M lg M) time. *
What is the big-O? How do I evaluate (N/lg N) * lg (N/lg N) ?
If it is not O(N), is there an M small enough that the whole thing does become O(N)?
* The partition algorithm is the STL's stable_partition which, in this example, will do M tests and at most M lg M swaps. But, the items being swapped are blocks of size lg N. Does this push the practical time of step 2 back up to O(N lg N) if they must be swapped in place?
Not homework, just a working engineer poking around in comp-sci stuff.

You evaluate by doing a bit of math.
log(x/y) = log(x) - log(y)
->
log(N / log(N)) = log(N) - log(log(N))
So, plugging this back in and combining into a single fraction.
N(log(N) - log(log(N))) / log(N)
=
N - N(log(log(N)) / log(N))
<=, since log(log(N)) <= log(N) as N -> inf., it's like multiplying by <= 1
N
So, the whole thing is O(N).
You can pretty easily guess that it is O(N log N) by noticing that M = N / log N is, itself, O(N). I don't know of a quick way to figure out that it's O(N) without a bit of doubt on my part due to having to multiply in the log M.

It is O(N):
N / lgN * lg(N / lgN)=N / lgN * (lgN-lglgN)=N*(1-lglgN / lgN)<=N

Related

Determine big o for these functions

For the below two questions, solution manual states that for
b. O(n log n)
c. O(n log n)
But as per my understanding b is O(log n) and C is sqrt(n).
I agree with you on (c). O(c(n)) is O(sqrt(n)). n^n = e^(n log n), so log log (n^n) is log(n) + log log(n). The square root grows much faster than this.
For (b), I'm interpreting log(n > 2) to mean log(n), where n > 2. Maybe I'm wrong. In which case the second term is O(n log n) which is bigger than the 5 log(n ^ 10) = 50 log(n) of the first term.

I thought "NlogN" is "N" times "logN", but why is it described as "double PLUS an amount proportional to N"

I'm currently learning about the big O notation. In the material, O(NlogN) was described as Doubled plus an amount proportional to N. But I thought that would be O(N + logN) and not O(NlogN) (I thought O(NlogN) is Double times logN).
Is there something logically wrong with my understanding?
Replace N with 2N as stated:
2N log 2N = 2N * (log N + log 2) (using logarithm rules)
Doubled original term 2 * (N log N)
Additional term (2 log 2) * N i.e. "proportional to N".

Running time of merge sort on linked lists?

i came across this piece of code to perform merge sort on a link list..
the author claims that it runs in time O(nlog n)..
here is the link for it...
http://www.geeksforgeeks.org/merge-sort-for-linked-list/
my claim is that it takes atleast O(n^2) time...and here is my argument...
look, you divide the list(be it array or linked list), log n times(refer to recursion tree), during each partition, given a list of size i=n, n/2, ..., n/2^k, we would take O(i) time to partition the original/already divided list..since sigma O(i)= O(n),we can say , we take O(n) time to partition for any given call of partition(sloppily), so given the time taken to perform a single partition, the question now arises as to how many partitions are going to happen all in all, we observe that the number of partitions at each level i is equal to 2^i , so summing 2^0+2^1+....+2^(lg n ) gives us [2(lg n)-1] as the sum which is nothing but (n-1) on simplification , implying that we call partition n-1, (let's approximate it to n), times so , the complexity is atleast big omega of n^2..
if i am wrong, please let me know where...thanks:)
and then after some retrospection , i applied master method to the recurrence relation where i replaced theta of 1 which is there for the conventional merge sort on arrays with theta of n for this type of merge sort (because the divide and combine operations take theta of n time each), the running time turned out to be theta of (n lg n)...
also i noticed that the cost at each level is n (because 2 power i * (n/(2pow i)))...is the time taken for each level...so its theta of n at each level* lg n levels..implying that its theta of (n lg n).....did i just solve my own question??pls help i am kinda confused myself..
The recursive complexity definition for an input list of size n is
T(n) = O(n) + 2 * T(n / 2)
Expanding this we get:
T(n) = O(n) + 2 * (O(n / 2) + 2 * T(n / 4))
= O(n) + O(n) + 4 * T(n / 4)
Expanding again we get:
T(n) = O(n) + O(n) + O(n) + 8 * T(n / 8)
Clearly there is a pattern here. Since we can repeat this expansion exactly O(log n) times, we have
T(n) = O(n) + O(n) + ... + O(n) (O(log n) terms)
= O(n log n)
You are performing a sum twice for some weird reason.
To split and merge a linked list of size n, it takes O(n) time. The depth of recursion is O(log n)
Your argument was that a splitting step takes O(i) time and sum of split steps become O(n) And then you call it the time taken to perform only one split.
Instead, lets consider this, a problem of size n forms two n/2 problems, four n/4 problems eight n/8 and so on until 2^log n n/2^logn subproblems are formed. Sum these up you get O(nlogn) to perform splits.
Another O(nlogn) to combine sub problems.

Where do the functions 2n^2 , 100n log n, and (log n) ^3 fit in the big-O hierarchy?

The big-O hierarchy for any constants a, b > 0 is;
O(a) ⊂ O(log n) ⊂ O(n^b) ⊂ O(C^n).
I need some explanations, thanks.
Leading constants don't matter, so O(2n^2) = O(n^2) and O(100 n log n) = O (n log n).
If f and g are functions then O(f * g) = O(f) * O(g). Now, apparently you are ok accepting that O(log n) < O(n). Multiply both sides by O(n) and you get O(n) * O(log n) = O(n * log n) < O(n * n) = O(n^2).
To see that O((log n)^3) is less than O(n^a) for any positive a is a little trickier, but if you are willing to accept that O(log n) is less than O(n^a) for any positive a, then you can see it by taking the third root of O((log n)^3) and O(n^a). You get O(log n) on the one side, and O(n^(a/3)) on the other side, and the inequality you are looking for is easy to deduce from this.
you can think about that number that a function produces, it usually goes that the smaller the number, the faster the algorithm is. And if it is a larger number the function produces its slower.
log 10 < 10^b < C^n

O(n^2) vs O (n(logn)^2)

Is time complexity O(n^2) or O (n(logn)^2) better?
I know that when we simplify it, it becomes
O(n) vs O((logn)^2)
and logn < n, but what about logn^2?
n is only less than (log n)2 for values of n less than 0.49...
So in general (log n)2 is better for large n...
But since these O(something)-notations always leave out constant factors, in your case it might not be possible to say for sure which algorithm is better...
Here's a graph:
(The blue line is n and the green line is (log n)2)
Notice, how the difference for small values of n isn't so big and might easily be dwarfed by the constant factors not included in the Big-O notation.
But for large n, (log n)2 wins hands down:
For each constant k asymptotically log(n)^k < n.
Proof is simple, do log on both sides of the equation, and you get:
log(log(n))*k < log(n)
It is easy to see that asymptotically, this is correct.
Semantic note: Assuming here log(n)^k == log(n) * log(n) * ... * log(n) (k times) and NOT log(log(log(...log(n)))..) (k times) as it is sometimes also used.
O(n^2) vs. O(n*log(n)^2)
<=> O(n) vs. O(log(n)^2) (divide by n)
<=> O(sqrt(n)) vs. O(log(n)) (square root)
<=> polynomial vs. logarithmic
Logarithmic wins.
(Log n)^2 is better because if you do a variable change n by exp m, then m^2 is better than exp m
(logn)^2 is also < n.
Take an example:
n = 5
log n = 0.6989....
(log n)^ 2 = 0.4885..
You can see, (long n)^2 is further reduced.
Even if you take any bigger value of n e.g. 100,000,000 , then
log n = 9
(log n)^ 2 = 81
which is far less than n.
O(n(logn)^2) is better (faster) for large n!
take log from both sides:
Log(n^2)=2log(n)
Log(n(logn)^2)=Log(n)+2log(Log(n))=Log(n)+2log(Log(n))
lim n--> infinity [(Log(n)+2log(Log(n)))/2log(n)/]=0.5 (use l'Hôpital's rule)(http://en.wikipedia.org/wiki/L'H%C3%B4pital's_rule)]

Resources