I am looking for an explanation for this question since i'm studying for GRE:
An algorithm is run in 10 seconds for a size 50 entry. If the algorithm is quadratic, how long in seconds does it spend to approximately on the same computer if the input has size 100?
I can't see a relation between time and the input.
Considering: O(n^2) -> O(50^2) =! 10 (seconds)
Also, i would like to study more about this topic, so please add any source if you can.
Thanks.
Note that the terminology is sloppy, time complexity has no notion of time (yes, the name is deceiving).
Neglecting terms smaller than O(n2) since we are working under the big-O framework, a quadratic function can be expressed as
t = c * n2
Given the (t, n) pair with value (10, 50) we have
10 = c * 2500
c = 1/250 = 4/1000
Solving for (t, 100) we get
t = 4/1000 * 10000 = 40
There is a faster and more insightful way to solve this problem.
The trained eye can spot the answer as 40 immediately.
Consider this function power function:
t = c * nk
Now lets consider the inputs m0 and m1, with m1 = a * m0 being an (integer) multiple of m0.
Lets compare the respective t0, t1:
t0 = c * (m0)k
t1 = c * (m1)k = c * ak * (m0)k
So we see that for a polynomial time function t1/t0 = ak, or t1 = ak * t0.
The ratio between two output is the ratio between their inputs, to the k-th power.
For a quadratic function k = 2, thus the ration between two outputs is the square of the ratio between two inputs.
If the input doubles (ratio = 2) the output quadruples (ratio = 22 = 4).
If the input triples (ratio = 3) the output is nine-fold (ratio = 32 = 9).
A good mnemonic trick is to remember that the function from the inputs ratio to the outputs ratio is of the same kind of the given function.
We are given a quadratic function so that is the kind of relationship between the inputs and outputs ratios:
input output
ratio ratio
1 1
2 4
3 9
4 16
5 25
... ...
The problem tells you that the output doubles (from 50 to 100 entries) so the output must quadruple (from 10 to 40) as the function is quadratic.
As you can see all the data from the problem is used elegantly and without any hardcore computation.
Suggesting out-site sources is frowned upon but in this case I can't help but recomending the reading of:
Introduction to the Theory Computation - Michael Sipser
See if you can answer these questions:
If the input is now 150 entries, how much time does it take?
90
If the input is now 30 entries, how much time does it take?
90/25 ~ 4
If the input is now 100 entries but the program is run in a computer that is twice as fast, how much time does it take?
20
What size of the input is necessary to make the program run for at least 1000 seconds?
500
Given the time complexity you can't calculate the exact running time (in seconds) of the algorithm. However, it does give you a good idea of the growth rate of measured time.
In a linear time algorithm (O(n)), time is expected to increase linearly as a function of the input. For example, if 10,000 items take one second to process in some machine, then you should expect 50,000 items to take about 5 seconds, and so on. This isn't the case in a quadratic algorithm (O(n^2)), which "punishes" you more as the input size is larger; an increase of x2 in input size will result in x4 processing time, an increase x5 in input size will result in x25 processing time, etc (Just like the function F(x)=x^2 behaves).
You can use this post as an introduction, and this one as a more formal explanation.
The Big-O doesn't inform you about the exact correlation between entry size and the execution time, but instead it tells what's the approximate correlation between the entry size growth and the execution time growth. So: O(n) complexity means that the entry size and the execution time are in direct proportion (if the input is 3 times bigger, then the execution time will also be 3 times longer), O(n^2) means that if the entry size is 3 times bigger, then the execution time will be 9 times longer etc.
In your case, the initial execution time for a size 50 entry is 10 seconds. If the input changes to a size 100 entry, then by simply dividing we can tell that it's 100 / 50 = 2 times bigger. Knowing that the algorithm's Big-O is O(n^2), we can tell that the execution time will be 2^2 = 4 times longer. So if the initial execution time was 10 seconds, then the execution time for the bigger input will be approximately 10 seconds * 4 = 40 seconds.
Related SO question: Rough estimate of running time from Big O
Nice article about the big-o: https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
Related
I have these algorithm and I didn't find out which one takes more time.
O((n^2)*log(n))
O(n*(2^n))
I calculate the log of those but I can't understand which one takes more time.
log((n^2)*log(n)) = 2log(n)+log(log(n))
log(n*(2^n))=log(n)+n*log(2)
The second one, because:
2^n > n^2
and
n > log(n)
For a small range of values of n (roughly n = 5 (~4.738) to 26 (~25.783)) assuming natural logarithms, the first is larger than the second, but above that the second is always larger and becomes increasingly so as n increases.
Plotting it confirms this, here using Mathematica:
f1[n_] := Log[n^2]*Log[n]
f2[n_] := Log[n*(2^n)]
Plot[{f1[n], f2[n]}, {n, 1, 50}]
Not good enough at math to show proof, but simply filling in a value for "n" will give you a good indication. Let's take n = 100, base for log is 10.
# First algorithm
100^2 * log(100) = 10000 * 2 = 20000
# Second algorithm
100 * 2^100 = 100 * 1267650600228229401496703205376 = 126765060022822940149670320537600
I think it should be pretty obvious to always choose the first algorithm.
You can calculate, reduce, or simplify those expressions.
..but hold on a moment and pay attention, that there is a highest order operand 2n, which grows exponentially - way faster (for sufficiently large n-s) than any other of the operands you have - namely n2 or log2n.
Therefore, O((n*(2n)) would take way much more for sufficiently large inputs.
input 1 → 2 operations
input 2 → 8 operations
input 3 → 24 operations
input 4 → 64 operations
...
input 10 → 10240 operations
I think you can see the pattern.
Note, that in terms of asymptotic analysis, we are always interested in large inputs, otherwise, for 1, 2 or very small inputs behaviour will be different.
Suppose an algorithm is known to be O(N2) and solving a problem of size M takes 5 minutes. About how long will it take to solve a problem of size 4M?
Is it as simple as ...
M=5min
4M=20min
?
Since Big O is just an approximation you can not compute the real time but yes you can have some estimation. In your case it would be
1 M ~ 5 min
4 M ~ 5 *(4*4) min ~ 80 min.
Note : I used symbol ~ to show approximation.
O(N^2) => problem with size N will take approximately N^2 time
M will take approximately M^2 time
O(M)~ O(1M)
=> 1^2*M^2
=> M^2
=> 5 min
O(4M) ~ (4M)^2
=> 4^2*M^2
=> 16*M^2
=> 16*5
=> 80 min
If the complexity is O(N^2), this means the time for a problem of size N is roughly k*N^2 for some fixed but unknown value of k.
If you represent the approximate time to run the algorithm on a problem of size N as T(N), then mathematically you have this:
T(N) = k*N^2
T(M) = k*M^2
T(M) = 5 minutes
T(4*M) = k*(4*M)^2
= 16*k*M^2
= 16*T(M)
= 80 minutes
In a nutshell, not necesarily.
When we say that a problem's time complexity is O(N2), what that means is that given a problem of size N, the time it takes to run conforms roughly to some equation of the form a + bN + cN2, where a, b, and c are unknown coefficients.
This does mean that eventually the N2 term will dominate the run-time. But eventually might be a long time away. There might be a large constant set-up time built in (that is, a in the formula above is big), such that 4 of the 5 minutes of your hypothetical scenario don't vary with problem size. In that case, perhaps a problem of size 4M might take less than twice as long to run.
Situations along these lines can happen frequently with algorithms that involve hashing (such as some associative array implementations), particularly if a slow hash function such as SHA2 is being used. Which is why for small collections of elements searching a simple array to see if it contains an element might be faster than searching a hash table, even though searching an array is O(N) and searching a hash set is O(1).
Yes, it is simple, but your calculation is wrong.
What you have calculated is a linear growth, e.g. something of growth of O(n) e.g. if some input takes five minutes, you double the size of your input, then time spend is twice that time. You state that your algorithm run in O(n^2) which is exponential growth.
So your calculation would look like this:
M^2 = 5 minutes <=>
M = sqrt(5) = 2.23607 (approx)
so
(4M)^2 = (4*2.23607)^2 = 80 minutes
Which is exponential growth.
This is also why you never talk about specific run times in computer science. Whether something takes 5 minutes or 5 hours is not interesting. What is interesting is what happens when we change the size of the input. Because when we implement algorithms we want something that runs faster, no matter what computer is used for testing when the size of the input moves towards infinite.
Your guess is correct in case you have O(n), but we have O(n^2) which means you need to square the constant
T(M) = (M)^2 = M^2 = 5 minutes
T(4M) = (4 * M)^2 = 4^2 * M^2 = 16 * M^2
Substitute: M^2 = 5 minutes
T(4M) = 16 * M^2 = 16 * 5 minutes
T(4M) = 80 minutes
I'm trying to calculate an appropriate timeout time for a real time simulator that I'm writing:
For p = probability of success, the time for a successful request = m, and the time for a failed attempt = f. What is the average time for 5 successful requests?
Let's call the total number of tries x.
x = 5p + (x-5)(1-p)
x = 5 / p
The total time would be
t = 5m + (x-5)f
or
t = 5m + (5 / p - 5)f
If m=1, f=2, and p=0.1, the answer should be 5 * 1 + 45 * 2 = 95. This checks out.
There might be errors in here, but I did my best.
This is probably more appropriate on the stats exchange, by the way, but here's the answer:
If you want the average time, you need to average over the possible total number of trials required to get your five successes. This could be anywhere from 5 to infinite (it requires at least 5 trials to get 5 successes, and you could in theory have an infinitely long sequence of failures). I would suggest you could happily cut this off at a reasonable point to get an answer accurate to several decimal places for anything other than pathological values of p. Let n be the number of trials, from which we want to observe x=5 successes. The probability for 5 successes in n trials is given by the binomial distribution, parameterised by x, n, and p. Let Bin(x; n,p) be the binomial probability, then the time associated with this is:
5m + (n-5)f
To get the expectation (average) of this quantity, then you want the sum:
Bin(5; n,p) * 5m + (n-5)f
for for n=5 to n=inf. Depending on your value of p, you should be able to stop at n=20 to 30 and still obtain fairly accurate answers. Beware if you're using a naive implementation of the binomial probability that the computation of the binomial coefficient, which involves a term of n!, may fail for moderately large n, so you may want to consider the normal approximation to the binomial.
We have 3 functions with big o notations:
Func A: O(n)
Func B: O(n^2)
Func C: O(2^n)
If these functions does n proccesses in 10 seconds, how much time need to proccess 2 * n processes for each functions ? Also if you explain why i will be happy.
Thank You
Actually, you really can't tell with only one data point. By way of example, a simplistic answer for the first one would be "twice as long", 20 seconds, since O(n) means the time complexity rises directly proportional to the input parameter.
However, that fails to take into account that the big-O is usually simplified to show only the highest effect. The actual time taken may well be proportional to n plus a constant 5 - in other words, there's a constant 5 second set-up time that doesn't depend on n at all, then half a second per n after that.
That would mean the time take would be 15 seconds rather than 20. And, for the other cases mentioned it's even worse since O(n2) may actually be proportional to n^2 + 52n + 7 which means you would need three data points, assuming you even know all the components of the equation. It could even be something hideous like:
1 12
n^2 + 52*n + 7 + --- + ------
n 47*n^5
which would still technically be O(n2).
If they are simplistic equation (which is likely for homework), then you just need to put together the equations and then plug in 2n wherever you have n, then re-do the equation in terms of the original:
Complexity Equation Double N Time Multiplier
---------- -------- ------------- ---------------
O(n) t = n t = 2n 2
O(n^2) t = n^2 t = (2n)^2
= 4 * n^2 4
O(2^n) t = 2^n t = 2^(2n)
= 2^n * 2^n 2^n
(i.e., depends on
original n)
So., the answers I would have given would have been:
(A) 20 seconds;
(B) 40 seconds; and
(C) 10 x 2n seconds.
A: 2 times as much
B: 4 times as much
C: 2^n times as much
?
time depends on n now
given time is 10 seconds and n also 10, this makes 20, 40 and 1024 seconds respectively :)
but if n is 1, it will be 20, 40 and 40...
Here's a hint
Func A is your base measure that takes 1 unit of time to complete. In this problem, your unit of time is 10 seconds. So O(2*n) = 2*O(n) = 2 * Units = 2 * 10 sec = 20 sec.
Just plug 2*n into the n^2 and 2^n functions to get
O((2*n)^2) = O(2^2 * n^2) = O(4 * n^2)
O(2^(2*n)) = O((2^2)^n) = O(4^n)
Now just figure out how many time units each represents and multiply by 10 seconds.
EDIT: C is 10*2^n, I made a mistake in my answer. I'll leave it below but here is the mistake:
The real formula includes the processing rate, which I left out in my original answer. The processing rate falls away in the first two, but it doesn't in C.
2^n/p=10 (where p is the processing rate of units/second)
2^n=10*p
y=2^(n*2)/p
y=(2^n)^2/p
y=(10*p)^2/p
y=100*p^2/p
y=100*p (so we need to know the processing rate. If we know n, then we know the processing rate)
The units are fine above, as we have seconds^2/seconds = seconds.
Original Answer:
A: 20
B: 40
C: 100
The existing answers already explain A and B. Regarding the C equation, if my mathematics serve me correctly....
Part 1: What does n equal
2^n=10
log(2^n)=log(10)
n*log(2)=log(10)
n=log(10)/log(2)
Part 2: Now replace n with 2*n
x = 2^(2*n)
x = 2^(2*log(10)/log(2))
x = 100 (thanks excel: =2^(2*LOG(10)/LOG(2)))
Still I haven't used logarithms in 6 years, so please be forgiving if I'm wrong.
EDIT: Found a simpler way.
Given t is orginal time, and y is the new time.
t=2^n
y=2^(n*2)
y=(2^n)^2
y=t^2
y=10^2
y=100
An algorithm having worst-case running time of O(N^2) took 30secs to run for input size N=20. How long will the same algorithm take for input size N=400 ?
O(n^2) implies proportionality to the square of n (see this guide). So
T = K (n^2)
30 = K (20^2)
K = 30 / 400
Hence time for 400 items
= (30 / 400)( 400 ^ 2 )
So that 12000 seconds.
Now, that's not necessarily true unless you know that the original 20 item test was a worst case scenario, if it isn't then we have a bad estimate of K. Even if if we have a good estimate of K so we we know the worst case scenarion for 400 items, we don't know that these 400 items will take that long.