Assuming n is a positive integer, the composite function performs as follows:
(define (composite? n)
(define (iter i)
(cond ((= i n) #f)
((= (remainder n i) 0) #t)
(else (iter (+ i 1)))))
(iter 2))
It seems to me that the time complexity (with a tight bound) here is O(n) or rather big theta(n). I am just eyeballing it right now. Because we are adding 1 to the argument of iter every time we loop through, it seems to be O(n). Is there a better explanation?
The function as written is O(n). But if you change the test (= i n) to (< n (* i i)) the time complexity drops to O(sqrt(n)), which is a considerable improvement; if n is a million, the time complexity drops from a million to a thousand. That test works because if n = pq, one of p and q must be less than the square root of n while the other is greater than the square root of n; thus, if no factor is found less than the square root of n, n cannot be composite. Newacct's answer correctly suggests that the cost of the arithmetic matters if n is large, but the cost of the arithmetic is log log n, not log n as newacct suggests.
Different people will give you different answers depending on what they assume and what they factor into the problem.
It is O(n) assuming that the equality and remainder operations you do inside each loop are O(1). It is true that the processor does these in O(1), but that only works for fixed-precision numbers. Since we are talking about asymptotic complexity, and since "asymptotic", by definition, deals with what happens when things grow without bound, we need to consider numbers that are arbitrarily big. (If the numbers in your problem were bounded, then the running time of the algorithm would also be bounded, and thus the entire algorithm would be technically O(1), obviously not what you want.)
For arbitrary-precision numbers, I would say that equality and remainder in general take time proportional to the size of the number, which is log n. (Unless you can optimize that away in amortized analysis somehow) So, if we consider that, the algorithm would be O(n log n). Some might consider this to be nitpicky
Related
I just came around this weird discovery, in normal maths, n*logn would be lesser than n, because log n is usually less than 1.
So why is O(nlog(n)) greater than O(n)? (ie why is nlogn considered to take more time than n)
Does Big-O follow a different system?
It turned out, I misunderstood Logn to be lesser than 1.
As I asked few of my seniors i got to know this today itself, that if the value of n is large, (which it usually is, when we are considering Big O ie worst case), logn can be greater than 1.
So yeah,
O(1) < O(logn) < O(n) < O(nlogn) holds true.
(I thought this to be a dumb question, and was about to delete it as well, but then realised, no question is dumb question and there might be others who get this confusion so I left it here.)
...because log n is always less than 1.
This is a faulty premise. In fact, logb n > 1 for all n > b. For example, log2 32 = 5.
Colloquially, you can think of log n as the number of digits in n. If n is an 8-digit number then log n ≈ 8. Logarithms are usually bigger than 1 for most values of n, because most numbers have multiple digits.
Plot both the graph( on desmos (https://www.desmos.com/calculator) or any other web) and look yourself the result on large values of n ( y=f(n)). I am saying that you should look for large value because for small value of n the program will not have time issue. For convenience I had attached a graph below you can try for other base of log.
The red represent time = n and blue represent time = nlog(n).
Here is a graph of the popular time complexities
n*log(n) is clearly greater than n for n>2 (log base 2)
An easy way to remember might be, taking two examples
Imagine the binary search algorithm with is Log N time complexity: O(log(N))
If, for each step of binary search, you had to iterate the array of N elements
The time complexity of that task would be O(N*log(N))
Which is more work than iterating the array once: O(N)
In computers, it's log base 2 and not base 10. So log(2) is 1 and log(n), where n>2, is a positive number which is greater than 1.
Only in the case of log (1), we have the value less than 1, otherwise, it's greater than 1.
Log(n) can be greater than 1 if n is greater than b. But this doesn't answer your question that why is O(n*logn) is greater than O(n).
Usually the base is less than 4. So for higher values n, n*log(n) becomes greater than n. And that is why O(nlogn) > O(n).
This graph may help. log (n) rises faster than n and is greater than 1 for n greater than logarithm's base. https://stackoverflow.com/a/7830804/11617347
No matter how two functions behave on small value of n, they are compared against each other when n is large enough. Theoretically, there is an N such that for each given n > N, then nlogn >= n. If you choose N=10, nlogn is always greater than n.
The assertion is not always accurate. When n is small, (n^2) requires more time than (log n), but when n is large, (log n) is more effective. The growth rate of (n^2) is less than (n) and (log n) for small values, so we can say that (n^2) is more efficient because it takes less time than (log n), but as n increases, (n^2) increases dramatically, whereas (log n) has a growth rate that is less than (n^2) and (n), so (log n) is more efficient.
For higher values of log n it becomes greater than 1. as we consider all possible values of n we can say that for most of the time log n is greater than 1. Hence we can say O(nlogn) > O(n) (Assuming higher values)
Remember "big O" is not about values, it is about the shape of the function I can have an O(n2) function that runs faster, even for values of n over a million than a O(1) function...
I just studied how to find the prime factors of a number with this algorithm that basically works this way:
void printPrimeFactors(N) {
while N is even
print 2 as prime factor
N = N/2
// At this point N is odd
for all the ODDS i from 3 to sqrt(N)
while N is divisible by i
print i as prime factor
N = N / i
if N hasn't been divided by anything (i.e. it's still N)
N is prime
}
everything is clear but I'm not sure how to calculate the complexity in big-O of the program above.
Being the division the most expensive operation (I suppose), I'd say that as a worst-case there could be a maximum of log(N) divisions, but I'm not totally sure of this.
You can proceed as like this. First of all, we are only interested in the behavior of the application when N is really big. In that case, we can simplify a lot: if two parts have different asymptotic performance, we need only take the one that performs worst.
The first while can loop at most m times, where m is the smallest integer so that 2m >= N. Therefore it will loop, at worst, log2N times -- this means that it performs as O(log N). Note that the type of log is irrelevant when N is large enough.
The for loop runs O(sqrt N) times. At scale, this is way more massive than log N, so we can drop the log.
Finally, we need to evaluate the while inside the loop. Since this while loop is executed only for divisors, it will have a big O equal to their number. With a little thought we can see that, at worst, the while will loop log3N times because 3 is the smallest possible divisor.
Thus, the while loop gets executed only O(log N) times, but the outer for gets executed O(sqrt N) times (and often the while loop doesn't run because the current number won't divide).
In conclusion, the part that will take the longest is the for loop and this will make the algorithm go as O(sqrtN).
I am currently working on some problems from my textbook, about Big-O notation, and how functions can dominate each other.
These are the functions that I am looking at from my book.
n²
n² + 1000n
n (if n is odd)
n³ (if n is even)
n (if n ≤ 100)
n³ (if n > 100)
I am trying to figure out which functions that #1 dominates. I know that both #1 and #2 simplify to n², so it does not dominate #2. However, the split functions (#3 and #4) are giving me problems. #1 dominates the function only on a certain condition, and under the other condition, #1 is being dominated by the other function. So does this mean that, since it is not always dominating it, that it doesn't technically count as dominating it at all? Does function #1 not dominate any of these functions, or does it dominate #3, for all odd numbers, and #4 for all numbers ≤ 100? The way I see it is, #1 does not dominate #2, only dominates #3 for odd numbers, and only dominates #4 for numbers ≤ 100. Am I on the right track?
Thanks for any help anyone can provide. I'm having a real tough time trying to reason this out to myself.
I am not sure what "dominates" means in your case. Lets say "f(n) dominates g(n)" translates to f(n) ∈ O(g(n)), where O(g(n)) is the worst case complexity.
So we should calculate the worst case complexity first:
n² is in Θ(n²)
n² + 1000n is also in Θ(n²)
n (if n is odd) n³ (if n is even) is in Θ(n³) (just picking the worst case which appears in 50% of all cases for random choices of n)
n (if n ≤ 100) n³ (if n > 100) is also in Θ(n³), since Big-O depends on asymptotics (large values of n).
Now we can compare the worst case complexities and see #1 dominates only #2.
Maybe you want to change the worst case complexity to an average case. But only for #3 there could be a change.
After calculating (n³ + n) / 2 we notice, that even the average case of #3 is in Θ(n³).
If you look at the best case you get the first change, but also only for #3. Here the best case is Θ(n), so here is #3 dominated by #1.
Notice that the best case of #4 is not Θ(n), since the complexity holds only for n → ∞, so we ignore all cases of n < c₀ where c₀ is a constant.
I know that the Fibonacci series grows exponentially, therefore a recursive algorithm has a required number of steps that grow exponentially, however SICP v2 says that a tree recursive Fibonacci algorithm requires linear space because we only need to keep track of the nodes above us in the tree.
I understand that the required number of steps grows linear with Fib(n) but I would also assume that because the tree is expanding exponentially, the memory required in this event would need to be exponential as well. Can someone explain why the memory required only expands linearly to N, and not exponentially?
I am guessing this is a consequence of the use of applicative order in evaluation. Given
(define (fib n)
(cond ((= n 0) 0) ((= n 1) 1) (else (+ fib (- n 1)) (fib (- n 2))))))
[from Structure and Interpretation of Computer Programs]
normal-order evaluation of (fib 5) would keep expanding until it got to primitive expressions:
(+ (+ (+ (+ (fib 1) (fib 0)) (fib 1)) (+ (fib 1) (fib 0))) (+ (+ (fib 1) (fib 0) (fib 0)))
That would result in all the leaves of the tree being stored in memory, which would require space space exponentially related to n.
But applicative-order evaluation should proceed differently, driving down to primitive expressions along one branch to two leaves, and then ascending the branch and accumulating any side branches. This would result in a maximum length expression for (fib 5) of:
(+ (+ (+ (+ (fib 1) (fib 0)) (fib 1)) (fib 2)) (fib 3))
This expression is much shorter than the expression used in normal-order evaluation. The length of this expression is not affected by the number of leaves in the tree, only the depth of the tree.
This is my answer after staring at that sentence in SICP for more time than I care to admit.
You do not store the whole tree but only as many stack frames as is the current depth you are in.
the difference between the normal-order evaluation and applicative-order evaluation is similar to the difference between depth first search algorithm and breadth first search algorithm algorithm.
In this cause, it is a normal-order evaluation, all the combinations as arguments would be evaluated one by one until there is no combination anymore by the order from left to right(when combination being is evaluating, if there are still combinations inside the the combination being evaluating, the first one of these combinations would be evaluating right after in the next evaluation, and go on like this.),
which means the space would first expanse and then shrink, when the first combination got evaluate.
And go on like this, the second, the third. Making the max space for the whole evaluation depend on the depth of the evaluation proceess.
Hence a tree recursive Fibonacci algorithm requires linear space.hope it would help
Binary search has a average case performance as O(log n) and Quick Sort with O(n log n) is O(n log n) is same as O(n) + O(log n)
Imagine a database with with every person in the world. That's 6.7 billion entries. O(log n) is a lookup on an indexed column (e.g. primary key). O(n log n) is returning the entire population in sorted order on an unindexed column.
O(log n) was finished before you finished reading the first word of that sentence.
O(n log n) is still calculating...
Another way to imagine it:
log n is proportional to the number of digits in n.
n log n is n times greater.
Try writing the number 1000 once versus writing it one thousand times. The first takes O(log n) time, the second takes O(n log n) time.
Now try that again with 6700000000. Writing it once is still trivial. Now try writing it 6.7 billion times. Even if you could write it once per second you'd be dead before you finished.
You could visualize it in a plot, see here for example:
No, O(n log n) = O(n) * O(log n)
In mathematics, when you have an expression (i.e. e=mc^2), if there is no operator, then you multiply.
Normally the way to visualize O(n log n) is "do something which takes log n computations n times."
If you had an algorithm which first iterated over a list, then did a binary search of that list (which would be N + log N) you can express that simply as O(n) because the n dwarfs the log n for large values of n
A (log n) plot increases, but is concave downward, which means:
It increases when n gets larger
It's rate of increasing decreases
when n gets larger
A (n log n) plot increases, and is (slightly) concave upward, which means:
It increases when n gets larger
It's rate of increasing (slightly)
increases when n gets larger
Depends on whether you tend to visualize n as having a concrete value.
If you tend to visualize n as having a concrete value, and the units of f(n) are time or instructions, then O(log n) is n times faster than O(n log n) for a given task of size n. For memory or space units, then O(log n) is n times smaller for a given task of size n. In this case, you are focusing on the codomain of f(n) for some known n. You are visualizing answers to questions about how long something will take or how much memory will this operation consume.
If you tend to visualize n as a parameter having any value, then O(log n) is n times more scalable. O(log n) can complete n times as many tasks of size n. In this case, you are focused on the domain of f(n). You are visualizing answers to questions about how big n can get, or how many instances of f(n) you can run in parallel.
Neither perspective is better than the other. The former can be use to compare approaches to solving a specific problem. The latter can be used to compare the practical limitations of the given approaches.