calculate a running time of a function - performance

I have trouble with coming up the running time of a function which calls other functions. For example, here is a function that convert the binary tree to a list:
(define (tree->list-1 tree)
(if (null? tree)
’()
(append (tree->list-1 (left-branch tree))
(cons (entry tree)
(tree->list-1 (right-branch tree))))))
The explanation is T(n) = 2*T(n/2) + O(n/2) because procedure append takes linear time. 
Solving above equation, we get T(n) = O(n * log n).
However, the cons is also a procedure that combines two element. In this case it goes though all the entry node, why don't we add another O(n) in the solution?
Thank you for any help.

Consider O(n^2) which is clearly quadratic.
Now consider O(n^2 + n), this still is quadratic, hence we can reduce this to O(n^2) as the + n is not significant (it does not change the "order of magnitude" (not sure this is the right term)).
The same applies here so we can reduce O([n*log(n)] + n) to O(n*log(n)). However we may not reduce this to O(log(n)) as this would be logarithmic, which is not.

If I understand correctly, you are asking about the difference between append and cons.
The time used by (cons a b) does not depend on the values of a and b. The call allocates some memory, tags it with a type tag ("pair") and stores pointers to the values a and b in the pair.
Compare this to (append xs ys). Here append needs to make a new list consisting of the elements in both xs and ys. This means that if xs is a list of n elements, then append needs to allocate n new pairs to hold the elements of xs.
In short: append needs to copy the elements in xs and thus the time is proportional to the length of xs. The function cons uses the time no matter what arguments it is called with.

Related

Analysing running time, Big O

I am having problems with anyalysing the time complexity of an algorithm.
For example the following Haskell code, which will sort a list.
sort xs
|isSorted xs= xs
|otherwise= sort (check xs)
where
isSorted xs=all (==True) (zipWith (<=) xs ( drop 1 xs))
check [] =[]
check [x]=[x]
check (x:y:xs)
|x<=y = x:check (y:xs)
|otherwise=y:check (x:xs)
So for n being the length of the list and t_isSorted(n) the running time function: there is an constant t_drop(n) =c and t_all(n)=n, t_zipWith(n)=n :
t_isSorted(n)= c + n +n
For t_check:
t_check(1)=c1
t_check(n)=c2 + t_check(n-1), c2= for comparing and changing an element
.
.
.
t_check(n)=i*c2 + tcheck_(n-i), with i=n-1
=(n-1)*c2 + t_check(1)
=n*c2 - c2 + c1
And how exactly do I have to combine those to get t_sort(n)? I guess in the worst- case, sort xs has to run n-1 times.
isSorted is indeed O(n), since it's dominated by zipWith which in turn is O(n) since it does a linear pass over its argument.
check itself is O(n), since it only calls itself once per execution, and it always removes a constant number of elements from the list. The fastest sorting algorithm (without knowing something more about the list) runs in O(n*log(n)) (equivalent to O(log(n!)) time). There's a mathematical proof of this, and this algorithm is faster, so it cannot possibly be sorting the whole list.
check only moves things one step; it's effectively a single pass of bubble sort.
Consider sorting this list: [3,2,1]
check [3,2,1] = 2:(check [3,1]) -- since 3 > 2
check [3,1] = 1:(check [3]) -- since 3 > 1
check [3] = [3]
which would return the "sorted" list [2,1,3].
Then, as long as the list is not sorted, we loop. Since we might only put one element in its correct position (as 3 did in the example above), we might need O(n) loop iterations.
This totals at a time complexity of O(n) * O(n) = O(n^2)
The time complexity is O(n^2).
You're right, one step takes O(n) time (for both isSorted and check functions). It is called no more than n times (maybe even n - 1, it doesn't really matter for time complexity) (after the first call the largest element is guaranteed to be the last one, the same is the case for the second largest after the second call. We can prove that the last k elements are the largest and sorted properly after k calls). It swaps only adjacent elements, so it removes at most one inversion per step. As the number of inversions is O(n^2) in the worst case (namely, n * (n - 1) / 2), the time complexity is O(n^2).

The smallest free number - divide and conquer algorithm

I'm reading a book Pearls of Functional Algorithm Design. Tried implementing the divide and conquer solution for smallest free number problem.
minfree xs = minfrom 0 (length xs) xs
minfrom a 0 _ = a
minfrom a n xs = if m == b - a
then minfrom b (n-m) vs
else minfrom a (m) us
where b = a + 1 + (n `div` 2)
(us,vs) = partition (<b) xs
m = length us
But this one works no faster than the solution that one might call "naive" solution. Which is
import Data.List ((\\))
minfree' = head . (\\) [0..]
I don't know why this is like this, what's wrong with the divide and conquer algorithm and how to improve it.
Tried using BangPatterns, implementing the version of partition that also returns first list's length in the tuple, so it eliminates additional traversal for m =length us. None of them made improvement.
First one takes more than 5 seconds, whereas second one does it almost instantly in ghci on input [0..9999999].
You have pathological input on which head . (\\) [0..] performs in O(N) time. \\ is defined as follows:
(\\) = foldl (flip delete)
delete x xs is an O(N) operation that removes the first x from xs. foldl (flip delete) xs ys deletes all elements of ys from xs one by one.
In [0..] \\ [0..9999999], we always find the next element to be deleted at the head of the list, so the result can be evaluated in linear time.
If you instead type minfree' (reverse [0..9999999]) into GHCi, that takes quadratic time and you find that it pretty much never finishes.
The divide-and-conquer algorithm on the other hand would not slow down on the reversed input.

Complexity of a sortBy in Haskell

I came up with an idea to solve another SO question and am hoping for some help with determining the function's complexity since I'm not too knowledgeable about that. Would I be correct in guessing that "unsorting" each of the chunks would be O (n * log 2) ? And then what would be the complexity of the sortBy function that compares the last of one chunk to the head of another? My guess is that that function would only compare pairs and not require finding one chunk's order in terms of the total list. Also, would Haskell offer different complexity because of a lazy optimization of the overall function? Thanks, in advance!
import Data.List.Split (chunksOf)
import Data.List (sortBy)
rearrange :: [Int] -> [Int]
rearrange = concat
. sortBy (\a b -> compare (last a) (head b))
. map (sortBy (\a b -> compare b a))
. chunksOf 2
Well step by step
chunksOf 2 must iterate through the whole list so O(n) and we half the length of the list. However, since constant multiples don't affect complexity we can ignore this.
map (sortBy... iterates through the whole list O(n) doing a constant time operation* O(1)=O(1*n) = O(n)
sortBy with a constant time comparison* is O( n * log n)
concat which is O(n)
So in total O(n + n + n log n + n) = O ((3 + log n) * n) = O(n log n)
*Since the lists are guarenteed to be of length 2 or less, we can say that operations like sorting and accessing the last element are O(2 * log 2) and O(2) respectively, which are both constant time, O(1)
Let's look at the parts in isolation (let n be the length of the list argument):
chunksOf 2 is O(n), resulting in a list of length (n+1) `quot` 2.
map (sortBy ...): Since all lists that are passed to sortBy ... have length <= 2, each of those sorts is O(1), and thus the entire map is O(n) again.
sortBy (\a b -> compare (last a) (head b)): The comparison is always O(1), since the lists whose last element is taken are of bounded length (<= 2), thus the entire sortBy operation is O(n*log n)
concat is O(n) again.
So overall, we have O(n*log n).
Note, however, that
cmp = \a b -> compare (last a) (head b)
is an inconsistent comparison, for two lists a and b (say [30,10] and [25,15]), you can have
cmp a b == cmp b a = LT
I'm not sure that your algorithm always works.
After looking at the implementation of sortBy and tracing the sort a bit in my head, I think that for the given purpose, it works (provided the list elements are distinct) and the inconsistent comparison does no harm. For some sorting algorithms, an inconsistent comparison might cause the sorting to loop, but for merge sort variants, that should not occur.

More Efficient Runtime in Scheme - AVL's

So I basically have the function avl? that run's in O(n^2), this is so because everytime im recursing, I'm calling height which is O(n) function (where n is number of nodes in a tree).
(define (height t)
(cond
[(empty? t) 0]
[else (+ 1 (max (height (BST-left t)) (height (BST-right t))))]))
(define (avl? t)
(cond
[(empty? t) #t]
[else (and (avl? (BST-left t))
(avl? (BST-right t))
(>= 1 (abs (- (height (BST-left t))
(height (BST-right t))))))]))
My problem is that i want to make avl? run in O(n) time. I was given the hint: "You should try to limit calling your height function within a constant time no matter how large the BST you are applied to. In this way, you can get a O(n) running time over all." ... I'm not sure how to make my height run in constant time thou. Any suggestion to make my avl? run in O(n) rather than O(n^2)?
If you are not allowed to store the height in the tree, you can avoid recomputing it by having a worker function that tells you the height of a tree and if it's an AVL tree. Then each node is looked at exactly once, and you have an O(n) algorithm. Then call the worker from the wrapper that forgets the height part of the worker's result. You should of course short-cut, so if some subtree is determined to violate the balancing condition, don't bother checking any more subtrees, return #f and a bogus height.
Another option would be storing the height in every node, where the value represents the height of the subtree rooted in that node. Clearly, with this approach returning the height of a subtree would be an O(1) operation.
That implies that all the operations that modify the tree (insertion, deletion, etc.) must keep the height up to date whenever there's a structural change in the tree.

time complexity of the acc function in scheme?

I have been trying to find a tight bound time complexity for this function with respect to just one of the arguments. I thought it was O(p^2) (or rather big theta) but I am not sure anymore.
(define (acc p n)
(define (iter p n result)
(if (< p 1)
result
(iter (/ p 2) (- n 1) (+ result n))))
(iter p n 1))
#sarahamedani, why would this be O(p^2)? It looks like O(log p) to me. The runtime should be insensitive to the value of n.
You are summing a series of numbers, counting down from n. The number of times iter will iterate depends on how many times p can be halved without becoming less than 1. In other words, the position of the leftmost '1' bit in p, minus one, is the number of times iter will iterate. That means the number of times iter runs is proportional to log p.
You might try to eyeball it, or go from it more systematically. Assuming we're doing this from scratch, we should try build a recurrence relation from the function definition.
We can assume, for the moment, a very simple machine model where arithmetic operations and variable lookups are constant time.
Let iter-cost be the name of the function that counts how many steps it takes to compute iter, and let it be a function of p, since iter's termination depends only on p. Then you should be able to write expressions for iter-cost(0). Can you do that for iter-cost(1), iter-cost(2), iter-cost(3), and iter-cost(4)?
More generally, given an p greater than zero, can you express iter-cost(p)? It will be in terms of constants and a recurrent call to iter-cost. If you can express it as a recurrence, then you're in a better position to express it in a closed form.

Resources