function foo(n)
if n = 1 then
return 1
else
return foo(rand(1, n))
end if
end function
If foo is initially called with m as the parameter, what is the expected number times that rand() would be called ?
BTW, rand(1,n) returns a uniformly distributed random integer in the range 1 to n.
A simple example is how many calls it takes to calculate f(2). Say this time is x, then x = 1 + 0/2 + x/2 because we do the actual call 1, then with probability 1/2 we go to f(1) and with probability 1/2 we stay at f(2). Solving the equation gives us x = 2.
As with most running time analysis of recursion, we try to get a recursive formula for the running time. We can use linearity of expectation to proceed through the random call:
E[T(1)] = 0
E[T(2)] = 1 + (E[T(1)] + E[T(2)])/2 = 2
E[T(n)] = 1 + (E[T(1)] + E[T(2)] + ... E[T(n)])/n
= 1 + (E[T(1)] + E[T(2)] + ... E[T(n-1)])/n + E[T(n)]/n
= 1 + (E[T(n-1)] - 1)(n-1)/n + E[T(n)]/n
Hence
E[T(n)](n-1) = n + (E[T(n-1)] - 1)(n-1)
And so, for n > 1:
E[T(n)] = 1/(n-1) + E[T(n-1)]
= 1/(n-1) + 1/(n-2) + ... + 1/2 + 2
= Harmonic(n-1) + 1
= O(log n)
This is also what we intuitively might have expected, since n should approximately half at each call to f.
We may also consider the 'Worst case with high probability'. For this it's easy to use Markov's inequality, which says P[X <= a*E[X]] >= 1-1/a. Setting a = 100 we get that with 99% probability, the algorithm makes less than 100 * log n calls to rand.
Related
T(n)=4T(n/4) + log4n is the recurrence provided and I was wondering how to write a pseudocode based on it.
This says: make four recursive calls, and have each recursive call do an amount of work equal to log 4n (rounded down).
Note that log 4n = log 4 + log n = 2 + log n.
Something like this gets pretty close::
function foo(array[1...n])
if n <= 1 then return 1
c = 1
while c < n do
c *= 4
return c + foo(arr[1...n/4]) + foo(arr[n/4+1...n/2]) + foo(arr[n/2+1...3n/4]) + foo(arr[3n/4+1...n]
The recurrence here is T(n) = 4T(n/4) + log 4n + 15, if I counted correctly, and making some assumptions about operations taking the same time to run.
We can bring that 15 down by turning the algorithm more and more silly. Store n/4 in a variable, call foo on the first 1/4 of arr four times, but only return c; this brings the 15 down to 1, I think. To get rid of that 1 operation, start c at 4 instead, killing one loop iteration (2 ops) and add one op back at the end, like return c + 1 instead of return c.
How can I turn the following recursive algorithm into an iterative algorithm?
count(integer: n)
for i = 1...n
return count(n-i) + count(n-i)
return 1
Essentially this algorithm computes the following:
count(n-1) + count(n-2) + ... + count(1)
This is not a tail recursion, so it is not trivial to transform it into iterative.
However, a recursion can be simulated using a stack and loop pretty easily, by pushing to the stack rather than recursing.
stack = Stack()
stack.push(n)
count = 0
while (stack.empty() == false):
current = stack.pop()
count++
for i from current-1 to 1 inclusive (and descending):
stack.push(i)
return count
Another solution is doing it with Dynamic Programming, since you don't need to calculate the same thing multiple times:
DP = new int[n+1]
DP[0] = 1
for i from 1 to n:
DP[i] = 0
for j from 0 to i-1:
DP[i] += DP[j]
return DP[n]
Note that you can even optimize it to run in O(n) rather than O(n^2), by remembering the "so far sum":
sum = 1
current = 1
for i from 1 to n:
current = sum
sum = sum + current
return current
Lastly, this actually sums to something you can easily pre-calculate: count(n) = 2^(n-1), count(0) = 1 (You can suspect it from seeing the last iterative solution we have...)
base: count(0) automatically yields 1, as the loop's body is not reached.
Hypothesis: T(k) = 2^(k-1) for all k < n
Proof:
T(n) = T(n-1) + T(n-2) + ... + T(1) + T(0) = (induction hypothesis)
= 2^(n-2) + 2^(n-3) + ... + 2^0 + 1 =
= sum { 2^i | i=0,...,n-2 } + 1 = (sum of geometric series)
= (1-2^(n-1)/(1-2)) + 1 = (2^(n-1) - 1) + 1 = 2^(n-1)
If you define your problem in the following recursive way:
count(integer : n)
if n==0 return 1
return count(n-1)+count(n-1)
Converting to an iterative algorithm is a typical application of backwards induction where you should keep all previous results:
count(integer : n):
result[0] = 1
for i = 1..n
result[i] = result[i-1] + result[i-1]
return result[n]
Ir is clear that this is more complicated than it should be because the point is to exemplify backwards induction. I could be accumulating into a single place but I wanted to provide a more general concept that could be extended to other cases. In my opinion the idea is clearer this way.
The pseudocode can be improved after the key idea is clear. In fact, there are two very simple improvements that are applicable only to this specific case:
instead of keeping all previous values, only the last one is necessary
there is no need for two identical calls as there are no side-effects expected
Going beyond, it is possible to calculate that based on the definition of the function, count(n)= 2^n
The statement return count(n-i) + count(n-i) appears to be equivalent to return 2 * count(n-i). In that case:
count(integer: n)
result = 1
for i = 1...n
result = 2 * result
return result
What am I missing here?
I am expressing the algorithms in pseudo code. I'm just wondering if my design works just as well as the original one displayed below. The algorithm is supposed to compute the sum of n odd positive integers.
This is how the algorithm should look:
procedure sumofodds(n:positive integer)
if n = 1
return 1
else
return sumofodds(n-1) + (2n-1)
This is how i designed my algorithm:
procedure odd(n: positive integer)
if n = 1
return 1
if n % 2 > 0
return n + odd(n-1) // this means n is odd
if n % 2 = 0
return 0 + odd(n-1) // this means its even
Your algorithm is not the same as the original.
The original computes the sum of the first n odd numbers.
Your algorithm computes the sum of all the odd numbers in the range 1..n.
So for an input of n=3, the first algorithm will compute 1+3+5, while your algorithm will compute 1+3.
(If you want a quicker way, then the formula n*n computes the sum of the first n odd numbers)
One small improvement that might help is defining it with tail recursion. Tail recursion happens when the very last thing to execute is the recursive call. To make this tail recursive, use a helper method and pass the running sum as a parameter. I'm pretty sure the pseudo code below is tail recursive since, regardless of the result of the (if odd) check, the final step is the recursive call (the math happens before the recursive call).
procedure SumOdds(n)
return SumOddsHelper(n, 0)
procedure SumOddsHelper(n, sum)
if n = 1 return 1
if n is odd return SumOddsHelper(n-1, sum + n)
else return SumOddsHelper(n-1, sum)
Let me suggest that you implement your idea in Python. You may be surprised to see that the working code is very similar to pseudocode.
This is the original algorithm:
def sum_of_n_odds(n):
if n == 1:
return 1
else:
return sum_of_n_odds(n-1) + (2*n-1)
And this is the one you wrote:
def sum_of_odds_up_to_n(n):
if n == 1:
return 1
if n % 2 > 0: # this means n is odd
return n + sum_of_odds_up_to_n(n-1)
if n % 2 == 0: # this means it's even
return 0 + sum_of_odds_up_to_n(n-1)
These two algorithms compute different things. Calling sum_of_n_odds(10) yields the same result as calling sum_of_odds_up_to_n(19) or sum_of_odds_up_to_n(20). In general, sum_of_odds_up_to_n(n) is equivalent to sum_of_n_odds((n+1)//2), where // means integer division.
If you're interested in making your implementation a little more efficient, I suggest that you omit the final if condition, where n % 2 == 0. An integer is either odd or even, so if it isn't odd, it must be even.
You can get another performance gain by making the recursive call sum_of_odds_up_to(n-2) when n is odd. Currently you are wasting half of your function calls on even numbers.
With these two improvements, the code becomes:
def sum_of_odds_up_to_n(n):
if n <= 0:
return 0
if n % 2 == 0:
return sum_of_odds_up_to_n(n-1)
return n + sum_of_odds_up_to_n(n-2)
And this is the tail-recursive version:
def sum_of_odds_up_to_n(n, partial=0):
if n <= 0:
return partial
if n % 2 == 0:
return sum_of_odds_up_to_n(n-1, partial)
return sum_of_odds_up_to_n(n-2, partial+n)
You should not expect performance gains from the above because Python does not optimize for tail recursion. However, you can rewrite tail recursion as iteration, which will run faster because it doesn't spend time allocating a stack frame for each recursive call:
def sum_of_odds_up_to_n(n):
partial = 0
if n % 2 == 0:
n -= 1
while n > 0:
partial += n
n -= 2
return partial
The fastest implementation of all relies on mathematical insight. Consider the sum:
1 + 3 + 5 + ... + (n-4) + (n-2) + n
Observe that you can pair the first element with the last element, the second element with the second last element, the third element with the third last element, and so on:
(1 + n) + (3 + n-2) + (5 + n-4) + ...
It is easy to see that this is equal to:
(n + 1) + (n + 1) + (n + 1) + ...
How many terms (n + 1) are there? Since we're pairing up two terms at a time from the original sequence, there are half as many terms in the (n + 1) sequence.
You can check for yourself that the original sequence has (n + 1) / 2 terms. (Hint: see what you get if you add 1 to every term.)
The new sequence has half as many terms as that, or (n + 1) / 4. And each term in the sequence is (n + 1), so the sum of the whole sequence is:
(n + 1) * (n + 1) / 4
The resulting Python program is this:
def sum_of_odds_up_to_n(n):
if n <= 0:
return 0
if n % 2 == 0:
n -= 1
return (n+1)*(n+1)//4
The Fizz-Buzz function (in pseudocode) takes any positive integer n. I'm especially curious about the algebraic breakdown of the cost and time required of the if-else statement. I know its worst case running time is O(n).
Fizz-Bizz(n)
for i = 1 to n
if (n % 3 == 0)
print "fizz"
if (n % 5 == 0)
print "buzz"
if (n % 3 != 0 and n % 5 != 0)
print n
Example of breakdown of another algorithm:
The time complexity is O(n) because the if statement has no real effect on that. The complexity of the if statement is constant over a large enough dataset.
The if statement may actually do a different amount of work in iterations where you have a multiple of three or five but, the amount of extra work per loop iteration is not dependent on n. In fact, it averages out to a constant as n becomes bigger.
And, as an aside, I think that code may be wrong. At multiples of fifteen, it should print both fizz and buzz.
If you want to do it to the level in your edit (the added breakdown), you simply need to assign an arbitrary cost ci to each statement (and this cost is constant for a single execution of that statement) then figure out how many times each statement is run.
For example, the first if sequence runs n times, the print "fizz" runs for one-third of those, n/3. So you end up with something like this table:
cost times
Fizz-Buzz(n)
for i = 1 to n c1 n
if (n % 3 == 0) c2 n
print "fizz" c3 n / 3 [call this a]
else
if (n % 5 == 0) c4 n - a
print "buzz" c5 (n - a) / 5 [call this b]
else
print n c6 n - a - b
Add up all of those as per your example (substituting the n-equations for a and b) and, in the end, you'll still end up with something dependent on n, hence an O(n) algorithm. It'll look something like:
c1*n + c2*n + c3*n/3 + c4*(n-a) + c5*(n-a)/5 + c6*(n-a-b)
= c1*n + c2*n + (c3/3)*n + c4*(n-n/3) + (c5/5)*(n-n/3) + c6*(n-n/3-(n-n/3)/5)
= c1*n + c2*n + (c3/3)*n + c4*(2/3)*n + (c5/5)*(2/3)*n + c6*(n-n/3-(n-n/3)/5)
= c1*n + c2*n + (c3/3)*n + (c4*2/3)*n + (c5*2/15)*n + c6*(n*8/15)
= c1*n + c2*n + (c3/3)*n + (c4*2/3)*n + (c5*2/15)*n + (c6*8/15)*n
/ 1 2 2 8 \
= ( c1 + c2 + - c3 + - c4 + -- c5 + -- c6 ) * n
\ 3 3 15 15 /
All those values inside parentheses are in fact constants (since they're multiples of constants) so the whole thing is a constant multiplier of n.
Now, if you find a minor mistake in those equations above, I wouldn't be too surprised - I haven't done this level of math for quite a few years, and I may well have thrown in a furphy in case this is for homework and you try copying it verbatim :-)
But the only mistake you're likely to find is the value of the constant multiplier itself. It will still be a constant multiplier of some description, I'll guarantee that.
The rule in a particular game is that a character's power is proportional to the triangular root of the character's experience. For example, 15-20 experience gives 5 power, 21-27 experience gives 6 power, 28-35 experience gives 7 power, etc. Some players are known to have achieved experience in the hundreds of billions.
I am trying to implement this game on an 8-bit machine that has only three arithmetic instructions: add, subtract, and divide by 2. For example, to multiply a number by 4, a program would add it to itself twice. General multiplication is much slower; I've written a software subroutine to do it using a quarter-square table.
I had considered calculating the triangular root T(p) through bisection search for the successive triangular numbers bounding an experience number from above and below. My plan was to use a recurrence identity for T(2*p) until it exceeds experience, then use that as the upper bound for a bisection search. But I'm having trouble finding an identity for T((x+y)/2) in the bisection that doesn't use either x*y or (x+y)^2.
Is there an efficient algorithm to calculate the triangular root of a number with just add, subtract, and halve? Or will I end up having to perform O(log n) multiplications, one to calculate each midpoint in the bisection search? Or would it be better to consider implementing long division to use Newton's method?
Definition of T(x):
T(x) = (n * (n + 1))/2
Identities that I derived:
T(2*x) = 4*T(x) - x
# e.g. T(5) = 15, T(10) = 4*15 - 5 = 55
T(x/2) = (T(x) + x/2)/4
# e.g. T(10) = 55, T(5) = (55 + 5)/4 = 15
T(x + y) = T(x) + T(y) + x*y
# e.g. T(3) = 6, T(7) = 28, T(10) = 6 + 28 + 21 = 55
T((x + y)/2) = (T(x) + T(y) + x*y + (x + y)/2)/4
# e.g. T(3) = 6, T(7) = 28, T(5) = (6 + 28 + 21 + 10/2)/4 = 15
Do bisection search, but make sure that y - x is always a power of two. (This does not increase the asymptotic running time.) Then T((x + y) / 2) = T(x) + T(h) + x * h, where h is a power of two, so x * h is computable with a shift.
Here's a Python proof of concept (hastily written, more or less unoptimized but avoids expensive operations).
def tri(n):
return ((n * (n + 1)) >> 1)
def triroot(t):
y = 1
ty = 1
# Find a starting point for bisection search by doubling y using
# the identity T(2*y) = 4*T(y) - y. Stop when T(y) exceeds t.
# At the end, x = 2*y, tx = T(x), and ty = T(y).
while (ty <= t):
assert (ty == tri(y))
tx = ty
ty += ty
ty += ty
ty -= y
x = y
y += y
# Now do bisection search on the interval [x .. x + h),
# using these identities:
# T(x + h) = T(x) + T(h) + x*h
# T(h/2) = (T(h) + h/2)/4
th = tx
h = x
x_times_h = ((tx + tx) - x)
while True:
assert (tx == tri(x))
assert (x_times_h == (x * h))
# Divide h by 2
h >>= 1
x_times_h >>= 1
if (not h):
break
th += h
th >>= 1
th >>= 1
# Calculate the midpoint of the search interval
tz = ((tx + th) + x_times_h)
z = (x + h)
assert (tz == tri(z))
# If the midpoint is below the target, move the lower bound
# of the search interval up to the midpoint
if (t >= tz):
tx = tz
x = z
x_times_h += ((th + th) - h)
return x
for q in range(1, 100):
p = triroot(q)
assert (tri(p) <= q < tri((p + 1)))
print(q, p)
As observed in the linked page on math.stackexchange.com there is a direct formula for the solution of this problem and being x = n*(n+1)/2 then the reverse is:
n = (sqrt(1+8*x) - 1)/2
Now there is the square root and other things but I would suggest to use this direct formula with an implementation like the following:
tmp = x + x; '2*x
tmp += tmp; '4*x
tmp += tmp + 1; '8*x + 1
n = 0;
n2 = 0;
while(n2 <= tmp){
n2 += n + n + 1; 'remember that (n+1)^2 - n^2 = 2*n + 1
n++;
}
'here after the loops n = floor(sqrt(8*x+1)) + 1
n -= 2; 'floor(sqrt(8*x+1)) - 1
n /= 2; '(floor(sqrt(8*x+1)) - 1) / 2
Of course this can be improved for better performances if neede like considering that integer values of floor(sqrt(8*x+1)) + 1 are even so n can be incremented with steps of 2 (rewriting the n2 calculation accordingly: n2 += n + n + n + n + 4 that can itself be written better than this).