Understanding why Floyd's tortoise and hare algorithm works when applied to an array of integers - algorithm

I was trying to solve this leetcode problem https://leetcode.com/problems/find-the-duplicate-number/ using my own implementation of the tortoise and hare algorithm which resulted in an infinite loop when given the following array of integers:
[3,1,3,4,2]
Only after tracing through my algorithm was I able to see that the slow and fast runners never take on the two duplicate values at the same time. Here is my algorithm in pseudocode:
initialize fast and slow runners to 0
while(true)
move fast runner two indices forward
move slow runner one index forward
if arr[fast] == arr[slow] and fast != slow
return arr[fast] // this is the duplicate
Now, I'm sure someone who is skilled in discrete mathematics would have been able to intuitively know that this approach would not have lead to the correct solution without first having to trace through an example like I had to do.
What inferences or observations could I have made that would have lead me to see that this algorithm was not going to work? I'd like to know how one could intuitively identity a flaw in this logic through a series of logical statements. In other words, what's the explanation for why the two runners will never find the duplicates in this example? I feel like it may have something to do with counting, but I do not have a very strong background in discrete.
And to clarify, I have looked at the correct implementation so I do know what the correct way to solve it is. I just thought that this way would have worked too similar to applying it to linked lists, where you'd move the fast runner two nodes up and the slow runner one node up. Thank you for your help.

Floyd's tortoise algorithm works when you're detecting a cycle in a linked list. It relies on the fact that if both pointers are moving at a different pace, the gap between them will keep on increasing to a limit, after which it'll be reset if a cycle exists.
In this case, the algorithm does find a cycle, since both pointers converge to the index 0 after some iterations. However, you're not looking to detect a cycle here; you're trying to find a duplicate. That's why this gets stuck in infinite recursion: it is meant to detect a cycle (which it correctly does), but not detect duplicates in its basic implementation.
To clarify, here's a sample linked list created on your sample array.
3 -> 1 -> 3 -> 4 -> 2
'--<----<----<----<-'
If you run Floyd's algorithm, you find that the cycle will get detected at index 0, since both pointers will converge there. It works by checking if fast and slow point to the same location and not if they have the same values of nodes (fast==slow isn't the same as fast.value==slow.value).
You are attempting to check duplicates by comparing the value on the nodes, and checking if the nodes don't point to the same location. That is actually the flaw, since Floyd's algorithm works to check if both pointers point to the same location in order to detect a cycle.
You can read this simple, informative proof to improve your intuition as to why the pointers will converge.

That' not a bad idea. Here is an implementation in Python:
class Solution:
def findDuplicate(self, nums):
slow, fast = 0, 0
while True:
slow = nums[nums[slow]]
fast = nums[fast]
if slow == fast:
break
fast = 0
while True:
slow, fast = nums[slow], nums[fast]
if slow == fast:
break
return slow
We can also use Binary Search:
class Solution:
def findDuplicate(self, nums):
lo, hi = 0, len(nums) - 1
mid = lo + (hi - lo) // 2
while hi - lo > 1:
count = 0
for num in nums:
if mid < num <= hi:
count += 1
if count > hi - mid:
lo = mid
else:
hi = mid
mid = lo + (hi - lo) // 2
return hi
In C++:
// The following block might slightly improve the execution time;
// Can be removed;
static const auto __optimize__ = []() {
std::ios::sync_with_stdio(false);
std::cin.tie(nullptr);
std::cout.tie(nullptr);
return 0;
}();
// Most of headers are already included;
// Can be removed;
#include <iostream>
#include <cstdint>
#include <vector>
static const struct Solution {
static const int findDuplicate(
const std::vector<int>& nums
) {
int slow = 0;
int fast = 0;
while (true) {
slow = nums[nums[slow]];
fast = nums[fast];
if (slow == fast) {
break;
}
}
fast = 0;
while (slow != fast) {
slow = nums[slow];
fast = nums[fast];
}
return slow;
}
};

Related

Difference in performance between the following two implementations of binary search

I came across these two implementations of binary search in the book "Competitive Programmer's Handbook" https://cses.fi/book/book.pdf.
Method 1:
int a = 0, b = n-1;
while (a <= b) {
int k = (a+b)/2;
if (array[k] == x) {
// x found at index k
}
if (array[k] > x)
b = k-1;
else
a = k+1;
}
Method 2:
int k = 0;
for (int b = n/2; b >= 1; b /= 2){
while (k+b < n && array[k+b] <= x)
k += b;
}
if (array[k] == x){
// x found at index k
}
I guess method 2 is not exactly binary search.
I understand that Both method 1 and method 2 have O(log n) complexity. Also the code for method 2 is simpler and therefore might result in fewer bugs.
My questions are:
Is there any improvement in performance when method-2 is used?
Does method-2 have any other advantage?
For such short code and so few differences, it is impossible to do any prediction. The time performance will depend on how the compiler optimizes, but also on the distribution of the keys in the array (in particular, the probability of a hit rather than a miss).
I disagree with all comments "against" the second method (even claimed buggy when it is perfectly correct). It is based on a principle that potentially makes it better: there's only one test in the body of the loop.
Having a comparison for equality (Method 1) gives the false feeling that the algorithm will terminate early when the key is found and make the search faster*. But this is not so true, because for half of the keys the full depth of the decision tree is anyway necessary, and this not counter-balanced by the fact that there are two comparisons instead of one.
*In fact, you just spare one test on average !
Only benchmarking can tell you if one of the methods is faster with particular test cases. My bet is that the distributions of the running times overlap a lot. (Not counting that it is virtually impossible to benchmark such a fast algorithm in a way that is representative of its behavior in real context.)
Last comment: the method 2 is a binary search, while in fact method 1 is ternary !

DTW algorithm: simple implementation - Verification

I have tried to make a simple implementation of the DTW algorithm in C,without using any substantial optimization techniques. I am trying to use this implementation for some simple sketch recognition, which is to say finding the k closest neighbors of a given sketch from within a set. I have gotten some results that seem weird to me and I would like to know of this is because of my dtw implementation. I need someone to verify my algorithm.
As I said, I am trying to find the k closest neighbors, so the only 'optimization' I have implemented to make calculations faster is that if the minimum cost of a given line calculated is at any point greater than the maximum distance between the k sketches currently considered as the closest neighbors, I stop calculating and return +inf.
Here is the corresponding algorithm:
(returnValue totalCost) dtw(sketch1, sketch2, curMaxDist){
distMatrix = 'empty matrix of size (sketch.size) x (sketch2.size)'
totalCostMatrix = 'empty matrix of size (sketch1.size) x (sketch2.size)'
for(i = 0 to sketch1.size - 1){
for(j = 0 to sketch2.size - 1){
distMatrix[i][j] = euclidianDistance(sketch1.point[i], sketch2.point[j])
totalCostMatrix[i][j] = +inf
}
}
//I am forcing the first points of each sketch to correspond to one
// and continue applying the algorithm from the next points.
for(i = 1 to sketch1.size - 1){
curMinDist = +inf
for(j = 1 to sketch2.size - 1){
totalCostMatrix[i][j] = min(totalCostMatrix[i-1][j-1],
totalCostMatrix[i-1][j],
totalCostMatrix[i][j-1]) + distMatrix[i][j]
if(totalCostMatrix[i][j] < curMinDist)
curMinDist = totalCostMatrix[i][j]
}
if(curMinDist > curMaxDist)
return +inf
}
return totalCostMatrix[sketch1.size - 1][sketch2.size - 1]
}
I am sure there is nothind wrong with the implementation as far as the syntax, C language etc is concerned since I have checked that and I always get the expectes result. I was just wandering if there is something wrong with the reasoning behind the algorithm. I am asking because it is a really well known algorithm and a really simple implementation so maybe it is easy for someone to spot an error there.

Sorting algorithm for list of integers

I have a list of about 200 integers whose values are between 1 and 5.
I want to get into learning about sorting algorithms and knowing where to apply each because at the moment I use bubble-sort for everything which I've been told is a terrible way to do things.
What would be the fastest sorting algorithm for this integer sorting?
EDIT: It turns out that because I know the numbers are 1 to 5 then I can use a bucket sort (?) algorithm which if I'm not mistaken - and I definitely could be - means that for each integer of value 1, I put it in the 1 group, value 2 I put it in the 2 group etc, then concatenate the groups at the end. This seems like a simple and efficient way to do it.
However since this is (currently) a learning excercise for me I am going to remove the 1 - 5 limitation and try to implement bubble-sort and merge-sort then compare the two to see which is faster.
Thanks for your help!
... which I've been told is a terrible way to do things.
First off, don't accept as gospel anything you hear from random bods on the internet (even me).
Bubble sort is fine under certain conditions, such as when the data is already mostly sorted, or the item count is relatively small (such as 200) (a), or you have no sort functionality built into the language and you're on a tight deadline where lack of performance will annoy the customer but lack of functionality will get you fired :-)
This bias against bubble sort is similar to the "only one exit point from a function" and "no goto" rules. You should understand the reasoning behind them so that you know when the rules can be ignored safely.
Anyway, on to the question proper. An efficient way for your specific case is to just count the items then output them, something like:
dim count[1..5] = {0, 0, 0, 0, 0};
for each item in list:
count[item] = count[item] + 1
for val in 1..5:
for quant in 1..count[val]:
output val
That's an O(n) time and O(1) space solution and you won't find a more efficient big-O for a generalised sort routine - it's only possible in this case because of the extra information you have about the data (limited to the values 1 through 5).
If you wanted to examine all the different sort algorithms, the Wikipedia Sorting Algorithm page is a useful starting point, including the major algorithms and their properties.
(a) As an aside, the following code (using worst case data for bubble sort), when run under CygWin on a not-very-powerful IBM T60 (2GHz dual core) laptop, completes in, on average, 0.157 seconds (5 samples: 0.150, 0.125, 0.192, 0.199, 0.115).
I wouldn't use it for sorting a million items (everyone knows bubble sort scales poorly) but 200 should be fine in most cases:
#include <stdio.h>
#define COUNT 200
int main (void) {
int i, swapped, tmp, item[COUNT];
// Set up worst case (reverse order) data.
for (i = 0; i < COUNT; i++)
item[i] = 200 - i;
// Slightly optimised bubble sort.
swapped = 1;
while (swapped) {
swapped = 0;
for (i = 1; i < COUNT; i++) {
if (item[i-1] > item[i]) {
tmp = item[i-1];
item[i-1] = item[i];
item[i] = tmp;
swapped = 1;
}
}
}
// for (i = 0; i < COUNT; i++)
// printf ("%d ", item[i]);
// putchar ('\n');
return 0;
}
You may not need sorting here, since you only have 5 possible values.
You could use 5 containers (or buckets) and as you scan your list of integers you place the values in the right bucket.
At the end, join the buckets together, in order.
Merge sort is an O(n log n) I think its way better than QuickSort
You can find some C# code here.

Tail-recursive pow() algorithm with memoization?

I'm looking for an algorithm to compute pow() that's tail-recursive and uses memoization to speed up repeated calculations.
Performance isn't an issue; this is mostly an intellectual exercise - I spent a train ride coming up with all the different pow() implementations I could, but was unable to come up with one that I was happy with that had these two properties.
My best shot was the following:
def calc_tailrec_mem(base, exp, cache_line={}, acc=1, ctr=0):
if exp == 0:
return 1
elif exp == 1:
return acc * base
elif exp in cache_line:
val = acc * cache_line[exp]
cache_line[exp + ctr] = val
return val
else:
cache_line[ctr] = acc
return calc_tailrec_mem(base, exp-1, cache_line, acc * base, ctr + 1)
It works, but it doesn't memoize the results of all calculations - only those with exponents 1..exp/2 and exp.
You'll get better performance if you use the successive squaring technique described in SICP section 1.2.4 Exponentiation. It doesn't use memoization, but the general approach is O(log n) instead of O(n), so you should still see an improvement.
I talk about the solution to the iterative process from exercise 1.16 here.
I don't think you're recording the correct thing in your cache, the mapping changed when you call it with different arguments.
I think you need to have a cache of (base,exp) -> pow(base,exp).
I understand what ctr is for, and why only half of what you expect is recorded.
Consider calc_tailrec_mem(2,4): First level, pow(2,1) is recorded as 2, the next level = calc_tailrec_mem(2,3,...), and pow(2,2) is recorded. The next level is calc_tailrec_mem(2,2,...), but that is already saved in the cache, so the recursion stops.
The function is very confusing because it's caching something completely different from what it's supposed to be calculating, due to the acculumator and ctr.
This is way too late, but anyone out there looking for the answer, here it is:
int powMem(int base,int exp){
//initializes once and for all
static map<int,int> memo;
//base case to stop the recursion
if(exp <= 1) return base;
//check if the value is already calculated before. If yes just return it.
if(memo.find(exp) != memo.end())
return memo[exp];
//else just find it and then store it in memo for further use.
int x = powMem(base,exp/2);
memo[exp] = x*x;
//return the answer
return memo[exp];
}
This uses the memo array - a map , to be exact - to store the already calculated values.

Recursion or Iteration?

Is there a performance hit if we use a loop instead of recursion or vice versa in algorithms where both can serve the same purpose? Eg: Check if the given string is a palindrome.
I have seen many programmers using recursion as a means to show off when a simple iteration algorithm can fit the bill.
Does the compiler play a vital role in deciding what to use?
Loops may achieve a performance gain for your program. Recursion may achieve a performance gain for your programmer. Choose which is more important in your situation!
It is possible that recursion will be more expensive, depending on if the recursive function is tail recursive (the last line is recursive call). Tail recursion should be recognized by the compiler and optimized to its iterative counterpart (while maintaining the concise, clear implementation you have in your code).
I would write the algorithm in the way that makes the most sense and is the clearest for the poor sucker (be it yourself or someone else) that has to maintain the code in a few months or years. If you run into performance issues, then profile your code, and then and only then look into optimizing by moving over to an iterative implementation. You may want to look into memoization and dynamic programming.
Comparing recursion to iteration is like comparing a phillips head screwdriver to a flat head screwdriver. For the most part you could remove any phillips head screw with a flat head, but it would just be easier if you used the screwdriver designed for that screw right?
Some algorithms just lend themselves to recursion because of the way they are designed (Fibonacci sequences, traversing a tree like structure, etc.). Recursion makes the algorithm more succinct and easier to understand (therefore shareable and reusable).
Also, some recursive algorithms use "Lazy Evaluation" which makes them more efficient than their iterative brothers. This means that they only do the expensive calculations at the time they are needed rather than each time the loop runs.
That should be enough to get you started. I'll dig up some articles and examples for you too.
Link 1: Haskel vs PHP (Recursion vs Iteration)
Here is an example where the programmer had to process a large data set using PHP. He shows how easy it would have been to deal with in Haskel using recursion, but since PHP had no easy way to accomplish the same method, he was forced to use iteration to get the result.
http://blog.webspecies.co.uk/2011-05-31/lazy-evaluation-with-php.html
Link 2: Mastering Recursion
Most of recursion's bad reputation comes from the high costs and inefficiency in imperative languages. The author of this article talks about how to optimize recursive algorithms to make them faster and more efficient. He also goes over how to convert a traditional loop into a recursive function and the benefits of using tail-end recursion. His closing words really summed up some of my key points I think:
"recursive programming gives the programmer a better way of organizing
code in a way that is both maintainable and logically consistent."
https://developer.ibm.com/articles/l-recurs/
Link 3: Is recursion ever faster than looping? (Answer)
Here is a link to an answer for a stackoverflow question that is similar to yours. The author points out that a lot of the benchmarks associated with either recursing or looping are very language specific. Imperative languages are typically faster using a loop and slower with recursion and vice-versa for functional languages. I guess the main point to take from this link is that it is very difficult to answer the question in a language agnostic / situation blind sense.
Is recursion ever faster than looping?
Recursion is more costly in memory, as each recursive call generally requires a memory address to be pushed to the stack - so that later the program could return to that point.
Still, there are many cases in which recursion is a lot more natural and readable than loops - like when working with trees. In these cases I would recommend sticking to recursion.
Typically, one would expect the performance penalty to lie in the other direction. Recursive calls can lead to the construction of extra stack frames; the penalty for this varies. Also, in some languages like Python (more correctly, in some implementations of some languages...), you can run into stack limits rather easily for tasks you might specify recursively, such as finding the maximum value in a tree data structure. In these cases, you really want to stick with loops.
Writing good recursive functions can reduce the performance penalty somewhat, assuming you have a compiler that optimizes tail recursions, etc. (Also double check to make sure that the function really is tail recursive---it's one of those things that many people make mistakes on.)
Apart from "edge" cases (high performance computing, very large recursion depth, etc.), it's preferable to adopt the approach that most clearly expresses your intent, is well-designed, and is maintainable. Optimize only after identifying a need.
Recursion is better than iteration for problems that can be broken down into multiple, smaller pieces.
For example, to make a recursive Fibonnaci algorithm, you break down fib(n) into fib(n-1) and fib(n-2) and compute both parts. Iteration only allows you to repeat a single function over and over again.
However, Fibonacci is actually a broken example and I think iteration is actually more efficient. Notice that fib(n) = fib(n-1) + fib(n-2) and fib(n-1) = fib(n-2) + fib(n-3). fib(n-1) gets calculated twice!
A better example is a recursive algorithm for a tree. The problem of analyzing the parent node can be broken down into multiple smaller problems of analyzing each child node. Unlike the Fibonacci example, the smaller problems are independent of each other.
So yeah - recursion is better than iteration for problems that can be broken down into multiple, smaller, independent, similar problems.
Your performance deteriorates when using recursion because calling a method, in any language, implies a lot of preparation: the calling code posts a return address, call parameters, some other context information such as processor registers might be saved somewhere, and at return time the called method posts a return value which is then retrieved by the caller, and any context information that was previously saved will be restored. the performance diff between an iterative and a recursive approach lies in the time these operations take.
From an implementation point of view, you really start noticing the difference when the time it takes to handle the calling context is comparable to the time it takes for your method to execute. If your recursive method takes longer to execute then the calling context management part, go the recursive way as the code is generally more readable and easy to understand and you won't notice the performance loss. Otherwise go iterative for efficiency reasons.
I believe tail recursion in java is not currently optimized. The details are sprinkled throughout this discussion on LtU and the associated links. It may be a feature in the upcoming version 7, but apparently it presents certain difficulties when combined with Stack Inspection since certain frames would be missing. Stack Inspection has been used to implement their fine-grained security model since Java 2.
http://lambda-the-ultimate.org/node/1333
There are many cases where it gives a much more elegant solution over the iterative method, the common example being traversal of a binary tree, so it isn't necessarily more difficult to maintain. In general, iterative versions are usually a bit faster (and during optimization may well replace a recursive version), but recursive versions are simpler to comprehend and implement correctly.
Recursion is very useful is some situations. For example consider the code for finding the factorial
int factorial ( int input )
{
int x, fact = 1;
for ( x = input; x > 1; x--)
fact *= x;
return fact;
}
Now consider it by using the recursive function
int factorial ( int input )
{
if (input == 0)
{
return 1;
}
return input * factorial(input - 1);
}
By observing these two, we can see that recursion is easy to understand.
But if it is not used with care it can be so much error prone too.
Suppose if we miss if (input == 0), then the code will be executed for some time and ends with usually a stack overflow.
In many cases recursion is faster because of caching, which improves performance. For example, here is an iterative version of merge sort using the traditional merge routine. It will run slower than the recursive implementation because of caching improved performances.
Iterative implementation
public static void sort(Comparable[] a)
{
int N = a.length;
aux = new Comparable[N];
for (int sz = 1; sz < N; sz = sz+sz)
for (int lo = 0; lo < N-sz; lo += sz+sz)
merge(a, lo, lo+sz-1, Math.min(lo+sz+sz-1, N-1));
}
Recursive implementation
private static void sort(Comparable[] a, Comparable[] aux, int lo, int hi)
{
if (hi <= lo) return;
int mid = lo + (hi - lo) / 2;
sort(a, aux, lo, mid);
sort(a, aux, mid+1, hi);
merge(a, aux, lo, mid, hi);
}
PS - this is what was told by Professor Kevin Wayne (Princeton University) on the course on algorithms presented on Coursera.
Using recursion, you're incurring the cost of a function call with each "iteration", whereas with a loop, the only thing you usually pay is an increment/decrement. So, if the code for the loop isn't much more complicated than the code for the recursive solution, loop will usually be superior to recursion.
Recursion and iteration depends on the business logic that you want to implement, though in most of the cases it can be used interchangeably. Most developers go for recursion because it is easier to understand.
It depends on the language. In Java you should use loops. Functional languages optimize recursion.
Recursion has a disadvantage that the algorithm that you write using recursion has O(n) space complexity.
While iterative aproach have a space complexity of O(1).This is the advantange of using iteration over recursion.
Then why do we use recursion?
See below.
Sometimes it is easier to write an algorithm using recursion while it's slightly tougher to write the same algorithm using iteration.In this case if you opt to follow the iteration approach you would have to handle stack yourself.
If you're just iterating over a list, then sure, iterate away.
A couple of other answers have mentioned (depth-first) tree traversal. It really is such a great example, because it's a very common thing to do to a very common data structure. Recursion is extremely intuitive for this problem.
Check out the "find" methods here:
http://penguin.ewu.edu/cscd300/Topic/BSTintro/index.html
Recursion is more simple (and thus - more fundamental) than any possible definition of an iteration. You can define a Turing-complete system with only a pair of combinators (yes, even a recursion itself is a derivative notion in such a system). Lambda calculus is an equally powerful fundamental system, featuring recursive functions. But if you want to define an iteration properly, you'd need much more primitives to start with.
As for the code - no, recursive code is in fact much easier to understand and to maintain than a purely iterative one, since most data structures are recursive. Of course, in order to get it right one would need a language with a support for high order functions and closures, at least - to get all the standard combinators and iterators in a neat way. In C++, of course, complicated recursive solutions can look a bit ugly, unless you're a hardcore user of FC++ and alike.
I would think in (non tail) recursion there would be a performance hit for allocating a new stack etc every time the function is called (dependent on language of course).
it depends on "recursion depth".
it depends on how much the function call overhead will influence the total execution time.
For example, calculating the classical factorial in a recursive way is very inefficient due to:
- risk of data overflowing
- risk of stack overflowing
- function call overhead occupy 80% of execution time
while developing a min-max algorithm for position analysis in the game of chess that will analyze subsequent N moves can be implemented in recursion over the "analysis depth" (as I'm doing ^_^)
Recursion? Where do I start, wiki will tell you “it’s the process of repeating items in a self-similar way"
Back in day when I was doing C, C++ recursion was a god send, stuff like "Tail recursion". You'll also find many sorting algorithms use recursion. Quick sort example: http://alienryderflex.com/quicksort/
Recursion is like any other algorithm useful for a specific problem. Perhaps you mightn't find a use straight away or often but there will be problem you’ll be glad it’s available.
In C++ if the recursive function is a templated one, then the compiler has more chance to optimize it, as all the type deduction and function instantiations will occur in compile time. Modern compilers can also inline the function if possible. So if one uses optimization flags like -O3 or -O2 in g++, then recursions may have the chance to be faster than iterations. In iterative codes, the compiler gets less chance to optimize it, as it is already in the more or less optimal state (if written well enough).
In my case, I was trying to implement matrix exponentiation by squaring using Armadillo matrix objects, in both recursive and iterative way. The algorithm can be found here... https://en.wikipedia.org/wiki/Exponentiation_by_squaring.
My functions were templated and I have calculated 1,000,000 12x12 matrices raised to the power 10. I got the following result:
iterative + optimisation flag -O3 -> 2.79.. sec
recursive + optimisation flag -O3 -> 1.32.. sec
iterative + No-optimisation flag -> 2.83.. sec
recursive + No-optimisation flag -> 4.15.. sec
These results have been obtained using gcc-4.8 with c++11 flag (-std=c++11) and Armadillo 6.1 with Intel mkl. Intel compiler also shows similar results.
Mike is correct. Tail recursion is not optimized out by the Java compiler or the JVM. You will always get a stack overflow with something like this:
int count(int i) {
return i >= 100000000 ? i : count(i+1);
}
You have to keep in mind that utilizing too deep recursion you will run into Stack Overflow, depending on allowed stack size. To prevent this make sure to provide some base case which ends you recursion.
Using just Chrome 45.0.2454.85 m, recursion seems to be a nice amount faster.
Here is the code:
(function recursionVsForLoop(global) {
"use strict";
// Perf test
function perfTest() {}
perfTest.prototype.do = function(ns, fn) {
console.time(ns);
fn();
console.timeEnd(ns);
};
// Recursion method
(function recur() {
var count = 0;
global.recurFn = function recurFn(fn, cycles) {
fn();
count = count + 1;
if (count !== cycles) recurFn(fn, cycles);
};
})();
// Looped method
function loopFn(fn, cycles) {
for (var i = 0; i < cycles; i++) {
fn();
}
}
// Tests
var curTest = new perfTest(),
testsToRun = 100;
curTest.do('recursion', function() {
recurFn(function() {
console.log('a recur run.');
}, testsToRun);
});
curTest.do('loop', function() {
loopFn(function() {
console.log('a loop run.');
}, testsToRun);
});
})(window);
RESULTS
// 100 runs using standard for loop
100x for loop run.
Time to complete: 7.683ms
// 100 runs using functional recursive approach w/ tail recursion
100x recursion run.
Time to complete: 4.841ms
In the screenshot below, recursion wins again by a bigger margin when run at 300 cycles per test
If the iterations are atomic and orders of magnitude more expensive than pushing a new stack frame and creating a new thread and you have multiple cores and your runtime environment can use all of them, then a recursive approach could yield a huge performance boost when combined with multithreading. If the average number of iterations is not predictable then it might be a good idea to use a thread pool which will control thread allocation and prevent your process from creating too many threads and hogging the system.
For example, in some languages, there are recursive multithreaded merge sort implementations.
But again, multithreading can be used with looping rather than recursion, so how well this combination will work depends on more factors including the OS and its thread allocation mechanism.
I found another differences between those approaches.
It looks simple and unimportant, but it has a very important role while you prepare for interviews and this subject arises, so look closely.
In short:
1) iterative post-order traversal is not easy - that makes DFT more complex
2) cycles check easier with recursion
Details:
In the recursive case, it is easy to create pre and post traversals:
Imagine a pretty standard question: "print all tasks that should be executed to execute the task 5, when tasks depend on other tasks"
Example:
//key-task, value-list of tasks the key task depends on
//"adjacency map":
Map<Integer, List<Integer>> tasksMap = new HashMap<>();
tasksMap.put(0, new ArrayList<>());
tasksMap.put(1, new ArrayList<>());
List<Integer> t2 = new ArrayList<>();
t2.add(0);
t2.add(1);
tasksMap.put(2, t2);
List<Integer> t3 = new ArrayList<>();
t3.add(2);
t3.add(10);
tasksMap.put(3, t3);
List<Integer> t4 = new ArrayList<>();
t4.add(3);
tasksMap.put(4, t4);
List<Integer> t5 = new ArrayList<>();
t5.add(3);
tasksMap.put(5, t5);
tasksMap.put(6, new ArrayList<>());
tasksMap.put(7, new ArrayList<>());
List<Integer> t8 = new ArrayList<>();
t8.add(5);
tasksMap.put(8, t8);
List<Integer> t9 = new ArrayList<>();
t9.add(4);
tasksMap.put(9, t9);
tasksMap.put(10, new ArrayList<>());
//task to analyze:
int task = 5;
List<Integer> res11 = getTasksInOrderDftReqPostOrder(tasksMap, task);
System.out.println(res11);**//note, no reverse required**
List<Integer> res12 = getTasksInOrderDftReqPreOrder(tasksMap, task);
Collections.reverse(res12);//note reverse!
System.out.println(res12);
private static List<Integer> getTasksInOrderDftReqPreOrder(Map<Integer, List<Integer>> tasksMap, int task) {
List<Integer> result = new ArrayList<>();
Set<Integer> visited = new HashSet<>();
reqPreOrder(tasksMap,task,result, visited);
return result;
}
private static void reqPreOrder(Map<Integer, List<Integer>> tasksMap, int task, List<Integer> result, Set<Integer> visited) {
if(!visited.contains(task)) {
visited.add(task);
result.add(task);//pre order!
List<Integer> children = tasksMap.get(task);
if (children != null && children.size() > 0) {
for (Integer child : children) {
reqPreOrder(tasksMap,child,result, visited);
}
}
}
}
private static List<Integer> getTasksInOrderDftReqPostOrder(Map<Integer, List<Integer>> tasksMap, int task) {
List<Integer> result = new ArrayList<>();
Set<Integer> visited = new HashSet<>();
reqPostOrder(tasksMap,task,result, visited);
return result;
}
private static void reqPostOrder(Map<Integer, List<Integer>> tasksMap, int task, List<Integer> result, Set<Integer> visited) {
if(!visited.contains(task)) {
visited.add(task);
List<Integer> children = tasksMap.get(task);
if (children != null && children.size() > 0) {
for (Integer child : children) {
reqPostOrder(tasksMap,child,result, visited);
}
}
result.add(task);//post order!
}
}
Note that the recursive post-order-traversal does not require a subsequent reversal of the result. Children printed first and your task in the question printed last. Everything is fine. You can do a recursive pre-order-traversal (also shown above) and that one will require a reversal of the result list.
Not that simple with iterative approach! In iterative (one stack) approach you can only do a pre-ordering-traversal, so you obliged to reverse the result array at the end:
List<Integer> res1 = getTasksInOrderDftStack(tasksMap, task);
Collections.reverse(res1);//note reverse!
System.out.println(res1);
private static List<Integer> getTasksInOrderDftStack(Map<Integer, List<Integer>> tasksMap, int task) {
List<Integer> result = new ArrayList<>();
Set<Integer> visited = new HashSet<>();
Stack<Integer> st = new Stack<>();
st.add(task);
visited.add(task);
while(!st.isEmpty()){
Integer node = st.pop();
List<Integer> children = tasksMap.get(node);
result.add(node);
if(children!=null && children.size() > 0){
for(Integer child:children){
if(!visited.contains(child)){
st.add(child);
visited.add(child);
}
}
}
//If you put it here - it does not matter - it is anyway a pre-order
//result.add(node);
}
return result;
}
Looks simple, no?
But it is a trap in some interviews.
It means the following: with the recursive approach, you can implement Depth First Traversal and then select what order you need pre or post(simply by changing the location of the "print", in our case of the "adding to the result list"). With the iterative (one stack) approach you can easily do only pre-order traversal and so in the situation when children need be printed first(pretty much all situations when you need start print from the bottom nodes, going upwards) - you are in the trouble. If you have that trouble you can reverse later, but it will be an addition to your algorithm. And if an interviewer is looking at his watch it may be a problem for you. There are complex ways to do an iterative post-order traversal, they exist, but they are not simple. Example:https://www.geeksforgeeks.org/iterative-postorder-traversal-using-stack/
Thus, the bottom line: I would use recursion during interviews, it is simpler to manage and to explain. You have an easy way to go from pre to post-order traversal in any urgent case. With iterative you are not that flexible.
I would use recursion and then tell: "Ok, but iterative can provide me more direct control on used memory, I can easily measure the stack size and disallow some dangerous overflow.."
Another plus of recursion - it is simpler to avoid / notice cycles in a graph.
Example (preudocode):
dft(n){
mark(n)
for(child: n.children){
if(marked(child))
explode - cycle found!!!
dft(child)
}
unmark(n)
}
It may be fun to write it as recursion, or as a practice.
However, if the code is to be used in production, you need to consider the possibility of stack overflow.
Tail recursion optimization can eliminate stack overflow, but do you want to go through the trouble of making it so, and you need to know you can count on it having the optimization in your environment.
Every time the algorithm recurses, how much is the data size or n reduced by?
If you are reducing the size of data or n by half every time you recurse, then in general you don't need to worry about stack overflow. Say, if it needs to be 4,000 level deep or 10,000 level deep for the program to stack overflow, then your data size need to be roughly 24000 for your program to stack overflow. To put that into perspective, a biggest storage device recently can hold 261 bytes, and if you have 261 of such devices, you are only dealing with 2122 data size. If you are looking at all the atoms in the universe, it is estimated that it may be less than 284. If you need to deal with all the data in the universe and their states for every millisecond since the birth of the universe estimated to be 14 billion years ago, it may only be 2153. So if your program can handle 24000 units of data or n, you can handle all data in the universe and the program will not stack overflow. If you don't need to deal with numbers that are as big as 24000 (a 4000-bit integer), then in general you don't need to worry about stack overflow.
However, if you reduce the size of data or n by a constant amount every time you recurse, then you can run into stack overflow when n becomes merely 20000. That is, the program runs well when n is 1000, and you think the program is good, and then the program stack overflows when some time in the future, when n is 5000 or 20000.
So if you have a possibility of stack overflow, try to make it an iterative solution.
As far as I know, Perl does not optimize tail-recursive calls, but you can fake it.
sub f{
my($l,$r) = #_;
if( $l >= $r ){
return $l;
} else {
# return f( $l+1, $r );
#_ = ( $l+1, $r );
goto &f;
}
}
When first called it will allocate space on the stack. Then it will change its arguments, and restart the subroutine, without adding anything more to the stack. It will therefore pretend that it never called its self, changing it into an iterative process.
Note that there is no "my #_;" or "local #_;", if you did it would no longer work.
"Is there a performance hit if we use a loop instead of
recursion or vice versa in algorithms where both can serve the same purpose?"
Usually yes if you are writing in a imperative language iteration will run faster than recursion, the performance hit is minimized in problems where the iterative solution requires manipulating Stacks and popping items off of a stack due to the recursive nature of the problem. There are a lot of times where the recursive implementation is much easier to read because the code is much shorter,
so you do want to consider maintainability. Especailly in cases where the problem has a recursive nature. So take for example:
The recursive implementation of Tower of Hanoi:
def TowerOfHanoi(n , source, destination, auxiliary):
if n==1:
print ("Move disk 1 from source",source,"to destination",destination)
return
TowerOfHanoi(n-1, source, auxiliary, destination)
print ("Move disk",n,"from source",source,"to destination",destination)
TowerOfHanoi(n-1, auxiliary, destination, source)
Fairly short and pretty easy to read. Compare this with its Counterpart iterative TowerOfHanoi:
# Python3 program for iterative Tower of Hanoi
import sys
# A structure to represent a stack
class Stack:
# Constructor to set the data of
# the newly created tree node
def __init__(self, capacity):
self.capacity = capacity
self.top = -1
self.array = [0]*capacity
# function to create a stack of given capacity.
def createStack(capacity):
stack = Stack(capacity)
return stack
# Stack is full when top is equal to the last index
def isFull(stack):
return (stack.top == (stack.capacity - 1))
# Stack is empty when top is equal to -1
def isEmpty(stack):
return (stack.top == -1)
# Function to add an item to stack.
# It increases top by 1
def push(stack, item):
if(isFull(stack)):
return
stack.top+=1
stack.array[stack.top] = item
# Function to remove an item from stack.
# It decreases top by 1
def Pop(stack):
if(isEmpty(stack)):
return -sys.maxsize
Top = stack.top
stack.top-=1
return stack.array[Top]
# Function to implement legal
# movement between two poles
def moveDisksBetweenTwoPoles(src, dest, s, d):
pole1TopDisk = Pop(src)
pole2TopDisk = Pop(dest)
# When pole 1 is empty
if (pole1TopDisk == -sys.maxsize):
push(src, pole2TopDisk)
moveDisk(d, s, pole2TopDisk)
# When pole2 pole is empty
else if (pole2TopDisk == -sys.maxsize):
push(dest, pole1TopDisk)
moveDisk(s, d, pole1TopDisk)
# When top disk of pole1 > top disk of pole2
else if (pole1TopDisk > pole2TopDisk):
push(src, pole1TopDisk)
push(src, pole2TopDisk)
moveDisk(d, s, pole2TopDisk)
# When top disk of pole1 < top disk of pole2
else:
push(dest, pole2TopDisk)
push(dest, pole1TopDisk)
moveDisk(s, d, pole1TopDisk)
# Function to show the movement of disks
def moveDisk(fromPeg, toPeg, disk):
print("Move the disk", disk, "from '", fromPeg, "' to '", toPeg, "'")
# Function to implement TOH puzzle
def tohIterative(num_of_disks, src, aux, dest):
s, d, a = 'S', 'D', 'A'
# If number of disks is even, then interchange
# destination pole and auxiliary pole
if (num_of_disks % 2 == 0):
temp = d
d = a
a = temp
total_num_of_moves = int(pow(2, num_of_disks) - 1)
# Larger disks will be pushed first
for i in range(num_of_disks, 0, -1):
push(src, i)
for i in range(1, total_num_of_moves + 1):
if (i % 3 == 1):
moveDisksBetweenTwoPoles(src, dest, s, d)
else if (i % 3 == 2):
moveDisksBetweenTwoPoles(src, aux, s, a)
else if (i % 3 == 0):
moveDisksBetweenTwoPoles(aux, dest, a, d)
# Input: number of disks
num_of_disks = 3
# Create three stacks of size 'num_of_disks'
# to hold the disks
src = createStack(num_of_disks)
dest = createStack(num_of_disks)
aux = createStack(num_of_disks)
tohIterative(num_of_disks, src, aux, dest)
Now the first one is way easier to read because suprise suprise shorter code is usually easier to understand than code that is 10 times longer. Sometimes you want to ask yourself is the extra performance gain really worth it? The amount of hours wasted debugging the code. Is the iterative TowerOfHanoi faster than the Recursive TowerOfHanoi? Probably, but not by a big margin. Would I like to program Recursive problems like TowerOfHanoi using iteration? Hell no. Next we have another recursive function the Ackermann function:
Using recursion:
if m == 0:
# BASE CASE
return n + 1
elif m > 0 and n == 0:
# RECURSIVE CASE
return ackermann(m - 1, 1)
elif m > 0 and n > 0:
# RECURSIVE CASE
return ackermann(m - 1, ackermann(m, n - 1))
Using Iteration:
callStack = [{'m': 2, 'n': 3, 'indentation': 0, 'instrPtr': 'start'}]
returnValue = None
while len(callStack) != 0:
m = callStack[-1]['m']
n = callStack[-1]['n']
indentation = callStack[-1]['indentation']
instrPtr = callStack[-1]['instrPtr']
if instrPtr == 'start':
print('%sackermann(%s, %s)' % (' ' * indentation, m, n))
if m == 0:
# BASE CASE
returnValue = n + 1
callStack.pop()
continue
elif m > 0 and n == 0:
# RECURSIVE CASE
callStack[-1]['instrPtr'] = 'after first recursive case'
callStack.append({'m': m - 1, 'n': 1, 'indentation': indentation + 1, 'instrPtr': 'start'})
continue
elif m > 0 and n > 0:
# RECURSIVE CASE
callStack[-1]['instrPtr'] = 'after second recursive case, inner call'
callStack.append({'m': m, 'n': n - 1, 'indentation': indentation + 1, 'instrPtr': 'start'})
continue
elif instrPtr == 'after first recursive case':
returnValue = returnValue
callStack.pop()
continue
elif instrPtr == 'after second recursive case, inner call':
callStack[-1]['innerCallResult'] = returnValue
callStack[-1]['instrPtr'] = 'after second recursive case, outer call'
callStack.append({'m': m - 1, 'n': returnValue, 'indentation': indentation + 1, 'instrPtr': 'start'})
continue
elif instrPtr == 'after second recursive case, outer call':
returnValue = returnValue
callStack.pop()
continue
print(returnValue)
And once again I will argue that the recursive implementation is much easier to understand. So my conclusion is use recursion if the problem by nature is recursive and requires manipulating items in a stack.
I'm going to answer your question by designing a Haskell data structure by "induction", which is a sort of "dual" to recursion. And then I will show how this duality leads to nice things.
We introduce a type for a simple tree:
data Tree a = Branch (Tree a) (Tree a)
| Leaf a
deriving (Eq)
We can read this definition as saying "A tree is a Branch (which contains two trees) or is a leaf (which contains a data value)". So the leaf is a sort of minimal case. If a tree isn't a leaf, then it must be a compound tree containing two trees. These are the only cases.
Let's make a tree:
example :: Tree Int
example = Branch (Leaf 1)
(Branch (Leaf 2)
(Leaf 3))
Now, let's suppose we want to add 1 to each value in the tree. We can do this by calling:
addOne :: Tree Int -> Tree Int
addOne (Branch a b) = Branch (addOne a) (addOne b)
addOne (Leaf a) = Leaf (a + 1)
First, notice that this is in fact a recursive definition. It takes the data constructors Branch and Leaf as cases (and since Leaf is minimal and these are the only possible cases), we are sure that the function will terminate.
What would it take to write addOne in an iterative style? What will looping into an arbitrary number of branches look like?
Also, this kind of recursion can often be factored out, in terms of a "functor". We can make Trees into Functors by defining:
instance Functor Tree where fmap f (Leaf a) = Leaf (f a)
fmap f (Branch a b) = Branch (fmap f a) (fmap f b)
and defining:
addOne' = fmap (+1)
We can factor out other recursion schemes, such as the catamorphism (or fold) for an algebraic data type. Using a catamorphism, we can write:
addOne'' = cata go where
go (Leaf a) = Leaf (a + 1)
go (Branch a b) = Branch a b

Resources