Do "algorithms" exist in Functional Programming? [closed] - algorithm

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Is it meaningless to be asked to document "algorithms" of your software (say, in a design specification) if it is implemented in a functional paradigm? Whenever I think of algorithms in technical documents I imagine a while loop with a bunch of sequential steps.
Looking at the informal dictionary meaning of an algorithm:
In mathematics and computer science, an algorithm is a step-by-step procedure for calculations.
The phrase "step-by-step" appears to be against the paradigm of functional programming (as I understand it), because functional programs, in contrast to imperative programs, have no awareness of time in their hypothetical machine. Is this argument correct? Or does lazy evaluation enforce an implicit time component that makes it "step by step"?
EDIT - so many good answers, it's unfair for me to choose a best response :( Thanks for all the perspectives, they all make great observations.

Yes, algorithms still exist in functional languages, although they don't always look the same as imperative ones.
Instead of using an implicit notion of "time" based on state to model steps, functional languages do it with composed data transformations. As a really nice example, you could think of heap sort in two parts: a transformation from a list into a heap and then from a heap into a list.
You can model step-by-step logic quite naturally with recursion, or, better yet, using existing higher-order functions that capture the various computations you can do. Composing these existing pieces is probably what I'd really call the "functional style": you might express your algorithm as an unfold followed by a map followed by a fold.
Laziness makes this even more interesting by blurring the lines between "data structure" and "algorithm". A lazy data structure, like a list, never has to completely exist in memory. This means that you can compose functions that build up large intermediate data structures without actually needing to use all that space or sacrificing asymptotic performance. As a trivial example, consider this definition of factorial (yes, it's a cliche, but I can't come up with anything better :/):
factorial n = product [1..n]
This has two composed parts: first, we generate a list from 1 to n and then we fold it by multiplying (product). But, thanks to laziness, the list never has to exist in memory completely! We evaluate as much of the generating function as we need at each step of product, and the garbage collector reclaims old cells as we're done with them. So even though this looks like it'll need O(n) memory, it actually gets away with O(1). (Well, assuming numbers all take O(1) memory.)
In this case, the "structure" of the algorithm, the sequence of steps, is provided by the list structure. The list here is closer to a for-loop than an actual list!
So in functional programming, we can create an algorithm as a sequence of steps in a few different ways: by direct recursion, by composing transformations (maybe based on common higher-order functions) or by creating and consuming intermediate data structures lazily.

I think you might be misunderstanding the functional programming paradigm.
Whether you use a functional language (Lisp, ML, Haskell) or an imperative/procedural one (C/Java/Python), you are specifying the operations and their order (sometimes the order might not be specified, but this is a side issue).
The functional paradigm sets certain limits on what you can do (e.g., no side effects), which makes it easier to reason about the code (and, incidentally, easier to write a "Sufficiently Smart Compiler").
Consider, e.g., a functional implementation of factorial:
(defun ! (n)
(if (zerop n)
1
(* n (! (1- n)))))
One can easily see the order of execution: 1 * 2 * 3 * .... * n and the fact that there are
n-1 multiplications and subtractions for argument n.
The most important part of the Computer Science is to remember that the language is just the means of talking to computers. CS is about computers no more than Astronomy is about telescopes, and algorithms are to be executed on an abstract (Turing) machine, emulated by the actual box in front of us.

No, I think if you solved a problem functionally and you solved it imperatively, what you have come up with are two separate algorithms. Each is an algorithm. One is a functional algorithm and one is an imperative algorithm. There's many books about algorithms in functional programming languages.
It seems like you are getting caught up in technicalities/semantics here. If you are asked to document an algorithm to solve a problem, whoever asked you wants to know how you are solving the problem. Even if its functional, there will be series of steps to reach the solution (even with all of the lazy evaluation). If you can write the code to reach the solution, then you can write the code in pseudocode, which means you can write the code in terms of an algorithm as far as I'm concerned.
And, since it seems like you are getting very hung up on definitions here, I'll posit a question your way that proves my point. Programming languages, whether functional or imperative, ultimately are run on a machine. Right? Your computer has to be told a step-by-step procedure of low level instructions to run. If this statement holds true, then every high-level computer program can be described in terms of their low level instructions. Therefore, every program, whether functional or imperative, can be described by an algorithm. And if you can't seem to find a way to describe the high-level algorithm, then output the bytecode/assembly and explain your algorithm in terms of these instructions

Consider this functional Scheme example:
(define (make-list num)
(let loop ((x num) (acc '()))
(if (zero? x)
acc
(loop (- x 1) (cons x acc)))))
(make-list 5) ; dumb compilers might do this
(display (make-list 10)) ; force making a list (because we display it)
With your logic make-list wouldn't be considered an algorithm since it doesn't do it's calculation in steps, but is that really true?
Scheme is eager and follows computation in order. Even with lazy languages everything becomes calculations in stages until you have a value. The differences is that lazy languages does calculations in the order of dependencies and not in the order of your instructions.
The underlying machine of a functional language is a register machine so it's hard avoid your beautiful functional program to actually become assembly instructions that mutate registers and memory. Thus a functional language (or a lazy language) is just an abstraction to ease writing code with less bugs.

Related

Can every recursion function be called DFS(Depth-First Search)? [duplicate]

A reddit thread brought up an apparently interesting question:
Tail recursive functions can trivially be converted into iterative functions. Other ones, can be transformed by using an explicit stack. Can every recursion be transformed into iteration?
The (counter?)example in the post is the pair:
(define (num-ways x y)
(case ((= x 0) 1)
((= y 0) 1)
(num-ways2 x y) ))
(define (num-ways2 x y)
(+ (num-ways (- x 1) y)
(num-ways x (- y 1))
Can you always turn a recursive function into an iterative one? Yes, absolutely, and the Church-Turing thesis proves it if memory serves. In lay terms, it states that what is computable by recursive functions is computable by an iterative model (such as the Turing machine) and vice versa. The thesis does not tell you precisely how to do the conversion, but it does say that it's definitely possible.
In many cases, converting a recursive function is easy. Knuth offers several techniques in "The Art of Computer Programming". And often, a thing computed recursively can be computed by a completely different approach in less time and space. The classic example of this is Fibonacci numbers or sequences thereof. You've surely met this problem in your degree plan.
On the flip side of this coin, we can certainly imagine a programming system so advanced as to treat a recursive definition of a formula as an invitation to memoize prior results, thus offering the speed benefit without the hassle of telling the computer exactly which steps to follow in the computation of a formula with a recursive definition. Dijkstra almost certainly did imagine such a system. He spent a long time trying to separate the implementation from the semantics of a programming language. Then again, his non-deterministic and multiprocessing programming languages are in a league above the practicing professional programmer.
In the final analysis, many functions are just plain easier to understand, read, and write in recursive form. Unless there's a compelling reason, you probably shouldn't (manually) convert these functions to an explicitly iterative algorithm. Your computer will handle that job correctly.
I can see one compelling reason. Suppose you've a prototype system in a super-high level language like [donning asbestos underwear] Scheme, Lisp, Haskell, OCaml, Perl, or Pascal. Suppose conditions are such that you need an implementation in C or Java. (Perhaps it's politics.) Then you could certainly have some functions written recursively but which, translated literally, would explode your runtime system. For example, infinite tail recursion is possible in Scheme, but the same idiom causes a problem for existing C environments. Another example is the use of lexically nested functions and static scope, which Pascal supports but C doesn't.
In these circumstances, you might try to overcome political resistance to the original language. You might find yourself reimplementing Lisp badly, as in Greenspun's (tongue-in-cheek) tenth law. Or you might just find a completely different approach to solution. But in any event, there is surely a way.
Is it always possible to write a non-recursive form for every recursive function?
Yes. A simple formal proof is to show that both µ recursion and a non-recursive calculus such as GOTO are both Turing complete. Since all Turing complete calculi are strictly equivalent in their expressive power, all recursive functions can be implemented by the non-recursive Turing-complete calculus.
Unfortunately, I’m unable to find a good, formal definition of GOTO online so here’s one:
A GOTO program is a sequence of commands P executed on a register machine such that P is one of the following:
HALT, which halts execution
r = r + 1 where r is any register
r = r – 1 where r is any register
GOTO x where x is a label
IF r ≠ 0 GOTO x where r is any register and x is a label
A label, followed by any of the above commands.
However, the conversions between recursive and non-recursive functions isn’t always trivial (except by mindless manual re-implementation of the call stack).
For further information see this answer.
Recursion is implemented as stacks or similar constructs in the actual interpreters or compilers. So you certainly can convert a recursive function to an iterative counterpart because that's how it's always done (if automatically). You'll just be duplicating the compiler's work in an ad-hoc and probably in a very ugly and inefficient manner.
Basically yes, in essence what you end up having to do is replace method calls (which implicitly push state onto the stack) into explicit stack pushes to remember where the 'previous call' had gotten up to, and then execute the 'called method' instead.
I'd imagine that the combination of a loop, a stack and a state-machine could be used for all scenarios by basically simulating the method calls. Whether or not this is going to be 'better' (either faster, or more efficient in some sense) is not really possible to say in general.
Recursive function execution flow can be represented as a tree.
The same logic can be done by a loop, which uses a data-structure to traverse that tree.
Depth-first traversal can be done using a stack, breadth-first traversal can be done using a queue.
So, the answer is: yes. Why: https://stackoverflow.com/a/531721/2128327.
Can any recursion be done in a single loop? Yes, because
a Turing machine does everything it does by executing a single loop:
fetch an instruction,
evaluate it,
goto 1.
Yes, using explicitly a stack (but recursion is far more pleasant to read, IMHO).
Yes, it's always possible to write a non-recursive version. The trivial solution is to use a stack data structure and simulate the recursive execution.
In principle it is always possible to remove recursion and replace it with iteration in a language that has infinite state both for data structures and for the call stack. This is a basic consequence of the Church-Turing thesis.
Given an actual programming language, the answer is not as obvious. The problem is that it is quite possible to have a language where the amount of memory that can be allocated in the program is limited but where the amount of call stack that can be used is unbounded (32-bit C where the address of stack variables is not accessible). In this case, recursion is more powerful simply because it has more memory it can use; there is not enough explicitly allocatable memory to emulate the call stack. For a detailed discussion on this, see this discussion.
All computable functions can be computed by Turing Machines and hence the recursive systems and Turing machines (iterative systems) are equivalent.
Sometimes replacing recursion is much easier than that. Recursion used to be the fashionable thing taught in CS in the 1990's, and so a lot of average developers from that time figured if you solved something with recursion, it was a better solution. So they would use recursion instead of looping backwards to reverse order, or silly things like that. So sometimes removing recursion is a simple "duh, that was obvious" type of exercise.
This is less of a problem now, as the fashion has shifted towards other technologies.
Recursion is nothing just calling the same function on the stack and once function dies out it is removed from the stack. So one can always use an explicit stack to manage this calling of the same operation using iteration.
So, yes all-recursive code can be converted to iteration.
Removing recursion is a complex problem and is feasible under well defined circumstances.
The below cases are among the easy:
tail recursion
direct linear recursion
Appart from the explicit stack, another pattern for converting recursion into iteration is with the use of a trampoline.
Here, the functions either return the final result, or a closure of the function call that it would otherwise have performed. Then, the initiating (trampolining) function keep invoking the closures returned until the final result is reached.
This approach works for mutually recursive functions, but I'm afraid it only works for tail-calls.
http://en.wikipedia.org/wiki/Trampoline_(computers)
I'd say yes - a function call is nothing but a goto and a stack operation (roughly speaking). All you need to do is imitate the stack that's built while invoking functions and do something similar as a goto (you may imitate gotos with languages that don't explicitly have this keyword too).
Have a look at the following entries on wikipedia, you can use them as a starting point to find a complete answer to your question.
Recursion in computer science
Recurrence relation
Follows a paragraph that may give you some hint on where to start:
Solving a recurrence relation means obtaining a closed-form solution: a non-recursive function of n.
Also have a look at the last paragraph of this entry.
It is possible to convert any recursive algorithm to a non-recursive
one, but often the logic is much more complex and doing so requires
the use of a stack. In fact, recursion itself uses a stack: the
function stack.
More Details: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Functions
tazzego, recursion means that a function will call itself whether you like it or not. When people are talking about whether or not things can be done without recursion, they mean this and you cannot say "no, that is not true, because I do not agree with the definition of recursion" as a valid statement.
With that in mind, just about everything else you say is nonsense. The only other thing that you say that is not nonsense is the idea that you cannot imagine programming without a callstack. That is something that had been done for decades until using a callstack became popular. Old versions of FORTRAN lacked a callstack and they worked just fine.
By the way, there exist Turing-complete languages that only implement recursion (e.g. SML) as a means of looping. There also exist Turing-complete languages that only implement iteration as a means of looping (e.g. FORTRAN IV). The Church-Turing thesis proves that anything possible in a recursion-only languages can be done in a non-recursive language and vica-versa by the fact that they both have the property of turing-completeness.
Here is an iterative algorithm:
def howmany(x,y)
a = {}
for n in (0..x+y)
for m in (0..n)
a[[m,n-m]] = if m==0 or n-m==0 then 1 else a[[m-1,n-m]] + a[[m,n-m-1]] end
end
end
return a[[x,y]]
end

What are data structures at the lowest level?

I recetly watched a SICP lecture in which Sussman demonstrated how it was possible to implement Scheme's cons car and cdr using nothing but procudures.
It went something like this:
(define (cons x y)
(lambda (m) (m x y)))
(define (car z)
(z (lambda (p q) p)))
This got me thinking; What excatly are data structures? When a language is created, are data structures implemented as abstractions built up by procedures? If they are just made up of procedures, what are procedures really at the lowest level?
I guess what I want to know is what is at the bottom of the abstraction chain (unless it happens to be abstractions all the way down). At what point does it become hardware?
The trick is that you don't have to care. cons, cdr and car are abstractions and so their underlying implementation shouldn't matter.
What you have there are what are called "Church pairs", we build everything from functions. In modern machines we build everything from strings of 1s and 0s, but really, it doesn't matter.
Now, if you're wondering how all of these abstractions are implemented at in your particular implementation, that depends. In all likelihood your compiler/interpeter is jumping through hoops behind the scenes allocating cons cells as tightly packed pair of pointers (or similar) and turning your functions into a string of 0's and 1's forming the appropriate machine code and pairing this with a pointer to its environment.
But like I said, the whole beauty of building these abstractions is that you don't have to care as a user :)
Cool question!
It becomes hardware whenever someone hands it off to calls to the processor (CPU). If you have to read a manual from a chip manufacturer to understand how to do what you want to do, you're at the level you describe.
(source: micro-examples.com)
Here's an overview of the path from physics to hardware to code to users: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-002-circuits-and-electronics-spring-2007/video-lectures/lecture-1/ although I don't think it's to do with lambda calculus. (But you're just saying that's what inspired the question—it's not the question per se, right?)
There is a step where the person who writes the language has to interface with processor instructions, eg https://en.wikipedia.org/wiki/X86_instruction_listings || https://en.wikipedia.org/wiki/X86_calling_conventions.
Peek at kernel programming or think about embedded systems (in a microwave, on the wing of an airplane), ARMs, or mobile devices—these are the things people program with that have non-laptop chipsets.
People who write BLAS (linear algebra solver libraries) sometimes get down to this level of detail. For example https://en.wikipedia.org/wiki/Math_Kernel_Library or https://en.wikipedia.org/wiki/Automatically_Tuned_Linear_Algebra_Software. When they say the BLAS is "tuned" what do they mean? They're talking about sniffing out facts about your processor and changing how inner-inner-inner loops make their decisions to waste less time with the way the physical thing is configured.
(source: hswstatic.com)
If I recall correctly, high-level programming languages (like C ;) ) make no assumptions about what system they will be running on so they make agnostic calls that run like ten times† slower than they would if they knew ahead of time which kind of call to make. Every. Time. This is the kind of thing that could drive you insane, but it's that classic engineering tradeoff of the technical people's time versus the end-user's performance. I guess if you become a kernel programmer or embedded-systems programmer you can help put an end to all of the wasted clock cycles on computers across the globe—processors getting hot as they waste a lot of pointless going back-and-forth. (Although there are clearly worse things that go on in the world.)
†: I just quickly searched how much BLAS speedups are and yeah, it can be a factor of like 15 or 20. So I don't think I'm exaggerating / misremembering what I heard about wasted movement. BTW, there is something similar goes on in power generation: the final step (the turbine) in power generation is only like 20% efficient. Doesn't all that waste just drive you crazy?! Time to become an engineer. ;)
A cool project you could check out is MenuetOS; someone wrote an operating system in assembler.
Yet more cool stuff to look at could be this guy who says it's actually pretty easy and fun to learn x86 assembly language. (!)
(source: iqsdirectory.com)
You can also read back on the olden days when there was less distance between software and hardware (eg, programming with a punchcard). Thankfully people have written "high-level" languages that are more like the way people talk and think and less like moving some tape around. Data structures may not be the most obvious thing from everyday conversation but they are more abstract than [lists a sequence of GOTO and STORE instructions...].
HTH
The substrate at the bottom of the abstraction chain varies depending on the hardware, but it's probably going to be some sort of register machine, which will have some kind of native instruction set.
SICP's last chapter is all about this layer. The book doesn't present any real-life hardware, but an abstract (ahem turtles!) register machine that sort of resembles what might actually be going on ... which happens to be modeled in Scheme, just for extra metacircular fun. I found it a tough read, but worthwhile.
If you're interested in this sort of stuff you might want to check out Knuth, too.
There exists several bottoms for abstraction chains. Depending on the underlying task computer scientist can choose one, that suits better. The most common are Turing Machine and Lambda Calculus.

Why is computing a factorial faster/efficient in functional programming?

Here's a claim made in one of the answers at: https://softwareengineering.stackexchange.com/a/136146/18748
Try your hand at a Functional language or two. Try implementing
factorial in Erlang using recursion and watch your jaw hit the floor
when 20000! returns in 5 seconds (no stack overflow in site)
Why would it be faster/more efficient than using recursion/iteration in Java/C/C++/Python (any)? What is the underlying mathematical/theoretical concept that makes this happen? Unfortunately, I wasn't ever exposed to functional programming in my undergrad (started with 'C') so I may just lack knowing the 'why'.
A recursive solution would probably be slower and clunkier in Java or C++, but we don't do things that way in Java and C++, so. :)
As for why, functional languages invariably make very heavy usage of recursion (as by default they either have to jump through hoops, or simply aren't allowed, to modify variables, which on its own makes most forms of iteration ineffective or outright impossible). So they effectively have to optimize the hell out of it in order to be competitive.
Almost all of them implement an optimization called "tail call elimination", which uses the current call's stack space for the next call and turns a "call" into a "jump". That little change basically turns recursion into iteration, but don't remind the FPers of that. When it comes to iteration in a procedural language and recursion in a functional one, the two would prove about equal performancewise. (If either one's still faster, though, it'd be iteration.)
The rest is libraries and number types and such. Decent math code ==> decent math performance. There's nothing keeping procedural or OO languages from optimizing similarly, other than that most people don't care that much. Functional programmers, on the other hand, love to navel gaze about how easily they can calculate Fibonacci sequences and 20000! and other numbers that most of us will never use anyway.
Essentially, functional languages make recursion cheap.
Particularly via the tail call optimization.
Tail-call optimization
Lazy evaluation: "When using delayed evaluation, an expression is not evaluated as soon as it gets bound to a variable, but when the evaluator is forced to produce the expression's value."
Both concepts are illustrated in this question about various ways of implementing factorial in Clojure. You can also find a detailed discussion of lazy sequences and TCO in Stuart Halloway and Aaron Bedra's book, Programming Clojure.
The following function, adopted from Programming Clojure, creates a lazy sequence with the first 1M members of the Fibonacci sequence and realizes the one-hundred-thousandth member:
user=> (time (nth (take 1000000 (map first (iterate (fn [[a b]] [b (+ a b)]) [0N 1N]))) 100000))
"Elapsed time: 252.045 msecs"
25974069347221724166155034....
(20900 total digits)
(512MB heap, Intel Core i7, 2.5GHz)
It's not faster (than what?), and it has nothing to do with tail call optimization (it's not enough to throw a buzzword in here, one should also explain why tail call optimization should be faster than looping? it's simply not the case!)
Mind you that I am not a functional programming hater, to the contrary! But spreading myths does not serve the case for functional programming.
BTW, has any one here actually tried how long it takes to compute (and print, which should consume at least 50% of the CPU cycles needed) 20000!?
I did:
main _ = println (product [2n..20000n])
This is a JVM language compiled to Java, and it uses Java big integers (known to be slow). It also suffers of JVM startup costs. And it is not the fastest way to do it (explicit recursion could save list construction, for example).
The result is:
181920632023034513482764175686645876607160990147875264
...
... many many lines with digits
...
000000000000000000000000000000000000000000000000000000000000000
real 0m3.330s
user 0m4.472s
sys 0m0.212s
(On a Intel® Core™ i3 CPU M 350 # 2.27GHz × 4)
We can safely assume that the same in C with GMP wouldn't even use 50% of that time.
Hence, functional is faster is a myth, as well as functional is slower. It's not even a myth, it's simply nonsense as long as one does not say compared to what it is faster/slower.

Reasoning about performance in Haskell

The following two Haskell programs for computing the n'th term of the Fibonacci sequence have greatly different performance characteristics:
fib1 n =
case n of
0 -> 1
1 -> 1
x -> (fib1 (x-1)) + (fib1 (x-2))
fib2 n = fibArr !! n where
fibArr = 1:1:[a + b | (a, b) <- zip fibArr (tail fibArr)]
They are very close to mathematically identical, but fib2 uses the list notation to memoize its intermediate results, while fib1 has explicit recursion. Despite the potential for the intermediate results to be cached in fib1, the execution time gets to be a problem even for fib1 25, suggesting that the recursive steps are always evaluated. Does referential transparency contribute anything to Haskell's performance? How can I know ahead of time if it will or won't?
This is just an example of the sort of thing I'm worried about. I'd like to hear any thoughts about overcoming the difficulty inherent in reasoning about the performance of a lazily-executed, functional programming language.
Summary: I'm accepting 3lectrologos's answer, because the point that you don't reason so much about the language's performance, as about your compiler's optimization, seems to be extremely important in Haskell - more so than in any other language I'm familiar with. I'm inclined to say that the importance of the compiler is the factor that differentiates reasoning about performance in lazy, functional langauges, from reasoning about the performance of any other type.
Addendum: Anyone happening on this question may want to look at the slides from Johan Tibell's talk about high performance Haskell.
In your particular Fibonacci example, it's not very hard to see why the second one should run faster (although you haven't specified what f2 is).
It's mainly an algorithmic issue:
fib1 implements the purely recursive algorithm and (as far as I know) Haskell has no mechanism for "implicit memoization".
fib2 uses explicit memoization (using the fibArr list to store previously computed values.
In general, it's much harder to make performance assumptions for a lazy language like Haskell, than for an eager one. Nevertheless, if you understand the underlying mechanisms (especially for laziness) and gather some experience, you will be able to make some "predictions" about performance.
Referential transparency increases (potentially) performance in (at least) two ways:
First, you (as a programmer) can be sure that two calls to the same function will always return the same, so you can exploit this in various cases to benefit in performance.
Second (and more important), the Haskell compiler can be sure for the above fact and this may enable many optimizations that can't be enabled in impure languages (if you've ever written a compiler or have any experience in compiler optimizations you are probably aware of the importance of this).
If you want to read more about the reasoning behind the design choices (laziness, pureness) of Haskell, I'd suggest reading this.
Reasoning about performance is generally hard in Haskell and lazy languages in general, although not impossible. Some techniques are covered in Chris Okasaki's Purely Function Data Structures (also available online in a previous version).
Another way to ensure performance is to fix the evaluation order, either using annotations or continuation passing style. That way you get to control when things are evaluated.
In your example you might calculate the numbers "bottom up" and pass the previous two numbers along to each iteration:
fib n = fib_iter(1,1,n)
where
fib_iter(a,b,0) = a
fib_iter(a,b,1) = a
fib_iter(a,b,n) = fib_iter(a+b,a,n-1)
This results in a linear time algorithm.
Whenever you have a dynamic programming algorithm where each result relies on the N previous results, you can use this technique. Otherwise you might have to use an array or something completely different.
Your implementation of fib2 uses memoization but each time you call fib2 it rebuild the "whole" result. Turn on ghci time and size profiling:
Prelude> :set +s
If it was doing memoisation "between" calls the subsequent calls would be faster and use no memory. Call fib2 20000 twice and see for yourself.
By comparison a more idiomatic version where you define the exact mathematical identity:
-- the infinite list of all fibs numbers.
fibs = 1 : 1 : zipWith (+) fibs (tail fibs)
memoFib n = fibs !! n
actually do use memoisation, explicit as you see. If you run memoFib 20000 twice you'll see the time and space taken the first time then the second call is instantaneous and take no memory. No magic and no implicit memoization like a comment might have hinted at.
Now about your original question: optimizing and reasoning about performance in Haskell...
I wouldn't call myself an expert in Haskell, I have only been using it for 3 years, 2 of which at my workplace but I did have to optimize and get to understand how to reason somewhat about its performance.
As mentionned in other post laziness is your friend and can help you gain performance however YOU have to be in control of what is lazily evaluated and what is strictly evaluated.
Check this comparison of foldl vs foldr
foldl actually stores "how" to compute the value i.e. it is lazy. In some case you saves time and space beeing lazy, like the "infinite" fibs. The infinite "fibs" doesn't generate all of them but knows how. When you know you will need the value you might as well just get it "strictly" speaking... That's where strictness annotation are usefull, to give you back control.
I recall reading many times that in lisp you have to "minimize" consing.
Understanding what is stricly evaluated and how to force it is important but so is understanding how much "trashing" you do to the memory. Remember Haskell is immutable, that means that updating a "variable" is actually creating a copy with the modification. Prepending with (:) is vastly more efficient than appending with (++) because (:) does not copy memory contrarily to (++). Whenever a big atomic block is updated (even for a single char) the whole block needs to be copied to represent the "updated" version. The way you structure data and update it can have a big impact on performance. The ghc profiler is your friend and will help you spot these. Sure the garbage collector is fast but not having it do anything is faster!
Cheers
Aside from the memoization issue, fib1 also uses non-tailcall recursion. Tailcall recursion can be re-factored automatically into a simple goto and perform very well, but the recursion in fib1 cannot be optimized in this way, because you need the stack frame from each instance of fib1 in order to calculate the result. If you rewrote fib1 to pass a running total as an argument, thus allowing a tail call instead of needing to keep the stack frame for the final addition, the performance would improve immensely. But not as much as the memoized example, of course :)
Since allocation is a major cost in any functional language, an important part of understanding performance is to understand when objects are allocated, how long they live, when they die, and when they are reclaimed. To get this information you need a heap profiler. It's an essential tool, and luckily GHC ships with a good one.
For more information, read Colin Runciman's papers.

Can any algorithmic puzzle be implemented in a purely functional way?

I've been contemplating programming language designs, and from the definition of Declarative Programming on Wikipedia:
This is in contrast from imperative programming, which requires a detailed description of the algorithm to be run.
and further down:
... Any style of programming that is not imperative. ...
It then goes on to express that functional languages, because they are not imperative, are declarative by their very nature.
However, this makes me wonder, are purely functional programming languages able to solve any algorithmic problem, or are the constraints based upon what functions are available in that language?
I'm mostly interested in general thoughts on the subject, although if specific examples can illustrate the point, I certainly welcome them.
According to the Church-Turing Thesis ,
the three computational processes (recursion, λ-calculus, and Turing machine) were shown to be equivalent"
where Turing machine can be read as "procedural" and lambda calculus as "functional".
Yes, Haskell, Erlang, etc. are Turing complete languages. In principle, you don't need mutable state to solve a problem, since you can always create a new object instead of mutating the old one. Of course, Brainfuck is also Turing complete. In other words, just because an algorithm can be expressed in a functional language doesn't mean it's not horribly awkward.
OK, so Church and Turing provied it is possible, but how do we actually do something?
Rewriting imperative code in pure functional style is an exercise I frequently assign to undergraduate students:
Each mutable variable becomes a function parameter
Loops are rewritten using recursion
Each goto is expressed as a function call with arguments
Sometimes what comes out is a mess, but often the results are surprisingly elegant. The only real trick is not to pass arguments that never change, but instead to let-bind them in the outer environment.
The big difference with functional style programming is that it avoids mutable state. Where imperative programming will typically update variables, functional programming will define new, read-only values.
The main place where this will hit performance is with algorithms that use updatable arrays. An imperative implementation can update an array element in O(1) time, while the best a purely functional style of implementation can achieve is O(log N) (using a sorted tree).
Note that functional languages generally have some way to use updateable arrays with O(1) access time (e.g., Haskell provides this with its state transformer monad). However, this is arguably an imperative programming method... nothing wrong with that; you want to use the best tools for a particular job, after all.
The functional style of O(log N) incremental array update is not all bad, though, as functional style algorithms seem to lend themselves well to parallellization.
Too long to be posted as a comment on #SteveB's answer.
Functional programming and imperative programming have equal capability: whatever one can do, the other can do. They are said to be Turing complete. The functions that a Turing machine can compute are exactly the ones that recursive function theory and λ-calculus express.
But the Church-Turing Thesis, as such, is irrelevant. It asserts that any computation can be carried out by a Turing machine. This relates an informal idea - computation - to a formal one - the Turing machine. Nobody has yet found anything we would recognise as computation that a Turing machine can't do. Will someone find such a thing in future? Who can tell.
Using state monads you can program in an imperative style in Haskell.
So the assertion that Haskell is declarative by its very nature needs to be taken with a grain of salt. On the positive side it then is equivalent to imperative programming languages, also in a practical sense which doesn't completely ignore efficiency.
While I completely agree with the answer that invokes Church-Turing thesis, this begs an interesting question actually. If I have a parallel computation problem (which is not algorithmic in a strict mathematical sense), such as multiple producer/consumer queue or some network protocol between several machines, can this be adequately modeled by Turing machine? It can be simulated, but if we simulate it, we lose the purpose why we have the parallelism in the problem (because then we can find simpler algorithm on the Turing machine). So what if we were not to lose parallelism inherent to the problem (and thus the reason why are we interested in it), we couldn't remove the notion of state?
I remember reading somewhere that there are problems which are provably harder when solved in a purely functional manner, but I can't seem to find the reference.
As noted above, the primary problem is array updates. While the compiler may use a mutable array under the hood in some conditions, it must be guaranteed that only one reference to the array exists in the entire program.
Not only is this a hard mathematical fact, it is also a problem in practice, if you don't use impure constructs.
On a more subjective note, stating that all Turing complete languages are equivalent is only true in a narrow mathematical sense. Paul Graham explores the issue in Beating the Averages in the section "The Blub Paradox."
Formal results such as Turing-completeness may be provably correct, but they are not necessarily useful. The travelling salesman problem may be NP-complete, and yet salesman travel all the time. It seems they don't feel the need to follow an "optimal" path, so the theorem is irrelevant.
NOTE: I am not trying to bash functional programming, since I really like it. It is just important to remember that it is not a panacea.

Resources