I can't wrap my head around the differences between sequence and LazyList. They're both lazy and potentially infinite. While seq<'T> is IEnumerable<'T> from .NET framework, LazyList is included in F# PowerPack. In practice, I encounter sequences much more often than LazyLists.
What are their differences in terms of performance, usage, readability, etc? What are reasons for such a bad reputation of LazyList compared to that of seq?
LazyList computes each element only once regardless of how many times the list is traversed. In this way, it's closer to a sequence returned from Seq.cache (rather than a typical sequence). But, other than caching, LazyList behaves exactly like a list: it uses a list structure under the hood and supports pattern matching. So you might say: use LazyList instead of seq when you need list semantics and caching (in addition to laziness).
Regarding both being infinite, seq's memory usage is constant while LazyList's is linear.
These docs may be worth a read.
In addition to Daniel's answer, I think the main practical difference is how you process the LazyList or seq structures (or computations).
If you want to process LazyList, you would typically write a recursive function using pattern matching (quite similar to processing of normal F# lists)
If you want to process seq, you can either use built-in functions or you have to write imperative code that calls GetEnumerator and then uses the returned enumerator in a loop (which may be written as a recursive function, but it will mutate the enumerator). You cannot use the usual head/tail style (using Seq.tail and Seq.head), because that is extremely inefficient - because seq does not keep the evaluated elements and the result of Seq.head needs to re-iterate from the start.
Regarding the reputation of seq and LazyList, I think that F# library design takes a pragmatic approach - since seq is actually .NET IEnumerable, it is quite convenient for .NET programming (and it is also nice because you can treat other collections as seq). Lazy lists are not as frequent and so normal F# list and seq are sufficient in most of the scenarios.
Related
I was wondering if boost::range or range_v3 will reconciliate free functions and member functions in a similar way that std::begin reconciliates STL containers and C-like arrays (in terms of coding genericity I mean)?
More particularly it would be convenient to me to call std::sort on a list that automatically calls the best possible implementation given by std::list::sort.
At the end, could member functions be seen as interfaces for their generic
counterpart only (std::list::sort never called in client code)?
AFAIK, neither library you mention deals with this directly. There is a push to deal with this kind of thing more generally in C++17, including a proposal to make f(x) and x.f() equivalent, but as I mentioned in the comment above, I'm unclear if it will work with range-v3's algorithms.
I did notice an interesting comment in range-v3's sort.hpp: // TODO Forward iterators, like EoP?. So, perhaps Niebler does have ideas to support a more generic sort. ("EoP" is Elements of Programming by Alex Stepanov.)
One complication: A generic sort uses iterators to reorder values, while list::sort() reorders the links themselves. The distinction is important if you care what iterators point to after the sort, so you'd still need a way to select which sort you want. One could even argue that sort() should never call list::sort(), given the different semantics.
As far as I learned using streams in large programs are way more efficient than using normal lisp in DrRacket.So why not the default evaluation is lazy evaluation in DrRacket?I wrote and put a timer procedure which calculates the time the work needed to be complete and in every complex program lazy eval was a lot faster.
AFAIK using streams when you are doing something like sorting is a waste of cycles since you need to be finished with the sort in order to know the first element. If you have tasks that work like a sort so that you'll need to evaluate a whole set you'll end up using more time than without streams. The reason for that is that the whole stream system has a cost as well as benefits.
The benefits of streams are the fact that you can do calculations in parallel so that the program doesn't need to do a whole loop before processing the first element. If you have n layers of processing streams you'll benefit when your program quits and all the other layers hasn't served you the whole thing yet.
DrRacket is not a language but an IDE. Racket is both a language (#!racket as first line of source) and the name of the implementation that implements it.
Racket supports #!lazy which is a lazy version of Racket. Basically everything works just like streams do, everywhere. You'll have the same benefits and cost.
None of the mentioned languages are Scheme, but #!racket was based on and was a superset of #!r5rs. Since then you have #!r6rs and the new #!r7rs. None of the official Scheme reports are lazy. The reason is that its predecessor was eager and making it lazy would completely change the language and ruin all backwards compatibility.
The innovation of Scheme in 1975 was lexical closures. The creators made lazy evaluation by need in an later report (by implementing delay and force). Other languages, like Haskell, are built to be lazy from the ground and they have a more advanced compiler to constant fold and make its code snappy.
I assume it doesn't.
My reason is that Haskell is pure functional programming (without I/O Monad), they could have made every "call by name" use the same evaluated value if the "name"s are the same.
I don't know anything about the implementation details but I'm really interested.
Detailed explanations will be much appreciated :)
BTW, I tried google, it was quite hard to get anything useful.
First of all, Haskell is a specification, not an implementation; the report does not actually require use of call-by-name evaluation, or lazy evaluation for that matter. Haskell implementations are only required to be non-strict, which does rule out call-by-value and similar strategies.
So, strictly (ha, ha) speaking, evaluation strategies can't slow down Haskell. I'm not sure what can slow down Haskell, though clearly something has or else it wouldn't have taken 12 years to get the next version of the Report out after Haskell 98. My guess is that it involves committees somehow.
Anyway, "lazy evaluation" refers to a "call by need" strategy, which is the most common implementation choice for Haskell. This differs from call-by-name in that if a subexpression is used in multiple places, it will be evaluated at most once.
The details of what qualifies as a subexpression that will be shared is a bit subtle and probably somewhat implementation-dependent, but to use an example from GHC Haskell: Consider the function cycle, which repeats an input list infinitely. A naive implementation might be:
cycle xs = xs ++ cycle xs
This ends up being inefficient because there is no single cycle xs expression that can be shared, so the resulting list has to be constructed continually as it's traversed, allocating more memory and doing more computation each time.
In contrast, the actual implementation looks like this:
cycle xs = xs' where xs' = xs ++ xs'
Here the name xs' is defined recursively as itself appended to the end of the input list. This time xs' is shared, and evaluated only once; the resulting infinite list is actually a finite, circular linked list in memory, and once the entire loop has been evaluated no further work is needed.
In general, GHC will not memoize functions for you: given f and x, each use of f x will be re-evaluated unless you give the result a name and use that. The resulting value will be the same in either case, but the performance can differ significantly. This is mostly a matter of avoiding pessimizations--it would be easy for GHC to memoize things for you, but in many cases this would cost large amounts of memory to gain tiny or nonexistent amounts of speed.
The flip side is that shared values are retained; if you have a data structure that's very expensive to compute, naming the result of constructing it and passing that to functions using it ensures that no work is duplicated--even if it's used simultaneously by different threads.
You can also pessimize things yourself this way--if a data structure is cheap to compute and uses lots of memory, you should probably avoid sharing references to the full structure, as that will keep the whole thing alive in memory as long as anything could possibly use it later.
Yes, it does, somewhat. The problem is that Haskell can't, in general, calculate the value too early (e.g. if it would lead to an exception), so it sometimes needs to keep a thunk (code for calculating the value) instead of the value itself, which uses more memory and slows things down. The compiler tries to detect cases where this can be avoided, but it's impossible to detect all of them.
I'm new to programming and learning Haskell by reading and working through Project Euler problems. Of course, the most important thing one can do to improve performance on these problems is to use a better algorithm. However, it is clear to me that there are other simple and easy to implement ways to improve performance. A cursory search brought up this question, and this question, which give the following tips:
Use the ghc flags -O2 and -fllvm.
Use the type Int, instead of Integer, because it is unboxed (or even Integer instead of Int64). This requires typing the functions, not letting the compiler decide on the fly.
Use rem, not mod, for division testing.
Use Schwartzian transformations when appropriate.
Using an accumulator in recursive functions (a tail-recursion optimization, I believe).
Memoization (?)
(One answer also mentions worker/wrapper transformation, but that seems fairly advanced.)
Question: What other simple optimizations can one make in Haskell to improve performance on Project Euler-style problems? Are there any other Haskell-specific (or functional programming specific?) ideas or features that could be used to help speed up solutions to Project Euler problems? Conversely, what should one watch out for? What are some common yet inefficient things to be avoided?
Here are some good slides by Johan Tibell that I frequently refer to:
Haskell Performance Patterns
One easy suggestion is to use hlint which is a program that checks your source code and makes suggestions for improvements syntax wise. This might not increase speed because most likely it's already done by the compiler or the lazy evaluation. But it might help the compiler in some cases. Further more it will make you a better Haskell programmer since you will learn better ways to do things, and it might be easier to understand your program and analyze it.
examples taken from http://community.haskell.org/~ndm/darcs/hlint/hlint.htm such as:
darcs-2.1.2\src\CommandLine.lhs:94:1: Error: Use concatMap
Found:
concat $ map escapeC s
Why not:
concatMap escapeC s
and
darcs-2.1.2\src\Darcs\Patch\Test.lhs:306:1: Error: Use a more efficient monadic variant
Found:
mapM (delete_line (fn2fp f) line) old
Why not:
mapM_ (delete_line (fn2fp f) line) old
I think the largest increases you can do in Project Euler problems is to understand the problem and remove unnecessary computations. Even if you don't understand everything you can do some small fixes which will make your program run twice the speed. Let's say you are looking for primes up to 1.000.000, then you of course can do filter isPrime [1..1000000]. But if you think a bit, then you can realize that well, no even number above is a prime, there you have removed (about) half the work. Instead doing [1,2] ++ filter isPrime [3,5..999999]
There is a fairly large section of the Haskell wiki about performance.
One fairly common problem is too little (or too much) strictness (this is covered by the sections listed in the General techniques section of the performance page above). Too much laziness causes a large number of thunks to be accumulated, too much strictness can cause too much to be evaluated.
These considerations are especially important when writing tail recursive functions (i.e. those with an accumulator); And, on that note, depending on how the function is used, a tail recursive function is sometimes less efficient in Haskell than the equivalent non-tail-recursive function, even with the optimal strictness annotations.
Also, as demonstrated by this recent question, sharing can make a huge difference to performance (in many cases, this can be considered a form of memoisation).
Project Euler is mostly about finding clever algorithmic solutions to the problems. Once you have the right algorithm, micro-optimization is rarely an issue, since even a straightforward or interpreted (e.g. Python or Ruby) implementation should run well within the speed constraints. The main technique you need is understanding lazy evaluation so you can avoid thunk buildups.
The following two Haskell programs for computing the n'th term of the Fibonacci sequence have greatly different performance characteristics:
fib1 n =
case n of
0 -> 1
1 -> 1
x -> (fib1 (x-1)) + (fib1 (x-2))
fib2 n = fibArr !! n where
fibArr = 1:1:[a + b | (a, b) <- zip fibArr (tail fibArr)]
They are very close to mathematically identical, but fib2 uses the list notation to memoize its intermediate results, while fib1 has explicit recursion. Despite the potential for the intermediate results to be cached in fib1, the execution time gets to be a problem even for fib1 25, suggesting that the recursive steps are always evaluated. Does referential transparency contribute anything to Haskell's performance? How can I know ahead of time if it will or won't?
This is just an example of the sort of thing I'm worried about. I'd like to hear any thoughts about overcoming the difficulty inherent in reasoning about the performance of a lazily-executed, functional programming language.
Summary: I'm accepting 3lectrologos's answer, because the point that you don't reason so much about the language's performance, as about your compiler's optimization, seems to be extremely important in Haskell - more so than in any other language I'm familiar with. I'm inclined to say that the importance of the compiler is the factor that differentiates reasoning about performance in lazy, functional langauges, from reasoning about the performance of any other type.
Addendum: Anyone happening on this question may want to look at the slides from Johan Tibell's talk about high performance Haskell.
In your particular Fibonacci example, it's not very hard to see why the second one should run faster (although you haven't specified what f2 is).
It's mainly an algorithmic issue:
fib1 implements the purely recursive algorithm and (as far as I know) Haskell has no mechanism for "implicit memoization".
fib2 uses explicit memoization (using the fibArr list to store previously computed values.
In general, it's much harder to make performance assumptions for a lazy language like Haskell, than for an eager one. Nevertheless, if you understand the underlying mechanisms (especially for laziness) and gather some experience, you will be able to make some "predictions" about performance.
Referential transparency increases (potentially) performance in (at least) two ways:
First, you (as a programmer) can be sure that two calls to the same function will always return the same, so you can exploit this in various cases to benefit in performance.
Second (and more important), the Haskell compiler can be sure for the above fact and this may enable many optimizations that can't be enabled in impure languages (if you've ever written a compiler or have any experience in compiler optimizations you are probably aware of the importance of this).
If you want to read more about the reasoning behind the design choices (laziness, pureness) of Haskell, I'd suggest reading this.
Reasoning about performance is generally hard in Haskell and lazy languages in general, although not impossible. Some techniques are covered in Chris Okasaki's Purely Function Data Structures (also available online in a previous version).
Another way to ensure performance is to fix the evaluation order, either using annotations or continuation passing style. That way you get to control when things are evaluated.
In your example you might calculate the numbers "bottom up" and pass the previous two numbers along to each iteration:
fib n = fib_iter(1,1,n)
where
fib_iter(a,b,0) = a
fib_iter(a,b,1) = a
fib_iter(a,b,n) = fib_iter(a+b,a,n-1)
This results in a linear time algorithm.
Whenever you have a dynamic programming algorithm where each result relies on the N previous results, you can use this technique. Otherwise you might have to use an array or something completely different.
Your implementation of fib2 uses memoization but each time you call fib2 it rebuild the "whole" result. Turn on ghci time and size profiling:
Prelude> :set +s
If it was doing memoisation "between" calls the subsequent calls would be faster and use no memory. Call fib2 20000 twice and see for yourself.
By comparison a more idiomatic version where you define the exact mathematical identity:
-- the infinite list of all fibs numbers.
fibs = 1 : 1 : zipWith (+) fibs (tail fibs)
memoFib n = fibs !! n
actually do use memoisation, explicit as you see. If you run memoFib 20000 twice you'll see the time and space taken the first time then the second call is instantaneous and take no memory. No magic and no implicit memoization like a comment might have hinted at.
Now about your original question: optimizing and reasoning about performance in Haskell...
I wouldn't call myself an expert in Haskell, I have only been using it for 3 years, 2 of which at my workplace but I did have to optimize and get to understand how to reason somewhat about its performance.
As mentionned in other post laziness is your friend and can help you gain performance however YOU have to be in control of what is lazily evaluated and what is strictly evaluated.
Check this comparison of foldl vs foldr
foldl actually stores "how" to compute the value i.e. it is lazy. In some case you saves time and space beeing lazy, like the "infinite" fibs. The infinite "fibs" doesn't generate all of them but knows how. When you know you will need the value you might as well just get it "strictly" speaking... That's where strictness annotation are usefull, to give you back control.
I recall reading many times that in lisp you have to "minimize" consing.
Understanding what is stricly evaluated and how to force it is important but so is understanding how much "trashing" you do to the memory. Remember Haskell is immutable, that means that updating a "variable" is actually creating a copy with the modification. Prepending with (:) is vastly more efficient than appending with (++) because (:) does not copy memory contrarily to (++). Whenever a big atomic block is updated (even for a single char) the whole block needs to be copied to represent the "updated" version. The way you structure data and update it can have a big impact on performance. The ghc profiler is your friend and will help you spot these. Sure the garbage collector is fast but not having it do anything is faster!
Cheers
Aside from the memoization issue, fib1 also uses non-tailcall recursion. Tailcall recursion can be re-factored automatically into a simple goto and perform very well, but the recursion in fib1 cannot be optimized in this way, because you need the stack frame from each instance of fib1 in order to calculate the result. If you rewrote fib1 to pass a running total as an argument, thus allowing a tail call instead of needing to keep the stack frame for the final addition, the performance would improve immensely. But not as much as the memoized example, of course :)
Since allocation is a major cost in any functional language, an important part of understanding performance is to understand when objects are allocated, how long they live, when they die, and when they are reclaimed. To get this information you need a heap profiler. It's an essential tool, and luckily GHC ships with a good one.
For more information, read Colin Runciman's papers.