Mutability in functional programming - algorithm

First I am a Haskell newbie.
I've read this:
Immutable functional objects in highly mutable domain
And my question is nearly the same -- how to efficiently write algorithms where the state is supposed to change. Let's take for example Dijkstra's algorithm. There will be new paths found and distances should be updated. And in traditional languages this is simple while in Haskell for example I can only think of creating entirely new distances which will be too slow and memory consuming. Are there something like design patterns for such cases where one should implement algorithm with mutable data structure and speed and memory usage are main concerns?

There are of course many ways functional languages address this issue.
Different data structures - many data structures can be implemented in a purely functional manner, with the same algorithmic complexity as imperative versions. Probably the most well-known work in this area is Chris Okasaki's Purely Functional Data Structures, but there are many other resources as well. For Dijkstra's algorithm, Martin Erwig's work on functional graphs is appropriate. See this question as well.
Different algorithms - some algorithms have assumptions of mutability built-in, Quicksort is an example of this. In this case an alternative algorithm can be used that's more amenable to immutability.
Mutable state - every functional language can model functional state with a State monad. Most provide other forms of mutability as well, such as Haskell's ST monad and IORef's.

The ST Monad lets you use mutable state internally, but present a pure external interface.

Creating new immutable objects isn't nearly as expense as you might think, since large amounts of structural sharing can occur because the compiler KNOWS they can't change and thus can be safely shared. That said, using highly imperative algorithms with lots of mutable state in Haskell is a bit of a code smell.

In ML derivatives (such as OCaml, SML, F#), there are "references", which can be used as mutable variables.
In Haskell, this isn't cleanly handled. State is simply not covered by the usual "purely functional" style. Pure FP languages deal with "eternal truths", and are thus not very suitable for working with "ephemeral truths" (although it can be done, definitely).
However, yes, sometimes we need mutable state. A language such as ATS incorporates linear types for handling destructive updates and safe resource manipulation.

Related

Is Data-Structure and Algorithm same for all programming languages?

If a person learns data-structure and algorithm in one programming language does it needs to learn other language's data-structure and algorithm ?
As i am about to start a book Data-structure and algorithm in JavaScript as i also want to learn Web
will it help me for other languages too?
Data structures and algorithms are concepts that are independent of language. Therefore once you master them in you favorite language it's relatively easy to switch to another one.
Now if you're asking about built-in methods and apis that different languages have, they do differ, but you shouldn't learn specific APIs in your data-structure and algorithms book anyways.
Yes... and no.
While the concepts behind algorithms and data structures, like space and time complexity or mutability, are language agnostic, some languages might not let you implement some of those patterns at all.
A good example would be a recursive algorithm. In some languages like haskell, recursivity is the best way to iterate over a collection of element. In other languages like C, you should avoid using recursive algorithm on unbound collections as you can easily hit the dept limit of the stack. You could also easily imagine a language that is not even stack based and in which a recursive algorithm would be completely impossible to implement. You could implement a stack on top of such a language but it would most definitely be slower than implementing the algorithm in a different fashion.
An other example would be object oriented data structures. Some languages like haskell do not let you change values. All elements in such language are immutable and must be copied to be changed. This is analog to how numbers are handled in javascript where you cannot change the value 2, but you can take the value 2, add 1 to it and then store it to a new location. Other languages like C do not have (or very poorly handle) object oriented programming. This will break down most data structure pattern you will learn about in a javascript oriented book.
In the end, it all boils down to performance. You don't write C code like you write JavaScript or F# code. They all have their quirks and thus need different implementations even though the idea behind those algorithms and structures will stay the same. You can always emulate a pattern on a language that does not supports it, like OOP in C, but it will always feel more natural to solve the problem in a different way.
That being said, as long as you stay within the same kind of language, you can probably reuse 80%+ of that book. There are many OOP languages out there. Javascript is probably the most exotic of them all with its ability to treat all objects like dictionaries and its weird concept of "this" so a lot of patterns in there will not apply in those other languages.
You need not learn data structure and algorithm when you use another language.The reason is evident, all of data structures and algorithm is a realization of some kinds of "mathmatical or logical thought".
for example,if you learn the sort algorithm, you must hear about the quick sort and merge sort and any others, the realization of different sort algotithm is based on fundamental element that almost every language has,such as varible,arrays,loop and so on. i mean you can realize it without using the language(such as JavaScript) characteristics.
although it is nothing relevant to language,i still suggest you stduy it with C.The reason is that C is a lower high-level language which means it is near the operating system.And the algorithm you write with C is faster than Java or Python.(Most cases).And C don't have so many characteristic like c++ stl or java collection. In C++ or Java, it realize hashmap itself.If you are a green hand to data structure, you'd better realize it from 0 to 1 yourself rather directly use other "tools" to lazy.
The data structure and algorithm as concepts are the same across languages, the implementation however varies greatly.
Just look at the implementation of quicksort in an imperative language like C and in a functional language like Haskell. This particular algorithm is a poster boy for functional languages as it can be implemented in just about two lines (and people are particularly fond of stressing that's it).
There is a finer point. Depending on the language, many data structures and algorithms needn't be implemented explicitly at all, other than as an academic exercise to learn about them. For example, in Python you don't really need to implement an associative container whereas in C++ you need to.
If it helps, think of using DS and algo in different programming languages as narrating a story in multiple human languages. The plot elements remain the same but the ease of expression, or the lack thereof, varies greatly depending on the language used to narrate it.
(DSA) Data structures and algorithms are like emotions in humans (in all humans emotions are same like happy, sad etc)
and, programming languages are like different languages that humans speak (like spanish, english, german, arabic etc)
all humans have same emotions (DSA) but when they express them in different languages (programming languages) , the way of expressing (implementation) of these emotions (DSA) are different.
so when you switch to using different or new language, just have a look at how those DSA are implemented in that languages.

Is there a language that takes advantage of massively parallel computers?

There's a company that have/are developing a very parallel computer called Parallella. It looks like it has lots of potential, but it runs some C style language.
Q. Has anyone written a language specifically to take advantage of massively parallel computers like this?
Clause 1. It has to be a managed garbage collected language.
Clause 2. It has to make it very easy to write parallel code without requiring the developer to look after low-level locking.
Clause 3. Bonus points for functional languages.
Clause 4. Super bonus points for languages with lambdas.
There are a definitely languages that have been designed to deal with the rising popularity of parallel computing. Parallel processors have sky rocketed in popularity since the death of Moore's Law. Support for better parallel computing in programming languages has followed quickly in its path.
My personal recommendation would be either Haskell or Clojure. Both are functional languages which have made great strides in parallel and concurrent computing leveraging their functional nature to gain advantages. Haskell has a really nice book called Parallel and Concurrent Programming in Haskell by Simon Marlow. I've read it and it's excellent. Clojure has also been built from the ground up with concurrency in mind. An interesting new player in this space is Julia, but I can't say I know much about it at all.
As for clause 1, I don't know what a managed language means. EDIT: What you're calling a managed language is more commonly called garbage collected language. You might want to use that term to help get more effective answers. Also all the languages I recommended have garbage collection.
As for clause 2, Haskell definitely makes parallel computing fairly automatic without any worrying about low level concepts or locking. There is a simple function called 'par' which allows the programmer to annotate two computations to be executed in parallel. The semantics guarantee that the expressions be evaluated when they're necessary and since the computations are functional they are guaranteed not to interact in non-thread-safe ways.
As for clause 3, you're on the right track to be looking for a functional language. Functional subcomputations have automatic thread safety which pays big dividends when it comes to ensuring parallel execution doesn't cause problems. It can't cause any if the computations are functional.
As for clause 4, good luck finding a functional language that doesn't have lambda ;) EDIT: It's not, strictly speaking, part of the definition of a functional language because there is no formal definition for what a functional programing language is. Informally I think a lot of people would mention it as one of the most important features. Concatenative languages or languages that are based on tacit programming (aka point-free style) can be functional and get away with not having lambda. I wouldn't be surprised if the K language didn't have lambda despite being functional. Also, I know for sure combinatory logic (which is the basis for K) does not have lambda. Though combinatory logic is just a theoretical basis and not a practical programming language.

Is there a functional algorithm which is faster than an imperative one?

I'm searching for an algorithm (or an argument of such an algorithm) in functional style which is faster than an imperative one.
I like functional code because it's expressive and mostly easier to read than it's imperative pendants. But I also know that this expressiveness can cost runtime overhead. Not always due to techniques like tail recursion - but often they are slower.
While programming I don't think about runtime costs of functional code because nowadays PCs are very fast and development time is more expensive than runtime. Furthermore for me readability is more important than performance. Nevertheless my programs are fast enough so I rarely need to solve a problem in an imperative way.
There are some algorithms which in practice should be implemented in an imperative style (like sorting algorithms) otherwise in most cases they are too slow or requires lots of memory.
In contrast due to techniques like pattern matching a whole program like a parser written in an functional language may be much faster than one written in an imperative language because of the possibility of compilers to optimize the code.
But are there any algorithms which are faster in a functional style or are there possibilities to setting up arguments of such an algorithm?
A simple reasoning. I don't vouch for terminology, but it seems to make sense.
A functional program, to be executed, will need to be transformed into some set of machine instructions.
All machines (I've heard of) are imperative.
Thus, for every functional program, there's an imperative program (roughly speaking, in assembler language), equivalent to it.
So, you'll probably have to be satisfied with 'expressiveness', until we get 'functional computers'.
The short answer:
Anything that can be easily made parallel because it's free of side-effects will be quicker on a multi-core processor.
QuickSort, for example, scales up quite nicely when used with immutable collections: http://en.wikipedia.org/wiki/Quicksort#Parallelization
All else being equal, if you have two algorithms that can reasonably be described as equivalent, except that one uses pure functions on immutable data, while the second relies on in-place mutations, then the first algorithm will scale up to multiple cores with ease.
It may even be the case that your programming language can perform this optimization for you, as with the scalaCL plugin that will compile code to run on your GPU. (I'm wondering now if SIMD instructions make this a "functional" processor)
So given parallel hardware, the first algorithm will perform better, and the more cores you have, the bigger the difference will be.
FWIW there are Purely functional data structures, which benefit from functional programming.
There's also a nice book on Purely Functional Data Structures by Chris Okasaki, which presents data structures from the point of view of functional languages.
Another interesting article Announcing Intel Concurrent Collections for Haskell 0.1, about parallel programming, they note:
Well, it happens that the CnC notion
of a step is a pure function. A step
does nothing but read its inputs and
produce tags and items as output. This
design was chosen to bring CnC to that
elusive but wonderful place called
deterministic parallelism. The
decision had nothing to do with
language preferences. (And indeed, the
primary CnC implementations are for
C++ and Java.)
Yet what a great match Haskell and CnC
would make! Haskell is the only major
language where we can (1) enforce that
steps be pure, and (2) directly
recognize (and leverage!) the fact
that both steps and graph executions
are pure.
Add to that the fact that Haskell is
wonderfully extensible and thus the
CnC "library" can feel almost like a
domain-specific language.
It doesn't say about performance – they promise to discuss some of the implementation details and performance in future posts, – but Haskell with its "pureness" fits nicely into parallel programming.
One could argue that all programs boil down to machine code.
So, if I dis-assemble the machine code (of an imperative program) and tweak the assembler, I could perhaps end up with a faster program. Or I could come up with an "assembler algorithm" that exploits some specific CPU feature, and therefor it really is faster than the imperative language version.
Does this situation lead to the conclusion that we should use assembler everywhere? No, we decided to use imperative languages because they are less cumbersome. We write pieces in assembler because we really need to.
Ideally we should also use FP algorithms because they are less cumbersome to code, and use imperative code when we really need to.
Well, I guess you meant to ask if there is an implementation of an algorithm in functional programming language that is faster than another implementation of the same algorithm but in an imperative language. By "faster" I mean that it performs better in terms of execution time or memory footprint on some inputs according to some measurement that we deem trustworthy.
I do not exclude this possibility. :)
To elaborate on Yasir Arsanukaev's answer, purely functional data structures can be faster than mutable data structures in some situations becuase they share pieces of their structure. Thus in places where you might have to copy a whole array or list in an imperative language, where you can get away with a fraction of the copying because you can change (and copy) only a small part of the data structure. Lists in functional languages are like this -- multiple lists can share the same tail since nothing can be modified. (This can be done in imperative languages, but usually isn't, because within the imperative paradigm, people aren't usually used to talking about immutable data.)
Also, lazy evaluation in functional languages (particularly Haskell which is lazy by default) can also be very advantageous because it can eliminate code execution when the code's results won't actually be used. (One can be very careful not to run this code in the first place in imperative languages, however.)

OOP vs PP for algorithms

Which paradigm is better for design and analysis of algorithms?
Which is faster? Because I have a subject called Design and Analysis of Algorithms in university and have a time limit for programs. Is OOP slower than Procedure programming? Or the time difference is not big?
Object-Oriented programming isn't particularly relevant to algorithms. Procedural programming you will need, but as far as algorithms are concerned, object-oriented programming is just another way to package up procedural programming. You have methods instead of functions and classes instead of records/structs, but the only relevant difference is run-time dispatch, and that's just a declarative way to handle a run-time decision that could have been handled some other way.
Object-Oriented programming is more relevant to the larger scale - design patterns etc - whereas algorithms are more relevant to the smaller scale involving a small number (often just one) of procedures.
IMO algorithms exist separat from the OO or PP issue.
Neither OO or PP are 'slow', in either design-time or program performance, they are different approaches.
I would think that Functional Programming would produce cleaner implementation of algorithms.
Having said that, you shouldn't see much of a difference whatever approach you take. An algorithm can be expressed in any language or development paradigm.
Update: (following comments)
Apparently functional programming does not lend itself to implementing algorithms as well as I thought it may. It has other strengths and I mostly mentioned it for completeness sake, as the question only mentioned OOP (object oriented programming) and PP (procedural programming).
the weak link is liekly to be your knowledge - what language & paradigm are you most comfortable with. use that
For design, analysis and development: definitely OOP. It was invented solemnly for the benefit of designers and developers.
For program runtime execution: sometimes PP is more efficient, but often OOP gets reduced to plain PP by the compiler, making them equivalent.
The difference (in execution time) is marginal at best.
Note that there is a more important factor than sheer performance: OOP provide the programmer with better means to organize his code which results in programs that are well structured, understandable, and more reliable (less bugs).
Object oriented programming abstracts many low level details from the programmer. It is designed with the goal
to make it easier to write and read (and understand) programs
to make programs look closer to the real world (and hence, easier to understand).
Procedural programming does not have many abstractions like objects, methods, virtual functions etc.
So, talking about speed: a seasoned expert who knows the internals of how an object oriented system will work can write a program that runs just as fast.
That being said, the speed advantage achieved by use PP over OOP will be very marginal. It boils down to which way you can write programs comfortably.
EDIT:
An interesting anecdote comes to my mind: in the Microsoft Foundation Classes, message passing from one object to the other was implemented using macros that looked like BEGIN_MESSAGE_MAP() and END_MESSAGE_MAP(), and the reason was that it was faster than using virtual functions.
This is one case where the library developers have used OOP, but have knowingly sidestepped a performance bottleneck.
My guess is that the difference is not big enough to worry about, and the time limit should allow using a slower language, since the algorithm used would be what's important.
The purpose of the time limit should IMO be to get you to avoid using for example a O(n3) algorithm when there is a O(n log n)
To make writing code easy and less error prone, you need a language that supports Generics - such as C++ with STL or Java with the Java Collections Framework. If you are implementing an algorithm against a deadline, you may be able to save time by not providing your algorithm with a nice O-O or Generic interface, so making the code you write yourself entirely procedural.
For run time efficiency, you would probably be best writing everything in procedural C - see e.g. the examples in "The Practice Of Programming" - but it will take a lot longer to write, and you are more likely to make mistakes. This also assumes that all the building blocks you need are available in their most up to date and efficient from in procedural C as well, which is quite an assumption these days. Most likely making use of the STL or the JFC will in practice save you cpu time as well as development time.
As for functional languages, I remember hearing functional programming enthusiasts point out how much easier to use their languages were than the competition, and then observing that those members of the class who chose a functional language were still struggling when those who wrote in Fortran 77 had finished and gone on to draw graphs of the performance of their program. I see that the claims of the functional programming community have not changed. I do not know if the underlying reality has.
Steve314 said it well. OOP is more about the design patterns and organization of large applications. It also lets you deal with unknowns better, which is great for user apps. However, for analyzing algorithms, most likely you are going to be thinking functionally about what you want to do. In that case, I'd stick to more simple PP and not try to create a fully OO design, when you care about the algorithm. I'd want to work with C or Matlab (depending on how math intensive the algorithm is). Just my opinion on it.
I once adapted the Knuth-Morris-Pratt string search algorithm so that I could have an object that would take a character at a time and return a match/no-match status. It wasn't a straight-forward translation.

Can any algorithmic puzzle be implemented in a purely functional way?

I've been contemplating programming language designs, and from the definition of Declarative Programming on Wikipedia:
This is in contrast from imperative programming, which requires a detailed description of the algorithm to be run.
and further down:
... Any style of programming that is not imperative. ...
It then goes on to express that functional languages, because they are not imperative, are declarative by their very nature.
However, this makes me wonder, are purely functional programming languages able to solve any algorithmic problem, or are the constraints based upon what functions are available in that language?
I'm mostly interested in general thoughts on the subject, although if specific examples can illustrate the point, I certainly welcome them.
According to the Church-Turing Thesis ,
the three computational processes (recursion, λ-calculus, and Turing machine) were shown to be equivalent"
where Turing machine can be read as "procedural" and lambda calculus as "functional".
Yes, Haskell, Erlang, etc. are Turing complete languages. In principle, you don't need mutable state to solve a problem, since you can always create a new object instead of mutating the old one. Of course, Brainfuck is also Turing complete. In other words, just because an algorithm can be expressed in a functional language doesn't mean it's not horribly awkward.
OK, so Church and Turing provied it is possible, but how do we actually do something?
Rewriting imperative code in pure functional style is an exercise I frequently assign to undergraduate students:
Each mutable variable becomes a function parameter
Loops are rewritten using recursion
Each goto is expressed as a function call with arguments
Sometimes what comes out is a mess, but often the results are surprisingly elegant. The only real trick is not to pass arguments that never change, but instead to let-bind them in the outer environment.
The big difference with functional style programming is that it avoids mutable state. Where imperative programming will typically update variables, functional programming will define new, read-only values.
The main place where this will hit performance is with algorithms that use updatable arrays. An imperative implementation can update an array element in O(1) time, while the best a purely functional style of implementation can achieve is O(log N) (using a sorted tree).
Note that functional languages generally have some way to use updateable arrays with O(1) access time (e.g., Haskell provides this with its state transformer monad). However, this is arguably an imperative programming method... nothing wrong with that; you want to use the best tools for a particular job, after all.
The functional style of O(log N) incremental array update is not all bad, though, as functional style algorithms seem to lend themselves well to parallellization.
Too long to be posted as a comment on #SteveB's answer.
Functional programming and imperative programming have equal capability: whatever one can do, the other can do. They are said to be Turing complete. The functions that a Turing machine can compute are exactly the ones that recursive function theory and λ-calculus express.
But the Church-Turing Thesis, as such, is irrelevant. It asserts that any computation can be carried out by a Turing machine. This relates an informal idea - computation - to a formal one - the Turing machine. Nobody has yet found anything we would recognise as computation that a Turing machine can't do. Will someone find such a thing in future? Who can tell.
Using state monads you can program in an imperative style in Haskell.
So the assertion that Haskell is declarative by its very nature needs to be taken with a grain of salt. On the positive side it then is equivalent to imperative programming languages, also in a practical sense which doesn't completely ignore efficiency.
While I completely agree with the answer that invokes Church-Turing thesis, this begs an interesting question actually. If I have a parallel computation problem (which is not algorithmic in a strict mathematical sense), such as multiple producer/consumer queue or some network protocol between several machines, can this be adequately modeled by Turing machine? It can be simulated, but if we simulate it, we lose the purpose why we have the parallelism in the problem (because then we can find simpler algorithm on the Turing machine). So what if we were not to lose parallelism inherent to the problem (and thus the reason why are we interested in it), we couldn't remove the notion of state?
I remember reading somewhere that there are problems which are provably harder when solved in a purely functional manner, but I can't seem to find the reference.
As noted above, the primary problem is array updates. While the compiler may use a mutable array under the hood in some conditions, it must be guaranteed that only one reference to the array exists in the entire program.
Not only is this a hard mathematical fact, it is also a problem in practice, if you don't use impure constructs.
On a more subjective note, stating that all Turing complete languages are equivalent is only true in a narrow mathematical sense. Paul Graham explores the issue in Beating the Averages in the section "The Blub Paradox."
Formal results such as Turing-completeness may be provably correct, but they are not necessarily useful. The travelling salesman problem may be NP-complete, and yet salesman travel all the time. It seems they don't feel the need to follow an "optimal" path, so the theorem is irrelevant.
NOTE: I am not trying to bash functional programming, since I really like it. It is just important to remember that it is not a panacea.

Resources