I come from a Python background and I am currently learning Elixir. I can understand some of the concepts from Elixir with my knowledge from Python.[1]
In my mental model I understand Elixir's Enum to be what is called an Iterable in Python: It can be sliced, indexed, sorted, zipped, it can be enumerated (!), and so on. However, I spotted one big difference: In Python, strings comply with the Iterable protocol, but Elixir's Strings are not Enums. I guess, there must be a deeper conceptual or computational reason behind this. Just looking at the specs of Enum all the functions of Enum would make sense (from a business logic point of view) to be applied to String. But in order to apply them, I always have to transfer a String to an Enum of its graphemes.
So, my question is:
Why is the String data type not implemented as an Enum in the specific context of Elixir?
[1] For instance, both Elixir Stream and Process seem to share ideas with Python's Generator — a Stream is basically a Generator which does not allow sending values into it, and a Process is a full-blown Generator.
Related
what is purely functional language? And what is purely functional data structure?
I'm kind of know what is the functional language, but I don's know what is the "pure" mean. Does anyone know about it?
Can someone explain it to me?Thanks!
When functional programmers refer to the concept of a pure function, they are referring to the concept of referential transparency.
Referential transparency is actually about value substitution without changing the behaviour of the program.
Consider some function that adds 2 to a number:
let add2 x = x + 2
Any call to add2 2 in the program can be substituted with the value 4 without changing any behaviour.
Now consider we throw a print into the mix:
let add2 x =
print x
x + 2
This function still returns the same result as the previous one but we can no longer do the value substitution without changing the program behaviour because add2 2 has the side-effect of printing 2 to the screen.
It is therefore not refentially transparent and thus an impure function.
Now we have a good definition of a pure function, we can define a purely functional language as a language where we work pretty much exclusively with pure functions.
Note: It is still possible to perform effects (such as printing to the console) in a purely functional language but this is done by treating the effect as a value that represents the action to be performed rather than as a side-effect within some function. These effect values are then composed into a larger set of program behaviour.
A purely functional data structure is then simply a data structure that is designed to be used from a purely functional language.
Since mutating a data structure with a function would break this referential transparency property, we need to return a new data structure each time we e.g. add or remove elements.
There are particular types of data structures where we can do this efficiently, sharing lots of memory from prior copies: singly-linked lists and various tree based structures are the most common examples but there are many others.
Most functional languages that are in use today are not pure in that they provide ways to interact with the real world. Long ago, for example, Haskell had a few pure variants.
Purely functional data = persistent data (i.e. immutable)
Pure function = given the same input always produces same output and does not contain, or is effected by, side effects.
I started self learning design patterns from Design Patterns by Gang of Four
Parameterized types give us a third way (in addition to class
inheritance and object composition) to compose behavior in
object-oriented systems. Many designs can be implemented using any of
these three techniques. To parameterize a sorting routine by the
operation it uses to compare elements, we could make the comparison
an operation implemented by subclasses (an application of Template Method (325)),
the responsibility of an object that's passed to the sorting routine (Strategy (315)), or
an argument of a C++ template or Ada generic that specifies the name of the function to call to compare the elements.
I looked up the strategy pattern, but was still wondering how the second way "make the comparison the responsibility of an object that's passed to the sorting routine (Strategy)" is done?
I'd appreciate some example(s) in whichever OO language: C++, C#, Java, Python, ...
Thanks.
This may sounds like I'm begging to start a flame war, but hear me out.
In some languages laziness is expensive. For example, in Ruby, where I have the most recent experience, laziness is slow because it's achieved using fibers, so it's only attractive when:
you must trade off cpu for memory (think paging through large data set)
the performance penalty is worth it to hide details (yielding to fibers is a great way to abstract away complexity instead of passing down blocks to run in mysterious places)
Otherwise you'll definitely want to use the normal, eager methods.
My initial investigation suggests that the overhead for laziness in Elixir is much lower (this thread on reddit backs me up), so there seems little reason to ever use Enum instead of Stream for those things which Stream can do.
Is there something I'm missing, since I assume Enum exists for a reason and implements some of the same functions as Stream. What cases, if any, would I want to use Enum instead of Stream when I could use Stream?
For short lists, Stream will be slower than simply using Enum, but there's no clear rule there without benchmarking exactly what you are doing. There are also some functions that exist in Enum, but don't have corresponding functions in Stream. (for example, Enum.reverse )
The real reason you need both is that Stream is just a composition of functions. Every pipeline that needs results, rather than side effects needs to end in an Enum to get the pipeline to run.
They go hand in hand, Stream couldn't stand alone. What Stream is largely doing is giving you a very handy abstraction for creating very complex reduce functions.
The methods in Stream essentially create a "recipe list" of transformations over your data while the methods in Enum actually resolve these transformations. So you eventually will have to use an Enum function to resolve your data transformation even if everything else is a Stream.
Also some concepts, namely Reduce, have no real meaning in Stream and you must use Enum.
As for performance, if you have a series of transformations you're performing, a possibly infinite stream of data, or you're reading a file, use Stream. If you've just one transformation over a finite enumerable or you need to resolve a Stream, use Enum.
I saw a SO question yesterday about implementing a classic linked list in Java. It was clearly an assignment from an undergraduate data structures class. It's easy to find questions and implementations for lists, trees, etc. in all languages.
I've been learning about Java lambdas and trying to use them at every opportunity to get the idiom under my fingers. This question made me wonder: How would I write a custom list or tree so I could use it in all the Java 8 lambda machinery?
All the examples I see use the built in collections. Those work for me. I'm more curious about how a professor teaching data structures ought to rethink their techniques to reflect lambdas and functional programming.
I started with an Iterator,but it doesn't appear to be fully featured.
Does anyone have any advice?
Exposing a stream view of arbitrary data structures is pretty easy. The key interface you have to implement is Spliterator, which, as the name suggests, combines two things -- sequential element access (iteration) and decomposition (splitting).
Once you have a Spliterator, you can turn that into a stream easily with StreamSupport.stream(). In fact, here's the stream() method from AbstractCollection (which most collections just inherit):
default Stream<E> stream() {
return StreamSupport.stream(spliterator(), false);
}
All the real work is in the spliterator() method -- and there's a broad range of spliterator quality (the absolute minimum you need to implement is tryAdvance, but if that's all you implement, it will work sequentially, but will lose out on most of the stream optimizations.) Look in the JDK sources Arrays.stream(), IntStream.range()) for examples of how to do better.)
I'd look at http://www.javaslang.io for inspiration, a library that does exactly what you want to do: Implement custom lists, trees, etc. in a Java 8 manner.
It specifically doesn't closely couple with the JDK collections outside of importing/exporting methods, but re-implements all the immutable collection semantics that a Scala (or other FP language) developer would expect.
I've been working with Rust the past few days to build a new library (related to abstract algebra) and I'm struggling with some of the best practices of the language. For example, I implemented a longest common subsequence function taking &[&T] for the sequences. I figured this was Rust convention, as it avoided copying the data (T, which may not be easily copy-able, or may be big). When changing my algorithm to work with simpler &[T]'s, which I needed elsewhere in my code, I was forced to put the Copy type constraint in, since it needed to copy the T's and not just copy a reference.
So my higher-level question is: what are the best-practices for passing data between threads and structures in long-running processes, such as a server that responds to queries requiring big data crunching? Any specificity at all would be extremely helpful as I've found very little. Do you generally want to pass parameters by reference? Do you generally want to avoid returning references as I read in the Rust book? Is it better to work with &[&T] or &[T] or Vec<T> or Vec<&T>, and why? Is it better to return a Box<T> or a T? I realize the word "better" here is considerably ill-defined, but hope you'll understand my meaning -- what pitfalls should I consider when defining functions and structures to avoid realizing my stupidity later and having to refactor everything?
Perhaps another way to put it is, what "algorithm" should my brain follow to determine where I should use references vs. boxes vs. plain types, as well as slices vs. arrays vs. vectors? I hesitate to start using references and Box<T> returns everywhere, as I think that'd get me a sort of "Java in Rust" effect, and that's not what I'm going for!