Right rotate of tree in Haskell: how is it work? - syntax

I don't know haskell syntax, but I know some FP concepts (like algebraic data types, pattern matching, higher-order functions ect).
Can someone explain please, what does this code mean:
data Tree ? = Leaf ? | Fork ? (Tree ?) (Tree ?)
rotateR tree = case tree of
Fork q (Fork p a b) c -> Fork p a (Fork q b c)
As I understand, first line is something like Tree-type declaration (but I don't understand it exactly). Second line includes pattern matching (I don't understand as well why do we need to use pattern matching here). And third line does something absolutely unreadable for non-haskell developer. I've found definition of Fork as fork (f,g) x = (f x, g x) but I can't move further anymore.

First of all the data type definition should not contain question marks, but normal letters:
data Tree a = Leaf a | Fork a (Tree a) (Tree a)
It defines a type Tree that contains elements of some not further specified type a.
The tree is either a Leaf, containing an element of type a, or it is a Fork, containing also an element of type a and two subtrees. The subtrees are Tree structures that contain elements of type a.
Important to note is that Haskell uses parenthesis purely for grouping, like in 2 * (2+3), not to specify calling functions. To call functions, the parameters are just written after the function name, separated with spaces, like in sin 30 or compare "abc" "abd".
In the case statement, the part to the left of -> is a pattern match, the part to the right is the functions result in case the tree actually had the form specified on the left. The pattern Fork q (Fork p a b) c matches if the tree is a Fork (that's the Fork from the data type definition) and the first subtree of it is another Fork. The lowercase letters are all just variables, capturing the different parts of the tree structure matched. So p would be the element contained in the subtree, a would be the subtrees first branch and b the second one.
The right side of the ->, Fork p a (Fork q b c), now builds a new tree from these parts matched in the pattern match. The lower case variables are all the tree parts matched on the left, and the Forks are the constructors from the data type definition. It build a tree that is a Fork and has a second subtree that is also a Fork (the part in parenthesis). The remaining pieces of this tree are just the parts of the tree that has been "dissolved" on the left side.

I think you misunderstand Fork. It is not a function, but a constructor for type Tree. It is essentially a node in the Tree data structure... Each node in Tree is either a Leaf (with a value) or a Fork (with a value and two sub-nodes).
Pattern matching is used to transform the structure. My ASCII art is not good enough to give you a drawing, but it sort-of moves 'left nodes' up and 'right nodes' down.
Note: I say you may be misunderstanding Fork, because fork (f,g) x = (f x, g x) is something completely different. It is a higher order function in this case and has nothing to do with your Tree structure.
Hope that helps :),
Carl

Related

Prolog pathfinder

I'm interested in pathfinder stuff and found a cool example on a site from 2011 but there isn't any explanation for the code. I understand what it does but don't understand how the steps work. So, for example these are the edges:
edge(1,2).
edge(1,4).
edge(2,4).
edge(3,6).
edge(3,7).
edge(4,3).
edge(4,5).
edge(5,6).
edge(5,7).
edge(6,5).
edge(7,5).
edge(8,6).
edge(8,7).
and with this I can tell if there is path between them:
path(X,Y,[X,Y]):- edge(X,Y).
path(X,Y,[X|Xs]):- edge(X,W), path(W,Y,Xs).
The output is something like this:
path(1,6,Xs).
Xs = [1,2,4,5,6];
Xs = [1,4,5,6];
...
But how does it exactly work?
What does [X,Y] do in the first line and what happens in the second?
The crucial thing to understand in this example is how recursive predicates work. First of all, recursion always needs a recursion step (recursive use of the current predicate), and a recursion anchor (the step where the recursion stops). The resolution algorithm is a depth-first search, and whereever there are multiple options to choose from (i.e., a ; or different rules or facts with the signature), the interpreter chooses from top to bottom and from left to right. To avoid infinite evaluations, the recursion anchor needs to be on the top like it is here, and the recursion step should be on the right of the second rule.
In the above example, the recursion stops when there is a direct edge between Xand Y, because that's where the path ends. Keep in mind that the rules are implications from right to left. As the third parameter is an output argument (the result you want to get), it needs to be initialized first in the anchor. [X,Y] does that by starting it with a list that contains the last two elements of the path. The rule is equivalent to the following:
path(X,Y,Result):- edge(X,Y), Result = [X,Y].
The second rule aims to find intermediate path elements: It assumes there is an edge(X,W) to an intermediate element W, and then a path from W to Y. The interpreter will try every edge from X to possible Ws. If there exists a path from a W to Y, there also is a path from X to Y, and the second rule becomes true for that step. The result of the recursive use to the predicate (the path list in the third parameter) will be Xs. So all that needs to be done in the current step is to add the X to the result list ([X|Xs]). Again, that is equivalent to:
path(X,Y,Result):- edge(X,W), path(W,Y,Xs), Result=[X|Xs].
Long story short: The resulting list is started with the last two elements in the recursion anchor, which then gets passed backwards through all recursive steps, and each step add its current X to the front to the list.
Of course recursion can still be infinite when there are cycles in the data (and paths) like in the example. If you want to avoid such cycles (and likely unwanted solutions such as paths where elements appear multiple times), you can keep track of the elements already visited:
path(X,Y,[X,Y],V):- \+member(X,V),\+member(Y,V),edge(X,Y).
path(X,Y,[X|Xs],V):- \+member(X,V),edge(X,W), path(W,Y,Xs,[X|V]).
In this solution, the list in the additional forth parameter collects the items already visited in an additional list. With \+member(X,V) it can be checked if the current X is already contained in V. There are other ways this can be implemented, for example by just using V as a result an reverting it in the anchor. V needs to be initialized in the query with an empty list:
?- path(1,6,R,[]).
R = [1, 2, 4, 3, 6] ;
R = [1, 2, 4, 3, 7, 5, 6] ;
...

Recursively finding sum in prolog

I'm trying to use Prolog to find a sum of a path using recursion. I need to pass a list of nodes of a graph, then have it return the sum of their weights.
This is what I've tried but I'm not sure if I'm on the right track.
connect(a,b,5).
connect(b,c,8).
connect(a,d,10).
connect(d,e,6).
connect(d,f,11).
connect(d,g,4).
connect(b,d,2).
connect(b,e,9).
connect(c,d,4).
connect(c,f,5).
connect(e,g,2).
connect(f,g,1).
list_sum([], 0).
list_sum([Head | Tail], TotalSum) :-
list_sum(connect(Head,Tail,X), Sum1),
TotalSum is Head + Sum1.
Example goal:
list_sum([a,b,c],Sum).
Sum = 13
I see three problems with your code. The first is that you have a logic variable X that you are not using, the second is that your predicate list_sum takes a list as its first element and yet you are giving it a predicate connect(Head,Tail,X), the third is that you are using Head in an addition whereas apparently Head is an atom, not an Integer (maybe you meant X here), the fourth (I'm finding them as I go) is that the second argument of the predicate connect is an atom (representing a node, in this case) and you are giving it a list.
And a fifth problem with your question: you seem to think that the weights are on the nodes where they are clearly on the edges.
So I think the question of your assignment is two-folds:
Check that the path given to you is actually a path (in that there is a connection between each element and the next)
If it is indeed a path, sum the weights of your connections along the way.
In Prolog, the core artifact of programming are predicates, not functions. So to get the weight of a given link, you call connect(Head,NextNode, Weight), which gives you (through unification) both the Weight and a possible NextNode, then the recursive call will check if that NextNode is indeed the next element in the list. After the recursive call, you use Weight instead of Head, and it should be a bit closer to the solution.
PS: Feel free to create a list_sum_aux/3 and use it instead.

List Recursive Predicates, advanced

Just going through some tutorials and I found some advanced recursive procedurs like flatten. I have tried to google to find similar examples that involve multiple recursions (head tail) but could not get the result I required.
Could you infer some predicates or tutorials that cover advanced list recusions (on both head, tail)?
Just to expand a bit on what #hardmath is saying, let's look at the definition of lists:
Base case: []
Inductive case: [Head|Tail]
What makes this a recursive data structure is that Tail is also a list. So when you see [1,2,3], you're also seeing [1|[2|[3|[]]]]. Let's prove it:
?- X = [1|[2|[3|[]]]].
X = [1, 2, 3].
So more "advanced" forms of recursion are forms that either involve more complex recursive data types or more complex computations. The next recursive data type most people are exposed to are binary trees, and binary trees have the nice property that they have two branches per node, so let's look at trees for a second.
First we need a nice definition like the definition from lists. I propose the following:
Base case: empty
Inductive case: tree(LeftBranch, Value, RightBranch)
Now let's create some example trees just to get a feel for how they look:
% this is like the empty list: no data
empty
% this is your basic tree of one node
tree(empty, 1, empty)
% this is a tree with two nodes
tree(tree(empty, 1, empty), 2, empty).
Structurally, the last example there would probably look something like this:
2
/
1
Now let's make a fuller example with several levels. Let's build this tree:
10
/ \
5 9
/ \ / \
4 6 7 14
In our Prolog syntax it's going to look like this:
tree(tree(tree(empty, 4, empty), 5, tree(empty, 6, empty)),
10,
tree(tree(empty, 7, empty), 9, tree(empty, 14, empty)))
The first thing we're going to want is a way to add up the size of the tree. Like with lists, we need to consider our base case and then our inductive cases.
% base case
tree_size(empty, 0).
% inductive case
tree_size(tree(Left, _, Right), Size) :-
tree_size(Left, LeftSize),
tree_size(Right, RightSize),
Size is LeftSize + RightSize + 1.
For comparison, let's look at list length:
% base case
length([], 0).
% inductive case
length([_|Rest], Length) :-
length(Rest, LengthOfRest),
Length is LengthOfRest + 1.
Edit: #false points out that though the above is intuitive, a version with better logical properties can be produced by changing the inductive case to:
length([_|Rest], Length) :-
length(Rest, LengthOfRest),
succ(LengthOfRest, Length).
So you can see the hallmarks of recursively processing data structures clearly by comparing these two:
You are given a recursive data structure, defined in terms of base cases and inductive cases.
You write the base of your rule to handle the base case.
This step is usually obvious; in the case of length or size, your data structure will have a base case that is empty so you just have to associate zero with that case.
You write the inductive step of your rule.
The inductive step takes the recursive case of the data structure and handles whatever that case adds, and combining that with the result of recursively calling your rule to process "the rest" of the data structure.
Because lists are only recursive in one direction there's only one recursive call in most list processing rules. Because trees have two branches there can be one or two depending on whether you need to process the whole tree or just go down one path. Both lists and trees effectively have two "constructors," so most rules will have two bodies, one to handle the empty case and one to handle the inductive case. More complex structures, such as language grammars, can have more than two basic patterns, and usually you'll either process all of them separately or you'll just be seeking out one pattern in particular.
As an exercise, you may want to try writing search, insert, height, balance or is_balanced and various other tree queries to get more familiar with the process.

Purely functional set

Is there an algorithm that implements a purely functional set?
Expected operations would be union, intersection, difference, element?, empty? and adjoin.
Those are not hard requirements though and I would be happy to learn an algorithm that only implements a subset of them.
You can use a purely functional map implementation, where you just ignore the values.
See http://hackage.haskell.org/packages/archive/containers/0.1.0.1/doc/html/Data-IntMap.html (linked to from https://cstheory.stackexchange.com/questions/1539/whats-new-in-purely-functional-data-structures-since-okasaki ).
(sidenote: For more information on functional datastructures, see http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504 )
A purely functional implementation exists for almost any data structure. In the case of sets or maps, you typically use some form of search tree, e.g. red/black trees or AVL trees. The standard reference for functional data structures is the book by Okasaki:
http://www.cambridge.org/gb/knowledge/isbn/item1161740/
Significant parts of it are available for free via his thesis:
http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
The links from the answer by #ninjagecko are good. What I've been following recently are the Persistent Data Structures used in Clojure, which are functional, immutable and persistent.
A description of the implementation of the persistent hash map can be found in this two-part blog post:
http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice/
http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap-part-ii/
These are implementations of some of the ideas (see the first answer, first entry) found in this reference request question.
The sets that come out of these structures support the functions you need:
http://clojure.org/data_structures#Data Structures-Sets
All that's left is to browse the source code and try to wrap your head around it.
Here is an implementation of a purely functional set in OCaml (it is the standard library of OCaml).
Is there an algorithm that implements a purely functional set?
You can implement set operations using many different purely functional data structures. Some have better complexity than others.
Examples include:
Lists
Where we have:
List Difference:
(\\) :: Eq a => [a] -> [a] -> [a]
The \\ function is list difference ((non-associative). In the result of xs \\ ys, the first occurrence of each element of ys in turn (if any) has been removed from xs. Thus
union :: Eq a => [a] -> [a] -> [a]
The union function returns the list union of the two lists. For example,
"dog" `union` "cow" == "dogcw"
Duplicates, and elements of the first list, are removed from the the second list, but if the first list contains duplicates, so will the result. It is a special case of unionBy, which allows the programmer to supply their own equality test.
intersect :: Eq a => [a] -> [a] -> [a]
The intersect function takes the list intersection of two lists. For example,
[1,2,3,4] `intersect` [2,4,6,8] == [2,4]
If the first list contains duplicates, so will the result.
Immutable Sets
More efficient data structures can be designed to improve the complexity of set operations. For example, the standard Data.Set library in Haskell implements sets as size-balanced binary trees:
Stephen Adams, "Efficient sets: a balancing act", Journal of Functional Programming 3(4):553-562, October 1993, http://www.swiss.ai.mit.edu/~adams/BB/.
Which is this data structure:
data Set a = Bin !Size !a !(Set a) !(Set a)
| Tip
type Size = Int
Yielding complexity of:
union, intersection, difference: O(n+m)

Converting an expression to conjunctive normal form with a twist

I've got a library that I have to interface with which acts basically as a data source. When retrieving data, I can pass special "filter expressions" to that library, which later get translated to SQL WHERE part. These expressions are pretty limited. They must be in conjunctive normal form. Like:
(A or B or C) and (D or E or F) and ...
This of course isn't very comfortable for programming. So I want to make a little wrapper which can parse arbitrary expressions and translate them to this normal form. Like:
(A and (B or C) and D) or E
would get translated to something like:
(A or E) and (B or C or E) and (D or E)
I can parse an expression to a tree with the Irony library. Now I need to normalize it, but I don't know how... Oh, also, here's the twist:
The final expression may not contain the NOT operator. However, I can inverse the individual terms by replacing the operators with the inverse operators. So, this is OK:(not A or not B) AND (not C or not D)but this is not:
not (A or B) and not (C or D)
I would like to make the expression as simple as possible, because it will be translated to a practically identical SQL WHERE statement, so a complex statement would most likely slow down execution speed.
I'd use two iterations over the tree, although it's probably possible in one.
First iteration: get rid of your NOT Nodes by walking through the tree and using De Morgan's law (wikipedia link) and remove double negation wherever applicable.
Second iteration (the NOT are now only directly before a leaf node)
Go through your tree:
Case "AND NODE":
fine, inspect the children
Case "OR NODE":
if there is a child which is neither a Leaf nor a NOT node
apply the distributive law.
start from parent of current node again
else
fine, inspect children
After that you should be done.

Resources