Given the following simple BST definition:
data Tree x = Empty | Leaf x | Node x (Tree x) (Tree x)
deriving (Show, Eq)
inOrder :: Tree x -> [x]
inOrder Empty = []
inOrder (Leaf x) = [x]
inOrder (Node root left right) = inOrder left ++ [root] ++ inOrder right
I'd like to write an in-order function that can have side effects. I achieved that with:
inOrderM :: (Show x, Monad m) => (x -> m a) -> Tree x -> m ()
inOrderM f (Empty) = return ()
inOrderM f (Leaf y) = f y >> return ()
inOrderM f (Node root left right) = inOrderM f left >> f root >> inOrderM f right
-- print tree in order to stdout
inOrderM print tree
This works fine, but it seems repetitive - the same logic is already present in inOrder and my experience with Haskell leads me to believe that I'm probably doing something wrong if I'm writing a similar thing twice.
Is there any way that I can write a single function inOrder that can take either pure or monadic functions?
In inOrder you are mapping a Tree x to a [x], i. e. you sequentialize your tree. Why not just use mapM or mapM_ on the resulting list?
mapM_ print $ inOrder tree
Just to remind the types of the functions I've mentioned:
mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m ()
You might want to look at implementing the Data.Traversable class or Data.Foldable class for your tree structure. Each only requires the definition of a single method.
In particular, if you implement the Data.Foldable class, you get the following two functions for free:
mapM_ :: (Foldable t, Monad m) => (a -> m b) -> t a -> m ()
toList :: Foldable t => t a -> [a]
It will also give you the rich set of functions (foldr, concatMap, any, ...) that you are used to using with the list type.
You only have to implement one of the following functions to create an instance of Data.Foldable:
foldMap :: Monoid m => (a -> m) -> t a -> m
foldr :: (a -> b -> b) -> b -> t a -> b
Related
I am trying to write my own graph library in Haskell for use in the advent of code. I am trying to use a class for graphs and one concrete implementation using Data.Map. I am trying to write Dijkstra's algorithm, but I am running into some trouble with type families. I have the following typeclass and concrete implementation:
{-# LANGUAGE TypeFamilies, AllowAmbiguousTypes, ScopedTypeVariables, TypeFamilyDependencies #-}
class Graph g where
type Node g
type Edge g
nodeSet :: g -> S.Set (Node g)
neighbours :: g -> (Node g) -> Maybe [(Edge g, Node g)]
data MapGraph e n = MapGraph {mGraph :: M.Map n [(e,n)]} deriving Show
instance (Num e,Ord e,Ord n) => Graph (MapGraph e n) where
type Node (MapGraph e n) = n
type Edge (MapGraph e n) = e
nodeSet mapGraph = S.fromList $ M.keys $ mGraph mapGraph
neighbours mapGraph node = M.lookup node (mGraph mapGraph)
To represent the Infinity value of unvisited nodes in Dijkstra's algorithm I have created a sum data type:
data MaxBoundedNum a = Inf | Num a deriving Show
I am trying to work on the recursive function for the algorithm which will take in the graph, the current node, the destination node, the unvisited set, and a map of nodes and their length from the source node. The following skeleton function seems to be what I want:
go :: (Graph g) =>
g -> (Node g) -> (Node g) ->
S.Set (Node g) ->
M.Map (Node g) (MaxBoundedNum (Edge g)) ->
Maybe (M.Map (Node g) (MaxBoundedNum (Edge g)))
go graph curr dest uset vals = do
currNeighbours <- neighbours graph curr
undefined
This appears to work correctly for a graph g where graph :: MapGraph Int String
go graph
:: [Char]
-> [Char]
-> S.Set [Char]
-> M.Map [Char] (MaxBoundedNum Int)
-> Maybe (M.Map [Char] (MaxBoundedNum Int))
The next part of my go function needs to lookup the current distance from the vals map.
currDist <- M.lookup curr vals
This works outside the go function if I do the following:
currDist = M.lookup current vals
*Main> :t currDist
currDist :: Maybe (MaxBoundedNum Integer)
However, inside the do block I get this error:
Could not deduce (Ord (Node g)) arising from a use of ‘M.lookup’
from the context: Graph g
bound by the type signature for:
go :: forall g.
Graph g =>
g
-> Node g
-> Node g
-> S.Set (Node g)
-> M.Map (Node g) (MaxBoundedNum (Edge g))
-> Maybe (M.Map (Node g) (MaxBoundedNum (Edge g)))
at WithClass.hs:(96,1)-(100,49)
• In a stmt of a 'do' block: currDist <- M.lookup curr vals
The part Could not deduce made me think I need to give it a type annotation, so I did that:
currDist <- M.lookup curr vals :: Maybe (MaxBoundedNum (Edge g))
But that gives me this error:
WithClass.hs:102:15: error:
• Couldn't match type ‘Edge g’ with ‘Edge g1’
Expected type: Maybe (MaxBoundedNum (Edge g1))
Actual type: Maybe (MaxBoundedNum (Edge g))
NB: ‘Edge’ is a non-injective type family
• In a stmt of a 'do' block:
currDist <- M.lookup curr vals :: Maybe (MaxBoundedNum (Edge g))
In the expression:
do currDist <- M.lookup curr vals :: Maybe (MaxBoundedNum (Edge g))
currNeighbours <- neighbours graph curr
undefined
In an equation for ‘go’:
go graph curr dest uset vals
= do currDist <- M.lookup curr vals ::
Maybe (MaxBoundedNum (Edge g))
currNeighbours <- neighbours graph curr
undefined
• Relevant bindings include
vals :: M.Map (Node g) (MaxBoundedNum (Edge g))
(bound at WithClass.hs:101:25)
uset :: S.Set (Node g) (bound at WithClass.hs:101:20)
dest :: Node g (bound at WithClass.hs:101:15)
curr :: Node g (bound at WithClass.hs:101:10)
graph :: g (bound at WithClass.hs:101:4)
go :: g
-> Node g
-> Node g
-> S.Set (Node g)
-> M.Map (Node g) (MaxBoundedNum (Edge g))
-> Maybe (M.Map (Node g) (MaxBoundedNum (Edge g)))
(bound at WithClass.hs:101:1)
I had a look at this question but the accepted answer just said to add the TypeFamilyDependencies language extension which appears to not do anything for me. What am I doing wrong and how can I fix my code? Thank you in advance.
Operations on Set (Node g) and Map (Node g) require you to be able to compare nodes. This is what the Ord (Node g) constraint represents. The type you gave for go says it works for any definition of Node g, even those that can't be compared. The error tells you that when you use M.lookup, that needs an Ord (Node g) constraint, but there is no way to satisfy it.
You can satisfy that constraint by adding it to the signature of go:
{-# LANGUAGE FlexibleConstraints #-} -- Also enable this extension
go :: (Graph g, Ord (Node g)) => ...
I've been learning Haskell in my spare time working through LYAH. Would like to improve upon my Haskell (/ Functional programming) skills by solving some problems from the imperative world. One of the problems from EPI is to print an "almost sorted array", in a sorted fashion where it is guaranteed that no element in the array is more than k away from its correct position. The input is a stream of elements and the requirement is to do this in O(n log k) time complexity and O(k) space complexity.
I've attempted to re-implement the imperative solution in Haskell as follows:
import qualified Data.Heap as Heap
-- print the k-sorted list in a sorted fashion
ksorted :: (Ord a, Show a) => [a] -> Int -> IO ()
ksorted [] _ = return ()
ksorted xs k = do
heap <- ksorted' xs Heap.empty
mapM_ print $ (Heap.toAscList heap) -- print the remaining elements in the heap.
where
ksorted' :: (Ord a, Show a) => [a] -> Heap.MinHeap a -> IO (Heap.MinHeap a)
ksorted' [] h = return h
ksorted' (x:xs) h = do let (m, h') = getMinAndBuildHeap h x in
(printMin m >> ksorted' xs h')
printMin :: (Show a) => Maybe a -> IO ()
printMin m = case m of
Nothing -> return ()
(Just item) -> print item
getMinAndBuildHeap :: (Ord a, Show a) => Heap.MinHeap a -> a -> (Maybe a, Heap.MinHeap a)
getMinAndBuildHeap h item= if (Heap.size h) > k
then ((Heap.viewHead h), (Heap.insert item (Heap.drop 1 h)))
else (Nothing, (Heap.insert item h))
I would like to know a better way of solving this in Haskell. Any inputs would be appreciated.
[Edit 1]: The input is stream, but for now I assumed a list instead (with only a forward iterator/ input iterator in some sense.)
[Edit 2]: added Data.Heap import to the code.
Thanks.
I think the main improvement is to separate the production of the sorted list from the printing of the sorted list. So:
import Data.Heap (MinHeap)
import qualified Data.Heap as Heap
ksort :: Ord a => Int -> [a] -> [a]
ksort k xs = go (Heap.fromList b) e where
(b, e) = splitAt (k-1) xs
go :: Ord a => MinHeap a -> [a] -> [a]
go heap [] = Heap.toAscList heap
go heap (x:xs) = x' : go heap' xs where
Just (x', heap') = Heap.view (Heap.insert x heap)
printKSorted :: (Ord a, Show a) => Int -> [a] -> IO ()
printKSorted k xs = mapM_ print (ksort k xs)
If I were feeling extra-special-fancy, I might try to turn go into a foldr or perhaps a mapAccumR, but in this case I think the explicit recursion is relatively readable, too.
I have the following code to do an inorder traversal of a Binary Tree:
data BinaryTree a =
Node a (BinaryTree a) (BinaryTree a)
| Leaf
deriving (Show)
inorder :: (a -> b -> b) -> b -> BinaryTree a -> b
inorder f acc tree = go tree acc
where go Leaf z = z
go (Node v l r) z = (go r . f v . go l) z
Using the inorder function above I'd like to get the kth element without having to traverse the entire list.
The traversal is a little like a fold given that you pass it a function and a starting value. I was thinking that I could solve it by passing k as the starting value, and a function that'll decrement k until it reaches 0 and at that point returns the value inside the current node.
The problem I have is that I'm not quite sure how to break out of the recursion of inorder traversal short of modifying the whole function, but I feel like having to modify the higher order function ruins the point of using a higher order function in the first place.
Is there a way to break after k iterations?
I observe that the results of the recursive call to go on the left and right subtrees are not available to f; hence no matter what f does, it cannot choose to ignore the results of recursive calls. Therefore I believe that inorder as written will always walk over the entire tree. (edit: On review, this statement may be a bit strong; it seems f may have a chance to ignore left subtrees. But the point basically stands; there is no reason to elevate left subtrees over right subtrees in this way.)
A better choice is to give the recursive calls to f. For example:
anyOldOrder :: (a -> b -> b -> b) -> b -> BinaryTree a -> b
anyOldOrder f z = go where
go Leaf = z
go (Node v l r) = f v (go l) (go r)
Now when we write
flatten = anyOldOrder (\v ls rs -> ls ++ [v] ++ rs) []
we will find that flatten is sufficiently lazy:
> take 3 (flatten (Node 'c' (Node 'b' (Node 'a' Leaf Leaf) Leaf) undefined))
"abc"
(The undefined is used to provide evidence that this part of the tree is never inspected during the traversal.) Hence we may write
findK k = take 1 . reverse . take k . flatten
which will correctly short-circuit. You can make flatten slightly more efficient with the standard difference list technique:
flatten' t = anyOldOrder (\v l r -> l . (v:) . r) id t []
Just for fun, I also want to show how to implement this function without using an accumulator list. Instead, we will produce a stateful computation which walks over the "interesting" part of the tree, stopping when it reaches the kth element. The stateful computation looks like this:
import Control.Applicative
import Control.Monad.State
import Control.Monad.Trans.Maybe
kthElem k v l r = l <|> do
i <- get
if i == k
then return v
else put (i+1) >> r
Looks pretty simple, hey? Now our findK function will farm out to kthElem, then do some newtype unwrapping:
findK' k = (`evalState` 1) . runMaybeT . anyOldOrder (kthElem 3) empty
We can verify that it is still as lazy as desired:
> findK' 3 $ Node 'c' (Node 'b' (Node 'a' Leaf Leaf) Leaf) undefined
Just 'c'
There are (at least?) two important generalizations of the notion of folding a list. The first, more powerful, notion is that of a catamorphism. The anyOldOrder of Daniel Wagner's answer follows this pattern.
But for your particular problem, the catamorphism notion is a bit more power than you need. The second, weaker, notion is that of a Foldable container. Foldable expresses the idea of a container whose elements can all be mashed together using the operation of an arbitrary Monoid. Here's a cute trick:
{-# LANGUAGE DeriveFoldable #-}
-- Note that for this trick only I've
-- switched the order of the Node fields.
data BinaryTree a =
Node (BinaryTree a) a (BinaryTree a)
| Leaf
deriving (Show, Foldable)
index :: [a] -> Int -> Maybe a
[] `index` _ = Nothing
(x : _) `index` 0 = Just x
(_ : xs) `index` i = xs `index` (i - 1)
(!?) :: Foldable f => Int -> f a -> Maybe a
xs !? i = toList xs `index` i
Then you can just use !? to index into your tree!
That trick is cute, and in fact deriving Foldable is a tremendous convenience, but it won't help you understand anything. I'll start by showing how you can define treeToList fairly directly and efficiently, without using Foldable.
treeToList :: BinaryTree a -> [a]
treeToList t = treeToListThen t []
The magic is in the treeToListThen function. treeToListThen t more converts t to a list and appends the list more to the end of the result. This slight generalization turns out to be all that's required to make conversion to a list efficient.
treeToListThen :: BinaryTree a -> [a] -> [a]
treeToListThen Leaf more = more
treeToListThen (Node v l r) more =
treeToListThen l $ v : treeToListThen r more
Instead of producing an inorder traversal of the left subtree and then appending everything else, we tell the left traversal what to stick on the end when it's done! This avoids the potentially serious inefficiency of repeated list concatenation that can turn things O(n^2) in bad cases.
Getting back to the Foldable notion, turning things into lists is a special case of foldr:
toList = foldr (:) []
So how can we implement foldr for trees? It ends up being somewhat similar to what we did with toList:
foldrTree :: (a -> b -> b) -> b -> BinaryTree a -> b
foldrTree _ n Leaf = n
foldrTree c n (Node v l r) = foldrTree c rest l
where
rest = v `c` foldrTree c n r
That is, when we go down the left side, we tell it that when it's done, it should deal with the current node and its right child.
Now foldr isn't quite the most fundamental operation of Foldable; that is actually
foldMap :: (Foldable f, Monoid m)
=> (a -> m) -> f a -> m
It is possible to implement foldr using foldMap, in a somewhat tricky fashion using a peculiar Monoid. I don't want to overload you with details of that right now, unless you ask (but you should look at the default definition of foldr in Data.Foldable). Instead, I'll show how foldMap can be defined using Daniel Wagner's anyOldOrder:
instance Foldable BinaryTree where
foldMap f = anyOldOrder bin mempty where
bin lres v rres = lres <> f v <> rres
I've been exploring the Foldable class and also the the Monoid class.
Firstly, lets say I want to fold over a list of the Monoid First. Like so:
x :: [First a]
fold? mappend mempty x
Then I assume in this case the most appropriate fold would be foldr, as mappend for First is lazy in it's second argument.
Conversely, for Last we'd want to foldl' (or just foldl I'm not sure).
Now moving away from lists, I've defined a simple binary tree like so:
{-# LANGUAGE GADTs #-}
data BinaryTree a where
BinaryTree :: BinaryTree a -> BinaryTree a -> BinaryTree a
Leaf :: a -> BinaryTree a
And I've made it Foldable with the most straightforward definition:
instance Foldable BinaryTree where
foldMap f (BinaryTree left right) =
(foldMap f left) `mappend` (foldMap f right)
foldMap f (Leaf x) = f x
As Foldable defines fold as simply foldMap id we can now do:
x1 :: BinaryTree (First a)
fold x1
x2 :: BinaryTree (Last a)
fold x2
Assuming our BinaryTree is balanced, and there's not many Nothing values, these operations should take O(log(n)) time I believe.
But Foldable also defines a whole lot of default methods like foldl, foldl', foldr and foldr' based on foldMap.
These default definitions seem to be implemented by composing a bunch of functions, wrapped in a Monoid called Endo, one for each element in the collection, and then composing them all.
For the purpose of this discussion I am not modifying these default definitions.
So lets now consider:
x1 :: BinaryTree (First a)
foldr mappend mempty x1
x2 :: BinaryTree (Last a)
foldl mappend mempty x2
Does running these retain O(log(n)) performance of the ordinary fold? (I'm not worried about constant factors for the moment). Does laziness result in the tree not needing to be fully traversed? Or will the default definitions of foldl and foldr require an entire traversal of the tree?
I tried to go though the algorithm step by step (much like they did on the Foldr Foldl Foldl' article) but I ended up completely confusing myself as this is a bit more complex as it involves an interaction between Foldable, Monoid and Endo.
So what I'm looking for is an explanation of why (or why not) the default definition of say foldr, would only take O(log(n)) time on a balanced binary tree like above. A step by step example like what's from the Foldr Foldl Foldl' article would be really helpful, but I understand if that's too difficult, as I totally confused myself attempting it.
Yes, it has O(log(n)) best case performance.
Endo is a wrapper around (a -> a) kind of functions that:
instance Monoid (Endo a) where
mempty = Endo id
Endo f `mappend` Endo g = Endo (f . g)
And the default implementation of foldr in Data.Foldable:
foldr :: (a -> b -> b) -> b -> t a -> b
foldr f z t = appEndo (foldMap (Endo #. f) t) z
The definition of . (function composition) in case:
(.) f g = \x -> f (g x)
Endo is defined by newtype constructor, so it only exists at compile stage, not run-time.
#. operator changes the type of it's second operand and discard the first.
The newtype constructor and #. operator guarantee that you can ignore the wrapper when considering performance issues.
So the default implementation of foldr can be reduced to:
-- mappend = (.), mempty = id from instance Monoid (Endo a)
foldr :: (a -> b -> b) -> b -> t a -> b
foldr f z t = foldMap f t z
For your Foldable BinaryTree:
foldr f z t
= foldMap f t z
= case t of
Leaf a -> f a z
-- what we care
BinaryTree l r -> ((foldMap f l) . (foldMap f r)) z
The default lazy evaluation in Haskell is ultimately simple, there are just two rules:
function application first
evaluate the arguments from left to right if the values matter
That makes it easy to trace the evaluation of the last line of the code above:
((foldMap f l) . (foldMap f r)) z
= (\z -> foldMap f l (foldMap f r z)) z
= foldMap f l (foldMap f r z)
-- let z' = foldMap f r z
= foldMap f l z' -- height 1
-- if the branch l is still not a Leaf node
= ((foldMap f ll) . (foldMap f lr)) z'
= (\z -> foldMap f ll (foldMap f lr)) z'
= foldMap f ll (foldMap f lr z')
-- let z'' = foldMap f lr z'
= foldMap f ll z'' -- height 2
The right branch of the tree is never expanded before the left has been fully expanded, and it goes one level higher after an O(1) operation of function expansion and application, therefore when it reached the left-most Leaf node:
= foldMap f leaf#(Leaf a) z'heightOfLeftMostLeaf
= f a z'heightOfLeftMostLeaf
Then f looks at the value a and decides to ignore its second argument (like what mappend will do to First values), the evaluation short-circuits, results O(height of the left-most leaf), or O(log(n)) performance when the tree is balanced.
foldl is all the same, it's just foldr with mappend flipped i.e. O(log(n)) best case performance with Last.
foldl' and foldr' are different.
foldl' :: (b -> a -> b) -> b -> t a -> b
foldl' f z0 xs = foldr f' id xs z0
where f' x k z = k $! f z x
At every step of reduction, the argument is evaluated first and then the function application, the tree will be traversed i.e. O(n) best case performance.
Recently, I've asked a question for building DFS tree from Graph in Stackoverflow and had learned that it can be simply implemented by using State Monad.
DFS in haskell
While DFS requires to track only visited nodes, so that we can use 'Set' or 'List' or some sort of linear data structure to track visited nodes, BFS requires 'visited node' and 'queue' data structure to be accomplished.
My pseudocode for BFS is
Q = empty queue
T = empty Tree
mark all nodes except u as unvisited
while Q is nonempty do
u = deq(Q)
for each vertex v ∈ Adj(u)
if v is not visited
then add edge (u,v) to T
Mark v as visited and enq(v)
As can be inferred from pseudocode, we only have to do 3 processes per iteration.
dequeue point from queue
add all unvisited neighbors of the point to current tree's child, queue and 'visited' list
repeat this for next in queue
Since we are not using recursive traversal for BFS search, we need some other traversal method such as while loop. I've looked up loop-while package in hackage, but it seems somewhat deprecated.
What I assume is that I require some sort of code like this :
{-...-}
... = evalState (bfs) ((Set.singleton start),[start])
where
neighbors x = Map.findWithDefault [] x adj
bfs =do (vis,x:queue)<-get
map (\neighbor ->
if (Set.member neighbor vis)
then put(vis,queue)
else put ((Set.insert neighbor vis), queue++[neighbor]) >> (addToTree neighbor)
) neighbors x
(vis,queue)<-get
while (length queue > 0)
I understand that this implementation is very erroneous but this should give minimalistic view for how I think BFS should be implemented. Also, I really don't know how to circumvent using while loop for do blocks.(i.e should I use recursive algorithm to overcome it or should I think of completely different strategy)
Considering one of the answer I've found in previous question linked above, it seems like the answer should look like this :
newtype Graph a = Graph (Map.Map a [a]) deriving (Ord, Eq, Show)
data Tree a = Tree a [Tree a] deriving (Ord, Eq, Show)
bfs :: (Ord a) => Graph a -> a -> Tree a
bfs (Graph adj) start = evalState (bfs') ((Set.singleton start),[start])
where
bfs' = {-part where I don't know-}
Finally, if such implementation for BFS using state monad is impossible due to some reason, (which I believe not to be) please correct my false assumption.
I've seen some of the examples for BFS in Haskell without using state monad but I want to learn more about how state monad can be processed and couldn't have found any of examples of BFS implemented using state monad.
Thanks in advance.
EDIT:
I came up with some sort of algorithm using state monad but I fall in infinite loop.
bfs :: (Ord a) => Graph a -> a -> Tree a
bfs (Graph adj) start = evalState (bfs' (Graph adj) start) (Set.singleton start)
bfs' :: (Ord a) => Graph a -> a -> State (Set.Set a) (Tree a)
bfs' (Graph adj) point= do
vis <- get
let neighbors x = Map.findWithDefault [] x adj
let addableNeighbors (x:xs) = if Set.member x vis
then addableNeighbors(xs)
else x:addableNeighbors(xs)
let addVisited (vis) (ns) = Set.union (vis) $ Set.fromList ns
let newVisited = addVisited vis $ addableNeighbors $ neighbors point
put newVisited
return (Tree point $ map (flip evalState newVisited) (map (bfs' (Graph adj)) $ addableNeighbors $ neighbors point))
EDIT2: With some expense of space complexity, I've came out with a solution to get BFS graph using graph to return and queue to process. Despite it is not the optimal solution for generating BFS tree/graph, it will work.
bfs :: (Ord a) => Graph a -> a -> Graph a
bfs (Graph adj) start = evalState (bfs' (Graph adj) (Graph(Map.empty)) [start]) (Set.singleton start)
bfs':: (Ord a) => Graph a -> Graph a -> [a] -> State (Set.Set a) (Graph a)
bfs' _ (Graph ret) [] = return (Graph ret)
bfs' (Graph adj) (Graph ret) (p:points)= do
vis <- get
let neighbors x = Map.findWithDefault [] x adj
let addableNeighbors ns
| null ns = []
| otherwise = if Set.member (head ns) vis
then addableNeighbors(tail ns)
else (head ns):addableNeighbors(tail ns)
let addVisited (v) (ns) = Set.union (v) $ Set.fromList ns
let unVisited = addableNeighbors $ neighbors p
let newVisited = addVisited vis unVisited
let unionGraph (Graph g1) (Graph g2) = Graph (Map.union g1 g2)
put newVisited
bfs' (Graph adj) (unionGraph (Graph ret) (Graph (Map.singleton p unVisited))) (points ++ unVisited)
EDIT3: I've added convert function for graph to tree. Running function in EDIT2, and EDIT3 will yield BFS Tree. It is not the best algorithm for computation time wise, but I believe it is intuitive and easy to understand for newbies like me :)
graphToTree :: (Ord a) => Graph a -> a -> Tree a
graphToTree (Graph adj) point = Tree point $ map (graphToTree (Graph adj)) $ neighbors point
where neighbors x = Map.findWithDefault [] x adj
Converting a graph into a Tree breadth-first is a bit more difficult than simply searching the graph breadth-first. If you are searching the graph, you only ever need to return from a single branch. When converting the graph into a tree, the result needs to include results from multiple branches.
We can use a more general type than Graph a for what we can search or convert to trees. We can search or convert to trees anything with a function a -> [a]. For a Graph we'd use the function (Map.!) m, where m is the Map. Searching with a transposition table has a signature like
breadthFirstSearchUnseen:: Ord r => (a -> r) -> -- how to compare `a`s
(a -> Bool) -> -- where to stop
(a -> [a]) -> -- where you can go from an `a`
[a] -> -- where to start
Maybe [a]
Converting the function to a tree that contains each reachable node at the earliest depth has a signature like
shortestPathTree :: Ord r => (a -> r) -> -- how to compare `a`s
(a -> l) -- what label to put in the tree
(a -> [a]) -> -- where you can go from an `a`
a -> -- where to start
Tree l
We can slightly more generally start at any number of nodes and build a Forest that contains each reachable node at the earliest depth.
shortestPathTrees :: Ord r => (a -> r) -> -- how to compare `a`s
(a -> l) -- what label to put in the tree
(a -> [a]) -> -- where you can go from an `a`
[a] -> -- where to start
[Tree l]
Searching
Performing the conversion to a tree doesn't really help us search, we can perform breadth first searches on the original graph.
import Data.Sequence (viewl, ViewL (..), (><))
import qualified Data.Sequence as Seq
import qualified Data.Set as Set
breadthFirstSearchUnseen:: Ord r => (a -> r) -> (a -> Bool) -> (a -> [a]) -> [a] -> Maybe [a]
breadthFirstSearchUnseen repr p expand = combine Set.empty Seq.empty []
where
combine seen queued ancestors unseen =
go
(seen `Set.union` (Set.fromList . map repr $ unseen))
(queued >< (Seq.fromList . map ((,) ancestors) $ unseen))
go seen queue =
case viewl queue of
EmptyL -> Nothing
(ancestors, a) :< queued ->
if p a
then Just . reverse $ ancestors'
else combine seen queued ancestors' unseen
where
ancestors' = a:ancestors
unseen = filter (flip Set.notMember seen . repr) . expand $ a
The state maintained in the above search algorithm is a Seq queue of what nodes to visit next and a Set of nodes that have already been seen. If we instead kept track of nodes that have already been visited, then we could visit the same node multiple times if we find multiple paths to the node at the same depth. There's a more complete explanation in the answer I wrote this breadth first search for.
We can easily write searching Graphs in terms of our general search.
import qualified Data.Map as Map
newtype Graph a = Graph (Map.Map a [a]) deriving (Ord, Eq, Show)
bfsGraph :: (Ord a) => Graph a -> (a -> Bool) -> [a] -> Maybe [a]
bfsGraph (Graph adj) test = breadthFirstSearchUnseen id test ((Map.!) adj)
We can also write how to search Trees themselves.
import Data.Tree
bfsTrees :: (Ord a) => (a -> Bool) -> [Tree a] -> Maybe [a]
bfsTrees test = fmap (map rootLabel) . breadthFirstSearchUnseen rootLabel (test . rootLabel) subForest
Building trees
Building trees breadth-first is a lot more difficult. Fortunately Data.Tree already provides ways to build Trees in breadth first order from a monadic unfold. The breadth first order will take care of the queuing, we will only need to keep track of the state for the nodes we've already seen.
unfoldTreeM_BF has the type Monad m => (b -> m (a, [b])) -> b -> m (Tree a). m is the Monad our computations will be in, b is the type of data we are going to build the tree based on, and a is the type for the labels of the tree. In order to use it to build a tree we need to make a function b -> m (a, [b]). We're going to rename a to l for label, and b to a, which is what we've been using for our nodes. We need to make an a -> m (l, [a]). For m, we'll use the State monad from transformers to keep track of some state; the state will be the Set of nodes whose representation r we've already seen; we'll be using the State (Set.Set r) monad. Overall, we need to provide a function a -> State (Set.Set r) (l, [a]).
expandUnseen :: Ord r => (a -> r) -> (a -> l) -> (a -> [a]) -> a -> State (Set.Set r) (l, [a])
expandUnseen repr label expand a = do
seen <- get
let unseen = filter (flip Set.notMember seen . repr) . uniqueBy repr . expand $ a
put . Set.union seen . Set.fromList . map repr $ unseen
return (label a, unseen)
To build the trees, we run the state computation built by unfoldForestM_BF
shortestPathTrees :: Ord r => (a -> r) -> (a -> l) -> (a -> [a]) -> [a] -> [Tree l]
shortestPathTrees repr label expand = run . unfoldForestM_BF k . uniqueBy repr
where
run = flip evalState Set.empty
k = expandUnseen repr label expand
uniqueBy is a nubBy that takes advantage of an Ord instance instead of Eq.
uniqueBy :: Ord r => (a -> r) -> [a] -> [a]
uniqueBy repr = go Set.empty
where
go seen [] = []
go seen (x:xs) =
if Set.member (repr x) seen
then go seen xs
else x:go (Set.insert (repr x) seen) xs
We can write building shortest path trees from Graphs in terms of our general shortest path tree building
shortestPathsGraph :: Ord a => Graph a -> [a] -> [Tree a]
shortestPathsGraph (Graph adj) = shortestPathTrees id id ((Map.!) adj)
We can do the same for filtering a Forest to only the shortest paths through the Forest.
shortestPathsTree :: Ord a => [Tree a] -> [Tree a]
shortestPathsTree = shortestPathTrees rootLabel rootLabel subForest
My solution is based on working level-by-level (wrt. to BFS), see also this question and answer.
The general idea is: Assume we already know the sets of visited elements prior each level of our BFS as a list of sets. Then we can traverse the graph, level by level, updating our list of sets, constructing the output Tree on the way.
The trick is that after such a level-by-level traversal, we'll have the sets of visited elements after each level. And this is the same as the list before each level, just shifted by one. So by tying the knot, we can use the shifted output as the input for the procedure.
import Control.Monad.State
import qualified Data.Map as M
import Data.Maybe (fromMaybe, catMaybes)
import qualified Data.Set as S
import Data.Tree
newtype Graph a = Graph (M.Map a [a])
deriving (Ord, Eq, Show)
tagBfs :: (Ord a) => Graph a -> a -> Maybe (Tree a)
tagBfs (Graph g) s = let (t, sets) = runState (thread s) (S.empty : sets)
in t
where
thread x = do
sets#(s : subsets) <- get
case M.lookup x g of
Just vs | not (S.member x s) -> do
-- recursively create sub-nodes and update the subsets list
let (nodes, subsets') = runState
(catMaybes `liftM` mapM thread vs) subsets
-- put the new combined list of sets
put (S.insert x s : subsets')
-- .. and return the node
return . Just $ Node x nodes
_ -> return Nothing -- node not in the graph, or already visited
Running tagBfs example2 'b' it on the following example
example2 :: Graph Char
example2 = Graph $ M.fromList
[ ('a', ['b', 'c', 'd'])
, ('b', ['a'])
, ('c', [])
, ('d', [])
]
yields
Just (Node {rootLabel = 'b',
subForest = [Node {rootLabel = 'a',
subForest = [Node {rootLabel = 'c',
subForest = []},
Node {rootLabel = 'd',
subForest = []}
]}
]}
)