DPLL algorithm and number of visited nodes

DPLL algorithm and number of visited nodes - algorithm

I'm implementing DPLL algorithm that counts the number of visited nodes. I managed to implement DPLL that doesn't count visited nodes but I can't think of any solutions to the problem of counting. The main problem is that as the algorithm finds satisfying valuation and returns True, the recursion rolls up and returns counter from the moment the recursion started. In any imperative language I would just use global variable and increment it as soon as function is invoked, but it is not the case in Haskell.
The code I pasted here does not represent my attempts to solve the counting problem, it is just my solution without it. I tried to use tuples such as (True,Int) but it will always return integer value from the moment the recursion started.
This is my implementation where (Node -> Variable) is a heuristic function, Sentence is list of clauses in CNF to be satisfied, [Variable] is a list of Literals not assigned and Model is just a truth valuation.
dpll' :: (Node -> Variable) -> Sentence -> [Variable] -> Model -> Bool
dpll' heurFun sentence vars model
| satisfiesSentence model sentence = True
| falsifiesSentence model sentence = False
| otherwise = applyRecursion
where
applyRecursion
| pureSymbol /= Nothing = recurOnPureSymbol
| unitSymbol /= Nothing = recurOnUnitSymbol
| otherwise = recurUsingHeuristicFunction
where
pureSymbol = findPureSymbol vars sentence model
unitSymbol = findUnitClause sentence model
heurVar = heurFun (sentence,(vars,model))
recurOnPureSymbol =
dpll' heurFun sentence (vars \\ [getVar pureSymbol]) ((formAssignment pureSymbol):model)
recurOnUnitSymbol =
dpll' heurFun sentence (vars \\ [getVar unitSymbol]) ((formAssignment unitSymbol):model)
recurUsingHeuristicFunction = case vars of
(v:vs) -> (dpll' heurFun sentence (vars \\ [heurVar]) ((AS (heurVar,True)):model)
|| dpll' heurFun sentence (vars \\ [heurVar]) ((AS (heurVar,False)):model))
[] -> False
I would really appreciate any advice on how to count the visited nodes. Thank you.
EDIT:
The only libraries I'm allowed to use are System.Random, Data.Maybe and Data.List.
EDIT:
One possible solution I tried to implement is to use a tuple (Bool,Int) as a return value from DPPL function. The code looks like:
dpll'' :: (Node -> Variable) -> Sentence -> [Variable] -> Model -> Int -> (Bool,Int)
dpll'' heurFun sentence vars model counter
| satisfiesSentence model sentence = (True,counter)
| falsifiesSentence model sentence = (False,counter)
| otherwise = applyRecursion
where
applyRecursion
| pureSymbol /= Nothing = recurOnPureSymbol
| unitSymbol /= Nothing = recurOnUnitSymbol
| otherwise = recurUsingHeuristicFunction
where
pureSymbol = findPureSymbol vars sentence model
unitSymbol = findUnitClause sentence model
heurVar = heurFun (sentence,(vars,model))
recurOnPureSymbol =
dpll'' heurFun sentence (vars \\ [getVar pureSymbol]) ((formAssignment pureSymbol):model) (counter + 1)
recurOnUnitSymbol =
dpll'' heurFun sentence (vars \\ [getVar unitSymbol]) ((formAssignment unitSymbol):model) (counter + 1)
recurUsingHeuristicFunction = case vars of
(v:vs) -> ((fst $ dpll'' heurFun sentence (vars \\ [heurVar]) ((AS (heurVar,True)):model) (counter + 1))
|| (fst $ dpll'' heurFun sentence (vars \\ [heurVar]) ((AS (heurVar,False)):model) (counter + 1)),counter)
[] -> (False,counter)
The basic idea of this approach is to increment the counter at each recursive call. However, the problem with this approach is that I have no idea how to retrieve counter from recursive calls in OR statement. I'm not even sure if this is possible in Haskell.

You can retrieve the counter from the recursive call using case or similar.
recurUsingHeuristicFunction = case vars of
v:vs -> case dpll'' heurFun sentence (vars \\ [heurVar]) (AS (heurVar,True):model) (counter + 1) of
(result, counter') -> case dpll'' heurFun sentence (vars \\ [heurVar]) (AS (heurVar,False):model) counter' of
(result', counter'') -> (result || result', counter'')
[] -> (False,counter)
This is a manual implementation of the State monad. However, it's not clear to me why you are passing in a counter at all. Just return it. Then it is the simpler Writer monad instead. The code for this helper would look something like this:
recurUsingHeuristicFunction = case vars of
v:vs -> case dpll'' heurFun sentence (vars \\ [heurVar]) (AS (heurVar,True):model) of
(result, counter) -> case dpll'' heurFun sentence (vars \\ [heurVar]) (AS (heurVar,False):model) of
(result', counter') -> (result || result', counter + counter' + 1)
[] -> (False,0)
Other results would be similar -- returning 0 instead of counter and 1 instead of counter+1 -- and the call to the function would be simpler, with one fewer argument to worry about setting up correctly.

Basically what you described as your solution in imperative language can be modeled by passing around a counting variable, adding the variable to the result at the time you return it (the bottom of recursion that reaches satisfiable assignment), i.e. for a function a -> b you would create a new function a -> Int -> (b, Int). The Int argument is the current state of the counter, the result is enriched with the updated state of the counter.
This can further be re-expressed elegantly using the state monad. A very nice tutorial on haskell in general and state monad is here. Basically the transformation of a -> b to a -> Int -> (b, Int) can be seen as transformation of a -> b into a -> State Int b by simply given a nicer name to the function Int -> (b, Int). There is a very nice blog that explains where these nice abstractions come from in a very accessible way.
import Control.Monad.Trans.StateT
type Count = Int
dpllM :: (Node -> Variable) -> Sentence -> [Variable] -> Model -> State Count Bool
dpllM heurFun sentence vars model | ... = do
-- do your thing
modify (+1)
-- do your thing
dpll' :: (Node -> Variable) -> Sentence -> [Variable] -> Model -> Bool
dpll' heurFun sentence vars model = runState (dpllM heurFun sentence vars model) 0
Maybe you want something like
f :: A -> Int -> (Bool, Int)
f a c =
let a' = ...
a'' = ...
(b', c') = f a' c in f a'' c'

Related

debugging a Haskell application

After learning few basics I wanted to try a "real world application" in Haskell, started with a Bittorrent client. Following through the explanation from this blog post, I did NOT use the Attoparsec parser combinator library. Instead following through Huttons book, I started writing the Parser Combinators. This is the code that I have so far (Still at the parsing stage, a long journey ahead):
module Main where
import System.Environment (getArgs)
import qualified Data.Map as Map
import Control.Monad (liftM, ap)
import Data.Char (isDigit, isAlpha, isAlphaNum, ord)
import Data.List(foldl')
main :: IO ()
main = do
[fileName] <- getArgs
contents <- readFile fileName
download . parse $ contents
parse :: String -> Maybe BenValue
parse s = case runParser value s of
[] -> Nothing
[(p, _)] -> Just p
download :: Maybe BenValue -> IO ()
download (Just p) = print p
download _ = print "Oh!! Man!!"
data BenValue = BenString String
| BenNumber Integer
| BenList [BenValue]
| BenDict (Map.Map String BenValue)
deriving(Show, Eq)
-- From Hutton, this follows: a Parser is a function
-- that takes a string and returns a list of results
-- each containing a pair : a result of type a and
-- an output string. (the string is the unconsumed part of the input).
newtype Parser a = Parser (String -> [(a, String)])
-- Unit takes a value and returns a Parser (a function)
unit :: a -> Parser a
unit v = Parser (\inp -> [(v, inp)])
failure :: Parser a
failure = Parser (\inp -> [])
one :: Parser Char
one = Parser $ \inp -> case inp of
[] -> []
(x: xs) -> [(x, xs)]
runParser :: Parser a -> String -> [(a, String)]
runParser (Parser p) inp = p inp
bind :: Parser a -> (a -> Parser b) -> Parser b
bind (Parser p) f = Parser $ \inp -> case p inp of
[] -> []
[(v, out)] -> runParser (f v) out
instance Monad Parser where
return = unit
p >>= f = bind p f
instance Applicative Parser where
pure = unit
(<*>) = ap
instance Functor Parser where
fmap = liftM
choice :: Parser a -> Parser a -> Parser a
choice p q = Parser $ \inp -> case runParser p inp of
[] -> runParser q inp
x -> x
satisfies :: (Char -> Bool) -> Parser Char
satisfies p = do
x <- one
if p x
then unit x
else failure
digit :: Parser Char
digit = satisfies isDigit
letter :: Parser Char
letter = satisfies isAlpha
alphanum :: Parser Char
alphanum = satisfies isAlphaNum
char :: Char -> Parser Char
char x = satisfies (== x)
many :: Parser a -> Parser [a]
many p = choice (many1 p) (unit [])
many1 :: Parser a -> Parser [a]
many1 p = do
v <- p
vs <- many p
unit (v:vs)
peek :: Parser Char
peek = Parser $ \inp -> case inp of
[] -> []
v#(x:xs) -> [(x, v)]
taken :: Int -> Parser [Char]
taken n = do
if n > 0
then do
v <- one
vs <- taken (n-1)
unit (v:vs)
else unit []
takeWhile1 :: (Char -> Bool) -> Parser [Char]
takeWhile1 pred = do
v <- peek
if pred v
then do
one
vs <- takeWhile1 pred
unit (v:vs)
else unit []
decimal :: Integral a => Parser a
decimal = foldl' step 0 `fmap` takeWhile1 isDigit
where step a c = a * 10 + fromIntegral (ord c - 48)
string :: Parser BenValue
string = do
n <- decimal
char ':'
BenString <$> taken n
signed :: Num a => Parser a -> Parser a
signed p = (negate <$> (char '-' *> p) )
`choice` (char '+' *> p)
`choice` p
number :: Parser BenValue
number = BenNumber <$> (char 'i' *> (signed decimal) <* char 'e')
list :: Parser BenValue
list = BenList <$> (char 'l' *> (many value) <* char 'e')
dict :: Parser BenValue
dict = do
char 'd'
pair <- many ((,) <$> string <*> value)
char 'e'
let pair' = (\(BenString s, v) -> (s,v)) <$> pair
let map' = Map.fromList pair'
unit $ BenDict map'
value = string `choice` number `choice` list `choice` dict
The above is a mix of code read/understood from the source code of the three sources the blog, the library, and the book. the download function just prints the "parse tree", obtained from the parser, Once I get the parser working will fill in the download function and test it out.
The parser is NOT working on few of the torrent files. :( There is definitely chance that I might have used code from the references incorrectly. And would like to know if there is anything obvious.
It works on "toy" examples and also on the test file picked from combinatorrent
When I pick a real world torrent like the Debian/Ubuntu etc, this fails.
I would like to debug and see what is happening, debugging with GHCI does not seem straight forward, I've tried :trace / :history style debugging mentioned in this document, but looks very primitive :-) .
My question to the experts out there is: "how to debug!!" :-)
Would really appreciate any hints on approaching how to debug this.
Thanks.

Because Haskell code is pure, "stepping" through it is less essential than in other languages. When I step through some Java code, I am often trying to see where a certain variable gets changed. That is obviously a non-issue in Haskell given that things are immutable.
That means we can also run snippets of code in GHCi to debug what is happening without worrying that what we run is going to change some global state, or what we run will work any differently than how it would if called deep inside our program. This mode of work benefits from iterating your design slowly building it to work on the full range of expected inputs.
Parsing is always a bit unpleasant - even in imperative languages. Nobody wants to run a parser just to get back Nothing - you want to know why you got back nothing. To that effect, most parsers libraries are helpful in giving you some information about what went wrong. That is a point for using a parser like attoparsec. Also, attoparsec works with ByteString by default - perfect for binary data. If you want to roll your own parser implementation, you'll have to debug it too.
Finally, based on your comments, it looks like you are having issues with character encodings. This is exactly the reason why we have ByteString - it represents a packed sequence of bytes - no encodings. The extension OverloadedStrings even makes it pretty easy to make ByteString literals that look just like regular strings.

Haskell Optimizations for List Processing stymied by Lazy Evaluation

I'm trying to improve the efficiency of the following code. I want to count all occurrences of a symbol before a given point (as part of pattern-matching using a Burrows-Wheeler transform). There's some overlap in how I'm counting symbols. However, when I have tried to implement what looks like it should be more efficient code, it turns out to be less efficient, and I'm assuming that lazy evaluation and my poor understanding of it is to blame.
My first attempt at a counting function went like this:
count :: Ord a => [a] -> a -> Int -> Int
count list sym pos = length . filter (== sym) . take pos $ list
Then in the body of the matching function itself:
matching str refCol pattern = match 0 (n - 1) (reverse pattern)
where n = length str
refFstOcc sym = length $ takeWhile (/= sym) refCol
match top bottom [] = bottom - top + 1
match top bottom (sym : syms) =
let topCt = count str sym top
bottomCt = count str sym (bottom + 1)
middleCt = bottomCt - topCt
refCt = refFstOcc sym
in if middleCt > 0
then match (refCt + topCt) (refCt + bottomCt - 1) syms
else 0
(Stripped down for brevity - I'm memoizing first occurrences of symbols in refCol through a Map, and a couple other details as well).
Edit: Sample use would be:
matching "AT$TCTAGT" "$AACGTTTT" "TCG"
which should be 1 (assuming I didn't mistype anything).
Now, I'm recounting everything in the middle between the top pointer and the bottom twice, which adds up when I count a million character DNA string with only 4 possible choices for characters (and profiling tells me that this is the big bottleneck, too, taking 48% of my time for bottomCt and around 38% of my time for topCt). For reference, when calculating this for a million character string and trying to match 50 patterns (each of which is between 1 and 1000 characters), the program takes about 8.5 to 9.5 seconds to run.
However, if I try to implement the following function:
countBetween :: Ord a => [a] -> a -> Int -> Int -> (Int, Int)
countBetween list sym top bottom =
let (topList, bottomList) = splitAt top list
midList = take (bottom - top) bottomList
getSyms = length . filter (== sym)
in (getSyms topList, getSyms midList)
(with changes made to the matching function to compensate), the program takes between 18 and 22 seconds to run.
I've also tried passing in a Map which can keep track of previous calls, but that also takes about 20 seconds to run and runs up the memory usage.
Similarly, I've shorted length . filter (== sym) to a fold, but again - 20 seconds for foldr, and 14-15 for foldl.
So what would be a proper Haskell way to optimize this code through rewriting it? (Specifically, I'm looking for something that doesn't involve precomputation - I may not be reusing strings very much - and which explains something of why this is happening).
Edit: More clearly, what I am looking for is the following:
a) Why does this behaviour happen in Haskell? How does lazy evaluation play a role, what optimizations is the compiler making to rewrite the count and countBetween functions, and what other factors may be involved?
b) What is a simple code rewrite which would address this issue so that I don't traverse the lists multiple times? I'm looking specifically for something which addresses that issue, rather than a solution which sidesteps it. If the final answer is, count is the most efficient possible way to write the code, why is that?

I'm not sure lazy evaluation has much to do with the performance of the code. I think the main problem is the use of String - which is a linked list - instead of more performant string type.
Note that this call in your countBetween function:
let (topList, bottomList) = splitAt top list
will re-create the linked link corresponding to topList meaning
a lot more allocations.
A Criterion benchmark to compare splitAt versus using take n/drop n
may be found here: http://lpaste.net/174526. The splitAt version is
about 3 times slower and, of course, has a lot more allocations.
Even if you don't want to "pre-compute" the counts you can improve
matters a great deal by simply switching to either ByteString or Text.
Define:
countSyms :: Char -> ByteString -> Int -> Int -> Int
countSyms sym str lo hi =
length [ i | i <- [lo..hi], BS.index str i == sym ]
and then:
countBetween :: ByteString -> Char -> Int -> Int -> (Int,Int)
countBetween str sym top bottom = (a,b)
where a = countSyms sym str 0 (top-1)
b = countSyms sym str top (bottom-1)
Also, don't use reverse on large lists - it will reallocate the
entire list. Just index into a ByteString / Text in reverse.
Memoizing counts may or may not help. It all depends on how it's done.

It seems that the main point of the match routine is
to transform a interval (bottom,top) to another interval
based on the current symbol sym. The formulas are
basically:
ref_fst = index of sym in ref_col
-- defined in an outer scope
match :: Char -> (Int,Int) -> (Int,Int)
match sym (bottom, top) | bottom > top = (bottom, top) -- if the empty interval
match sym (bottom, top) =
let
top_count = count of sym in str from index 0 to top
bot_count = count of sym in str from index 0 to bottom
mid_count = top_count - bot_count
in if mid_count > 0
then (ref_fst + bot_count, ref_fst + top_count)
else (1,0) -- the empty interval
And then matching is just a fold over pattern using match
with the initial interval (0, n-1).
Both top_count and bot_count can be computed efficiently
using a precomputed lookup table, and below is code which
does that.
If you run test1 you'll see a trace of how the interval
is transformed via each symbol in the pattern.
Note: There may be off-by-1 errors, and I've hard coded
ref_fst to be 0 - I'm not sure how this fits into the
larger algorithm, but the basic idea should be sound.
Note that once the counts vector has been created
there is no need to index into the original string anymore.
Therefore, even though I use a ByteString here for
the (larger) DNA sequence, it's not crucial, and the
mkCounts routine should work just as well if passed a String
instead.
Code also available at http://lpaste.net/174288
{-# LANGUAGE OverloadedStrings #-}
import Data.Vector.Unboxed ((!))
import qualified Data.Vector.Unboxed as UV
import qualified Data.Vector.Unboxed.Mutable as UVM
import qualified Data.ByteString.Char8 as BS
import Debug.Trace
import Text.Printf
import Data.List
mkCounts :: BS.ByteString -> UV.Vector (Int,Int,Int,Int)
mkCounts syms = UV.create $ do
let n = BS.length syms
v <- UVM.new (n+1)
let loop x i | i >= n = return x
loop x i = let s = BS.index syms i
(a,t,c,g) = x
x' = case s of
'A' -> (a+1,t,c,g)
'T' -> (a,t+1,c,g)
'C' -> (a,t,c+1,g)
'G' -> (a,t,c,g+1)
_ -> x
in do UVM.write v i x
loop x' (i+1)
x <- loop (0,0,0,0) 0
UVM.write v n x
return v
data DNA = A | C | T | G
deriving (Show)
getter :: DNA -> (Int,Int,Int,Int) -> Int
getter A (a,_,_,_) = a
getter T (_,t,_,_) = t
getter C (_,_,c,_) = c
getter G (_,_,_,g) = g
-- narrow a window
narrow :: Int -> UV.Vector (Int,Int,Int,Int) -> DNA -> (Int,Int) -> (Int,Int)
narrow refcol counts sym (lo,hi) | trace msg False = undefined
where msg = printf "-- lo: %d hi: %d refcol: %d sym: %s top_cnt: %d bot_count: %d" lo hi refcol (show sym) top_count bot_count
top_count = getter sym (counts ! (hi+1))
bot_count = getter sym (counts ! lo)
narrow refcol counts sym (lo,hi) =
let top_count = getter sym (counts ! (hi+1))
bot_count = getter sym (counts ! (lo+0))
mid_count = top_count - bot_count
in if mid_count > 0
then ( refcol + bot_count, refcol + top_count-1 )
else (lo+1,lo) -- signal an wmpty window
findFirst :: DNA -> UV.Vector (Int,Int,Int,Int) -> Int
findFirst sym v =
let n = UV.length v
loop i | i >= n = n
loop i = if getter sym (v ! i) > 0
then i
else loop (i+1)
in loop 0
toDNA :: String -> [DNA]
toDNA str = map charToDNA str
charToDNA :: Char -> DNA
charToDNA = go
where go 'A' = A
go 'C' = C
go 'T' = T
go 'G' = G
dnaToChar A = 'A'
dnaToChar C = 'C'
dnaToChar T = 'T'
dnaToChar G = 'G'
first :: DNA -> BS.ByteString -> Int
first sym str = maybe len id (BS.elemIndex (dnaToChar sym) str)
where len = BS.length str
test2 = do
-- matching "AT$TCTAGT" "$AACGTTTT" "TCG"
let str = "AT$TCTAGT"
refcol = "$AACGTTTT"
syms = toDNA "TCG"
-- hard coded for now
-- may be computeed an memoized
refcol_G = 4
refcol_C = 3
refcol_T = 5
counts = mkCounts str
w0 = (0, BS.length str -1)
w1 = narrow refcol_G counts G w0
w2 = narrow refcol_C counts C w1
w3 = narrow refcol_T counts T w2
firsts = (first A refcol, first T refcol, first C refcol, first G refcol)
putStrLn $ "firsts: " ++ show firsts
putStrLn $ "w0: " ++ show w0
putStrLn $ "w1: " ++ show w1
putStrLn $ "w2: " ++ show w2
putStrLn $ "w3: " ++ show w3
let (lo,hi) = w3
len = if lo <= hi then hi - lo + 1 else 0
putStrLn $ "length: " ++ show len
matching :: BS.ByteString -> BS.ByteString -> String -> Int
matching str refcol pattern =
let counts = mkCounts str
n = BS.length str
syms = toDNA (reverse pattern)
firsts = (first A refcol, first T refcol, first C refcol, first G refcol)
go (lo,hi) sym = narrow refcol counts sym (lo,hi)
where refcol = getter sym firsts
(lo, hi) = foldl' go (0,n-1) syms
len = if lo <= hi then hi - lo + 1 else 0
in len
test3 = matching "AT$TCTAGT" "$AACGTTTT" "TCG"

F# Tree: Node Insertion

This is a question that extends F# Recursive Tree Validation, which I had nicely answered yesterday.
This question concerns inserting a child in an existing tree. This is the updated type I'd like to use:
type Name = string
type BirthYear = int
type FamilyTree = Person of Name * BirthYear * Children
and Children = FamilyTree list
My last question concerned checking the validity of the tree, this was the solution I decided to go with:
let rec checkAges minBirth = function
| Person(_,b,_) :: t -> b >= minBirth && checkAges b t
| [] -> true
let rec validate (Person(_,b,c)) =
List.forall isWF c && checkAges (b + 16) c
Now I would like to be able to insert a Person Simon as a child of specific Person Hans in the following form
insertChildOf "Hans" simon:Person casperFamily:FamilyTree;;
So, input should be parent name, child and the family tree. Ideally it should then return a modified family tree, that is FamilyTree option
What I am struggling with is to incorporating the validate function to make sure it is legal, and a way to insert it properly in the list of children, if the insertion Person is already a parent - maybe as a seperate function.
All help is welcome and very appreciated - thanks! :)

After your comment here's a code that will behave as expected:
let insert pntName (Person(_, newPrsnYear, _) as newPrsn) (Person (n,y,ch)) =
let rec ins n y = function
| [] -> if y < newPrsnYear && n = pntName then Some [newPrsn] else None
| (Person (name, year, childs) as person) :: bros ->
let tryNxtBros() = Option.map (fun x -> person::x) (ins n y bros)
if y < newPrsnYear && n = pntName then // father OK
if newPrsnYear < year then // brother OK -> insert here
Some (newPrsn::person::bros)
else tryNxtBros()
else // keep looking, first into eldest child ...
match ins name year childs with
| Some i -> Some (Person (name, year, i) :: bros)
| _ -> tryNxtBros() // ... then into other childs
Option.map (fun x -> Person (n, y, x)) (ins n y ch)
As in my previous answer I keep avoiding using List functions since I don't think they are a good fit in a tree structure unless the tree provides a traverse.
I might be a bit purist in the sense I use either List functions (with lambdas and combinators) or pure recursion, but in general I don't like mixing them.

Caching possible values of a function constructed at runtime

I have a data constructor with a few value constructors:
data DataType = C1 | C2 | C3 | ... | Cn
I'd like to build a function at run time from that data type to some other values (in fact, I'm doing this in an IO monad):
buildFun :: IO (DataType -> b)
buildFun = do
....
return $ \x -> case x of
C1 -> someProcessesToGetTheValue C1
...
Cn -> someProcessesToGetTheValue Cn
Will this mean that someProcessesToGetTheValue will be called each time I call the returned function?
I'd prefer Haskell to evaluate someProcessesToGetTheValue inside buildFun (since those calls are quite expensive) and return a function which returns these fully evaluated expressions.
Can I force that behaviour? Perhaps by doing something like the following?:
buildFun :: IO (DataType -> b)
buildFun = do
C1value <- return $ someProcessesToGetTheValue C1
...
Cnvalue <- return $ someProcessesToGetTheValue Cn
return $ \x -> case x of
C1 -> C1value
...
Cn -> Cnvalue

You don't have to involve the IO monad at all (and indeed do { x <- return v; ... } is identical to let x = v in ...), just bind the values outside the lambda:
buildFun :: IO (DataType -> b)
buildFun = do
let v1 = someProcessesToGetTheValue C1
...
return $ \x -> case x of { C1 -> v1; ... }
Haskell doesn't really specify anything about runtime evaluation behaviour, but on all common implementations this will ensure that the results are shared; see What does "floated out" mean? for more information.
However, it still won't evaluate v1…vn inside buildFun; instead, they will each be evaluated the first time the corresponding result of the function you return is evaluated. If you want to force them to be evaluated up-front, you can say let !v1 = someProcessesToGetTheValue C1 (this requires the BangPatterns language extension), or v1 <- evaluate $ someProcessesToGetTheValue C1 (from Control.Exception; this behaves better if someProcessesToGetTheValue C1 might throw an exception).

Instead of a function, why not instead define some data structure, like a list, of all the the results of evaluating this function (indexed by position of constructor in data type)? For example, something like this (not tested):
data DataType = C1 | C2 | C3 | ... | Cn deriving (Enum, Bounded)
cachedValues :: [b]
cachedValues = map someProcessesToGetTheValue ([minBound .. maxBound] :: [DataType])
getCachedValue :: DataType -> b
getCachedValue x = cachedValues !! (fromEnum x)
Since Haskell is lazy, it will store a thunk until it is run for the first time, after which it will remember the value.
(If list traversal over the list of size n is inefficient; you can use an array or Map instead. The idea is the same.)

Create (pseudo) Cyclic Discriminated Unions in F#

I've run into a small problem here. I wrote the Tortoise and Hare cycle detection algorithm.
type Node =
| DataNode of int * Node
| LastNode of int
let next node =
match node with
|DataNode(_,n) -> n
|LastNode(_) -> failwith "Error"
let findCycle(first) =
try
let rec fc slow fast =
match (slow,fast) with
| LastNode(a),LastNode(b) when a=b -> true
| DataNode(_,a), DataNode(_,b) when a=b -> true
| _ -> fc (next slow) (next <| next fast)
fc first <| next first
with
| _ -> false
This is working great for
let first = DataNode(1, DataNode(2, DataNode(3, DataNode(4, LastNode(5)))))
findCycle(first)
It shows false. Right. Now when try to test it for a cycle, I'm unable to create a loop!
Obviously this would never work:
let first = DataNode(1, DataNode(2, DataNode(3, DataNode(4, first))))
But I need something of that kind! Can you tell me how to create one?

You can't do this with your type as you've defined it. See How to create a recursive data structure value in (functional) F#? for some alternative approaches which would work.
As an alternative to Brian's solution, you might try something like:
type Node =
| DataNode of int * NodeRec
| LastNode of int
and NodeRec = { node : Node }
let rec cycle = DataNode(1, { node =
DataNode(2, { node =
DataNode(3, { node =
DataNode(4, { node = cycle}) }) }) })

Here is one way:
type Node =
| DataNode of int * Lazy<Node>
| LastNode of int
let next node = match node with |DataNode(_,n) -> n.Value |LastNode(_) -> failwith "Error"
let findCycle(first) =
try
let rec fc slow fast =
match (slow,fast) with
| LastNode(a),LastNode(b) when a=b->true
| DataNode(a,_), DataNode(b,_) when a=b -> true
| _ -> fc (next slow) (next <| next fast)
fc first <| next first
with
| _ -> false
let first = DataNode(1, lazy DataNode(2, lazy DataNode(3, lazy DataNode(4, lazy LastNode(5)))))
printfn "%A" (findCycle(first))
let rec first2 = lazy DataNode(1, lazy DataNode(2, lazy DataNode(3, lazy DataNode(4, first2))))
printfn "%A" (findCycle(first2.Value))

Even though both Brian and kvb posted answers that work, I still felt I needed to see if it was possible to achieve the same thing in a different way. This code will give you a cyclic structure wrapped as a Seq<'a>
type Node<'a> = Empty | Node of 'a * Node<'a>
let cyclic (n:Node<_>) : _ =
let rn = ref n
let rec next _ =
match !rn with
| Empty -> rn := n; next Unchecked.defaultof<_>
| Node(v, x) -> rn := x; v
Seq.initInfinite next
let nodes = Node(1, Node(2, Node(3, Empty)))
cyclic <| nodes |> Seq.take 40 // val it : seq<int> = seq [1; 2; 3; 1; ...]
The structure itself is not cyclic, but it looks like it from the outside.
Or you could do this:
//removes warning about x being recursive
#nowarn "40"
type Node<'a> = Empty | Node of 'a * Lazy<Node<'a>>
let rec x = Node(1, lazy Node(2, lazy x))
let first =
match x with
| Node(1, Lazy(Node(2,first))) -> first.Value
| _ -> Empty

Can you tell me how to create one?
There are various hacks to get a directly cyclic value in F# (as Brian and kvb have shown) but I'd note that this is rarely what you actually want. Directly cyclic data structures are a pig to debug and are usually used for performance and, therefore, made mutable.
For example, your cyclic graph might be represented as:
> Map[1, 2; 2, 3; 3, 4; 4, 1];;
val it : Map<int,int> = map [(1, 2); (2, 3); (3, 4); (4, 1)]
The idiomatic way to represent a graph in F# is to store a dictionary that maps from handles to vertices and, if necessary, another for edges. This approach is much easier to debug because you traverse indirect recursion via lookup tables that are comprehensible as opposed to trying to decipher a graph in the heap. However, if you want to have the GC collect unreachable subgraphs for you then a purely functional alternative to a weak hash map is apparently an unsolved problem in computer science.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

DPLL algorithm and number of visited nodes - algorithm

Related

debugging a Haskell application

Haskell Optimizations for List Processing stymied by Lazy Evaluation

F# Tree: Node Insertion

Caching possible values of a function constructed at runtime

Create (pseudo) Cyclic Discriminated Unions in F#

Categories

Resources