How does one debug infinite recursion in Haskell? - debugging

How would one debug this (obviously) flawed program using GHC's profiling tools? The program enters an infinite recursion within the second clause of frobnicate.
-- Foo.hs
frobnicate :: Show a => Maybe a -> String
frobnicate Nothing = ""
frobnicate x = case frobnicate x of
"" -> "X"
_ -> show x
main :: IO ()
main = print (frobnicate (Just "" :: Maybe String))
The example might look contrived, but it's actually stripped down version of a real bug I encountered today.
In an imperative language one the mistake would be obvious as the stack trace would say something like frobnicate -> frobnicate -> frobnicate -> .... But how would one discover this in Haskell? How would one narrow down the blame to this one particular function?
I tried something akin to the following:
ghc -fforce-recomp -rtsopts -prof -fprof-auto Foo.hs
./Foo +RTS -M250M -i0.001 -h
hp2ps -c Foo.hp
where the -M250M flag is added to ensure it doesn't kill the machine, -i0.001 increases the profiling frequency in an attempt to catch the overflow in action (which happens very fast).
This produces this rather unhelpful plot:
There is no obvious overflow in this plot. The y-axis doesn't go past even a single megabyte! What am I doing wrong here?

Related

Haskell: is it possible to output type as a part of a program?

To help with debugging and writing programs in Haskell, I am thinking about ability of Haskell programs to output types of variables as part of the program. For example, I have following code:
listHEADFiles :: ReaderT LgRepo IO ()
listHEADFiles = do
ref <- resolveReference $ T.pack "HEAD"
case ref of
Nothing -> fail "Could not resolve reference named 'HEAD'"
Just reference -> do
obj <- lookupObject reference
case obj of
CommitObj commit -> do
objects <- listAllObjects Nothing (commitOid commit)
for_ objects (\case
TreeObjOid toOid -> do
tree <- lookupTree toOid
treeEntries <- sourceTreeEntries tree
entries <- lift $ treeEntries
outputTypeOf entries
)
_ -> fail "'HEAD' is not a commit object"
I want to output type of variable entries because I fail to understand what exactly happens after I lift the value. I can go to documentation, but it always perplexes me to calculate it by hand. I would like to know for sure what type it is when my program is executed. In other words, I want functionality of :t in ghci as a part of my program. Is it possible?
You don't really want your program to output a type: you want the compiler to output a type when it compiles your program. The feature you're looking for is Partial type signatures. The idea is you put an incomplete signature on an expression, and you get out a compiler "error" telling you how to fill in the blanks. If you have no idea at all of the type, an acceptable incomplete signature would be just _:
(entries :: _) <- lift $ treeEntries

'Share' or 'cache' an expression parameterized by only ambiguous types?

I have a tricky question;
So, I know that GHC will ‘cache’ (for lack of a better term) top level definitions and only compute them once, e.g. :
myList :: [Int]
myList = fmap (*10) [0..10]
Even if I use myList in several spots, GHC notices the value has no params, so it can share it and won’t ‘rebuild’ the list.
I want to do that, but with a computation which depends on a type-level context; a simplified example is:
dependentList :: forall n. (KnownNat n) => [Nat]
dependentList = [0..natVal (Proxy #n)]
So the interesting thing here, is that there isn’t a ‘single’ cacheable value for dependentList; but once a type is applied it reduces down to a constant, so in theory once the type-checker runs, GHC could recognize that several spots all depend on the ‘same’ dependentList; e.g. (using TypeApplications)
main = do
print (dependentList #5)
print (dependentList #10)
print (dependentList #5)
My question is, will GHC recognize that it can share both of the 5 lists? Or does it compute each one separately? Technically it would even be possible to compute those values at Compile-time rather than run-time, is it possible to get GHC to do that?
My case is a little more complicated, but should follow the same constraints as the example, however my dependentList-like value is intensive to compute.
I’m not at all opposed to doing this using a typeclass if it makes things possible; does GHC cache and re-use typeclass dictionaries? Maybe I could bake it into a constant in the typeclass dict to get caching?
Ideas anyone? Or anyone have reading for me to look at for how this works?
I'd prefer to do this in such a way that the compiler can figure it out rather than using manual memoization, but I'm open to ideas :)
Thanks for your time!
As suggested by #crockeea I ran an experiment; here's an attempt using a top-level constant with a polymorphic ambiguous type variable, and also an actual constant just for fun, each one contains a 'trace'
dependant :: forall n . KnownNat n => Natural
dependant = trace ("eval: " ++ show (natVal (Proxy #n))) (natVal (Proxy #n))
constantVal :: Natural
constantVal = trace "constant val: 1" 1
main :: IO ()
main = do
print (dependant #1)
print (dependant #1)
print constantVal
print constantVal
The results are unfortunate:
λ> main
eval: 1
1
eval: 1
1
constant val: 1
1
1
So clearly it re-evaluates the polymorphic constant each time it's used.
But if we write the constants into a typeclass (still using Ambiguous Types) it appears that it will resolve the Dictionary values only once per instance, which makes sense when you know that GHC passes the same dict for the same class instances. It does of course re-run the code for different instances:
class DependantClass n where
classNat :: Natural
instance (KnownNat n) => DependantClass (n :: Nat) where
classNat = trace ("dependant class: " ++ show (natVal (Proxy #n))) (natVal (Proxy #n))
main :: IO ()
main = do
print (classNat #1)
print (classNat #1)
print (classNat #2)
Result:
λ> main
dependant class: 1
1
1
dependant class: 2
2
As far as getting GHC to do these at compile-time, it looks like you'd do that with lift from TemplateHaskell using this technique.
Unfortunately you can't use this within the typeclass definition since TH will complain that the '#n' must be imported from a different module (yay TH) and it's not known concretely at compile time. You CAN do it wherever you USE the typeclass value, but it'll evaluate it once per lift and you have to lift EVERYWHERE you use it to get the benefits; pretty impractical.

How to debug variables / recursive data types in Haskell

https://www.inf.ed.ac.uk/teaching/courses/inf1/fp/exams/exam-2016-paper1-answers.pdf
-- 3b
trace :: Command -> State -> [State]
trace Nil s = [s]
trace (com :#: mov) s = t ++ [state mov (last t)]
where t = trace com s
I am having troubles to understand section 3b. I try to debug the variables one by one but I always end up with violating the defined data types. The code confuses me and I want to see what the variables contain. How can I do it using Debug.Trace?
https://downloads.haskell.org/~ghc/latest/docs/html/libraries/base-4.12.0.0/Debug-Trace.html
Thank you.
Looks like section 3b is teaching 'How Recursion works' Those slides cover only conventional Haskell Lists. What you have with :#: is the opposite, sometimes called a snoc list. Then a sensible design for trace would be to also produce a snoc list result. But it doesn't (presumably because the lecturer thinks that by torturing beginners like this, they'll learn something). Haskell List comprehensions only work for Lists, not snoc. (Then half the content in those slides is useless for this exercise.) So 3b is teaching you structural inversion via recursion (which is the useful half of the slides).
violating the defined data types
Variable t in the code you give is local to function trace, so it seems difficult to access. But its definition
t = trace com s
is not:
We know trace :: Command -> State -> [State]
We can see in the equation for t that trace is applied to two arguments.
So the type of t must be the type of the result of trace, that is [State].
Are you not sure what is the type of the arguments to trace in the equation for t? In particular com unpacked from the Command argument to trace at top level.
Then we need to understand the type for :#:. We have for Question 3
data Command =
Nil
| Command :#: Move
That makes (:#:) an infix operator (which is why I've put it in parens).
Then we can ask GHCi for its type, to make sure.
Use the :type command, make sure to put parens.
For contrast, also ask for the :type of the usual Haskell List constructor (:) -- see the left-right inversion?
The term to the left of (:#:) is of type Command.
Then variable com in the equation for t must be type Command; and that fits what the call to trace is expecting.

How to print with line number and stack trace in Haskell? [duplicate]

This question already has an answer here:
How can I figure out the line number where exception occured in Haskell?
(1 answer)
Closed 1 year ago.
I Java I appended this to my print statements and they had a stack trace...
How can we print line numbers to the log in java
public static int getLineNumber() {
// The second row of the stack trace had the caller file name, etc.
return Thread.currentThread().getStackTrace()[2];
}
How do I do this is Haskell?
One option appears to be the use of a library like loc-th where you can, for example, write an error message with the line information:
{-# LANGUAGE TemplateHaskell #-}
-- app/Main.hs
module Main where
import Debug.Trace.LocationTH
main :: IO ()
main = do
$failure "Error"
putStrLn "Hello"
gives me
my-exe: app/Main.hs:10:5-12: Error
It also provides a string which one could look at, in order to determine the line number. However, I'd imagine that's a bit frowned upon, depending on your use-case. For example, I wouldn't want to see this method used to just log line numbers.
There's more on Haskell debugging techniques here.
Honestly, though, maybe this isn't the greatest idea. What are you planning on doing with the line number?
I think I found a solution:
Debug.Trace: Functions for tracing and monitoring execution.
traceStack :: String -> a -> a Source
like trace, but additionally prints a call stack if one is available.
In the current GHC implementation, the call stack is only availble if
the program was compiled with -prof; otherwise traceStack behaves
exactly like trace. Entries in the call stack correspond to SCC
annotations, so it is a good idea to use -fprof-auto or
-fprof-auto-calls to add SCC annotations automatically.
Since: 4.5.0.0
^ https://hackage.haskell.org/package/base-4.8.2.0/docs/Debug-Trace.html
As noted in #user2407038's comment, modern GHC makes a CallStack available, see docs at https://www.stackage.org/haddock/lts-17.13/base-4.14.1.0/GHC-Stack.html#t:HasCallStack
Print the callstack like this:
import GHC.Stack
msgStacktraced :: HasCallStack => String -> IO ()
msgStacktraced msg = putStrLn (msg ++ "\n" ++ prettyCallStack callStack)
You'll need that HasCallStack constraint on anything that can call msgStacktraced as well, or it'll be hidden from the call stack.

Infinite recursion in Haskell

This question is essentially a duplicate of Debugging infinite loops in Haskell programs with GHCi. The author there solved it manually, though I'd like to know other solutions.
(my particular problem)
I have an arrow code which contains a recursive invocation,
testAVFunctor = proc x -> do
y <- errorArrow "good error" -< x
z <- isError -< y
(passError ||| testAVFunctor) -< trace "value of z" z
The errorArrow should make the recursive testAVFunctor not execute, since that will cause isError to return a Left (AVError "good error") which should in turn choose the passError route and bypass the recursive call.
The very odd thing is that inserting "trace" calls at popular sites like the function composition results in the program emitting a finite amount of output, then freezing. Not what I'd expect from an infinite term expansion problem. (see edit 1)
I've uploaded my source code here if anyone is so curious.
EDIT 1
I wasn't looking in the right place (if you care to look at the source, apparently avEither was looping). The way I got there was by compiling a binary, and running gdb:
gdb Main
r (runs code)
Ctrl+C (send interrupt). The backtrace will be useless, but what you can do, is hit
s (step). Then, hold down the enter key; you should see a lot of method names fly by. Hopefully one of them will be recognizable.
You can compile with ghc flag -O0 to disable optimization, which can reveal more method names.
EDIT 3
Apparently, the proc x -> do block above was causing the code to generate combinators, which were calling the AVFunctor.arr lifting method to be called -- something in there must be violating laziness. If I rewrite the top level function as
testAVFunctor = errorArrow "good error" >>>
isError >>> (passError ||| testAVFunctor)
then everything works fine. I guess it's time to try learning and using garrows (by a grad student here at Berkeley).
My general takeaway from the experience is that ghci debugging can be frustrating. For example, I managed to make the argument f of AVFunctor.arr show up as a local variable, but I can't get anything terribly informative from it:
> :i f
f :: b -> c -- <no location info>
Revised source code is here
Keep in mind that the meaning of (|||) depends on the arrow, and testAVFunctor is an infinite object of your arrow:
testAVFunctor = proc x -> do
...
(passError ||| proc x -> do
...
(passError ||| proc x -> ...) -< trace "value of z" z)
-< trace "value of z" z
I'm not sure if you were aware of that. Examine the definition of (|||) (or if there isn't one, left) to see if it can handle infinite terms. Also check (>>>) (er, (.) in modern versions I think). Make sure the combinators are not strict, because then an infinite term will diverge. This may involve making patterns lazier with ~ (I have had to do this a lot when working with arrows). The behavior you're seeing might be caused by too much strictness in one of the combinators, so it evaluates "far enough" to give some output but then gets stuck later.
Good luck. You're into the deep subtleness of Haskell.

Resources