Looking at the description for traceIO, I feel that it does exactly what hPutStrLn stderr does. However when I looked into its source code:
traceIO :: String -> IO ()
traceIO msg = do
withCString "%s\n" $ \cfmt -> do
-- NB: debugBelch can't deal with null bytes, so filter them
-- out so we don't accidentally truncate the message. See Trac #9395
let (nulls, msg') = partition (=='\0') msg
withCString msg' $ \cmsg ->
debugBelch cfmt cmsg
when (not (null nulls)) $
withCString "WARNING: previous trace message had null bytes" $ \cmsg ->
debugBelch cfmt cmsg
It seems it uses a foreign routine called debugBelch, which I failed to find any documentation about. So what does traceIO do that can't be done by hPutStrLn stderr?
One thing I can think of is that it might ensure that the string is printed as a unit, without any other trace messages inside. Indeed an experiment seems to confirm this:
Prelude Debug.Trace System.IO> traceIO $ "1" ++ trace "2" "3"
2
13
Prelude Debug.Trace System.IO> hPutStrLn stderr $ "1" ++ trace "2" "3"
12
3
Another difference is that it seems to remove characters that cannot safely be printed to stderr:
Prelude Debug.Trace System.IO> hPutStrLn stderr "\9731"
*** Exception: <stderr>: hPutChar: invalid argument (invalid character)
Prelude Debug.Trace System.IO> traceIO "\9731"
Prelude Debug.Trace System.IO>
As #dfeuer reminds me of, none of these features would be impossible to write in Haskell. So the deciding factor is probably this: debugBelch is already a predefined C function, used all over the place in GHC's runtime system, which is written in C and C--, not Haskell.
Related
Issue
Following is a minimal, contrived example:
read :: FilePath -> Aff String
read f = do
log ("File: " <> f) -- (1)
readTextFile UTF8 f -- (2)
I would like to do some debug logging in (1), before a potential error on (2) occurs. Executing following code in Spago REPL works for success cases so far:
$ spago repl
> launchAff_ $ read "test/data/tree/root.txt"
File: test/data/tree/root.txt
unit
Problem: If there is an error with (2) - file is directory here - , (1) seems to be not executed at all:
$ spago repl
> launchAff_ $ read "test/data/tree"
~/purescript-book/exercises/chapter9/.psci_modules/node_modules/Effect.Aff/foreign.js:532
throw util.fromLeft(step);
^
[Error: EISDIR: illegal operation on a directory, read] {
errno: -21,
code: 'EISDIR',
syscall: 'read'
}
The original problem is more complex including several layers of recursions (see E-Book exercise 3), where I need logging to debug above error.
Questions
How can I properly log regardless upcoming errors here?
(Optional) Is there a more sophisticated, well-established debugging alternative - purescript-debugger? A decicated VS Code debug extension/functionality would be the cherry on the cake.
First of all, the symptoms you observe do not mean that the first line doesn't execute. It does always execute, you're just not seeing output from it due to how console works in the PureScript REPL. The output gets swallowed. Not the only problem with REPL, sadly.
You can verify that the first line is always executed by replacing log with throwError and observing that the error always gets thrown. Or, alternatively, you can make the first line modify a mutable cell instead of writing to the console, and then examine the cell's contents.
Finally, this only happens in REPL. If you put that launchAff_ call inside main and run the program, you will always get the console output.
Now to the actual question at hand: how to debug trace.
Logging to console is fine if you can afford it, but there is a more elegant way: Debug.trace.
This function has a hidden effect - i.e. its type says it's pure, but it really produces an effect when called. This little lie lets you use trace in a pure setting and thus debug pure code. No need for Effect! This is ok as long as used for debugging only, but don't put it in production code.
The way it works is that it takes two parameters: the first one gets printed to console and the second one is a function to be called after printing, and the result of the whole thing is whatever that function returns. For example:
calculateSomething :: Int -> Int -> Int
calculateSomething x y =
trace ("x = " <> show x) \_ ->
x + y
main :: Effect Unit
main =
log $ show $ calculateSomething 37 5
> npx spago run
'x = 37'
42
The first parameter can be anything at all, not just a string. This lets you easily print a lot of stuff:
calculateSomething :: Int -> Int -> Int
calculateSomething x y =
trace { x, y } \_ ->
x + y
> npx spago run
{ x: 37, y: 5 }
42
Or, applying this to your code:
read :: FilePath -> Aff String
read f = trace ("File: " <> f) \_ -> do
readTextFile UTF8 f
But here's a subtle detail: this tracing happens as soon as you call read, even if the resulting Aff will never be actually executed. If you need tracing to happen on effectful execution, you'll need to make the trace call part of the action, and be careful not to make it the very first action in the sequence:
read :: FilePath -> Aff String
read f = do
pure unit
trace ("File: " <> f) \_ -> pure unit
readTextFile UTF8 f
It is, of course, a bit inconvenient to do this every time you need to trace in an effectful context, so there is a special function that does it for you - it's called traceM:
read :: FilePath -> Aff String
read f = do
traceM ("File: " <> f)
readTextFile UTF8 f
If you look at its source code, you'll see that it does exactly what I did in the example above.
The sad part is that trace won't help you in REPL when an exception happens, because it's still printing to console, so it'll still get swallowed for the same reasons.
But even when it doesn't get swallowed, the output is a bit garbled, because trace actually outputs in color (to help you make it out among other output), and PureScript REPL has a complicated relationship with color:
> calculateSomething 37 5
←[32m'x = 37'←[39m
42
In addition to Fyodor Soikin's great answer, I found a variant using VS Code debug view.
1.) Make sure to build with sourcemaps:
spago build --purs-args "-g sourcemaps"
2.) Add debug configuration to VS Code launch.json:
{
"version": "0.2.0",
"configurations": [
{
"type": "pwa-node",
"request": "launch",
"name": "Launch Program",
"skipFiles": ["<node_internals>/**"],
"runtimeArgs": ["-e", "require('./output/Main/index.js').main()"],
"smartStep": true // skips files without (valid) source map
}
]
}
Replace "./output/Main/index.js" / .main() with the compiled .js file / function to be debugged.
3.) Set break points and step through the .purs file via sourcemap support.
The debugger can be programmatically invoked by executing (break). For example, the debugging banner then displays what caused the interrupt, the HELP line, the available restarts, some related info, and finally the source of the interrupt:
debugger invoked on a SIMPLE-CONDITION in thread
#<THREAD "main thread" RUNNING {10010B0523}>:
break
Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
restarts (invokable by number or by possibly-abbreviated name):
0: [CONTINUE] Return from BREAK.
1: [ABORT ] Exit debugger, returning to top level.
#(
NODE: STATE=<NIL NIL NIL 0.0 0.0
( )> DEPTH=0)
#(
NODE: STATE=<NIL NIL NIL 0.0 0.0
((ACTIVE GATE1) (ACTIVE GATE2) (COLOR RECEIVER1 BLUE) (COLOR RECEIVER2 RED) (COLOR TRANSMITTER1 BLUE) (COLOR TRANSMITTER2 RED) (FREE ME) (LOC CONNECTOR1 AREA5) (LOC CONNECTOR2 AREA7) (LOC ME AREA5))> DEPTH=0)
(DF-BNB1 )
source: (BREAK)
0]
I don't understand the related info between the restarts and the source. Can this info be suppressed, as
sometimes it is many lines long in my application. I've tried changing the debug & safety optimization settings, but to no effect.
The output you are confused with is related to the place in the code where break was invoked. When I call it from the vanilla Lisp REPL (without SLIME), it displays:
(SB-INT:SIMPLE-EVAL-IN-LEXENV (BREAK) #<NULL-LEXENV>)
However, if I do something wrong in the debugger, here's what happens:
0] q
; in: PROGN (PRINT 1)
; (PROGN Q)
;
; caught WARNING:
; undefined variable: COMMON-LISP-USER::Q
;
; compilation unit finished
; Undefined variable:
; Q
; caught 1 WARNING condition
debugger invoked on a UNBOUND-VARIABLE in thread
#<THREAD "main thread" RUNNING {10005204C3}>:
The variable Q is unbound.
Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
restarts (invokable by number or by possibly-abbreviated name):
0: [CONTINUE ] Retry using Q.
1: [USE-VALUE ] Use specified value.
2: [STORE-VALUE] Set specified value and use it.
3: [ABORT ] Reduce debugger level (to debug level 1).
4: Return from BREAK.
5: Exit debugger, returning to top level.
((LAMBDA (#:G498)) #<unused argument>)
source: (PROGN Q)
You can see that the last line resembles the output you got with the line starting at source:. Actually, the output we saw consists of 3 main parts:
1. Description of the condition
2. Listing of the available restarts
3. Debug REPL prompt printed by debug-loop-fun
The last output is part of the prompt and it is generated by the invocation of:
(print-frame-call *current-frame* *debug-io* :print-frame-source t)
So, you can recompile the call providing :print-frame-source nil or try to understand why your current frame looks this way...
I noticed that the base package use errorWithoutStackTrace to implement lots of functions. Is there some performance different between the following two definition?
head :: [a] -> a
head (x:_) = x
head [] = errorWithoutStackTrace ("Prelude.head: empty list")
head :: [a] -> a
head (x:_) = x
head [] = withFrozenCallStack $ error ("Prelude.head: empty list")
error means something bad happened so for most, if not all purposes, it does not matter how fast it is, because it indicates a program that's not working.
That said, a quick glance at the code is enough to reasonably guess that error does strictly more work than errorWithoutStackTrace (and that is compounded by the addition of withFrozenCallStack to the error variant of your code). Confirming that with benchmarks is left as an exercise to the reader.
Here's the definition of error and errorWithoutStackTrace:
https://hackage.haskell.org/package/base-4.12.0.0/docs/src/GHC.Err.html#error
error s = raise# (errorCallWithCallStackException s ?callStack)
errorWithoutStackTrace s = raise# (errorCallException s)
Now those two internal functions are defined as follows:
errorCallException :: String -> SomeException
errorCallException s = toException (ErrorCall s)
errorCallWithCallStackException :: String -> CallStack -> SomeException
errorCallWithCallStackException s stk = unsafeDupablePerformIO $ do
...
return $ toException (ErrorCallWithLocation s stack)
Note that both essentially do toException (something s), but errorCallWithCallStackException also has a whole lot more code to handle the stack (in "...").
I have a small text file in markdown :
---
title: postWithReference
author: auf
date: 2010-07-29
keywords: homepage
abstract: |
What are the objects of
ontologists .
bibliography: "/home/frank/Workspace8/SSG/site/resources/BibTexLatex.bib"
csl: "/home/frank/Workspace8/SSG/site/resources/chicago-fullnote-bibliography-bb.csl"
---
An example post. With a reference to [#Frank2010a] and more[#navratil08].
## References
and process it in Haskell with processCites' which has a single argument, namely the Pandoc data resulting from readMarkdown. The bibliography and the csl style should be taken from the input file.
The process does not produce errors, but the result of processCites is the same text as the input; references are not treated at all. For the same input the references are resolved with the standalone pandoc (this excludes errors in the bibliography and the csl style)
pandoc -f markdown -t html --filter=pandoc-citeproc -o p1.html postWithReference.md
The issue is therefore in the API. The code I have is:
markdownToHTML4 :: Text -> PandocIO Value
markdownToHTML4 t = do
pandoc <- readMarkdown markdownOptions t
let meta2 = flattenMeta (getMeta pandoc)
-- test if biblio is present and apply
let bib = Just $ ( meta2) ^? key "bibliography" . _String
pandoc2 <- case bib of
Nothing -> return pandoc
_ -> do
res <- liftIO $ processCites' pandoc -- :: Pandoc -> IO Pandoc
when (res == pandoc) $
liftIO $ putStrLn "*** markdownToHTML3 result without references ***"
return res
htmltex <- writeHtml5String html5Options pandoc2
let withContent = ( meta2) & _Object . at "contentHtml" ?~ String ( htmltex)
return withContent
getMeta :: Pandoc -> Meta
getMeta (Pandoc m _) = m
What do I misunderstand? are there any reader options necessary for citeproc? The bibliography is a BibLatex file.
I found in hakyll code a comment, which I cannot understand in light of the code there - perhaps somebody knows what the intention is.
-- We need to know the citation keys, add then *before* actually parsing the
-- actual page. If we don't do this, pandoc won't even consider them
-- citations!
I have a workaround (not an answer to the original question, I still hope that somebody can identify my error!). It is simple to call the standalone pandoc with System.readProess and pass the text and get the result back, not even reading and writing files:
processCites2x :: Maybe FilePath -> Maybe FilePath -> Text -> ErrIO Text
-- porcess the cites in the text (not with the API)
-- using systemcall because the standalone pandoc works with
-- call: pandoc -f markdown -t html --filter=pandoc-citeproc
-- with the input text on stdin and the result on stdout
-- the csl and bib file are used from text, not from what is in the arguments
processCites2x _ _ t = do
putIOwords ["processCite2" ] -- - filein\n", showT styleFn2, "\n", showT bibfn2]
let cmd = "pandoc"
let cmdargs = ["--from=markdown", "--to=html5", "--filter=pandoc-citeproc" ]
let cmdinp = t2s t
res :: String <- callIO $ System.readProcess cmd cmdargs cmdinp
return . s2t $ res
-- error are properly caught and reported in ErrIO
t2s and s2t are conversion utilities between string and text, ErrIO is ErrorT Text a IO and callIO is essentially liftIO with handling of errors.
The original problem was very simple: I had not included the option Ext_citations in the markdownOptions. When it is included, the example works (thanks to help I received from the pandoc-citeproc issue page). The referenced code is updated...
Is there any way to view the reduction steps in haskell, i.e trace the recursive function calls made? For example, chez scheme provides us with trace-lambda. Is there an equivalent form in Haskell?
You could try inserting Debug.Trace.trace in places you want to trace, but this has the tendency of (a) producing wildly out-of-order output, as your trace statement may belong to a thunk that isn't evaluated until far far away from the original call, and (b) changing the runtime behavior of your program, if tracing requires evaluating things that wouldn't otherwise have been evaluated (yet).
Is this for debugging? If so...
Hat modifies your source code to output tracing which can be viewed after running. The output should be pretty close to what you want: the example on their homepage is
For example, the computation of the faulty program
main = let xs :: [Int]
xs = [4*2,5 `div` 0,5+6]
in print (head xs,last' xs)
last' (x:xs) = last' xs
last' [x] = x
gives the result
(8, No match in pattern.
and the Hat viewing tools can be used to explore its behaviour as follows:
Hat-stack
For aborted computations, that is computations that terminated with an error message or were interrupted, hat-stack shows in which function call the computation was aborted. It does so by showing a virtual stack of function calls (redexes). Thus, every function call shown on the stack caused the function call above it. The evaluation of the top stack element caused the error (or during its evaluation the computation was interrupted). The stack shown is virtual, because it does not correspond to the actual runtime stack. The actual runtime stack enables lazy evaluation whereas the virtual stack corresponds to a stack that would be used for eager (strict) evaluation.
Using the same example program as above, hat-stack shows
$ hat-stack Example
Program terminated with error:
No match in pattern.
Virtual stack trace:
(Last.hs:6) last' []
(Last.hs:6) last' [_]
(Last.hs:6) last' [_,_]
(Last.hs:4) last' [8,_,_]
(unknown) main
$
These days, GHCi (≥6.8.1) also comes with a debugger:
$ ghci -fbreak-on-exception
GHCi, version 6.10.1: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer ... linking ... done.
Loading package base ... linking ... done.
Prelude> :l Example.hs
[1 of 1] Compiling Main ( Example.hs, interpreted )
Example.hs:5:0:
Warning: Pattern match(es) are overlapped
In the definition of `last'': last' [x] = ...
Ok, modules loaded: Main.
*Main> :trace main
(8,Stopped at <exception thrown>
_exception :: e = _
[<exception thrown>] *Main> :back
Logged breakpoint at Example.hs:(5,0)-(6,12)
_result :: t
[-1: Example.hs:(5,0)-(6,12)] *Main> :hist
-1 : last' (Example.hs:(5,0)-(6,12))
-2 : last' (Example.hs:5:15-22)
-3 : last' (Example.hs:(5,0)-(6,12))
-4 : last' (Example.hs:5:15-22)
-5 : last' (Example.hs:(5,0)-(6,12))
-6 : last' (Example.hs:5:15-22)
-7 : last' (Example.hs:(5,0)-(6,12))
-8 : main (Example.hs:3:25-32)
-9 : main (Example.hs:2:17-19)
-10 : main (Example.hs:2:16-34)
-11 : main (Example.hs:3:17-23)
-12 : main (Example.hs:3:10-33)
<end of history>
[-1: Example.hs:(5,0)-(6,12)] *Main> :force _result
*** Exception: Example.hs:(5,0)-(6,12): Non-exhaustive patterns in function last'
[-1: Example.hs:(5,0)-(6,12)] *Main> :back
Logged breakpoint at Example.hs:5:15-22
_result :: t
xs :: [t]
[-2: Example.hs:5:15-22] *Main> :force xs
xs = []
While not as nice, it has the benefit of being easily available, and being usable without recompiling your code.
There's a reduction count in hugs, if that helps?
Alternatively, could you use something like the hugs hood to wrap your code, to get more detail around what it's doing at each step?
Nothing of the kind is built into the Haskell standard.
I would hope that the Helium graphical interpreter would offer something like this, but the web page is silent on the topic.
A partial solution is to use vacuum to visualize data structures.
I've seen some gif animations of fold, scan and others, but I can't find them at the moment. I think Cale Gibbard made the animations.