I'm using GHC 8.4.2 on Windows. I have this program that depends on the library red-black-record, version 2.0.2.2:
{-# LANGUAGE DataKinds, TypeApplications #-}
module Main where
import Data.RBR (FromList,Delete,Variant,I,injectI,winnowI,match)
import GHC.TypeLits
type Phase01 = FromList '[
'("ctor1",Int), '("ctor2",Bool), '("ctor4",Char), '("ctor3",Char),
'("ctor6",Char), '("ctor5",Char), '("ctor10",Char), '("ctor11",Char),
'("ctor13",Char), '("ctor14",Char), '("ctor39",Char), '("ctor46",Char),
'("ctor47",Char), '("ctor44",Char), '("ctor43",Char), '("ctor7",Char),
'("ctor9",Char), '("ctor20",Char), '("ctor45",Char), '("ctor21",Char),
'("ctor48",Char), '("ctor49",Char), '("ctor50",Char), '("ctor41",Char),
'("ctor33",Char), '("ctor32",Char), '("ctor42",Char), '("ctor22",Char),
'("ctor23",Char), '("ctor8",Char), '("ctor40",Char), '("ctor29",Char),
'("ctor24",Char), '("ctor38",Char), '("ctor25",Char), '("ctor26",Char),
'("ctor27",Char), '("ctor28",Char), '("ctor36",Char), '("ctor52",Char),
'("ctor51",Char), '("ctor53",Char), '("ctor12",Char), '("ctor54",Char),
'("ctor15",Char), '("ctor31",Char), '("ctor30",Char), '("ctor34",Char),
'("ctor35",Char), '("ctor17",Char), '("ctor16",Char), '("ctor18",Char),
'("ctor19",Char), '("ctor37",Char)
]
type Phase02 = Delete "ctor1" Int Phase01
main :: IO ()
main = print (match #"ctor17" (fromPhase1ToPhase2 (injectI #"ctor1" 2)))
where
fromPhase1ToPhase2 :: Variant I Phase01 -> Variant I Phase02
fromPhase1ToPhase2 v = case winnowI #"ctor1" #Int v of
Right z -> injectI #"ctor2" False
Left l -> l
(The library itself is not important, except as an example of code that makes heavy use of type families.)
This code takes ~ 9 seconds to compile on my machine. But when I try to move the fromPhase1ToPhase2 function out of the where clause and make it a top-level function, compilation time spikes to ~ 29 seconds!
Is there a reason why lifting a function to the top level makes it compile slower?
Edit: as another data point, moving the function to the top-level but using -XPartialTypeSignatures like this:
{-# LANGUAGE PartialTypeSignatures #-}
{-# OPTIONS_GHC -Wno-partial-type-signatures #-}
fromPhase1ToPhase2 :: Variant I _ -> Variant I _
fromPhase1ToPhase2 v = case winnowI #"ctor1" #Int #Phase01 v of
Right z -> injectI #"ctor2" False
Left l -> l
keeps the original compilation time of ~ 9 seconds.
Related
I have to simulate a discrete environment in F#, to be called by Python, for a reinforcement learning problem. I had a function with primitive types (mainly float) to make the exchange of data smoother. Now I am in the position to run this function many times with different data, so to run it in parallel seems a good idea.
I have the following code:
type AscentStrategy = |Strategy of seq<float>
let simulateAscent env ascentLimiter initState (sequenceOfDepths:seq<float>) =
//let infinitSeqOfConstantValues = (fun _ -> constantDepth) |> Seq.initInfinite
sequenceOfDepths
|> Seq.scan ( fun ( nextState, rew, isTerminal, _ ) depth -> getNextEnvResponseAndBoundForNextAction(env, nextState , depth , ascentLimiter) ) ( initState, 0.0 , false, 0.0)
|> SeqExtension.takeWhileWithLast (fun (_ , _, isTerminalState, _) -> not isTerminalState)
|> Seq.toArray
and then
let simulateStrategy ({MaxPDCS = maxPDCS ; MaxSimTime = maximumSimulationTime ; PenaltyForExceedingRisk = penaltyForExceedingRisk ;
RewardForDelivering = rewardForDelivering ; PenaltyForExceedingTime = penaltyForExceedingTime ; IntegrationTime = integrationTime
ControlToIntegrationTimeRatio = controlToIntegrationTimeRatio; DescentRate = descentRate; MaximumDepth = maximumDepth ;
BottomTime = bottomTime ; LegDiscreteTime = legDiscreteTime } : SimulationParameters) (Strategy ascentStrategy : AscentStrategy) =
let env, initState , ascentLimiter , _ = getEnvInitStateAndAscentLimiter ( maxPDCS , maximumSimulationTime ,
penaltyForExceedingRisk , rewardForDelivering , penaltyForExceedingTime ,
integrationTime ,
controlToIntegrationTimeRatio,
descentRate ,
maximumDepth ,
bottomTime ,
legDiscreteTime )
ascentStrategy
|> simulateAscent env ascentLimiter initState
finally I call the function for testing:
let commonSimulationParameters = {MaxPDCS = 0.32 ; MaxSimTime = 2000.0 ; PenaltyForExceedingRisk = 1.0 ; RewardForDelivering = 10.0; PenaltyForExceedingTime = 0.5 ;
IntegrationTime = 0.1; ControlToIntegrationTimeRatio = 10; DescentRate = 60.0; MaximumDepth = 20.0 ; BottomTime = 10.0; LegDiscreteTime = 0.1}
printfn"insert number of elements"
let maxInputsString = Console.ReadLine()
let maxInputs = maxInputsString |> Double.Parse
let inputsStrategies = [|0.0 .. maxInputs|] |> Array.map (fun x -> Seq.initInfinite (fun _ -> x ) )
let testParallel = inputsStrategies
|> Array.Parallel.map (fun x -> (simulateStrategy commonSimulationParameters ( Strategy x )) )
I have compared this with Array.map and, while it is faster and uses 70% of the CPU on my laptop, still does not seem to use the whole processing power. I have run it on a machine with many more cores ( ~50) and it barely increases the CPU usage (it gets up to 3/4% of total usage with 50ish independent inputs). I think there must be a deadlock generated somewhere, but how can I detect and get rid of it?
Also, why does this happen? One of the advantages of functional programming, as I see it, is also to be able to parallelize easily.
PS: SeqExtension.takeWhileWithLast is a function I have found on SO, kindly provided by Tomas Petricek in one of his brilliant answers, if needed I can post it.
PPS: env is the environment, whose type is defined as:
type Environment<'S, 'A ,'I> = |Environment of (State<'S> -> Action<'A> -> EnvironmentOutput<'S ,'I>)
I have tried the same with Async.Parallel and ParallelSeq, reporting the same problem.
Would a message-based solution solve the problem>? I am looking into it, although I am not familiar at all, but would it be a good way of getting the code parallel, using MailboxProcessor?
Following my question,
I have tried also this great library for parallel code, based on streams of data. https://nessos.github.io/Streams/.
I have added the following code:
let nessosResult = inputsStrategies
|> ParStream.ofArray
|> ParStream.map simulateStrategy
|> ParStream.toArray
I have defined an ad hoc type for inputStrategy (basic the old tuple I had) so that simulateStrategy accepts only one input. Unfortunately the problem seems very well hidden somewhere. I attach a graph with CPU usage. Time spent on my machine for the different cases is: ~8.8 sec (sequential); ~6.2 sec (Array.Parallel.map); ~ 6.1 sec (Nessos.Streams)
I have found that server garbage collection is necessary to get the best parallel performance on .NET. Something like this in your app.config:
<configuration>
<runtime>
<gcServer enabled="true" />
</runtime>
</configuration>
Hello i am encountering this error message in a Haskell program and i do not know where is the loop coming from.There are almost no IO methods so that i can hook myself to them and print the partial result in the terminal.
I start with a file , i read it and then there are only pure methods.How can i debug this ?
Is there a way to attach to methods or create a helper that can do the following:
Having a method method::a->b how can i somehow wrap it in a iomethod::(a->b)->IO (a->b) to be able to test in in GHCI (i want to insert some putStrLn-s etc ?
P.S My data suffer transformations IO a(->b->c->d->......)->IO x and i do not know how to debug the part that is in the parathesis (that is the code that contains the pure methods)
Types and typeclass definitions and implementations
data TCPFile=Rfile (Maybe Readme) | Dfile Samples | Empty
data Header=Header { ftype::Char}
newtype Samples=Samples{values::[Maybe Double]}deriving(Show)
data Readme=Readme{ maxClients::Int, minClients::Int,stepClients::Int,maxDelay::Int,minDelay::Int,stepDelay::Int}deriving(Show)
data FileData=FileData{ header::Header,rawContent::Text}
(>>?)::Maybe a->(a->Maybe b)->Maybe b
(Just t) >>? f=f t
Nothing >>? _=Nothing
class TextEncode a where
fromText::Text-> a
getHeader::TCPFile->Header
getHeader (Rfile _ ) = Header { ftype='r'}
getHeader (Dfile _ )= Header{ftype='d'}
getHeader _ = Header {ftype='e'}
instance Show TCPFile where
show (Rfile t)="Rfile " ++"{"++content++"}" where
content=case t of
Nothing->""
Just c -> show c
show (Dfile c)="Dfile " ++"{"++show c ++ "}"
instance TextEncode Samples where
fromText text=Samples (map (readMaybe.unpack) cols) where
cols=splitOn (pack ",") text
instance TextEncode Readme where
fromText txt =let len= length dat
dat= case len of
6 ->Prelude.take 6 .readData $ txt
_ ->[0,0,0,0,0,0] in
Readme{maxClients=Prelude.head dat,minClients=dat!!1,stepClients=dat!!2,maxDelay=dat!!3,minDelay=dat!!4,stepDelay=dat!!5} where
instance TextEncode TCPFile where
fromText = textToFile
Main
module Main where
import Data.Text(Text,pack,unpack)
import Data.Text.IO(readFile,writeFile)
import TCPFile(TCPFile)
main::IO()
main=do
dat<-readTcpFile "test.txt"
print dat
readTcpFile::FilePath->IO TCPFile
readTcpFile path =fromText <$> Data.Text.IO.readFile path
textToFile::Text->TCPFile
textToFile input=case readHeader input >>? (\h -> Just (FileData h input)) >>? makeFile of
Just r -> r
Nothing ->Empty
readHeader::Text->Maybe Header
readHeader txt=case Data.Text.head txt of
'r' ->Just (Header{ ftype='r'})
'd' ->Just (Header {ftype ='d'})
_ -> Nothing
makeFile::FileData->Maybe TCPFile
makeFile fd= case ftype.header $ fd of
'r'->Just (Rfile (Just (fromText . rawContent $ fd)))
'd'->Just (Dfile (fromText . rawContent $ fd))
_ ->Nothing
readData::Text->[Int]
readData =catMaybes . maybeValues where
maybeValues=mvalues.split.filterText "{}"
#all the methods under this line are used in the above method
mvalues::[Text]->[Maybe Int]
mvalues arr=map (\x->(readMaybe::String->Maybe Int).unpack $ x) arr
split::Text->[Text]
split =splitOn (pack ",")
filterText::[Char]->Text->Text
filterText chars tx=Data.Text.filter (\x -> not (x `elem` chars)) tx
I want first to clean the Text from given characters , in our case }{ then split it by ,.After the text is split by commas i want to parse them, and create either a Rfile which contains 6 integers , either a Dfile (datafile) which contains any given number of integers.
Input
I have a file with the following content: r,1.22,3.45,6.66,5.55,6.33,2.32} and i am running runghc main 2>err.hs
Expected Output : Rfile (Just (Readme 1.22 3.45 6.66 5.55 6.33 2.32))
In the TextEncode Readme instance, len and dat depend on each other:
instance TextEncode Readme where
fromText txt =let len= length dat
dat= case len of
To debug this kind of thing, other than staring at the code, one thing you can do is compile with -prof -fprof-auto -rtsopts, and run your program with the cmd line options +RTS -xc. This should print a trace when the <<loop>> exception is raised (or if the program loops instead, when you kill it (Ctrl+C)). See the GHC manual https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/runtime_control.html#rts-flag--xc
As Li-yao Xia said part of the problem is the infinite recursion, but if you tried the following code, then the problem still remains.
instance TextEncode Readme where
fromText txt =let len= length [1,2,3,4,5,6] --dat
dat= case len of
The second issue is that the file contains decimal numbers but all the conversion function are expecting Maybe Int, changing the definitions of the following functions should give the expected results, on the other hand probably the correct fix is that the file should have integers and not decimal numbers.
readData::Text->[Double]
--readData xs = [1,2,3,4,5,6,6]
readData =catMaybes . maybeValues where
maybeValues = mvalues . split . filterText "{}"
--all the methods under this line are used in the above method
mvalues::[Text]->[Maybe Double]
mvalues arr=map (\x->(readMaybe::String->Maybe Double).unpack $ x) arr
data Readme=Readme{ maxClients::Double, minClients::Double,stepClients::Double,maxDelay::Double,minDelay::Double,stepDelay::Double}deriving(Show)
I found that the following Haskell code uses 100% CPU and takes about 14secs to finish on my Linux server.
{-# LANGUAGE OverloadedStrings #-}
module Main where
import qualified Data.ByteString.Lazy.Char8 as L
import System.IO
str = L.pack "FugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFuga\n"
main = do
hSetBuffering stdout (BlockBuffering (Just 1000))
sequence (take 1000000 (repeat (L.hPutStr stdout str >> hFlush stdout)))
return ()
On the other hand, very similar Python code finishes the same task in about 3secs.
import sys
str = "FugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFuga\n"
def main():
for i in xrange(0, 1000000):
print str,
sys.stdout.flush()
# doIO()
main()
By using strace, I found that select is called every time hFlush is called in Haskell version. On the other hand, select is not called in Python version. I guess this is one of the reason that Haskell version is slow.
Are there any way to improve performance of Haskell version?
I already tried to omit hFlush and it certainly decreased CPU usage a lot. But this solution is not satisfiable because it does not flush.
Thanks.
EDITED
Thank you very very much for your help! By changing sequence and repeat to replicateM_, runtime is reduced from 14s to 3.8s.
But now I have another question. I asked the above question because when I removed hFlush from the above program, it runs fast despite it repeats I/O using sequence and repeat.
Why only the combination of sequence and hFlush makes it slow?
To confirm my new question, I changed my program as follows to do profiling.
{-# LANGUAGE OverloadedStrings #-}
module Main where
import qualified Data.ByteString.Char8 as S
import System.IO
import Control.Monad
str = S.pack "FugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFuga\n"
doIO = S.hPutStr stdout str >> hFlush stdout
doIO' = S.hPutStr stdout str >> hFlush stdout
doIOWithoutFlush = S.hPutStr stdout str
main = do
hSetBuffering stdout (BlockBuffering (Just 1000))
sequence (take 1000000 (repeat doIO))
replicateM_ 1000000 doIO'
sequence (take 1000000 (repeat doIOWithoutFlush))
return ()
By compiling and running as follows:
$ ghc -O2 -prof -fprof-auto Fuga.hs
$ ./Fuga +RTS -p -RTS > /dev/null
I got the following result.
COST CENTRE MODULE %time %alloc
doIO Main 74.7 35.8
doIO' Main 21.4 35.8
doIOWithoutFlush Main 2.6 21.4
main Main 1.3 6.9
What makes the difference between doIO and doIO' which do the same task? And why doIOWithoutFlush runs fast even in sequence and repeat? Are there any reference about this behavior?
Thanks.
Calling hFlush on every write seems wrong.
This simple change, to use strict bytestrings, forM_ or replicateM_ instead of your explicit sequence, and block buffering, reduces runtime from 16.2s to 0.3s
{-# LANGUAGE OverloadedStrings #-}
module Main where
import qualified Data.ByteString.Char8 as S
import Control.Monad
import System.IO
str = S.pack "FugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFuga\n"
main = replicateM_ 1000000 $ S.putStr str
Though more idiomatic would be to use a single write of a lazy bytestring, relying on the bytestring subsystem to coordinate the writes.
import qualified Data.ByteString.Char8 as S
import qualified Data.ByteString.Lazy.Char8 as L
import Control.Monad
import System.IO
str :: S.ByteString
str = S.pack "FugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFuga\n"
main = L.putStr $ L.fromChunks (replicate 1000000 str)
With marginally improved performance (0.27s)
I'm not sure about the Python code (what's doIO()?), but an obvious way to improve the Haskell is to use sequence_ instead of sequence, so it doesn't need to build up the huge list of ()s. That small change makes it 6-7 times faster on my machine.
(A simpler way of expressing that line would be replicateM_ 1000000 (L.hPutStr stdout str >> hFlush stdout).)
It might be that the number of system calls is significant -- GHC's RTS does do non-blocking I/O, and possibly makes unnecessary select calls -- but going by your numbers, this change might be enough to bring it into the Python range on its own.
The big problem is that
sequence (take 1000000 (repeat (L.hPutStr stdout str >> hFlush stdout)))
collects the results of the IO-actions performed in a list. If you discard the results,
sequence_ (take 1000000 (repeat (L.hPutStr stdout str >> hFlush stdout)))
It'll be much faster and do less allocation.
I am new to Haskell. Previously I have programmed in Python and Java. When I am debugging some code I have a habit of littering it with print statements in the middle of code. However doing so in Haskell will change semantics, and I will have to change my function signatures to those with IO stuff. How do Haskellers deal with this? I might be missing something obvious. Please enlighten.
Other answers link the official doco and the Haskell wiki but if you've made it to this answer let's assume you bounced off those for whatever reason. The wikibook also has an example using Fibonacci which I found more accessible. This is a deliberately basic example which might hopefully help.
Let's say we start with this very simple function, which for important business reasons, adds "bob" to a string, then reverses it.
bobreverse x = reverse ("bob" ++ x)
Output in GHCI:
> bobreverse "jill"
"llijbob"
We don't see how this could possibly be going wrong, but something near it is, so we add debug.
import Debug.Trace
bobreverse x = trace ("DEBUG: bobreverse" ++ show x) (reverse ("bob" ++ x))
Output:
> bobreverse "jill"
"DEBUG: bobreverse "jill"
llijbob"
We are using show just to ensure x is converted to a string correctly before output. We also added some parenthesis to make sure the arguments were grouped correctly.
In summary, the trace function is a decorator which prints the first argument and returns the second. It looks like a pure function, so you don't need to bring IO or other signatures into the functions to use it. It does this by cheating, which is explained further in the linked documentation above, if you are curious.
Read this. You can use Debug.Trace.trace in place of print statements.
I was able to create a dual personality IO / ST monad typeclass, which will print debug statements when a monadic computation is typed as IO, them when it's typed as ST. Demonstration and code here: Haskell -- dual personality IO / ST monad? .
Of course Debug.Trace is more of a swiss army knife, especially when wrapped with a useful special case,
trace2 :: Show a => [Char] -> a -> a
trace2 name x = trace (name ++ ": " ++ show x) x
which can be used like (trace2 "first arg" 3) + 4
edit
You can make this even fancier if you want source locations
{-# LANGUAGE TemplateHaskell #-}
import Language.Haskell.TH
import Language.Haskell.TH.Syntax as TH
import Debug.Trace
withLocation :: Q Exp -> Q Exp
withLocation f = do
let error = locationString =<< location
appE f error
where
locationString :: Loc -> Q Exp
locationString loc = do
litE $ stringL $ formatLoc loc
formatLoc :: Loc -> String
formatLoc loc = let file = loc_filename loc
(line, col) = loc_start loc
in concat [file, ":", show line, ":", show col]
trace3' (loc :: String) msg x =
trace2 ('[' : loc ++ "] " ++ msg) x
trace3 = withLocation [| trace3' |]
then, in a separate file [from the definition above], you can write
{-# LANGUAGE TemplateHaskell #-}
tr3 x = $trace3 "hello" x
and test it out
> tr3 4
[MyFile.hs:2:9] hello: 4
You can use Debug.Trace for that.
I really liked Dons short blog about it:
https://donsbot.wordpress.com/2007/11/14/no-more-exceptions-debugging-haskell-code-with-ghci/
In short: use ghci, example with a program with code called HsColour.hs
$ ghci HsColour.hs
*Main> :set -fbreak-on-exception
*Main> :set args "source.hs"
Now run your program with tracing on, and GHCi will stop your program at the call to error:
*Main> :trace main
Stopped at (exception thrown)
Ok, good. We had an exception… Let’s just back up a bit and see where we are. Watch now as we travel backwards in time through our program, using the (bizarre, I know) “:back” command:
[(exception thrown)] *Main> :back
Logged breakpoint at Language/Haskell/HsColour/Classify.hs:(19,0)-(31,46)
_result :: [String]
This tells us that immediately before hitting error, we were in the file Language/Haskell/HsColour/Classify.hs, at line 19. We’re in pretty good shape now. Let’s see where exactly:
[-1: Language/Haskell/HsColour/Classify.hs:(19,0)-(31,46)] *Main> :list
18 chunk :: String -> [String]
vv
19 chunk [] = head []
20 chunk ('\r':s) = chunk s -- get rid of DOS newline stuff
21 chunk ('\n':s) = "\n": chunk s
^^
I want to see what functions are called in my user-space C99 program and in what order. Also, which parameters are given.
Can I do this with DTrace?
E.g. for program
int g(int a, int b) { puts("I'm g"); }
int f(int a, int b) { g(5+a,b);g(8+b,a);}
int main() {f(5,2);f(5,3);}
I wand see a text file with:
main(1,{"./a.out"})
f(5,2);
g(10,2);
puts("I'm g");
g(10,5);
puts("I'm g");
f(5,3);
g(10,3);
puts("I'm g");
g(11,5);
puts("I'm g");
I want not to modify my source and the program is really huge - 9 thousand of functions.
I have all sources; I have a program with debug info compiled into it, and gdb is able to print function parameters in backtrace.
Is the task solvable with DTrace?
My OS is one of BSD, Linux, MacOS, Solaris. I prefer Linux, but I can use any of listed OS.
Here's how you can do it with DTrace:
script='pid$target:a.out::entry,pid$target:a.out::return { trace(arg1); }'
dtrace -F -n "$script" -c ./a.out
The output of this command is like as follows on FreeBSD 14.0-CURRENT:
dtrace: description 'pid$target:a.out::entry,pid$target:a.out::return ' matched 17 probes
I'm g
I'm g
I'm g
I'm g
dtrace: pid 39275 has exited
CPU FUNCTION
3 -> _start 34361917680
3 -> handle_static_init 140737488341872
3 <- handle_static_init 2108000
3 -> main 140737488341872
3 -> f 2
3 -> g 2
3 <- g 32767
3 -> g 5
3 <- g 32767
3 <- f 0
3 -> f 3
3 -> g 3
3 <- g 32767
3 -> g 5
3 <- g 32767
3 <- f 0
3 <- main 0
3 -> __do_global_dtors_aux 140737488351184
3 <- __do_global_dtors_aux 0
The annoying thing is that I've not found a way to print all the function arguments (see How do you print an associative array in DTrace?). A hacky workaround is to add trace(arg2), trace(arg3), etc. The problem is that for nonexistent arguments there will be garbage printed out.
Yes, you can do this with dtrace. But you probably will never be able to do it on linux. I've tried multiple versions of the linux port of dtrace and it's never done what I wanted. In fact, it once caused a CPU panic. Download the dtrace toolkit from http://www.brendangregg.com/dtrace.html. Then set your PATH accordingly. Then execute this:
dtruss -a yourprogram args...
Your question is exceedingly likely to be misguided. For any non-trivial program, printing the sequense of all function calls executed with their parameters will result in multi-MB or even multi-GB output, that you will not be able to make any sense of (too much detail for a human to understand).
That said, I don't believe you can achieve what you want with dtrace.
You might begin by using GCC -finstrument-functions flag, which would easily allow you to print function addresses on entry/exit to every function. You can then trivialy convert addresses into function names with addr2line. This gives you what you asked for (except parameters).
If the result doesn't prove to be too much detail, you can set a breakpoint on every function in GDB (with rb . command), and attach continue command to every breakpoint. This will result in a steady stream of breakpoints being hit (with parameters), but the execution will likely be at least 100 to 1000 times slower.