Performance gain on call chaining? - performance

Do you gain any performance, even if it's minor, by chaining function calls as shown below or is it just coding style preference?
execute() ->
step4(step3(step2(step1())).
Instead of
execute() ->
S1 = step1(),
S2 = step2(S1),
S3 = step3(S2),
step4(S3).
I was thinking whether in the 2nd version the garbage collector has some work to do for S1, S2, S3. Should that apply for the 1st version as well?

They are identical after compilation. You can confirm this by running the erl file through erlc -S and reading the generated .S file:
$ cat a.erl
-module(a).
-compile(export_all).
step1() -> ok.
step2(_) -> ok.
step3(_) -> ok.
step4(_) -> ok.
execute1() ->
step4(step3(step2(step1()))).
execute2() ->
S1 = step1(),
S2 = step2(S1),
S3 = step3(S2),
step4(S3).
$ erlc -S a.erl
$ cat a.S
{module, a}. %% version = 0
...
{function, execute1, 0, 10}.
{label,9}.
{line,[{location,"a.erl",9}]}.
{func_info,{atom,a},{atom,execute1},0}.
{label,10}.
{allocate,0,0}.
{line,[{location,"a.erl",10}]}.
{call,0,{f,2}}.
{line,[{location,"a.erl",10}]}.
{call,1,{f,4}}.
{line,[{location,"a.erl",10}]}.
{call,1,{f,6}}.
{call_last,1,{f,8},0}.
{function, execute2, 0, 12}.
{label,11}.
{line,[{location,"a.erl",12}]}.
{func_info,{atom,a},{atom,execute2},0}.
{label,12}.
{allocate,0,0}.
{line,[{location,"a.erl",13}]}.
{call,0,{f,2}}.
{line,[{location,"a.erl",14}]}.
{call,1,{f,4}}.
{line,[{location,"a.erl",15}]}.
{call,1,{f,6}}.
{call_last,1,{f,8},0}.
...
As you can see, both execute1 and execute2 result in identical code (the only thing different are line numbers and label numbers.

Related

Debugging <<loop>> error message in haskell

Hello i am encountering this error message in a Haskell program and i do not know where is the loop coming from.There are almost no IO methods so that i can hook myself to them and print the partial result in the terminal.
I start with a file , i read it and then there are only pure methods.How can i debug this ?
Is there a way to attach to methods or create a helper that can do the following:
Having a method method::a->b how can i somehow wrap it in a iomethod::(a->b)->IO (a->b) to be able to test in in GHCI (i want to insert some putStrLn-s etc ?
P.S My data suffer transformations IO a(->b->c->d->......)->IO x and i do not know how to debug the part that is in the parathesis (that is the code that contains the pure methods)
Types and typeclass definitions and implementations
data TCPFile=Rfile (Maybe Readme) | Dfile Samples | Empty
data Header=Header { ftype::Char}
newtype Samples=Samples{values::[Maybe Double]}deriving(Show)
data Readme=Readme{ maxClients::Int, minClients::Int,stepClients::Int,maxDelay::Int,minDelay::Int,stepDelay::Int}deriving(Show)
data FileData=FileData{ header::Header,rawContent::Text}
(>>?)::Maybe a->(a->Maybe b)->Maybe b
(Just t) >>? f=f t
Nothing >>? _=Nothing
class TextEncode a where
fromText::Text-> a
getHeader::TCPFile->Header
getHeader (Rfile _ ) = Header { ftype='r'}
getHeader (Dfile _ )= Header{ftype='d'}
getHeader _ = Header {ftype='e'}
instance Show TCPFile where
show (Rfile t)="Rfile " ++"{"++content++"}" where
content=case t of
Nothing->""
Just c -> show c
show (Dfile c)="Dfile " ++"{"++show c ++ "}"
instance TextEncode Samples where
fromText text=Samples (map (readMaybe.unpack) cols) where
cols=splitOn (pack ",") text
instance TextEncode Readme where
fromText txt =let len= length dat
dat= case len of
6 ->Prelude.take 6 .readData $ txt
_ ->[0,0,0,0,0,0] in
Readme{maxClients=Prelude.head dat,minClients=dat!!1,stepClients=dat!!2,maxDelay=dat!!3,minDelay=dat!!4,stepDelay=dat!!5} where
instance TextEncode TCPFile where
fromText = textToFile
Main
module Main where
import Data.Text(Text,pack,unpack)
import Data.Text.IO(readFile,writeFile)
import TCPFile(TCPFile)
main::IO()
main=do
dat<-readTcpFile "test.txt"
print dat
readTcpFile::FilePath->IO TCPFile
readTcpFile path =fromText <$> Data.Text.IO.readFile path
textToFile::Text->TCPFile
textToFile input=case readHeader input >>? (\h -> Just (FileData h input)) >>? makeFile of
Just r -> r
Nothing ->Empty
readHeader::Text->Maybe Header
readHeader txt=case Data.Text.head txt of
'r' ->Just (Header{ ftype='r'})
'd' ->Just (Header {ftype ='d'})
_ -> Nothing
makeFile::FileData->Maybe TCPFile
makeFile fd= case ftype.header $ fd of
'r'->Just (Rfile (Just (fromText . rawContent $ fd)))
'd'->Just (Dfile (fromText . rawContent $ fd))
_ ->Nothing
readData::Text->[Int]
readData =catMaybes . maybeValues where
maybeValues=mvalues.split.filterText "{}"
#all the methods under this line are used in the above method
mvalues::[Text]->[Maybe Int]
mvalues arr=map (\x->(readMaybe::String->Maybe Int).unpack $ x) arr
split::Text->[Text]
split =splitOn (pack ",")
filterText::[Char]->Text->Text
filterText chars tx=Data.Text.filter (\x -> not (x `elem` chars)) tx
I want first to clean the Text from given characters , in our case }{ then split it by ,.After the text is split by commas i want to parse them, and create either a Rfile which contains 6 integers , either a Dfile (datafile) which contains any given number of integers.
Input
I have a file with the following content: r,1.22,3.45,6.66,5.55,6.33,2.32} and i am running runghc main 2>err.hs
Expected Output : Rfile (Just (Readme 1.22 3.45 6.66 5.55 6.33 2.32))
In the TextEncode Readme instance, len and dat depend on each other:
instance TextEncode Readme where
fromText txt =let len= length dat
dat= case len of
To debug this kind of thing, other than staring at the code, one thing you can do is compile with -prof -fprof-auto -rtsopts, and run your program with the cmd line options +RTS -xc. This should print a trace when the <<loop>> exception is raised (or if the program loops instead, when you kill it (Ctrl+C)). See the GHC manual https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/runtime_control.html#rts-flag--xc
As Li-yao Xia said part of the problem is the infinite recursion, but if you tried the following code, then the problem still remains.
instance TextEncode Readme where
fromText txt =let len= length [1,2,3,4,5,6] --dat
dat= case len of
The second issue is that the file contains decimal numbers but all the conversion function are expecting Maybe Int, changing the definitions of the following functions should give the expected results, on the other hand probably the correct fix is that the file should have integers and not decimal numbers.
readData::Text->[Double]
--readData xs = [1,2,3,4,5,6,6]
readData =catMaybes . maybeValues where
maybeValues = mvalues . split . filterText "{}"
--all the methods under this line are used in the above method
mvalues::[Text]->[Maybe Double]
mvalues arr=map (\x->(readMaybe::String->Maybe Double).unpack $ x) arr
data Readme=Readme{ maxClients::Double, minClients::Double,stepClients::Double,maxDelay::Double,minDelay::Double,stepDelay::Double}deriving(Show)

Lists garbage collection

If I do the following:
List2 = [V || V <- List1, ...]
It seems that the List2 refers to the List1 and erlang:garbage_collect() doesn't clear memory. How is it possible to create a new list without references and discard the old?
In any language with garbage collection you simply need to 'lose' all references to a piece of data before it can be garbage collected. Simply returning from the function that generates the original list, while not storing it in any other 'persistent' location (e.g. the process dictionary), should allow the memory to be reclaimed.
The VM is supposed to manage the garbage collecting. If you use a gen_server, or if you use a "home made" server_loop(State), you should have always the same pattern:
server_loop(State) ->
A = somefunc(State),
B = receive
mesg1 -> func1(...);
...
after Timeout ->
func2(...)
end,
NewState = func3(...),
server_loop(NewState).
As long as a process is alive, executing this loop, the VM will allocate and manage memory areas to store all needed information (variables, message queue...+ some margin) As far as I know, there is some spare memory allocated to the process, and if the VM does not try to recover the memory very fast after it has been released, but if you force a garbage collecting, using erlang:garbage_collect(Pid) you can verify that the memory is free - see example bellow.
startloop() -> spawn(?MODULE,loop,[{lists:seq(1,1000),infinity}]).
loop(endloop) -> ok;
loop({S,T}) ->
NewState = receive
biglist -> {lists:seq(1,5000000),T};
{timeout,V} -> {S,V};
sizelist -> io:format("Size of the list = ~p~n",[length(S)]),
{S,T};
endloop -> endloop
after T ->
L = length(S) div 2,
{lists:seq(1,L),T}
end,
loop(NewState).
%% Here, NewState is a copy of State or a totally new data, depending on the
%% received message. In general, for performance consideration it can be
%% interesting to take care of the function used to avoid big copies,
%% and allow the compiler optimize the beam code
%% [H|Q] rather than Q ++ [H] to add a term to a list for example
and the results in the VM:
2> P = lattice:startloop().
<0.57.0>
...
6> application:start(sasl).
....
ok
7> application:start(os_mon).
...
ok
...
11> P ! biglist.
biglist
...
% get_memory_data() -> {Total,Allocated,Worst}.
14> memsup:get_memory_data().
{8109199360,5346488320,{<0.57.0>,80244336}}
...
23> P ! {timeout,1000}.
{timeout,1000}
24> memsup:get_memory_data().
{8109199360,5367361536,{<0.57.0>,80244336}}
the worst case is the loop process: {<0.57.0>,80244336}
...
28> P ! sizelist.
Size of the list = 0
sizelist
...
31> P ! {timeout,infinity}.
{timeout,infinity}
32> P ! biglist.
biglist
33> P ! sizelist.
Size of the list = 5000000
sizelist
...
36> P ! {timeout,1000}.
{timeout,1000}
37> memsup:get_memory_data().
{8109199360,5314289664,{<0.57.0>,10770968}}
%% note the garbage collecting in the previous line: {<0.57.0>,10770968}
38> P ! sizelist.
sizelist
Size of the list = 156250
39> memsup:get_memory_data().
{8109199360,5314289664,{<0.57.0>,10770968}}
...
46> P ! sizelist.
Size of the list = 0
sizelist
47> memsup:get_memory_data().
{8109199360,5281882112,{<0.57.0>,10770968}}
...
50> erlang:garbage_collect(P).
true
51> memsup:get_memory_data().
{8109199360,5298778112,{<0.51.0>,688728}}
%% after GC, the process <0.57.0> is no more the worst case
If you create new list like this, the new list will have elements from the first one, some elements will be shared between both the lists. And if you throw the first list away, shared elements will still be reachable from the new list and won't count as garbage.
How do you check if the first list is garbage collected? Do you test this in erlang console? The console stores results of evaluation each expression that may be the cause you don't see the list garbage collected.

ActiveState Perl 5.14 fails to compare certain values?

Basically, using the following code on a file stream, I get the following:
$basis = $2 * 1.0;
$cost = ($basis - 2500.0) ** 1.05;
# The above should ensure that both cost & basis are floats
printf " %f -> %f", $basis, $cost;
if ($basis gt $cost) { # <- *** THIS WAS MY ERROR: gt forces lexical!
$cost = $basis;
printf " -> %f", $cost;
}
Outputs:
10667.000000 -> 12813.438340
30667.000000 -> 47014.045519
26667.000000 -> 40029.842300
66667.000000 -> 111603.373367 -> 66667.000000
8000.000000 -> 8460.203780
10667.000000 -> 12813.438340
73333.000000 -> 123807.632158 -> 73333.000000
6667.000000 -> 6321.420427 -> 6667.000000
80000.000000 -> 136071.379474 -> 80000.000000
As you can see, for most values, the code appears to work fine.
But for some values.... 66667, 80000, and a few others, ActivePerl 5.14 tells me that 66667 > 1111603!!!
Does anyone know anything about this - or have an alternate Perl interpreter I might use (Windows). Because this is ridiculous.
You are using a lexical comparison instead of the numerical one
$cost = ($basis - 2500.0) ** 1.05;
printf " %f -> %f", $basis, $cost;
if ($basis > $cost) {
$cost = $basis;
printf " -> %f", $cost;
}
ps: revised to match the updated question
The first few chapters of Learning Perl will clear this up for you. Scalar values are can be either strings or numbers, or both at the same time. Perl uses the operator to decide how to treat them. If you want to do numeric comparisons, you use the numeric comparison operators. If you want to do string comparisons, you use the string comparison operators.
The scalar values themselves don't have a type, despite other answers and comments using words like "float" and "cast". It's just strings and numbers.
Not sure why you need to compare as lexical, but you can force it using sprintf
$basis_real = sprintf("%015.6f",$basis);
$cost_real = sprintf("%015.6f",$cost);
printf " %s -> %s", $basis_real, $cost_real;
if ($basis_real gt $cost_real) {
$cost = $basis;
printf " -> %015.6f", $cost;
}
Output:
00010667.000000 -> 00012813.438340
00030667.000000 -> 00047014.045519
00026667.000000 -> 00040029.842300
00066667.000000 -> 00111603.373367
00008000.000000 -> 00008460.203780
00010667.000000 -> 00012813.438340
00073333.000000 -> 00123807.632158
00006667.000000 -> 00006321.420427 -> 00006667.000000
00080000.000000 -> 00136071.379474
The reason it was failing as you noted, the lexical compare does character to character compare, so when it hits the decimal point in 6667. it actually is alphabetically before 111603. , so it is greater.
To fix this, you must make all the numbers the same size, especially where the decimal lines up. The %015 is the total size of number, including the period and the decimals.

Tracing all functions calls and printing out their parameters (userspace)

I want to see what functions are called in my user-space C99 program and in what order. Also, which parameters are given.
Can I do this with DTrace?
E.g. for program
int g(int a, int b) { puts("I'm g"); }
int f(int a, int b) { g(5+a,b);g(8+b,a);}
int main() {f(5,2);f(5,3);}
I wand see a text file with:
main(1,{"./a.out"})
f(5,2);
g(10,2);
puts("I'm g");
g(10,5);
puts("I'm g");
f(5,3);
g(10,3);
puts("I'm g");
g(11,5);
puts("I'm g");
I want not to modify my source and the program is really huge - 9 thousand of functions.
I have all sources; I have a program with debug info compiled into it, and gdb is able to print function parameters in backtrace.
Is the task solvable with DTrace?
My OS is one of BSD, Linux, MacOS, Solaris. I prefer Linux, but I can use any of listed OS.
Here's how you can do it with DTrace:
script='pid$target:a.out::entry,pid$target:a.out::return { trace(arg1); }'
dtrace -F -n "$script" -c ./a.out
The output of this command is like as follows on FreeBSD 14.0-CURRENT:
dtrace: description 'pid$target:a.out::entry,pid$target:a.out::return ' matched 17 probes
I'm g
I'm g
I'm g
I'm g
dtrace: pid 39275 has exited
CPU FUNCTION
3 -> _start 34361917680
3 -> handle_static_init 140737488341872
3 <- handle_static_init 2108000
3 -> main 140737488341872
3 -> f 2
3 -> g 2
3 <- g 32767
3 -> g 5
3 <- g 32767
3 <- f 0
3 -> f 3
3 -> g 3
3 <- g 32767
3 -> g 5
3 <- g 32767
3 <- f 0
3 <- main 0
3 -> __do_global_dtors_aux 140737488351184
3 <- __do_global_dtors_aux 0
The annoying thing is that I've not found a way to print all the function arguments (see How do you print an associative array in DTrace?). A hacky workaround is to add trace(arg2), trace(arg3), etc. The problem is that for nonexistent arguments there will be garbage printed out.
Yes, you can do this with dtrace. But you probably will never be able to do it on linux. I've tried multiple versions of the linux port of dtrace and it's never done what I wanted. In fact, it once caused a CPU panic. Download the dtrace toolkit from http://www.brendangregg.com/dtrace.html. Then set your PATH accordingly. Then execute this:
dtruss -a yourprogram args...
Your question is exceedingly likely to be misguided. For any non-trivial program, printing the sequense of all function calls executed with their parameters will result in multi-MB or even multi-GB output, that you will not be able to make any sense of (too much detail for a human to understand).
That said, I don't believe you can achieve what you want with dtrace.
You might begin by using GCC -finstrument-functions flag, which would easily allow you to print function addresses on entry/exit to every function. You can then trivialy convert addresses into function names with addr2line. This gives you what you asked for (except parameters).
If the result doesn't prove to be too much detail, you can set a breakpoint on every function in GDB (with rb . command), and attach continue command to every breakpoint. This will result in a steady stream of breakpoints being hit (with parameters), but the execution will likely be at least 100 to 1000 times slower.

Nice formatting of numbers inside Messages

When printing string with StyleBox by default we get nicely formatted numbers inside string:
StyleBox["some text 1000000"] // DisplayForm
I mean that the numbers look as if would have additional little spaces: "1 000 000".
But in Messages all numbers are displayed without formatting:
f::NoMoreMemory =
"There are less than `1` bytes of free physical memory (`2` bytes \
is free). $Failed is returned.";
Message[f::NoMoreMemory, 1000000, 98000000]
Is there a way to get numbers inside Messages to be formatted?
I'd use Style to apply the AutoNumberFormatting option:
You can use it to target specific messages:
f::NoMoreMemory =
"There are less than `1` bytes of free physical memory (`2` bytes is free). $Failed is returned.";
Message[f::NoMoreMemory,
Style[1000000, AutoNumberFormatting -> True],
Style[98000000, AutoNumberFormatting -> True]]
or you can use it with $MessagePrePrint to apply it to all the messages:
$MessagePrePrint = Style[#, AutoNumberFormatting -> True] &;
Message[f::NoMoreMemory, 1000000, 98000000]
I think you want $MessagePrePrint
$MessagePrePrint =
NumberForm[#, DigitBlock -> 3, NumberSeparator -> " "] &;
Or, incorporating Sjoerd's suggestion:
With[
{opts =
AbsoluteOptions[EvaluationNotebook[],
{DigitBlock, NumberSeparator}]},
$MessagePrePrint = NumberForm[#, Sequence ## opts] &];
Adapting Brett Champion's method, I believe this allows for copy & paste as you requested:
$MessagePrePrint = StyleForm[#, AutoNumberFormatting -> True] &;

Resources