How to debug Erlang code? - debugging

I have some Ruby and Java background and I'm accustomed to having exact numbers of lines in the error logs.
So, if there is an error in the compiled code, I will see the number of line which caused the exception in the console output.
Like in this Ruby example:
my_ruby_code.rb:13:in `/': divided by 0 (ZeroDivisionError)
from my_ruby_code.rb:13
It's simple and fast - I just go to the line number 13 and fix the error.
On the contrary, Erlang just says something like:
** exception error: no match of right hand side value [xxxx]
in function my_module:my_fun/1
in call from my_module:other_fun/2
There are no line numbers to look at.
And if I have two lines like
X = Param1,
Y = Param2,
in 'my_fun', how can understand in which line the problem lies?
Additionally, I have tried to switch to Emacs+Elang-mode from Vim, but the only bonus I've got so far is the ability to cycle through compilation errors inside Emacs (C-k `).
So, the process of writing code and seeking for simple logical errors like 'no match of right hand side' seems to be a bit cumbersome.
I have tried to add a lot of "io:format" lines in the code, but it is additional work which takes time.
I have also tried to use distel, but it requires 10 steps to just open a debugger once.
Questions:
What is the most straight and simple way to debug Erlang code?
Does Emacs' erlang-mode has something superior in terms of Erlang development comparing to Vim?
What development 'write-compile-debug' cycle do you prefer? Do you leave Emacs to compile and run the code in the terminal? How do you search for errors in your Erlang code?

Debugging Erlang code can be tricky at times, especially dealing with badmatch errors. In general, two good guidelines to keep are:
Keep functions short
Use return values directly if you can, instead of binding temporary variables (this will give you the benefit of getting function_clause errors etc which are way more informative)
That being said, using the debuggers are usually required to quickly get to the bottom of errors. I recommend to use the command line debugger, dbg, instead of the graphical one, debugger (it's way faster when you know how to use it, and you don't have to context switch from the Erlang shell to a GUI).
Given the sample expression you provided, the case is often that you have more than just variables being assigned to other variables (which is absolutely unnecessary in Erlang):
run(X, Y) ->
X = something(whatever),
Y = other:thing(more_data),
Debugging a badmatch error here is aided by using the command line debugger:
1> dbg:tracer(). % Start the CLI debugger
{ok,<0.55.0>}
2> dbg:p(all, c). % Trace all processes, only calls
{ok,[{matched,nonode#nohost,29}]}
3> dbg:tpl(my_module, something, x). % tpl = trace local functions as well
{ok,[{matched,nonode#nohost,1},{saved,x}]}
4> dbg:tp(other, do, x). % tp = trace exported functions
{ok,[{matched,nonode#nohost,1},{saved,x}]}
5> dbg:tp(my_module, run, x). % x means print exceptions
{ok,[{matched,nonode#nohost,1},{saved,x}]} % (and normal return values)
Look for {matched,_,1} in the return value... if this would have been 0 instead of 1 (or more) that would have meant that no functions matched the pattern. Full documentation for the dbg module can be found here.
Given that both something/1 and other:do/1 always returns ok, the following could happen:
6> my_module:run(ok, ok).
(<0.72.0>) call my_module:run(ok,ok)
(<0.72.0>) call my_module:something(whatever)
(<0.72.0>) returned from my_module:something/1 -> ok
(<0.72.0>) call other:thing(more_data)
(<0.72.0>) returned from other:thing/1 -> ok
(<0.72.0>) returned from my_module:run/2 -> ok
ok
Here we can see the whole call procedure, and what return values were given. If we call it with something we know will fail:
7> my_module:run(error, error).
** exception error: no match of right hand side value ok
(<0.72.0>) call my_module:run(error,error)
(<0.72.0>) call my_module:something(whatever)
(<0.72.0>) returned from my_module:something/1 -> ok
(<0.72.0>) exception_from {my_module,run,2} {error,{badmatch,ok}}
Here we can see that we got a badmatch exception, something/1 was called, but never other:do/1 so we can deduce that the badmatch happened before that call.
Getting proficient with the command line debugger will save you a lot of time, whether you debug simple (but tricky!) badmatch errors or something much more complex.

You can use the Erlang debugger to step through your code and see which line is failing.
From erl, start the debugger with:
debugger:start().
Then you can choose which modules you want to in interpreted mode (required for debugging) using the UI or using the console with ii:
ii(my_module).
Adding breakpoints is done in the UI or console again:
ib(my_module, my_func, func_arity).
Also, in Erlang R15 we'll finally have line number in stack traces!

If you replace your erlang installation with a recent one, you will have line numbers, they were added starting with version 15.
If the new versions are not yet available on your operating system, you could build from source or try to get a packaged version here: http://www.erlang-solutions.com/section/132/download-erlang-otp

You can use "debug_info" at compile time of the file and "debugger"
1> c(test_module, [debug_info]).
{ok, test_module}
2> debugger:start().
More details about how do Debugging in Erlang you can follow by link to video - https://vimeo.com/32724400

Related

OpenEdge Progress-4GL development: how to know from an error message the procedure I'm running?

As stated in previous questions, I'm quite new at Progress-4GL development.
I've just created a windows (*.w file), together with a procedure file (*.p file), which are based on an include file (*.i file).
I've done something wrong and I get an error message, copy-paste reveals the following:
---------------------------
Fout
---------------------------
** Begin positie voor SUBSTRING, OVERLAY, enz. moet 1 of groter zijn. (82)
---------------------------
OK
---------------------------
As you can see, this is the Dutch translation of error 82:
** Starting position for SUBSTRING, OVERLAY, etc. must be 1 or greater. (82)
The SUBSTRING, OVERLAY, etc, functions require that the start position (second argument) be greater than or equal to 1.
P
I'd like to know which procedure/function is launching this error message. I'm working with the AppBuilder release 11.6 and the corresponding procedure editor, so the debugging possibilities are very limited. One thing I'm thinking of, is taking a dump of the Windows process, in order to determine the call stack, but I'm not sure how to do this. I also tried using Process Explorer and check the stack of the individual stacks of the threads inside the "procwin32.exe" process, but I'm not sure how to proceed.
By the way, I'm regularly adding message boxes to my code, which look as follows (just an example):
MESSAGE "begin procedure combobox-value-changed" VIEW-AS ALERT-BOX.
As you see, the name of the procedure is hardcoded, while in other programming languages (like C++) the procedure/function name can be shown as follows:
OUTPUT("begin procedure %s", __FUNCTION__);
Next to __FUNCTION__, C++ also knows __FILE__ (for filename) and __LINE__ (for line number).
Do such predefined values also exist in Progress 4GL, preferably release 11.6 or previous?
As ABL code is not compiled into Windows byte-code a windows debugger will not be really helpful.
You should start by adding the -debugalert startup parameter to prowin/prowin32.exe. Or add this
ASSIGN SESSION:DEBUG-ALERT = TRUE .
That will add a HELP button to all (error) messages which will open a dialog with the ABL stack trace.
As you'Re using include files, be aware that the line numbers referenced in the stack-trace are based on the debug listing, not the actual source code. So execute
COMPILE myfile.w DEBUG-LIST c:\temp\myfile.debuglist .
to receive the debug-listing file with the correct line numbers.
Are you aware of the visual debugger that's available for the AVM? https://docs.progress.com/de-DE/bundle/openedge-abl-troubleshoot-applications-117/page/Introduction.html
%DLC%\bin\proDebugger.bat
Or the Compile -> Debug menu in the AppBuilder.
It looks a bit antique, but usually does it's job.
Debugging needs to be enabled as an Administrator in proenv:
prodebugenable -enable-all
Of course the grass is greener when you switch to Progress Developer Studio as your IDE.
Regarding the second part of your question. See the PROGRAM-NAME function.
message
program-name(1) skip
program-name(2)
.
Additionally see the {} preprocessor name reference.
message 'file: {&file-name} line: {&line-number}'.

Windows pintool mismatch between call/ret instructions

So i've been trying to write a pintool that monitors call/ret instructions but i've noticed that threre was a significant inconsistency between the two. For example ret instructions without previous call.
I've run the tool in a console application from which i got the following logs showing this inconsistency (this is an example, there are more inconsistencies like the one listed below in the other call/ret instructions):
1. Call from ntdll!LdrpCallInitRoutine+0x69, expected to return to 7ff88f00502a
2. RETURN to 7ff88f00502a
//call from ntdll!LdrpInitializeNode+0x1ac which is supposed to return at 7ff88f049385 is missing (the previous instruction)
3. RETURN to 7ff88f049385 (ntdll!LdrpInitializeNode+0x1b1)
The above are the first 3 log entries for the call/ret instructions. As one can see, the monitoring started a bit late, at the call found at ntdll!LdrpCallInitRoutine+0x69, it returned to the expected address but then returned to 7ff88f049385 without first tracking the call found in the previous instruction.
Any ideas of what could be the fault?
The program is traced with INS_AddInstrumentFunction with a callback that more or less does:
if INS_IsCall(ins) INS_InsertCall(ins,...
if INS_IsRet(ins) INS_InsertCall(ins,...
I've tried the same program on Linux which worked as expected, without any mismatch.
Any ideas of the reason behind this behavior?

Segmentation Fault Using LLDB

When I was debugging my .c file using lldb on terminal for Mac, I some how cannot find the location of the segmentation fault. I have debugged the code numerous of times and it is still producing the same error. Can someone help me on why I can find the location of segmentation fault. enter image description here
Use the bt command in lldb to see the call stack. You've called a libc function like scanf() and are most likely passing an invalid argument to it. When you see the call stack, you will see a stack frame with your own code on it, say it is frame #3. You can select that frame with f 3, and you can look at variables with the v command to understand what arguments were passed to the libc function that led to a crash.
Without knowing what your code is doing, I would suggest using a tool like valgrind instead of just a normal debugger. It's designed to look for memory issues for lower-level languages like C/C++/FORTRAN. For example, it will tell you if you're trying to use an index that is too large for an array.
From the quick start guide, try valgrind --leak-check=yes myprog arg1 arg2

What is RUST_BACKTRACE supposed to tell me?

My program is panicking so I followed its advice to run RUST_BACKTRACE=1 and I get this (just a little snippet).
1: 0x800c05b5 - std::sys::imp::backtrace::tracing::imp::write::hf33ae72d0baa11ed
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:42
2: 0x800c22ed - std::panicking::default_hook::{{closure}}::h59672b733cc6a455
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panicking.rs:351
If the program panics it stops the whole program, so where can I figure out at which line it's panicking on?
Is this line telling me there is a problem at line 42 and line 351?
The whole backtrace is on this image, I felt it would be to messy to copy and paste it here.
I've never heard of a stack trace or a back trace. I'm compiling with warnings, but I don't know what debugging symbols are.
What is a stack trace?
If your program panics, you encountered a bug and would like to fix it; a stack trace wants to help you here. When the panic happens, you would like to know the cause of the panic (the function in which the panic was triggered). But the function directly triggering the panic is usually not enough to really see what's going on. Therefore we also print the function that called the previous function... and so on. We trace back all function calls leading to the panic up to main() which is (pretty much) the first function being called.
What are debug symbols?
When the compiler generates the machine code, it pretty much only needs to emit instructions for the CPU. The problem is that it's virtually impossible to quickly see from which Rust-function a set of instructions came. Therefore the compiler can insert additional information into the executable that is ignored by the CPU, but is used by debugging tools.
One important part are file locations: the compiler annotates which instruction came from which file at which line. This also means that we can later see where a specific function is defined. If we don't have debug symbols, we can't.
In your stack trace you can see a few file locations:
1: 0x800c05b5 - std::sys::imp::backtrace::tracing::imp::write::hf33ae72d0baa11ed
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:42
The Rust standard library is shipped with debug symbols. As such, we can see where the function is defined (gcc_s.rs line 42).
If you compile in debug mode (rustc or cargo build), debug symbols are activated by default. If you, however, compile in release mode (rustc -O or cargo build --release), debug symbols are disabled by default as they increase the executable size and... usually aren't important for the end user. You can tweak whether or not you want debug symbols in your Cargo.toml in a specific profile section with the debug key.
What are all these strange functions?!
When you first look at a stack trace you might be confused by all the strange function names you're seeing. Don't worry, this is normal! You are interested in what part of your code triggered the panic, but the stack trace shows all functions somehow involved. In your example, you can ignore the first 9 entries: those are just functions handling the panic and generating the exact message you are seeing.
Entry 10 is still not your code, but might be interesting as well: the panic was triggered in the index() function of Vec<T> which is called when you use the [] operator. And finally, entry 11 shows a function you defined. But you might have noticed that this entry is missing a file location... the above section describes how to fix that.
What do to with a stack trace? (tl;dr)
Activate debug symbols if you haven't already (e.g. just compile in debug mode).
Ignore any functions from std and core at the top of the stack trace.
Look at the first function you defined, find the corresponding location in your file and fix the bug.
If you haven't already, change all camelCase function and method names to snake_case to stick to the community wide style guide.

Debugger implementation - Step over issue

I am currently writing a debugger for a script virtual machine.
The compiler for the scripts generates debug information, such as function entry points, variable scopes, names, instruction to line mappings, etc.
However, and have run into an issue with step-over.
Right now, I have the following:
1. Look up the current IP
2. Get the source line from that
3. Get the next (valid) source line
4. Get the IP where the next valid source line starts
5. Set a temporary breakpoint at that instruction
or: if the next source line no longer belongs to the same function, set the temp breakpoint at the next valid source line after return address.
So far this works well. However, I seem to be having problems with jumps.
For example, take the following code:
n = 5; // Line A
if(n == 5) // Line B
{
foo(); // Line C
}
else
{
bar(); // Line D
--n;
}
Given this code, if I'm on line B and choose to step-over, the IP determined for the breakpoint will be on line C. If, however, the conditional jump evaluates to false, it should be placed on line D. Because of this, the step-over wouldn't halt at the expected location (or rather, it wouldn't halt at all).
There seems to be little information on debugger implementation of this specific issue out there. However, I found this. While this is for a native debugger on Windows, the theory still holds true.
It seems though that the author has not considered this issue, either, in section "Implementing Step-Over" as he says:
1. The UI-threads calls CDebuggerCore::ResumeDebugging with EResumeFlag set to StepOver.
This tells the debugger thread (having the debugger-loop) to put IBP on next line.
2. The debugger-thread locates next executable line and address (0x41141e), it places an IBP on that location.
3. It calls then ContinueDebugEvent, which tells the OS to continue running debuggee.
4. The BP is now hit, it passes through EXCEPTION_BREAKPOINT and reaches at EXCEPTION_SINGLE_STEP. Both these steps are same, including instruction reversal, EIP reduction etc.
5. It again calls HaltDebugging, which in turn, awaits user input.
Again:
The debugger-thread locates next executable line and address (0x41141e), it places an IBP on that location.
This statement does not seem to hold true in cases where jumps are involved, though.
Has anyone encountered this problem before? If so, do you have any tips on how to tackle this?
Since this thread comes in Google first when searching for "debugger implement step over". I'll share my experiences regarding the x86 architecture.
You start first by implementing step into: This is basically single stepping on the instructions and checking whether the line corresponding to the current EIP changes. (You use either the DIA SDK or the read the dwarf debug data to find out the current line for an EIP).
In the case of step over: before single stepping to the next instruction, you'll need to check if the current instruction is a CALL instuction. If it's a CALL instruction then put a temporary breakpoint on the instruction following it and continue execution till the execution stops (then remove it). In this case you effectively stepped over function calls literally in the assembly level and so in the source too.
No need to manage stack frames (unless you'll need to deal with single line recursive functions). This analogy can be applied to other architectures as well.
Ok, so since this seems to be a bit of black magic, in this particular case the most intelligent thing was to enumerate the instruction where the next line starts (or the instruction stream ends + 1), and then run that many instructions before halting again.
The only gotcha was that I have to keep track of the stack frame in case CALL is executed; those instructions should run without counting in case of step-over.

Resources