LLDB stop at breakpoint while expr function - xcode

for example
void foo()
{
int a = 0;
printf("%d",a);
}
I set a breakpoint at foo function, and then use expr to evaluate it, it just run the function, didn't stop at breakpoint.
(lldb) br set -n foo
(lldb) expr foo()
Is there a way to run any function/code while lldb breakpoint still working?

Yes, I found expr has a option --ignore-breakpoints
expr --ignore-breakpoints false -- foo()
will be working

Related

Sequence of Raku program compilation and execution (maybe nested compile phases?)

The following program correctly fails to compile:
sub f(Int $a) { my Str $b = $a }
say f 42;
say f 'foo';
Specifically, line 3 causes a compilation error (with a ===SORRY!=== error message); this error occurs before line 2 is executed, so the type mismatch within &f is never reached.
But when, specifically, does this error occur? I thought it occurred during the CHECK phase, but was surprised to notice that raku -c does not generate a compile error; it reports Syntax OK.
To dig into this a bit more, I added logging code to the snippet above:
BEGIN note 'begin';
CHECK note 'check';
INIT note 'init';
END note 'end';
sub f(Int $a) { my Str $b = $a }
say f 42;
say f 'foo';
Running this revised code with raku -c prints "begin\n check\n Syntax OK"; running it with raku prints "begin\n check\n ===SORRY!===" (and the rest of the error message).
If I remove the say f 'foo' line (and thus the compile error), raku -c still prints "begin\n check\n Syntax OK" but raku prints "begin\n check\n init\n Type check failed… \n end" (again omitting the body of the error message).
What's going on here? Does the compile error that generated the ===SORRY!=== occur some time between CHECK and INIT (is there any such time?)? Or does raku -c not actually "run BEGIN and CHECK blocks" as raku --help indicates? Or something else?
Relatedly: how, if at all, is any of this connected to the idea of "nested compile times"? Does the execution of this code involve any nested compile times, or does that only occur when using modules? Is there any way to note/log separate compile phases (maybe with correctly placed BEGIN blocks?) or is that something that isn't exposed?
The SORRY message is a side-effect of the static optimizer. Observe the difference in behaviour between:
$ raku -e 'sub foo(Int $a) { }; foo "foo"'
===SORRY!=== Error while compiling -e
Calling foo(Str) will never work with declared signature (Int $a)
and:
$ raku --optimize=off -e 'sub foo(Int $a) { }; foo "foo"'
Type check failed in binding to parameter '$a'; expected Int but got Str ("foo")
in sub foo at -e line 1
which happens somewhere between CHECK and INIT time, unless it has been disabled. Note that disabling the static optimizer makes it a runtime error.

GDB: How to force a watchpoint to not be deleted after a function returned?

Watchpoints on function-local variables usually get removed upon the function return, with a message «Watchpoint 7 deleted because the program has left the block in». Illustration:
struct mystruct{
int a, b, c;
};
void MyFunc(){
mystruct obj;
obj.a = 2;
}
int main(){
MyFunc();
}
gdb session example
(gdb) b 7
Breakpoint 1 at 0x4004f1: file /tmp/test2.cpp, line 7.
(gdb) r
Starting program: /tmp/test2
Breakpoint 1, MyFunc () at /tmp/test2.cpp:7
7 obj.a = 2;
(gdb) wa obj
Hardware watchpoint 2: obj
(gdb) c
Continuing.
Hardware watchpoint 2: obj
Old value = {a = 4195600, b = 0, c = 4195328}
New value = {a = 2, b = 0, c = 4195328}
MyFunc () at /tmp/test2.cpp:8
8 }
(gdb) c
Continuing.
Watchpoint 2 deleted because the program has left the block in
which its expression is valid.
main () at /tmp/test2.cpp:12
12 }
I tried casting it like wa *(mystruct *)&obj and wa *(mystruct *)(void*)&obj, to no avail.
I need it because GDB on embedded ARM device I'm working with is broken: sometimes it removes a watchpoint for no reason; backtrace then looks like lines marked with "??" signs, and a message about corrupted stack. Even though application is actually fine.
As GDB: Setting Watchpoints says,
GDB automatically deletes watchpoints that watch local (automatic) variables, or expressions that involve such variables, when they go out of scope, that is, when the execution leaves the block in which these variables were defined.
However, as of release 7.3 (thanks to #Hi-Angel and user parcs on IRC for pointing this out; I missed seeing it right there in the documentation), the watch command accepts a -location argument:
Ordinarily a watchpoint respects the scope of variables in expr (see below). The -location argument tells GDB to instead watch the memory referred to by expr. In this case, GDB will evaluate expr, take the address of the result, and watch the memory at that address. The type of the result is used to determine the size of the watched memory.
On older versions of GDB, you can run this instead, using the example from your question:
eval "watch *(mystruct *)%p", &obj
Note that watching locations on the stack may cause spurious notifications if the memory you're watching gets reused by another function's local variables.
As an alternative, you can automate the setting of a watchpoint on an automatic variable that keeps coming into and out of scope. Set a breakpoint at a point where it's in scope - for example, at the beginning of the function or block in which it's declared - then attach a watch and continue command:
(gdb) break MyFunc
(gdb) commands $bpnum
>watch obj
>continue
>end

minimal typing command line calculator - tcsh vs bash

I like to have a command-line calculator handy. The requirements are:
Support all the basic arithmetic operators: +, -, /, *, ^ for exponentiation, plus parentheses for grouping.
Require minimal typing, I don't want to have to call a program interact with it then asking it to exit.
Ideally only one character and a space in addition to the expression itself should be entered into the command line.
It should know how to ignore commas and dollar (or other currency symbols)
in numbers to allow me to copy/paste from the web without worrying
about having to clean every number before pasting it into the calculator
Be white-space tolerant, presence or lack of spaces shouldn't cause errors
No need for quoting anything in the expression to protect it from the shell - again for the benefit of minimal typing
Since tcsh supports alias positional arguments, and since alias expansion precedes all other expansions except history-expansion, it was straight forward to implement something close to my ideal in tcsh.
I used this:
alias C 'echo '\''\!*'\'' |tr -d '\'',\042-\047'\'' |bc -l'
Now I can do stuff like the following with minimal typing:
# the basic stuff:
tcsh> C 1+2
3
# dollar signs, multiplication, exponentiation:
tcsh> C $8 * 1.07^10
15.73721085831652257992
# parentheses, mixed spacing, zero power:
tcsh> C ( 2+5 ) / 8 * 2^0
.87500000000000000000
# commas in numbers, no problem here either:
tcsh> C 1,250.21 * 1.5
1875.315
As you can see there's no need to quote anything to make all these work.
Now comes the problem. Trying to do the same in bash, where parameter aliases aren't supported forces me to implement the calculator as a shell function and pass the parameters using "$#"
function C () { echo "$#" | tr -d ', \042-\047' | bc -l; }
This breaks in various ways e.g:
# works:
bash$ C 1+2
3
# works:
bash$ C 1*2
2
# Spaces around '*' lead to file expansion with everything falling apart:
bash$ C 1 * 2
(standard_in) 1: syntax error
(standard_in) 1: illegal character: P
(standard_in) 1: illegal character: S
(standard_in) 1: syntax error
...
# Non-leading parentheses seem to work:
bash$ C 2*(2+1)
6
# but leading-parentheses don't:
bash$ C (2+1)*2
bash: syntax error near unexpected token `2+1'
Of course, adding quotes around the expression solves these issues, but is against the original requirements.
I understand why things break in bash. I'm not looking for explanations. Rather, I'm looking for a solution which doesn't require manually quoting the arguments. My question to bash wizards is is there any way to make bash support the handy minimal typing calculator alias. Not requiring quoting, like tcsh does? Is this impossible? Thanks!
If you're prepared to type C Enter instead of C Space, the sky's the limit. The C command can take input in whatever form you desire, unrelated to the shell syntax.
C () {
local line
read -p "Arithmetic: " -e line
echo "$line" | tr -d \"-\', | bc -l
}
In zsh:
function C {
local line=
vared -p "Arithmetic: " line
echo $line | tr -d \"-\', | bc -l
}
In zsh, you can turn off globbing for the arguments of a specific command with the noglob modifier. It is commonly hidden in an alias. This prevents *^() from begin interpreted literally, but not quotes or $.
quickie_arithmetic () {
echo "$*" | tr -d \"-\', | bc -l
}
alias C='noglob quickie_arithmetic'
At least preventing the expansion of * is possible using 'set -f' (following someone's blog post:
alias C='set -f -B; Cf '
function Cf () { echo "$#" | tr -d ', \042-\047' | bc -l; set +f; };
Turning it off in the alias, before the calculation, and back on afterwards
$ C 2 * 3
6
I downloaded the bash sources and looked very closely. It seems the parenthesis error occurs directly during the parsing of the command line, before any command is run or alias is expanded. And without any flag to turn it off.
So it would be impossible to do it from a bash script.
This means, it is time to bring the heavy weapons. Before parsing the command line is read from stdin using readline. Therefore, if we intercept the call to readline, we can do whatever we want with the command line.
Unfortunately bash is statically linked against readline, so the call cannot be intercepted directly. But at least readline is a global symbol, so we can get the address of the function using dlsym, and with that address we can insert arbitrary instructions in readline.
Modifying readline directly is prune to errors, if readline is changed between the different bash version, so we modify the function calling readline, leading to following plan:
Locate readline with dlsym
Replace readline with our own function that uses the current stack to locate the function calling readline (yy_readline_get) on its first call and then restores the original readline
Modify yy_readline_get to call our wrapper function
Within the wrapper function: Replace the parentheses with non problematic symbols, if the input starts with "C "
Written in C for amd64, we get:
#include <string.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#ifndef __USE_GNU
#define __USE_GNU
#endif
#ifndef __USE_MISC
#define __USE_MISC
#endif
#include <dlfcn.h>
#include <unistd.h>
#include <sys/mman.h>
#include <errno.h>
//-----------Assembler helpers----------
#if (defined(x86_64) || defined(__x86_64__))
//assembler instructions to read rdp, which we need to read the stack
#define MOV_EBP_OUT "mov %%rbp, %0"
//size of a call instruction
#define RELATIVE_CALL_INSTRUCTION_SIZE 5
#define IS64BIT (1)
/*
To replace a function with a new one, we use the push-ret trick, pushing the destination address on the stack and let ret jump "back" to it
This has the advantage that we can set an additional return address in the same way, if the jump goes to a function
This struct corresponds to the following assembler fragment:
68 ???? push <low_dword (address)>
C7442404 ???? mov DWORD PTR [rsp+4], <high_dword (address) )
C3 ret
*/
typedef struct __attribute__((__packed__)) LongJump {
char push; unsigned int destinationLow;
unsigned int mov_dword_ptr_rsp4; unsigned int destinationHigh;
char ret;
// char nopFiller[16];
} LongJump;
void makeLongJump(void* destination, LongJump* res) {
res->push = 0x68;
res->destinationLow = (uintptr_t)destination & 0xFFFFFFFF;
res->mov_dword_ptr_rsp4 = 0x042444C7;
res->destinationHigh = ((uintptr_t)(destination) >> 32) & 0xFFFFFFFF;
res->ret = 0xC3;
}
//Macros to save and restore the rdi register, which is used to pass an address to readline (standard amd64 calling convention)
typedef unsigned long SavedParameter;
#define SAVE_PARAMETERS SavedParameter savedParameters; __asm__("mov %%rdi, %0": "=r"(savedParameters));
#define RESTORE_PARAMETERS __asm__("mov %0, %%rdi": : "r"(savedParameters));
#else
#error only implmented for amd64...
#endif
//Simulates the effect of the POP instructions, popping from a passed "stack pointer" and returning the popped value
static void * pop(void** stack){
void* temp = *(void**)(*stack);
*stack += sizeof(void*);
return temp;
}
//Disables the write protection of an address, so we can override it
static int unprotect(void * POINTER){
const int PAGESIZE = sysconf(_SC_PAGE_SIZE);;
if (mprotect((void*)(((uintptr_t)POINTER & ~(PAGESIZE-1))), PAGESIZE, PROT_READ | PROT_WRITE | PROT_EXEC)) {
fprintf(stderr, "Failed to set permission on %p\n", POINTER);
return 1;
}
return 0;
}
//Debug stuff
static void fprintfhex(FILE* f, void * hash, int len) {
for (int i=0;i<len;i++) {
if ((uintptr_t)hash % 8 == 0 && (uintptr_t)i % 8 == 0 && i ) fprintf(f, " ");
fprintf(f, "%.2x", ((unsigned char*)(hash))[i]);
}
fprintf(f, "\n");
}
//---------------------------------------
//Address of the original readline function
static char* (*real_readline)(const char*)=0;
//The wrapper around readline we want to inject.
//It replaces () with [], if the command line starts with "C "
static char* readline_wrapper(const char* prompt){
if (!real_readline) return 0;
char* result = real_readline(prompt);
char* temp = result; while (*temp == ' ') temp++;
if (temp[0] == 'C' && temp[1] == ' ')
for (int len = strlen(temp), i=0;i<len;i++)
if (temp[i] == '(') temp[i] = '[';
else if (temp[i] == ')') temp[i] = ']';
return result;
}
//Backup of the changed readline part
static unsigned char oldreadline[2*sizeof(LongJump)] = {0x90};
//A wrapper around the readline wrapper, needed on amd64 (see below)
static LongJump* readline_wrapper_wrapper = 0;
static void readline_initwrapper(){
SAVE_PARAMETERS
if (readline_wrapper_wrapper) { fprintf(stderr, "ERROR!\n"); return; }
//restore readline
memcpy(real_readline, oldreadline, 2*sizeof(LongJump));
//find call in yy_readline_get
void * frame;
__asm__(MOV_EBP_OUT: "=r"(frame)); //current stackframe
pop(&frame); //pop current stackframe (??)
void * returnToFrame = frame;
if (pop(&frame) != real_readline) {
//now points to current return address
fprintf(stderr, "Got %p instead of %p=readline, when searching caller\n", frame, real_readline);
return;
}
void * caller = pop(&frame); //now points to the instruction following the call to readline
caller -= RELATIVE_CALL_INSTRUCTION_SIZE; //now points to the call instruction
//fprintf(stderr, "CALLER: %p\n", caller);
//caller should point to 0x00000000004229e1 <+145>: e8 4a e3 06 00 call 0x490d30 <readline>
if (*(unsigned char*)caller != 0xE8) { fprintf(stderr, "Expected CALL, got: "); fprintfhex(stderr, caller, 16); return; }
if (unprotect(caller)) return;
//We can now override caller to call an arbitrary function instead of readline.
//However, the CALL instruction accepts only a 32 parameter, so the called function has to be in the same 32-bit address space
//Solution: Allocate memory at an address close to that CALL instruction and put a long jump to our real function there
void * hint = caller;
readline_wrapper_wrapper = 0;
do {
if (readline_wrapper_wrapper) munmap(readline_wrapper_wrapper, 2*sizeof(LongJump));
readline_wrapper_wrapper = mmap(hint, 2*sizeof(LongJump), PROT_EXEC | PROT_READ | PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
if (readline_wrapper_wrapper == MAP_FAILED) { fprintf(stderr, "mmap failed: %i\n", errno); return; }
hint += 0x100000;
} while ( IS64BIT && ( (uintptr_t)readline_wrapper_wrapper >= 0xFFFFFFFF + ((uintptr_t) caller) ) ); //repeat until we get an address really close to caller
//fprintf(stderr, "X:%p\n", readline_wrapper_wrapper);
makeLongJump(readline_wrapper, readline_wrapper_wrapper); //Write the long jump in the newly allocated space
//fprintfhex(stderr, readline_wrapper_wrapper, 16);
//fprintfhex(stderr, caller, 16);
//patch caller to become call <readline_wrapper_wrapper>
//called address is relative to address of CALL instruction
*(uint32_t*)(caller+1) = (uint32_t) ((uintptr_t)readline_wrapper_wrapper - (uintptr_t)(caller + RELATIVE_CALL_INSTRUCTION_SIZE) );
//fprintfhex(stderr, caller, 16);
*(void**)(returnToFrame) = readline_wrapper_wrapper; //change stack to jump to wrapper instead real_readline (or it would not work on the first entered command)
RESTORE_PARAMETERS
}
static void _calc_init(void) __attribute__ ((constructor));
static void _calc_init(void){
if (!real_readline) {
//Find readline
real_readline = (char* (*)(const char*)) dlsym(RTLD_DEFAULT, "readline");
if (!real_readline) return;
//fprintf(stdout, "loaded %p\n", real_readline);
//fprintf(stdout, " => %x\n", * ((int*) real_readline));
if (unprotect(real_readline)) { fprintf(stderr, "Failed to unprotect readline\n"); return; }
memcpy(oldreadline, real_readline, 2*sizeof(LongJump)); //backup readline's instructions
//Replace readline with readline_initwrapper
makeLongJump(real_readline, (LongJump*)real_readline); //add a push/ret long jump from readline to readline, to have readline's address on the stack in readline_initwrapper
makeLongJump(readline_initwrapper, (LongJump*)((char*)real_readline + sizeof(LongJump) - 1)); //add a push/ret long jump from readline to readline_initwrapper, overriding the previous RET
}
}
This can be compiled to an intercepting library with:
gcc -g -std=c99 -shared -fPIC -o calc.so -ldl calc.c
and then loaded in bash with:
gdb --batch-silent -ex "attach $BASHPID" -ex 'print dlopen("calc.so", 0x101)'
Now, when the previous alias extended with parenthesis replacement is loaded:
alias C='set -f -B; Cf '
function Cf () { echo "$#" | tr -d ', \042-\047' | tr [ '(' | tr ] ')' | bc -l; set +f; };
We can write:
$ C 1 * 2
2
$ C 2*(2+1)
6
$ C (2+1)*2
6
Even better it becomes, if we switch from bc to qalculate:
alias C='set -f -B; Cf '
function Cf () { echo "$#" | tr -d ', \042-\047' | tr [ '(' | tr ] ')' | xargs qalc ; set +f; };
Then we can do:
$ C e ^ (i * pi)
e^(i * pi) = -1
$ C 3 c
3 * speed_of_light = approx. 899.37737(km / ms)

How to call methods or execute code in LLDB debugger?

I know I can type print someFloatVariable when I set a breakpoint or po [self someIvarHoldingAnObject], but I can't do useful things like:
[self setAlpha:1];
Then it spits out:
error: '[self' is not a valid command.
Weird thing is that I can call po [self someIvarHoldingAnObject] and it will print it's description.
I believe I've seen a video a year ago where someone demonstrated how to execute code through the console at runtime, and if I am not mistaken this guy also provided arguments and assigned objects to pointers. How to do that?
The canonical reference for gdb v. lldb commands is http://lldb.llvm.org/lldb-gdb.html
You want to use the expr command which evaluates an expression. It's one of the lldb commands that takes "raw input" in addition to arguments so you often need a "--" to indicate where the arguments (to expr) end and the command(s) begin. e.g.
(lldb) expr -- [self setAlpha:1]
There is a shortcut, "p", which does the -- for you (but doesn't allow any arguments), e.g.
(lldb) p [self setAlpha:1]
If the function(s) you're calling are not part of your program, you'll often need to explicitly declare their return type so lldb knows how to call them. e.g.
(lldb) p printf("hi\n")
error: 'printf' has unknown return type; cast the call to its declared return type
error: 1 errors parsing expression
(lldb) p (int)printf("hi\n")
(int) $0 = 3
hi
(lldb)
There is a neat way to work around the floating point argument problem, BTW. You create an "expression prefix" file which is added to every expression you enter in lldb, with a prototype of your class methods. For instance, I have a class MyClass which inherits from NSObject, it has two methods of interest, "setArg:" and "getArg" which set and get a float ivar. This is a silly little example, but it shows how to use it. Here's a prefix file I wrote for lldb:
#interface NSObject
#end
#interface MyClass : NSObject
- init;
- setArg: (float)arg;
- (float) getArg;
#end
extern "C" {
int strcmp (const char *, const char *);
int printf(const char * __restrict, ...);
void puts (const char *);
}
in my ~/.lldbinit file I add
settings set target.expr-prefix /Users/jason/lldb-prefix.h
and now I can do
(lldb) p [var getArg]
(float) $0 = 0.5
(lldb) p [var setArg:0.7]
(id) $1 = 0x0000000100104740
(lldb) p [var getArg]
(float) $2 = 0.7
You'll notice I included a couple of standard C library functions in here too. After doing this, I don't need to cast the return types of these any more, e.g.
(lldb) p printf("HI\n")
<no result>
HI
(lldb) p strcmp ("HI", "THERE")
(int) $3 = -12
(a fix for that "<no result>" thing has been committed to the lldb TOT sources already.)
If you need multiline, use expression:
expression
do {
try thing.save()
} catch {
print(error)
}
// code will execute now
Blank line to finish and execute the code.

Is it possible to set a conditional breakpoint at the end of a function based on what the function is about to return?

I have a more complicated version of the following:
unsigned int foo ();
unsigned int bar ();
unsigned int myFunc () {
return foo()+bar();
}
In my case, myFunc is called from lots of places. In one of the contexts there is something going wrong. I know from debugging further down what the return value of this function is when things are bad, but unfortunately I don't know what path resulted in this value.
I could add a temporary variable that stored the result of the expression "foo()+bar()" and then add the conditional breakpoint on that value, but I was wondering if it is possible to do in some other way.
I'm working on x86 architecture.
From this and this answer I thought I could set a breakpoint at the exact location of the return from the function:
gdb> break *$eip
And then add a conditional breakpoint based on the $eax register, but at least in my tests here the return is not in this register.
Is this possible?
Agree with previous commenter that this is probably something you don't want to do, but for me, setting a conditional breakpoint at the last instruction on $eax (or $rax if you are on 64-bit x86) works just fine.
For the code
unsigned int foo(void) { return 1; }
unsigned int bar(void) { return 4; }
unsigned int myFunc(void) { return foo()+bar(); }
using gdb ..
(gdb) disass myFunc
Dump of assembler code for function myFunc:
0x080483d8 <myFunc+0>: push %ebp
0x080483d9 <myFunc+1>: mov %esp,%ebp
0x080483db <myFunc+3>: push %ebx
0x080483dc <myFunc+4>: call 0x80483c4 <foo>
0x080483e1 <myFunc+9>: mov %eax,%ebx
0x080483e3 <myFunc+11>: call 0x80483ce <bar>
0x080483e8 <myFunc+16>: lea (%ebx,%eax,1),%eax
0x080483eb <myFunc+19>: pop %ebx
0x080483ec <myFunc+20>: pop %ebp
0x080483ed <myFunc+21>: ret
End of assembler dump.
(gdb) b *0x080483ed if $eax==5
Breakpoint 1 at 0x80483ed
(gdb) run
Starting program: /tmp/x
Breakpoint 1, 0x080483ed in myFunc ()
(gdb)
I don't get whether you're compiling from the command line or not, but from within Visual Studio, once you set your breakpoint, right-click it and click the "Condition..." option for a dialog to appear to let you edit the condition for your breakpoint to break.
Hope this helps! :-)

Resources