Is there a way to get the instruction pointer of a running application Unix?
I have a running process (C++) and want to get its current location, and thereafter in GDB (on a different machine) map the location to source location ('list' command).
On Linux, there is /proc/[pid]/stat.
From "man proc":
stat Status information about the process. This is used by
ps(1). It is defined in /usr/src/linux/fs/proc/array.c.
...
kstkeip %lu
The current EIP (instruction pointer).
AFAICT, the 29th field of the output corresponds to the current instruction pointer of the process. For example:
gdb date
GNU gdb Red Hat Linux (6.0post-0.20040223.20rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging symbols found)...Using host libthread_db library "/lib64/tls/libthread_db.so.1".
(gdb) set stop-on-solib-events 1
(gdb) run
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...[Thread debugging using libthread_db enabled]
[New Thread 182896391360 (LWP 27968)]
(no debugging symbols found)...Stopped due to shared library event
(gdb) c
[Switching to Thread 182896391360 (LWP 27968)]
Stopped due to shared library event
(gdb) where
#0 0x00000036b060bb20 in _dl_debug_state_internal () from /lib64/ld-linux-x86-64.so.2
#1 0x00000036b060b51c in _dl_init_internal () from /lib64/ld-linux-x86-64.so.2
#2 0x00000036b0600f72 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#3 0x0000000000000001 in ?? ()
#4 0x0000007fbff62728 in ?? ()
#5 0x0000000000000000 in ?? ()
(gdb) shell cat /proc/27968/stat
27968 (date) T 27839 27968 8955 34817 27839 4194304 42 0 330 0 0 0 0 0 18 0 0 0 1881668573 6144000 78 18446744073709551615 4194304 4234416 548680739552 18446744073709551615 234887363360 0 0 0 0 18446744071563322838 0 0 17 0 0 0 0 0 0 0
(gdb) p/a 234887363360 <--- the value of 29th field
$1 = 0x36b060bb20 <_dl_debug_state_internal>
The instruction pointer can be retrieved on Linux with the following code:
pid_t traced_process;
struct user_regs_struct regs;
ptrace(PTRACE_ATTACH, traced_process, NULL, NULL);
ptrace(PTRACE_GETREGS, traced_process, NULL, ®s);
printf("EIP: %lx\n", regs.eip);
You will need to temporarily stop the process or thread in order to get its current instruction pointer. You can do it by attaching to the process with ptrace() or (on HP-UX) ttrace() and accessing the registers.
If you're using gdb anyway, you can simply attach yourself to a running process like this:
gdb program 1234
where program is the name of the executable you're debugging, and 1234 is the PID. You can then use all of the facilities of gdb to debug the process.
Related
I am familiar with using windbg or IDA for remote kernel debugging, but right now i have extracted a kernel driver from an executable, and have done static analysis on its IDB and renamed a lot of variables, what is the easiest way of using my IDB file to debug the driver on the remote debugee when it gets loaded by the executable?
I know how to attach to remote kernel using IDA, but how can i use my current IDB file, and put breakpoint on some of its functions so it they get hit when the driver is loaded? (I dont have the corresponding pdb file for the driver so i can't use symbols for breakpoint)
this is a vanilla windbg answer to break on any DriverInit
once you have Broken on DriverInit You Can Lookup and Set bp on all MajorFunctions
Assuming you have a regular kd Connection
use sxe -ibp;.reboot to reboot the target
on reconnection the target will break very Early as below
kd> sxe ibp;.reboot
Shutdown occurred at (Sun Oct 18 02:58:09.077 2020 )...unloading all symbol tables.
Waiting to reconnect...
Connected to Windows 7 7601 x86 compatible target at (xxx), ptr64 FALSE
Kernel Debugger connection established. (Initial Breakpoint requested)
once broken set a breakpoint on nt!IopLoadDriver
inside this function search for an indirect Call
that Calls the _DRIVER_OBJECT->DriverInit
kd> ?? #FIELD_OFFSET(nt!_DRIVER_OBJECT , DriverInit)
long 0x2c
like
nt!IopLoadDriver+0x7ea:
829d5355 ff562c call dword ptr[esi+2Ch] ds:84f2928c={cdrom!FxDriverEntry (87eb53cf)}
set a break point here to
you are now set to enter almost every driver that is loaded
once you are on entrypoint of any Driver
use the DriverObject (an argument the DriverEntry Takes )
and Set Breakpoints on each MAJORFunction
kd> bp . "du poi(#esi+1c+4);gc"
kd> bl
0 e Disable Clear 829d4b6a 0001 (0001) nt!IopLoadDriver
1 e Disable Clear 829d5355 0001 (0001) nt!IopLoadDriver+0x7ea "du poi(#esi+1c+4);gc"
kd> bd 0
kd> bl
0 d Enable Clear 829d4b6a 0001 (0001) nt!IopLoadDriver
1 e Disable Clear 829d5355 0001 (0001) nt!IopLoadDriver+0x7ea "du poi(#esi+1c+4);gc"
kd> g
841bd1d0 "\Driver\Null.Ѕ捁印䍁䥐停偎〰〰"
84f18718 "\Driver\Beep.Б浍摌䂈蓶䈸蓶...."
84eef210 "\Driver\VgaSave"
84eb2860 "\Driver\RDPCDDᛛ..В浍慃憠褎.蓫菌蓲"
84e903c0 "\Driver\RDPENCDD..浍摌읨蓤潤獷獜獹整.尲牤癩牥"
84e90400 "屳摲数据摤献獹"
84ef15c0 "\Driver\RDPREFMP..牉..蓧"
84ef4a78 "\FileSystem\Msfs.В浍慃冀褘蝴蓳荤蓶"
84f191f0 "\FileSystem\Npfs.З獍䑆.°"
I have a full dump of a VM with windows 10 installed. This dump was taken from a hard hanged system, frozen mouse and keyboard, totally unresponsive.
While analyzing I found that there are no thread in running or ready state. No deadlocks. Only suspicious thing is that there are a lot of thread waiting for a reply from ALPC and also there is page fault pattern in as lot of threads that looks like this:
ffffbc0f`151380f0 fffff805`1e4e081c Ntfs!NtfsNonCachedIo+0x4ea
ffffbc0f`151383b0 fffff805`1e4df8bc Ntfs!NtfsCommonRead+0xd2c
ffffbc0f`151385b0 fffff805`19687d3a Ntfs!NtfsFsdRead+0x1fc
ffffbc0f`15138680 fffff805`19687ce7 nt!IopfCallDriver+0x46
ffffbc0f`151386c0 fffff805`1d926ccf nt!IofCallDriver+0x17
ffffbc0f`151386f0 fffff805`1d9248d3 FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x28f
ffffbc0f`15138760 fffff805`19687d3a FLTMGR!FltpDispatch+0xa3
ffffbc0f`151387c0 fffff805`19687ce7 nt!IopfCallDriver+0x46
ffffbc0f`15138800 fffff805`196215b2 nt!IofCallDriver+0x17
ffffbc0f`15138830 fffff805`196221e2 nt!IoPageReadEx+0x1e6
ffffbc0f`151388a0 fffff805`19622eee nt!MiIssueHardFaultIo+0xb6
ffffbc0f`151388f0 fffff805`19666566 nt!MiIssueHardFault+0x48e
ffffbc0f`151389f0 fffff805`197aba1e nt!MmAccessFault+0x276
ffffbc0f`15138b00 00007ffd`2e42ec10 nt!KiPageFault+0x35e (TrapFrame # ffffbc0f`15138b00)
also almost every thread (maybe I've seen 1 or 2 that don't) in every process ends with this:
ffffbc0f`15d08df0 fffff805`1966aad4 nt!KiSwapContext+0x76
ffffbc0f`15d08f30 fffff805`196657ca nt!KiSwapThread+0x190
ffffbc0f`15d08fa0 fffff805`19666fb0 nt!KiCommitThreadWait+0x13a
ffffbc0f`15d09050 fffff805`1e4e261a nt!KeWaitForSingleObject+0x140
I have one particular example of a thread with a page fault belonging to a prl_tools_service.exe (which is Parallels VM related service) that has same pattern and when looking into trap frame at the moment of KiPageFault there was an attempt to get value from an address in eax and in trap frame rax=0000000000000001 which can't be a valid address and I can't see how this page fault can be resolved.
IRQLs of both processors are LOW_LEVEL
The question, basically, is - where should I look for any faults since there must be a kernel problem (hence mouse and keyboard freeze) and how do I find wether this example of page fault pattern could stall the kernel.
Since the question is pretty vague any kind of response, a direction where to look, a hint - every thing will be much appreciated
UPD: as requested by Lieven Keersmaekers and blabb here is !analyze -hang output:
0: kd> !analyze -hang
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Unknown bugcheck code (0)
Unknown bugcheck description
Arguments:
Arg1: 0000000000000000
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: 0000000000000000
Debugging Details:
------------------
Scanning for threads blocked on locks ...
Cannot get _ERESOURCE type
BUGCHECK_CODE: 0
BUGCHECK_P1: 0
BUGCHECK_P2: 0
BUGCHECK_P3: 0
BUGCHECK_P4: 0
PROCESS_NAME: System
ERROR_CODE: (NTSTATUS) 0x45474150 - <Unable to get error code text>
SYMBOL_NAME: nt!PpmIdleGuestExecute+1d
MODULE_NAME: nt
IMAGE_NAME: ntkrnlmp.exe
FAILURE_BUCKET_ID: 0x0_STACKPTR_ERROR_nt!PpmIdleGuestExecute
FAILURE_ID_HASH: {94784d45-ed21-c95f-fc42-87fec626bbee}
Followup: MachineOwner
---------
The main topic is to write a code in C++ for Hash with Chaining (using single link list).here, data has been provided in terms of array of long datatype and we have to store them in hash(table size 13) in a sorted manner.
Here is my code for the same.
https://onlinegdb.com/B1pbgjxAI
There is no compiler error in the code but while running the code the following error arises.
*** buffer overflow detected ***: ./Solution terminated
Reading symbols from Solution...done.
[New LWP 86657]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./Solution'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
To enable execution of this file add
add-auto-load-safe-path /usr/local/lib64/libstdc++.so.6.0.25-gdb.py
line to your configuration file "//.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "//.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
Here, for the testcase, input is
201911169
and the output should be
93
In line 36 you're calling strcpy(p->name,Name), but Name is passed from x in main, and char x[4] isn't null-terminated as you only assign to x[j] for j from 2 downto 0. Add a statement x[3] = '\0';.
Yes, this question has been asked before, but reading the answers didn't enlighten me much.
I wrote a C program that crashes after a few days of use. An important point is that it does NOT generate a core file, even though everything is set up so that it should (core_pattern, ulimit -c unlimited, etc. I can trigger a core dump fine with kill -SIGQUIT).
The programs extensively logs what it does, but there's no hint about the crash in the log.
The only message displayed at the crash (or before?) is:
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
after 2322 requests (2322 known processed) with 0 events remaining.
So two questions:
- how is it possible for a program to crash (return $?=1) without core dump.
- what is this error message about and what can I do ?
System is RedHat Enterprise 6.4
Edit:
I managed to force a core dump by calling abort() from inside an atexit() callback:
(gdb) bt
#0 0x00bc8424 in __kernel_vsyscall ()
#1 0x0085a861 in raise () from /lib/libc.so.6
#2 0x0085c13a in abort () from /lib/libc.so.6
#3 0x0808f5cf in Unexpected () at MyCode.c:1378
#4 0x0085de9f in exit () from /lib/libc.so.6
#5 0x00c85701 in _XDefaultIOError () from /usr/lib/libX11.so.6
#6 0x00c85797 in _XIOError () from /usr/lib/libX11.so.6
#7 0x00c84055 in _XReply () from /usr/lib/libX11.so.6
#8 0x00c68b8f in XGetImage () from /usr/lib/libX11.so.6
#9 0x004fd6a7 in ?? () from /usr/local/lib/libcvi.so
#10 0x00478ad5 in ?? () from /usr/local/lib/libcvi.so
...
#29 0x001eed9d in ?? () from /usr/local/lib/libcvi.so
#30 0x001eee41 in RunUserInterface () from /usr/local/lib/libcvi.so
#31 0x0808fab4 in main (argc=2, argv=0xbfbdc984) at MyCode.c:1540
Anyone can enlighten me as to this X11 problem ? libcvi.so is not mine, only MyCode.c (LabWindows/CVI).
Edit 2014-12-05:
Here's an even more precise backtrace. Things definitely happen in X11, but I'm no X11 programmer, so looking at the source code for X from the provided linestell me only that the X server (?) is temporarily unavailable. Is there any way to simply tell it to ignore this error if it's only temporary ?
#4 0x00965eaf in __run_exit_handlers (status=1) at exit.c:78
#5 exit (status=1) at exit.c:100
#6 0x00c356b1 in _XDefaultIOError (dpy=0x88aeb80) at XlibInt.c:1292
#7 0x00c35747 in _XIOError (dpy=0x88aeb80) at XlibInt.c:1498
#8 0x00c340a6 in _XReply (dpy=0x88aeb80, rep=0xbf82fa90, extra=0, discard=0) at xcb_io.c:708
#9 0x00c18c0f in XGetImage (dpy=0x88aeb80, d=27263845, x=0, y=0, width=60, height=20, plane_mask=4294967295, format=2) at GetImage.c:75
#10 0x005f46a7 in ?? () from /usr/local/lib/libcvi.so
Corresponding lines:
XlibInt.c: _XDefaultIOError()
1292: exit(1);
XlibInt.c: _XIOError
1498: _XDefaultIOError(dpy);
xcb_io.c: _XReply()
708: if(!reply) _XIOError(dpy);
GetImage.c: XGetImage()
74: if (_XReply (dpy, (xReply *) &rep, 0, xFalse) == 0 || ...
OK, I finally found the cause (thanks to someone at National Instruments), a better diagnostic and a workaround.
The bug is in many versions of libxcb and is a 32-bit counter rollover problem that has been known for a few years: https://bugs.freedesktop.org/show_bug.cgi?id=71338
Not all versions of libxcb are affected libxcb-1.9-5 has it, libxcb-1.5-1 doesn't. From the bug list, 64-bits OS shouldn't be affected, but I managed to trigger it on at least one version.
Which brings me to a better diagnostic. The following program will crash in less than 15 minutes on affected libraries (better than the entire week it previously took):
// Compile with: gcc test.c -lX11 && time ./a.out
#include <X11/Xlib.h>
void main(void) {
Display *d = XOpenDisplay(NULL);
if (d)
for(;;)
XNoOp(d);
}
And one final thing, the above prog compiled and ran on a 64-bit system works fine, compiled and ran on an old 32-bit system also works fine, but if I transfer the 32-bit version to the 64-bit system, it crashes after a few minutes.
I just had a program that acted exactly like this, with exactly the same error message. I would expect the counter error to process 2^32 events before crashing.
The program was structured so that a worker thread has a separate X connection to the X thread so that it can send messages to the X thread to update the window.
In the end I traced the problem down to a place where the function sending the events to the window to redraw it was called by multiple threads, without a mutex on it, and since X to the same X connection is not re-entrant, crashed with this exact error. Put in a mutex on the function and no problems since.
In the book "Rootkit Arsenal" page 84 (Chapter 3) mentions:
..., we can view the contents of the
target machine's descriptor registers
using the command with the 0x100 mask:
kd> rM 0x100
and a paragraph below:
Note that the same task can be
accomplished by specifying the GDTR
components explicitly: kd> r gdtr ....
I run Windbg on my Win XP (inside VMWare) and choose the Kernel Debug -> Local.
My problem is in case of first command, windbg errors with:
lkd> rM 0x100
^ Operation not supported in current debug session 'rM 0x100'
and in the second command:
lkd> r gdtr
^ Bad register error in 'r gdtr'
Can anyone guide me ?
Right, you can't look at registers in a local kernel debug session. LiveKD works and you can also get the address indirectly through the PCR (!pcr).
-scott
I think I've found the solution:
Use two computers for kernel debugging instead of Local Kernel Debug.
(I used VMWare and am debugging through the COM port/named pipe)
I am thinking why this facility/feature (Local Kernel Debugging) is there if it's not complete ?