Debugging dyld under OS X - macos

I've got some obscure errors in an OS X program concerning the loading and unloading and symbol bindings of dynamically loaded libraries. First attempts to analyse the problem by using the DYLD_PRINT_* environment variables failed.
I solved similar problems under GNU/Linux by installing the glibc with debug symbols and the corresponding sources. Since the sources for dyld are also available, something similar must be possible under OS X.
How do I have to proceed in order to set up a debugging session and step through the sources of dyld in order to understand what went wrong with the program? Is it possible to start an application using a different hand-crafted dyld?

You can set a symbolic breakpoint of dyld`dyldbootstrap::start.
So, Symbol is "dyldbootstrap::start", and Module is "dyld".
Actually, we can set a symbolic breakpoint of dyld`_dyld_start, and we can see it enabled after the process launched, but it won't be hit.

Yes, and it's actually designed to be this way. You can drop your custom dyld in the file system, making sure its LC_ID_DYLINKER command is set properly. Then, to use it, edit the Mach-O you are loading so that its LC_LOAD_DYLINKER points to it.
Mind you, it's possible to just step through dyld anyway without all this - use lldb and do process launch -s , then you can single step right through dyld as well, albeit in assembly.
Caveat: Don't touch or move the /usr/lib/dyld in the process - but rather drop the custom dyld side by side to it. Since virtually everything requires dyld, moving it can be a pain to undo (and requires booting with a ramdisk and mounting the root file system as a secondary just so as to issue the correcting mv..)

Related

How to load debug symbols for the whole OVMF UEFI image in gdb?

I am trying to debug a driver in UEFI firmware (OVMF) via gdb as described here:
https://github.com/tianocore/tianocore.github.io/wiki/How-to-debug-OVMF-with-QEMU-using-GDB
It works well, but I discovered that just having debug symbols for my driver is not enough. I also need debug symbols for the whole OVMF image to properly see what's going on. I have a lot of .debug files after OVMF is built with edk2, but I don't understand which ones I need to load into gdb, and what addresses I should use.
I found some instructions involving DebugPkg, but I couldn't make gdb_uefi.py work no matter what. It always failed to locate EFI_SYSTEM_TABLE_POINTER.
In the end, I ended up writing my own script, which implements gdb command that does manage to successfully load all debug symbols. It is probably a worse solution, since it requires a setup: "debug.log" with driver addresses must be present when loading is performed, so you need to run QEMU at least once first. But, this is good enough for me.
My script can be found here:
https://github.com/artem-nefedov/uefi-gdb

How can I remove the need of wpcap.dll in my go program?

I use gopacket in my program. on linux, it runs perfectly.
But on windows the whole program crashes if i did not install WinPcap before.
My plan was to check if WinPcap is installed, and if not to inform the user that he needs this to use 100% of all features.
But i dont come to this point. i cant use gopacket if WinPcap is not available. I mean... not a single line of code of it (=> crash)
Has anyone an idea how i can solve this? im do not need gopacket actually. My plan was, if it is installed, fine, super! If not, dont care... do other things.
But now i have 2 choices... remove gopacket totally or find a way to start my program without the need of wpcap.dll. at least to tell the user that he needs it.
Please help me :(
You're wrong in that you are «not [using] a single line of code of it»: it's not hard to see that
its Windows-specific code calls into winpcap.dll.
What is more fun, is that
its Unix-specific code calls into libpcap.so, and this means you have it working on your local system simply due to the fact you have libpcap package installed (or whatever it's named in your code).
All this means that currently your program is not really portable
anyway (I mean, in the sense you supposedly think it is portable).
You can run something like
$ ldd ./yourbinary
and see it printing a reference to libpcap.so of some version.
There are several ways to solve this.
The easiest is to just try shipping winpcap.dll with your binary. Windows by default looks for DLLs in the current directory of the application trying to load them. Since gopacket uses cgo, it means the winpcap.dll is attempted to be linked it at the application startup, so the application has no chance of changing its working directory before that library is attempted to be found and linked in.
A more complicated approach is to make (or obtain) a static version of the winpcap library (remember that DLL is a library, just a special form of it) and then jump around building gopacket so that it picks that static library.
Install Npcap in "Wpcap API compatibility mode".

I need to find the point in my userland code that crash my kernel

I have big system that make my system crash hard. When I boot up, I don't even have
a coredump. If I log every line that
get executed until my system goes down. I will find that evil code.
Can I log every source code line in GDB to a file?
UPDATE:
ok, I found the bug. It was nasty. The application I started did not
take the system down. After learning about coredump inspection with mdb, and some gdb stepping I found out that the systemcall causing the dump, was not implemented. Updating the system to latest kernel will fix my problem. Thanks to all of you.
MY LESSON:
make sure you know what process causes the coredump. It's not always the one you started.
Sounds like a tricky little problem.
I often try to eliminate as many possible suspects as I can by commenting out large chunks of code, configuring the system to not run certain pieces (if it allows you to do that) etc. This amounts to doing an ad-hoc binary search on the problem, and is a surprisingly effective way of zooming in on offending code relatively quickly.
A potential problem with logging is that the log might not hit the disk before the system locks up - if you don't get a core dump, you might not get the log.
Speaking of core dumps, make sure you don't have a limit on your core dump size (man ulimit.)
You could try to obtain a list of all the functions in your code using objdump, process it a little bit and create a bunch of GDB trace statements on those functions - basically creating a GDB script automatically. If that turns out to be overkill, then a binary search on the code using tracepoints can also help you zoom in on the problem.
And don't panic. You're smarter than the bug - you'll find it.
You can not reasonably track every line of your source using GDB (too slow). Besides, a system crash is most likely a result of a system call, and libc is probably doing the system call on your behalf. Even if you find the line of the application that caused OS crash, you still don't really know anything.
You should start by clarifying which OS is crashing. For Linux, you can try the following approaches:
strace -fo trace.out /path/to/app
After reboot, trace.out will contain syscalls the application was doing just before the crash. If you are lucky, you'll see the last syscall-of-death, but I wouldn't count on it.
Alternatively, try to reproduce the crash on the user-mode Linux, or on kernel with KGDB compiled in.
These will tell you where the problem in the kernel is. Finding the matching system call in your application will likely be trivial.
Please clarify your problem: What part of the system is crashing?
Is it an application?
If so, which application? Is this an application which you have written yourself? Is this an application you have obtained from elsewhere? Can you obtain a clean interrupt if you use a debugger? Can you obtain a backtrace showing which functions are calling the section of code which crashes?
Is it a new hardware driver?
Is it based on an older driver? If so, what has changed? Is it based on a manufacturer's data sheet? Is that data sheet the latest and most correct?
Is it somewhere in the kernel? Which kernel?
What is the OS? I assume it is linux, seeing that you are using the GNU debugger. But of course, that is not necessarily so.
You say you have no coredump. Have you enabled coredumps on your machine? Most systems these days do not have coredumps enabled by default.
Regarding logging GDB output, you may have some success, but it depends where the problem is whether or not you will have the right output logged before the system crashes. There is plenty of delay in writing to disk. You may not catch it in time.
I'm not familiar with the gdb way of doing this, but with windbg the way to go is to have a debugger attached to the kernel and control the debugger remotely over a serial cable (or firewire) from a second debugger. I'm pretty sure gdb has similar capabilities, I could quickly find some hints here: http://www.digipedia.pl/man/gdb.4.html

Finding out why a process is spending time in the kernel in win32

I'm compiling a vc8 C++ project in a WinXp VmWare session. It's a hell of a lot slower than gcc3.2 in a RedHat VmWare session, so I'm looking at Task Manager. It's saying a very large percentage of my compile process is spent in the kernel. That doesn't sounds right to me.
Is there an equivalent of strace for Win32? At least something which will give me an overview of which kernel functions are being called. There might be something that stands out as being the culprit.
Windows Resource Kit contains a tool called kernrate. It's a sampling profiler. It can profile entire system or a particular process. By default, its resolution is on a module level, but can be tuned down to several bytes. You should be fine with default resolution as you'll see which modules/drivers are consuming most of the time.
Here is some info regarding its use.
Not exactly strace, but there is a way of getting visibility into the kernel call stack, and by sampling it at times of high CPU usage, you can usually estimate what's using up all the time.
Install Process Explorer and make sure you configure it with symbol server support. You can do this by:
Installing WinDebug to get an updated dbghelp.dll
Set Process Explorer to use this version of dbghelp.dll by setting the path in the Options | Configure Symbols menu of Process Explorer.
Also in the same dialog, set the symbols path such that it includes the MS symbol server and a local cache.
Here's an example value for the symbol path:
SRV*C:\symbolcache*http://msdl.microsoft.com/download/symbols
(You can set _NT_SYMBOL_PATH environment variable to the same value to have the debugging tools use the same symbol server and cache path.) This path will cause dbghelp.dll to download symbols to local disk when asked for symbols for a module that doesn't have symbols locally.
After having set up Process Explorer like this, you can then get a process's properties, go to the threads tab, and double-click on the busiest thread. This will cause Process Explorer to temporarily hook into the process and scan the thread's stack, and then go and look up the symbols for the various return addresses on the stack. The return addresses's symbols, and the module names (for non-MS third-party drivers) should give you a strong clue as to where your CPU time is being spent.
VmWare support should be address that question. It's probably somewhere in the VmWare implementation.
You can use for example IrpTracker that give you an idea what is going on in the kernel.
Another option is using kernel debugger i.e WinDbg. If the cpu load very high just randomly breaking in the debugger and looking on the call stack can give you an idea who is the driver behind the cpu load. But as i stated i will guess that it will be some VmWare component. It worth to check if the problem persist on same computer on WinXP without emulation.

How to insert a LC_LOAD_DYLIB command into a Mach-O binary (OSX)

I'm looking to patch a piece of abandonware with some code.
The software is carbon based, so I can not use an InputManager (at least, I do not think I can). My idea was to add a dylib reference to the mach-o header, and launch a new thread when the initialization routine is called.
I have mucked around with the mach-o header using a hexeditor to add the appropriate load command (LC_ LOAD_DYLIB).
otool reports what I expect to see, so I'm fairly confident that the file is correctly formatted.
Load command 63
cmd LC_LOAD_DYLIB
cmdsize 60
name #executable_path/libAltInput.dylib (offset 24)
time stamp 1183743291 Fri Jul 6 19:34:51 2007
current version 0.0.0
compatibility version 0.0.0
However, launching the binary gives me the following error
dyld: bad external relocation length
All I can guess this means is that I need to modify the LC_ SYMTAB or LC_ DYNSYMTAB sections...
Anyone have any ideas?
I'm not entirely sure what you're trying to accomplish, but the easiest way to do this is probably to inject a thread into the mach task after it starts. A great source of information on doing this (as well as running code to do it) can be found here: http://rentzsch.com/mach_inject/.
Some caveats that you should be aware of:
the mach task_for_pid() call necessary to get the mach port to the task is now privleged and requires authorization to call. The reason for this is pretty self-evident but if you were planning on releasing something with injected code, you should be aware of this.
Your code will be running in the same process space as the original application but on a separate thread. You will, therefore, have full access to the application, however, if it is not thread-aware be very careful about using and manipulating data from outside of your injected code. Obviously all multithreaded issues will be amplified here because the original code was never aware of your additions.
The easiest solution that doesn't involve patching the binary is to simply use the DYLD_INSERT_LIBRARIES environment variable and then run your application.
set DYLD_INSERT_LIBRARIES to /my/path/libAltInput.dylib
I'm assuming the reason the dynamic linker reported an error is because many fields in the Mach-O file format contain addresses specified as an offset from the beginning of the file so adding another load command would invalidate every address. For example, see the symoff and stroff entries in the Mac OS X ABI Mach-O File Format Reference.

Resources