I am trying to hunt down retain leaks in an open-source project to support I2C based trackpads (https://github.com/kprinssu/VoodooI2CHID).
The reason why I believe that there are retain leaks is because when I attempt to unload the kernel extension via the following commands:
sudo kextunload -verbose 6 VoodooI2CHID.kext
I get the following output:
Kext user-space log filter changed from 0xff2 to 0xfff.
Kext kernel-space log filter changed from 0xff2 to 0xfff.
Kext library architecture set to x86_64.
Requesting unload of com.alexandred.VoodooI2CHID (with termnation of IOServices).
(kernel) User-space log flags changed from 0x0 to 0xfff.
(kernel) Received 'Unload' request from user space.
(kernel) Rescheduling scan for unused kexts in 60 seconds.
(kernel) Can't unload kext com.alexandred.VoodooI2CHID; classes have instances:
(kernel) Kext com.alexandred.VoodooI2CHID class VoodooI2CPrecisionTouchpadHIDEventDriver has 1 instance.
(kernel) Kext com.alexandred.VoodooI2CHID class VoodooI2CMultitouchHIDEventDriver has 1 instance.
Kernel error handling kext request - (libkern/kext) kext is in use or retained (cannot unload).
Failed to unload com.alexandred.VoodooI2CHID - (libkern/kext) kext is in use or retained (cannot unload).
I came across pmdj's excellent answer on tracking down retain leaks (Can't Unload Kernel Extension; Classes Have Instances). I verfied that my situation is the second case via ioreg (classes are being terminated but are not properly freed). Additionally, I used pmdj's hint by overiding taggedRelease and taggedRetain (https://stackoverflow.com/a/13471512/48660) to print the stack trace of the function calls.
Here's where I run into problems, I cannot use atos to convert the hex addresses back into human readable symbols. I use the follow command to generate the symbols:
atos -arch x86_x64 -o VoodooI2C.kext/Contents/MacOS/VoodooI2C -l 0xffffff7f8432b000 0xffffff804588dfa0
The load address parameter is retrieved from kextstat and I expect the -l argument should handle the slide arithmetic.
atos should return a valid symbol but all I get is the hex address back. In the above example, I get 0xffffff804588dfa0 as the output. Can anybody point out what I exactly I am missing?
Both kextstat and OSReportWithBacktrace report unslid addresses, so KASLR is not your problem.
Notice that your kext is apparently loaded at 0xffffff7f8432b000, whereas your backtrace frame address is 0xffffff804588dfa0. This is quite far apart, and indeed kexts are always loaded in the 0xffffff7f8??????? (unslid) range, so 0xffffff804588dfa0 can't be anywhere near kext code. (the offset is about 3GB) It's almost certainly a function in the kernel proper. If you use atos with the appropriate running kernel's binary, it should be able to locate which one. For example:
atos -o /Library/Developer/KDKs/KDK_10.14.5_18F132.kdk/System/Library/Kernels/kernel 0xffffff804588dfa0
(I don't know what kernel version you are using, and this address doesn't seem to be meaningful in the 18F132 kernel, but you get the idea.)
Related
I tried to load kext module on M1 machine running 11.4 Beta (20F5046g) Big Sur and encounter some error messages on binding at kext module loading.
Accessing kernel symbol exported from Apple kext modules
First, to access the kernel functions exported from apple's kext module, com.apple.kpi.unsupported, I used the below extern declaration.
extern int cpu_number(void);
Also, I added the com.apple.kpi.unsupported on the info.plist
<key>OSBundleLibraries</key>
<dict>
<key>com.apple.kpi.libkern</key>
<string>20.5</string>
<key>com.apple.kpi.unsupported</key>
<string>20.5.0</string>
</dict>
The compilation doesn't raise any errors, but when I try to load the module, it prints below message.
Error Domain=KMErrorDomain Code=31 "Error occurred while building a collection:
1: One or more binaries has an error which prevented linking. See other errors.
2: Could not use 'kext' because: Failed to bind '_cpu_number' in 'kext' (at offset 0x0 in __DATA_CONST, __got) as could not find a kext which exports this symbol
kext specific:
1: Failed to bind '_cpu_number' in 'kext' (at offset 0x0 in __DATA_CONST, __got) as could not find a kext which exports this symbol
" UserInfo={NSLocalizedDescription=Error occurred while building a collection:
1: One or more binaries has an error which prevented linking. See other errors.
2: Could not use 'kext' because: Failed to bind '_cpu_number' in 'kext' (at offset 0x0 in __DATA_CONST, __got) as could not find a kext which exports this symbol
kext specific:
1: Failed to bind '_cpu_number' in 'kext' (at offset 0x0 in __DATA_CONST, __got) as could not find a kext which exports this symbol
Can I access the kernel symbol specified in the kernel symbol list but not exported from apple's kext module?
I also would like to access kernel function called SecureDTInitEntryIterator. I found that this symbol is listed on the kernel symbol located in the /System/Library/Kernels/kernel. However, $kextfind -defines-symbol _SecureDTIterateEntries doesn't return any corresponding kext module names.
As an IOS newbie, I guess that this symbol is not exported from any apple's kexy module. Is there any way to access this function from my kext module? I think I can just type cast the address where the symbol is located within the kernel space with the function prototype, but I am looking for a systematic approach if there exists.
I have just checked, and the crucial detail appears to be that you are trying to access this function on arm64/aarch64. As it turns out, it's exported in the "unsupported" KPI for x86_64, but not on arm64:
Unsupported.x86_64.exports in xnu source
Unsupported.arm64.exports in xnu source
There's no straightforward way of accessing unexported symbols. If you know the offset of a symbol in the exact version of the running kernel, you should be able to compute the address by offsetting from a known function address; at least, this worked on x86-64. arm64 may require extra effort due to PAC (pointer authentication).
As this circumvents Apple's policies, I don't recommend using this type of technique in a shipping product.
I would like to use it to debug kernel drivers but I would try to avoid to add logging to all functions. OSReportWithBacktrace seems to work but I need symbols.
I'm not aware of a way to print symbolicated stack traces directly from a kext. You can get symbolicated panic logs by adding keepsyms=1 to the boot-args nvram variable. I suspect the data structures for this have private linkage so you probably can't replicate the symbolicated panic code in your own kext. (It's in osfmk/i386/AT386/model_dep.c of the xnu source though if you want to try.)
Your other option is to send the output from OSReportWithBacktrace through the atos command-line tool. For kext symbols, you'll need to find the kext's load address from kextstat and pass that to theĀ -l command line argument.
Finally, you can of course use lldb kernel debugging to get a stack trace. If you need to set a breakpoint during early kext load, before you get a chance to do it from the lldb command line, you can insert __asm__("int $3") (IIRC) at the point in the code where you want to break into the debugger.
So far, with gdb + qemu, I can step into/over linux kernel source code. Is it possible to debug the user space programs simultaneously? For example, single step a program from user space to kernel space so I can observe the changes of registers on the qemu monitor by issuing info registers?
Minimal step-by-setep setup
Mahouk is right, but here is a fully automated QEMU + Buildroot example which presuposes that you already know how to debug the kernel with QEMU + gdb and a more detailed exaplanation:
readelf -h myexecutable | grep Entry
Gives:
Entry point address: 0x4003a0
So inside GDB we need to do:
add-symbol-file myexecutable 0x4003a0
b main
And only then start the executable in QEMU:
myexecutable
A more reliable way to do that is to set myexecutable as the init process if you can do that.
add-symbol-file is also mentioned at: How to load multiple symbol files in gdb
Why would you ever want to do this instead of gdbserver?
I can only see one use case for this so far: debugging init: Debug init on Qemu using gdb
Otherwise, why not just use the following more reliable method, e.g. to step into a syscall:
start two remote GDBs:
one with qemu-system-* -s
the other gdbserver myexecutable as explained at: https://reverseengineering.stackexchange.com/questions/8829/cross-debugging-for-mips-elf-with-qemu-toolchain/16214#16214
step in gdbserver's GDB as close as possible to the system call, which often mean stepping into the libc
on the QEMU's GDB, do e.g. b sys_read for the read syscall
back on gdbserver, do continue
I propose this because:
using the QEMU GDB for userland can lead to random jumps as the kernel context switches to another process that uses the same virtual addresses
I was not able to load shared libraries properly without gdbserver: attempting sharedlibrary directly gives:
(gdb) sharedlibrary ../../staging/lib/libc.so.0
No loaded shared libraries match the pattern `../../staging/lib/libc.so.0'.
As a consequence, since most kernel interactions go through the stdib, you would need to do a lot of smart assembly stepping to find the kernel entry, which could be impractical.
Until, that is, someone writes a smarter GDB scripts that steps every instruction until a context switch happens or until source become available. I wonder if such scripts would't be too slow, as the naive approach has the overhead of communication to-from GDB for every instruction.
This might get you started: Tell gdb to skip standard files
Parsing Linux kernel data structures
To do userland process debug properly, that's what we would have to do eventually: thread-aware gdb for the Linux kernel
I achieve it by using the gdb command add-symbol-file to add userspace programs debugging information. But you must know these programs loading addresses. so to be precise, you have to launch the kernel debugging by connecting gdb to gdbserver as usual; and then, you can add those program debugging information. You can also use .gdbinit script though. Read this
I'm writing an OSX kernel extension for an audio device driver (it's software, but emulates a hardware device).
During development, it'd be convenient to completely uninstall existing old versions and then build and install the new version from scratch. However, this occasionally seems to not be possible without a system restart.
The program itself is not running and the source files have been deleted from the /System/Library/Extensions/ dir.
But kextstat reveals a single instance:
$ kextstat | grep 'com.foo.driver.bar'
219 0 0xfff123 0x5000 0x5000 com.foo.driver.bar (0.0.1) <102 5 4 3>
(...meaning:)
Index Refs Address Size Wired Name (Version) <Linked Against>
So there are 0 Refs to my driver instance, but kextunload will sometimes fail, complaining of existing instances:
$ sudo kextunload -b com.foo.driver.bar
(kernel) Can't unload kext com.foo.driver.bar; classes have instances:
(kernel) Kext com.foo.driver.bar class FooBarDriver has 1 instance.
(kernel) Kext com.foo.driver.bar class com_foo_driver_bar has 1 instance.
Failed to unload com.foo.driver.bar - (libkern/kext) kext is in use or retained (cannot unload).
When this happens, there's no way to "force" unload the kext (that I know of).
Am I right in guessing that this single instance still exists because of a reference held in memory by the running OS kernel? That doesn't seem right, because then kextunload would always fail. So why does kextunload only sometimes require a system restart to "fully" unload all driver instances?
Running kextunload for an IOKit kext will (if no other kexts depend on it) cause the kernel to attempt to terminate() any instances of classes in that kext which are in the I/O Kit registry. It will then wait a bit and check if any of that kext's classes still have instances. If not, it will unload the kext. If instances remain, kextunload fails (the terminated instances stay terminated, though; by this I mean that I/O kit matching is not re-run on their providers).
So somehow, you're still ending up with live instances.
One possibility is that your objects are refusing to terminate(). This can happen if they have clients that won't give up control, e.g. you can't unload the driver for a disk with a mounted file system on top. Userspace clients that don't respond to termination messages are another example.
Otherwise, the instances terminate, but are not freed. Since they seem to be of two of your main driver classes, if you don't have any user clients that won't give up their claim, I'm going to go out on a limb and suggest that you might have a circular reference. If that's not it, you'll just have to hunt for retain()s which are not matched by a release(). I give some tips on how to track these down in this answer.
If the instances terminate and are deregistered, they will no longer appear in the output of the ioreg commandline tool, so that's an easy way of checking which of the two cases applies here.
Is it possible to get backtrace of kext without attaching with gdb as described
at
http://developer.apple.com/library/mac/#documentation/Darwin/Conceptual/KEXTConcept/KEXTConceptDebugger/debug_tutorial.html
if I have the panic log?
Somehow like this:
Get the address of kext caused panic from panic log
Generate dSYM file with kextutil
Paste the method's names from dSYM file into panic log to get backtrace?
Apple's tech note tn2063 describes analysing panics in detail. http://developer.apple.com/library/mac/ipad/#technotes/tn2063/_index.html
In addition, tn2118 describes analyzing kernel core dumps:
http://developer.apple.com/library/mac/#technotes/tn2004/tn2118.html
You can get the kernel to dump on panic, then take that core dump and analyze it against the symbolicated kernel. You add your own kext's symbols to the kernel's with gdb's add-symbol-file command.