Do I need two machines to develop IOKit Mac drivers? - macos

I'm building an IOKit CFPlugin driver for OS X. I'll be working with network data coming in that will be translated to MIDI data. No hardware is involved other than the built-in Airport. I have experience with drivers on Windows machines and firmware but this is my first dip into doing it on the Mac. So far things are going pretty well, but the Apple documentation sez: "For safety reasons, you should not load your driver on your development machine."
I only have one Mac. I really don't want two Macs- sorry, Apple. Should I take this warning seriously? Are there things I need to know?
Thanks, Tom Jeffries

You could also consider running OS X inside a VM as your testbed. It would surely be much more convenient that having a separate boot volume.

The warning is rather poorly worded; what you should consider doing is using a separate boot volume (partition) for trying out your driver, since it's possible to arbitrarily hose your system with your driver.
If you're doing kernel development on any OS that isn't isolated from your main system (via a VM, alternate boot disk, etc.), you're crazy!
What may be a bigger issue is that you can't do any kernel debugging, because the only option for that is to use GDB on a remote OS X system. For this, you may want to consider running OS X in virtualization.

you DEFINITELY want to have some way to recover a fubar kext installation: a bootable external drive or something you can quickly restore from-- this is the main reason for Apple's warning against running in-development-kernel-extensions on your production machine.
Nicholas is right that in order to debug using gdb (the only way in kernel space) you do need two machines. I've never tried using a VM as Coxy suggests: but I guess it's feasible (assuming that you run your kext on the virtual machine and use the real host machine to run gdb).
My preferred method for tracing and debugging in the kernel is kprintf() routed to firewire (aka firewire kprintf (man fwkpfv) ). for this you do need two machines with firewire ports.
finally, being an old computer musician myself, I wonder why you want to program a MIDI synthesizer (or transformer) on the network stack level. my guess is that you would have a much more gratifying experience working in userland (where you can use floating point math...)
if you need some hints or tips, feel free to get in touch...
|K<
from the ADC Kernel Programming Guide
Kernel programming is a black art that
should be avoided if at all possible.
Fortunately, kernel programming is
usually unnecessary. You can write
most software entirely in user space.
Even most device drivers (FireWire and
USB, for example) can be written as
applications, rather than as kernel
code. A few low-level drivers must be
resident in the kernel's address
space, however, and this document
might be marginally useful if you are
writing drivers that fall into this
category.

Related

How does the ARM Linux kernel map console output to a hardware device on boot?

I'm currently struggling to determine how I can get an emulated environment via QEMU to correctly display output on the command line. I have an environment that displays perfectly well using the virt reference board, a cortex-a9CPU, and the 4.1 Linux kernel cross-compiled for ARM. However, if I swap out the 4.1 kernel for 2.6 or 3.1, suddenly I can no longer see console output.
While solving this issue is my main goal, I feel like I lack a critical understanding of how Linux and the hardware initially integrate before userspace configurations via boot scripts and whatnot have a chance to execute. I am aware of the device tree, and have a loose understanding of how it works. But the issue I ran into where a different kernel version broke console availability entirely confounds me. Can someone explain how Linux initially maps console output to a hardware device on the ARM architecture?
Thank you!
The answer depends quite a bit on which kernel version, what config options are set, what hardware, and also possibly on kernel command line arguments.
For modern kernels, the answer is that it looks in the device tree blob it is passed for descriptions of devices, some of which will be serial ports, and it initializes those. The kernel config or command line will specify which of those is to be used for the console. For earlier kernels, especially if you go all the way back to 2.6, use of device tree was less universal, and for some hardware the boot loader simply said "this is a versatile express board" (for instance) and the kernel had compiled-in data structures to tell it where the devices were for each board that it supported. As the transition to device tree progressed, boards were converted one by one, and sometimes a few devices at a time, so what exactly the situation was for any specific kernel version depends on which board you're using.
The other thing that I rather suspect you're running into is that if the kernel crashes early in bootup (ie before it finds the serial port at all) then it will never output anything. So if the kernel is just too early to support the "virt" board properly at all, or if your kernel config is missing something important, then the chances are good that it crashes in early boot without being able to print you a useful message. (Sometimes "earlycon" or "earlyprintk" kernel arguments can assist here, but not always.)

Immidiate and latent effects of modifying my own kernel binary at runtime?

I'm more of a web-developer and database guy, but severely inconvenient performance issues relating to kernel_task and temperature on my personal machine have made me interested in digging into the details of my Mac OS (I notices some processes would trigger long-lasting spikes in kernel-task, despite consistently low CPU temperature and newly re-imaged machine).
I am a root user on my own OSX machine. I can read /System/Library/Kernels/kernel. My understanding is this is "Mach/XNU" Kernel of this machine (although I don't know a lot about those, but I'm surprised that it's only 13Mb).
What happens if I modify or delete /System/Library/Kernels/kernel?
I imagine since it's at run-time, things might be okay until I try to reboot. If this is the case, would carefully modifying this file change the behavior of my OS, only effective on reboot, presuming it didn't cause a kernel panic? (is kernel-panic only a linux thing?)
What happens if I modify or delete /System/Library/Kernels/kernel?
First off, you'll need to disable SIP (system integrity protection) in order to be able to modify or edit this file, as it's protected even from the root user by default for security reasons.
If you delete it, your system will no longer boot. If you replace it with a different xnu kernel, that kernel will in theory boot next time, assuming it's sufficiently matched to both the installed device drivers and other kexts, and the OS userland.
Note that you don't need to delete/replace the kernel file to boot a different one, you can have more than one installed at a time. For details, see the documentation that comes with Apple's Kernel Debug Kits (KDKs) which you can download from the Apple Developer Downloads Area.
I imagine since it's at run-time, things might be okay until I try to reboot.
Yes, the kernel is loaded into memory by the bootloader early on during the boot process; the file isn't used past that, except for producing prelinked kernels when your device drivers change.
Finally, I feel like I should explain a little about what you actually seem to be trying to diagnose/fix:
but severely inconvenient performance issues relating to kernel_task and temperature on my personal machine have made me interested in digging into the details of my Mac OS
kernel_task runs more code than just the kernel core itself. Specifically, any kexts that are loaded (see kextstat command) - and there are a lot of those on a modern macOS system - are loaded into kernel space, meaning they are counted under kernel_task.
Long-running spikes of kernel CPU usage sound like they might be caused by file system self-maintenance, or volume encryption/decryption activity. They are almost certainly not basic programming errors in the xnu kernel itself. (Although I suppose stupid mistakes are easy to make.)
Another possible culprits are device drivers; especially GPU drivers are incredibly complex pieces of software, and of course are busy even if your system is seemingly idle.
The first step to dealing with this problem - if there indeed is one - would be to find out what the kernel is actually doing with those CPU cycles. So for that you'd want to do some profiling and/or tracing. Doing this on the running kernel most likely again requires SIP to be disabled. The Instruments.app that ships with Xcode is able to profile processes; I'm not sure if it's still possible to profile kernel_task with it, I think it at least used to be possible in earlier versions. Another possible option is DTrace. (there are entire books written on this topic)

Fuzzing virtual drivers tools

I'm looking to fuzz virtual drivers, I've read the other questions about this but they don't really go anywhere. Basically looking to see if there's an obvious tool I've missed and want to know if fuzzing IOCTLs from a windows guest would work? Or if I need to write one in low level eg IN/OUT?
Any tools out there for fuzzing drivers in a windows guest to hit the hypervisor either hyper-v or VMware
There are a number of ways to exercise virtualization code.
First, of course, if you're on Windows, is the IOCTL interface.
Then you should remember that all virtual devices are emulated in some way by some code in the guest OS and in the host OS. So, accessing input devices (keyboard and mouse), video device, storage (disks), network card, communication ports (serial, parallel), standard PC devices (PIC, PIT, RTC, DMA), CPU APIC, etc etc will also exercise virtualization code.
It's also very important to remember that virtualization of the various PC devices (unless we're talking about synthetic devices working over the VMBUS in Windows) is done by intercepting, parsing and emulating/executing instructions that access device memory-mapped buffers and registers and I/O ports. This gives you yet another "interface" to pound on.
By using it you might uncover not only device-related bugs but also instruction-related bugs. If you're interested in the latter, you need to have a good understanding of how the x86 CPU works at the instruction level in various modes (real, virtual 8086, protected, 64-bit), how it handles interrupts and exceptions and you'll also need to know how to access those PC devices (how and at what memory addresses and I/O port numbers).
Btw, Windows won't let you directly access these things unless your code is running in the kernel. You may want to have a non-Windows guest VM for things like this just to avoid overprotective functionality of Windows. Look for edge cases, unusual instruction encodings (including invalid encodings) or unusual instructions for usual tasks (e.g. using FPU/MMX/SSE/etc or special protected-mode instructions (like SIDT) to access devices). Think and be naughty.
Another thing to consider is race conditions and computational or I/O load. You may have some luck exploring in that direction too.

Can anyone please tell me how is Kernel Programming done in Linux, as Windows DDK in Windows

I am aware of windows kernel but new to linux kernel. I just need to know how its done in linux, i.e. the program development.
You can check there (free-electrons.com), it's a good informations source for kernel developement. (specialized in embedded linux, but most of the docs are available for standard development)
You have also the classical Linux Devices Drivers, which is very complete and detailled.
And last but not least, the Linux kernel documentation.
Linux does not have a stable kernel API. This is by design, so you should generally avoid writing kernel code if you can; it is unlikely to remain source-compatible indefinitely, and will definitely NOT be binary-compatible, even between minor releases.
This is less-or-more true for vendor kernels; Redhat etc DO maintain source & binary kernel compatibility between major revisions.
More work is gradually being done in the kernel to reduce the amount of kernel-code required to carry out various tasks, such as driver development (for example, USB drivers can typically be done in userspace with libusb), filesystem development (FUSE) and network filtering (NFQUEUE). However, there are still some cases where you need to; in particular, block devices still need to be in the kernel to be able to be usefully used for boot devices and swap.

kvm vs. vmware for kernel debugging / USB driver development

I'm currently setting up vmware Server 2.0 for kernel debugging with gdb ( see this setup guide ) and someone asked me why not use kvm?
So I ask: kvm vs. vmware for kernel debugging / USB driver development
what are the pros and cons of each?
Driver development? are you working on a driver for a particular piece of hardware? if so, then you probably won't be able to use virtualization, because the virtualized instance won't have access to the new hardware.
For this you will need two machines, one running a remote debugger on the other.
*Edit: * Apparently you're developing a driver for a USB Device? this is one area in particular that a VM actually Can help. These days most VM's have the ability to delegate specific USB devices to a guest OS.
That said, this situation doesn't really offer any benefits over the remote debugger option, because you still need a way to inspect the state of the running or crashed OS, and VM's offer very little assistance in this regard. You might be able to replay saved states from just before a crash.
You might be able to get a bit of traction using UML, which would allow you to do local debugging as on a regular user process, which is a little bit less trouble.
Instead of answering the direct question I'll add another option... Depending on if the kernel in question is a Linux kernel, and what part(s) of it you are working on, you might find that UserModeLinux (included in the 2.6.x source, and available as patch sets for 2.4 and 2.2) may trump both of those options.
As it runs the kernel as a userland process under the host kernel it is easier to attach common debugging tools to. I believe it is very commonly used in the early stages of updates/additions to file-system related code. If you are developing/debugging modules that interact directly with hardware it may be much less use to you though.
Reference links: home,
other
I recently started building GNU Mach/HURD and found the combination of QEmu/KVM to work really quite well.. for the following reasons:
QEmu presents quite a clean environment
Networking has alot of options
I can easily mount the filesystem using a raw device file / loopback
Bottom line is, for kernel work I just want the minimum of functionality to boot and see the result. VMWare is much more for usable virtualization rather than down-and-dirty.
There is however no comparison to booting on a real machine with real hardware. The VM environment can seem like a safety blanket somtimes ... because even my toaster would know what a Realtek RTL8139C was.
If it is a "real hardware" device, of course, vmware will not emulate it, so you won't be able to debug the driver under it (nor will any other virtualisation software, unless you extend one to do so).
Device driver debugging can be done to some extent with a real hardware machine with a normal kernel - although there are obviously things you can't do - like set breakpoints.
It is still possible to attach a debugger to the kernel and inspect stuff. Moreover, traditional printf() debugging is quite possible (printk, anyone), and there are various features in the kernel which make debugging easier. It's possible to build the kernel with various debug options to try to detect pointer problems, memory leaks etc.
By default, the kernel even gives a nice-ish stack trace on the log when it encounters an OOPS or BUG condition (obviously this does not necessarily get written anywhere if the system hangs or crashes). Of course a pointer-out-of-range condition happening inside an interrupt is a recipe for disaster, but you could still get a stack trace on the screen immediately before the panic :)

Resources