how to debug a pci device and linux driver - linux-kernel

I am programming a pci device with verilog and also writing its driver,
I have probably inserted some bug in the hardware design and when i load the driver with insmod the kernel just gets stuck and doesnt respond. Now Im trying to figure out what's the last driver code line that makes my computer stuck. I have inserted printk in all relevant functions like probe and init but non of them get printed.
What other code is running when i use insmod before it gets to my init function? (I guess the kernel gets stuck over there)

printks are often not useful debugging such a problem. They are buffered sufficiently that you won't see them in time if the system hangs shortly after printk is called.
It is far more productive to selectively comment out sections of your driver and by process of elimination determine which line is the (first) problem.
Begin by commenting out the entire module's init section leaving only return 0;. Build it and load it. Does it hang? Reboot system, reenable the next few lines (class_create()?) and repeat.

From what you are telling, it is looks like that Linux scheduler is deadlocking by your driver. That's mean that interrupts from the system timer doesn't arrive or have a chance to be handled by kernel. There are two possible reasons:
You hang somewhere in your driver interrupt handler (handler starts its work but never finish it).
Your device creates interrupts storm (Device generates interrupts too frequently as a result your system do the only job -- handling of your device interrupts).
You explicitly disable all interrupts in your driver but doesn't reenable them.
In all other cases system will either crash, either oops or panic with all appropriate outputs or tolerate potential misbehavior of your device.
I guess that printk won't work for such extreme scenario as hang in kernel mode. It is quite heavy weight and due to this unreliable diagnostic tool for scenarios like your.
This trick works only in simpler environments like bootloaders or more simple kernels where system runs in default low-end video mode and there is no need to sync access to the video memory. In such systems tracing via debugging output to the display via direct writing to the video memory can be great and in many times the only tool that can be used for debugging purposes. Linux is not the case.
What techniques can be recommended from the software debugging point of view:
Try to review you driver code devoting special attention to interrupt handler and places where you disable/enable interrupts for synchronization.
Commenting out of all driver logic with gradual uncommenting can help a lot with localization of the issue.
You can try to use remote kernel debugging of your driver. I advice to try to use virtual machine for that purposes, but I'm not aware about do they allow to pass the PCI device in the virtual machine.
You can try the trick with in-memory tracing. The idea is to preallocate the memory chunk with well known virtual and physical addresses and zeroes it. Then modify your driver to write the trace data in this chunk using its virtual address. (For example, assign an unique integer value to each event that you want to trace and write '1' into the appropriate index of bytes array in the preallocated memory cell). Then when your system will hang you can simply force full memory dump generation and then analyze the memory layout packed in the dump using physical address of the memory chunk with traces. I had used this technique with VmWare Workstation VM on Windows. When the system had hanged I just pause a VM instance and looked to the appropriate .vmem file that contains raw memory latout of the physical memory of the VM instance. Not sure that this trick will work easy or even will work at all on Linux, but I would try it.
Finally, you can try to trace the messages on the PCI bus, but I'm not an expert in this field and not sure do it can help in your case or not.
In general kernel debugging is a quite tricky task, where a lot of tricks in use and all they works only for a specific set of cases. :(

I would put a logic analyzer on the bus lines (on FPGA you could use chipscope or similar). You'll then be able to tell which access is in cause (and fix the hardware). It will be useful anyway in order to debug or analyze future issues.
Another way would be to use the kernel crash dump utility which saved me some headaches in the past. But depending your Linux distribution requires installing (available by default in RH). See http://people.redhat.com/anderson/crash_whitepaper/

There isn't really anything that is run before your init. Bus enumeration is done at boot, if that goes by without a hitch the earliest cause for freezing should be something in your driver init AFAIK.
You should be able to see printks as they are printed, they aren't buffered and should not get lost. That's applicable only in situations where you can directly see kernel output, such as on the text console or over a serial line. If there is some other application in the way, like displaying the kernel logs in a terminal in X11 or over ssh, it may not have a chance to read and display the logs before the computer freezes.
If for some other reasons the printks still do not work for you, you can instead have your init function return early. Just test and move the return to later in the init until you find the point where it crashes.
It's hard to say what is causing your freezes, but interrupts is one of those things I would look at first. Make sure the device really doesn't signal interrupts until the driver enables them (that includes clearing interrupt enables on system reset) and enable them in the driver only after all handlers are registered (also, clear interrupt status before enabling interrupts).
Second thing to look at would be bus master transfers, same thing applies: Make sure the device doesn't do anything until it's asked to and let the driver make sure that no busmaster transfers are active before enabling busmastering at the device level.

The fact that the kernel gets stuck as soon as you install your driver module makes me wonder if any other driver (built in to kernel?) is already driving the device. I made this mistake once which is why i am asking. I'd look for the string "kernel driver in use" in the output of 'lspci' before installing the module. In any case, your printk's should be visible in dmesg output.

in addition to Claudio's suggestion, couple more debug ideas:
1. try kgdb (https://www.kernel.org/doc/htmldocs/kgdb/EnableKGDB.html)
2. use JTAG interfaces to connect to debug tools (these i think vary between devices, vendors so you'll have to figure out which debug tools you need to the particular hardware)

Related

Is There Ever an Advantage to User Mode Debug over Kernel Mode Debug?

From what I understand, on a high level, user mode debugging provides you with access to the private virtual address for a process. A debug session is limited to that process and it cannot overwrite or tamper w/ other process' virtual address space/data.
Kernel mode debug, I understand, provides access to other drivers and kernel processes that need full access to multiple resources, in addition to the original process address space.
From this, I get to thinking that kernel mode debugging seems more robust than user mode debugging. This raises the question for me: is there a time, when both options of debug mode are available, that it makes sense to choose user mode over a more robust kernel mode?
I'm still fairly new to the concept, so perhaps I am thinking of the two modes incorrectly. I'd appreciate any insight there, as well, to better understand anything I may be missing. I just seem to notice that a lot of people seem to try to avoid kernel debugging. I'm not entirely sure why, as it seems more robust.
The following is mainly from a Windows background, but I guess it should be fine for Linux too. The concepts are not so different.
Some inline answers first
From what I understand, on a high level, user mode debugging provides you with access to the private virtual address for a process.
Correct.
A debug session is limited to that process
No. You can attach to several processes at the same time, e.g. with WinDbg's .tlist/.attach command.
and it cannot overwrite or tamper w/ other process' virtual address space/data.
No. You can modify the memory, e.g. with WinDbg's ed command.
Kernel mode debug, I understand, provides access to other drivers and kernel processes that need full access to multiple resources,
Correct.
in addition to the original process address space.
As far as I know, you have access to physical RAM only. Some of the virtual address space may be swapped, so not the full address space is available.
From this, I get to thinking that kernel mode debugging seems more robust than user mode debugging.
I think the opposite. If you write incorrect values somewhere in kernel mode, the PC crashes with a blue screen. If you do that in user mode, it's only the application that crashes.
This raises the question for me: is there a time, when both options of debug mode are available, that it makes sense to choose user mode over a more robust kernel mode?
If you debug an application only and no drivers are involved, I prefer user mode debugging.
IMHO, kernel mode debugging is not more robust, it's more fragile - you can really break everything at the lowest level. User mode debugging provides the typical protection against crashes of the OS.
I just seem to notice that a lot of people seem to try to avoid kernel debugging
I observe the same. And usually it's not so difficult once they try it. In my debugging workshops, I explain processes and threads from kernel point of view and do it live in the kernel. And once people try kernel debugging, it's not such a mystery any more.
I'm not entirely sure why, as it seems more robust.
Well, you really can blow up everything in kernel mode.
User mode debugging
User mode debugging is the default that any IDE will do. The integration is usually good, in some IDEs it feels quite native.
During user mode debugging, things are easy. If you access memory that is paged out to disk, the OS is still running and will simply page it in, so you can read and write it.
You have access to everything that you know from application development. There are threads and you can suspend or resume them. The knowledge you have from application development will be sufficient to operate the debugger.
You can set breakpoints and inspect variables (as long as you have correct symbols).
Some kinds of debugging is only available in user mode. E.g. the SOS extension for WinDbg to debug .NET application only works in user mode.
Kernel debugging
Kernel debugging is quite complex. Typically, you can't simply do local kernel debugging - if you stop somewhere in the kernel, how do you control the debugger? The system will just freeze. So, for kernel debugging, you need 2 PCs (or virtual PCs).
During kernel mode debugging, things are complex. While you are just inside an application, a millisecond later, some interrupt occurs and does something completely different. You don't only have threads, you also need to deal with call stacks that are outside your application, you'll see CPU register content, instruction pointers etc. That's all stuff a "normal" app developer does not want to care about.
You don't only have access to everything that you implemented. You also have access to everything that Microsoft, Intel, NVidia and lots of other companies developed.
You cannot simply access all memory, because some memory that is paged out to the swap file will first generate a page fault, then involve some disk driver to fetch the data, potentially page out some other data, etc.
There is so much giong on in kernel mode and in order to not break it, you need to have really professional comprehension of all those topics.
Conclusion
Most developers just want to care about their source code. So if they are writing programs (aka. applications, scripts, tools, games), they just want user mode debugging. If "their code" is driver code, of course they want kernel debugging.
And of course Security Specialists and Crackers want kernel mode debugging because they want privileges.

Kernel panic error in ARM board

I have ARM board at remote location. Some time I had a kernel panic error in it. At this same time there is no option to hardware restart. bus no one is available at this place to restart it.
I want to restart my board automatically after kernel panic error. so what to do in kernel.
If your hardware contains watchdog timer, then compile the kernel with watchdog support and configure it. I suggest to follow this blog http://www.jann.cc/2013/02/02/linux_watchdog.html
Caution :: I never tried this. If the problem is solved, request you to update here.
You can modify the panic() function kernel/panic.c to call the kernel_restart(*cmd) at the point you want it to restart (like probably after printing the required debug information).
I am assuming you are bringing up a board, so Please note that you need to supply the ops for the associated functions in machine_restart() - (called by kernel_restart) in accordance to the MACH . If you are just using the board as is , then i guess rebuilding the kernel with kernel_restart(*cmd) should do.
The panic() is usually due to events that the kernel can not recover from. If you do not have a watchdog, you need to look at your hardware to see if a GPIO, etc is connected to the RESET line. If so, you can toggle this pin to reboot the CPU. Trying to alter panic() may just make things worse, depending on the root cause and the type of features you use.
You may hook arm_pm_restart with your custom restart functionality. You can test it with the shell command reboot, if present. panic() should call the same routine. With current ARM Linux versions
You may wish to turn off the MMU and block interrupts in this routine. It will make it more resilient when called from panic(). As you are going to reset, you can copy the routine to any physical address you like.
The watchdog maybe better; it may catch cases where even panic() may not be called. You may have a watchdog and not realize it. Many Cortex-A CPUs, have one built in. It is fairly rare for hardware not to have a watchdog.
However, if you don't have the watchdog, you can use the GPIO mechanism above; hardware should usually provide someway for software to restart the device (and peripherals). The panic() maybe due to some mis-behaving device tromping memory, latched up DRAM/Flash, etc. Toggling a RESET line maybe better than a watchdog in this case; if the RESET is also connected to other hardware, besides the CPU.
Related: How to debug kernel freeze, How to change watchdog timer
AFAIK, a simple way to restart the board after kernel panic is to pass a kernel parameter (from the bootloader usually)
panic=1
The board will then auto-reboot '1' second(s) after a panic.
Search the Documentation for more.
Some examples from the documentation:
...
panic= [KNL] Kernel behaviour on panic: delay <timeout>
timeout > 0: seconds before rebooting
timeout = 0: wait forever
timeout < 0: reboot immediately
Format: <timeout>
...
oops=panic Always panic on oopses. Default is to just kill the
process, but there is a small probability of
deadlocking the machine.
This will also cause panics on machine check exceptions.
Useful together with panic=30 to trigger a reboot.
...
As suggested in previous comments watchdog timer is your friend here. If your hardware contains watchdog timer, Enable it in kernel option and configure it.
Other alternative is use Phidget. If you usb connection available at remote location. Phidget controller/software is used to control your board using USB. Check for board support.

How to know that the kernel has panicked?

I want to be able to monitor kernel panics - know if and when they have happened.
Is there a way to know, after the machine has booted, that it went down due to a kernel panic (and not, for example, an ordered reboot or a power failure)?
The machine may be configured with KDUMP and/or KDB, but I prefer not to assume that either is or is not installed.
Patching the kernel is an option, though I prefer to avoid it. But even if I do it, I'm not sure what can the patch do.
I'm using kernel 2.6.18 (ancient, I know). Solutions for newer kernels may be interesting too.
Thanks.
The kernel module 'netconsole' may help you to log kernel printk messages over UDP.
You can view the log message in remote syslog server, event if the machine is rebooted.
Introduction:
=============
This module logs kernel printk messages over UDP allowing debugging of
problem where disk logging fails and serial consoles are impractical.
It can be used either built-in or as a module. As a built-in,
netconsole initializes immediately after NIC cards and will bring up
the specified interface as soon as possible. While this doesn't allow
capture of early kernel panics, it does capture most of the boot
process.
Check kernel document for more information: https://www.kernel.org/doc/Documentation/networking/netconsole.txt

Debugging kernel hang

I am trying to run an app which is using a kernel mode driver. System locks up every hour and the only way to recover it is a hard reset. Sysrq stops responding, telnet sessions hang and there are no error messages of any kind. Unfortunately the board does not have ejtag support. I have been trying to isolate it functionally, but this is like looking for a needle in a hay stack. Any suggestions?
PS: This is a mips linux system (2.6.31).
Here are some options, depending on the specifics on your situation. If you can provide more detail about the platform and nature of the kernel mode driver it would be helpful.
Assuming you have reason to be confident in the hardware, your likely sources of lockups are locking problems in the kernel, uninitialized variables, and infinite loops with preemption disabled.
Can you configure a timer interrupt to run periodically and blink a LED? You might find it useful to see if interrupts continue to be handled while in a lockup.
Enable soft lockup detection in the Linux kernel hacking menu, and any other relevant kernel hacking features. It may take Linux a minute or two detect and report a soft lockup. Have you waited long enough to check for this?
Enable lock dependency checking in kernel hacking, and fix any reported locking errors in your driver.
Try changing the kernel preemption mode. This changes the behaviour of some system locks, in some cases turning deadlocks into less harmful locks. If it's relevant/possible, disable SMP.
Unfortunately without sysreq operating, or some way of poking the underlying system, you are out of luck.
If you can get some behavior out of the system (perhaps a hardware watchdog?), I would recommend kdump.
Furthermore, if this is a more recent problem, start by bisecting the code of the driver to determine where the crash is occurring.
If the kernel isn't totally hung and you are still getting interrupts, you might be able to use KGDB.
If you can't do that, you could add more logging code to your driver to track down the source of the problem. I'd put a printk() on every function's entry at a minimum and probably on every exit of each function as well. That should at least help you find out where the problem is happening.

Temporarily suspend the PC operating system

How does one programmatically cause the OS to switch off, go away and stop doing anything at all so that a program may have complete control of a PC system?
I'm interested in doing this from both an MS Windows and Linux environments. Any languages or APIs considered.
I want the OS to stop preempting my program, stop its virtual memory management, stop its device drivers and interrupt service routines from running and basically just go away. Then, when my program has had its evil way with the bare metal, I want the OS to come back again without a reboot.
Is this even possible?
With Linux, you could use kexec jump to transfer control completely to another kernel (ie, your program). Of course, with great power comes great responsibility - it is entirely up to you to service interrupts, and avoid corrupting the old kernel's memory. You'll end up having to write your own OS kernel to do this. Also, the transfer of control takes quite some time, as the kernel has to de-initialize all hardware, then reinitialize it when it's time to resume. Since kexec jump was originally designed for hibernation support, this isn't a problem in its original context, but depending on what you're doing, it might be a problem.
You may want to consider instead working within the framework given to you by the OS - just write a normal driver for whatever you're doing.
Finally, one more option would be using the linux Real-Time patchset. This lets you assign static priorities to everything, even interrupt handlers; by running a process with higher priority than anything else, you could suspend /nearly/ everything - the system will still service a small stub for interrupts, as well as certain interrupts that can't be deferred, like timing interrupts, but for the most part the heavy work will be deferred until you relinquish control of the CPU.
Note that the RT patchset won't stop virtual memory and the like - mlockall will prevent page faults on valid pages though, if that's enough for you.
Also, keep in mind that whatever you do, the system BIOS can still cause SMM traps, which cannot be disabled, except by motherboard-model-specific methods.
There are lots of really ugly ways to do this. You could modify the running kernel by writing some trampoline code to /dev/kmem that passes control to your application. But I wouldn't recommend attempting something like that!
Basically, you would need to have your application act as its own operating system. If you want to read data from a file, you would have to figure out where the data lives on disk, and generate your own SCSI requests to talk to the disk drive. You would have to implement your own interrupt handler to get notified when the data is ready. Likewise you would have to handle page faults, memory allocation, etc. Most users feel that this isn't worth the effort...
Why do you want to do this?
Is there something that your application needs to do that the OS won't let it do? Are you concerned with the OS impact on performance? Something else?
If you don't mind shelling out some cash, you could use IntervalZero's RTX to do this for a Windows system. It's a hard realtime subsystem that gets installed on a Windows box as sort of a hack into the HAL and takes over the machine, letting Windows have whatever CPU cycles are left over.
It has its own scheduler and device drivers, but if you run your program at the top RTX priority, don't install any RTX device drivers (or disable interrupts for the duration), then nothing will interrupt it.
It also supports a small amount of interaction with programs on the Windows side.
We use it as a nice way to get a hard realtime box that runs Windows.
coLinux loads CoLinuxDriver into the NT kernel or a colinux.ko into the Linux kernel. It does exactly what you asked – it "unschedules" the host OS, and runs its own code, with its own memory management, interrupts, etc. Then, when it's done, it "reschedules" the host OS, allowing it to continue from where it left off. coLinux uses this to run a modified Linux kernel parallel to the host OS.
Unlike more common virtualization techniques, there are no barriers between coLinux and the bare metal hardware at all. However, hardware and the host OS tend to get confused if the coLinux guest touches anything without restoring it before returning to the host OS.
Not really. Operating Systems are a foundation, and your program runs on top of them. The OS handles memory access, disk writing operations, communications, etc. when your application makes requests, and asking the OS to move out of the way would mean that your program would have to do the OS's job instead.
Not as such, no.
What you want is basically an application that becomes an OS; a severely stripped down Linux kernel coupled with some highly customized and minimized tools might be the way to go for this.
if you were devious, and wanted to avoid alot of the operating system housekeeping you could probably hook yourself into a driver routine. Thinking out aloud, verging on hacking. google how to write root kits.
Yeah dude, you can totally do that, you can also write a program to tell my bank to give you all my money and send you a hot Russian.

Resources