I am suspecting a double kfree in my kernel code. Basically, I have a data structure that is kzalloced and kfreed in a module. I notice that the same address is allocated and then allocated again without being freed in the module.
I would like to know what technique should I employ in finding where the wrong kfree is issued.
1.
Yes, kmemleak is an excellent tool, especially suitable for system-wide analysis.
Note that if you are going to use it to analyze a kernel module, you may need to save the addresses of the ELF sections containing the code of the module (.text, .init.text, ...) when the module is loaded. This may help you decipher the call stacks in the kmemleak's report. It usually makes sense to ask kmemleak to produce a report after the module has been unloaded but kmemleak cannot resolve the addresses at that time.
While a module is loaded, the addresses fo its sections can be found in the files in /sys/module/<module_name>/sections/.
After you have found the section each code address in the report belongs to and the corresponding offset into that section, you can use objdump, gdb, addr2line or a similar tool to obtain more detailed information about where the event of interest occurred.
2.
Besides that, if you are working on an x86 system and you would like to analyze a single kernel module, you can also use KEDR LeakCheck tool.
Unlike kmemleak, most of the time, it is not required to rebuild the kernel to be able to use KEDR.
The instructions on how to build and use KEDR are here. A simple example of how LeakCheck can be used is described in "Detecting Memory Leaks" section.
Have you tried enabling the kmemleak detection code?
See Documentation/kmemleak.txt for details.
Related
Is there a straigtforward way with ready-at-hand tooling to suspend a traced process' execution when a certain syscalls are called with specific parameters? Specifically I want to suspend program execution whenever
stat("/${SOME_PATH}")
or
readlink("/${SOME_PATH}")
are called. I aim to then attach a debugger, so that I can identify which of the hundreds of shared objects that are linked into the process is trying to access that specific path.
strace shows me the syscalls alright, and gdb does the rest. The question is, how to bring them together. This surely can be solved with custom glue-scripting, but I'd rather use a clean solution.
The problem at hand is a 3rd party toolsuite which is available only in binary form and which distribution package completely violates the LSB/FHS and good manners and places shared objects all over the filesystem, some of which are loaded from unconfigurable paths. I'd like to identify which modules of the toolsuite try to do this and either patch the binaries or to file an issue with the vendor.
This is the approach that I use for similar condition in windows debugging. Even though I think it should be possible for you too, I have not tried it with gdb in linux.
When you attached your process, set breakpoint on your system call which is for example stat in your case.
Add a condition based on esp to your breakpoint. For example you want to check stat("/$te"). value at [esp+4] should point to address of string which in this case is "/$te". Then add a condition like: *(uint32_t*)[esp+4] == "/$te". It seems that you can use strcmp() in your condition too as described here.
I think something similar to this should work for you too.
As a part of understanding virtualization, I am trying to extend the support of KVM and defin a new instruction. The instruction will use previously unused opcodes.
ref- ref.x86asm.net/coder32.html.
Now, lets say an instruction like 'CPUID' (which causes a vm-exit) and i want to add a new instruction, say - 'NEWCPUID', which is similar to 'CPUID' in priviledge and is trapped by hypervisor, but will differ in the implementation.
After going through some online resources, I was able to understand how to define new system calls, but I am not sure about which all files in linux source code do I need to add the code for NEWCPUID? Is there a better way than only relying on 'find' command?
I am facing below challenges:
1. Which all places in linux source code do I need to add code?
2. Not sure how this new instruction can be mapped to a previously unused opcode?
As I am completely new to this field and willing to learn this, can someone explain me in short how to go about this task? I will need the right direction to achieve this. If there is a reference/tutorial/blog describing the process, it will be of great help!
Here are answers to some of your questions:
... but I am not sure about which all files in linux source code do I need to add the code for NEWCPUID?
A - The right place to add emulation for KVM is arch/x86/kvm/emulate.c. Take a look at how opcode_table[] is defined and the hooks to the functions that they execute. The basic idea is the guest executes and undefined instruction such as "db 0xunused"; this is results in an exit since the instruction is undefined. In KVM, you look at the rip from the VMCS/VMCB and determine if it's an instruction KVM knows about (such as NEWCPUID) and then KVM calls x86_emulate_instruction().
...Is there a better way than only relying on 'find' command?
A - Yes, pick an example system call and then use a symbol cross reference such as cscope.
...n me in short how to go about this task?
A - As I mentioned in 1, first of all find a way for the guest to attempt to execute this unused opcode (such as the db trick). I think the assembler will trying to reject unknown opcodes. So, that the first step. Second, check whether your instruction causes an vmexit(). For this, you can use tracing. Tracing emits a lot of output, so, you have to use some filter options. If tracing is overwhelming, simply printk something in vmx_handle_exit (vmx.c). Finally, find a way to hook to your custom function from here. KVM already has handle_exception() to handle guest exceptions; that would be a good place to insert your custom function. See how this function calls emulate_instruction to emulate an exception to be injected to the guest.
I have deliberately skipped some of the questions since I consider them essential to figure out yourself in the process of learning. BTW, I don't think this may not be the best way to understand virtualization. A better way might be to write your own userspace hypervisor that utlizes kvm services via /dev/kvm or maybe just a standalone hypervisor.
I am working on ARM Linux. When we enable CONFIG_PAGE_POISONING, the pages are filled with poison byte pattern after free_pages() and verifying the poison patterns before alloc_pages().
This helps me to identify the bit flips or page memory corruption, by verifying the poison byte pattern before allocating a new page. But how to identify the culprit? I searched in google, but could not find.
I know it's an old question, but I've just had a similar issue, and it took me a while to debug it. So I will recommend here on two tools that were very useful to me (I used both of them simultaneously):
First tool: KMEMLEAK
In order to enable it you must enable it in your kernel configuration:
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK=y
And if you get this kernel log:
kmemleak: Kernel memory leak detector disabled
kmemleak: Early log buffer exceeded (xxxx), please increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE
Then I also suggest to add this to your kernel configuration:
CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=4096
In addition I think you need to add kmemleak=on to the boot arguments (because I'm not sure it's enabled by default).
After all that, the KMEMLEAK tool is ready to run.
Now I suggest to give a look at the examples that are given in the links below. They helped me to understand how to use the (API and how to read it's logs):
KMEMLEAK first example
KMEMLEAK second example
Second tool: SLUB_DEBUG
The SLUB_DEBUG is very useful with finding memory corruptions caused by use-after-free, double-free and buffer-overrun errors.
In order to enable it you should update your kernel configuration:
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB_DEBUG_ON=y
CONFIG_PAGE_POISONING=y
And I also suggest adding:
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_PAGE_OWNER=y
In addition you will probably need to add to the boot arguments the next flags: page_poison=on, slub_debug=FZP and if you've also set CONFIG_PAGE_OWNER, you also need page_owner=on.
After all the debug configuration is set, you should have a look at the next example. It's clear and useful.
SLUB_DEBUG example
Good Luck finding your bug !!!
I'm doing various tasks on the linux kernel, and I end up reading source code from time to time. I haven't really needed to change the kernel yet (I'm good with so called "Loadable Kernel Modules") so I didn't download the source of the kernel, just using http://lxr.free-electrons.com/ . And quite a lot I find myself finding a function that has many implementations, and start guessing which one is the one I need.
For example, I looked at the file Linux/virt/kvm/kvm_main.c at line 496 is a call to list_add, a click on it gives me two options: drivers/gpu/drm/radeon/mkregtable.c, line 84 and include/linux/list.h, line 60 - It's quite clear that kvm will not send my to something under "gpu" but this is not always the case. I have looked at the includes of the file - was not much help.
So my questions: Given a file from the kernel, and a function call at line ###, what is the nicest way to find where one function call actually continues?
(I'll be happy to hear also about ways that don't include the website and\or require me to download the source code)
There are many things in kernel that are #define'd or typedef'd or functions mapped inside structs (the fop struct in the drivers). So, there's no easy way to browse the kernel source. lxr site helps you but it can't go any further when you encounter any of the above data structs. The same is with using cscope/ctags. The best way though, despite you explicitly mentioning against it, is to download the source and browse through it.
Another method would be to use kgdb and inspect the code function by function, but that requires you to have some knowledge of the functions where you want to step in or not, to save a lot of time. And last but not the least, increase the kernel log level, and print the logs that are accessible through dmesg. But these all require you to have a kernel source.
I am trying to understand if we can add our page fault handlers / exception handlers in kernel / user mode and handle the fault we induced before giving the control back to the kernel.
The task here will be not modifying the existing kernel code (do_page_fault fn) but add a user defined handler which will be looked up when a page fault or and exception is triggered
One could find tools like "kprobe" which provide hooks at instruction, but looks like this will not serve my purpose.
Will be great if somebody can help me understand this or point to good references.
From user space, you can define a signal handler for SIGSEGV, so your own function will be invoked whenever an invalid memory access is made. When combined with mprotect(), this lets a program manage its own virtual memory, all from user-space.
However, I get the impression that you're looking for a way to intercept all page faults (major, minor, and invalid) and invoke an arbitrary kernel function in response. I don't know a clean way to do this. When I needed this functionality in my own research projects, I ended up adding code to do_page_fault(). It works fine for me, but it's a hack. I would be very interested if someone knew of a clean way to do this (i.e., that could be used by a module on a vanilla kernel).
If you don't won't to change the way kernel handles these fault and just add yours before, then kprobes will server your purpose. They are a little difficult to handle, because you get arguments of probed functions in structure containing registers and on stack and you have to know, where exactly did compiler put each of them. BUT, if you need it for specific functions (known during creation of probes), then you can use jprobes (here is a nice example on how to use both), which require functions for probing with exactly same arguments as probed one (so no mangling in registers/stack).
You can dynamically load a kernel module and install jprobes on chosen functions without having to modify your kernel.
You want can install a user-level pager with gnu libsegsev. I haven't used it, but it seems to be just what you are looking for.
I do not think it would be possible - first of all, the page fault handler is a complex function which need direct access to virtual memory subsystem structures.
Secondly, imagine it would not be an issue, yet in order to write a page fault handler in user space you should be able to capture a fault which is by default a force transfer to kernel space, so at least you should prevent this to happen.
To this end you would need a supervisor to keep track of all memory access, but you cannot guarantee that supervisor code was already mapped and present in memory.