Instrumenting Linux kernel functions - gcc

I am looking for a way to instrument a function in Linux kernel. It seems that GCC's -finstrument-functions flag allows instrumentation, but is there any way to instrument only a particular Linux function using compiler directives (i.e., function attributes) instead of instrumenting all functions?
It seems that KProbe also has functionality of instrumentation, but KProbe maintains a blacklist of functions and does not allow to monitor those functions, resulting limited scope of instrumentation.
I am running Ubuntu-16.04 with kernel version 4.8.11 on x86_64.
The purpose of instrumentation is to monitor the entry and exit of the target function by setting and clearing a flag.

The answer is it depends. However it is not as straightforward to inject as one might think.
In particular, example you can use -finstrument-functions for specific C files and then use attribute((no_instrument_function)) to exclude the ones you don't want instrumented
You would have to keep track of preemption and fastcalls, you would have to make sure that you're not accidentally stopping or removing instrumentation. Oh you also have to track any random stack layout that the kernel is self inflicting.

Related

Measuring time in omp_fn routines

I am writing a pintool gathering metrics in a subset of applications routines(some among them, are generated by the compiler).
The goal is to get the execution time of those routines.
Below is a set of attempts I already gave:
Of course doing it with pin is a bad idea because of the Virtual Machine overhead.
gcc option -finstrument-functions does not scope the OpenMP functions it generates.
LD_PRELOAD does not work with OpenMP functions which are statically linked.
Maybe if pin allowed to dump statically instrumented assembly, we could avoid the virtual environment overhead, but as far as I know it isn't possible.
I know about Maqao instrumentation tool which do not use virtual environment, but I want to avoid using too many frameworks or translating my pintool into maqao lua script.
I guess I am left with manual binary instrumentation, but if anybody has a better solution, the help will be appreciated.
If you just want the results - use a comprehensive measurement infrastructure that supports OpenMP such as Intel VTune, Extrae/Paraver, Score-P. This will provide you profiling or tracing information about the OpenMP regions.
If you want to implement the measurement yourself, you can use the underlying source-to-source transformation tool Opari. You could also use the much cleaner OpenMP tools interface (OMPT), but AFAIK it is not widely supported yet. You might have some luck with recent Intel OpenMP runtimes.

How to add a custom semaphore to the linux kernel?

Basically I want implement my own semaphore inside the linux kernel and be able to use it in user programs.
I've made some progress implementing the kernel code however I do not know how to make semaphore type and the functions I've written available to user programs.
User programs would need to have access to my semaphore type and its functions (wait, signal, ...)
Is there any way to this so that a linux using a kernel compiled with my code would be able to use my semaphore simply by including a header file?
I'm no pro when it comes to the linux kernel, so if I'm making any obvious mistakes feel free to point them out.Thanks.
The kernel version I'm using is 2.6.32.
I would recommend looking into the user space libraries for how a semaphore implemented for user space programs.
Semaphores are only available in kernels older 2.6.16 kernels, as mutex's appeared after that version of the kernel. Only the previous implementation used semaphores. The newer code should use mutexes instead which are used only in process context. You may want to look the following headers, struct's and api's.
#include <linux/mutex.h>
struct mutex
mutex_{lock,trylock,unlock,lock_interruptable}()
Also you may want to look semaphore.c for the implementation.

How to use __sync_fetch_and_add for a Linux userspace program on beagleboard/gumstix

I am looking to use the __sync_fetch_and_xxx functions for thread safe shared memory access on my Linux application with a beagleboard and gumstix. I can't seem to find the correct header to include. Are these functions only available for kernel development?
Thanks
These are compiler builtins. They are available for user development. You need no header to include, if gcc on your architecture supports them, it will produce correct assembler, if no, then it will produce an error.

What are the possible side effects of using GCC profiling flag -pg?

There is a device driver for a camera device provided to us as a .so library file by the vendor.
Only the header file with API's is available which provides the list of functions that we can work with the device. Our application is linked with the .so library file provided by the vendor and uses the interface functions provided for our objective.
When we wanted to measure the time taken by our application in handling different tasks, we have added GCC -pg flag and compiled+built our application.
But we found that using this executable built with -pg, we are observing random failure in the camera image acquire functions. Since we are using the .so library file, we do not know what is going wrong inside that function.
So in general I wanted to understand what could be the possible reasons of such a failure mode. Any pointers or documents that can help what goes inside profiling and its side effects is appreciated.
This answer is a helpful overview of how the gcc -pg flag profiler actually works. The take-home point is mostly to do with possible changes to timing. If your library has any kind of time-sensitivity in it, introducing profiler overheads might be changing the time it takes to execute parts of the code, and perhaps violating some kind of constraint.
If you look at the gprof documentation, it would explain the implementation details:
Profiling works by changing how every function in your program is
compiled so that when it is called, it will stash away some
information about where it was called from. From this, the profiler
can figure out what function called it, and can count how many times
it was called. This change is made by the compiler when your program
is compiled with the `-pg' option, which causes every function to call
mcount (or _mcount, or __mcount, depending on the OS and compiler) as
one of its first operations.
So the timing of your application would change quite a bit when you turn on -pg.
If you would like to instrument your code without significantly affecting the timings, you could possibly look at oprofile. It does not pose as significant an overhead as gprof does.
Another fairly recent tool that serves as a good lightweight profiling tool is perf.
The profiling tools are useful primarily in understanding the CPU bound pieces of your library/application and can help you optimize those critical pieces. Most of the time they serve to identify some culprit function/method which wastes CPU cycles. So do not use it as the sole piece for debugging any and all issues.
Most vendor libraries would also provide means to turn on extra debugging or dumping extra information during runtime. They include means such as environment variables, log files, /proc or /sys interfaces for drivers, etc. and sometimes even tools to increase debugging levels at runtime. See if you can leverage these.
If you have defined APIs in a library/driver, you should run unit-tests on them instead of trying to debug the whole application you've built.
If you find a certain unit-test fails, send the source code of the unit-test to your vendor, and ask them to fix the bug. If it is not a bug, your vendor would at least point you towards the right set of APIs or the semantics to use.

Can I use -fstack-check when compiling my Ubuntu 10.04 kernel module?

It looks like my kernel module is performing some stack smashing under heavy loads. Can I use the -fstack-check compile option for kernel modules? It appears as if that compile option causes the compiler to emit additional code, but not link to a library or runtime. Is that correct?
I have a very simplified kernel that does not do much. I can load that simple kernel with and without slub debugging enabled, and it will also load with and without -fstack-check at compile. When I start testing my module, it starts crashing when I use the -fstack-check compile option, whereas it seems to not trip errors with just slub debugging.
A different question (How does the gcc option -fstack-check exactly work?) provided some information but I haven't been able to find examples of people using the -fstack-check option in kernel module compilations.
The stack space inside the Linux kernel is severely limited. Go over your code with a fine comb to check there are no paths using too much in local variables, no alloca() allowed at all. Other than that, the kernel environment is harsh. Check your logic carefully. Add tests for possibly out of range data, trace data to wherever it comes from and make sure it is always as you believe. Data from userland is always a reason for extra paranoia.

Resources