I would like to write an eBPF program in order to track the functions being called in a separate running eBPF program. Also, I would like to count the number of times the respective functions have been called.
Is this possible? And if so, could someone please hint me towards what could be used in order to achieve this?
(Note: I am looking for the idea/concept behind achieving this functionality (i.e. the specific kprobe to be used) rather than a fully developed solution.)
Preferably, I am looking for a solution which can be implemented using python bcc or bpftrace.
Update: I would like to count the number of times "user-defined" functions are being called inside an eBPF program.
For example, if I create an eBPF program in the kernel code:
SEC("kprobe/tcp_v4_connect")
int bpf_sample_prog(struct pt_regs *ctx) {
int x, y, ...; /* local variables to which I assign data from context */
foo(x); /* user-defined function */
bar(y); /* user-defined function */
return 0;
}
, and I attach it to the kprobe from the userspace code, I would like to be able to count using a completely separate eBPF program the number of times the foo and bar functions have been called inside the bpf_sample_prog program.
Thank you in advance.
Related
I'm trying to figure out how an ebpf program can change the outcome of a function (not a syscall, in my case) in kernel space. I've found numerous articles and blog posts about how ebpf turns the kernel into a programmable kernel, but it seems like every example is just read-only tracing and collecting statistics.
I can think of a few ways of doing this: 1) make a kernel application read memory from an ebpf program, 2) make ebpf change the return value of a function, 3) allow an ebpf program to call kernel functions.
The first approach does not seem like a good idea.
The second would be enough, but as far as I understand it's not easy. This question says syscalls are read-only. This bcc document says it is possible but the function needs to be whitelisted in the kernel. This makes me think that the whitelist is fixed and can only be changed by recompiling the kernel, is this correct?
The third seems to be the most flexible one, and this blog post encouraged me to look into it. This is the one I'm going for.
I started with a brand new 5.15 kernel, which should have this functionality
As the blog post says, I did something no one should do (security is not an issue since I'm just toying with this) and opened every function to ebpf by adding this to net/core/filter.c (which I'm not sure is the correct place to do so):
static bool accept_the_world(int off, int size,
enum bpf_access_type type,
const struct bpf_prog *prog,
struct bpf_insn_access_aux *info)
{
return true;
}
bool export_the_world(u32 kfunc_id)
{
return true;
}
const struct bpf_verifier_ops all_verifier_ops = {
.check_kfunc_call = export_the_world,
.is_valid_access = accept_the_world,
};
How does the kernel know of the existence of this struct? I don't know. None of the other bpf_verifier_ops declared are used anywhere else, so it doesn't seem like there is a register_bpf_ops
Next I was able to install bcc (after a long fight due to many broken installation guides).
I had to checkout v0.24 of bcc. I read somewhere that pahole is required when compiling the kernel, so I updated mine to v1.19.
My python file is super simple, I just copied the vfs example from bcc and simplified it:
bpf_text_kfunc = """
extern void hello_test_kfunc(void) __attribute__((section(".ksyms")));
KFUNC_PROBE(vfs_open)
{
stats_increment(S_OPEN);
hello_test_kfunc();
return 0;
}
"""
b = BPF(text=bpf_text_kfunc)
Where hello_test_kfunc is just a function that does a printk, inserted as a module into the kernel (it is present in kallsyms).
When I try to run it, I get:
/virtual/main.c:25:5: error: cannot call non-static helper function
hello_test_kfunc();
^
And this is where I'm stuck. It seems like it's the JIT that is not allowing this, but who exactly is causing this issue? BCC, libbpf or something else? Do I need to manually write bpf code to call kernel functions?
Does anyone have an example with code of what the lwn blog post I linked talks about actually working?
eBPF is fundamentally made to extend kernel functionality in very specific limited ways. Essentially a very advanced plugin system. One of the main design principles of the eBPF is that a program is not allowed to break the kernel. Therefor it is not possible to change to outcome of arbitrary kernel functions.
The kernel has facilities to call a eBPF program at any time the kernel wants and then use the return value or side effects from helper calls to effect something. The key here is that the kernel always knows it is doing this.
One sort of exception is the BPF_PROG_TYPE_STRUCT_OPS program type which can be used to replace function pointers in whitelisted structures.
But again, explicitly allowed by the kernel.
make a kernel application read memory from an ebpf program
This is not possible since the memory of an eBPF program is ephemaral, but you could define your own custom eBPF program type and pass in some memory to be modified to the eBPF program via a custom context type.
make ebpf change the return value of a function
Not possible unless you explicitly call a eBPF program from that function.
allow an ebpf program to call kernel functions.
While possible for a number for purposes, this typically doesn't give you the ability to change return values of arbitrary functions.
You are correct, certain program types are allowed to call some kernel functions. But these are again whitelisted as you discovered.
How does the kernel know of the existence of this struct?
Macro magic. The verifier builds a list of these structs. But only if the program type exists in the list of program types.
/virtual/main.c:25:5: error: cannot call non-static helper function
This seems to be a limitation of BCC, so if you want to play with this stuff you will likely have to manually compile your eBPF program and load it with libbpf or cilium/ebpf.
I am trying to monitor all times the PCIe stack writes configures to a device. In the absence of a PCI equivalent of usbmon, I thought to monitor all times the pci_bus_write_config_byte() function is called. I wanted to write a kernel module that essentially did this:
int (*original)(struct pci_bus *, unsigned int, int, u8);
original = &pci_bus_write_config_byte;
pci_bus_write_config_byte = &my_custom_func;
And then my custom function will printk() whatever data is passed, and return the original pci_bus_write_config_byte. However, when I load the module nothing happens. I suspect this is due to some sort of RW protection.
My google searches revealed that set_memory_rw() is supposed to make a function pointer writable, but I am not able to properly include it or use this function - when I go to insmod the module, the kernel says there are unknown symbols.
Any ideas on how one would do this?
I'm new on modules programming and I need to make a system call to retrieve the system processes and show how much CPU they are consuming.
How can I make this call?
Why would you implement a system call for this? You don't want to add a syscall to the existing Linux API. This is the primary Linux interface to userspace and nobody touches syscalls except top kernel developers who know what they do.
If you want to get a list of processes and their parameters and real-time statuses, use /proc. Every directory that's an integer in there is an existing process ID and contains a bunch of useful dynamic files which ps, top and others use to print their output.
If you want to get a list of processes within the kernel (e.g. within a module), you should know that the processes are kept internally as a doubly linked list that starts with the init process (symbol init_task in the kernel). You should use macros defined in include/linux/sched.h to get processes. Here's an example:
#include <linux/module.h>
#include <linux/printk.h>
#include <linux/sched.h>
static int __init ex_init(void)
{
struct task_struct *task;
for_each_process(task)
pr_info("%s [%d]\n", task->comm, task->pid);
return 0;
}
static void __exit ex_fini(void)
{
}
module_init(ex_init);
module_exit(ex_fini);
This should be okay to gather information. However, don't change anything in there unless you really know what you're doing (which will require a bit more reading).
There are syscalls for that, called open, and read. The information of all processes are all kept in /proc/{pid} directories. You can gather process information by reading corresponding files.
More explained here: http://www.tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html
I'm developing a launcher for a game.
Want to intercept game's call for a function that prints text.
I don't know whether the code that contains this function is dynamically linked or statically. So I dont even know the function name.
I did intercepted some windows-api calls of this game through microsoft Detours, Ninject and some others.
But this one is not in import table either.
What should I do to catch this function call? What profiler should be used? IDA? How this could be done?
EDIT:
Finally found function address. Thanks, Skino!
Tried to hook it with Detours, injected dll. Injected DllMain:
typedef int (WINAPI *PrintTextType)(char *, int, float , int);
static PrintTextType PrintText_Origin = NULL;
int WINAPI PrintText_Hooked(char * a, int b, float c, int d)
{
return PrintText_Origin(a, b, c , d);
}
HMODULE game_dll_base;
/* game_dll_base initialization goes here */
BOOL APIENTRY DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpvReserved)
{
if(fdwReason==DLL_PROCESS_ATTACH)
{
DisableThreadLibraryCalls(hinstDLL);
DetourTransactionBegin();
DetourUpdateThread(GetCurrentThread());
PrintText_Origin = (PrintTextType)((DWORD)game_dll_base + 0x6049B0);
DetourAttach((PVOID *)&PrintText_Origin , PrintText_Hooked);
DetourTransactionCommit();
}
}
It hooks as expected. Parameter a has text that should be displayed. But when calling original function return PrintText_Origin (a, b, c , d); application crashes(http://i46.tinypic.com/ohabm.png, http://i46.tinypic.com/dfeh4.png)
Original function disassembly:
http://pastebin.com/1Ydg7NED
After Detours:
http://pastebin.com/eM3L8EJh
EDIT2:
After Detours:
http://pastebin.com/GuJXtyad
PrintText_Hooked disassembly http://pastebin.com/FPRMK5qt w3_loader.dll is the injected dll
Im bad at ASM, please tell what can be wrong ?
Want to intercept game's call for a function that prints text.
You can use a debugger for the investigative phase. Either IDA, or even Visual Studio (in combination with e.g. HxD), should do. It should be relatively easy to identify the function using the steps below:
Identify a particular fragment of text whose printing you want to trace (e.g. Hello World!)
Break the game execution at any point before the game normally prints the fragment you identified above
Search for that fragment of text† (look for either Unicode or ANSI) in the game's memory. IDA will allow you to do that IIRC, as will the free HxD (Extras > Open RAM...)
Once the address of the fragment has been identified, set a break-on-access/read data breakpoint so the debugger will give you control the moment the game attempts to read said fragment (while or immediately prior to displaying it)
Resume execution, wait for the data breakpoint to trigger
Inspect the stack trace and look for a suitable candidate for hooking
Step through from the moment the fragment is read from memory until it is printed if you want to explore additional potential hook points
†provided text is not kept compressed (or, for whatever reason, encrypted) until the very last moment
Once you are done with the investigative phase and you have identified where you'd like to inject your hook, you have two options when writing your launcher:
If, based on the above exercise, you were able to identify an export/import after all, then use any API hooking techniques
EDIT Use Microsoft Detours, making sure that you first correctly identify the calling convention (cdecl, fastcall, stdcall) of the function you are trying to detour, and use that calling convention for both the prototype of the original as well as for the implementation of the dummy. See examples.
If not, you will have to
use the Debugging API to programatically load the game
compute the hook address based on your investigative phase (either as a hard-coded offset from the module base, or by looking for the instruction bytes around the hook site‡)
set a breakpoint
resume the process
wait for the breakpoint to trigger, do whatever you have to do
resume execution, wait for the next trigger etc. again, all done programatically by your launcher via the Debugging API.
‡to be able to continue to work with eventual patch releases of the game
At this stage it sounds like you don't have a notion of what library function you're trying to hook, and you've stated it's not (obviously at least) an imported external function in the import table which probably means that the function responsible for generating the text is likely located inside the .text of the application you are disassembling directly or loaded dynamically, the text generation (especially in a game) is likely a part of the application.
In my experience, this simplest way to find code that is difficult to trace such as this is by stopping the application shortly during or before/after text is displayed and using IDA's fabulous call-graph functionality to establish what is responsible for writing it out (use watches and breakpoints liberally!)
Look carefully to calls to CreateRemoteThread or any other commonly used dynamic loading mechanism if you have reason to believe this functionality might be provided by an exported function that isn't showing up in the import table.
I strongly advice against it but for the sake of completeness, you could also hook NtSetInformationThread in the system service dispatch table. here's a good dump of the table for different Windows versions here. If you want to get the index in the table yourself you can just disassemble the NtSetInformationThread export from ntdll.dll.
I've started to learn Linux driver programs, but I'm finding it a little difficult.
I've been studying the i2c driver, and I got quite confused regarding the entry-point of the driver program. Does the driver program start at the MOUDULE_INIT() macro?
And I'd also like to know how I can know the process of how the driver program runs. I got the book, Linux Device Driver, but I'm still quite confused. Could you help me? Thanks a lot.
I'll take the i2c driver as an example. There are just so many functions in it, I just wanna know how I can get coordinating relation of the functions in the i2c drivers?
A device driver is not a "program" that has a main {} with a start point and exit point. It's more like an API or a library or a collection of routines. In this case, it's a set of entry points declared by MODULE_INIT(), MODULE_EXIT(), perhaps EXPORT_SYMBOL() and structures that list entry points for operations.
For block devices, the driver is expected to provide the list of operations it can perform by declaring its functions for those operations in (from include/linux/blkdev.h):
struct block_device_operations {
int (*open) ();
int (*release) ();
int (*ioctl) ();
int (*compat_ioctl) ();
int (*direct_access) ();
unsigned int (*check_events) ();
/* ->media_changed() is DEPRECATED, use ->check_events() instead */
int (*media_changed) ();
void (*unlock_native_capacity) ();
int (*revalidate_disk) ();
int (*getgeo)();
/* this callback is with swap_lock and sometimes page table lock held */
void (*swap_slot_free_notify) ();
struct module *owner;
};
For char devices, the driver is expected to provide the list of operations it can perform by declaring its functions for those operations in (from include/linux/fs.h):
struct file_operations {
struct module *owner;
loff_t (*llseek) ();
ssize_t (*read) ();
ssize_t (*write) ();
ssize_t (*aio_read) ();
ssize_t (*aio_write) ();
int (*readdir) ();
unsigned int (*poll) ();
long (*unlocked_ioctl) ();
long (*compat_ioctl) ();
int (*mmap) ();
int (*open) ();
int (*flush) ();
int (*release) ();
int (*fsync) ();
int (*aio_fsync) ();
int (*fasync) ();
int (*lock) ();
ssize_t (*sendpage) ();
unsigned long (*get_unmapped_area)();
int (*check_flags)();
int (*flock) ();
ssize_t (*splice_write)();
ssize_t (*splice_read)();
int (*setlease)();
long (*fallocate)();
};
For platform devices, the driver is expected to provide the list of operations it can perform by declaring its functions for those operations in (from include/linux/platform_device.h):
struct platform_driver {
int (*probe)();
int (*remove)();
void (*shutdown)();
int (*suspend)();
int (*resume)();
struct device_driver driver;
const struct platform_device_id *id_table;
};
The driver, especially char drivers, does not have to support every operation listed. Note that there are macros to facilitate the coding of these structures by naming the structure entries.
Does the driver program starts at the MOUDLUE_INIT() macro?
The driver's init() routine specified in MODULE_INIT() will be called during boot (when statically linked in) or when the module is dynamically loaded. The driver passes its structure of operations to the device's subsystem when it registers itself during its init().
These device driver entry points, e.g. open() or read(), are typically executed when the user app invokes a C library call (in user space) and after a switch to kernel space. Note that the i2c driver you're looking at is a platform driver for a bus that is used by leaf devices, and its functions exposed by EXPORT_SYMBOL() would be called by other drivers.
Only the driver's init() routine specified in MODULE_INIT() is guaranteed to be called. The driver's exit() routine specified in MODULE_EXIT() would only be executed if/when the module is dynamically unloaded. The driver's op routines will be called asynchronously (just like its interrupt service routine) in unknown order. Hopefully user programs will invoke an open() before issuing a read() or an ioctl() operation, and invoke other operations in a sensible fashion. A well-written and robust driver should accommodate any order or sequence of operations, and produce sane results to ensure system integrity.
It would probably help to stop thinking of a device driver as a program. They're completely different. A program has a specific starting point, does some stuff, and has one or more fairly well defined (well, they should, anyway) exit point. Drivers have some stuff to do when the first get loaded (e.g. MODULE_INIT() and other stuff), and may or may not ever do anything ever again (you can forcibly load a driver for hardware your system doesn't actually have), and may have some stuff that needs to be done if the driver is ever unloaded. Aside from that, a driver generally provides some specific entry points (system calls, ioctls, etc.) that user-land applications can access to request the driver to do something.
Horrible analogy, but think of a program kind of like a car - you get in, start it up, drive somewhere, and get out. A driver is more like a vending machine - you plug it in and make sure it's stocked, but then people just come along occasionaly and push buttons to make it do something.
Actually you are taking about (I2C) platform (Native)driver first you need to understand how MOUDULE_INIT() of platform driver got called versus other loadable modules.
/*
* module_init() - driver initialization entry point
* #x: function to be run at kernel boot time or module insertion
* module_init() will either be called during do_initcalls() (if
* builtin) or at module insertion time (if a module). There can only
* be one per module.*/
and for i2c driver you can refer this link http://www.linuxjournal.com/article/7136 and
http://www.embedded-bits.co.uk/2009/i2c-in-the-2632-linux-kernel/
Begin of a kernel module is starting from initialization function, which mainly addressed with macro __init just infront of the function name.
The __init macro indicate to linux kernel that the following function is an initialization function and the resource that will use for this initialization function will be free once the code of initialization function is executed.
There are other marcos, used for detect initialization and release function, named module_init() and module_exit() [as described above].
These two macro are used, if the device driver is targeted to operate as loadable and removeable kernel module at run time [i.e. using insmod or rmmod command]
IN short and crisp way : It starts from .probe and go all the way to init as soon you do insmod .This also registers the driver with the driver subsystem and also initiates the init.
Everytime the driver functionalities are called from the user application , functions are invoked using the call back.
"Linux Device Driver" is a good book but it's old!
Basic example:
#include <linux/module.h>
#include <linux/version.h>
#include <linux/kernel.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Name and e-mail");
MODULE_DESCRIPTION("my_first_driver");
static int __init insert_mod(void)
{
printk(KERN_INFO "Module constructor");
return 0;
}
static void __exit remove_mod(void)
{
printk(KERN_INFO "Module destructor");
}
module_init(insert_mod);
module_exit(remove_mod);
An up-to-date tutorial, really well written, is "Linux Device Drivers Series"