I am writing a driver code for first time. I have checked many resources which state that the initialization function for a kernel module is either init_module() or any another function that is specified using MODULE_INIT() macro. And this initialization function is called when a kernel module is loaded.
But I could not find any information about what happens when initialization function doesn't exist in a Loadable kernel module and is tried to load.
What would be the scenario if the kernel module is in-tree and doesn't have initialization function??
---Thanks in advance..
In the routine do_init_module, it tests whether an init routine is provided. If not, it just skips the call. See https://elixir.bootlin.com/linux/v3.10.108/source/kernel/module.c#L3092
if (mod->init != NULL)
ret = do_one_initcall(mod->init);
Related
I'm trying to figure out how an ebpf program can change the outcome of a function (not a syscall, in my case) in kernel space. I've found numerous articles and blog posts about how ebpf turns the kernel into a programmable kernel, but it seems like every example is just read-only tracing and collecting statistics.
I can think of a few ways of doing this: 1) make a kernel application read memory from an ebpf program, 2) make ebpf change the return value of a function, 3) allow an ebpf program to call kernel functions.
The first approach does not seem like a good idea.
The second would be enough, but as far as I understand it's not easy. This question says syscalls are read-only. This bcc document says it is possible but the function needs to be whitelisted in the kernel. This makes me think that the whitelist is fixed and can only be changed by recompiling the kernel, is this correct?
The third seems to be the most flexible one, and this blog post encouraged me to look into it. This is the one I'm going for.
I started with a brand new 5.15 kernel, which should have this functionality
As the blog post says, I did something no one should do (security is not an issue since I'm just toying with this) and opened every function to ebpf by adding this to net/core/filter.c (which I'm not sure is the correct place to do so):
static bool accept_the_world(int off, int size,
enum bpf_access_type type,
const struct bpf_prog *prog,
struct bpf_insn_access_aux *info)
{
return true;
}
bool export_the_world(u32 kfunc_id)
{
return true;
}
const struct bpf_verifier_ops all_verifier_ops = {
.check_kfunc_call = export_the_world,
.is_valid_access = accept_the_world,
};
How does the kernel know of the existence of this struct? I don't know. None of the other bpf_verifier_ops declared are used anywhere else, so it doesn't seem like there is a register_bpf_ops
Next I was able to install bcc (after a long fight due to many broken installation guides).
I had to checkout v0.24 of bcc. I read somewhere that pahole is required when compiling the kernel, so I updated mine to v1.19.
My python file is super simple, I just copied the vfs example from bcc and simplified it:
bpf_text_kfunc = """
extern void hello_test_kfunc(void) __attribute__((section(".ksyms")));
KFUNC_PROBE(vfs_open)
{
stats_increment(S_OPEN);
hello_test_kfunc();
return 0;
}
"""
b = BPF(text=bpf_text_kfunc)
Where hello_test_kfunc is just a function that does a printk, inserted as a module into the kernel (it is present in kallsyms).
When I try to run it, I get:
/virtual/main.c:25:5: error: cannot call non-static helper function
hello_test_kfunc();
^
And this is where I'm stuck. It seems like it's the JIT that is not allowing this, but who exactly is causing this issue? BCC, libbpf or something else? Do I need to manually write bpf code to call kernel functions?
Does anyone have an example with code of what the lwn blog post I linked talks about actually working?
eBPF is fundamentally made to extend kernel functionality in very specific limited ways. Essentially a very advanced plugin system. One of the main design principles of the eBPF is that a program is not allowed to break the kernel. Therefor it is not possible to change to outcome of arbitrary kernel functions.
The kernel has facilities to call a eBPF program at any time the kernel wants and then use the return value or side effects from helper calls to effect something. The key here is that the kernel always knows it is doing this.
One sort of exception is the BPF_PROG_TYPE_STRUCT_OPS program type which can be used to replace function pointers in whitelisted structures.
But again, explicitly allowed by the kernel.
make a kernel application read memory from an ebpf program
This is not possible since the memory of an eBPF program is ephemaral, but you could define your own custom eBPF program type and pass in some memory to be modified to the eBPF program via a custom context type.
make ebpf change the return value of a function
Not possible unless you explicitly call a eBPF program from that function.
allow an ebpf program to call kernel functions.
While possible for a number for purposes, this typically doesn't give you the ability to change return values of arbitrary functions.
You are correct, certain program types are allowed to call some kernel functions. But these are again whitelisted as you discovered.
How does the kernel know of the existence of this struct?
Macro magic. The verifier builds a list of these structs. But only if the program type exists in the list of program types.
/virtual/main.c:25:5: error: cannot call non-static helper function
This seems to be a limitation of BCC, so if you want to play with this stuff you will likely have to manually compile your eBPF program and load it with libbpf or cilium/ebpf.
Current scenario: ko registers a callback function to the kernel(not a ko) in the ko's init function. Will the kernel call try_module_get/module_get before calling back the function? If it is called, please give a example in kernel. If it is not called, what mechanism is used to ensure that when uninstalling, the kernel does not use ko.Thanks.
How can I read from the PMU from inside Kernel space?
For a profiling task I need to read the retired instructions provided by the PMU from inside the kernel. The perf_event_open systemcall seems to offer this capability. In my source code I
#include <linux/syscalls.h>
set my parameters for the perf_event_attr struct and call the sys_perf_event_open(). The mentioned header contains the function declaration. When checking "/proc/kallsyms", it is confirmed that there is a systemcall with the name sys_perf_event_open. The symbol is globally available indicated by the T:
ffffffff8113fe70 T sys_perf_event_open
So everything should work as far as I can tell.
Still, when compiling or inserting the LKM I get a warning/error that sys_perf_event_open does not exist.
WARNING: "sys_perf_event_open" [/home/vagrant/mods/lkm_read_pmu/read_pmu.ko] undefined!
What do I need to do in order to get those retired instructions counter?
The /proc/kallsyms file shows all kernel symbols defined in the source. Right, the capital T indicates a global symbol in the text section of the kernel binary, but the meaning of "global" here is according to the C language. That is, it can be used in other files of the kernel itself. You can't call a kernel function from a kernel module just because it's global.
Kernel modules can only use kernel symbols that are exported with EXPORT_SYMBOL in the kernel source code. Since kernel 2.6.0, none of the system calls are exported, so you can't call any of them from a kernel module, including sys_perf_event_open. System calls are really designed to be called from user space. What this all means is that you can't use the perf_event subsystem from within a kernel module.
That said, I think you can modify the kernel to add EXPORT_SYMBOL to sys_perf_event_open. That will make it an exported symbol, which means it can be used from a kernel module.
I am trying to monitor all times the PCIe stack writes configures to a device. In the absence of a PCI equivalent of usbmon, I thought to monitor all times the pci_bus_write_config_byte() function is called. I wanted to write a kernel module that essentially did this:
int (*original)(struct pci_bus *, unsigned int, int, u8);
original = &pci_bus_write_config_byte;
pci_bus_write_config_byte = &my_custom_func;
And then my custom function will printk() whatever data is passed, and return the original pci_bus_write_config_byte. However, when I load the module nothing happens. I suspect this is due to some sort of RW protection.
My google searches revealed that set_memory_rw() is supposed to make a function pointer writable, but I am not able to properly include it or use this function - when I go to insmod the module, the kernel says there are unknown symbols.
Any ideas on how one would do this?
I have created one kernel module. within the module i have defined some functions say function1(int n) and function2().
There was no error in compiling and inserting the module. What i don't understand is how to call the function1(n) and function2() from a user space program.
I think there is no direct way to do it, you can't link userspace code with the kernel like you do with a library. First, you have to register your function as syscall and then call the syscall with the syscall() function.
See here
Also some interface between kernel and user space possible using socket communication see
this link
And find use full link related to this topic at right side of page.
You can make your driver to react on writes to a /dev/file file or a /proc/file file.
EDIT
Form name file my point is device is as file in kernel and you can access via ioctl()
the pretty good explanation is http://tldp.org/LDP/lkmpg/2.6/html/lkmpg.html#AEN885
See Link