I would like to modify the linux kernel.
I would like to use functions from a shared library (an .so file) in file kernel/panic.c.
Unfortunately I don't know how to compile it.
When I put it in to the Makefile I receive the following error:
ld: attempted static link of dynamic object.
Is there a way to put the shared library file to the Linux kernel or do I need to recompile my library to gain an object file.
It is not possible to link shared library into kernel code (ELF shared objects are a user-space thing, using ld-linux(8)...) You should consider making a kernel module (and use modprobe(8) to load it). Read Loadable Kernel Module HowTo.
kernel modules *.ko are conceptually similar to shared objects *.so but the linking mechanism is different.
BTW, you generally should avoid writing kernel code and should prefer coding application code. In other words, modifying the kernel is generally a bad idea and is frowned upon.
Also, the API available in kernel space is not the same as user space API (which extends the C standard library and POSIX functions). For example, kernel modules (and kernel code) don't have (so cannot call) fopen or fprintf or fork; the kernel is a freestanding C application. Also, kernel code cannot use any floating point operation!
Userland applications are interacting with the kernel using system calls listed in syscalls(2) (and the libc is using them, e.g. for printf or system(3)). Kernel code (including kernel modules) cannot use directly syscalls (since they are provided by the kernel, see syscalls(2)).
Read also Advanced Linux Programming (mostly about application programming) and Operating Systems: Three Easy Pieces (to get a broader view about OSes).
Related
I wrote a kernel-mode driver using C. When I examined it using dependency walker I saw that it depends on some NT*.dll and HAL.dll.
I have several questions:
When does the OS load these DLLs? I thought kernel is responsible for loading DLLs in that case how can driver load a DLL if it is already in kernel-mode
Why don't the standard C dependencies show up like ucrtbase, concrt, vcruntime, msvcp etc? Would it be possible for a driver to have these dependencies and still function?
(A continuation of the last question). If Windows will still load DLLs even in kernel mode, I don't see why drivers cannot be written in (MS) C++
Thanks,
Most of the API in the driver is exported from ntoskrnl.exe.
Your driver is actually a "kernel module", which is part of a process, just like the modules in Ring3.
The driver's "process" is "System", with a Pid of 4, which you can see in the task manager.
ntoskrnl.exe and HAL.dll are modules in the "System", they will Loaded at system startup, while other modules are loaded at time of use (such as your drivers).
You can write and load "driver DLLs", but I haven't done so yet, so I can't answer that.
Ring3 modules are not loaded into the kernel, so you can't call many common Ring3 APIs, but Microsoft has mostly provided alternative APIs for them.
You can't load the Ring3 module directly into the kernel and call its export function. There may be some very complicated methods or tricks to do this, but it's really not necessary.
You can write drivers in C++, but this is not officially recommended by Microsoft at this time as it will encounter many problems, such as:
Constructors and destructors of global variables cannot be called automatically.
You can't use C++ standard libraries directly.
You can't use new and delete directly, they need to be overridden.
C++ exceptions cannot be used directly, and will consume a lot of stack space if you support them manually. Ring0 driver stack space is usually much smaller than Ring3 application stack space, indicating that BSOD may be caused.
Fortunately:
Some great people have solved most of the problems, such as the automatic calling of constructors and destructors and the use of standard libraries.
GitHub Project Link (But I still don't recommend using standard libraries in the kernel unless it's necessary, because they are too complex and large and can lead to some unanticipated issues)
My friend told me that Microsoft seems to have a small team currently trying to make drivers support C++. But I don't have time to confirm the veracity of this claim.
I've been reading that in most cases (like gcc) the compiler reads the source code in a high level language and spits out the corresponding machine code. Now, machine code by definition is the code that a processor can understand directly. So, machine code should be only machine (processor) dependent and OS independent. But this is not the case. Even if 2 different operating systems are running on the same processor, I can not run the same compiled file (.exe for Windows or .out for Linux) on both the Operating Systems.
So, what am I missing? Is the output of a gcc compiler (and most compilers) not Machine Code? Or is Machine Code not the lowest level of code and the OS translated it further to a set of instructions that the processor can execute?
You are confusing a few things. I retargettable compiler like gcc and other generic compilers compile files to objects, then the linker later links objects with other libraries as needed to make a so called binary that the operating system can then read, parse, load the loadable blocks and start execution.
A sane compiler author will use assembly language as the output of the compiler then the compiler or the user in their makefile calls the assembler which creates the object. This is how gcc works. And how clang works sorta, but llc can make objects directly now not just assembly that gets assembled.
It makes far more sense to generate debuggable assembly language that produce raw machine code. You really need a good reason like JIT to skip the step. I would avoid toolchains that go straight to machine code just because they can, they are harder to maintain and more likely to have bugs or take longer to fix bugs.
If the architecture is the same there is no reason why you cant have a generic toolchain generate code for incompatible operating systems. the gnu tools for example can do this. Operating system differences are not by definition at the machine code level most are at the high level language level C libraries that you can to create gui windows, etc have nothing to do with the machine code nor the processor architecture, for some operating systems the same operating system specific C code can be used on mips or arm or powerpc or x86. where the architecture becomes specific is the mechanism that actual system calls are invoked. A specific instruction is often used. and machine code is eventually used yes but no reason why this cant be coded in real or inline assembly.
And then this leads to libraries, even fopen and printf which are generic C calls eventually have to make a system call so much of the library support code can be in a compatible across systems high level language, there will need to be a system and architecture specific bit of code for the last mile. You should see this in glibc sources, or hooks into newlib for example in other library solutions. As examples.
Same is true for other languages like C++ as it is for C. Interpreted languages have additional layers but their virtual machines are just programs that sit on similar layers.
Low level programming doesnt mean machine nor assembly language it just means whatever programming language you are using accesses at a lower level, below the application or below the operating system, etc...
Compilers produce assembly code, which is a human-readable version of machine code (eg, instead of 1's and 0's you have actual commands). However, the correct assembly/machine code needed to make your program run correctly is different depending on the operating system. So the language the processors use is the same, but your program needs to talk to the operating system, which is different.
For example, say you're writing a Hello World program. You need to print the phrase "Hello, World" onto the screen. Your program, will need to go through the OS to actually do that, and different OSes have different interfaces.
I'm deliberately avoiding technical terms here to keep the answer understandable for beginners. To be more precise, your program needs to go through the operating system to interact with the other hardware on your computer(eg, keyboard, display). This is done through system calls that are different for each family of OS.
The machine code that is generated can run on any of the same type of processor it was generated for. The challenge is that your code will interact with other modules or programs on the system and to do that you need a conventions for calling and returning. The code generated assumes a runtime environment (OS) as well as library support (calling conventions). Those are not consistent across operating systems.
So, things break when they need to transition to and depend on other modules using conventions defined by the operating system's calling conventions.
Even if the machine code instructions are identical for the compiled program on two different operating systems (not at all likely, since different operating systems provide different services in different ways), the machine code needs to be stored in a format that the host OS can use "load into" a process for execution. And those formats are frequently different between different operating systems.
Scenario: two unrelated pieces of software are going to be distributed with their own copy of the same shared library. They will both be installed on the same machine (running Windows), and they're going to be run at the same time.
In this scenario - from my understanding, the two programs won't share the library in memory without somehow specifying it - which doesn't seem to be the norm (correct me if I'm wrong)... In other words, most or all of the programs that use this library will have their own copy of it, both in memory and on disk, which is the same as what statically linked programs would have - roughly speaking.
Is it preferable for the writers of each program to ship the shared library (together with their programs) over linking with the library statically, or is the difference negligible?
I am looking to use the __sync_fetch_and_xxx functions for thread safe shared memory access on my Linux application with a beagleboard and gumstix. I can't seem to find the correct header to include. Are these functions only available for kernel development?
Thanks
These are compiler builtins. They are available for user development. You need no header to include, if gcc on your architecture supports them, it will produce correct assembler, if no, then it will produce an error.
I want to intercept all file system access that occurs inside of dlopen(). At first, it would seem like LD_PRELOAD or -Wl,-wrap, would be viable solutions, but I have had trouble making them work due to some technical reasons:
ld.so has already mapped its own symbols by the time LD_PRELOAD is processed. It's not critical for me to intercept the initial loading, but the _dl_* worker functions are resolved at this time, so future calls go through them. I think LD_PRELOAD is too late.
Somehow malloc circumvents the issue above because the malloc() inside of ld.so does not have a functional free(), it just calls memset().
The file system worker functions, e.g. __libc_read(), contained in ld.so are static so I can't intercept them with -Wl,-wrap,__libc_read.
This might all mean that I need to build my own ld.so directly from source instead of linking it into a wrapper. The challenge there is that both libc and rtld-libc are built from the same source. I know that the macro IS_IN_rtld is defined when building rtld-libc, but how can I guarantee that there is only one copy of static data structures while still exporting a public interface function? (This is a glibc build system question, but I haven't found documentation of these details.)
Are there any better ways to get inside dlopen()?
Note: I can't use a Linux-specific solution like FUSE because this is for minimal "compute-node" kernels that do not support such things.
it would seem like LD_PRELOAD or -Wl,-wrap, would be viable solutions
The --wrap solution could not possibly be viable: it works only at (static) link time, and your ld.so and libc.so.6 and libdl.so.2 have all already been linked, so now it is too late to use --wrap.
The LD_PRELOAD could have worked, except ... ld.so considers the fact that dlopen() calls open() an internal implementation detail. As such, it just calls the internal __open function, bypassing PLT, and your ability to interpose open with it.
Somehow malloc circumvents the issue
That's because libc supports users who implement their own malloc (e.g. for debugging purposes). So the call to e.g. calloc from dlopen does go through PLT, and is interposable via LD_PRELOAD.
This might all mean that I need to build my own ld.so directly from source instead of linking it into a wrapper.
What will the rebuilt ld.so do? I think you want it to call __libc_open (in libc.so.6), but that can't possibly work for obvious reason: it is ld.so that opens libc.so.6 in the first place (at process startup).
You could rebuild ld.so with the call to __open replaced with a call to open. That will cause ld.so to go through PLT, and expose it to LD_PRELOAD interposition.
If you go that route, I suggest that you don't overwrite the system ld.so with your new copy (the chance of making a mistake and rendering the system unbootable is just too great). Instead, install it to e.g. /usr/local/my-ld.so, and then link your binaries with -Wl,--dynamic-linker=/usr/local/my-ld.so.
Another alternative: runtime patching. This is a bit of a hack, but you can (once you gain control in main) simply scan the .text of ld.so, and look for CALL __open instructions. If ld.so is not stripped, then you can find both the internal __open, and the functions you want to patch (e.g. open_verify in dl-load.c). Once you find the interesting CALL, mprotect the page that contains it to be writable, and patch in the address of your own interposer (which can in turn call __libc_open if it needs to), then mprotect it back. Any future dlopen() will now go through your interposer.