Why do user space apps need kernel headers? - linux-kernel

I am studying a smartphone project. During compilation process it's installing kernel header files for user space building.
Why do user space apps need kernel headers?

In general, those headers are needed because userspace applications often talk to kernel, passing some data. To do this, they have to agree on the structure of data passed between them.
Most of the kernel headers are only needed by libc library (if you're using one) as it usually hides all the lowlevel aspects from you by the providing abstractions conforming to some standards like POSIX (it will usually provide its own include files). Those headers will, for example, provide all the syscall numbers and definitions of all the structures used by their arguments.
The are, however, some "custom services" provided by kernel that are not handled by libc. One example is creating userspace programs that talk directly to some hardware drivers. That may require passing some data structures (so you need some struct definitions), knowing some magic numbers (so you need some defines), etc.
As an example, take a look at hid-example.c from kernel sources. It will, for example, call this ioctl:
struct hidraw_report_descriptor rpt_desc;
[...]
ioctl(fd, HIDIOCGRDESC, &rpt_desc);
But where did it get HIDIOCGRDESC or know the structure of struct hidraw_report_descriptor? They are of course defined in linux/hidraw.h which this application included.

Related

If both Mac OS and Windows use the x86 instruction set, why do we have to recompile for each platform?

If both Mac OS and Windows, running on Intel processors, use the x86 instruction set, why can't a program written using only C++11 (no OS Specific libraries, frameworks or API's), run on both without having to recompile for that platform ?
Ultimately the program gets compiled to machine code, so if the instruction set is the same, whats the difference ? What's really going on ?
EDIT: I'm really just talking about a simple "Hello world" program compiled with something like gcc. Not Apps!
EDIT: For example:
#include<iostream>
using namespace std;
int main()
{
cout << "Hello World!";
return 0;
}
EDIT: An even simpler program:
int main(){
int j = 2;
j = j + 3;
}
Because a "program" nowadays consists of more than just a blob of binary code. Their file formats are not cross-compatible (PE/COFF vs. ELF vs. Mach-O). It's kind of silly when you think about it, yes, but that's the reality. It wouldn't have to be this way if you could start history over again.
Edit:
You may also want to see my longer answer on SoftwareEngineering.StackExchange (and others').
Even "Hello, world" needs to generate output. That will either be OS calls, BIOS calls at a somewhat lower level, or, as was common in DOS days for performance reasons, direct output to video via I/O calls or memory mapped video. Any of those methods will be highly specific to the operating system and other design issues. In the example you've listed, iostream hides those details and will be different for each target system.
One reason is provided by #Mehrdad in their answer: even if the assembly code is the same on all platforms, the way it's "wrapped" into an executable file may differ. Back in the day, there were COM files in MS-DOS. You could load this file in a memory and then just start executing it from the very beginning.
Eventually we've got read-only memory pages, .bss, non-executable read-write memory pages (non-executable for safety reasons), embedded resources (like icons on Windows), and other stuff which the OS should know about before running the code in order to properly configure the isolated environment for the newly created process. Of course, there are also shared libraries (which have to be loaded by the OS) and any program which does anything meaningful has to output some result via OS call, e.g. it has to know how to perform system calls.
So, turns out that in multi-process modern OSes executable files should contain a lot of metainformation in addition to the code. That's why we have file formats. They are different on different platforms mainly for historical reasons. Think of it as of PNG vs JPEG - both are compressed rasterized image formats, but they're incompatible, use different algorithms for compression and different storage formats.
no OS Specific libraries, frameworks or API's
That's not true. As we live in multi-process OS, no process has any kind of direct access to the hardware - be it network card or display. In general, it can only access CPU and memory (in a very limited way).
E.g. when you run your program in terminal, its output should get to the terminal emulator, so it can be displayed in a window, which you can drag across the screen, transparently for your "Hello World". So, OS gets involved anyway.
Even your "hello world" application has to:
Load dynamic C++ runtime, which will initialize cin object before your main starts. Who else will initialize cin object and call destructors when main ends?
When you try to print something, your C++ runtime will eventually have to make a call to the OS. Nowadays, it's typically abstracted away in C standard library (libc), which we have to load dynamically even before C++ runtime.
That C standard library invokes some x86 instructions which make the system call which "prints" the string on the screen. Note that different OSes and different CPUs (even among x86 family) have different mechanisms and conventions about system calls. Some use interruptions, some use specifically designed sysenter/syscall instructions (hello from Intel and AMD), some pass arguments in known memory locations, some pass them via registers. Again, that's why this code is abstracted away by the OS's standard library - it typically provides some simple C interface which makes necessary assembly-level magic.
All in all, answering your question: because your program have to interact with the OS and different OSes use completely different mechanisms for that.
If your program has no side effects (like your second example), then it is still saved in the "general" format. And, as "general" formats differ between platforms, we should recompile. It's just not worth to invent a common compatible format for simple programs with no side effects, as they are useless.

Which methods/calls perform the disk I/O operations and how to find them?

Which methods and system calls should I hook into, so I can replace 'how' an OS X app (the target) reads and writes to/from the HD?.
How may I determine that list of functions or system calls?.
Adding more context:
This is a final project and I'm looking for advise. The goal is to alter the behavior of an OS X app, adding it data encryption and decryption capabilities.
Which tools could I use to achieve my goal, and why?
For instance, assume the target app is Text Edit. Instead of saving "hello world" as plain text in a .txt file in the HD, it'll save: "ifmmnXxnpme". Opening the file will show the original text.
I think its better to get more realistic or at least conscious of what you want to do.
The lowest level in software is a kernel module on top of the storage modules, that "encrypt" the data.
In Windows you can stack drivers, so conceptually you simply intercept the call for a read/write, edit it and pass it down the driver stack.
Under BSD there is an equivalent mechanism surely, but I don't know precisely what it is.
I don't think you want to dig into kernel programming.
At the lowest level from an user space application point of view, there are the system calls.
The system calls used to write and read are respectively the number 3 and 4 (see here), in BSD derived OS, like OS X, they becomes 2000003h and 2000004h (see here).
This IA32e specific since you are using Apple computers.
Files can be read/written by memory mapping them, so you would need to hijack the system call sys_mmap too.
This is more complex as you need to detect page faults or any mechanism used to implement file mapping.
To hijack system calls you need a kernel module again.
The next upper level of abstraction is the runtime, that probably is the Obj C runtime (up to data, Swift still use Obj C runtime AFAIK).
An Obj C application use the Cocoa Framework and can read/write to file with calls like [NSData dataWithContentOfFile: myFileName] or [myData writeToFile: myFileName atomically:myAtomicalBehavior].
There are plenty of Cocoa methods that write to or read from file, but internally the framework will use few methods from the Obj C runtime.
I'm not an expert of the internals of Cocoa, so you need to take a debugger and look what the invocation chain is.
Once you have found the "low level" methods that read or write to files you can use method swizzling.
If the target app load your code as part of a library, this is really simple, otherwise you need more clever techniques (like infecting or manipulating the memory of the other process directly). You can google around for more info.
Again to be honest this is still a lot of work, although manageable.
You may consider to simply hijack a limited set of Cocoa methods, for example the writeToFile of NSData or similar for NSString and consider the project a work in progress demo.
A similar question has been asked and answered here.

How should different Linux device tree drivers share common registers?

I'm working on a port of the Linux kernel to an unsupported ARM SoC platform. Unfortunately, on this SoC, different peripherals will sometimes share registers or commingle registers within the same region of memory. This is giving me grief with the Device Tree specification which doesn't seem to support the notion of different devices sharing the same set of registers or registers commingled in the same address space. Various documents I've read on the device tree don't suggest the proper way to handle this.
My simple approach to specify the same register region within multiple drivers throws "can't request region for resource" for the second device that attempts to map the same register region as another driver. From my understanding, this results from the kernel enforcing device tree rules regarding register regions.
What is the preferred general solution for solving this dilemma? Should there be a higher level driver that marshals access to the shared register region? Are there examples in the existing Linux kernel that address this specific issue (I couldn't find any, but I may not be sure what to look for)?
I am facing exactly the same problem. My solution is to create a separate module to guard common resources and then write 'client modules' that use symbols exported from the common module.
Note that this makes sense from the safety point of view as well. How would you otherwise implement proper memory locking and ensure operation coherency across several independent modules?
You can still use devm_ioremap() directly but extra caution has to be exercised with some synchronization.
Below is an example from upstream,
https://github.com/torvalds/linux/blob/master/drivers/usb/phy/phy-tegra-usb.c#L1368

Creating a list similar to .ctors from multiple object files

I'm currently at a point where I need to link in several modules (basically ELF object files) to my main executable due to a limitation of our target (background: kernel, targeting the ARM architecture). On other targets (x86 specifically) these object files would be loaded at runtime and a specific function in them would be called. At shutdown another function would be called. Both of these functions are exposed to the kernel as symbols, and this all works fine.
When the object files are statically linked however there's no way for the kernel to "detect" their presence so to speak, and therefore I need a way of telling the kernel about the presence of the init/fini functions without hardcoding their presence into the kernel - it needs to be extensible. I thought a solution to this might be to put all the init/fini function pointers into their own section - in much the same way you'd expect from .ctors and .dtors - and call through them at the relevant time.
Note that they can't actually go into .ctors, as they require specific support to be running by the time they're called (specifically threads and memory management, if you're interested).
What's the best way of going about putting a bunch of arbitrary function pointers into a specific section? Even better - is it possible to inject arbitrary data into a section, so I could also store stuff like module name (a struct rather than a function pointer, basically). Using GCC targeted to arm-elf.
GCC attributes can be used to specify a section:
__attribute__((section("foobar")))

C Runtime objects, dll boundaries

What is the best way to design a C API for dlls which deals with the problem of passing "objects" which are C runtime dependent (FILE*, pointer returned by malloc, etc...). For example, if two dlls are linked with a different version of the runtime, my understanding is that you cannot pass a FILE* from one dll to the other safely.
Is the only solution to use windows-dependent API (which are guaranteed to work across dlls) ? The C API already exists and is mature, but was designed from a unix POV, mostly (and still has to work on unix, of course).
You asked for a C, not a C++ solution.
The usual method(s) for doing this kind of thing in C are:
Design the modules API to simply not require CRT objects. Get stuff passed accross in raw C types - i.e. get the consumer to load the file and simply pass you the pointer. Or, get the consumer to pass a fully qualified file name, that is opened , read, and closed, internally.
An approach used by other c modules, the MS cabinet SD and parts of the OpenSSL library iirc come to mind, get the consuming application to pass in pointers to functions to the initialization function. So, any API you pass a FILE* to would at some point during initialization have taken a pointer to a struct with function pointers matching the signatures of fread, fopen etc. When dealing with the external FILE*s the dll always uses the passed in functions rather than the CRT functions.
With some simple tricks like this you can make your C DLLs interface entirely independent of the hosts CRT - or in fact require the host to be written in C or C++ at all.
Neither existing answer is correct: Given the following on Windows: you have two DLLs, each is statically linked with two different versions of the C/C++ standard libraries.
In this case, you should not pass pointers to structures created by the C/C++ standard library in one DLL to the other. The reason is that these structures may be different between the two C/C++ standard library implementations.
The other thing you should not do is free a pointer allocated by new or malloc from one DLL that was allocated in the other. The heap manger may be differently implemented as well.
Note, you can use the pointers between the DLLs - they just point to memory. It is the free that is the issue.
Now, you may find that this works, but if it does, then you are just luck. This is likely to cause you problems in the future.
One potential solution to your problem is dynamically linking to the CRT. For example,you could dynamically link to MSVCRT.DLL. That way your DLL's will always use the same CRT.
Note, I suggest that it is not a best practice to pass CRT data structures between DLLs. You might want to see if you can factor things better.
Note, I am not a Linux/Unix expert - but you will have the same issues on those OSes as well.
The problem with the different runtimes isn't solvable because the FILE* struct belongs
to one runtime on a windows system.
But if you write a small wrapper Interface your done and it does not really hurt.
stdcall IFile* IFileFactory(const char* filename, const char* mode);
class IFile {
virtual fwrite(...) = 0;
virtual fread(...) = 0;
virtual delete() = 0;
}
This is save to be passed accross dll boundaries everywhere and does not really hurt.
P.S.: Be careful if you start throwing exceptions across dll boundaries. This will work quiet well if you fulfill some design creterions on windows OS but will fail on some others.
If the C API exists and is mature, bypassing the CRT internally by using pure Win32 API stuff gets you half the way. The other half is making sure the DLL's user uses the corresponding Win32 API functions. This will make your API less portable, in both use and documentation. Also, even if you go this way with memory allocation, where both the CRT functions and the Win32 ones deal with void*, you're still in trouble with the file stuff - Win32 API uses handles, and knows nothing about the FILE structure.
I'm not quite sure what are the limitations of the FILE*, but I assume the problem is the same as with CRT allocations across modules. MSVCRT uses Win32 internally to handle the file operations, and the underlying file handle can be used from every module within the same process. What might not work is closing a file that was opened by another module, which involves freeing the FILE structure on a possibly different CRT.
What I would do, if changing the API is still an option, is export cleanup functions for any possible "object" created within the DLL. These cleanup functions will handle the disposal of the given object in the way that corresponds to the way it was created within that DLL. This will also make the DLL absolutely portable in terms of usage. The only worry you'll have then is making sure the DLL's user does indeed use your cleanup functions rather than the regular CRT ones. This can be done using several tricks, which deserve another question...

Resources