What is the real significance of __cpuinit in the linux kernel code?
I have come across normal kernel code acceessing __cpuinit functions. This gives me loads of modpost warnings. Is this normal or a serious error?
__cpuinit actually tells the compiler to put the function into the specified elf section.
#define __cpuinit __section(.cpuinit.text) __cold
the kernel code says in include/linux/init.h:
/* modpost check for section mismatches during the kernel build.
* A section mismatch happens when there are references from a
* code or data section to an init section (both code or data).
* The init sections are (for most archs) discarded by the kernel
* when early init has completed so all such references are potential bugs.
* For exit sections the same issue exists. ......
Related
How can I read from the PMU from inside Kernel space?
For a profiling task I need to read the retired instructions provided by the PMU from inside the kernel. The perf_event_open systemcall seems to offer this capability. In my source code I
#include <linux/syscalls.h>
set my parameters for the perf_event_attr struct and call the sys_perf_event_open(). The mentioned header contains the function declaration. When checking "/proc/kallsyms", it is confirmed that there is a systemcall with the name sys_perf_event_open. The symbol is globally available indicated by the T:
ffffffff8113fe70 T sys_perf_event_open
So everything should work as far as I can tell.
Still, when compiling or inserting the LKM I get a warning/error that sys_perf_event_open does not exist.
WARNING: "sys_perf_event_open" [/home/vagrant/mods/lkm_read_pmu/read_pmu.ko] undefined!
What do I need to do in order to get those retired instructions counter?
The /proc/kallsyms file shows all kernel symbols defined in the source. Right, the capital T indicates a global symbol in the text section of the kernel binary, but the meaning of "global" here is according to the C language. That is, it can be used in other files of the kernel itself. You can't call a kernel function from a kernel module just because it's global.
Kernel modules can only use kernel symbols that are exported with EXPORT_SYMBOL in the kernel source code. Since kernel 2.6.0, none of the system calls are exported, so you can't call any of them from a kernel module, including sys_perf_event_open. System calls are really designed to be called from user space. What this all means is that you can't use the perf_event subsystem from within a kernel module.
That said, I think you can modify the kernel to add EXPORT_SYMBOL to sys_perf_event_open. That will make it an exported symbol, which means it can be used from a kernel module.
I am using a syscall checker in combination with -fsanitize=address and when ASAN finds a bug, it calls some syscalls (ioctl(ISATTY), etc) when printing out the report. The syscall checker interrupts ASAN's ioctls and the error report is not collected properly.
What I would like is for ASAN to simply abort without printing the report, or failing that, a way to determine (using a libasan4 API call maybe) that ASAN found an error, so I can stop the syscall checker from intercepting syscalls.
Unfortunately __asan_error_report, __sanitizer_set_death_callback and __asan_set_error_report_callback from libasan4 all kick in after ASAN has collected the report:
0 __asan_error_report()
1 syscall_checker()
2 ioctl(ISATTY)
3 asan::PrintReport()
4 app_code_that_crashes()
And the syscall checker does not handle ASAN's ioctl() calls properly, so it exit()s normally, while I am hoping to keep ASAN's behavior of abort()ing.
You should be able to intercept before report is printed by overriding __asan_on_error (declared in asan_interface.h, empty by default):
// User may provide function that would be called right when ASan detects
// an error. This can be used to notice cases when ASan detects an error, but
// the program crashes before ASan report is printed.
void __asan_on_error();
Note that due to weird Asan callback interface you'd better implement this callback in main binary (definitions in shared libraries are likely to be unable to intercept default definition from libasan.a).
I'm intrigued by the DISCARDABLE flag in the section flags in PE files, specifically in the context of Windows drivers (in this case NDIS). I noticed that the INIT section was marked as RWX in a driver I'm reviewing, which seems odd - good security practice says you should adopt a W^X policy.
The dump of the section is as follows:
Name Virtual Size Virtual Addr Raw Size Raw Addr Reloc Addr LineNums RelocCount LineNumCount Characteristics
INIT 00000B7E 0000E000 00000C00 0000B200 00000000 00000000 0000 0000 E2000020
The characteristics map to:
IMAGE_SCN_MEM_EXECUTE
IMAGE_SCN_MEM_READ
IMAGE_SCN_MEM_WRITE
IMAGE_SCN_MEM_DISCARDABLE
IMAGE_SCN_CNT_CODE
The INIT section seems to contain the driver entry, which implies that it might be used to ensure that the driver entry function resides in nonpaged memory, whereas the rest of the code is allowed to be paged. I'm not entirely sure, though. I can see no evidence in the driver code to say that the developers explicitly set the page flags, or forced the driver entry into a separate section, so it looks like the compiler did it automatically. I also manually flipped the writeable flag in the driver binary to test it out, and it works fine without writing enabled, so that implies that having it RWX is unnecessary.
So, my questions are:
What is the INIT section used for in the context of a Windows driver and why is it marked discardable?
How are discardable sections treated in the Windows kernel? I have some idea of how ReactOS handles them but that's still fuzzy and not massively helpful.
Why would the compiler move the driver entry to an INIT section?
Why would the compiler mark the section as RWX, when RX is sufficient and RWX may constitute a security issue?
References I've looked at so far:
What happens when you mark a section as DISCARDABLE? - The Old New Thing
Windows Executable Files - x86 Disassembly Book
Pageable and Discardable Code in a Protocol Driver - MSDN
EDIT, 2022: I forgot to update this, but a while after I posted this question I passed it on to Microsoft and it did turn out to be a bug in the MSVC linker. They were mistakenly marking the discard section that contained DriverEntry as RWX. The issue was fixed in VS2015.
What is the INIT section used for in the context of a Windows...
It is normally used for the DriverEntry() function.
How are discardable sections treated in the Windows kernel?
It allows the page(s) that contain the DriverEntry() function code to be discarded. They are no longer needed after the driver is initialized.
Why would the compiler move the driver entry to an INIT section?
An NDIS driver normally contains
#pragma NDIS_INIT_FUNCTION(DriverEntry)
Which is a macro in the WDK's inc/ddk/ndis.h header file:
#define NDIS_INIT_FUNCTION(_F) alloc_text(INIT,_F)
#pragma alloc_text is one of the ways to move a function into a particular section. Another common way it is done is by bracketing the DriverEntry function with #pragma code_seg(INIT) and #pragma code_seg().
Why would the compiler mark the section as RWX
That requires an archeological dig. Many drivers were started a long time ago and are likely to still use ~VS6, back when life was still uncomplicated and programmers wore white hats. Or perhaps the programmer used #pragma section, yet another way to name sections, it permits setting the attributes directly. A modern toolchain certainly won't do this, you get RX from #pragma alloc_text. There very little point in fretting about it, given that DriverEntry() lives for a very short time and any malware code that runs with ring0 privileges can do a lot more practical damage.
I passed this information on to Microsoft and it did turn out to be a bug in the MSVC linker. They were mistakenly marking the discard section that contained DriverEntry as RWX. This issue was fixed in Visual Studio 2015.
I wrote about the issue in more detail here.
I would like to use the Linux mmc_spi on a system with highmem enabled. I can't see why the mmc_spi module won't work with highmem.
The module uses kmap() and kmalloc(), so I am unsure as to why high memory would be a problem.
The Kconfig file indicates that the file depends on !HIGHMEM and the source code has the comment:
/* allow pio too; we don't allow highmem */ on line 939.
Any help would be greatly appreciated.
My question is practically repeats this one, which asks why this issue occurs. I would like ot know if it is possible to avoid it.
The issue is: if I allocate a huge amount of memory statically:
unsigned char static_data[ 8 * BYTES_IN_GYGABYTE ];
then linker (ld) takes very long time to make an executable. There is a good explanation from #davidg about this behaviour in question I gave above:
This leaves us with the follow series of steps:
The assembler tells the linker that it needs to create a section of memory that is 1GB long.
The linker goes ahead and allocates this memory, in preparation for placing it in the final executable.
The linker realizes that this memory is in the .bss section and is marked NOBITS, meaning that the data is just 0, and doesn't need to be physically placed into the final executable. It avoids writing out the 1GB of data, instead just throwing the allocated memory away.
The linker writes out to the final ELF file just the compiled code, producing a small executable.
A smarter linker might be able to avoid steps 2 and 3 above, making your compile time much faster
Ok. #davidg had explained why does linker takes a lot of time, but I want to know how can I avoid it. Maybe GCC have some options, that will say to linker to be a little smarter and to avoid steps 2 and 3 above ?
Thank you.
P.S. I use GCC 4.5.2 at Ubuntu
You can allocate the static memory in the release version only:
#ifndef _DEBUG
unsigned char static_data[ 8 * BYTES_IN_GYGABYTE ];
#else
unsigned char *static_data;
#endif
I would have 2 ideas in mind that could help:
As already mentioned in some comment: place it in a separate compilation unit.That itself will not reduce linking time. But maybe together with incremental linking it helps (ld option -r).
Other is similar. Place it in a separate compilation unit, and generate a shared library from it. And just link later with the shared library.
Sadly I can not promise that one of it helps, as I have no way to test: my gcc(4.7.2) and bin tools dont show this time consuming behaviour, 8, 16 or 32 Gigabytes testprogram compile and link in under a second.