Linux Memory Management - memory-management

Is there any way i can mark a page execute only with no read permissions ? (i.e able to execute instructions on that page without having read permissions of that page which is executable.)
My final goal is to make a page that i can execute but no other process should be able to make any data access to that page ..

This is one of the things that is kernel and hardware-dependent, as mentioned in the mprotect() manual page:
Whether PROT_EXEC has any effect different from PROT_READ is architecture and kernel version dependent.
On recent Linux/x86 kernels, those flags are definitely discrete if your CPU supports the NX-bit. On the other x86 CPUs, it depends on whether your kernel has support for Exec-Shield or another similar NX-bit emulation.

Related

Does the existence of PAGE_EXECUTE_READWRITE as an option in VirtualAlloc mean that the W^X is only facilitated in Windows by DEP?

W^X ("write xor execute", pronounced W xor X) is a security feature in operating systems and virtual machines. It is a memory protection policy whereby every page in a process's or kernel's address space may be either writable or executable, but not both.
My basic perspective on why this is a good security feature is that the owner of the system theoretically has an opportunity to, within the kernel, or specifically within the VirtualAlloc function, to hook some analysis function to perform some security validation before allowing newly written code to be executed on the machine.
I was already familiar with DEP, but only just now realizing it has something to do with W^X in Windows:
Executable space protection on Windows is called "Data Execution Prevention" (DEP).
Under Windows XP or Server 2003 NX protection was used on critical Windows services exclusively by default. If the x86 processor supported this feature in hardware, then the NX features were turned on automatically in Windows XP/Server 2003 by default. If the feature was not supported by the x86 processor, then no protection was given.
Early implementations of DEP provided no address space layout randomization (ASLR), which allowed potential return-to-libc attacks that could have been feasibly used to disable DEP during an attack.
It was my impression that W^X applied to Windows in general, without requiring configuration of the process. But I just noticed that VirtualProtect allows the option PAGE_EXECUTE_READWRITE, which is documented as:
Enables execute, read-only, or read/write access to the committed region of pages.
This seems to entirely defy the concept of W^X. So is W^X not an enforced security policy on Windows, except when DEP is enabled?
If you turn DEP off, W^X is not enforced. When DEP is on, W^X is enforced by all memory pages that ask for it (when the hardware supports it). It is bit 63 in the page table on x86, known as the NX bit.
Now the question becomes, when is this bit set?
The PE header has a bit indicating if DEP/W^X is supported (IMAGE_DLLCHARACTERISTICS_NX_COMPAT) and if so, the code sections in the file without the write attribute gets the NX bit set when that code is mapped into memory.
For memory dynamically allocated at run-time, the developer gets to choose. PAGE_EXECUTE_READWRITE does not get the NX bit set on purpose. This is useful if they have legacy code that dynamically alters executable code while still having the DEP bit set on the PE so the majority of their code is W^X.
Early x86 CPUs had no support for pages without eXec permission. In legacy 32-bit x86 page tables, there was only a bit for write permission, the R/W bit. (Read permission is always implicit in the page being valid, whether the page is writeable or not). The PAE format for page-table entries, which x86-64 also uses, added an NX bit ("no exec"), aka XD (eXecute Disable).
An OS still had to decide which pages to make non-executable.
Windows seems to use DEP to describe the feature of actually mapping logical page permissions to the hardware page tables, to be enforced by the CPU.
Some programs written in the bad old days when every readable page was executable may have been sloppy about telling the OS that they wanted a page to be executable. Especially ones that only targeted 32-bit x86. This is what Windows caters for by requiring executables to opt in to DEP, to indicate that they're aware of and compatible with not having exec permission for pages that aren't explicitly marked that way.
Some OSes, notably OpenBSD, truly enforce W^X. For example, mmap(..., PROT_WRITE | PROT_EXEC, ...) will return an error on OpenBSD. Their mmap(2) man page documents that such an mmap or mprotect system call will return
[ENOTSUP] The accesses requested in the prot argument are not allowed. In particular, PROT_WRITE | PROT_EXEC mappings are not permitted unless the filesystem is mounted wxallowed and the process is link-time tagged with wxneeded. (See also kern.wxabort in sysctl(2) for a method to diagnose failure).
Most other OSes (including Linux and Windows) allow user-space to create pages that are writeable and executable at the same time. But the standard toolchains and dynamic linking mechanisms aim for W^X compliance by default, if you don't use any options like gcc -zexecstack that will get the OS to create a process image with some R|W|X pages.
Older 32-bit x86 Linux for example used to use PLT entries (dynamic linking stubs) with jmp rel32 direct jumps, and rewrite the machine code to have the right displacement to reach wherever the shared library got loaded in memory. But these days, the PLT code uses indirect jumps (through the GOT = global offset table), so the executable PLT code can be in read-only page(s).
Changes like this have weeded out any need for write+exec pages in a normal process built with the standard tools.
But on Windows, MacOS, and Linux, W^X is not enforced by the OS. System calls like Windows VirtualAlloc / VirtualProtect and their POSIX equivalents mmap / mprotect will work just fine.
#Ander's answer says DEP does not enforce W^X, just gets the OS to respect the exec permission settings in the executable when creating the initial mappings for .text / .data / .bss and stack space, and stuff like that during process startup.

How OS protects against malicious memory access from assembly level code?

I know about the system calls that OS provides to protect programs from accessing other programs memory. But that can only help if I have used the system call library provided by OS. What if I write a assembly code myself that sets CPU bit for kernel mode and executes a privileged instruction ( let's say modify OS' program segment in memory ). Can OS protect against that ?
P.S. Out of curiosity question. If any good blog or book reference can be provided, that would be helpful as I want to study OS in as much detail as possible.
The processor protects again such malicious mischief by (1) requiring you to be in an elevated mode (for our example here, KERNEL); and (2) limiting access to kernel mode.
In order to enter kernel mode from user mode there either has to be an interrupt (not applicable here) or an exception. Usually both are handled the same way but there are some bizarre processors (Did anyone say Intel?) that do things a bit differently
The operating system exception and interrupt handlers must limits what the user mode program can do.
What if I write a assembly code myself that sets CPU bit for kernel mode and executes a privileged instruction
You cant just set the kernel mode bit in the processor status register to enter kernel mode.
Can OS protect against that ?
The CPU protects against that.
If any good blog or book reference can be provided, that would be helpful as I want to study OS in as much detail as possible.
The VAX/VMS Systems Internals book is old but it is cheap and shows how a real OS has been implemented.
This blog clearly explains what my confusion was.
http://minnie.tuhs.org/CompArch/Lectures/week05.html
Even though user programs can switch to kernel mode, but they have to do it through a interrupt instruction ( int in case x86) and for this interrupt, the interrupt handler is written by the OS. ( probably when it was in kernel mode at bootup time). So this way all priviliged instructions can only be executed by the OS code only.

In Linux how to forbid code execute in heap

Image this way to invade Linux: 1. malloc a space. 2. write binary code to this region. 3. jump to this code.
I want to forbid this way to run code. Only run code in .text section. What should I do to the Linux kernel? Thank you!
The PaX security patch to linux address this concern by ensuring that no memory in RAM is both writeable and executable. This ensures that one can not allocate memory into RAM, write code to it, and then execute it (which seems exactly what you are trying to prevent).
https://en.wikipedia.org/wiki/PaX#Executable_space_protections
Note that you may have to compile a custom kernel to install this patch. Alternatively, try seeing if your distribution offers a linux kernel with the patch installed. (Search for linux-grsec or linux-pax).

Why processes don't have the ability to run in kernel mode?

OS use kernel mode (privilege mode) and user mode. It seems very reasonable for security reasons. Process cant make any command it wants, only the operation system can make those commands.
On the other hand it take long time all the context switch. change between user to kernel mode and vice versa.
The trap to the operation system take a long time.
I think why the operation system not give the ability to process to run in kernel mode to increase it's performance (this can be very big improve)?
In real time systems this works in the same way?
Thanks.
There are safety and stability reasons, which disallow user-space process to access kernel space functions directly.
Kernel code garantees, that no user-space process(until being executed with root priveleges) can break operating system. This is a vital property of modern OS. Also it is important, that development of user-space apps is much more simple, than kernel modules development.
In case when application needs more perfomance than available for use-space, it is possible to move its code(or part of it) into kernel space. E.g., network protocols and filesystems are implemented as kernel drivers mostly because of perfomance reasons.
Real time applications are more demanding to stability. They also use system calls.
I think there is no sense to do this.
1.) If you want something to be runned in kernel context use kernel module API, what is the problem with that?
2.) Why do you think that it will multiple process speed? Switch between kernel and userspace is just additional registers state save / restore. It will run faster, but i don't think user will even notice it.

Does Windows XP have an equivalent to VAX/VMS Installed Shared Images?

Back in the good old/bad old days when I developed on VAX/VMS it had a feature called 'Installed Shared Images' whereby if one expected one's executable program would be run by many users concurrently one could invoke the INSTALL utility thus:
$ INSTALL
INSTALL> ADD ONES_PROGRAM.EXE/SHARE
INSTALL> EXIT
The /SHARE flag had the effect of separating out the code from the data so that concurrent users of ONES_PROGRAM.EXE would all share the code (on a read-only basis of course) but each would have their own copy of the data (on a read-write basis). This technique/feature saved Mbytes of memory (which was necessary in those days) as only ONE copy of the program's code ever needed to be resident in VAX memory irrespective of the number of concurrent users.
Does Windows XP have something similar? I can't figure out if the Control Panel's 'Add Programs/Features' is the equivalent (I think it is, but I'm not sure)
Many thanks for any info
Richard
p.s. INSTALL would also share Libraries as well as Programs in case you were curious
The Windows virtual memory manager will do this automatically for you. So long as the module can be loaded at the same address in each process, the physical memory for the code will be shared between each process that loads that module. That is true for all modules, libraries as well as executables.
This is achieved by the linker marking code segments as being shareable. So, linkers mark code segments as being shareable, and data segments otherwise.
The bottom line is that you do not have to do anything explicit to make this happen.

Resources