How portable is mmap? - windows

I've been considering using mmap for file reading, and was wondering how portable that is.
I'm developing on a Linux platform, but would like my program to work on Mac OS X and Windows.
Can I assume mmap is working on these platforms?

The mmap() function is a POSIX call. It works fine on MacOS X (and Linux, and HP-UX, and AIX, and Solaris).
The problem area will be Windows. I'm not sure whether there is an _mmap() call in the POSIX 'compatibility' sub-system. It is likely to be there — but will have the name with the leading underscore because Microsoft has an alternative view on namespaces and considers mmap() to intrude on the user name space, even if you ask for POSIX functionality. You can find a definition of an alternative Windows interface MapViewOfFile() and discussion about performance in another SO question (mmap() vs reading blocks).
If you try to map large files on a 32-bit system, you may find there isn't enough contiguous space to allocate the whole file in memory, so the memory mapping will fail. Do not assume it will work; decide what your fallback strategy is if it fails.

Using mmap for reading files isn't portable if you rely on mapping large bits of large files into your address space - 32-bit systems can easily not have a single large usable space - say 1G - of address space available so mmap would fail quite often for a 1G mapping.

The principle of a memory mapped file is fairly portable, but you don't have mmap() on Windows (but things like MapViewOfFile() exist). You could take a peek at the python mmap modules c code to see how they do it for various platforms.

I consider memory mapped io on UNIXs
as not useable for interactive applications,
as it may result in a SIGSEGV/SIGBUS
(in case of the file has been truncated meanwhile by some other process).
Ignoring such sick "solutions" as setjmp/longjmp
there is nothing one can do other than to terminate the process after getting SIGSEGV/SIGBUS.
The new G++ feature to convert such signals into exceptions
seems to be intended mainly for apples OS,
since the description states, that one needs runtime support for this G++ feature
and there is no information to be found about this G++ feature anywhere.
We probably have to wait a couple of years, until structured exception handling like it can be found on windows since more than 20 years makes its way into UNIXs.

Related

How can an Operating System be coded in high level languages?

I just started diving into the world of operating systems and I've learned that processes have a certain memory space they can address which is handled by the operating system. I don't quite understand how can an Operating System written in high level languages like c and c++ obtain this kind of memory management functionality.
You have caught the bug and there is no cure for it :-)
The language you use to write your OS has very little to do with the way your OS operates. Yes, most people use C/C++, but there are others. As for the language, you do need a language that will let you directly communicate with the hardware you plan to manage, assembly being the main choice for this part. However, this is less than 5% of the whole project.
The code that you write must not rely upon any existing operating system. i.e.: you must code all of the function yourself, or call existing libraries. However, these existing libraries must be written so that they don't rely upon anything else.
Once you have a base, you can write your OS in any language you choose, with the minor part in assembly, something a high level language won't allow. In fact, in 64-bit code, some compilers no longer allow inline assembly, so this makes that 5% I mentioned above more like 15%.
Find out what you would like to do and then find out if that can be done in the language of choice. For example, the main operating system components can be written in C, while the actual processor management (interrupts, etc) must be done in assembly. Your boot code must be in assembly as well, at least most of it.
As mentioned in a different post, I have some early example code that you might want to look at. The boot is done in assembly, while the loader code, both Legacy BIOS and EFI, are mostly C code.
To clarify fysnet's answer, the reason you have to use at least a bit of assembly is that you can only explicitly access addressable memory in C/C++ (through pointers), while hardware registers (such as the program counter or stack pointer) often don't have memory addresses. Not only that, but some registers have to be manipulated with CPU architecture-dependent special instructions, and that, too, is only possible in machine language.
I don't quite understand how can an Operating System written in high level languages like c and c++ obtain this kind of memory management functionality.
As described above, depending on the architecture, this could be achieved by having special instructions to manage the MMU, TLB etc. INVLPG is one example of such an instruction in the x86 architecture. Note that having a special instruction requiring kernel privileges is probably the simplest way to implement such a feature in hardware in a secure manner, because then it is simply sufficient to check if the CPU is in kernel mode in order to determine whether the instruction can be executed or not.
Compilers turn high-level languages into asm / machine code for you, so you don't have to write asm yourself. You pick a compiler that handles memory the way you want your OS to; e.g. using the callstack for automatic storage, and not implicitly calling malloc / free (because those won't exist in your kernel).
To link your compiled C/C++ into a kernel, you typically have to know more about the ABI it targets, and the toolchain especially the linker.
The ISO C standard treats implementation details very much as a black box. But real compilers that people use for low level stuff work in well-known ways (i.e. make the expected/useful implementation choices) that kernel programmers depend on, in terms of compiling code and static data into contiguous blocks that can be linked into a single kernel executable that can be loaded all as one chunk.
As for actually managing the system's memory, you write code yourself to do that, with a bit of inline asm where necessary for special instructions like invlpg as other answers mention.
The entry point (where execution starts) will normally be written in pure asm, to set up a callstack with the stack pointer register pointing to it.
And set up virtual memory and so on so code is executable, data is read/write, and read-only data is readable. All of this before jumping to any compiled C code. The first C you jump to is probably more kernel init code, e.g. initializing data structures for an allocator to manage all the memory that isn't already in use by static code/data.
Creating a stack and mapping code/data into memory is the kind of setup that's normally done by an OS when starting a user-space program. The asm emitted by a compiler will assume that code, static data, and the stack are all there already.

Reading huge files using Memory Mapped Files

I see many articles suggesting not to map huge files as mmap files so the virtual address space won't be taken solely by the mmap.
How does that change with 64 bit process where the address space dramatically increases?
If I need to randomly access a file, is there a reason not to map the whole file at once? (dozens of GBs file)
On 64bit, go ahead and map the file.
One thing to consider, based on Linux experience: if the access is truly random and the file is much bigger than you can expect to cache in RAM (so the chances of hitting a page again are slim) then it can be worth specifying MADV_RANDOM to madvise to stop the accumulation of hit file pages steadily and pointlessly swapping other actually useful stuff out. No idea what the windows equivalent API is though.
There's a reason to think carefully of using memory-mapped files, even on 64-bit platform (where virtual address space size is not an issue). It's related to the (potential) error handling.
When reading the file "conventionally" - any I/O error is reported by the appropriate function return value. The rest of error handling is up to you.
OTOH if the error arises during the implicit I/O (resulting from the page fault and attempt to load the needed file portion into the appropriate memory page) - the error handling mechanism depends on the OS.
In Windows the error handling is performed via SEH - so-called "structured exception handling". The exception propagates to the user mode (application's code) where you have a chance to handle it properly. The proper handling requires you to compile with the appropriate exception handling settings in the compiler (to guarantee the invocation of the destructors, if applicable).
I don't know how the error handling is performed in unix/linux though.
P.S. I don't say don't use memory-mapped files. I say do this carefully
One thing to be aware of is that memory mapping requires big contiguous chunks of (virtual) memory when the mapping is created; on a 32-bit system this particularly sucks because on a loaded system, getting long runs of contiguous ram is unlikely and the mapping will fail. On a 64-bit system this is much easier as the upper bound of 64-bit is... huge.
If you are running code in controlled environments (e.g. 64-bit server environments you are building yourself and know to run this code just fine) go ahead and map the entire file and just deal with it.
If you are trying to write general purpose code that will be in software that could run on any number of types of configurations, you'll want to stick to a smaller chunked mapping strategy. For example, mapping large files to collections of 1GB chunks and having an abstraction layer that takes operations like read(offset) and converts them to the offset in the right chunk before performing the op.
Hope that helps.

Does windows have same maximum path lengths (name of directory entry) for different filesystems it mounts?

I have to know if a specific vulnerablity in TCL 8.4 affects Windows platform
The vulnerability is: http://www.securityfocus.com/bid/15259/info
As per the link:
Operating systems with no difference in the maximum path lengths among differing file systems are not affected by this issue
I am using TCL on windows and want to know if this vulnerablity affects TCL on windows and how ?
Further, how can a person exploit this vulnerability on Windows ?
Thanks
The windows header files define MAX_PATH - as 260 - as the usual maximum path size. This isn't really universally applied. There are a number of ways to bypass this limit, in which case the effective path limit is, well, unlimited. Or 32,767 characters. Whichever is shorter.
Naming, Files, Paths and Namespaces has more info.
While there exist common conventions regarding maximum file name and path length, certain file system drivers (or third-party file system implementations) might have their own limits which can be lower, than the commonly used ones.
That article does not mention any vulnerability of systems hosted on Windows to this at all; the standard recommended size of buffer to be allocated there is sufficiently long to hold any legal filename. This is specifically true for Tcl (Tk does not do directory scanning except via Tcl's interfaces).
Exploiting the vulnerability on Windows is going to be hard (and impossible with Tcl, which is very careful with buffer management). If you're on another platform, you are recommended to switch to a later patchlevel of Tcl; the current version is 8.4.19. (Actually, you're recommended to switch to the 8.5 series – currently 8.5.9 – as 8.4 as basically been EOLed; there will be maybe one more roll-up release on that branch but bugfixes are now only committed to 8.4 for critical things like demonstrated security issues or build-chain problems.)
Note that, since Tcl has never allocated buffers for holding a whole path directly anyway, it's not clear how this sort of thing could cause an exploit in the first place. The article does state that there is no instance of this issue in the wild.

Drawbacks of using /LARGEADDRESSAWARE for 32-bit Windows executables?

We need to link one of our executables with this flag as it uses lots of memory.
But why give one EXE file special treatment. Why not standardize on /LARGEADDRESSAWARE?
So the question is: Is there anything wrong with using /LARGEADDRESSAWARE even if you don't need it. Why not use it as standard for all EXE files?
blindly applying the LargeAddressAware flag to your 32bit executable deploys a ticking time bomb!
by setting this flag you are testifying to the OS:
yes, my application (and all DLLs being loaded during runtime) can cope with memory addresses up to 4 GB.
so don't restrict the VAS for the process to 2 GB but unlock the full range (of 4 GB)".
but can you really guarantee?
do you take responsibility for all the system DLLs, microsoft redistributables and 3rd-party modules your process may use?
usually, memory allocation returns virtual addresses in low-to-high order. so, unless your process consumes a lot of memory (or it has a very fragmented virtual address space), it will never use addresses beyond the 2 GB boundary. this is hiding bugs related to high addresses.
if such bugs exist they are hard to identify. they will sporadically show up "sooner or later". it's just a matter of time.
luckily there is an extremely handy system-wide switch built into the windows OS:
for testing purposes use the MEM_TOP_DOWN registry setting.
this forces all memory allocations to go from the top down, instead of the normal bottom up.
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management]
"AllocationPreference"=dword:00100000
(this is hex 0x100000. requires windows reboot, of course)
with this switch enabled you will identify issues "sooner" rather than "later".
ideally you'll see them "right from the beginning".
side note: for first analysis i strongly recommend the tool VMmap (SysInternals).
conclusions:
when applying the LAA flag to your 32bit executable it is mandatory to fully test it on a x64 OS with the TopDown AllocationPreference switch set.
for issues in your own code you may be able to fix them.
just to name one very obvious example: use unsigned integers instead of signed integers for memory pointers.
when encountering issues with 3rd-party modules you need to ask the author to fix his bugs. unless this is done you better remove the LargeAddressAware flag from your executable.
a note on testing:
the MemTopDown registry switch is not achieving the desired results for unit tests that are executed by a "test runner" that itself is not LAA enabled.
see: Unit Testing for x86 LargeAddressAware compatibility
PS:
also very "related" and quite interesting is the migration from 32bit code to 64bit.
for examples see:
As a programmer, what do I need to worry about when moving to 64-bit windows?
https://www.sec.cs.tu-bs.de/pubs/2016-ccs.pdf (twice the bits, twice the trouble)
Because lots of legacy code is written with the expectation that "negative" pointers are invalid. Anything in the top two Gb of a 32bit process has the msb set.
As such, its far easier for Microsoft to play it safe, and require applications that (a) need the full 4Gb and (b) have been developed and tested in a large memory scenario, to simply set the flag.
It's not - as you have noticed - that hard.
Raymond Chen - in his blog The Old New Thing - covers the issues with turning it on for all (32bit) applications.
No, "legacy code" in this context (C/C++) is not exclusively code that plays ugly tricks with the MSB of pointers.
It also includes all the code that uses 'int' to store the difference between two pointer, or the length of a memory area, instead of using the correct type 'size_t' : 'int' being signed has 31 bits, and can not handle a value of more than 2 Gb.
A way to cure a good part of your code is to go over it and correct all of those innocuous "mixing signed and unsigned" warnings. It should do a good part of the job, at least if you haven't defined function where an argument of type int is actually a memory length.
However that "legacy code" will probably apparently work right for quite a while, even if you correct nothing.
You'll only break when you'll allocate more than 2 Gb in one block. Or when you'll compare two unrelated pointers that are more than 2 Gb away from each other.
As comparing unrelated pointers is technically an undefined behaviour anyway, you won't encounter that much code that does it (but you can never be sure).
And very frequently even if in total you need more than 2Gb, your program actually never makes single allocations that are larger than that. In fact in Windows, even with LARGEADDRESSAWARE you won't be able by default to allocate that much given the way the memory is organized. You'd need to shuffle the system DLL around to get a continuous block of more than 2Gb
But Murphy's laws says that kind of code will breaks one day, it's just that it will happen very long after you've enable LARGEADDRESSAWARE without checking, and when nobody will remember this has been done.

Why code segment is common for different instances of same program

I wanted to know why code segment is common for different instances of same program.
For example: consider program P1.exe running, if another copy of P1.exe is running, code segment will be common for both running instances. Why is it so?
If the code segment in question is loaded from a DLL, it might be the operating system being clever and re-using the already loaded library. This is one of the core points of using dynamically loaded library code, it allows the code to be shared across multiple processes.
Not sure if Windows is clever enough to do this with the code sections of regular EXE files, but it would make sense if possible.
It could also be virtual memory fooling you; two processes can look like they have the same thing on the same address, but that address is virtual, so they really are just showing mappings of physical memory.
Code is typically read-only, so it would be wasteful to make multiple copies of it.
Also, Windows (at least, I can't speak for other OS's at this level) uses the paging infrastructure to page code in and out direct from the executable file, as if it were a paging file. Since you are dealing with the same executable, it is paging from the same location to the same location.
Self-modifying code is effectively no longer supported by modern operating systems. Generating new code is possible (by setting the correct flags when allocating memory) but this is separate from the original code segment.
The code segment is (supposed to be) static (does not change) so there is no reason not to use it for several instances.
Just to start at a basic level, Segmentation is just a way to implement memory isolation and partitioning. Paging is another way to achieve this. For the most part, anything you can achieve via segmentation, you can be achieve via paging. As such, most modern operating systems on the x86 forego using segmentation at all, instead relying completely on paging facilities.
Because of this, all processes will usually be running under the trivial segment of (Base = 0, Limit = 4GB, Privilege level = 3), which means the code/data segment registers play no real part in determining the physical address, and are just used to set the privilege level of the process. All processes will usually be run at the same privilege, so they should all have the same value in the segment register.
Edit
Maybe I misinterpreted the question. I thought the question author was asking why both processes have the same value in the code segment register.

Resources