I'm working on a JIT compiler which will generate machine code in memory. This JIT is targeted at 64-bit POSIX x86 systems primarily, and I'm concerned about jumps in the code always being encodeable as 32-bit relative offsets. What I'd like to do is to mmap a 2-4GB chunk of executable memory for machine code, and manage this memory area myself.
What I'm wondering about specifically is: is it safe for me to mmap 4GB of memory at once, on a 64-bit system, even if the system doesn't have 4GBs of memory? I'm assuming that most (or all) OSes won't really be allocating the pages I don't write to, and so if I always allocate in the lower addresses first, I'm going to be OK, so long as I don't actually use more memory than the system physically has.
I would also be curious to hear alternative suggestions as to how to manage machine code allocation so that the machine code always resides in the same 4GB space on a 64-bit machine.
Your mmap of 4GB may succeed in allocating the virtual memory, and physical pages will be allocated as they are "dirtied", or modified by your program. If you run out of physical memory, your process may be terminated. See, also, this question.
Related
This image and others like it have been bothering me for a while now. When I use malloc, this should be a part of the dynamic data, the heap. However, this seems to be bounded from above by the stack, which seems very ineffective. A program cannot predict how much memory I plan on allocating, so how does the program judge how far up to put the stack? It seems as though all of the memory in the middle is wasted, and I would like to know how this works for programs that could potentially range from a small service program that doesn't use any dynamic memory verses a videogame that could potentially allocate huge sections of memory.
Say, for example, I open up Microsoft paint. If I paste a high resolution picture into it, the memory allocation of paint skyrockets. Where was this memory taken from? What I would truly like is a snapshot of my entire RAM stick labelled as above to visualize how the many programs of a computer partition the computer's memory as a whole, but I can only find diagrams like this one for a single process and a single section of RAM.
Your picture is not of the RAM, but of the address space of some process in virtual memory. The kernel configures the MMU to manage virtual memory, provide the virtual address space of a process, and do some paging, and manage the page cache.
BTW, it is not the compiler which grows the stack (so your picture is wrong). The compiler is generating machine code which may push or pop things on the call stack. For malloc allocated heap, the C standard library implementation contains the malloc function, above operating system primitives or system calls allocating pages of virtual memory (e.g. mmap(2) on Linux).
On Linux, a process can ask its address space to be changed with mmap(2) -and munmap and mprotect. When a program is started with execve(2) the kernel is setting its initial address space. See also /proc/ (see proc(5) and try cat /proc/$$/maps....). BTW mmap is often used to implement malloc(3) and dlopen(3) -runtime loading of plugins, both heavily used in the RefPerSys project (it is a free software artificial intelligence project for Linux).
Since most Linux systems are open source, I suggest you to dive into implementation details by downloading then looking inside the source code of GNU libc or musl-libc: both implement malloc and dlopen above mmap and other syscalls(2).
Windows has similar facilities, but I don't know Windows. Refer to the documentation of the WinAPI
Read also Operating Systems: Three Easy Pieces, and, if you code in C, some good book such as Modern C and some C reference website. Be sure to read the documentation of your C compiler (e.g. GCC). See also the OSDEV website.
Be aware that modern C compilers are permitted to make extensive optimizations. Read what every C programmer should know about undefined behavior. See also this draft report.
In modern systems, a technique called virtual memory is used to give the program its own memory space. Heap and stack locations are at specific locations in virtual memory. The kernel then takes care of mapping memory locations between physical memory and virtual memory. Your program may have a chunk of memory allocated at 0x80000000, but that chunk of memory might be stored in location 0x49BA5400. Your actual RAM stick would be a jumbled mess of chunks of all of those sections in seemingly random locations.
I'm a fresher and was asked this question in the Microsoft recruitment process.
I'd read somewhere that the maximum memory allocated to a process can be the maximum physical memory available. So is it that if the RAM is 4GB, that's the answer? If yes, then how? Because some part of the RAM is always occupied by the Operating System, right? If no, then could you tell me the answer and what are the factors it really depends on?
First of all, the base of your question is totally related to Virtual Memory which has already been pointed out by Chris O!
Now,proceeding to your questions step by step :-
I'd read somewhere that the maximum memory allocated to a process can
be the maximum physical memory available. So is it that if the RAM is
4GB, that's the answer?
No, the maximum memory which your process can use can be anything depending on the virtual memory assigned or the swap size. Swap memory is generally taken twice of the physical memory,thought it can always be more or less depending on the requirements!
Also, PAE (Physical Address Extension) allows more memory to be allocated. PAE allows a 32-bit OS to use more RAM, that is, more physical memory. This has nothing whatsoever to do with the 4GB virtual address space limitation that 32-bit OSes have.
A 32-bit OS uses 32-bit virtual addresses. That limits it to 4GB of addressable virtual memory at any one time. If a 32-bit OS also uses 32-bit physical addresses, it is limited to 4GB of physical memory as well. PAE allows a 32-bit OS to use 36-bit physical addresses, which raises the limit to 64GB.
Next, the point which you mentioned is valid for the atomic processes which can't be broken further into threads or So. I doubt one would rarely face that situation in which the size of atomic process is more than that of the physical memory...
If yes, then how?Because some part of the RAM is always occupied by
the Operating System, right?
No.it's not as I already have mentioned above!
If no, then could you tell me the answer and what are the factors it
really depends on?
The memory requirement of a process is not defined earlier. But, you might have heard about this that many programs recommend at least it must have this much of memory to execute this process. This is the minimal requirement of the process without which the process won't even run properly! Because it must have suitable physical memory to handle those events! Next, the term swapping comes into picture whenever we are talking about Virtual memory! All the process which are currently not running are send to disks and the process which are to be executed are sent to the physical memory for execution.So, more than one processes are requested and executed by continuous swapping!
Some other continuous processes which are maintained in main memory are :-
System processes OR daemons
cache memory or cache maintenance
sorry for my rather general question, but I could not find a definite answer to it:
Given that I have free swap memory left and I allocate memory in reasonable chunks (~1MB) -> can memory allocation still fail for any reason?
The smartass answer would be "yes, memory allocation can fail for any reason". That may not be what you are looking for.
Generally, whether your system has free memory left is not related to whether allocations succeed. Rather, the question is whether your process address space has free virtual address space.
The allocator (malloc, operator new, ...) first looks if there is free address space in the current process that is already mapped, that is, the kernel is aware that the addresses should be usable. If there is, that address space is reserved in the allocator and returned.
Otherwise, the kernel is asked to map new address space to the process. This may fail, but generally doesn't, as mapping does not imply using physical memory yet -- it is just a promise that, should someone try to access this address, the kernel will try to find physical memory and set up the MMU tables so the virtual->physical translation finds it.
When the system is out of memory, there is no physical memory left, the process is suspended and the kernel attempts to free physical memory by moving other processes' memory to disk. The application does not notice this, except that executing a single assembler instruction apparently took a long time.
Memory allocations in the process fail if there is no mapped free region large enough and the kernel refuses to establish a mapping. For example, not all virtual addresses are useable, as most operating systems map the kernel at some address (typically, 0x80000000, 0xc0000000, 0xe0000000 or something such on 32 bit architectures), so there is a per-process limit that may be lower than the system limit (for example, a 32 bit process on Windows can only allocate 2 GB, even if the system is 64 bit). File mappings (such as the program itself and DLLs) further reduce the available space.
A very general and theoretical answer would be no, it can not. One of the reasons it could possibly under very peculiar circumstances fail is that there would be some weird fragmentation of your available / allocatable memory. I wonder whether you're trying get (probably very minor) performance boost (skipping if pointer == NULL - kind of thing) or you're just wondering and want to discuss it, in which case you should probably use chat.
Yes, memory allocation often fails when you run out of memory space in a 32-bit application (can be 2, 3 or 4 GB depending on OS version and settings). This would be due to a memory leak. It can also fail if your OS runs out of space in your swap file.
i have following question
i have RAM 2.5 GB in my computer what i want is if it is possible that in case of allocate totally memory to some process or for example
char * buffer=malloc(2.4GB) , no more process ( google chrome, microsoft games in computer..etc) can run?
Probably not. First, your operating system will have protections ie, malloc eventually becomes a system call in your OS so it will fail instead of killing everything.
Second, because of virtual memory you can have more allocated memory than RAM so that even if your OS were to let you allocate 2.5 gigs it will still be able to function and run processes.
While it is OS and compiler dependent, on Visual C++ under 32 bits windows, you will typically be unable to malloc more than 512MB at a time. This controlled by the preprocessor constant _HEAP_MAXREQ. For details of the approach I used to work around this limitatation, see the following thread If you go to 64 bits, this also ceases to be an issue, although you might end up using much more virtual memory than you would expect.
In a OS like Windows where each process gets a 4GB (assuming 32 bit OS) virtual address space, it doesn't matter how much RAM you are having. In such a case malloc(2.4GB) will surely fail as the user address space is limited to 2GB only. Even allocating 2GB will most probably fail as the system has to allocate 2GB of continuos virtual address space for malloc. This much continous free memory is nearly impossible due to fragmentation.
Computer works with virtual memory, this has no relation to a real size of RAM.
I am playing with an MSDN sample to do memory stress testing (see: http://msdn.microsoft.com/en-us/magazine/cc163613.aspx) and an extension of that tool that specifically eats physical memory (see http://www.donationcoder.com/Forums/bb/index.php?topic=14895.0;prev_next=next). I am obviously confused though on the differences between Virtual and Physical Memory. I thought each process has 2 GB of virtual memory (although I also read 1.5 GB because of "overhead". My understanding was that some/all/none of this virtual memory could be physical memory, and the amount of physical memory used by a process could change over time (memory could be swapped out to disc, etc.)I further thought that, in general, when you allocate memory, the operating system could use physical memory or virtual memory. From this, I conclude that dwAvailVirtual should always be equal to or greater than dwAvailPhys in the call GlobalMemoryStatus. However, I often (always?) see the opposite. What am I missing.
I apologize in advance if my question is not well formed. I'm still trying to get my head around the whole memory management system in Windows. Tutorials/Explanations/Book recs are most welcome!
Andrew
That was only true in the olden days, back when RAM was expensive. The operating system maps pages of virtual memory to RAM as needed. If there isn't enough RAM to satisfy a program's request, it starts unmapping pages to make room. If such a page contains data instead of code, it gets written to the paging file. Whenever the program accesses that page again, it generates a paging fault, letting the operating system read the page back from disk.
If the machine has little RAM and lots of processes consuming virtual memory pages, that can cause a very unpleasant effect called "thrashing". The operating system is constantly accessing the disk and machine performance slows down to a crawl.
More RAM means less disk access. There's very little reason not to use 3 or 4 GB of RAM on a 32-bit operating system, it's cheap. Even if you do not get to use all 4 GB, not all of it will be addressable due hardware devices taking space on the address bus (video, mostly). But that won't change the size of the virtual memory accessible by user code, it is still 2 Gigabytes.
Windows Internals is a good book.
The amount of virtual memory is limited by size of the address space - which is 4GB per process on a 32-bit system. And you have to subtract from this the size of regions reserved for system use and the amount of VM used already by your process (including all the libraries mapped to its address space).
On the other hand, the total amount of physical memory may be higher than the amount of virtual memory space the system has left free for your process to use (and these days it often is).
This means that if you have more than ~2GB or RAM, you can't use all your physical memory in one process (since there's not enough virtual memory space to map it to), but it can be used by many processes. Note that this limitation is removed in a 64-bit system.
I don't know if this is your issue, but the MSDN page for the GlobalMemoryStatus function contains the following warning:
On computers with more than 4 GB of memory, the GlobalMemoryStatus function can return incorrect information, reporting a value of –1 to indicate an overflow. For this reason, applications should use the GlobalMemoryStatusEx function instead.
Additionally, that page says:
On Intel x86 computers with more than 2 GB and less than 4 GB of memory, the GlobalMemoryStatus function will always return 2 GB in the dwTotalPhys member of the MEMORYSTATUS structure. Similarly, if the total available memory is between 2 and 4 GB, the dwAvailPhys member of the MEMORYSTATUS structure will be rounded down to 2 GB. If the executable is linked using the /LARGEADDRESSAWARE linker option, then the GlobalMemoryStatus function will return the correct amount of physical memory in both members.
Since you're referring to members like dwAvailPhys instead of ullAvailPhys, it sounds like you're using a MEMORYSTATUS structure instead of a MEMORYSTATUSEX structure. I don't know the consequences of that on a 64-bit platform, but on a 32-bit platform that definitely could cause incorrect memory sizes to be reported.