I am studying memory management. In particular, I am studying MMU and the mapping between the process logical space pages and the RAM frames.
My question is: what about low-end embedded systems? If I'm correct, MMU can't be used in this systems due to their smaller memory. So how computers with less memory available can avoid the problem of shared memory between processes?
For embedded systems, the kind of MMU you speak of is only present in high-end microcontrollers like PowerPC or Cortex A.
Low-end to mid-range microcontrollers do often have some simpler form of MMU though. Not as advanced as used to create virtual memory sections, but a simpler kind which allows remapping of RAM, flash, registers and so on. Similarly, they often have various mechanisms for protecting certain parts of the memory from accidental writes. They may or may not be smart enough to do a "MMU-like" realization that code is executing from data memory or when data access happens in code memory. Harvard vs von Neumann architecture also matters here.
As for multiple processes in a RTOS, it can't be compared with multiple processes in a desktop computer. Each process in a RTOS typically got its own stack but that's about it - the MMU isn't involved in that but it's handled by the RTOS. Code in embedded systems is typically executed directly from flash, so it doesn't make sense to assign chunks of RAM memory for executable code like in a PC. Several processes will simply execute code from flash and it might be the same code or different code between processes simply depending on whether they share common code or not.
Similarly, it is senseless to use heap allocation in embedded systems (see Why should I not use dynamic memory allocation in embedded systems?) so we don't need to create a RAM image for that purpose either. The only thing left as unique per process is the stack, as well as separate parts of .data/.bss.
Related
I am intending to write a program to create huge relational networks out of unstructured data - the exact implementation is irrelevant but imagine a GPT-3-style large language model. Training such a model would require potentially 100+ gigabytes of available random access memory as links get reinforced between new and existing nodes in the graph. Only a small portion of the entire model would likely be loaded at any given time, but potentially any region of memory may be accessed randomly.
I do not have a machine with 512 Gb of physical RAM. However, I do have one with a 512 Gb NVMe SSD that I can dedicate for the purpose. I see two potential options for making this program work without specialized hardware:
I can write my own memory manager that would swap pages between "hot" resident memory and "cold" on the hard disk, probably using memory-mapped files or some similar construct. This would require me coding all memory accesses in the modeling program to use this custom memory manager, and coding the page cache and concurrent access handlers and all of the other low-level stuff that comes along with it, which would take days and very likely introduce bugs. Also performance would likely be poor. Or,
I can configure the operating system to use the entire SSD as a page file / SWAP filesystem, and then just have the program reserve as much virtual memory as it needs - the same as any other normal program, relying on the kernel's memory manager which is already doing the page mapping + swapping + caching for me.
The problem I foresee with #2 is making the operating system understand what I am trying to do in a "cooperative" way. Ideally I would like to hint to the OS that I would only like a specific fraction of resident memory and swap the rest, to keep overall system RAM usage below 90% or so. Otherwise the OS will allocate 99% of physical RAM and then start aggressively compacting and cutting down memory from other background programs, which ends up making the whole system unresponsive. Linux apparently just starts sacrificing entire processes if it gets too bad.
Does there exist a kernel command in any language or operating system that would let me tell the OS to chill out and proactively swap user memory to disk? I have looked through VMM functions in kernel32.dll and the Linux paging and swap daemon (kswapd) documentation, but nothing looks like what I need. Perhaps some way to reserve, say, 1Gb of pages and then "donate" them back to the kernel to make sure they get used for processes that aren't my own? Some way to configure memory pressure or limits or make kswapd work more aggressively for just my process?
I have an application where I have 3 different processes that need to run concurrently, in 3 different languages, on Windows:
A "data gathering" process, which interfaces with a sensor array. The developers of the sensor array have been kind enough to provide us with their C# source code, which I can modify. This should be generating raw data and shoving it into shared memory
A "post-processing" process. This is C++ code that uses CUDA to get the processing done as fast as possible. This should be taking raw data, moving it to the GPU, then taking the results from the GPU and communicating it to--
A feedback controller written in Matlab, which takes the results of the post-processing and uses it to make decisions on how to control a mechanical system.
I've done coursework on parallel programming, but that coursework all worked in Linux, where I used the mmap.h for coordination between multiple processes. This makes sense to me--you ask the OS for a page in virtual memory to be mapped to shared physical memory addresses, and the OS gives you some shared memory.
Googling around, it seems like the preferred way to set up shared memory between processes in Windows (in fact, the only "easy" way to do it in Matlab) is to use memory-mapped files, but this seems completely bonkers to me. If I'm understanding correctly, a memory-mapped file grabs some disk space and maps it to the physical address space, which is then mapped into the virtual address space for any process that accesses the same memory-mapped file.
This seems about three times more complex than it needs to be just to get multiple processes to map pages in their virtual address space to the same physical memory. I don't feel like I should be doing anything remotely related to disk I/O for what I'm trying to accomplish, especially since performance is a big issue for me (ideally I should be able to process 1000 sets of data per second, though that's not a hard limit). Is this really the right way to coordinate my processes?
I'm recently learning the part of I/O buffering of operating system and according to the book I use,
When a user process issues an I/O request, the OS assigns a buffer in the system portion of main memory to the operation.
I understand how this method is able to avoid the swapping problem in non-buffering situation. But is it assumed that the OS buffering created for the process will never be swapped out?
To extend my question, I was wondering if there is any mechanism where the kernel portion of an OS in memory may also be swapped?
It is common for operating systems to page out parts of the kernel. The kernel has to define what parts may be paged out and which may not be paged out. For example, typically, there will be separate memory allocators for paged pool and non-paged pool.
Note that on most processors the page table format is the same for system pages as for user pages, thus supporting paging of the kernel.
Determining what parts of the kernel may be paged out is part of the system design and is done up front. You cannot page out the system interrupt table. You can page out system service code for the most part. You cannot page out interrupt handling code for the most part.
I was wondering if there is any mechanism where the kernel portion of an OS in memory
IIRC some old versions of AIX might have been able to swap (i.e. to paginate) some kernel code. And probably older OSes too (perhaps even Multics).
However, it practically is useless today, because the kernel memory is a tiny fraction of the RAM on current (desktop & server) computers. The total kernel memory is several dozens of megabytes only, while most computers have several dozens of gigabytes of RAM.
BTW, microkernel systems (e.g. GNU Hurd) can have server programs in paging processes.
See Operating Systems: Three Easy Pieces
I've been doing research on operating systems lately, particularly regarding memory management. However, I'm not sure what the difference is between memory management schemes like those found at http://en.wikipedia.org/wiki/Memory_management such as memory pools or the buddy system, and components of virtual memory, such as paging. Do they both accomplish the same thing or different things? How are they typically implemented in modern operating systems?
They are complementary. Memory management generally refers to how virtual address space is allocated to hold objects in a program. The goal is to reduce fragmentation.
Virtual memory is a system that allows processes to beleive they have more memory then actually exists, allows processes to share parts of their memory without worrying about protecting the rest and so on. The OS's job here is to decide which pages should be backed by physical memory, and how to swap out ones that aren't in use.
This is a re-submission, because I am not getting any response from superuser.com. Sorry for the misunderstanding.
I need to know the difference between physical addressing and virtual addressing concept in embedded systems.
Why virtual addressing concept is implemented in embedded systems?
What is the advantage of the virtual addressing over a system with physical addressing concept in embedded systems?
How the mapping between virtual addressing to physical addressing is done in embedded systems?
Please, explain the above concept with some simple examples in some simple architecture.
Physical addressing means that your program actually knows the real layout of RAM. When you access a variable at address 0x8746b3, that's where it's really stored in the physical RAM chips.
With virtual addressing, all application memory accesses go to a page table, which then maps from the virtual to the physical address. So every application has its own "private" address space, and no program can read or write to another program's memory. This is called segmentation.
Virtual addressing has many benefits. It protects programs from crashing each other through poor pointer manipulation, etc. Because each program has its own distinct virtual memory set, no program can read another's data - this is both a safety and a security plus. Virtual memory also enables paging, where a program's physical RAM may be stored on a disk (or, now, slower flash) when not in use, then called back when an application attempts to access the page. Also, since only one program may be resident at a particular physical page, in a physical paging system, either a) all programs must be compiled to load at different memory addresses or b) every program must use Position-Independent Code, or c) some sets of programs cannot run simultaneously.
The physical-virtual mapping may be done in software (with hardware support for memory traps) or in pure hardware. Sometimes even the page tables themselves are on a special set of hardware memory. I don't know off the top of my head which embedded system does what, but every desktop has a hardware TLB (Translation Lookaside Buffer, basically a cache for the virtual-physical mappings) and some now have advanced Memory Mapping Units that help with virtual machines and the like.
The only downsides of virtual memory are added complexity in the hardware implementation and slower performance.
The VAX (Virtual Address eXtented by Digital Equipment Corp which became Compaq, which became HP) is a very good example of an virtual embeded hardware system. It was a 32 bit mini computer that had an OS called VMS or Virtual Memory Systems. Dave Cutler was one of the principle architets of the systems and he much later wrote the Kernal for Windows NT. He is a very good read for this and other stuff. The Vax had special hardware for control of the virtual space and control of opcode access for security through hardware... very secure. This system was or is the grandfather of the modfern day PC at the Kernal Level. The first BSOD I saw on WNT 3.51 I was able to read because it came from the crash dump used in VMS to stop the system when unstable. By te way Look at the name VMS and WNT and you will find the next letters in the alhabet from VMS makes the term WNT. This was not an accident. maybe a jab at DEC for letting him go.