OS X Shared Memory Cleanup - macos

I'm sharing memory between a parent process and multiple children processes by allocating shared memory segments with shm_open/mmap in OS X. Either parent or children may create the segment then communicate the identifying name to either. My understanding is that the parent has to call shm_unlink on each of these segments when it quits to cleanup memory, otherwise the shared memory is permanently leaked.
What I had initially thought from reading the documentation is that shared segments are cleaned up when no processes with it mapped are alive. However experiments show that this isn't the case and someone has to explicitly use shm_unlink.
Is there any way in OS X to list all the currently existing shared memory segments? The problem is that the parent may crash and so not have a chance to call shm_unlink. In Linux my solution is to clean out /dev/shm, but in OS X I would need some way of listing open shared segments.

The answer seems to be: you can't.
First, see this question, which quotes a comment in the kernel:
TODO:
(2) Need to export data to a userland tool via a
sysctl. Should ipcs(1) and ipcrm(1) be expanded or should new
tools to manage both POSIX kernel semaphores and POSIX shared
memory be written?
Also see this post on the Apple mailing list unix-porting:
There is no "picps"/"picprm" utility, you are expected to remember what
you create and clean up afterward, or clean up first thing on
restart if you crash a lot, there is nothing exposed directly
in the filesystem namespace, and you are expected to do
the shm_unlink because it is a rendezvous for potentially a
lot of unrelated programs.

Hope you figure out your problem. you can use ipcs -a and look under the heading Shared Memory for NATTCH. That value will tell you how many shared memory segments are attached to a particular id.

Related

Storing Per-Process Data in Kernel Module / Passing Data Between sys_enter and sys_exit Probe

Familiarity with how Linux Kernel Tracepoints work is not necessarily required to help with this question, it is just what is motivating this problem. In essence, I am looking for a way to store per-process data for a kernel module, without modifying the Linux source (e.g. struct task_struct), and ideally without using locks. Here is my specific question:
I have a kernel module that hooks into the sys_enter (defined here for x86_64, aarch64) and sys_exit (x86_64, aarch64) tracepoints. For each system call issued, I need to pass some data between the enter probe and the exit probe.
Some things I have considered: I could ...
...use one global variable -- but that will be shared between concurrently executing system calls on different CPUs, creating a race.
...use one global map from PID (of the process issuing the system call) to my data, together with locks -- but that will unnecessarily require synchronization between all CPUs on each system call. I would like to avoid this, since the data is "local" to each issued system call, so I feel like there should be a way to keep it local and not add costly synchronization.
...use a per-CPU global variable -- but (it is my understanding that) a process may move to another CPU during the system call execution, making this approach incorrect.
...kmallocing some memory for my custom data upon each system call entry, then pass the address to that memory by clobbering one of the registers in struct pt_regs (both the entry and exit probe receive a pointer to said struct) -- but then I will have a memory leak for system calls that do not trigger the exit probe (such as sys_exit, which never returns).
I am open to any suggestions how these ideas could be refined to address the problems I listed, or any completely different ideas that I am not thinking of.
I'd use an RCU enabled hashtable, for safety.
The first option isn't actually doable, as you stated.
The third one requires you to track which process is using which CPU, which seems unnecessary.
The leaking problem of the fourth option can probably be solved somehow, but allocating memory on each system call can introduce a serious delay.
Of course that accessing the hashtable will also slow down the system, but It won't trigger a memory allocation for each system call, so I assume it'll be less harmful.
Also, I may be wrong here, but if you assume that only process creation/destruction will introduce changes to table itself (not to the data within each entry, but the location and hash value of each row) than maybe you won't even have to synchronize on each system call, but only on ones that will cause process creation/destruction.

Can many (similar) processes use a common RAM cache?

As I understand the creation of processes, every process has it's own space in RAM for it's heap, data, etc, which is allocated upon its creation. Many processes can share their data and storage space in some ways. But since terminating a process would erase its allocated memory(so also its caches), I was wondering if it is possible that many (similar) processes share a cache in memory that is not allocated to any specific process, so that it can be used even when these processes are terminated and other ones are created.
This is a theoretical question from a student perspective, so I am merely interested in the general sence of an operating system, without adding more functionality to them to achieve it.
For example I think of a webserver that uses only single-threaded processes (maybe due to lack of multi-threading support), so that most of the processes created do similar jobs, like retrieving a certain page.
There are a least four ways what you describe can occur.
First, the system address space is shared by all processes. The Operating system can save data there that survives the death of a process.
Second, processes can map logical pages to the same physical page frame. The termination of one process does not cause the page frame to be deallocated to the other processes.
Third, some operating systems have support for writable shared libraries.
Fourth, memory mapped files.
There are probably others as well.
I think so, when a process is terminated the RAM clears it. However your right as things such as webpages will be stored in the Cache for when there re-called. For example -
You open Google and then go to another tab and close the open Google page, when you next go to Google it loads faster.
However, what I think your saying is if the Entire program E.G - Google Chrome or Safari - is closed, does the webpage you just had open stay in the cache? No, when the program is closed all its relative data is also terminated in order to fully close the program.
I guess this page has some info on it -
https://www.wikipedia.org/wiki/Shared_memory

Is it possible to associate data with a running process?

As the title says, I want to associate a random bit of data (ULONG) with a running process on the local machine. I want that data persisted with the process it's associated with, not the process thats reading & writing the data. Is this possible in Win32?
Yes but it can be tricky. You can't access an arbitrary memory address of another process and you can't count on shared memory because you want to do it with an arbitrary process.
The tricky way
What you can do is to create a window (with a special and known name) inside the process you want to decorate. See the end of the post for an alternative solution without windows.
First of all you have to get a handle to the process with OpenProcess.
Allocate memory with VirtualAllocEx in the other process to hold a short method that will create a (hidden) window with a special known name.
Copy that function from your own code with WriteProcessMemory.
Execute it with CreateRemoteThread.
Now you need a way to identify and read back this memory from another process other than the one that created that. For this you simply can find the window with that known name and you have your holder for a small chunk of data.
Please note that this technique may be used to inject code in another process so some Antivirus may warn about it.
Final notes
If Address Space Randomization is disabled you may not need to inject code in the process memory, you can call CreateRemoteThread with the address of a Windows kernel function with the same parameters (for example LoadLibrary). You can't do this with native applications (not linked to kernel32.dll).
You can't inject into system processes unless you have debug privileges for your process (with AdjustTokenPrivileges).
As alternative to the fake window you may create a suspended thread with a local variable, a TLS or stack entry used as data chunk. To find this thread you have to give it a name using, for example, this (but it's seldom applicable).
The naive way
A poor man solution (but probably much more easy to implement and somehow even more robust) can be to use ADS to hide a small data file for each process you want to monitor (of course an ADS associated with its image then it's not applicable for services and rundll'ed processes unless you make it much more complicated).
Iterate all processes and for each one create an ADS with a known name (and the process ID).
Inside it you have to store the system startup time and all the data you need.
To read back that informations:
Iterate all processes and check for that ADS, read it and compare the system startup time (if they mismatch then it means you found a widow ADS and it should be deleted.
Of course you have to take care of these widows so periodically you may need to check for them. Of course you can avoid this storing ALL these small chunk of data into a well-known location, your "reader" may check them all each time, deleting files no longer associated to a running process.

free mem as function of command 'purge'

one of my app needs the function that free inactive/used/wired memory just like command 'purge'.
Check and google a lot, but can not get any hit
Welcome any comment
Purge doesn't do what you seem to think it does. It doesn't "free inactive/used/wired memory". As the manpage says:
It does not affect anonymous memory that has been allocated through malloc, vm_allocate, etc.
All it does is purge the disk cache. This is only useful if you're running performance tests and want to simulate the effects of "first run after cold boot" without actually cold booting. Again, from the manpage:
Purge can be used to approximate initial boot conditions with a cold disk buffer cache for performance analysis.
There is no public API for this, although a quick scan of the symbols shows that it seems to call a function CPOSXPurgeAllDiskBuffers from the CoreProfile private framework. I believe the underlying kernel and userland disk cache code is all or mostly available on http://www.opensource.apple.com, so you could do probably implement the same thing yourself, if you really want.
As iMysak says, you can just exec (or NSTask, etc.) the tool if you want to.
As a side note, it you could free used/wired memory, presumably that memory is used by something—even if you don't have pointers into it in your own data structures, malloc probably does. Are you trying to segfault your code?
Freeing inactive memory is a different story. Just freeing something up to malloc doesn't necessarily make malloc return it to the OS. And there's no way you can force it to. If you think about the way traditional UNIX works, it makes sense: When you ask it to allocate more memory, it uses sbrk to expand your data segment; if you free up memory at the top, it can sbrk back down, but if you free up memory in the middle, there's no way it can do that. Of course modern UNIX systems don't work that way, but the POSIX and C APIs are all designed to be compatible with systems that do. So, if you want to make sure memory gets freed, you have to handle memory allocation directly.
The simplest and most portable way to do this is to create and mmap a temporary backing file, or just MAP_ANON, and explicitly unmap pages when you're done with them. (This works on all POSIX systems—and, with a pretty simple wrapper, even Windows.) If you need even more control (e.g., to manually handle flushing pages to disk, etc.), you can use the mach/mach_vm.h APIs.
You can directly run it from OS // with exec() function

freebsd shpgperproc what is responsible for?

i've googled a lot about what is "page share factor per proc" responsible for and found nothing. It's just interesting for me, i have no current problem with it for now, just curious (wnat to know more). In sysctl it is:
vm.pmap.shpgperproc
Thanks in advance
The first thing to note is that shpgperproc is a loader tunable, so it can only be set at boot time with an appropriate directive in loader.conf, and it's read-only after that.
The second thing to note is that it's defined in <arch>/<arch>/pmap.c, which handles the architecture-dependent portions of the vm subsystem. In particular, it's actually not present in the amd64 pmap.c - it was removed fairly recently, and I'll discuss this a bit below. However, it's present for the other architectures (i386, arm, ...), and it's used identically on each architecture; namely, it appears as follows:
void
pmap_init(void)
{
...
TUNABLE_INT_FETCH("vm.pmap.shpgperproc", &shpgperproc);
pv_entry_max = shpgperproc * maxproc + cnt.v_page_count;
and it's not used anywhere else. pmap_init() is called only once: at boot time as part of the vm subsystem initialization. maxproc, is just the maximum number of processes that can exist (i.e. kern.maxproc), and cnt.v_page_count is just the number of physical pages of memory available (i.e. vm.stats.v_page_count).
A pv_entry is basically just a virtual mapping of a physical page (or more precisely a struct vm_page, so if two processes share a page and both have them mapped, there will be a separate pv_entry structure for each mapping. Thus given a page (struct vm_page) that needs to be dirtied or paged out or something requiring a hw page table update, the list of corresponding mapped virtual pages can be easily found by looking at the corresponding list of pv_entrys (as an example, take a look at i386/i386/pmap.c:pmap_remove_all()).
The use of pv_entrys makes certain VM operations more efficient, but the current implementation (for i386 at least) seems to allocate a static amount of space (see pv_maxchunks, which is set based on pv_entry_max) for pv_chunks, which are used to manage pv_entrys. If the kernel can't allocate a pv_entry after deallocating inactive ones, it panics.
Thus we want to set pv_entry_max based on how many pv_entrys we want space for; clearly we'll want at least as many as there are pages of RAM (which is where cnt.v_page_count comes from). Then we'll want to allow for the fact that many pages will be multiply-virtually-mapped by different processes, since a pv_entry will need to be allocated for each such mapping. Thus shpgperproc - which has a default value of 200 on all arches - is just a way to scale this. On a system where many pages will be shared among processes (say on a heavily-loaded web server running apache), it's apparently possible to run out of pv_entrys, so one will want to bump it up.
I don't have a FreeBSD machine close to me right now but it seems this parameter is defined and used in pmap.c, http://fxr.watson.org/fxr/ident?v=FREEBSD7&im=bigexcerpts&i=shpgperproc

Resources