Deleting shared memory with ipcrm in Linux - shared-memory

I am working with a shared memory application, and to delete the segments I use the following command:
ipcrm -M 0x0000162e (this is the key)
But I do not know if I'm doing the right things, because when I run ipcs I see the same segment but with the key 0x0000000. So is the memory segment really deleted? When I run my application several times I see different memory segments with the key 0x000000, like this:
key shmid owner perms bytes nattch status
0x00000000 65538 me 666 27 2 dest
0x00000000 98307 me 666 5 2 dest
0x00000000 131076 me 666 5 1 dest
0x00000000 163845 me 666 5 0
What is actually happening? Is the memory segment really deleted?
Edit: The problem was - as said below in the accepted answer - that there were two processes using the shared memory, until all the process were closed, the memory segment is not going to disappear.

I vaguely remember from my UNIX (AIX and HPUX, I'll admit I've never used shared memory in Linux) days that deletion simply marks the block as no longer attachable by new clients.
It will only be physically deleted at some point after there are no more processes attached to it.
This is the same as with regular files that are deleted, their directory information is removed but the contents of the file only disappear after the last process closes it. This sometimes leads to log files that take up more and more space on the file system even after they're deleted as processes are still writing to them, a consequence of the "detachment" between a file pointer (the zero or more directory entries pointing to an inode) and the file content (the inode itself).
You can see from your ipcs output that 3 of the 4 still have attached processes so they won't be going anywhere until those processes detach from the shared memory blocks. The other's probably waiting for some 'sweep' function to clean it up but that would, of course, depend on the shared memory implementation.
A well-written client of shared memory (or log files for that matter) should periodically re-attach (or roll over) to ensure this situation is transient and doesn't affect the operation of the software.

You said that you used the following command
ipcrm -M 0x0000162e (this is the key)
From the man page for ipcrm
-M shmkey
Mark the shared memory segment associated with key shmkey for
removal. This marked segment will be destroyed after the
last detach.
So the behaviour of -M options does exactly what you observed, ie set the segment to be destroyed only after the last detach.

Related

accessing process memory parts

I'm currently studying memory management of OS by the video lecture. The instructor says,
In fact, you may have, and it is quite often the case that there may
be several parts of the process memory, which are not even accessed at
all. That is, they are neither executed, loaded or stored from memory.
I don't understand the saying since even if in a simple C program, we access whole address space of it. Don't we?
#include <stdio.h>
int main()
{
printf("Hello, World!");
return 0;
}
Could you elucidate the saying? If possible could you provide an example program wherein "several parts of the process memory, which are not even accessed at all" when it is run.
Imagine you have a large and complicated utility (e.g. a compiler), and the user asks it for help (e.g. they type gcc --help instead of asking it to compile anything). In this case, how much of the utility's code and data is used?
Most programs have various optional parts that aren't used (e.g. maybe something that works with graphics will have some code for 16 bits per pixel and other code for 32 bits per pixel, and will determine which code to use and not use the other code). Most heap allocators are "eager" (e.g. they'll ask the OS for 20 MiB of space and then might only "malloc() 2 MiB of it). Sometimes a program will memory map a huge file but then only access a small part of it.
Even for your trivial "hello world" example code; the virtual address space probably contains a huge (several MiB) shared library to support lots of C standard library functions (e.g. puts(), fprintf(), sprintf(), ...) and your program only uses a small part of that shared library; and your program probably reserves a conservative amount of space for its stack (e.g. maybe 20 KiB of space for its stack) and then probably only uses a few hundred bytes of stack.
In a virtual memory system, the address space of the process is created in secondary store at start up. Little or nothing gets placed in memory. For example, the operating system may use the executable file as the page file for the code and static data. It just sets up an internal structure that says some range of memory is mapped to these blocks in the executable file. The same goes for shared libraries. The other data gets mapped to the page file.
As your program runs it starts page faulting rapidly because nothing is in memory and the operating system has to load it from secondary storage.
If there is something that your program does not reference, it never gets loaded into memory.
If you had global variable declared like
char somedata [1045] ;
and your program never references that variable, it will never get loaded into memory. The same goes for code. If you have pages of code that done get execute (e.g. error handling code) it does not get loaded. If you link to shared libraries, you will likely bece including a lot of functions that you never use. Likewise, they will not get loaded if you do not execute them.
To begin with, not all of the address space is backed by physical memory at all times, especially if your address space covers 248+ bytes, which your computer doesn't have (which is not to say you can't map most of the address space to a single physical page of memory, which would be of very little utility for anything).
And then some portions of the address space may be purposefully permanently inaccessible, like a few pages near virtual address 0 (to catch NULL pointer dereferences).
And as it's been pointed out in the other answers, with on-demand loading of programs, you may have some portions of the address space reserved for your program but if the program doesn't happen to need any of its code or data there, nothing needs to be loader there either.

ext4 commit= mount option and dirty_writeback_centisecs

I'm tring to understand the way bytes go from write() to the phisical disk plate to tune my picture server performance.
Thing I don't understand is what is the difference between these two: commit= mount option and dirty_writeback_centisecs. Looks like they are about the same procces of writing changes to the storage device, but still different.
I do not get it clear which one fires first on the way to the disk for my bytes.
Yeah, I just ran into this investigating mount options for an SDCard Ubuntu install on an ARM Chromebook. Here's what I can tell you...
Here's how to see the dirty and writeback amounts:
user#chrubuntu:~$ cat /proc/meminfo | grep "Dirty" -A1
Dirty: 14232 kB
Writeback: 4608 kB
(edit: This dirty and writeback is rather high, I had a compile running when I ran this.)
So data to be written out is dirty. Dirty data can still be eliminated (if say, a temporary file is created, used, and deleted before it goes to writeback, it'll never have to be written out). As dirty data is moved into writeback, the kernel tries to combine smaller requests that may be into dirty into single larger I/O requests, this is one reason why dirty_expire_centisecs is usually not set too low. Dirty data is usually put into writeback when a) Enough data is cached to get up to vm.dirty_background_ratio. b) As data gets to be vm.dirty_writeback_centisecs centiseconds old (3000 default is 30 seconds) it is put into writeback. vm.dirty_writeback_centisecs, a writeback daemon is run by default every 500 centiseconds (5 seconds) to actually flush out anything in writeback.
fsync will flush out an individual file (force it from dirty into writeback and wait until it's flushed out of writeback), and sync does that with everything. As far as I know, it does this ASAP, bypassing any attempt to try to balance disk reads and writes, it stalls the device doing 100% writes until the sync completes.
commit=5 default ext4 mount option actually forces a sync() every 5 seconds on that filesystem. This is intended to ensure that writes are not unduly delayed if there's heavy read activity (ideally losing a maximum of 5 seconds of data if power is cut or whatever.) What I found with an Ubuntu install on SDCard (in a Chromebook) is that this actually just leads to massive filesystem stalls like every 5 seconds if you're writing much to the card, ChromeOS uses commit=600 and I applied that Ubuntu-side to good effect.
The dirty_writeback_centisecs, configures the daemons of the kernel Linux related to the virtual memory (that's why the vm). Which are in charge of making a write back from the RAM memory to all the storage devices, so if you configure the dirty_writeback_centisecs and you have 25 different storage devices mounted on your system it will have the same amount of time of writeback for all the 25 storage systems.
While the commit is done per storage device (actually is per filesystem) and is related to the sync process instead of the daemons from the virtual memory.
So you can see it as:
dirty_writeback_centisecs
writing from RAM to all filesystems
commit
each filesystem fetches from RAM

Does Windows clear memory pages?

I know that Windows has an option to clear the page file when it shuts down.
Does Windows do anything special with the actual physical/virtual memory when it goes in or out of scope?
For instance, let's say I run application A, which writes a recognizable string to a variable in memory, and then I close the application. Then I run application B. It allocates a large chunk of memory, leaves the contents uninitialized, and searches it for the known string written by application A.
Is there ANY possibility that application B will pick up the string written by application A? Or does Windows scrub the memory before making it available?
Windows does "scrub" the freed memory returned by a process before allocating it to other processes. There is a kernel thread specifically for this task alone.
The zero page thread runs at the lowest priority and is responsible for zeroing out free pages before moving them to the zeroed page list[1].
Rather than worrying about retaining sensitive data in the paging file, you should be worried about continuing to retain it in memory (after use) in the first place. Clearing the page-file on shutdown is not the default behavior. Also a system crash dump will contain any sensitive info that you may have in "plain-text" in RAM.
Windows does NOT "scrub" the memory as long as it is allocated to a process (obviously). Rather it is left to the program(mer) to do so. For this very purpose one can use the SecureZeroMemory() function.
This function is defined as the RtlSecureZeroMemory() function ( see WinBase.h). The implementation of RtlSecureZeroMemory() is provided inline and can be used on any version of Windows ( see WinNT.h)
Use this function instead of ZeroMemory() when you want to ensure that your data will be overwritten promptly, as some C++ compilers can optimize a call to ZeroMemory() by removing it entirely.
WCHAR szPassword[MAX_PATH];
/* Obtain the password */
if (GetPasswordFromUser(szPassword, MAX_PATH))
{
UsePassword(szPassword);
}
/* Before continuing, clear the password from memory */
SecureZeroMemory(szPassword, sizeof(szPassword));
Don't forget to read this interesting article by Raymond Chen.

redis bgsave failed because fork Cannot allocate memory

all:
here is my server memory info with 'free -m'
total used free shared buffers cached
Mem: 64433 49259 15174 0 3 31
-/+ buffers/cache: 49224 15209
Swap: 8197 184 8012
my redis-server has used 46G memory, there is almost 15G memory left free
As my knowledge,fork is copy on write, it should not failed when there has 15G free memory,which is enough to malloc necessary kernel structures .
besides, when redis-server used 42G memory, bgsave is ok and fork is ok too.
Is there any vm parameter I can tune to make fork return success ?
More specifically, from the Redis FAQ
Redis background saving schema relies on the copy-on-write semantic of fork in modern operating systems: Redis forks (creates a child process) that is an exact copy of the parent. The child process dumps the DB on disk and finally exits. In theory the child should use as much memory as the parent being a copy, but actually thanks to the copy-on-write semantic implemented by most modern operating systems the parent and child process will share the common memory pages. A page will be duplicated only when it changes in the child or in the parent. Since in theory all the pages may change while the child process is saving, Linux can't tell in advance how much memory the child will take, so if the overcommit_memory setting is set to zero fork will fail unless there is as much free RAM as required to really duplicate all the parent memory pages, with the result that if you have a Redis dataset of 3 GB and just 2 GB of free memory it will fail.
Setting overcommit_memory to 1 says Linux to relax and perform the fork in a more optimistic allocation fashion, and this is indeed what you want for Redis.
Redis doesn't need as much memory as the OS thinks it does to write to disk, so may pre-emptively fail the fork.
Modify /etc/sysctl.conf and add:
vm.overcommit_memory=1
Then restart sysctl with:
On FreeBSD:
sudo /etc/rc.d/sysctl reload
On Linux:
sudo sysctl -p /etc/sysctl.conf
From proc(5) man pages:
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual memory accounting mode. Values are:
0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit
In mode 0, calls of mmap(2) with MAP_NORESERVE set are not checked, and the default check is very weak, leading to the risk of getting a process "OOM-killed". Under Linux 2.4
any non-zero value implies mode 1. In mode 2 (available since Linux 2.6), the total virtual address space on the system is limited to (SS + RAM*(r/100)), where SS is the size
of the swap space, and RAM is the size of the physical memory, and r is the contents of the file /proc/sys/vm/overcommit_ratio.
Redis's fork-based snapshotting method can effectively double physical memory usage and easily OOM in cases like yours. Reliance on linux virtual memory for doing snapshotting is problematic, because Linux has no visibility into Redis data structures.
Recently a new redis-compatible project Dragonfly has been released. Among other things, it solves the OOM problem entirely. (disclosure - I am the author of this project).

when using shared memory in unix

When you code a data supplier app in C for Unix that uses shared memory when do you detach the shared memory only when the server exits or when you are finished updating the shared memory ?
AFAIK, keeping it attached will not bother.
However since the attachment tracks the number of processes attached, if that count is >0, then you won't be allowed to destroy your shm until that count get back to 0 (in other words when all process are detached).
If you have a main process attached, I'm not sure you will be able to destroy it from an external "administrative" process for you shm.
In my personnal experience, I don't detach the SHM after write operations, only at process exit.

Resources