Tachyon Doesn't Seem to be Aware of Available Memory - alluxio

Just to see if Tachyon would give me an error about configured memory being more than available I set:
# Some value over combined available mem and disk space.
export TACHYON_WORKER_MEMORY_SIZE=1000GB
And observed the allocation in the web UI without error.
Is some of the info going to be pushed to disk when available RAM is exceeded?
What happens when it exceeds disk space? Dropped file errors or system failure?

This is the expected (if perhaps unhelpful behaviour) and ultimately it is to do with the fact that Tachyon uses Linux ramfs as the in-memory storage.
As this article explains:
ramfs file systems cannot be limited in size like a disk base file
system which is limited by it’s capacity. ramfs will continue using
memory storage until the system runs out of RAM and likely crashes or
becomes unresponsive.
Note that Tachyon will enforce the size constraint based on the size you give it. However as you've found you can allocate more RAM than is actually available and Tachyon won't check this so you may want to go ahead and file a bug report.
To answer your specific questions:
No excess data will not be pushed to disk automatically
When RAM is full behaviour is OS dependent
Note that the setting you are referring to only controls the in-memory space, if you want to use local disks in addition to RAM then you need to use Tachyon's Tiered Storage.

Related

mlock usage in RAM only system

We have a time-critical process (third party code) and it has mlockall. I am porting this code to our embedded system which doesnt have a hard disk.
It is RAM only system, which boots from SD card and has storage as well in SD card.
Does mlockall has any performance benefits at all in an RAM only system?
Just to check whether there is any swap partition it tried:
cat /proc/swaps
Filename Type Size Used Priority
In this case can i remove mlockall as it is not going to add any value in our system. Kindly help.
When under memory pressure the linux kernel might decide to evict pages from RAM. Dirty pages (containing writeable data) can only be swapped out. Clean pages (read only) like the text section of your process might just be dropped. Such pages would be brought back in via a page fault when needed.
The first case can not happen to you as there is no swap. The second case might happen and could be prevented via the mlockall call.
Overall the discussion is theoretical as running a system under memory pressure introduces a lot of non deterministic behaviour which is bad for a real time system.

what are pagecache, dentries, inodes?

Just learned these 3 new techniques from https://unix.stackexchange.com/questions/87908/how-do-you-empty-the-buffers-and-cache-on-a-linux-system:
To free pagecache:
# echo 1 > /proc/sys/vm/drop_caches
To free dentries and inodes:
# echo 2 > /proc/sys/vm/drop_caches
To free pagecache, dentries and inodes:
# echo 3 > /proc/sys/vm/drop_caches
I am trying to understand what exactly are pagecache, dentries and inodes. What exactly are they?
Do freeing them up also remove the useful memcached and/or redis cache?
--
Why i am asking this question? My Amazon EC2 server RAM was getting filled up over the days - from 6% to up to 95% in a matter of 7 days. I am having to run a bi-weekly cronjob to remove these cache. Then memory usage drops to 6% again.
With some oversimplification, let me try to explain in what appears to be the context of your question because there are multiple answers.
It appears you are working with memory caching of directory structures. An inode in your context is a data structure that represents a file. A dentries is a data structure that represents a directory. These structures could be used to build a memory cache that represents the file structure on a disk. To get a directly listing, the OS could go to the dentries--if the directory is there--list its contents (a series of inodes). If not there, go to the disk and read it into memory so that it can be used again.
The page cache could contain any memory mappings to blocks on disk. That could conceivably be buffered I/O, memory mapped files, paged areas of executables--anything that the OS could hold in memory from a file.
Your commands flush these buffers.
I am trying to understand what exactly are pagecache, dentries and
inodes. What exactly are they?
user3344003 already gave an exact answer to that specific question, but it's still important to note those memory structures are dynamically allocated.
When there's no better use for "free memory", memory will be used for those caches, but automatically purged and freed when some other "more important" application wants to allocate memory.
No, those caches don't affect any caches maintained by any applications (including redis and memcached).
My Amazon EC2 server RAM was getting filled up over the days - from 6%
to up to 95% in a matter of 7 days. I am having to run a bi-weekly
cronjob to remove these cache. Then memory usage drops to 6% again.
Probably you're mis-interpreting the situation: your system may just be making efficient usage of its ressources.
To simplify things a little bit: "free" memory can also be seen as "unused", or even more dramatic - a waste of resources: you paid for it, but don't make use of it. That's a very un-economic situation, and the linux kernel tries to make some "more useful" use of your "free" memory.
Part of its strategy involves using it to save various kinds of disk I/O by using various dynamically sized memory caches. A quick access to cache memory saves "slow" disk access, so that's often a useful idea.
As soon as a "more important" process wants to allocate memory, the Linux kernel voluntarily frees those caches and makes the memory available to the requesting process. So there's usually no need to "manually free" those caches.
The Linux kernel may even decide to swap out memory of an otherwise idle process to disk (swap space), freeing RAM to be used for "more important" tasks, probably also including to be used as some cache.
So as long as your system is not actively swapping in/out, there's little reason to manually flush caches.
A common case to "manually flush" those caches is purely for benchmark comparison: your first benchmark run may run with "empty" caches and so give poor results, while a second run will show much "better" results (due to the pre-warmed caches). By flushing your caches before any benchmark run, you're removing the "warmed" caches and so your benchmark runs are more "fair" to be compared with each other.
Common misconception is that "Free Memory" is important.
Memory is meant to be used.
So let's clear that out :
There's used memory, which is where important data is stored, and if that reaches 100% you're dead
Then there's cache/buffer, which is used as long as there is space to do so. It's facultative memory to access disk files faster, mostly. If you run out of free memory, this will just free itself and let you access disk directly.
Clearing cached memory as you suggest is most of the case useless and means you're deactivating an optimization, therefore you'll get a slow down.
If you really run out of memory, that is if your "used memory" is high, and you begin to see swap usage, then you must do something.
HOWEVER : there's a known bug running on AWS instances, with dentry cache eating memory with no apparent reason. It's clearly described and solved in this blog.
My own experience with this bug is that "dentry" cache consumes both "used" and "cached" memory and does not seem to release it in time, eventually causing swap. The bug itself can consume resources anyway, so you need to look into it.
Hate to bring an old thread back from the dead, but I've been dealing with memory issues lately on my Linux Virtual Machines. Unfortunately, even with the virtualization of computing machines being great and the advancements of Linux memory and resource allocation being superb, conflicts occur when the hypervisor acts out what it calls "performance features".
VMWare will actively send RAM that hasn't been "written or modified" recently, to the disk. When your disk is on a SAN, that means reading from the RAM is now at 1Gbps to 10Gbps at best if you have a REALLY performant RAID and steady network access (ignoring the fact that now the RAM of say 100 VMs are all using the same SAN). DDR3 RAM operates at 25Gbps+ on modern systems, so I'll assume you can see the problem with systems running at 1/25th to less than 1/2 of the speed anticipated.
The caches on my linux systems are literally the same speed as disk I/O of the filesystem, meaning they do not help our performance and are actively sending the OS's RAM into Swap instead of clearing caches. This is a huge problem thanks to VMWare, not because of Linux, but be aware that cloud infrastructure often does stupid crap like this all the time unfortunately. You can read more here: https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/perf-vsphere-memory_management.pdf or if you use VMWare, surely you'll notice the "allocated memory" vs "active memory" and where your VMs will always display a different amount than VMWare because of this distinction and treatment of the memory.

Will Windows be still able to allocate memory when free space in physical memory is very low?

On Windows 32-bit system the application is being developed using Visual Studio:
Lets say lots of other application running on my machine and they have occupied almost all of physical memory and only 1 MB memory is left free. If my application (which has not yet allocated any memory) tries to allocate, say 2 MB, will the call be successful?
My guess: In theory, each Windows application has 2GB of virtual memory available.
So I believe this call should be successful (regardless how much physical memory is available). But I am not sure on this. That's why asking here.
Windows gives a rock-hard guarantee that this will always work. A process can only allocate virtual memory when Windows can commit space in the paging file for the allocation. If necessary, it will grow the paging file to make the space available. If that fails, for example when the paging file grows beyond the preset limit, then the allocation fails as well. Windows doesn't have the equivalent of the Linux "OOM killer", it does not support over-committing that may require an operating system to start randomly killing processes to find RAM.
Do note that the "always works" clause does have a sting. There is no guarantee on how long this will take. In very extreme circumstances the machine can start thrashing where just about every memory access in the running processes causes a page fault. Code execution slows down to a crawl, you can lose control with the mouse pointer frozen when Explorer or the mouse or video driver start thrashing as well. You are well past the point of shopping for RAM when that happens. Windows applies quotas to processes to prevent them from hogging the machine, but if you have enough processes running then that doesn't necessarily avoid the problem.
Of course. It would be lousy design if memory had to be wasted now in order to be used later. Operating systems constantly re-purpose memory to its most advantageous use at any moment. They don't have to waste memory by keeping it free just so that it can be used later.
This is one of the benefits of virtual memory with a page file. Because the memory is virtual, the system can allocate more virtual memory than physical memory. Virtual memory that cannot fit in physical memory, is pushed out to the page file.
So the fact that your system may be using all of the physical memory does not mean that your program will not be able to allocate memory. In the scenario that you describe, your 2MB memory allocation will succeed. If you then access that memory, the virtual memory will be paged in to physical memory and very likely some other pages (maybe in your process, maybe in another process) will be pushed out to the page file.
Well, it will succeed as long as there's some memory for it - apart from physical memory, there's also the page file.
However, once you reach the limit of both RAM and the page file, you're done for and that's when the out of memory situation really starts being fun.
Now, systems like Windows Vista will try to use all of your available RAM, pretty much for caching. That's a good thing, and when there's a request for memory from an application, the cache will be thrown away as needed.
As for virtual memory, you can request much more than you have available, regardless of your RAM or page file size. Only when you commit the memory does it actually need some backing - either RAM or the page file. On 64-bit, you can easily request terabytes of virtual memory - that doesn't mean you'll get it when you try to commit it, though :P
If your application is unable to allocate a physical memory (RAM) block to store information, the operating system takes over and 'pages' or stores sections that are in RAM on disk to free up physical memory so that your program is able to perform the allocation. This is done automatically and is completely invisible to your applications.
So, in your example, on a system that has 1MB RAM free, if your application tries to allocate memory, the operating system will page certain contents of physical memory to disk and free up RAM for your application. Your application will not crash in this case.
This, obviously is much more complicated than that.
There are several ways to configure a page file on Windows (fixed size, variable size and on which disk). If you run out of physical memory, and out of hard drive space (because your page file has grown very large due to excessive 'paging') or reach the limit of your paging file (if it is a static limit) then your applications will fail due out an out-of-memory exception. With today's systems with large local storage however, this is a rare event.
Be sure to read about paging for the full picture. Check out:
http://en.wikipedia.org/wiki/Paging
In certain cases, you will notice that you have sufficient free physical memory. Say 100MB and your program tries to allocate a 10MB block to store a large object but fails. This is caused by physical memory fragmentation. Although the total free memory is 100MB, there is no single contiguous block of 10MB that can be used to store your object. This will result in an exception that needs to be handled in your code. If you allocate large objects in your code you may want to separate the allocation into smaller blocks to facilitate allocation, and then aggregate them back in your code logic. For example, instead of having a single 10m vector, you can declare 10 x 1m vectors in an array and allocate memory for each individual one.

What pitfalls should I be wary of when memory mapping BIG files?

I have a bunch of big files, each file can be over 100GB, the total amount of data can be 1TB and they are all read-only files (just have random reads).
My program does small reads in these files on a computer with about 8GB main memory.
In order to increase performance (no seek() and no buffer copying) i thought about using memory mapping, and basically memory-map the whole 1TB of data.
Although it sounds crazy at first, as main memory << disk, with an insight on how virtual memory works you should see that on 64bit machines there should not be problems.
All the pages read from disk to answer to my read()s will be considered "clean" from the OS, as these pages are never overwritten. This means that all these pages can go directly to the list of pages that can be used by the OS without writing back to disk OR swapping (wash them). This means that the operating system could actually store in physical memory just the LRU pages and would operate just reads() when the page is not in main memory.
This would mean no swapping and no increase in i/o because of the huge memory mapping.
This is theory; what I'm looking for is any of you who has every tried or used such an approach for real in production and can share his experience: are there any practical issues with this strategy?
What you are describing is correct. With a 64-bit OS you can map 1TB of address space to a file and let the OS manage reading and writing to the file.
You didn't mention what CPU architecture you are on but most of them (including amd64) the CPU maintains a bit in each page table entry as to whether data in the page has been written to. The OS can indeed use that flag to avoid writing pages that haven't been modified back to disk.
There would be no increase in IO just because the mapping is large. The amount of data you actually access would determine that. Most OSes, including Linux and Windows, have a unified page cache model in which cached blocks use the same physical pages of memory as memory mapped pages. I wouldn't expect the OS to use more memory with memory mapping than with cached IO. You're just getting direct access to the cached pages.
One concern you may have is with flushing modified data to disk. I'm not sure what the policy is on your OS specifically but the time between modifying a page and when the OS will actually write that data to disk may be a lot longer than your expecting. Use a flush API to force the data to be written to disk if it's important to have it written by a certain time.
I haven't used file mappings quite that large in the past but I would expect it to work well and at the very least be worth trying.

Memory mapped files causes low physical memory

I have a 2GB RAM and running a memory intensive application and going to low available physical memory state and system is not responding to user actions, like opening any application or menu invocation etc.
How do I trigger or tell the system to swap the memory to pagefile and free physical memory?
I'm using Windows XP.
If I run the same application on 4GB RAM machine it is not the case, system response is good. After getting choked of available physical memory system automatically swaps to pagefile and free physical memory, not that bad as 2GB system.
To overcome this problem (on 2GB machine) attempted to use memory mapped files for large dataset which are allocated by application. In this case virtual memory of the application(process) is fine but system cache is high and same problem as above that physical memory is less.
Even though memory mapped file is not mapped to process virtual memory system cache is high. why???!!! :(
Any help is appreciated.
Thanks.
If your data access pattern for using the memory mapped file is sequential, you might get slightly better page recycling by specifying the FILE_FLAG_SEQUENTIAL_SCAN flag when opening the underlying file. If your data pattern accesses the mapped file in random order, this won't help.
You should consider decreasing the size of your map view. That's where all the memory is actually consumed and cached. Since it appears that you need to handle files that are larger than available contiguous free physical memory, you can probably do a better job of memory management than the virtual memory page swapper since you know more about how you're using the memory than the virtual memory manager does. If at all possible, try to adjust your design so that you can operate on portions of the large file using a smaller view.
Even if you can't get rid of the need for full random access across the entire range of the underlying file, it might still be beneficial to tear down and recreate the view as needed to move the view to the section of the file that the next operation needs to access. If your data access patterns tend to cluster around parts of the file before moving on, then you won't need to move the view as often. You'll take a hit to tear down and recreate the view object, but since tearing down the view also releases all the cached pages associated with the view, it seems likely you'd see a net gain in performance because the smaller view significantly reduces memory pressure and page swapping system wide. Try setting the size of the view based on a portion of the installed system RAM and move the view around as needed by your file processing. The larger the view, the less you'll need to move it around, but the more RAM it will consume potentially impacting system responsiveness.
As I think you are hinting in your post, the slow response time is probably at least partially due to delays in the system while the OS writes the contents of memory to the pagefile to make room for other processes in physical memory.
The obvious solution (and possibly not practical) is to use less memory in your application. I'll assume that is not an option or at least not a simple option. The alternative is to try to proactively flush data to disk to continually keep available physical memory for other applications to run. You can find the total memory on the machine with GlobalMemoryStatusEx. And GetProcessMemoryInfo will return current information about your own application's memory usage. Since you say you are using a memory mapped file, you may need to account for that in addition. For example, I believe the PageFileUsage information returned from that API will not include information about your own memory mapped file.
If your application is monitoring the usage, you may be able to use FlushViewOfFile to proactively force data to disk from memory. There is also an API (EmptyWorkingSet) that I think attempts to write as many dirty pages to disk as possible, but that seems like it would very likely hurt performance of your own application significantly. Although, it could be useful in a situation where you know your application is going into some kind of idle state.
And, finally, one other API that might be useful is SetProcessWorkingSetSizeEx. You might consider using this API to give a hint on an upper limit for your application's working set size. This might help preserve more memory for other applications.
Edit: This is another obvious statement, but I forgot to mention it earlier. It also may not be practical for you, but it sounds like one of the best things you might do considering that you are running into 32-bit limitations is to build your application as 64-bit and run it on a 64-bit OS (and throw a little bit more memory at the machine).
Well, it sounds like your program needs more than 2GB of working set.
Modern operating systems are designed to use most of the RAM for something at all times, only keeping a fairly small amount free so that it can be immediately handed out to processes that need more. The rest is used to hold memory pages and cached disk blocks that have been used recently; whatever hasn't been used recently is flushed back to disk to replenish the pool of free pages. In short, there isn't supposed to be much free physical memory.
The principle difference between using a normal memory allocation and memory mapped a files is where the data gets stored when it must be paged out of memory. It doesn't necessarily have any effect on when the memory will be paged out, and will have little effect on the time it takes to page it out.
The real problem you are seeing is probably not that you have too little free physical memory, but that the paging rate is too high.
My suggestion would be to attempt to reduce the amount of storage needed by your program, and see if you can increase the locality of reference to reduce the amount of paging needed.

Resources