Use 3GB Free Space to Access 30 GB info without Virtual Memory Paging? - algorithm

I have a quick question:
How can we Use a 3GB Free Space to Access roughly 30 GB of data without Virtual Memory or Compression? It's more of a Data Structure Question.
Thanks

You should somehow mimic the paging mechanism.
One way to do it is hashing1.
Hash all your data into bins, and store these bins in disk. In your main memory (RAM) you will only hold an array of pointers to disk. Once you need an address, you know where it is on disk by accessing the RAM and taking the pointer from the location hash(address)
You can of course optimize it to keep a portion of the data in memory - using the principle of locality - and hoping to get a hit - and avoid reloading a chunk from disk.
(1) The hashing does not have to be complex or uniformly distributed. I believe using the MSb's of the address will be just fine - and will actually mimic the paging mechanism better.

The most obvious way would be through a typical filesystem API with read, write, and seek functions.

Related

How to determine the starting address of unused memory region in operating system?

I am working on some project related with huge objects in physical memory in Windows.
I wanted to create really big structure of data, but therefore I found some problems.
While I am trying to allocate huge amount of data I can just create as large object as heap allows (it also depends on architecture of operating system).
I am not sure if this is restricted by private heap of thread, or some other way.
When I was looking for how operating system places data in memory, I found out that the data is stored in some particular order.
And here comes some questions...
If I want to create large objects, should I have one very large heap region to allocate memory inside? If so, I have to fragmentate data.
In other way, there came an idea, of finding starting addresses of empty regions, and then use this unused place to put data in some data structure.
If this idea is possible to realize, then how it could be done?
Another question is that, do you think that list would be the best option for that sort of huge object? Or maybe it would be better to use another data structure?
Do you think that chosen data structure could be divided into two regions of data separately, but standing as one object?
Thanks in advance, every answer for my questions could be helpful.
There seems to be some kind of misconception about memory allocation here.
(1) Most operating systems do not allocate memory linearly. There usually are discontinuities in the memory mapped to a process address space.
(2) If you want to allocate a huge amount of memory, you should do it directly with the operating system; not through a heap.

What is the cost of mmaping on Mac OS X?

I have an algorithm where my life would be greatly simplified if I could reserve about 20 blocks of memory addresses of size 4GB. In practice, I never use more than 4GB, but I do not know which block will fill up in advance.
If I mmap 20 blocks of 4GB everything seems to work fine -- until I write to memory the OS does not seem to actually allocate anything.
Is there any reason I should not use mmap to allocate 80GB of memory, and then only using a small amount of it? I assume there is some cost to setting up these buffers. Can I measure it?
The only drawback of mmap-ing 80GB at once is that a page table has to be created for the full 80GB. So if the pages are 4kB, this table could consume a lot of memory (unless huge pages are used).
For sizes like that it is probably better to use one or more sliding mmap-ed views (i.e. create and remove them when needed).
On Windows, memory usage for mmap/page tables can be checked with RamMap, not sure about Mac.

Can chronicle-map handle data larger than memory?

I'm a bit confused by how off heap memory works. I have a server that has 32GB ram, and a data set of key-value mappings about 1TB in size. I'm looking for a simple and fast embedded Java database that would allow me to map a key to a value according to this 1TB dataset, which will mostly have to be read from disk. Each entry in this data set is small (<500 bytes), so I think using a file system would be ineffecient.
I'd like to use Chronicle Map for this. I read that off heap memory usage can exceed ram size and that it interacts with the filesysytem somehow, but at the same time, Chronicle Map is described as an in memory database. Can Chronicle Map handle the 1TB data set for my server, or am I only limited to using data sets 32GB or less?
The answer is it depends on your operating system. On Windows a Chronicle Map must fit inside main memory, however on Linux and MacOSX it doesn't have fix in main memory (the difference is how memory mapping is implemented on these OSes) Note: Linux even allows you to map a region larger than your disk space (MacOSX and Windows doesn't)
So on Linux you could map 1 TB or even 100 TB on a machine with 32 GB of memory. It is important to remember that your access pattern and your choice of drive will be critical to performance. If you generally access the same data most of the time and you have SSD this will perform well. If you have spinning disk and a random access pattern, you will be limited by the speed of your drive.
Note: we have tested Chronicle Map to 2.5 billion entries and it performs well as it uses 64-bit hashing of keys.

Memory mapped files causes low physical memory

I have a 2GB RAM and running a memory intensive application and going to low available physical memory state and system is not responding to user actions, like opening any application or menu invocation etc.
How do I trigger or tell the system to swap the memory to pagefile and free physical memory?
I'm using Windows XP.
If I run the same application on 4GB RAM machine it is not the case, system response is good. After getting choked of available physical memory system automatically swaps to pagefile and free physical memory, not that bad as 2GB system.
To overcome this problem (on 2GB machine) attempted to use memory mapped files for large dataset which are allocated by application. In this case virtual memory of the application(process) is fine but system cache is high and same problem as above that physical memory is less.
Even though memory mapped file is not mapped to process virtual memory system cache is high. why???!!! :(
Any help is appreciated.
Thanks.
If your data access pattern for using the memory mapped file is sequential, you might get slightly better page recycling by specifying the FILE_FLAG_SEQUENTIAL_SCAN flag when opening the underlying file. If your data pattern accesses the mapped file in random order, this won't help.
You should consider decreasing the size of your map view. That's where all the memory is actually consumed and cached. Since it appears that you need to handle files that are larger than available contiguous free physical memory, you can probably do a better job of memory management than the virtual memory page swapper since you know more about how you're using the memory than the virtual memory manager does. If at all possible, try to adjust your design so that you can operate on portions of the large file using a smaller view.
Even if you can't get rid of the need for full random access across the entire range of the underlying file, it might still be beneficial to tear down and recreate the view as needed to move the view to the section of the file that the next operation needs to access. If your data access patterns tend to cluster around parts of the file before moving on, then you won't need to move the view as often. You'll take a hit to tear down and recreate the view object, but since tearing down the view also releases all the cached pages associated with the view, it seems likely you'd see a net gain in performance because the smaller view significantly reduces memory pressure and page swapping system wide. Try setting the size of the view based on a portion of the installed system RAM and move the view around as needed by your file processing. The larger the view, the less you'll need to move it around, but the more RAM it will consume potentially impacting system responsiveness.
As I think you are hinting in your post, the slow response time is probably at least partially due to delays in the system while the OS writes the contents of memory to the pagefile to make room for other processes in physical memory.
The obvious solution (and possibly not practical) is to use less memory in your application. I'll assume that is not an option or at least not a simple option. The alternative is to try to proactively flush data to disk to continually keep available physical memory for other applications to run. You can find the total memory on the machine with GlobalMemoryStatusEx. And GetProcessMemoryInfo will return current information about your own application's memory usage. Since you say you are using a memory mapped file, you may need to account for that in addition. For example, I believe the PageFileUsage information returned from that API will not include information about your own memory mapped file.
If your application is monitoring the usage, you may be able to use FlushViewOfFile to proactively force data to disk from memory. There is also an API (EmptyWorkingSet) that I think attempts to write as many dirty pages to disk as possible, but that seems like it would very likely hurt performance of your own application significantly. Although, it could be useful in a situation where you know your application is going into some kind of idle state.
And, finally, one other API that might be useful is SetProcessWorkingSetSizeEx. You might consider using this API to give a hint on an upper limit for your application's working set size. This might help preserve more memory for other applications.
Edit: This is another obvious statement, but I forgot to mention it earlier. It also may not be practical for you, but it sounds like one of the best things you might do considering that you are running into 32-bit limitations is to build your application as 64-bit and run it on a 64-bit OS (and throw a little bit more memory at the machine).
Well, it sounds like your program needs more than 2GB of working set.
Modern operating systems are designed to use most of the RAM for something at all times, only keeping a fairly small amount free so that it can be immediately handed out to processes that need more. The rest is used to hold memory pages and cached disk blocks that have been used recently; whatever hasn't been used recently is flushed back to disk to replenish the pool of free pages. In short, there isn't supposed to be much free physical memory.
The principle difference between using a normal memory allocation and memory mapped a files is where the data gets stored when it must be paged out of memory. It doesn't necessarily have any effect on when the memory will be paged out, and will have little effect on the time it takes to page it out.
The real problem you are seeing is probably not that you have too little free physical memory, but that the paging rate is too high.
My suggestion would be to attempt to reduce the amount of storage needed by your program, and see if you can increase the locality of reference to reduce the amount of paging needed.

osx: how do I find the size of the disk i/o cache (write cache, for example)

I am looking to optimize my disk IO, and am looking around to try to find out what the disk cache size is. system_profiler is not telling me, where else can I look?
edit: my program is processing entire volumes: I'm doing a secure-wipe, so I loop through all of the blocks on the volume, reading, randomizing the data, writing... if I read/write 4k blocks per IO operation the entire job is significantly faster than r/w a single block per operation. so my question stems from my search to find the ideal size of a r/w operation (ideal in terms of performance:speed). please do not point out that for a wipe-program I don't need the read operation, just assume that I do. thx.
Mac OS X uses a Unified Buffer Cache. What that means is that in the kernel VM objects and files are them at some level, same thing, and the size of the available memory for caching is entirely dependent on the VM pressure in the rest of the system. It also means the read and write caching is unified, if an item in the read cache is written to it just gets marked dirty and then will be written to disk when changes are committed.
So the disk cache may be very small or gigabytes large, and dynamically changes as the system is used. Because of this trying to determine the cache size and optimize based on it is a losing fight. You are much better off looking at doing things that inform the cache how to operate better, like checking with the underlying device's optimal IO size is, or identifying data that should not be cached and using F_NOCACHE.

Resources