Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
What I mean is: memories are becoming larger and larger, and OS and compilers smarter and smarter. Therefore my question, if I have to read data from file, does it make sense to implement a cache? Isn't the operating system already managing data into memory?
edit ok to be more practical, I have 1TB of data sparse in more files, and 180GB of RAM. I need to read some of this data more than once. Does it make sense to implement a cache like LRU, or when I read from file (using c++) the operating system will have been smart enough to have kept these data somewhere so to read them from memory instead of from disk?
Depending on the language and library you are using. It is highly likely that you are actually already caching things into the memory.
In general, you want to cache things that you are currently managing until you are ready to commit the updated data buffer back into the file on the disk simply because disk I/O is a very slow operation.
On files that are very big, you may not want to cache the entire data due to memory constraints, but you would still want to cache the block of data that you are currently managing.
Here's a general diagram of different means of storing data from the fastest (most expensive) to the slowest (least expensive):
CPU data registers -> CPU Cache -> RAM -> SSD -> Hard Disk -> keyboard, etc..
HowStuffWorks.com has a pretty good illustration of this hierarchy and the entire article itself is actually a pretty good read as well: http://computer.howstuffworks.com/computer-memory4.htm
EDIT: There is also another similar discussion here that you may want to check out as well.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
If in-memory databases are as fast as they claim to be, why aren't they more commonly utilized?
One of the main reasons in-memory databases aren't commonly used is because of cost. As you stated, in-memory databases are usually an order of magnitude faster than disk resident databases for obvious reasons. However, RAM is also significantly more expensive than hard drives and consequently not viable for large databases. With that said, with RAM getting much cheaper in-memory databases today are more than viable for enterprise use.
Another reason is that in-memory databases are often not ACID compliant. This is because memory is volatile, and unforeseen events like power losses may result in complete loss of data. As this is unacceptable for the vast majority of use cases, most in-memory databases do end up utilizing disks to persist data. Of course, this ends up undermining some of the benefits of in-memory databases by re-introducing disk I/O as a performance bottleneck.
In any case, in-memory databases will likely become predominant as RAM becomes cheaper. The performance differences between the two are too drastic to be ignored. Knowing this, multiple vendors have thrown their hats into the in-memory space such as Oracle TimesTen, SAP Hana, and many others. Also, some companies like Altibase are offering “hybrid” DBMS systems, which contain both in-memory and disk resident components.
You may want to read up on these in-memory offerings to get a better understanding of in-memory databases.
http://www.oracle.com/technetwork/database/database-technologies/timesten/overview/index.html
http://www.saphana.com/
http://altibase.com/in-memory-database-hybrid-products/hdbtm-hybrid-dbms/
Possibly because one or more of:
there often a mismatch between the data size and the available RAM size
when the data is small normal disk caching and OS memory/disk management may be as effective
when the data is large, swapping to disk is likely to void any benefit
fast enough to meet performance requirements and service level does not mean as fast as possible - fast enough is good enough.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am looking for rules of thumb for designing algorithms where the data is accessed slowly due to limitations of disk speed, pci speed(gpgpu) or other bottleneck.
Also, how does one manage gpgpu programs where the memory of the application exceeds gpgpu memory?
In general, the GPU memory should not be an arbitrary limitation on the size of data for algorithms. The GPU memory could be considered to be a "cache" of data that the GPU is currently operating on, but many GPU algorithms are designed to operate on more data than can fit in the "cache". This is accomplished by moving data to and from the GPU while computation is going on, and the GPU has specific concurrent execution and copy/compute overlap mechanisms to enable this.
This usually implies that independent work can be completed on sections of the data, which is typically a good indicator for acceleration in a parallelizable application. Conceptually, this is similar to large scale MPI applications (such as high performance linpack) which break the work into pieces and then send the pieces to various machines (MPI ranks) for computation.
If the amount of work to be done on the data is small compared to the cost to transfer the data, then the data transfer speed will still become the bottleneck, unless it is addressed directly via changes to the storage system.
The basic approach to handling out-of-core or algorithms where the data set is too large to fit in GPU memory all at once is to determine a version of the algorithm which can work on separable data, and then craft a "pipelined" algorithm to work on the data in chunks. An example tutorial which covers such a programming technique is here (focus starts around 40 minute mark, but the whole video is relevant).
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Got asked in an interview. They asked to order the following in terms of speed:
CPU register access,
Context switch
Memory access
Disk seek.
pretty sure the disk seek is the slowest and register access is the fastest, but not quite sure about the two in between. Can anyone explain it a bit?
I happened to find a surprisingly good answer on Yahoo!:
Fastest to slowest:
CPU
Memory
Context switching
Disk
Although:
Disk access may be significantly faster at times due to caching ... so
can memory access (CPUs sometimes manage a caches from main memory to
help speed up access and avoid competition for the bus).
Memory access could also be as slow or slightly slower than disk
access at times, due to virtual memory page swapping.
Context switching needs to be extremely fast in general ... if it was
slow then your CPU could begin to spend more time switching between
processes than actually performing meaningful work when several
processes are running concurrently.
Register access is nearly instantaneous.
(emphasis mine)
I agree with that answer.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Typically in a working environment, I have many windows open, Outlook, 2/3 word docuements, few windows in browser, notepad++, some vpn client, excel etc..
Having said that, there are chances that about 40% of these apps are not frequently used, but are referred only sparingly. They occupy memory none-the-less.
Now, how does a typical OS deal with that kind of memory consumption ? does it suspend that app to hard disk (pagefile , or linux swap area etc) thereby freeing up that memory for usage, or does it keep occupying the memory there as it is.
Can this suspension be a practical solution, doable thing ? Are there any downsides ? response time ?
Is there some study material I can refer to for reading on this topic/direction..
would appreciate the help here.
The detailed answer depends on your OS and how it implements its memory management, but here is a generality:
The OS doesnt look at memory in terms of how many processes are in RAM, it looks at in terms of discrete units called pages. Most processes have several pages of RAM. Pages that are least referenced can be swapped out of RAM and onto the hard disk when physical RAM becomes scarce. Rarely, therefore, is an entire process swapped out of RAM, but only certain parts of it. It could be, for example, that some aspect of your currently running program is idle (ie the page is rarely accessed). In that case, it could be swapped out even though the process is in the foreground.
Try the wiki article for starters on how this process works and the many methods to implement it.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
What are the drawbacks (if any) of using memory mapped file to read (regular sized files) over doing the same using CreateFile ReadFile combination?
With ReadFile/WriteFile you have deterministic error handling semantics. When you use memory mapped files, errors are returned by throwing an exception.
In addition, if the memory mapped file has to hit the disk (or even worse, the network) your memory read may take several seconds (or even minutes) to complete. Depending on your application, this can cause unexpected stalls.
If you use ReadFile/WriteFile you can use asynchronous variants of the API to allow you to control this behavior.
You also have more deterministic performance if you use ReadFile, especially if your I/O pattern is predictable - memory mapped I/O is often random while as ReadFile is almost always serial (since ReadFile reads at the current file position and advances the current file position).
A big advantage of file mapping is that it doesn't influence system cache. If your application does excessive I/O by means of ReadFile, your system cache will grow, consuming more and more physical memory. If your OS is 32 bit and you have much more than 1GB memory, than you're lucky, since on 32 bit Windows the size of system cache is limited by 1GB. Otherwise system cache will consume all available physical memory and the memory manager will soon start purging pages of other processes to disk, intensifying disk operations instead of actually lessen them. The effect is especially noticeable on 64 bit Windows, where the cache size is limited only by available physical memory. File mapping on the other hand doesn't lead to overgrowing of system cache and at the same time doesn't degrade the performance.
You'll need more complex code for establishing the file mapping than for just opening and reading. File mapping is intended for random access to a section of file. If you don't need that, just don't bother with file mapping.
Also if ever need to port your code onto another platform you'll do it much easier and faster if you don't use file mapping.