Which is faster/better for caching, File System or Memcached? - performance

I don't think it's clear to me yet, is it faster to read things from a file or from memcached? Why?

Memcached is faster, but the memory is limited. HDD is huge, but I/O is slow compared to memory. You should put the hottest things to memcached, and all the others can go to cache files.
(Or man up and invest some money into more memory like these guys :)
For some benchmarks see: Cache Performance Comparison (File, Memcached, Query Cache, APC)
In theory:
Read 1 MB sequentially from memory       250,000 ns
Disk seek 10,000,000 ns
http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf

There are quite a few different aspects that might favour one or the other:
Do you need/want to share this data between multiple servers? The filesystem is local, memcached is accessed over a network.
How large are the items your caching? The filesystem is likely to be better for large objects.
How many memcached requests might be there per page? TCP connections and tear-downs might take up more time than a simple filesystem stat() on the local machine.
I would suggest you look at your use case and do some profiling of both approaches. If you can get away with using the filesystem then I would. Adding in memcached adds in another layer of complexity and potential points of failure (memcached client/server).
For what it's worth the other comments about disk vs memory performance might well be academic as if the filesystem data is being accessed regularly then it'll likely be sitting in OS or disk cache memory anyway.

"Faster" can not be used without context.
For example, accessing data in memcached on remote server can be "slower" due to network latency. In the other hand, reading data from remote server memory via 10Gb network can be "faster" than reading same data from local disk.
The main difference between caching on the filesystem and using memcached is that memcached is a complete caching solution. So there is LRU lists, expiration concept (data freshness), some high-level operations, like cas/inc/dec/append/prepend/replace.
Memcached is easy to deploy and monitor (how can we distinguish "cache" workload on filesystem from, let's say kernel? Can we calculate total amount of cached data? Data distribution? Capacity planning? And so on).
There are also some hybrid systems, like cachelot
Basically, it's memcached that can be embedded right into the application, so the cache would be accessible without any syscalls or network IO.

In fact, it is not as simple as that reading from memory is much faster than reading from HDD. As you known, Memcached is based on tcp connection, if you make connection each time you want to get sth or set sth to memcached server(that is most of programmers do), it proberly performs poorly than using file cache. You should use static Memcached object, and reuse the object. Secondly, the modern OS's will cached files that are frequently used, that makes file caches might be faster than memcaches which are actualy TCP connections.

Cache Type | Cache Gets/sec
Array Cache | 365000
APC Cache | 98000
File Cache | 27000
Memcached Cache (TCP/IP) | 12200
MySQL Query Cache (TCP/IP) | 9900
MySQL Query Cache (Unix Socket) | 13500
Selecting from table (TCP/IP) | 5100
Selecting from table (Unix Socket) | 7400
Source:
https://surniaulula.com/os/unix/memcached-vs-disk-cache/
Source of my source :)
https://www.percona.com/blog/2006/08/09/cache-performance-comparison/

You're being awefully vauge on the details. And I believe the answer your looking for depends on the situtation. To my knowledge very few things tend to be better than the other all the time.
Obviously it woudln't be faster to read things of the file system (assuming that it's a harddrive). Even a SDD will be noticably slower than in-memory reads. And the reason for that is that HDD/FileSystem is built for capacity not speed, while DDR memory is particulary fast for that reason.
Good caching means to keep frequently accessed parts in memory and the not so frequently accessed things on disk (persistent storage). That way the normal case would be vastly improved by your caching implementation. That's your goal. Make sure you have a good understanding of your ideal caching policy. That will require extensive benchmarking and testing.

It depends if the cache is stored locally. Memcache can store data across a network, which isn't necessarily faster than a local disk.

If that file is stored in disk, and it is accessed frequently, there will be a high probability of finding it in the RAM (as a recently accessed file) or I am missing something here?
Yes the first read will be from the disk which is awefully slow, but what about the subsequent reads (assuming that file is hot and its getting lots of reads) this should be even faster than memcached as this is a pure RAM read

Related

SAN Performance

Have a question regarding SAN performance specifically EMC VNX SAN. I have a significant number of processes spread over number of blade servers running concurrently. The number of processes is typically around 200. Each process loads 2 small files from storage, one 3KB one 30KB. There are millions (20) of files to be processed. The processes are running on Windows Server on VMWare. The way this was originally setup was 1TB LUNs on the SAN bundled into a single 15TB drive in VMWare and then shared as a network share from one Windows instance to all the processes. The processes running concurrently and the performance is abysmal. Essentially, 200 simultaneous requests are being serviced by the SAN through Windows share at the same time and the SAN is not handling it too well. I'm looking for suggestions to improve performance.
With all performance questions, there's a degree of 'it depends'.
When you're talking about accessing a SAN, there's a chain of potential bottlenecks to unravel. First though, we need to understand what the actual problem is:
Do we have problems with throughput - e.g. sustained transfer, or latency?
It sounds like we're looking at random read IO - which is one of the hardest workloads to service, because predictive caching doesn't work.
So begin at the beginning:
What sort of underlying storage are you using?
Have you fallen into the trap of buying big SATA, configuring it RAID-6? I've seen plenty of places do this because it looks like cheap terabytes, without really doing the sums on the performance. A SATA drive starts to slow down at about 75 IO operations per second. If you've got big drives - 3TB for example - that's 25 IOPs per terabytes. As a rough rule of thumb, 200 per drive for FC/SAS and 1500 for SSD.
are you tiering?
Storage tiering is a clever trick of making a 'sandwich' out of different speeds of disk. This usually works, because usually only a small fraction of a filesystem is 'hot' - so you can put the hot part on fast disk, and the cold part on slow disk, and average performance looks better. This doesn't work for random IO or cold read accesses. Nor does it work for full disk transfers - as only 10% of it (or whatever proportion) can ever be 'fast' and everything else has to go the slow way.
What's your array level contention?
The point of SAN is that you aggregate your performance, such that each user has a higher peak and a lower average, as this reflects most workloads. (When you're working on a document, you need a burst of performance to fetch it, but then barely any until you save it again).
How are you accessing your array?
Typically SAN is accessed using a Fiber Channel network. There's a whole bunch of technical differences with 'real' networks, but they don't matter to you - but contention and bandwidth still do. With ESX in particular, I find there's a tendency to underestimate storage IO needs. (Multiple VMs using a single pair of HBAs means you get contention on the ESX server).
what sort of workload are we dealing with?
One of the other core advantages of storage arrays is caching mechanisms. They generally have very large caches and some clever algorithms to take advantage of workload patterns such as temporal locality and sequential or semi-sequential IO. Write loads are easier to handle for an array, because despite the horrible write penalty of RAID-6, write operations are under a soft time constraint (they can be queued in cache) but read operations are under a hard time constraint (the read cannot complete until the block is fetched).
This means that for true random read, you're basically not able to cache at all, which means you get worst case performance.
Is the problem definitely your array? Sounds like you've a single VM with 15TB presented, and that VM is handling the IO. That's a bottleneck right there. How many IOPs are the VM generating to the ESX server, and what's the contention like there? What's the networking like? How many other VMs are using the same ESX server and might be sources of contention? Is it a pass through LUN, or VMFS datastore with a VMDK?
So - there's a bunch of potential problems, and as such it's hard to roll it back to a single source. All I can give you is some general recommendations to getting good IO performance.
fast disks (they're expensive, but if you need the IO, you need to spend money on it).
Shortest path to storage (don't put a VM in the middle if you can possibly avoid it. For CIFS shares a NAS head may be the best approach).
Try to make your workload cacheable - I know, easier said than done. But with millions of files, if you've got a predictable fetch pattern your array will start prefetching, and it'll got a LOT faster. You may find if you start archiving the files into large 'chunks' you'll gain performance (because the array/client will fetch the whole chunk, and it'll be available for the next client).
Basically the 'lots of small random IO operations' especially on slow disks is really the worst case for storage, because none of the clever tricks for optimization work.

Cache size on heroku postgres smaller than advertised?

I fired up a Zilla instance of heroku postgres which is advertised as having 17GB of memory cache.
When I run show all; I see:
effective_cache_size | 12240000kB
Does this mean the cache is 12GB and not 17GB? Or am I missing something? Queries run much slower when my dataset goes above the 12GB point.
There is a limit on the available memory on the underlying hardware (17G for a zilla). This amount of memory cannot be used entirely for the "hot dataset" cache, however. Many other aspects of normal postgres operations also require memory, as you may imagine. Some of that includes establishing a connection (which spawns a backend), queries requiring joins, queries requiring sorts, or aggregates like count, sum, max, etc. Additionally, processes such as autovacuum also use part of that available memory.

Looking for a FIFO/LRU file storage system

I'm looking to implement a disk based caching system. The idea is to allocate a certain amount of disk space and save however much data fits in there, discarding of old files as I run out of space.
LRU is my first choice of deletion strategy, but I'm willing to settle for FIFO. When googling for cache algorithms, the discussion seems to be dominated by memory-based caching. Memcached, for example, would be exactly what I'm looking for, except that it's memory based. On the other hand, solutions like Memcachedb, couchdb etc. don't seem to have LRU capabilities.
The closest thing I've found is the squid proxy server storage systems. COSS seems to be the most documented one, but to use it I would probably have to rewrite it as a stand-alone process (or library).
What project or (java/python) library can I use for such a thing?
EDIT: found this related question.
I guess all Memory caching library have an option to persist or expand on disk. At least, EHCache does.
So you can just configure a cache library to write on disk (either because you want the data to be persistant, or to expand the cache size over your memory limits).
Note that EhCache has LRU capabilities.

Estimation of commodity hardware for an application

Suppose, I wanted to develop stack overflow website. How do I estimate the amount of commodity hardware required to support this website assuming 1 million requests per day. Are there any case studies that explains the performance improvements possible in this situation?
I know I/O bottleneck is the major bottleneck in most systems. What are the possible options to improve I/O performance? Few of them I know are
caching
replication
You can improve I/O performance in several ways depending upon what you use for your storage setup:
Increase filesystem block size if your app displays good spatial locality in its I/Os or uses large files.
Use RAID 10 (striping + mirroring) for performance + redundancy (disk failure protection).
Use fast disks (Performance Wise: SSD > FC > SATA).
Segregate workloads at different times of day. e.g. Backup during night, normal app I/O during day.
Turn off atime updates in your filesystem.
Cache NFS file handles a.k.a. Haystack (Facebook), if storing data on NFS server.
Combine small files into larger chunks, a.k.a BigTable, HBase.
Avoid very large directories i.e. lots of files in the same directory (instead divide files between different directories in a hierarchy).
Use a clustered storage system (yeah not exactly commodity hardware).
Optimize/design your application for sequential disk accesses whenever possible.
Use memcached. :)
You may want to look at "Lessons Learned" section of StackOverflow Architecture.
check out this handy tool:
http://www.sizinglounge.com/
and another guide from dell:
http://www.dell.com/content/topics/global.aspx/power/en/ps3q01_graham?c=us&l=en&cs=555
if you want your own stackoverflow-like community, you can sign up with StackExchange.
you can read some case studies here:
High Scalability - How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data
http://www.gear6.com/gear6-downloads?fid=56&dlt=case-study&ls=Veoh-Case-Study
1 million requests per day is 12/second. Stack overflow is small enough that you could (with interesting normalization and compression tricks) fit it entirely in RAM of a 64 GByte Dell PowerEdge 2970. I'm not sure where caching and replication should play a role.
If you have a problem thinking enough about normalization, a PowerEdge R900 with 256GB is available.
If you don't like a single point of failure, you can connect a few of those and just push updates over a socket (preferably on a separate network card). Even a peak load of 12K/second should not be a problem for a main-memory system.
The best way to avoid the I/O bottleneck is to not do I/O (as much as possible). That means a prevayler-like architecture with batched writes (no problem to lose a few seconds of data), basically a log file, and for replication also write them out to a socket.

Problem with Caching on the client side?

I want to cache data on the client. What is the best algorithm/data structure that can be employed?
Case 1. The data to be stored requires extremely fast string searching capability.
Case 2. The cached data set can be large. I don't want to explode the client's memory usage and also I don't want to make a network and disk access calls which slows down my processing time on the client side
Solutions:
Case 1: I think suffix tree/Tries provides you with a good solution in this case.
Case 2: The two problems to consider here are:
To store large data with minimum memory consumption
Not to make any network calls to access any data which is not available in the cache.
LRU caching model is one solution I can think of but that does not prevent me from bloating the memory.
Is there any way to write down to a file and access without compromising the data (security aspect)?
Let me know if any point is not clear.
EDIT:
Josh, I know my requirements are non-realistic. To narrow down my requirement, I am looking for something which stores using LRU algorithm. It will be good if we can have dynamic size configuration for this LRU with a maximum limit to it. This will reduce the number of calls going to the network/database and provide a good performance as well.
If this LRU algorithm works on a compressed data which can be interpreted with a slight overhead (but less than a network call), it will be much better.
Check out all the available caching frameworks/libraries - I've found Ehcache to be very useful. You can also have it keep just some (most recent) in memory and failover to disk at a specified memory usage. The disk calls will still be a lot faster then network calls and you avoid taking all the memory.
Ehcache
Unfortunately, I think your expectations are unrealistic.
Keeping memory usage small, but also not making disk access calls means that you have nowhere to store the data.
Furthermore, to answer your question about security, there is no client side data storage (assuming you are talking about a web-application) that is "secure". You could encrypt it, but this will destroy your speed requirements as well as require server-side processing. Everything stored at and sent from the client is suspect.
Perhaps if you could describe the problem in greater detail we can suggest some realistic solutions.

Resources