Is it good idea to provide cache as service? - caching

We have many web services and web app applications which have caching needs so we are trying to come up with caching strategy which can help all the teams irrespective of their technical choices. We have used Memcached(not replicated) & Couchbase(multi master) running locally on each server node and applications connect to them locally using Memcached protocol but going forward we are planning to go with centralized cache cluster exposed via REST APIs which can be used by all the applications running on different server nodes in a datacenter. Following are reasons behind this thought process:
Easy maintenance of a cluster without worrying about app server
nodes.
Single protocol(HTTP) used to access the cache without worrying
about underlying implementation.(We might use Redis or Couchbase or
Aerospike cluster)
But we are not sure about this strategy because we are worried about performance impact due to network overhead because of HTTP.
Has anyone tried this strategy? Is it a good idea to make cache as centralized service or local caches are the best?

While it's true that HTTP and network add latency, generally you need a cache because the actual operation takes significantly longer. So the question is: if you add 1-2 milliseconds to the cache access, does that still shorten the un-cached operation time significantly? If the answer is yes, and you follow some common best practices, having a centralized cache could be a good idea.
You might want to look into low latency, high throughput server-side frameworks for the HTTP service, like Node.js or Go. Also, you will probably benefit from implementing proper ETag support in your cache HTTP API.
Another alternative might be centralizing the cache server(s) without wrapping them in an HTTP layer. There are standard cache provider implementations for all the technologies you mentioned available for most modern web frameworks.

Disclaimer: I work for Redis Labs, a commercial company that makes tools for managing Redis and Memcached clusters. My employer, Redis Labs, has made a business of the strategy that you want affirmed :)
Cache is a dish best served close, but remote caching has benefits (e.g., offloading the DB) even if the latency penalty suggests differently. In most cases, compared to the time spent in the application, the local area network latency becomes negligible, so using a shared network-attached cache makes a lot of sense.
To get the best performance, interact from your app directly with the shared cache using its own protocol. An HTTP API, unless provided by the caching engine itself, could add latency to the client app's requests. OTOH, formalizing your apps access to the cache with a custom layer (such as a REST API) has a lot of nice benefits too, so you should evaluate the cost in the context of your latency budgets.
Your strategy is sound and it is used everywhere to build scalable and performant applications. Feel free to hit me if you need further advice.

Related

What is the best practices to implement caching layer?

I'm going to use Redis as a cache service.
What is the best practices to access the caching service?
Through a service/API or in-memory component?
I'm not sure I want to have access to the DB from all the services.
Thanks
All your questions depends on topology and/or architecture of your system. I don't think that you would provide a service on separated computer if your application resided completely on one computer.
But suppose you have distributed app.
In this case it makes sense to do caching using separated service on separated node. It's same as within OOP, you can simple encapsulate data also in cache. Other services depends on your cache, not directly on Redis - you can decide to change redis for something else. Another advantage of caching service is that you can cache data in memory depending on throughput and fetches data from redis time to time. Note that you can simple buy a server having a lot of RAM, e.g. 192gb, because caching service needs a memory more than anything else.

How do you distribute your app across multiple servers using EC2?

For the first time I am developing an app that requires quite a bit of scaling, I have never had an application need to run on multiple instances before.
How is this normally achieved? Do I cluster SQL servers then mirror the programming across all servers and use load balancing?
Or do I separate out the functionality to run some on one server some on another?
Also how do I push out code to all my EC2 windows instances?
This will depend on the requirements you have. But as a general guideline (I am assuming a website) I would separate db, webserver, caching server etc to different instance(s) and use s3(+cloudfont) for static assets. I would also make sure that some proper rate limiting is in place so that only legitimate load is on the infrastructure.
For RDBMS server I might setup a master-slave db setup (RDS makes this easier), use db sharding etc. DB cluster solutions also exists which will be more complex to setup but simplifies database access for the application programmer. I would also check all the db queries and the tune db/sql queries accordingly. In some cases pure NoSQL type databases might be better than RDBMS or a mix of both where the application switches between them depending on the data required.
For webserver I will setup a loadbalancer and then use autoscaling on the webserver instance(s) behind the loadbalancer. Something similar will apply for app server if any. I will also tune the web servers settings.
Caching server will also be separated into its on cluster of instance(s). ElastiCache seems like a nice service. Redis has comparable performance to memcache but has more features(like lists, sets etc) which might come in handy when scaling.
Disclaimer - I'm not going to mention any Windows specifics because I have always worked on Unix machines. These guidelines are fairly generic.
This is a subjective question and everyone would tailor one's own system in a unique style. Here are a few guidelines I follow.
If it's a web application, separate the presentation (front-end), middleware (APIs) and database layers. A sliced architecture scales the best as compared to a monolithic application.
Database - Amazon provides excellent and highly available services (unless you are on us-east availability zone) for SQL and NoSQL data stores. You might want to check out RDS for Relational databases and DynamoDb for NoSQL. Both scale well and you need not worry about managing and load sharding/clustering your data stores once you launch them.
Middleware APIs - This is a crucial part. It is important to have a set of APIs (preferably REST, but you could pretty much use anything here) which expose your back-end functionality as a service. A service oriented architecture can be scaled very easily to cater multiple front-facing clients such as web, mobile, desktop, third-party widgets, etc. Middleware APIs should typically NOT be where your business logic is processed, most of it (or all of it) should be translated to database lookups/queries for higher performance. These services could be load balanced for high availability. Amazon's Elastic Load Balancers (ELB) are good for starters. If you want to get into some more customization like blocking traffic for certain set of IP addresses, performing Blue/Green deployments, then maybe you should consider HAProxy load balancers deployed to separate instances.
Front-end - This is where your presentation layer should reside. It should avoid any direct database queries except for the ones which are limited to the scope of the front-end e.g.: a simple Redis call to get the latest cache keys for front-end fragments. Here is where you could pretty much perform a lot of caching, right from the service calls to the front-end fragments. You could use AWS CloudFront for static assets delivery and AWS ElastiCache for your cache store. ElastiCache is nothing but a managed memcached cluster. You should even consider load balancing the front-end nodes behind an ELB.
All this can be bundled and deployed with AutoScaling using AWS Elastic Beanstalk. It currently supports ASP .NET, PHP, Python, Java and Ruby containers. AWS Elastic Beanstalk still has it's own limitations but is a very cool way to manage your infrastructure with the least hassle for monitoring, scaling and load balancing.
Tip: Identifying the read and write intensive areas of your application helps a lot. You could then go ahead and slice your infrastructure accordingly and perform required optimizations with a read or write focus at a time.
To sum it all, Amazon AWS has pretty much everything you could possibly use to craft your server topology. It's upon you to choose components.
Hope this helps!
The way I would do it would be, to have 1 server as the DB server with mysql running on it. All my data on memcached, which can span across multiple servers and my clients with a simple "if not on memcached, read from db, put it on memcached and return".
Memcached is very easy to scale, as compared to a DB. A db scaling takes a lot of administrative effort. Its a pain to get it right and working. So I choose memcached. Infact I have extra memcached servers up, just to manage downtime (if any of my memcached) servers.
My data is mostly read, and few writes. And when writes happen, I push the data to memcached too. All in all this works better for me, code, administrative, fallback, failover, loadbalancing way. All win. You just need to code a "little" bit better.
Clustering mysql is more tempting, as it seems more easy to code, deploy, maintain and keep up and performing. Remember mysql is harddisk based, and memcached is memory based, so by nature its much more faster (10 times atleast). And since it takes over all the read load from the db, your db config can be REALLY simple.
I really hope someone points to a contrary argument here, I would love to hear it.

Mongolab and network latency?

I'm just curious, there's a few services like MongoLab where data is hosted on remote servers. Anyone who's worked with databases knows that there's a certain amount of network latency, even when all servers are internal. Is a remote data storage service such as MongoLab a good idea for production environments?
This question is mainly for AJAX based web apps or websites in general.
I've found MongoLab to be pretty good. Obviously, you need to think about round-trips in general, and optimising those will minimise your overall latency.
It also makes sense to put yourself into the same data-center as MongoLab (you can choose where). They also have a (beta) service on Azure now.
I've been running services with high-latency (three different geographical regions for browser, web servers and Mongo and it still performs adequately in my case because my interactions are not "chatty".
As you probably know, one of the design constraints with Mongo is a lack of joins, so my data structures have naturally lent themselves to simple Q&A fetching of data. I don't read one collection and then use that information to go look in another (manual joins). As a result, I'm not adding up latency costs with those complex interactions. The worst case is generally a single request/response (or a series of parallel, single request/response queries) so it's the difference of about 200ms total which is acceptable.
But of course, the closer you can get your web servers to your DB the better you'll be.
Presumably, if you're spending enough money, MongoLab et al could roll you a custom configuration, possibly where you can have local secondaries.

How many connections/how much bandwidth can Apache handle?

This is a request for pointers to good documentation/good articles. I'm looking for information on how many connections an Apache server can reasonably handle, and potentially how to load balance between multiple servers. I've done Google searches but it's harder for beginners to judge what are good docs.
Apache 1.3 had some nasty scalability limitations, but later versions are designed to scale with the hardware and operating system, making them the bottleneck rather than the web server itself. As always, though, it comes down to how you configure and tune it if you want uber performance. Each situation has its own demands, and they're documented here:
http://httpd.apache.org/docs/2.2/misc/perf-tuning.html
The above assumes you're serving static content, which is where Apache excels. If you run webapps behind it, that's your bottleneck, not Apache.
Unfortunately you'll be disappointed.
Apache's ability to handle connections (and indeed any other web server's) is limited by what the web application sitting on top of it is doing. If you're serving static pages, you will be able to serve a lot of requests with very little hardware.
Depending on the IO workload (Apache cannot work faster than the IO subsystem - install enough ram to cache your entire content, if you can), you will be able to fill up a gigabit network on any reasonable spec modern box.
Once you've filled a gigabit network, you'll have other things to worry about.
But the reasons that you really need load balancers are because your application slows down Apache and uses up the box's resources. Your application will not be infinitely fast, nor infinitely scalable. You'll need to address those issues.
As the previous answers have pointed out it is generally not the case that Apache becomes the bottleneck, instead it is usually the application server (PHP, Mongrel, etc). However, if you are only serving static content then you will want to do some benchmarking to see how fast it can go. Of course it is unlikely to peg the exact number which Apache will be able to serve since a lot depends on how you configure it (e.g. disabling persistent connections) and the specs of the server. However to get a ballpark estimate you can use this benchmark as a reference since it is run on 1-8 cores (using one or two servers) so you should be able to find something reasonably comparable to the hardware you are considering.
Of course in order to get the most accurate results you will want to test it yourself using a load generator like ab or httperf.

Caching with multiple server

I'm building an application with multiple server involved. (4 servers where each one has a database and a webserver. 1 master database and 3 slaves + one load balancer)
There is several approach to enable caching. Right now it's fairly simple and not efficient at all.
All the caching is done on an NFS partition share between all servers. NFS is the bottleneck in the architecture.
I have several ideas implement
caching. It can be done on a server
level (local file system) but the
problem is to invalidate a cache
file when the content has been
update on all server : It can be
done by having a small cache
lifetime (not efficient because the
cache will be refresh sooner that it
should be most of the time)
It can also be done by a messaging
sytem (XMPP for example) where each
server communicate with each other.
The server responsible for the
invalidation of the cache send a
request to all the other to let them
know that the cache has been
invalidated. Latency is probably
bigger (take more time for everybody
to know that the cache has been
invalidated) but my application
doesn't require atomic cache
invalidation.
Third approach is to use a cloud
system to store the cache (like
CouchDB) but I have no idea of the
performance for this one. Is it
faster than using a SQL database?
I planned to use Zend Framework but I don't think it's really relevant (except that some package probably exists in other Framework to deal with XMPP, CouchDB)
Requirements: Persistent cache (if a server restart, the cache shouldn't be lost to avoid bringing down the server while re-creating the cache)
http://www.danga.com/memcached/
Memcached covers most of the requirements you lay out - message-based read, commit and invalidation. High availability and high speed, but very little atomic reliability (sacrificed for performance).
(Also, memcached powers things like YouTube, Wikipedia, Facebook, so I think it can be fairly well-established that organizations with the time, money and talent to seriously evaluate many distributed caching options settle with memcached!)
Edit (in response to comment)
The idea of a cache is for it to be relatively transitory compared to your backing store. If you need to persist the cache data long-term, I recommend looking at either (a) denormalizing your data tier to get more performance, or (b) adding a middle-tier database server that stores high-volume data in straight key-value-pair tables, or something closely approximating that.
In defence of memcached as a cache store, if you want high peformance with low impact of a server reboot, why not just have 4 memcached servers? Or 8? Each 'reboot' would have correspondingly less effect on the database server.
I think I found a relatively good solution.
I use Zend_Cache to store locally each cache file.
I've created a small daemon based on nanoserver which manage cache files locally too.
When one server create/modify/delete a cache file locally, it send the same action to all server through the daemon which do the same action.
That mean I have local caching files and remote actions at the same time.
Probably not perfect, but should work for now.
CouchDB was too slow and NFS is not reliable enough.

Resources