Appfabric Caching: Configuration Provider as single point of failure - caching

After doing some initial research into using Appfabric for caching, my understanding is that the configuration provider for the cluster is a single point of failure as mentioned here:
MSDN
I want to use appfabric just for distributed caching, particularly for the tagging features. What are the options to avoid having the configuration provider as this failure point? I thought of two but not sure if one is better or if there are any other options.
(1) Create my own caching service configuration provider. I'm guessing this is possible (?) but I'm not sure how to go about it. I'd probably make a provider that fetched the xml file from S3 since I'm already using AWS.
(2) Configure each cache as a single node cluster and then create a proxy client that uses the individual nodes as a distributed cache, a la a memcached type client.
Thoughts or recommendations, or anything else I should consider in making this decision?

Yes, it is a single point of failure.
Microsoft's recomended solutions seem to be:
(SQL Server provider) Use SQL Server
clustering. In my limited
experience of it, using SQL Server
clustering for this is probably a
case of 'the cure is worse than the
disease' i.e. it brings a lot of
pain. Unless you've already got a SQL
Server cluster available, avoid!
(XML
provider) Use Windows Server
clustering. I have even less
knowledge of this than SQL
clustering, so I can't say how well (or otherwise)
this might work. It doesn't strike me as a trivial thing to do, though.
You can create your own configuration provider by implementing the ICustomProvider interface and making some registry entries. Using AWS seems like a really good idea to make the config provider resilient, I'd be interested to see how you got on with this.
Creating a proxy client seems to me like you'd be making a lot of work for yourself, at that point it feels like you'd be more fighting against AppFabric rather than working with it.

We have also tried AppFabric but it gave us fair few headaches like for one there's no API access which is making it very difficult to use our current unit testing strategy. We have now moved to NCache that is better option than AppFabric. NCache provides tagging feature and it is not a single point of failure.

Related

Using org.springframework.cache.support.SimpleCacheManager in the cloud

I noticed that Spring reference application (Sagan) uses the SimpleCacheManager implementation. See here for source code of Sagan.
I was surprised by this choice because I thought that all but small applications running on a single node would use something like a Redis cache manager and not the simple cache manager.
How can a large application like Sagan -which I assume runs on cloudfoundry- use this simple implementation?
Any comment welcome.
Well, the SimpleCacheManager choice has been made because it was the simplest solution that could possibly work. Note that Sagan is, at least for now, not storing a lot of data in that cache and merely using it to respect various APIs rate-limiting and get better performance on some parts of the application.
Yes, Sagan is running on CloudFoundry (see this presentation) and is using CF marketplace services.
Even if cache consistency between instances is not a constraint for now, we could definitely add another marketplace service, here a Redis Cloud instance, and use this as a central cache repository.
Now that we're considering using that cache for more features, it even makes sense to at least consider that use case, since it could lower our monthly bill (pay a small fee for a redis service and use less memory for our CF instances).
In any case, thanks a lot balteo for this insightful question, we've created a Github issue for that.

MemoryCache object and load balancing

I'm writing a web application using ASP .NET MVC 3. I want to use the MemoryCache object but I'm worried about it causing issues with load balanced web servers. When I google for it looks like that problem is solved on the server ie using AppFabric. If a company has load balanced servers is it on them to make sure they have AppFabric or something similar running? or is there anything I can or should do as a developer for this?
First of all, for ASP.NET you should look at the ASP.NET Cache instead of MemoryCache. MemoryCache is a generic caching API that was introduced in .NET 4.0 to provide an equivalent of the ASP.NET Cache in non-web applications.
You're correct to say that AppFabric resolves the issue of multiple servers having their own instances of cached data, in that it provides a single logical cache accessible from all your web servers. Before you leap on it as the solution to your problem, there's a couple of things to consider:
It does not ship as part of Windows Server - it is, as you say, on
you to install it on your servers if you want to use it. When
AppFabric was released, there was a suggestion that it would ship as
part of the next release of Windows Server, but I haven't seen
anything about Windows Server 2012 that confirms that to be the case.
You need extra servers for it, or at least you're advised to have
them. Microsoft's recommendation for AppFabric is that you run it on
dedicated servers. Which means that whilst AppFabric itself is a free
download, you may be incurring additional Windows Server licence
costs. Speaking of which...
You might need Enterprise Edition licences. If you want to use the
High Availability features of AppFabric, you can only do this with
servers running Enterprise Edition, which is a more expensive licence
than Standard Edition.
You might not need it after all. Some of this will depend on your application and why you want to use a shared caching layer. If your concern is that caches on multiple servers could get out of sync with the database (or indeed each other), some judicious use of SqlCacheDependency objects might get you past the issue.
This CodeProject article Implementing Local MemoryCache Invalidation with Redis suggests an approach for handling the scenario you describe.
You didn't mention the flavor of load balancing that you are using: "sticky" or "stateless". By far the easiest solution is to use sticky sessions.
If you want to use local memory caches and stateless load balancing, you can end up with race conditions the cross-server invalidation messages arrive late. This can be particularly problematic if you use the Post-Redirect-Get pattern so common in ASP.Net MVC. This can be overcome by using cookies to supplement the cache invalidation broadcasts. I detail this in a blog post here.

Create a LDAP cache using unboundid LDAP SDK?

I would like to make a LDAP cache with the following goals
Decrease connection attempt to the ldap server
Read local cache if entry is exist and it is valid in the cache
Fetch from ldap if there is no such request before or the entry in the cache is invalid
Current i am using unboundid LDAP SDK to query LDAP and it works.
After doing some research, i found a persistent search example that may works. Updated entry in the ldap server will pass the entry to searchEntryReturned so that cache updating is possible.
https://code.google.com/p/ldap-sample-code/source/browse/trunk/src/main/java/samplecode/PersistentSearchExample.java
http://www.unboundid.com/products/ldapsdk/docs/javadoc/com/unboundid/ldap/sdk/AsyncSearchResultListener.html
But i am not sure how to do this since it is async or is there a better way to implement to cache ? Example and ideas is greatly welcomed.
Ldap server is Apache DS and it supports persistent search.
The program is a JSF2 application.
I believe that Apache DS supports the use of the content synchronization controls as defined in RFC 4533. These controls may be used to implement a kind of replication or data synchronization between systems, and caching is a somewhat common use of that. The UnboundID LDAP SDK supports these controls (http://www.unboundid.com/products/ldap-sdk/docs/javadoc/index.html?com/unboundid/ldap/sdk/controls/ContentSyncRequestControl.html). I'd recommend looking at those controls and the information contained in RFC 4533 to determine whether that might be more appropriate.
Another approach might be to see if Apache DS supports an LDAP changelog (e.g., in the format described in draft-good-ldap-changelog). This allows you to retrieve information about entries that have changed so that they can be updated in your local copy. By periodically polling the changelog to look for new changes, you can consume information about changes at your own pace (including those which might have been made while your application was offline).
Although persistent search may work in your case, there are a few issues that might make it problematic. The first is that you don't get any control over the rate at which updated entries are sent to your client, and if the server can apply changes faster than the client can consume them, then this can overwhelm the client (which has been observed in a number of real-world cases). The second is that a persistent search will let you know what entries were updated, but not what changes were made to them. In the case of a cache, this may not have a huge impact because you'll just replace your copy of the entire entry, but it's less desirable in other cases. Another big problem is that a persistent search will only return information about entries updated while the search was active. If your client is shut down or the connection becomes invalid for some reason, then there's no easy way to get information about any changes while the client was in that state.
Client-side caching is generally a bad thing, for many reasons. It can serve stale data to applications, which has the potential to cause incorrect behavior or in some cases pose a security risk, and it's absolutely a huge security risk if you're using it for authentication. It could also pose a security risk if not all of the clients have the same level of access to the data contained in the cache. Further, implementing a cache for each client application isn't a scalable solution, and if you were to try to share a cache across multiple applications, then you might as well just make it a full directory server instance. It's much better to use a server that can simply handle the desired load without the need for any additional caching.

Horizontal Scaling of Tomcat in Microsoft Azure

I am working on this quiet a while, but still no conclustion.
I want to do horizontal scaling of Tomcat instances in Microsoft Azure (1,2,3,... Tomcat instances for one service). I read lots of articles about session replication, clustering,... with Tomcat. Since Azure does not support Multicasts, there is no easy way to cluster Tomcat. Also sticky sessions is no options, because Azure does round robin load balancing. Setting up two services - one with Terracotta or Apache mod_jk - and the other with Tomcat instances seems overkill for me (if even doable)...
Is this even possible?
Thanks in advance for reading and answering my question. Every comment/idea is highly appreciated.
There is the new appFabric caching service you could use, or there are examples of using Memcache on Azure, would that help?
http://code.msdn.microsoft.com/winazurememcached
Why do you feel that running 2 services is overkill, exactly? If you have no issue with scaling out to n Tomcat instances, adding another service for load distribution is a perfectly acceptable solution in my book. By running that service on a minimum of 2 instances, that service itself meets the Azure SLA requirements: your uptime will be as good as it is going to get on Azure, and you avoid a SPoF (single point of failure).
You could go with a product like terracotta, but it is also pretty straightforward to write a simple socket server to route HTTP sessions back to a particular instance running in Windows Azure. You would have to be aware of node recyles, but that is quite doable.
Be aware that memcached requires an additional Azure service as well (web roles), the appFabric caching service does not (but also has cost associated with it). I do not know Tomcat, but for IIS you can easily move session state from in memory to persisted (either SQL Azure or Azure Storage). Something to be aware of: for high volume sites, the transaction cost to Azure Storage can actually become a cost driver for your deployment if you store session info there. SQL Azure could well be the more cost-effective solution, but on the other hand might not be supported out-of-the-box for your solution.
I do not think that you can run Tomcat on Azure. Even if you could (using the virtual machine role) it is probably cheaper to run it on a Linux VM on Amazon EC2.
Edit
I see that this is possible using the Tomcat Solution Accelerator. But look at the disclaimer:
This solution accelerator is provided
for informational purposes only and
Microsoft or Infosys makes no
warranties, either express or implied
This is an unsupported solution. I know that it is often difficult to question management decisions. But using unsupported software for production systems, when a cheaper supported alternative is available, is generally not a good idea.

Move application to Websphere clusters

What should we take care of before moving an application from a single Websphere Application Server to a Websphere cluster
This is my list from experience. It is not complete but should cover the most common problem areas:
Plan head the distributed session management configuration (ie. will you use memory-to-memory or database based replicaton). Make a notice that if you are still on 32-bit platform the resource requirement overhead from clustering might cause you instability issues if your application uses already lots of memory.
Make sure that everything you put into user sessions can be serialized with the default serializer (implements Serializable). You might otherwise run into problems with distributed sessions.
The same goes for everything you put into DynaCache. Make sure everything serializes properly.
Specify and make sure all the resource definitions (JDBC providers etc) will be made to a proper scope. I would usually recommend using the actual Cluster scope for everything that your applications installed to cluster use. That ensures the testing features work properly from proper points, and that you don't make conflicting definitions.
Make sure your application uses relative paths for resources in web interfaces. Once you start load balancing and stuff you can run into some serious problems if you have bolted down a lot of stuff.
If you had any sort of timers make sure they work well with clusters. With Quartz that means probably that you should use the JDBC store for timer tasks. With EJB Timers make sure you register the timers only once (it is possible to corrupt the timer database of WAS if you have several nodes attempting the registering at the exactly same time) and make sure you install them to Cluster scope.
Make sure you use the WAS provided SSO mechanisms. If you have a custom implementation please make sure it handles moving the user between servers in cluster well.
Keep it simple, depending on your requirements, try configuring your load balancer to use sticky sessions and not hold state in your HTTP Session. That way you don't need to use resource hungry in memory session replication.
Single Sign On isn't an issue for a single cluster as your HTTP clients will not be moving off the same http://server.acme.com/... host domain name.
Most of your testing should focus on database contention. If you have a highly transactional application (i.e. many writes to the same table) make sure you look at your database Isolation levels so that locks are not held unecessarily. Same goes for your transaction demarkaction. Keep transactions as brief as possible. If you dont have database skills yourself make sure you get a Database Analyst to help you monitor the database while you test.
Also a good advice to raise a PMR to IBM Support up front of any major changes, such as this one or upgrading to new versions etc. Raise it as a "Software Usage Question" and they can provide you with feedback from their knowledge database based on other customers input. Same would apply for any type of product which you have a support agreement for - ask support before problems occur.

Resources