What's the best place for a database-backed, memory-resident global cache in an ASP.NET web server? - performance

I have to cache an object hierarchy in-memory for performance reasons, which reflects a simple database table with columns (ObjectID, ParentObjectID, Timestamp) and view CurrentObjectHierarchy. I query the CurrentObjectHierarchy and use a hash table to cache the current parents of each object for quickly looking up the parent object ID, given any object ID. Querying the database table and constructing the cache is a 77ms operation on average, and ideally this refresh occurs only when a method in my database API is called that would change the hierarchy (adding/removing/reparenting an object).
Where is the best place for such a cache, if it must be accessed by multiple ASP.NET web applications, possibly running in different application pools?
Originally, I was storing the cache in a static variable in a C# dll shared by the different web applications. The problem, of course, is that while static variables can be accessed across threads, they cannot be accessed across processes, which is a problem when multiple web-apps are involved (possibly running in separate application pools). As a result... synchronized, thread-safe modifications to the object hierarchy cache in one application are not reflected in other applications, even though they are using the same code-base.
So I need a more global location for this cache. I cannot use static variables (as I just explained), session state (which is basically a per-user store), and application state (needs to be accessible across applications).
Potential places I've been considering are:
Some kind of global object storage within IIS itself, accessible from any thread in any application in any application pool (if such a place exists. Does it?)
A separate, custom web service that manages an exclusive cache.
Right now, I think the BEST solution is SQL CLR integration, because:
I can keep my current design using static variables
It's a separate service that already exists, so I don't have to write a custom one
It will be running in a single process (SQL Server), so the existing lock-based synchronization will work fine
The cache would be setting as close as possible to the data structures it represents!
I would embed the hierarchy-traversing methods in the SQL CLR DLL, so that I could make a single SQL call where I would normally make a regular method call. This all depends on SQL Server running in a single process and the CLR being loaded into that process, which I think is the case. What do you think of this? Can you see anything obviously wrong with this idea that I may be missing? Is this not an awesome idea?
EDIT:
After looking more closely, it seems that different ASP.NET applications actually run in the same process, but are isolated by AppDomains. If I could find a way to share and synchronize data across AppDomains, that would be very very useful. I'm reading about .NET Remoting now.

Microsoft is working on a distributed caching framework: Velocity. However, the latest release is a CTP3 version, so it may not be production ready...

Related

Handling dictionary values stored in DB - Spring

I am developing some SPA with a backend written in Java (Spring Boot). In relational DB that backend connects to, there is a table with some dictionary values. Values can edited by users of the app, but it's done really, really rarely (almost never).
Those dictionary values are used in a lot of pages on UI and because of that I would like to "cache" them in a way. What I want to achieve is that I want to load dictionary values on startup to avoid asking DB for values during every request between UI and Backend.
Firstly, I thought about just loading it on the UI part of the app, when user enters the page for the first time. Then I ruled it out, since when one of the users changes the values, it should be reloaded.
What I think might work is just loading them on startup of Backend into some collection (that can be safely used in concurent environment, probably ConcurrentMap) and then during some GET requests asking that collection for the values (instead of DB). When the values are changed, that request just updates the DB table and reloads them into collection.
Then I thought that the collection solution won't be enough, when my backend would be scaled up to more than one instance. In that case, only one of instances will be updated and the second one will provide outdated data. We can avoid it and force refreshes i.e. every 15 minutes (instead of on demand during values update).
But what I think is the best solution is to start some redis service on a side, load dictionary values into it and after every DB update of the values just update the redis instance with the new ones. Every instance of backend would use the same instance of redis, which seems quicker than executing query (select * from _ where _ = _) on DB.
What do you think? Is my thought process is correct? Do you have any ideas that can help solve my issue?
If you are using Spring you could check out Spring Cache Abstraction. That way your cache will be up-to-date whenever some change occurs.
Out of the box few implementations are supported by Spring:
Spring provides a few implementations of that abstraction: JDK java.util.concurrent.ConcurrentMap based caches, Ehcache 2.x, Gemfire cache, Caffeine, and JSR-107 compliant caches (such as Ehcache 3.x). See Plugging-in Different Back-end Caches for more information on plugging in other cache stores and providers.
If you decide to use Memcached implementation you can check out this library (uses Xmemcached under the hood) here.
You could also check a small demo app of how to use Spring Cache Abstraction in your project (link).
I think your in the right path with your approach in terms of 'caching'. I suggest you also check Memcached for it simplicity. Redis is a good choice but still it depends on your requirements and if you need that much feature. just my 2cent
https://aws.amazon.com/elasticache/redis-vs-memcached/
https://devcenter.heroku.com/articles/spring-boot-memcache#add-caching-to-spring-boot
Thanks,

Clarification on database caching

Correct me if I'm wrong, but from my understanding, "database caches" are usually implemented with an in-memory database that is local to the web server (same machine as the web server). Also, these "database caches" store the actual results of queries. I have also read up on the multiple caching strategies like - Cache Aside, Read Through, Write Through, Write Behind, Write Around.
For some context, the Write Through strategy looks like this:
and the Cache Aside strategy looks like this:
I believe that the "Application" refers to a backend server with a REST API.
My first question is, in the Write Through strategy (application writes to cache, cache then writes to database), how does this work? From my understanding, the most commonly used database caches are Redis or Memcached - which are just key-value stores. Suppose you have a relational database as the main database, how are these key-value stores going to write back to the relational database? Do these strategies only apply if your main database is also a key-value store?
In a Write Through (or Read Through) strategy, the cache sits in between the application and the database. How does that even work? How do you get the cache to talk to the database server? From my understanding, the web server (the application) is always the one facilitating the communication between the cache and the main database - which is basically a Cache Aside strategy. Unless Redis has some kind of functionality that allows it to talk to another database, I don't quite understand how this works.
Isn't it possible to mix and match caching strategies? From how I see it, Cache Aside and Read Through are caching strategies for application reads (user wants to read data), while Write Through and Write Behind are caching strategies for application writes (user wants to write data). Couldn't you have a strategy that uses both Cache Aside and Write Through? Why do most articles always seem to portray them as independent strategies?
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
Could you implement a cache using a normal (not in-memory) database? I suppose this would still be somewhat useful since you do not need to make an additional network hop to the database server (since the cache lives on the same machine as the web server)?
Introduction & clarification
I guess you have one misunderstood point, that the cache is NOT expclicitely stored on the same server as the werbserver. Sometimes, not even the database is sperated on it's own server from the webserver. If you think of APIs, like HTTP REST APIs, you can use caching to not spend too many resources on database connections & queries. Generally, you want to use as few database connections & queries as possible. Now imagine the following setting:
You have a werbserver who serves your application and a REST API, which is used by the webserver to work with some resources. Those resources come from a database (lets say a relational database) which is also stored on the same server. Now there is one endpoint which serves e.g. a list of posts (like blog-posts). Every user can fetch all posts (to make it simple in this example). Now we have a case where one can say that this API request could be cached, to not let all users always trigger the database, just to query the same resources (via the REST API) over and over again. Here comes caching. Redis is one of many tools which can be used for caching. Since redis is a simple in-memory key-value storage, you can just put all of your posts (remember the REST API) after the first DB-query, into the cache. All future requests for the posts-list would first check whether the posts are alreay cached or not. If they are, the API will return the cache-content for this specific request.
This is one simple example to show off, what caching can be used for.
Answers on your question
My first question is, why would you ever write to a cache?
To reduce the amount of database connections and queries.
how is writing to these key-value stores going to help with updating the relational database?
It does not help you with updating, but instead it helps you with spending less resources. It also helps you in terms of "temporary backing up" some data - but that only as a very little side effect. For this, out there are more attractive solutions (Since redis is also not persistent by default. But it supports persistence.)
Do these cache writing strategies only apply if your main database is also a key-value store?
No, it is not important which database you use. Whether it's a NoSQL or SQL DB. It strongly depends on what you want to cache and how the database and it's tables are set up. Do you have frequent changes in your recources? Do resources get updated manually or only on user-initiated actions? Those are questions, leading you to the right caching implementation.
Isn't it possible to mix and match caching strategies?
I am not an expert at caching strategies, but let me try:
I guess it is possible but it also, highly depends on what you are doing in your DB and what kind of application you have. I guess if you find out what kind of application you are building up, then you will know, what strategy you have to use - i guess it is also not recommended to mix those strategies up, because those strategies are coupled to your application type - in other words: It will not work out pretty well.
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
I guess that both is possible. Usually you have one database, maybe clustered or synchronized with copies, to which your webservers (e.g. REST APIs) make their requests. Then whether each of you API servers would have it's own cache, to not query the database at all (in cloud-based applications your database is also maybe on another separated server - so another "hop" in terms of networking). OR (what i also can imagine) you have another middleware between your APIs (clusterd up) and your DB (maybe also clustered up) - but i guess that no one would do that because of the network traffic. It would result in a higher response-time, what you usually want to prevent.
Could you implement a cache using a normal (not in-memory) database?
Yes you could, but it would be way slower. A machine can access in-memory data faster then building up another (local) connection to a database and query your cached entries. Also, because your database has to write the entries into files on your machine, to persist the data.
Conclusion
All in all, it is all about being fast in terms of response times and to prevent much network traffic. I hope that i could help you out a little bit.

Do Different CRM Orgs Running On The Same Box Share The Same App Domain?

I'm doing some in memory Caching for some Plugins in Microsoft CRM. I'm attempting to figure out if I need to be concerned about different orgs populating the same cache:
// In Some Plugin
var settings = Singleton.GetCache["MyOrgSpecificSetting"];
// Use Org specific cached Setting:
or do I need to do something like this to be sure I don't cross contaminate settings:
// In Some Plugin
var settings = Singleton.GetCache[GetOrgId() + "MyOrgSpecificSetting"];
// Use Org specific cached Setting:
I'm guessing this would also need to be factored in for Custom Activities in the AsyncWorkflowService as well?
Great question. As far as I understand, you would run into the issue you describe if you set static data if your assemblies were not registered in Sandbox Mode, so you would have to create some way to uniquely qualify the reference (as your second example does).
However, this goes against Microsoft's best practices in Plugin/Workflow Activity development. Every plugin should not rely on state outside of the state that is passed into the plugin. Here is what it says on MSDN found HERE:
The plug-in's Execute method should be written to be stateless because
the constructor is not called for every invocation of the plug-in.
Also, multiple system threads could execute the plug-in at the same
time. All per invocation state information is stored in the context,
so you should not use global variables or attempt to store any data in
member variables for use during the next plug-in invocation unless
that data was obtained from the configuration parameter provided to
the constructor.
So the ideal way to managage caching would be to use either one or more CRM records (likely custom) or use a different service to cache this data.
Synchronous plugins of all organizations within CRM front-end run in the same AppDomain. So your second approach will work. Unfortunately async services are running in separate process from where it would not be possible to access your in-proc cache.
I think it's technically impossible for Microsoft NOT to implement each CRM organization in at least its own AppDomain, let alone an AppDomain per loaded assembly. I'm trying to imagine how multiple versions of a plugin-assembly are deployed to multiple organizations and loaded and executed in the same AppDomain and I can't think of a realistic way. But that may be my lack of imagination.
I think your problem lies more in the concurrency (multi-threading) than in sharing of the same plugin across organizations. #BlueSam quotes Microsoft where they seem to be saying that multiple instances of the same plugin can live in one AppDomain. Make sure multiple threads can concurrently read/write to your in-mem cache and you'll be fine. And if you really really want to be sure, prepend the cache key with the OrgId, like in your second example.
I figure you'll be able to implement a concurrent cache, so I won't go into detail there.

Handling large object in stateless environment

We have various windows services that load up a large amount of data i.e. mostly settings, from a database into an object which is used whenever calls are made to our various .net remoting functions (I know it's old!!). Having this object containing all these settings in memory saves us having the query the database constantly or load the data from a cache whenever queries are executed.
Settings in this "large" object are collections of data, from id, path, text, etc...
We want to move away from .net remoting to wcf and potentially get rid of our windows services and run the lot under IIS (and eventually Azure), but being stateless, I'm wondering how should we handle this?
1) What's the best method you can think of? From experience preferrably.
One suggestion that was made to me was to return all of this to the client, cache it and use only the relevant settings when making a wcf call.
2) Numerous services we have are polling services, constantly monitoring, databases, file locations, ftp locations, etc... How would you recommend to handle this in a stateless environment?? I can't see how this will be handled.
We use SQL Server, but I don't want to rely too heavily on the build-in features as we could potentially have to suppor the likes of mySQL & Oracle.
Thanks.
Thierry
You could store these settings in the AppSettings section of the config file (Web.config for IIS). Using the ConfigurationManager class, you can retrieve the relevant values as needed.
If you prefer to store a static instance of your settings object, suggest implementing a Singleton pattern for the same. Jon Skeet's article is a great starting point.
Hope this helps.

What level is ObjectCache at

If I add an object to the ObjectCache - at what level is this stored at? Would this be accessible by all users of the application or only a specific instance?
I've read articles that claim it is at application level but when I enumerate the cache, all I can see are the objects that instance of the application created.
As far as I know it depends on the application pool (since it stays on top of the ASP.NET stack).
This means that if you have multiple instances of the same cache on the same machine, each using a different app pool, you'll have different caches. The same if you have multiple machines.
If you want a single cache on multiple machines use a distributed cache like Windows Server Appfabric.

Resources