I have the need to cache a collection of objects that is mostly static (might have changes 1x per day) that is avaliable in my ASP.NET Web API OData service. This result set is used across calls (meaning not client call specific) so it needs to be cached at the application level.
I did a bunch of searching on 'caching in Web API' but all of the results were about 'output caching'. That is not what I'm looking for here. I want to cache a 'People' collection to be reused on subsequent calls (might have a sliding expiration).
My question is, since this is still just ASP.NET, do I use traditional Application caching techniques for persisting this collection in memory, or is there something else I need to do? This collection is not directly returned to the user, but rather used as the source behind the scenes for OData queries via API calls. There is no reason for me to go out to the database on every call to get the exact same information on every call. Expiring it hourly should suffice.
Any one know how to properly cache the data in this scenario?
The solution I ended up using involved MemoryCache in the System.Runtime.Caching namespace. Here is the code that ended up working for caching my collection:
//If the data exists in cache, pull it from there, otherwise make a call to database to get the data
ObjectCache cache = MemoryCache.Default;
var peopleData = cache.Get("PeopleData") as List<People>;
if (peopleData != null)
return peopleData ;
peopleData = GetAllPeople();
CacheItemPolicy policy = new CacheItemPolicy {AbsoluteExpiration = DateTimeOffset.Now.AddMinutes(30)};
cache.Add("PeopleData", peopleData, policy);
return peopleData;
Here is another way I found using Lazy<T> to take into account locking and concurrency. Total credit goes to this post: How to deal with costly building operations using MemoryCache?
private IEnumerable<TEntity> GetFromCache<TEntity>(string key, Func<IEnumerable<TEntity>> valueFactory) where TEntity : class
{
ObjectCache cache = MemoryCache.Default;
var newValue = new Lazy<IEnumerable<TEntity>>(valueFactory);
CacheItemPolicy policy = new CacheItemPolicy { AbsoluteExpiration = DateTimeOffset.Now.AddMinutes(30) };
//The line below returns existing item or adds the new value if it doesn't exist
var value = cache.AddOrGetExisting(key, newValue, policy) as Lazy<IEnumerable<TEntity>>;
return (value ?? newValue).Value; // Lazy<T> handles the locking itself
}
Yes, output caching is not what you are looking for. You can cache the data in memory with MemoryCache for example, http://msdn.microsoft.com/en-us/library/system.runtime.caching.memorycache.aspx . However, you will lose that data if the application pool gets recycled. Another option is to use a distributed cache like AppFabric Cache or MemCache to name a few.
Related
I am using Spring Cache to cache some objects through #Cacheable. However, 1 of the requirement requires me to be able to know if the returned object was from the Cache Hit or standard call. Is there any flag or indicator thats gets set i can use to check this ?
I have seen past questions regarding cache hits being logged whenever there are cache hits but that is not really useful for my situation. I am currently using Spring Cache with the Simple Provider and am open to using any external Cache Managers that is able to do this.
Yes, we can know whether it is a cache hit or a cache miss(a direct call to REST call or a database call) using a flag.
Using #Cacheable, it always first checks in the cache, before it executes the method, if found in cache, it will skip the method execution, where as #CachePut works slightly different, where it will executes the adivised method & updates the cache, so it will miss the cache always.
For example:
private volatile boolean cacheMiss = false;
public boolean isCacheMiss(){
boolean cacheMiss = this.cacheMiss;
this.cacheMiss = false; //resetting for next read
return cacheMiss;
}
protected void setCacheMiss(){
this.cacheMiss = true;
}
#Cacheable("Quotes")
public Quote requestQuote(Long id) {
setCacheMiss();
//REST CALL HERE
return requestQuote(ID_BASED_QUOTE_SERVICE_URL,
Collections.singletonMap("id", id));
}
cacheMiss variable gives the status, whether it is from cache or not.
Here is it discussed Spring Caching with GemFire, the underlying caching provider is Pivotal GemFire. You can use any such caching providers.
I get a file with 4000 entries and debatch it, so i dont lose the whole message if one entry has corrupting data.
The Biztalkmap is accessing an SQL server, before i debatched the Message I simply cached the SLQ data in the Map, but now i have 4000 indipendent maps.
Without caching the process takes about 30 times longer.
Is there a way to cache the data from the SQL Server somewhere out of the Map without losing much Performance?
It is not a recommendable pattern to access a database in a Map.
Since what you describe sounds like you're retrieving static reference data, another option is to move the process to an Orchestration where the reference data is retrieved one time into a Message.
Then, you can use a dual input Map supplying the reference data and the business message.
In this patter, you can either debatch in the Orchestration or use a Sequential Convoy.
I would always avoid accessing SQL Server in a map - it gets very easy to inadvertently make many more calls than you intend (whether because of a mistake in the map design or because of unexpected volume or usage of the map on a particular port or set of ports). In fact, I would generally avoid making any kind of call in a map that has to access another system or service, but if you must, then caching can help.
You can cache using, for example, MemoryCache. The pattern I use with that generally involves a custom C# library where you first check the cache for your value, and if there's a miss you check SQL (either for the paritcular entry or the entire cache, e.g.:
object _syncRoot = new object();
...
public string CheckCache(string key)
{
string check = MemoryCache.Default.Get(key) as string;
if (check == null)
{
lock (_syncRoot)
{
// make sure someone else didn't get here before we acquired the lock, avoid duplicate work
check = MemoryCache.Default.Get(key) as string;
if (check != null) return check;
string sql = #"SELECT ...";
using (SqlConnection conn = new SqlConnection(connStr))
{
conn.Open();
using (SqlCommand cmd = conn.CreateCommand())
{
cmd.CommandText = sql;
cmd.Parameters.AddWithValue(...);
// ExecuteScalar or ExecuteReader as appropriate, read values out, store in cache
// use MemoryCache.Default.Add with sensible expiration to cache your data
}
}
}
}
else
{
return check;
}
}
A few things to keep in mind:
This will work on a per AppDomain basis, and pipelines and orchestrations run on separate app domains. If you are executing this map in both places, you'll end up with caches in both places. The complexity added in trying to share this accross AppDomains is probably not worth it, but if you really need that you should isolate your caching into something like a WCF NetTcp service.
This will use more memory - you shouldn't just throw everything and anything into a cache in BizTalk, and if you're going to cache stuff make sure you have lots of available memory on the machine and that BizTalk is configured to be able to use it.
The MemoryCache can store whatever you want - I'm using strings here, but it could be other primitive types or objects as well.
On the documentation it says
/// The cache manager must have at least one cache handle configured with <see cref="CacheHandleConfiguration.IsBackplaneSource"/> set to <c>true</c>.
/// Usually this is the redis cache handle, if configured. It should be the distributed and bottom most cache handle.
I know how to do it with RedisCacheHandle, since it's given as example on Cachemanager's website
var cache = CacheFactory.Build<int>("myCache", settings =>
{
settings
.WithSystemRuntimeCacheHandle("inProcessCache")
.And
.WithRedisConfiguration("redis", config =>
{
config.WithAllowAdmin()
.WithDatabase(0)
.WithEndpoint("localhost", 6379);
})
.WithMaxRetries(1000)
.WithRetryTimeout(100)
.WithRedisBackplane("redis")
.WithRedisCacheHandle("redis", true);
});
The problem is I don't want to use Redis as cache resource; I just want to make a distributed cache by the power of Redis Pub/Sub mechanism. According to my debugging through the code, by using Redis Backplane feature I'm able to send messages to and receive messages from Redis indeed. So why not use RedisCacheHandle and instead use SystemRuntimeCacheHandle?
So, my expectation was the successfull execution with the following cache configuration
var cache = CacheFactory.Build<int>("myCache", settings =>
{
settings
.WithSystemRuntimeCacheHandle("inProcessCache")
.And
.WithRedisConfiguration("redis", config =>
{
config.WithAllowAdmin()
.WithDatabase(0)
.WithEndpoint("localhost", 6379);
})
.WithMaxRetries(1000)
.WithRetryTimeout(100)
.WithRedisBackplane("redis")
.WithSystemRuntimeCacheHandle("inProcessCache", true);
});
But it's not working. Can you please show me a solution? What am I doing wrong? Or, eventhough it's written in the documentation as
...Usually this is the redis cache handle...
are there any way to use the cache synchronization feature without RedisCacheHandle?
https://github.com/MichaCo/CacheManager/issues/111
I guess you with "nor working" you mean that the other caches do not get synced, e.g. if I delete a key from cacheA, it doesn't get deleted from cacheB?
Yes this is currently an expected behavior.
The backplane is intended to work with out of process caches where you have only one state.
With 2 instances of the cache both using system runtime caching, you have two totally disconnected in proc caches.
Normally, if you have a Redis layer and you remove a key from cache instance A, the item would be removed from the Redis layer. The message gets send to other instances of the same cache and would remove the key from any other cache layer but redis (the one marked as backplane source).
This means, we expect that the backplane source is already in sync.
Now what if you have an in-process cache as backplane source. That doesn't work, because both instances will always be out of sync.
Lets look at this example:
var cacheConfig = ConfigurationBuilder.BuildConfiguration(settings =>
{
settings
.WithSystemRuntimeCacheHandle("inProcessCache")
.And
.WithRedisConfiguration("redis", config =>
{
config.WithAllowAdmin()
.WithDatabase(0)
.WithEndpoint("localhost", 6379);
})
.WithMaxRetries(1000)
.WithRetryTimeout(100)
.WithRedisBackplane("redis")
.WithSystemRuntimeCacheHandle("inProcessCache", true);
});
var cacheA = new BaseCacheManager<string>(cacheConfig);
var cacheB = new BaseCacheManager<string>(cacheConfig);
cacheB.Backplane.Removed += (obj, args) =>
{
Console.WriteLine(args.Key + " removed from B.");
};
cacheA.Add("key", "value");
var result = cacheB.Get("key");
Console.WriteLine("Result should be null:" + result);
cacheB.Add("key", "value");
result = cacheB.Get("key");
Console.WriteLine("Result should not be null:" + result);
// triggers backplane remove event
cacheA.Remove("key");
// lets give redis some time send messages
Thread.Sleep(100);
result = cacheB.Get("key");
Console.WriteLine("Result should be null again but isn't:" + result);
Console.ReadKey();
If you run this, you can see that the backplane event actually fires but because the only in-proc cache is the backplane source, the key does not get deleted.
That's why at the end, you still get the key returned back to you.
As I said, that is currently the expected behavior.
You could implement custom logic with listening to those events though.
(the events will change slightly in the next version, currently there are a few bugs and inconsistencies).
Also, don't expect that the backplane would transfer cache values over to other instances. That's not going to happen ever. CacheManager only sends key events, not the data because the data is handled by the out of process cache usually.
Meaning, if you have in-proc cache only with backplane, adding an item in cacheA, will NOT copy the item to cacheB! You might get a change event for the key on cacheB though.
I hope that makes sense ;)
I have a form that searches via AJAX against two different data sources. The data is relatively small but the speed at which it returns is slow.
I built a cache layer to store the full result after the first query... however, I would like to prime the cache with data before the user executes the search.
Should I be looking at an AsyncController to do this? Any recommendations?
My desired behavior is (updated):
User requests any ActionABC of some controller (not necessarily the search action)
Server-side, that action checks the cache and asynchronously requests data if empty
ActionABC returns requested view while cache continues to populate on server
If the user subsequently performs a search while cache being populated, their request waits until cache populate is complete otherwise cache data is immediately available
You would get a benefit from an async controller only if you could perform the 2 searches in parallel.
In this case your logic could be:
If the data is found in the cache return the result immediately.
If the data is not found in the cache launch 2 parallel async tasks to perform the search.
Synchronize those tasks so that once they both finish you populate the cache and return the final result.
Also if you are going the AsyncController route make sure you use async ADO.NET API to query your database (command.BeginExecuteResult/command.EndExecuteResult) so that you can take full advantage of I/O Completion ports and do not block worker threads during the execution of the expensive search operations.
I ended up not having to use AsyncControllers.
I used the Task Factory to "fire and forget" a call to load the data initially upon any call to the controller.
Task.Factory.StartNew(() => { var x = GetData(); });
Inside "GetData" call I used LOCK to force subsequent calls to wait until cache was populated (addresses #4)
private static object ThisLock = new object();
protected MyData GetData()
{
if(<MyData in cache>)
return MyData from cache;
lock(ThisLock)
{
// just entered lock, see if cache was set by previous blocking thread
if(MyData in cache>)
return data from cache;
... load MyData from database ...
... save MyData to cache ...
return MyData from cache;
}
}
I had my first go at AppFabric - caching (aka Ms Velocity) today and checked out msdn virtual labs.
https://cmg.vlabcenter.com/default.aspx?moduleid=4d352091-dd7d-4f6c-815c-db2eafe608c7
There is this code sample in it that I dont get. It creates a cache object and stores it in session state. The documentation just says:
We need to store the cache object in
Session state and retrieve the same
instance of that object each time we
need to use it.
Thats not the way I used to use the cache in ASP.NET. What is the reason for this pattern and do I have to use it?
private DataCache GetCache()
{
DataCache dCache;
if (Session["dCache"] != null)
{
dCache = (DataCache)Session["dCache"];
if (dCache == null)
throw new InvalidOperationException("Unable to get or create distributed cache");
}
else
{
var factory = new DataCacheFactory();
dCache = factory.GetCache("default");
Session["dCache"] = dCache;
}
return dCache;
}
This is because DataCacheFactory is an expensive object to create - you don't want to be creating an instance of it every time you want to access the cache.
What they're showing you in the lab is how to create an instance of DataCacheFactory once to get hold of a DataCache instance, and then storing that DataCache instance in Session state so you can go back to that one each time you access the cache.
Of course, this still means you're creating an instance of DataCacheFactory per user, I think storing it in Application state would be an even better design.