Multiple webRole instances at Azure and session state - session

I have webRole with some data stored in Session. The data is some tens of small variables (strings), and one-two big objects (some megabytes). I need to run this webRole in multiple instances. Since two requests from the single user can go to different instances, Session became useless. So, i am looking for most efficient and simplest method of storing volatile user data for this case. I know that i can store it in cookies at client side, but this will fail for big objects. I also know that i can user data in Azure storage - but this seems to be more complicated than Session. Can anybody suggest both efficient and simple method, like Session state? Or probably some workaround to get Session state working correctly when multiple instances enabled.

This may help
http://social.msdn.microsoft.com/Forums/en-US/windowsazure/thread/7ddc0ca8-0cc5-4549-b44e-5b8c39570896

You need to use another session state storage than memory. In Azure you can use Cache, Storage tables or SQL server to share session data between instances.

Related

Clarification on database caching

Correct me if I'm wrong, but from my understanding, "database caches" are usually implemented with an in-memory database that is local to the web server (same machine as the web server). Also, these "database caches" store the actual results of queries. I have also read up on the multiple caching strategies like - Cache Aside, Read Through, Write Through, Write Behind, Write Around.
For some context, the Write Through strategy looks like this:
and the Cache Aside strategy looks like this:
I believe that the "Application" refers to a backend server with a REST API.
My first question is, in the Write Through strategy (application writes to cache, cache then writes to database), how does this work? From my understanding, the most commonly used database caches are Redis or Memcached - which are just key-value stores. Suppose you have a relational database as the main database, how are these key-value stores going to write back to the relational database? Do these strategies only apply if your main database is also a key-value store?
In a Write Through (or Read Through) strategy, the cache sits in between the application and the database. How does that even work? How do you get the cache to talk to the database server? From my understanding, the web server (the application) is always the one facilitating the communication between the cache and the main database - which is basically a Cache Aside strategy. Unless Redis has some kind of functionality that allows it to talk to another database, I don't quite understand how this works.
Isn't it possible to mix and match caching strategies? From how I see it, Cache Aside and Read Through are caching strategies for application reads (user wants to read data), while Write Through and Write Behind are caching strategies for application writes (user wants to write data). Couldn't you have a strategy that uses both Cache Aside and Write Through? Why do most articles always seem to portray them as independent strategies?
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
Could you implement a cache using a normal (not in-memory) database? I suppose this would still be somewhat useful since you do not need to make an additional network hop to the database server (since the cache lives on the same machine as the web server)?
Introduction & clarification
I guess you have one misunderstood point, that the cache is NOT expclicitely stored on the same server as the werbserver. Sometimes, not even the database is sperated on it's own server from the webserver. If you think of APIs, like HTTP REST APIs, you can use caching to not spend too many resources on database connections & queries. Generally, you want to use as few database connections & queries as possible. Now imagine the following setting:
You have a werbserver who serves your application and a REST API, which is used by the webserver to work with some resources. Those resources come from a database (lets say a relational database) which is also stored on the same server. Now there is one endpoint which serves e.g. a list of posts (like blog-posts). Every user can fetch all posts (to make it simple in this example). Now we have a case where one can say that this API request could be cached, to not let all users always trigger the database, just to query the same resources (via the REST API) over and over again. Here comes caching. Redis is one of many tools which can be used for caching. Since redis is a simple in-memory key-value storage, you can just put all of your posts (remember the REST API) after the first DB-query, into the cache. All future requests for the posts-list would first check whether the posts are alreay cached or not. If they are, the API will return the cache-content for this specific request.
This is one simple example to show off, what caching can be used for.
Answers on your question
My first question is, why would you ever write to a cache?
To reduce the amount of database connections and queries.
how is writing to these key-value stores going to help with updating the relational database?
It does not help you with updating, but instead it helps you with spending less resources. It also helps you in terms of "temporary backing up" some data - but that only as a very little side effect. For this, out there are more attractive solutions (Since redis is also not persistent by default. But it supports persistence.)
Do these cache writing strategies only apply if your main database is also a key-value store?
No, it is not important which database you use. Whether it's a NoSQL or SQL DB. It strongly depends on what you want to cache and how the database and it's tables are set up. Do you have frequent changes in your recources? Do resources get updated manually or only on user-initiated actions? Those are questions, leading you to the right caching implementation.
Isn't it possible to mix and match caching strategies?
I am not an expert at caching strategies, but let me try:
I guess it is possible but it also, highly depends on what you are doing in your DB and what kind of application you have. I guess if you find out what kind of application you are building up, then you will know, what strategy you have to use - i guess it is also not recommended to mix those strategies up, because those strategies are coupled to your application type - in other words: It will not work out pretty well.
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
I guess that both is possible. Usually you have one database, maybe clustered or synchronized with copies, to which your webservers (e.g. REST APIs) make their requests. Then whether each of you API servers would have it's own cache, to not query the database at all (in cloud-based applications your database is also maybe on another separated server - so another "hop" in terms of networking). OR (what i also can imagine) you have another middleware between your APIs (clusterd up) and your DB (maybe also clustered up) - but i guess that no one would do that because of the network traffic. It would result in a higher response-time, what you usually want to prevent.
Could you implement a cache using a normal (not in-memory) database?
Yes you could, but it would be way slower. A machine can access in-memory data faster then building up another (local) connection to a database and query your cached entries. Also, because your database has to write the entries into files on your machine, to persist the data.
Conclusion
All in all, it is all about being fast in terms of response times and to prevent much network traffic. I hope that i could help you out a little bit.

How to use redis for number of micro-services?

I am very much new to redis. I have been investigating on redis for past few days.I read the documentation on cache management(lru cache), commands ,etc. I want to know how to implement caching for multiple microservice(s) data .
I have few questions:
Can all microservices data(cached) be kept under a single instance of redis
server?
Should every microservice have its own cache database in redis?
How to refresh cache data without setting EXPIRE? Since it would consume more memory.
Some more information on best practices on redis with microservices will be helpful.
It's possible to use the same Redis for multiple microservices, just make sure to prefix your redis cache keys to avoid conflict between all microservices.
You can use multi db in the same redis instance (i.e one for each microservice) but it's discouraged because Redis is single threaded.
The best way is to use one Redis for each microservices, then you can easily flush one of them without touching others.
From my personal experience with a redis cache in production (with 2 million keys), there is no problem using EXPIRE. I encourage you to use it.
Please find below the answer to all your questions -
Can all microservices data(cached) be kept under a single instance of redis server? Ans - Yes you can keep all the data under single redis instance, all you need to do is to set that data using different key Name. As redis is basically a Key-Value Database.
Should every microservice have its own cache database in redis? Ans - Not required. Just make different key for each microservice. Also please note that you can use colon (:) to make folders in redis, to identify different microservices easily on Redis Desktop Manager.
Example - Key Name X:Y:Z, here Z is placed in Y folder and Y is in X. SO you will get a folder kind of structure. That would be helpful to differentiate different microservices.
How to refresh cache data without setting EXPIRE? Since it would consume more memory. Ans - You can set data again on the same key if you have any change in Microservice response. That Key value will get over written in that case.
Can all microservices data(cached) be kept under a single instance of redis server?
In microservice architecture it's prefirible "elastic scale SaaS". You can think your Cache service is perse a microservice (that will response on demand) Then you have multiple options here. The recommended practice on data storage is sharding https://azure.microsoft.com/en-us/documentation/articles/best-practices-caching/#partitioning-a-redis-cache .See the diagram below for book Microservices, IoT and Azure
Should every microservice have its own cache database in redis? It's possible to still thinking "vertical partition" but you should consider "horizontal partitions" so again consider sharding; additionally It's not a bad idea to have "local cache" specialy to avoid DoS
"Be careful not to introduce critical dependencies on the availability of a shared cache service into your solutions. An application should be able to continue functioning if the service that provides the shared cache is unavailable. The application should not hang or fail while waiting for the cache service to resume."
How to refresh cache data without setting EXPIRE? Since it would consume more memory.
You can define your synch polices; I think cache is suitable for things that have few changes.
"It might also be appropriate to have a background process that periodically updates reference data in the cache to ensure it is up to date, or that refreshes the cache when reference data changes."
For cahe best practices check
Caching Best Practices

Can ColdFusion sessions be stored in dynamoDB?

Does anyone know if there's any way to get ColdFusion 10+ to store sessions in dynamoDB using the SDK?
http://docs.aws.amazon.com/AWSSdkDocsJava/latest//DeveloperGuide/java-dg-tomcat-session-manager.html
http://java.awsblog.com/post/Tx12CFK2FZ7PXRN/Amazon-DynamoDB-Session-Manager-for-Apache-Tomcat
AFAIK no, because Session scope can store any variables, whereas dynamoDB is a DB so variables have to be serialized first at least. However, if your new or existing app does not reference session directly and use an abstration layer like Coldbox's SessionStorage then you may still be able to do it, but I will then worry about the latency.

Store 600 KB of session data per registered user in Heroku

For a given application, we would need to store about 600 Kb of data in the web session per registered user who connects on our website. We would have about 1,000 registered users in parallel hence we need to store 600 Mb of session data.
The reason we need so much data in the session is to avoid querying frequently a table with about 1 billion rows in the database.
I understood Heroku stores session information in the database. This is fine as it means the session data is available cross-dynos (no session affinity).
Is there another way of storing more efficiently information across dynos ? Reading the docs, I found memcachier.
My questions would be the following :
Do you think storing that amount of session in the database would be performant enough
Do you suggest other caching systems than memcachier to store session information available across different dynos ?
Thanks a lot for your help !
Olivier
Heroku does not store session information at all -- how session information is stored depends entirely on your application and your application's framework and that will work in the same way regardless of whether you are deployed on Heroku or any other system.
As far as what kind of storage is sensible, however: it sounds like cookie storage is right out, due to the volume of data. Database storage was the de facto default for web applications for a long time and there's nothing wrong with it. Memcached would be faster, and how much faster exactly depends on your configuration (are you using connection pooling? does each page view hit the database for something else anyways? what is your caching system like?). But as long as you're sure this strategy of storing so much info in session data is sound, then the difference between database and memcached storage will not be great.

what way to store data by key and value?

I store data in
HttpContext.Current.Application.Add(appKey, value);
And read data by this one:
HttpContext.Current.Application[appKey];
This has the advantage for me that is using a key for a value but after a short time (about 20 minutes) it does not work, and I can not find [appKey],because the application life cycle in iis data will lose.
i want to know is that another way to store my data by key and value?
i do not want sql server,file,... and want storing data on server not on client
i store users some data in it.
thanks for your helping
Since IIS may recycle and throw away any cache/memory contents at any time, the only way you will get data persisted is to store it outside IIS. Some examples are; (and yes, I included the ones you stated you didn't want just to have the list a bit more complete, feel free to skip them)
A SQL database (there are quite a few free ones if the price is prohibitive)
A NoSQL database (same thing there, quite a few free ones and usually simpler to use for key/value)
File (which you also stated you didn't want)
Some kind of external memory cache, a'la AppFabric cache or memcached.
Cookies (somewhat limited in size and not secure in any way by default)
you could create a persistent cookie on the user's machine so that the session doesn't expire, or increase the session timeout to a value that would work better for your situation/users
How to create persistent cookies in asp.net?
Session timeout in ASP.NET
You're talking about persisting data beyond the scope of a session. So you're going to have to use some form of persistent storage (Database, File, Caching Server).
Have you considered using AppFabric. It's actually pretty easy to implement. You could either access it directly from your code using the nuget packages, or you could just configured it as a session store. (I think) doing the latter would mean you'd get rid of the session timeout issue.
Do you understand that whatever you decide to store in Application, will be available for all users in your application?
Now regarding your actual question, what kind of data do you plan on storing? If its user sensitive data, then it probably makes sense to store it in the session. If it's client specific and it doesn't contain any sensitive information, than cookies is probably a reasonable way forward.
If it is indeed an application wide data and it must be the same for every user of your application, then you can make configuration changes to make sure that it doesn't expiry after 20 minutes.

Resources