What is a Coldfusion Session? - session

I've used Coldfusion sessions for quite a while, so I know how they are used, but now I need to know how they work, so that I can plan for scaling my website.
Is a Coldfusion user 'session' simply a quick method to setup 2 cookies (CFTOKEN and CFID) and an associated server side memory structure? (the SESSION scope) Does it do anything else? I'm trying to identify the overhead associated with user sessions versus other methods such as cookies.

Your understanding of them is basically correct. Although they are not bound to the cookies. The cookies are a recording of a token. That token can get passed in the url string if cookies are not enabled in the browser.
There are 2 main advantages I see of saving things in session instead of cookies:
You control the session scope. People can't edit the data in the session scope without you providing them an interface. Cookies can be modified by the client.
Complex data like structures, arrays, objects, network sessions (FTP, exchange) can be stored there.
Their memory overhead is "low" but that's a relative term. Use the ColdFusion Admin Server Monitor to drill into how much memory your sessions are actually using.

First of all, Session is scope: secure and efficient way to keep current user attributes like permissions or preferences. Not sure what do you mean under "other methods", but I doubt that you'll be able to keep complex data structures (query,object,array) in cookies.
Second, application server provides you with really handy event handlers specially for sessions: onSessionStart() and onSessionEnd().
Third, sessions can be pretty easily shared and clustered: between CF applications or between CF and J2EE.

Sessions are per-user memory space assigned within a particular application space within the jvm memory. The two cookies are pointers to (the token of) that memory space. Yes, there are overhead of using session (RAM, swap space, etc), but unless you're shoving mass amount of data inside the session scope, it shouldn't be that bad.

One aspect of sessions not mentioned is that they have a lifetime: by default 20 minutes (of inactivity). This lifetime can be set by application, but can never be more than the limit set in ColdFusion Administrator.
If memory usage is a concern the time limit could be reduced, although there's still much that depends on the Java garbage collection.

Related

Spring Session - asynchronous call handling

Does Spring Session management take care of asynchronous calls?
Say that we have multiple controllers and each one is reading/writing different session attributes. Will there be a concurrency issue as the session object is entirely written/read to/from external servers and not the attributes alone?
We are facing such an issue that the attributes set from a controller are not present in the next read... this is an intermittent issue depending on the execution of other controllers in parallel.
When we use the session object from the container we never faced this issue... assuming that it is a direct attribute set/get happening right on to the session object in the memory.
The general use case for the session is storing some user specific data. If I am understanding your context correctly, your issue describes the scenario in which a user, while for example being authenticated from two devices (for example a PC and a phone - hence withing the bounds of the same session) is hitting your backend with requests so fast you face concurrency issues around reading and writing the session data.
This is not a common (and IMHO reasonable) scenario for the session, so projects such as spring-data-redis or spring-data-gemfire won't support it out of the box.
The good news is that spring-session was built with flexibility in mind, so you could of course achieve what you want. You could implement your own version of SessionRepository and manually synchronize (for example via Redis distributed locks) the relevant methods. But, before doing that, check your design and make sure you are using session for the right data storage job.
This question is very similar in nature to your last question. And, you should read my answer to that question before reading my response/comments here.
The previous answer (and insight) posted by the anonymous user is fairly accurate.
Anytime you have a highly concurrent (Web) application/environment where many different, simultaneous HTTP requests are coming in, accessing the same HTTP session, there is always a possibility for lost updates caused by race conditions between competing concurrent HTTP requests. This is due to the very nature of a Servlet container (e.g. Apache Tomcat, or Eclipse Jetty) since each HTTP request is processed by, and in, a separate Thread.
Not only does the HTTP session object provided by the Servlet container need to be Thread-safe, but so too do all the application domain objects that your Web application puts into the HTTP session. So, be mindful of this.
In addition, most HTTP session implementations, such as Apache Tomcat's, or even Spring Session's session implementations backed by different session management providers (e.g. Spring Session Data Redis, or Spring Session Data GemFire) make extensive use of "deltas" to send only the changes (or differences) to the Session state, there by minimizing the chance of lost updates due to race conditions.
For instance, if the HTTP session currently has an attribute key/value of 1/A and HTTP request 1 (processed by Thread 1) reads the HTTP session (with only 1/A) and adds an attribute 2/B, while another concurrent HTTP request 2 (processed by Thread 2) reads the same HTTP session, by session ID (seeing the same initial session state with 1/A), and now wants to add 3/C, then as Web application developers, we expect the end result and HTTP session state to be, after request 1 & 2 in Threads 1 & 2 complete, to include attributes: [1/A, 2/B, 3/C].
However, if 2 (or even more) competing HTTP requests are both modifying say HTTP sessoin attribute 1/A and HTTP request/Thread 1 wants to set the attribute to 1/B and the competing HTTP request/Thread 2 wants to set the same attribute to 1/C then who wins?
Well, it turns out, last 1 wins, or rather, the last Thread to write the HTTP session state wins and the result could either be 1/B or 1/C, which is indeterminate and subject to the vagaries of scheduling, network latency, load, etc, etc. In fact, it is nearly impossible to reason which one will happen, much less always happen.
While our anonymous user provided some context with, say, a user using multiple devices (a Web browser and perhaps a mobile device... smart phone or tablet) concurrently, reproducing this sort of error with a single user, even multiple users would not be impossible, but very improbable.
But, if we think about this in a production context, where you might have, say, several hundred Web application instances, spread across multiple physical machines, or VMs, or container, etc, load balanced by some network load balancer/appliance, and then throw in the fact that many Web applications today are "single page apps", highly sophisticated non-dumb (no longer thin) but thick clients with JavaScript and AJAX calls, then we begin the understand that this scenario is much more likely, especially in a highly loaded Web application; think Amazon or Facebook. Not only many concurrent users, but many concurrent requests by a single user given all the dynamic, asynchronous calls that a Web application can make.
Still, as our anonymous user pointed out, this does not excuse the Web application developer from responsibly designing and coding our Web application.
In general, I would say the HTTP session should only be used to track very minimal (i.e. in quantity) and necessary information to maintain a good user experience and preserve the proper interaction between the user and the application as the user transitions through different parts or phases of the Web app, like tracking preferences or items (in a shopping cart). In general, the HTTP session should not be used to store "transactional" data. To due so is to get yourself into trouble. The HTTP session should be primarily a read heavy data structure (rather than write heavy), particularly because the HTTP session can be and most likely will be accessed from multiple Threads.
Of course, different backing data stores (like Redis, and even GemFire) provide locking mechanisms. GemFire even provides cache level transactions, which is very heavy and arguable not appropriate when processing Web interactions managed in and by an HTTP session object (not to be confused with transactions). Even locking is going to introduce serious contention and latency to the application.
Anyway, all of this is to say that you very much need to be conscious of the interactions and data access patterns, otherwise you will find yourself in hot water, so be careful, always!
Food for thought!

Store persistent data in session

This might be super stupid. Shoot me, but I was in a strange mood yesterday and thought about the following:
What if I store webapp data in a persistent way, just by using sessions. So I store a sessioncookie with an hash, way longer so it's not bruteable. Then just save all stored data in the session. I also set sessiontime to unlimited...
Would there be any use for this? :D
Not really. Most session state implementations keep the sessions in-memory. On app restart (or hardware failure, etc) memory is cleared and session cache is lost.
You could do so if you have your sessions stored in a database rather than in-proc but could be a bit of work depending on what platform you're working with. It's slower as well.
Generally you don't want to keep sessions very large because if they are in-proc sessions, you're going to eat up your servers memory real fast. Even if you go with the database approach for sessions, this is still often done but using in-memory temp tables for sessions and, therefore will eat up the ram of the database server.
Sessions should be light-weight and non-essential to the applications functionality. For anything important that must be persisted, keep it in a database.

Play framework session via client cookie

In my application I want to keep a large amount of data in memory specific to a user currently accessing my web application in a user specific session. As for as I know play framework uses cookie to store session data which has a limit of 4k. How can I have much larger session data? Does ehacache memcache help here? This session has expiration time from last activity of the user.
If a session data is cache'ble its better to keep it in Cache with key as userid and clear it when user logs off. Get it reloaded from DB on relevant DB update/delete. Keeping the content in external cache like memcache, will help you to scale well and will enable you to move to distributed cache in the long run, if required. Check this interesting article on Share Nothing.
The idea with Play is to dispel the need for the session and the keeping of lots of information in memory. The problem with the in-memory approach, is that you tie the user to the specific server that their data is held, where-as the play share nothing approach means you can scale horizontally easily without worry of sticky sessions and the like.
The options you have are
- store transient data in a temporary database that can be accessed via a userId or other unique idenifier of the users session. This database would be the equivalent of your server side session.
- use a cache. However the idea of a cache is that if the information is not in the cache, it can be retrieved from the database (or other source) instead. A cache should not have to guarantee that the data will be available. If in the case of an in memory cache (like ehcache) if you have a load balanced set of servers, you may not be able to guarantee that all requets go back to the same server, so data in the cache may not be available on all servers for a particular session.
The answer to your question depends on your use case, but I think the database is your best approach based on the information you have supplied.

What's the best way to share "session" information between 4 datacenters with 40 servers?

Currently our DNS routes the user to the correct datacenter and then we have a round-robin situation for the servers. We currently store the session information in the cookie but it's grown too large so we want to move it out of the browser and into a database. I'm worried that if we create a midteir box that they all hit that the response times will be affected. It's not feasible to store the session info all all machines because we're talking about 200M+ unique sessions a month. Any suggestions, thoughts?
A job for memcached or, if you want to save session data to disk, memcacheddb
Memached is a free & open source, high-performance,
distributed memory object caching
system, generic in nature, but
intended for use in speeding up
dynamic web applications by
alleviating database load.
Memcached is an in-memory key-value
store for small chunks of arbitrary
data (strings, objects) from results
of database calls, API calls, or page
rendering.
Memcached is simple yet powerful. Its
simple design promotes quick
deployment, ease of development, and
solves many problems facing large data
caches. Its API is available for most
popular languages.
Let's understand the role of browser-based cookies
Cookies are stored per browser
profile.
The same user logged on from different computers or browsers is
considered different users.
State cookies are mixed with user cookies
Segregate the cookies.
Long-term state cookies, e.g. the currently-remembered userId.
session state cookies
user cookies
Reading that your site is only beginning to consider server-side cookies implies that a segregation of cookies has not yet been done. User cookies should be stored on server as much as possible, so that when a user logs on at another computer or browser, the preferences and shopping carts are preserved. Your development team has to decide for some cookies, for example shopping carts, to be between being session-state or user info cookies.
User cookies
Need to be accessible across the web site, regardless where the user logs in. Your developers have to decide, when a user updates a preference or shopping cart, how immediate should that change be visible if the same userId is logged in at another location.
Which means you have to implement a distributed database system. You have a master db server. Let us say you have 20 web servers, each server with its own database.
Store only frequently changed cookies on the local db and leave the infrequently changing cookies on the master.
Everytime a cookie is updated at a local db, a updated flag is queued for update to the master. The cookie record in the master is not updated, only marked as stale with the location number where the fresh data resides. So that if that userid somehow gets activated 3000 miles away simultaneously, that session would find out the stale records and trigger a request to copy from those records from the fresh location to its own local db and the master db and the records no longer marked as stale on the master db.
Then you schedule a regular sync of most frequently used cookies. The frequency of sync could be nightly or depends on the result of characterization of cookie modification.
First, your programmers would need to write a routine to log all cookie read/writes. You should collect a week's worth of cookie read/write activity to perform your initial component analysis.
You perform simple statistical characterization per cookie, userid and frequency of change. Then you slide along your preferences deciding which cookie is pushed to all the local dbs and which stays on the master. The decision balances between the size of the cookie block on the local dbs and the frequency of database sync you are willing to allow. Which means not every user have the same set of cookies propagated. of course, your programmers would need to write routines to automate the regular recharacterization. Rather than per user, you might wish to lighten the processing load of cookie propagation by grouping your users using cluster analysis. May be the grouping of users for your site is so obvious that you need not perform cluster analysis.
You might be surprised to find that most of the cookies could fall into the longer-than-weekly-update bucket. Or the worse case, daily-update. and the worst case you should accept is hourly update for cookie fields which are not pushed onto the local dbs. You want to increase the chances that a cookie access occurs on the local db rather than being pulled from the master database. So when a user decides to click on "preferences" which is seldom changed, you preemptively pull the preferences records from the master while distracting the user with some frills like "have you considered preview our new service?", "would you like to answer our usability survey?", "new Gibson rant, would you comment?", etc until the "preferences" cookies are copied over.
The characterization of cookies could be done per userid, or per cluster of users to decide which cookie field to push around to local dbs.
It is more simplistic to characterize per userid because it barely involves any statistical analysis skills on the part of the programmer. The disadvantage is that the web server would have to perform decisions for each of 200 million users. The database cookie table would be
Cookie[id, param, value, expectedMutationInterval].
You web server would decide per user which cookie push regularly by the threshold time.
SELECT param, value
WHERE expectedMutationInterval < $thresholdTime
AND id = UserId
You have to perform a regular recharacterization of cookies to update expectedMutationInterval per user per cookie. A simple SQL query would be able to perform the update of expectedMutationInterval. A more complex analysis could be performed to produce the value expectedMutationInterval.
If each cookie field change is logged by time, userid and ipaddr then your Cookie log table would be
CookieLog[id, time, ipaddr, param, value].
which would help your automated recharacterization routine decide what fields to push depending on the dayofweek/month/season and location/region/ipaddr.
Then after removing user info cookies from the browser, if you still find your sessison cookies overflowing, you now decide which session cookies to push to the browser and which stays on the local server. You use the same master-local db analysis technique but now used to decide between local db and pushing to browser. You leave your least frequently accessed session cookies on the local server, either as session attributes or on in-memory db. So when a client finds a cookie is missing, it makes are request to the server for the cookie while sacrificing some least recently/frequently used cookie space on the browser to accommodate placing of that fresh cookie.
Since these are session cookies, they need be propagated to other locations because if a same userid is logged on 3000 miles away, it should have its own set of session cookies.
Characterization of browser cookies are an irony because, for AJAX apps, the client accesses the cookies without letting the server know. Letting the server know might defeat the purpose of placing the cookies in the browser in the first place. So you would have to choose idle times to send cookie accesses to the server to log - for characterization purposes.
Such level of granularity is good for cookies that are short in lengths (parameter value + parameter name), be it session based or user based cookies.
Therefore, if your parameter names and values of cookie fields are long, you might seek to quantize them.
However, quantization is a little more complex. Browser cookies have a lot of commonality. Just like any quantization/compression method, you look for the clusters of commonalities and assign each commonality block a signature. Then the cookies are stored in terms of the quantized signature.
How do you facilitate quantization of browser-based cookies? Using GWT as an example, use the Dictionary or Map class.
e.g., the cookie "%1"="^$Kdm3i" might translate to LastConnectedFriend=MohammadAli#jinnah.
You should not need to perform characterization, for example, why store your cookie as "LastConnectedFriend" when you could map it to "%1"? When a user logs in, why not map the most frequently accessed friends, etc, and place that map on the GWT/AJAX launching page? In that way you could shorten your session cookie lengths.
So, is your company looking for a statistical programmer? Disclaimer is, this is written off-the-cuff and might need some factual realignment.

Pros and Cons of Sticky Session / Session Affinity load blancing strategy?

One approach to high scalability is to use network load balancing to split processing load between several servers.
One challenge that this approach presents is where servers are state aware - storing user state in a "session".
One solution to this problem is "sticky session" (aka "session affinity") where each user is assigned to a single server and his/her state data is contained on that server exclusively throughout the duration of the session.
What are the Pros and Cons of the "sticky session" approach? Do you use it and if so are you satisfied with it?
Pros:
It's easy-- no app changes required.
Better utilizes local RAM caches (e.g. look up user profile once, cache it, and can re-use it on subsequent visits from same user)
Cons:
If the server goes down, session is lost. (Note that this is a con of storing session info locally on the web server, not of sticky sessions per se). If what's in the session is really important to the user (e.g. a draft email) or to the site (e.g. a shopping cart) then losing one of your servers can be very painful.
Depending on "sticky" implementation in your load balancer, may direct unequal load to some servers vs. others
Bringing a new server online doesn't immediately give the new server lots of load. If you have a dynamic load-balancing system to deal with spikes, stickiness may slow your ability to respond quickly to a spike. That said, this is somewhat of a corner case and really only applies to very large and sophisticated sites.
If you have relatively few users but a single user's traffic can swamp one server (e.g. complex pages with SSL, AJAX, dynamically-generated images, dynamic compression, etc.), then stickiness may hurt end-user response time since you're not spreading a single user's load evenly across servers. If you have a lot of concurrent users, this is a non-issue since all your servers will be swamped!
But if you must use server-local session state, sticky sessions are definitely the way to go. Even if you don't use server-local session state, stickiness has benefits when it comes to cache utilization (see above). Your load balancer should be able to look at HTTP cookies (not only IP address) to determine stickiness, since IP addresses can change during a single session (e.g. docking a laptop between a wired and wireless network).
Even better, don't use session state on the web server at all! If session state is very painful to lose (e.g. shopping carts), store it in a central database and clear out old sessions periodically. If session state is not critical (e.g. username/avatar URL), then stick it in a cookie-- just make sure you're not shoving too much data into the cookie.
Modern versions of Rails, by default, store session variables in a cookie for the reasons above. Other web frameworks may have a "store in cookie" and/or "store in DB" option.

Resources