I need in my project more session management than I thought at the beginning. The major feature that I need is to list all sessions for (or per) identified principal (for example to delete/invalidate all his session id's). I don't want to use SessionsRegistry because of distributed kind of a system.
So two questions:
How to list session ids in Spring-Session (Do I need to come with custom implementation) ?
Is there a way to set sessions time-out that is not interval between requests but max session time life?
Typical use case for such functionality is to prevent malicious user to continue his activity by blocking his account and invalidate his all sessions across the servers.
Related
Does Spring Session management take care of asynchronous calls?
Say that we have multiple controllers and each one is reading/writing different session attributes. Will there be a concurrency issue as the session object is entirely written/read to/from external servers and not the attributes alone?
We are facing such an issue that the attributes set from a controller are not present in the next read... this is an intermittent issue depending on the execution of other controllers in parallel.
When we use the session object from the container we never faced this issue... assuming that it is a direct attribute set/get happening right on to the session object in the memory.
The general use case for the session is storing some user specific data. If I am understanding your context correctly, your issue describes the scenario in which a user, while for example being authenticated from two devices (for example a PC and a phone - hence withing the bounds of the same session) is hitting your backend with requests so fast you face concurrency issues around reading and writing the session data.
This is not a common (and IMHO reasonable) scenario for the session, so projects such as spring-data-redis or spring-data-gemfire won't support it out of the box.
The good news is that spring-session was built with flexibility in mind, so you could of course achieve what you want. You could implement your own version of SessionRepository and manually synchronize (for example via Redis distributed locks) the relevant methods. But, before doing that, check your design and make sure you are using session for the right data storage job.
This question is very similar in nature to your last question. And, you should read my answer to that question before reading my response/comments here.
The previous answer (and insight) posted by the anonymous user is fairly accurate.
Anytime you have a highly concurrent (Web) application/environment where many different, simultaneous HTTP requests are coming in, accessing the same HTTP session, there is always a possibility for lost updates caused by race conditions between competing concurrent HTTP requests. This is due to the very nature of a Servlet container (e.g. Apache Tomcat, or Eclipse Jetty) since each HTTP request is processed by, and in, a separate Thread.
Not only does the HTTP session object provided by the Servlet container need to be Thread-safe, but so too do all the application domain objects that your Web application puts into the HTTP session. So, be mindful of this.
In addition, most HTTP session implementations, such as Apache Tomcat's, or even Spring Session's session implementations backed by different session management providers (e.g. Spring Session Data Redis, or Spring Session Data GemFire) make extensive use of "deltas" to send only the changes (or differences) to the Session state, there by minimizing the chance of lost updates due to race conditions.
For instance, if the HTTP session currently has an attribute key/value of 1/A and HTTP request 1 (processed by Thread 1) reads the HTTP session (with only 1/A) and adds an attribute 2/B, while another concurrent HTTP request 2 (processed by Thread 2) reads the same HTTP session, by session ID (seeing the same initial session state with 1/A), and now wants to add 3/C, then as Web application developers, we expect the end result and HTTP session state to be, after request 1 & 2 in Threads 1 & 2 complete, to include attributes: [1/A, 2/B, 3/C].
However, if 2 (or even more) competing HTTP requests are both modifying say HTTP sessoin attribute 1/A and HTTP request/Thread 1 wants to set the attribute to 1/B and the competing HTTP request/Thread 2 wants to set the same attribute to 1/C then who wins?
Well, it turns out, last 1 wins, or rather, the last Thread to write the HTTP session state wins and the result could either be 1/B or 1/C, which is indeterminate and subject to the vagaries of scheduling, network latency, load, etc, etc. In fact, it is nearly impossible to reason which one will happen, much less always happen.
While our anonymous user provided some context with, say, a user using multiple devices (a Web browser and perhaps a mobile device... smart phone or tablet) concurrently, reproducing this sort of error with a single user, even multiple users would not be impossible, but very improbable.
But, if we think about this in a production context, where you might have, say, several hundred Web application instances, spread across multiple physical machines, or VMs, or container, etc, load balanced by some network load balancer/appliance, and then throw in the fact that many Web applications today are "single page apps", highly sophisticated non-dumb (no longer thin) but thick clients with JavaScript and AJAX calls, then we begin the understand that this scenario is much more likely, especially in a highly loaded Web application; think Amazon or Facebook. Not only many concurrent users, but many concurrent requests by a single user given all the dynamic, asynchronous calls that a Web application can make.
Still, as our anonymous user pointed out, this does not excuse the Web application developer from responsibly designing and coding our Web application.
In general, I would say the HTTP session should only be used to track very minimal (i.e. in quantity) and necessary information to maintain a good user experience and preserve the proper interaction between the user and the application as the user transitions through different parts or phases of the Web app, like tracking preferences or items (in a shopping cart). In general, the HTTP session should not be used to store "transactional" data. To due so is to get yourself into trouble. The HTTP session should be primarily a read heavy data structure (rather than write heavy), particularly because the HTTP session can be and most likely will be accessed from multiple Threads.
Of course, different backing data stores (like Redis, and even GemFire) provide locking mechanisms. GemFire even provides cache level transactions, which is very heavy and arguable not appropriate when processing Web interactions managed in and by an HTTP session object (not to be confused with transactions). Even locking is going to introduce serious contention and latency to the application.
Anyway, all of this is to say that you very much need to be conscious of the interactions and data access patterns, otherwise you will find yourself in hot water, so be careful, always!
Food for thought!
If I have a number of Grails domain objects that I do not want to save just yet, but still access them throughout my application, is it wise to store them in the Grails / Hibernate session (especially as regards peformance)? If not, what is the alternative?
What do you mean by the grails / hibernate session?
If you really mean the Hibernate session, adding an object to it will provoke the object to be saved automatically when the session is flushed (unless the object doesn't validate, in that case it will be lost once the session is discarded). A session is created and discared per request.
If you mean the session object that gets automatically injected into controllers and views, it's nor grails neither Hibernate specific, but just the old, plain HttpSession from the Servlet specification (see http://docs.oracle.com/javaee/7/api/javax/servlet/http/HttpServletRequest.html).
You can use that to store any kind of object if you need to access them across multiple requests of the same client. Meaning the session is private to a given client (who identifies it throught the jsessionid cookie) and survives multiple requests. If you don't need the multiple request bit, adding them as a request attribute would suffice.
Putting things in the session is generally fine and fast (since by default is based on memory), but it will increase the memory footprint of the application if abused, and will prevent horizontal scaling (i.e deploying the same application in multiple instances) unless sticky session mechanisms are used (or the session is persisted).
Bear in mind though that grails uses a new Hibernate session per request (not an Http session :), so if you add objects that are attached to a Hibernate session to the Http session, and then the Hibernate session is closed, you might encounter problems. This shouldn't affect non-saved objects (they don't come from a Hibernate session), but it might affect their associations (other domain classes that do come from the database and therefore a Hibernate session). If that's the case, you might need to re-attach them. See https://grails.github.io/grails-doc/latest/ref/Domain%20Classes/attach.html
Also, if the session is invalidated (because the user logs out, or the server is re-deployed) everything that was stored in there will be gone.
If you don't want to rely on sessions at all, you can create your own a MemoryBasedStoreService service and use a ConcurrentHashMap or a similar mechanism to store and retrieve the objects. Since services are singleton in Grails, you can use it across the whole application, regardless of requests or clients - as long as your application is deployed in a single instance of course :).
I'm looking to store sessions in an external data store instead of in memory. I want to do this in order to have session data available to all my Jetty server instances.
So I read the Jetty docs:
Session Clustering with a Database
Session Clustering with MongoDB
Both docs have the following statement:
The persistent session mechanism works in conjunction with a load balancer that supports stickiness.
My question is - why do I need stickiness in this mechanism? The whole point is to allow any server in the cluster serve any request because the session data is stored externally.
The problem is that the servlet specification does not provide any transactional semantics around sessions. Thus, unless your sessions are sticky its possible for concurrent requests involving the same session to go to multiple servers and produce inconsistent results, depending on the interleaving of the writes. If your sessions are read-mostly you may get away with non-sticky sessions, although you may find that sessions time out a little sooner or a little later than you'd expect, due to having multiple servers trying to manage the same session.
I have been given a requirement to persist user data once the user has authenticated initially. We don't want to hit the database to look up the user every time they navigate to a new view etc...
I have a User class that is [Serializable] so it could be stored in a session. I am using SQL server for session state as well. I was thinking of storing the object in session but I really hate doing that.
How are developers handling this type of requirement these days?
Three ways:
Encrypting data in cookies and sending it to client, decrypting it whenever you need it
Storing it server side by an Id (e.g UserId) in Cache, Session, or any other storage(which is safer than cookie).
Use second level caching strategy if you used an ORM
Assuming your user object is not huge and does not change often i think it is acceptable to store it in the session.
Since you already have a sql server session you will be making SP calls to pull/push the data already and adding a small object to that should have minimal perf issues compared to other options like persisting it down to the client and sending it back on every request.
I would also consider the server a much more secure place to keep this info.
You want to minimize the number of times you write to the session(request a lock) when it is stored in sql as it is implemented in a sealed class that exclusivity locks the session. If any of your other requests in this session require write access to the SQL session they will be blocked by the initial request until it releases the session lock. (there are some new hooks in .NET 4 for allowing you to change the SessionStateBehavior in the pipeline before the session is accessed)
You might consider a session state server (appfabric) if perf of your SQL session store is an issue.
Currently our DNS routes the user to the correct datacenter and then we have a round-robin situation for the servers. We currently store the session information in the cookie but it's grown too large so we want to move it out of the browser and into a database. I'm worried that if we create a midteir box that they all hit that the response times will be affected. It's not feasible to store the session info all all machines because we're talking about 200M+ unique sessions a month. Any suggestions, thoughts?
A job for memcached or, if you want to save session data to disk, memcacheddb
Memached is a free & open source, high-performance,
distributed memory object caching
system, generic in nature, but
intended for use in speeding up
dynamic web applications by
alleviating database load.
Memcached is an in-memory key-value
store for small chunks of arbitrary
data (strings, objects) from results
of database calls, API calls, or page
rendering.
Memcached is simple yet powerful. Its
simple design promotes quick
deployment, ease of development, and
solves many problems facing large data
caches. Its API is available for most
popular languages.
Let's understand the role of browser-based cookies
Cookies are stored per browser
profile.
The same user logged on from different computers or browsers is
considered different users.
State cookies are mixed with user cookies
Segregate the cookies.
Long-term state cookies, e.g. the currently-remembered userId.
session state cookies
user cookies
Reading that your site is only beginning to consider server-side cookies implies that a segregation of cookies has not yet been done. User cookies should be stored on server as much as possible, so that when a user logs on at another computer or browser, the preferences and shopping carts are preserved. Your development team has to decide for some cookies, for example shopping carts, to be between being session-state or user info cookies.
User cookies
Need to be accessible across the web site, regardless where the user logs in. Your developers have to decide, when a user updates a preference or shopping cart, how immediate should that change be visible if the same userId is logged in at another location.
Which means you have to implement a distributed database system. You have a master db server. Let us say you have 20 web servers, each server with its own database.
Store only frequently changed cookies on the local db and leave the infrequently changing cookies on the master.
Everytime a cookie is updated at a local db, a updated flag is queued for update to the master. The cookie record in the master is not updated, only marked as stale with the location number where the fresh data resides. So that if that userid somehow gets activated 3000 miles away simultaneously, that session would find out the stale records and trigger a request to copy from those records from the fresh location to its own local db and the master db and the records no longer marked as stale on the master db.
Then you schedule a regular sync of most frequently used cookies. The frequency of sync could be nightly or depends on the result of characterization of cookie modification.
First, your programmers would need to write a routine to log all cookie read/writes. You should collect a week's worth of cookie read/write activity to perform your initial component analysis.
You perform simple statistical characterization per cookie, userid and frequency of change. Then you slide along your preferences deciding which cookie is pushed to all the local dbs and which stays on the master. The decision balances between the size of the cookie block on the local dbs and the frequency of database sync you are willing to allow. Which means not every user have the same set of cookies propagated. of course, your programmers would need to write routines to automate the regular recharacterization. Rather than per user, you might wish to lighten the processing load of cookie propagation by grouping your users using cluster analysis. May be the grouping of users for your site is so obvious that you need not perform cluster analysis.
You might be surprised to find that most of the cookies could fall into the longer-than-weekly-update bucket. Or the worse case, daily-update. and the worst case you should accept is hourly update for cookie fields which are not pushed onto the local dbs. You want to increase the chances that a cookie access occurs on the local db rather than being pulled from the master database. So when a user decides to click on "preferences" which is seldom changed, you preemptively pull the preferences records from the master while distracting the user with some frills like "have you considered preview our new service?", "would you like to answer our usability survey?", "new Gibson rant, would you comment?", etc until the "preferences" cookies are copied over.
The characterization of cookies could be done per userid, or per cluster of users to decide which cookie field to push around to local dbs.
It is more simplistic to characterize per userid because it barely involves any statistical analysis skills on the part of the programmer. The disadvantage is that the web server would have to perform decisions for each of 200 million users. The database cookie table would be
Cookie[id, param, value, expectedMutationInterval].
You web server would decide per user which cookie push regularly by the threshold time.
SELECT param, value
WHERE expectedMutationInterval < $thresholdTime
AND id = UserId
You have to perform a regular recharacterization of cookies to update expectedMutationInterval per user per cookie. A simple SQL query would be able to perform the update of expectedMutationInterval. A more complex analysis could be performed to produce the value expectedMutationInterval.
If each cookie field change is logged by time, userid and ipaddr then your Cookie log table would be
CookieLog[id, time, ipaddr, param, value].
which would help your automated recharacterization routine decide what fields to push depending on the dayofweek/month/season and location/region/ipaddr.
Then after removing user info cookies from the browser, if you still find your sessison cookies overflowing, you now decide which session cookies to push to the browser and which stays on the local server. You use the same master-local db analysis technique but now used to decide between local db and pushing to browser. You leave your least frequently accessed session cookies on the local server, either as session attributes or on in-memory db. So when a client finds a cookie is missing, it makes are request to the server for the cookie while sacrificing some least recently/frequently used cookie space on the browser to accommodate placing of that fresh cookie.
Since these are session cookies, they need be propagated to other locations because if a same userid is logged on 3000 miles away, it should have its own set of session cookies.
Characterization of browser cookies are an irony because, for AJAX apps, the client accesses the cookies without letting the server know. Letting the server know might defeat the purpose of placing the cookies in the browser in the first place. So you would have to choose idle times to send cookie accesses to the server to log - for characterization purposes.
Such level of granularity is good for cookies that are short in lengths (parameter value + parameter name), be it session based or user based cookies.
Therefore, if your parameter names and values of cookie fields are long, you might seek to quantize them.
However, quantization is a little more complex. Browser cookies have a lot of commonality. Just like any quantization/compression method, you look for the clusters of commonalities and assign each commonality block a signature. Then the cookies are stored in terms of the quantized signature.
How do you facilitate quantization of browser-based cookies? Using GWT as an example, use the Dictionary or Map class.
e.g., the cookie "%1"="^$Kdm3i" might translate to LastConnectedFriend=MohammadAli#jinnah.
You should not need to perform characterization, for example, why store your cookie as "LastConnectedFriend" when you could map it to "%1"? When a user logs in, why not map the most frequently accessed friends, etc, and place that map on the GWT/AJAX launching page? In that way you could shorten your session cookie lengths.
So, is your company looking for a statistical programmer? Disclaimer is, this is written off-the-cuff and might need some factual realignment.