Coldfusion 2018 clustering and session replication not working - session

Setting up a couple new Coldfusion 2018 servers and will be using clustering for the first time and have run into some problems.
I am having trouble with session replication. Basically, session variables appear to be replicated between nodes in a cluster but are killed after a short while at random.
A little setup info:
2 web servers (Windows Server 2012) behind load balancers
On each web server sits a Coldfusion cluster consisting of 2 local instances (still unclear if this is useful or not - will ask in separate question) and 2 remote instances (the remotes reference the local instances of each opposite server)
For simplicity, currently just testing on a single server with local Coldfusion instances - leaving the remotes out of the equation until I can get things working reliably locally
Using J2EE session variables
Coldfusion session timeout set to 2 hours
In each Coldfusion instance, channelSendOptions is set to "6"
Here is what I did/experienced:
We have a web application that requires login and stores user information in the session upon login.
I made a small modification to the web app to show me which cluster instance has serviced my current request.
After setting up the cluster, I started the web application and logged in, noting the instance which displayed the login page.
Upon logging in, I was immediately returned to the login screen (app checks for user info in session and redirects to login if not found)
Debugging revealed that I was actually being logged in but after redirecting to some new page after login the user info would be gone from session.
Multiple login attempts in a row (same credentials, just tried over and over again and again) revealed that sometimes login would proceed just fine and I would get into the app. However, if I refreshed the page or went to another page, the session would be lost very soon but at random (within a few page refreshes).
In an attempt to simplify the problem to try and figure out what is going on, I created a simple .cfm that bypasses all the login stuff and does one thing: adds a simple string value to session and then dumps the session and instance name.
** I ran the script once, noted which instance was being used and that session contained my value.
** I then edited the script so it no longer set the session value.
** I then hit refresh over and over so I could confirm:
That requests were being serviced by both instances in cluster
That as I flip-flopped between instances, the session value was available all the time.
Again, the replication would work and for several refreshes I could see my session variable available on each instance...until it wasn't. After a random number of refreshes/seconds (between 2 - 10 refreshes say) the value would disappear.
I am at a loss to explain why this is happening. We considered using Redis as a session store to see if it helped but frankly, our team has no experience with it, it is clunky to get working in Windows and we really don't want any more moving pieces in our infrastructure if we can help it.
Any insight on what is occurring as well as advice for how to peer behind the scenes as it were and see what is going on with session replication would be greatly appreciated.
Thanks
Adding some code and screenshots. The screenshots show the state of session after each page refresh and which instance is currently serving the page. The last two images represent refreshes 11 and 13 - the session variable was lost in 11 and I went to 13 so that we can see that the variable was lost on the other instance as well. Also a couple pictures of cluster/session setup.
Following is the simple test script. The first line is un-commented on first run to create the session variable and commented out for each subsequent run.
<!--- <cfset Session.svar="cake!"> --->
<cfdump var="#Session#" />
<cfscript>
hostaddress = createObject("java", "java.net.InetAddress").localhost.getHostAddress();
</cfscript>
<cfoutput>
<h3>
Instance: #createobject("component","CFIDE.adminapi.runtime").getinstancename()#
</h3>
</cfoutput>

Related

All Session Variables are removed

We are having a problem with regards to All Session Variables being deleted at random times.
This happens without calling Session.Abort(); or Session.Clear(); IIS is also not recycled and NO App_Code, Bin, Global.asax or Web.config changes are made when this happens. We have enabled logs on iis to confirm app pool is not being recycled. IIS is also set to recycle once a day in the morning and no limits are set on iis to force recycle
This happens very randomly and not able to reproduce at all. We use Formsauthentication, but the site determines if it should be redirected back to the login page by looking if 2 critical session variables exists(sometimes the sessions are cleared even while authenticated aswell. We use default In-Proc session State.
We have tried response.redirect(...,false) when setting variables without any luck. This happens on a single server.
We are somehow running a web farm(Login screen handles the load and redirects to a server, but user stays on that server, until he logs out).
Any help in the correct direction will be appreciated!

Session corrupt using aspnet_state service

We have for some time now been experiencing problems with data being saved in our SQL database.
Sometimes records are saved with data that does not match the rest of the row, making it seem like at some point, data is being 'swapped' for something else, perhaps, another user's data, before being passed to the database.
We do use TransactionScopes throughout with Isolation Level of ReadCommitted which makes me think the data integrity issue lies within the application rather than at the Database level.
We do use the session extensively and we are starting to think that the times of the corrupt data are similar to the times we deploy updates to the system during the day.
We do use the aspnet_state service to persist the session over application restarts.
Our users rely on terminal sessions therefore multiple users all log into the same server and launch the system via a browser.
We have in the past noticed users logging in with the same domain credentials but we are now relatively confident that users now log in with unique accounts.
99.9% of the data is correct but we have been struggling to understand what could be causing this intermittent data integrity issue.
We are now limiting our deploys to outside working hours on pain of death, but this is not always possible.
Can anyone shed light on why/how this might be happening?
EDIT: We have now isolated this to the DAL layer, see SQL query returns incorrect value in multi user environment
I have recently been fighting this!, and had similar problem to yours around 95% of the data written back was correct. I looked at various reasons why, the main culprit was some users on the network had downloaded Chrome and opening the record within Chrome, breaking our session id's as Chrome ignores sessions.
The other cause had been either the users was not closing the browser or not logging off the application allowing either the same user or completely different user to pick and use the session id.
After introducing a browser check and then reject Chrome, educating the users to make sure they log off, doing any updates to outside busy periods the problem was just about gone.
I forgot to mention, also on your IIS its best to turn off caching in the Output Caching, for the user and kernal set to prevent caching.

Sessions dropped intermittently in ColdFusion/IIS

Several times per day (though we cannot reproduce it ourselves), we're seeing instances of sessions being dropped.
What I mean is I have logs of the user coming to the site, performing a few requests, and then having each of their next few requests get a different session identifier and thus wiping out everything in their session. Same IP, same browser, and all of this happens in the course of a couple seconds. The session timeout is configured to 20 minutes.
It doesn't appear to be related to a specific browser, as users have claimed coworkers don't experience the issue on the same machine.
What's really bizarre is that for some requests I can clearly see one session ID coming in through CGI.HTTP_COOKIE and another one is assigned during the course of the request (by the time we get an error email, which is caused by their lack of session). WTF?
To my knowledge, nothing in our application code could be causing this. We use session variables of course, but don't wipe or reset the session ID cookies. I was under the impression that's completely handled by the server.
I'm ripping my hair out here. Any ideas on even how to go about debugging this would be appreciated.

ColdFusion Session issue - multiple users behind one proxy IP -- cftoken and cfid seems to be shared

I have an application that uses coldfusion's session management (instead of the J2EE) session management.
We have one client, who has recently switched their company's traffic to us to come viaa proxy server in their network.
So, to our Coldfusion server, it appears that all traffic is coming from this one IP Address, for all of the accounts of this one company..
Of the session variables, Part 1 is kept in a cflock, and Part 2 is kept in editable session variables. I may be misundestanding, but we have done it this way as we modify some values as needed throughout the application's usage.
We are now running into an issue of this client having their session variables mixed up (?). We have one case where we set a timestamp.. and when it comes time to look it up, it's empty. From the looks of it this is happening because of another user on the same token.
My initial thoughts are to look into modifying our existing session management to somehow generate a unique cftoken/cfid, or to start using jsession_ID, if this solves the problem at all.
I have done some basic research on this issue and couldn't find anything similar, so I thought I'd ask here.
Thanks!
I've run into similar problems on and off for years.
JSession cookies seem to help (no hard data on that) but one solution that I've implemented repoeatedly is using no-cache and cache expiry headers on every page.
http://www.bpurcell.org/blog/index.cfm?entry=1075&mode=entry gives some specifics on how to implement this.
In extreme cases, we've been forced to pass the token and cfid in the links/forms, but that is a PITA to implement, so I'd try the cache expiry/prevention soluiton first.
As far as I know, there are no "cons" in using J2EE session variables, unless you really need session to be active after user closes the browser. I think you should try and see how application behaves with it and see if that saves you trouble of refactoring.
To be sure that you are using all other settings try this:
<cfdump var="#APPLICATION.GetApplicationSettings()#" label="Application settings" />
If you have sessionmanagement and client cookies turned on, everything is fine, so try j2ee session variables.

ColdFusion sessions not being timed out

We have 2 core applications running on our servers on CF 8, and both have the exact same session timeout set in the application CFC (2 hours at the moment). However we're seeing that sessions are spiralling out of control for one of the applications (currently at 120,000+ on one server), lets call it AppA whereas AppB seems fine (and AppB is the one we'd expect a lot more traffic to).
So I did some further digging and found out that most of the sessions for AppA have been idle for many hours with the highest value I've seen so far being over 11 hours.
We're not actually doing that much with sessions so I'm a little confused as to why they're not being timed out as expected. Also I've dumped the this scope in the application CFC and it is showing the expected value for sessionTimeout.
The only thing I had noticed is that in one instance we're assigning a variable on the Request scope from a Session variable. If it were a different scope I would maybe think that is causing some sort of reference that GC (or whatever) can't clear.
In terms of the spiral, I'd say that's to do with some requests which aren't passing through the CFID/CFTOKEN to maintain the session. This could be web service calls, CFHTTP requests, search engine bots, etc. Sounds like one of your apps is experiencing this. If this is the case then for CFHTTP pass the CFID/CFTOKEN through to maintain sessions. Web services bit more tricky, you'll need to create a 'key' which is passed back and forth, whole separate topic! Bots can be handled by having some conditionals to set the session timeout value.
For the 11 hours, I'd say thats due to it been kept alive by something. Some continual polling? Reocurring AJAX request? It would have to be something that continues to pass the ID/TOKEN through.
I used to get server lockups in CF6.1 when I was persisting CFCs in the application or session scopes. Now I instantiate them in the request scope and the lockups stopped happening (with no noticeable performance drop). Maybe you have a similar issue.
Actually turns out the sessions were started from another App which wasn't over-riding the default value in the base Application.cfc (including the application name).

Resources