SQLAlchemy session: how to keep it alive? - session

I have a session object that gets passed around a whole lot and at some point the following lines of code are called (this is unavoidable):
import transaction
transaction.commit()
This renders the session unusable (by closing it I think).
My question is two part:
How do I check if a session is still alive and well?
Is there a quick way to revitalize a dead session?
For 2: The only way I currently know is to use sqlalchemy.orm.scoped_session, then call query(...)get(id) many times to recreate the necessary model instances but this seems pretty darn inefficient.
EDIT
Here's an example of the sequence of events that causes the error:
modelInstance = DBSession.query(ModelClass).first()
import transaction
transaction.commit()
modelInstance.some_relationship
And here is the error:
sqlalchemy.orm.exc.DetachedInstanceError: Parent instance <CategoryNode at 0x7fdc4c4b3110> is not bound to a Session; lazy load operation of attribute 'children' cannot proceed
I don't really want to turn off lazy loading.
EDIT
DBSession.is_active seems to be no indication of whether or not the session is in fact alive and well in this case:
transaction.commit()
print(DBSession.is_active)
this prints True...
EDIT
This seemed too big for a comment so I'm putting it here.
zzzeek said:
"An expired object will automatically load new state from the database, via the Session, as soon as you access anything on it, so there's no need to tell the Session to do anything here."
So how do I get stuff committed in such a way that this will happen? calling transaction.commit is wrong, what's the correct way?

so the first thing to observe here is "import transaction" is a package called zope.transaction. this is a generic transaction that takes hold of any number of sub-tasks, of which the SQLAlchemy Session is one of them, via the zope.sqlalchemy extension.
What zope.sqlalchemy here is going to do is call the begin()/rollback()/commit() methods of the Session itself, in response to it's own management of the "transaction".
The Session itself works in such a way that it is almost always ready for use, even if its internal transaction has been committed. When this happens, the Session upon next use just keeps going, either starting a new transaction if it's in autocommit=False, or if autocommit=True it continues in "autocommit" mode. Basically it is auto-revitalizing.
The one time that the Session is not able to proceed is if a flush has failed, and the rollback() method has not been called, which, when in autocommit=False mode, the Session would like you do to explicitly when flush() fails. To see if the Session is in this specific state, the session.is_active property will return False in that case.
I'm not 100% sure what the implications are of continuing to use the Session when zope.transaction is in use. I think it depends on how you're using zope.transaction in the bigger scheme.
Which leads us where lots of these questions do, which is what are you really trying to do. Like, "recreate the necessary model instances" is not something the Session does, unless you are referring to existing instances which have been expired (their guts emptied out). An expired object will automatically load new state from the database, via the Session, as soon as you access anything on it, so there's no need to tell the Session to do anything here.
It's of course an option to even turn off auto-expiration entirely, but that you are even arriving at a problem here implies something is not working as it should. Like there's some error message you're getting. More detail would be needed to understand exactly what the issue you're having is.

Related

Community-auth Codeigniter 3 - randomly seems to lose session variable

I am using Community-Auth with Codeingniter V3 to do authentication and to store authorization levels, etc.
The problem I am having is that my users are sometimes being redirected to the login page, even though they have not been inactive. I cannot seem to isolate a particular behavior or pattern to duplicate the problem.
The problem occurs when a controller calls the verify_min_level routine which should just verify that they are logged on. But it returns FALSE, which means Community-Auth believes they are not logged in, and the code redirects to the login screen.
Since it seems to happen randomly and for no apparent reason (the user was not inactive for a while, etc) it is driving my users crazy.
Has anyone else seen this kind of behavior?
I seem to have identified the problem. This particular client wanted sessions that would only end when they logged out or closed their browser window. So I set the session expiration to zero (0).
I thought that the garbage collection would only delete sessions occasionally (given that in codeigniter I understand that 0 means the session ends in two years) and that I would catch up with it with my own garbage collection. However I started noticing that the ci_sessions table (I moved session data to database from file system to help debug this issue) would have multiple sessions removed frequently, even though none of the sessions were anywhere near two years old.
What seems to have solved the problem is to turn off the garbage collection completely by setting the PHP parameter sessions.gc_probability to 0.
No garbage collection, no premature deletion of session variables.
I am implementing a nightly CRON job to do garbage collection of the ci_sessions table.

gorilla/sessions persistent between server restarts?

I have a general question about sessions. I am not very seasoned when it comes to this subject. I've tried with:
NewRediStore (gopkg.in/boj/redistore.v1)
NewCookieStore
NewFileSystemStore
I was under the impression that sessions could last between server restarts, hence the need for a 'store'. While my golang backend is running, I am able to set new sessions and retrieve them for multiple users/browsers. No problems there.
When I restart my server, I notice that all session access results in session.IsNew == true.
In Redis, I can see all the session keys after the restart, and even verified that .Getting the session results in the right ID retrieved, but IsNew is still set.
I guess intuitively, this makes sense because there must be some map in memory that leads to the setting of IsNew but I would think that if there was any hit for the cookie key in the store, IsNew should not be set. Am I going crazy? Is there something easy that I am doing wrong? Is this a fundamental misunderstanding of how to use sessions?
Please let me know if I need to include code or additional details.
I would have had the same assumptions you did, and browsing the source, it looks like it should work as you described. You might try debugging and stepping through it, particularly the New method for the store you're using (e.g. FilesystemStore.New or RediStore.New). If that method successfully reads the cookie and finds the session in the store, it should set IsNew = false, according to the source.
Also note that just checking the session ID is not a good way of validating this behavior. If you look at the source, it decodes the session ID from the cookie, then tries to look that up in the backing store. If the lookup fails, then the session ID will match, but IsNew will be true and there won't be any values in the session. Make sure you're setting some value in the session and check for that instead of the session ID. The behavior is different for the CookieStore since it stores the session data in the cookie itself.

Coldfusion: is it better to keep just the user_id in the session, or the whole user object?

I've got a cfc to handle the user object. My question is: is it better to store just the user_id in the session and create the user object anew with each request? Or is is better to store the whole user object in the session?
Here are my thoughts either way:
If I store the whole object in the session:
There will be potentially less processor overhead
There will be potentially more memory overhead
all of the methods/functions are stored in the actual object, and new functions that I update in the cfc will not be available unless users logout and back in, or if I devise some way to make it refresh itself.
There could potentially be mutex or lock problems if I'm messing with the object via concurrent ajax calls
If I store just the user_id in the session:
I'll have to create the user object with each page request (potentially more processor overhead)
There will be potentially less memory overhead
There won't be a chance for mutex/lock/race conditions since each request will have its own copy of the user object
Updates to the CFC model itself will be immediately recognized across the system and users wouldn't have to log out and back in
Is there a normal practice for this sort of thing? Am I over-thinking it?
All of the CF apps I've written were targeted at high traffic levels and high availability, so we never had the luxury of being able to think about single-server practices.
So, in my experience, I always had to a) allow for multiple load-balanced servers, and b) avoid sticky-sessions on the load balancer for a number of reasons. Therefore, we needed to, at the very least, have a server become part of a cluster on the fly and pick up mid-session traffic.
So, we always pulled "session" data from a shared datastore on every request.
My suggestion is to implement a session facade.
This affords you the option to change how you persist session data (like the user record) without changing the rest of your app.
You can choose, behind the scenes, to store everything in the session scope, load it up for every request, do a hybrid, use a key-value store, whatever.
You can choose whether to eager-load data, or lazy-load data, or any mix in between, and the rest of the app doesn't need to be aware of what you've done.
On Race Conditions
If you're concerned about race conditions then I would suggest using named locks around data commit and access. This is another bonus of using a facade - your application code doesn't need to know about this, and you can choose to put locks around certain objects, as opposed to locking the whole session.
You haven't indicated whether you're using an ORM, so this is a general answer.
For typical applications, I recommend instantiating the user object into the session scope. There's a big downside to creating the object anew with each request that you didn't include in your list: changes to the user object's properties and state will not persist across requests unless you intend to flush the user object's state to your persistence layer (e.g. database) on every hit. That is likely to be a much more expensive operation than object instantiation, and it doesn't necessarily insulate you from the kinds of problems you're thinking about with respect to ajax calls, race conditions, etc -- it just transfers the manifestation of those problems to the persistence layer, where your object's data could be in an unpredictable state.
Since every new request would be an "implicit save", you would also have to design your "ephemeral" object to be able to persist itself regardless of whether it's in a valid state (imagine the case of a multi-page form that modifies some aspect of the user object).
For session-stored objects, your concerns about memory can be mitigated by careful design practices. For instance, if your user has many tasks, and each task has many items, it might be a bad idea to instantiate and compose all those objects into your user object (i.e., lazy loading would be a better approach than eager loading).
If you really must to be able to change your CFCs on the fly, you can achieve that goal even with session-stored objects. One way is to store a version flag in both the application and session. With each request, your app would compare those flags. When they differ, the app would run a session-reload routine that snapshots current properties, rebuilds the session-stored objects, and finally updates the session flag to match the application flag.
This is piggy-backing partially off Ken Redler's answer but I don't have enough reputation to comment.
The way we do it, and the way I prefer, is to store the user data in Session as a struct. Then on request start, our Auth Model creates the user object in the Request scope and overrides any default values with the Session data. There are a few advantages to this:
Less hits to the database, less CPU
Always run newest code without a complex custom system ensuring that
Clustered environment friendly (complex objects in Session can't be clustered)
Can add or remove properties without corruption (assuming your User object only updates dirty columns)
Also, if you're using CF9, one of the features they were really proud of is how much they optimized object instantiation. If you haven't, test it yourself!
It depends.
If you have a lot of traffic - in the thousands of unique visitors per minute range - the memory overhead of storing your User.cfc in the session will eventually weigh you down. This can be easily overcome by throwing hardware at it (more memory for a while, eventually more servers and a hardware load balancer). Of course popularity is a good problem to have.
If you seem to have a CPU, network or other bottleneck in your database space, you may want to have the object cached in session memory so that you have fewer hits to the database.
Why do I mention these scenarios? You may be prematurely optimizing - don't fix a problem that you don't have. Don't optimize your memory, CPU and database access until those are, or soon will be, problems.
Now from an architectural best practice - not from an optimized "what's best for my processor" - well, I can only say: It depends.
Truthfully, neither way is wrong. If you are going to find yourself needing to check credentials against your database on every request, don't cache it. If you like the feel of an object in the session, then cache it. Because you know your own domain, you can probably go back and forth all day on why you should or should not cache the user object in the session. If it's going to make it easier, do it. If it's going to make it harder, don't.
I would just warn you against doing something incredibly convoluted or anything that is not immediately obvious to a developer looking at your application - the more you write, the more you have to maintain forever, the more your co-workers will associate your name with evil.
Finally, last note, if this is a vote - I say you cache it. It makes sense and always feels good to call session.user.hasRole("xyz") or the like.

How to I set up a lock that will automatically time out if it does not get a keep alive signal?

I have a certain resouce I want to limit access to. Basically, I am using a session level lock. However, it is getting to be a pain writing JavaScript that covers every possible way a window can close.
Once the user leaves that page I would like to unlock the resouce.
My basic idea is to use some sort of server side timeout, to unlock the resouce. Basically, if I fail to unlock the resource, I want a timer to kick in and unlock the resouce.
For example, after 30 seconds with now update from the clientside, unlock the resouce.
My basic question, is what sort of side trick can I use to do this? It is my understanding, that I can't just create a thread in JSF, because it would be unmanaged.
I am sure other people do this kind of thing, what is the correct thing to use?
Thanks,
Grae
As BalusC right fully asked, the big question is at what level of granularity would you like to do this locking? Per logged-in user, for all users, or perhaps you could get away with locking per request?
Or, and this will be a tougher one, is the idea that a single page request grabs the lock and then that specific page is intended to keep the lock between requests? E.g. as a kind of reservation. I'm browsing a hotel page, and when I merely look at a room I have made an implicit reservation in the system for that room so it can't happen that somebody else reserves the room for real while I'm looking at it?
In the latter case, maybe the following scheme would work:
In application scope, define a global concurrent map.
Keys of the map represent the resources you want to protect.
Values of the map are a custom structure which hold a read write lock (e.g. ReentrantReadWriteLock), a token and a timestamp.
In application scope, there also is a single global lock (e.g. ReentrantLock)
Code in a request first grabs the global lock, and quickly checks if the entry in the map is there.
If the entry is there it is taken, otherwise it's created. Creation time should be very short. The global lock is quickly released.
If the entry was new, it's locked via its write lock and a new token and timestamp are created.
If the entry was not new, it's locked via its read lock
if the code has the same token, it can go ahead and access the protected resource, otherwise it checks the timestamp.
If the timestamp has expired, it tries to grab the write lock.
The write lock has a time-out. When the time-out occurs give up and communicate something to the client. Otherwise a new token and timestamp are created.
This just the general idea. In a Java EE application that I have build I have used something similar (though not exactly the same) and it worked quite well.
Alternatively you could use a quartz job anyway that periodically removed the stale entries. Yet another alternative for that is replacing the global concurrent map with e.g. a JBoss Cache or Infinispan instance. These allow you to define an eviction policy for their entries, which saves you from having to code this yourself. If you have never used those caches though, learning how to set them up and configuring them correctly can be more trouble than just building a simple quartz job yourself.

What's a good way to handle "async" commits?

I have a WCF service that uses ODP.NET to read data from an Oracle database. The service also writes to the database, but indirectly, as all updates and inserts are achieved through an older layer of business logic that I access via COM+, which I wrap in a TransactionScope. The older layer connects to Oracle via ODBC, not ODP.NET.
The problem I have is that because Oracle uses a two-phase-commit, and because the older business layer is using ODBC and not ODP.NET, the transaction sometimes returns on the TransactionScope.Commit() before the data is actually available for reads from the service layer.
I see a similar post about a Java user having trouble like this as well on Stack Overflow.
A representative from Oracle posted that there isn't much I can do about this problem:
This maybe due to the way OLETx
ITransaction::Commit() method behaves.
After phase 1 of the 2PC (i.e. the
prepare phase) if all is successful,
commit can return even if the resource
managers haven't actually committed.
After all the successful "prepare" is
a guarantee that the resource managers
cannot arbitrarily abort after this
point. Thus even though a resource
manager couldn't commit because it
didn't receive a "commit" notification
from the MSDTC (due to say a
communication failure), the
component's commit request returns
successfully. If you select rows from
the table(s) immediately you may
sometimes see the actual commit occur
in the database after you have already
executed your select. Your select will
not therefore see the new rows due to
consistent read semantics. There is
nothing we can do about this in Oracle
as the "commit success after
successful phase 1" optimization is
part of the MSDTC's implementation.
So, my question is this:
How should I go about dealing with the possible delay ("asyc" via the title) problem of figuring out when the second part of the 2PC actually occurs, so I can be sure that data I inserted (indirectly) is actually available to be selected after the Commit() call returns?
How do big systems deal with the fact that the data might not be ready for reading immediately?
I assume that the whole transaction has prepared and a commit outcome decided by the TransactionManager, therefore eventually (barring heuristic damage) the Resource Managers will receive their commit message and complete. However, there are no guarantees as to how long that might take - could be days, no timeouts apply, having voted "commit" in the Prepare the Resource Manager must wait to hear the collective outcome.
Under these conditions, the simplest approach is to take "an understood, we're thinking" approach. Your request has been understood, but you actually don't know the outcome, and that's what you tell the user. Yes, in all sane circumstances the request will complete, but under some conditions operators could actually choose to intervene in the transaction manually (and maybe cause heuristic damage in doing so.)
To go one step further, you could start a new transaction and perform some queries to see if the data is there. Now, if you are populating a result screen you will naturally be doing such as query. The question would be what to do if the expected results are not there. So again, tell the user "your recent request is being processed, hit refresh to see if it's complete". Or retry automatically (I don't much like auto retry - prefer to educate the user that it's effectively an asynch operation.)

Resources