How to I set up a lock that will automatically time out if it does not get a keep alive signal? - ajax

I have a certain resouce I want to limit access to. Basically, I am using a session level lock. However, it is getting to be a pain writing JavaScript that covers every possible way a window can close.
Once the user leaves that page I would like to unlock the resouce.
My basic idea is to use some sort of server side timeout, to unlock the resouce. Basically, if I fail to unlock the resource, I want a timer to kick in and unlock the resouce.
For example, after 30 seconds with now update from the clientside, unlock the resouce.
My basic question, is what sort of side trick can I use to do this? It is my understanding, that I can't just create a thread in JSF, because it would be unmanaged.
I am sure other people do this kind of thing, what is the correct thing to use?
Thanks,
Grae

As BalusC right fully asked, the big question is at what level of granularity would you like to do this locking? Per logged-in user, for all users, or perhaps you could get away with locking per request?
Or, and this will be a tougher one, is the idea that a single page request grabs the lock and then that specific page is intended to keep the lock between requests? E.g. as a kind of reservation. I'm browsing a hotel page, and when I merely look at a room I have made an implicit reservation in the system for that room so it can't happen that somebody else reserves the room for real while I'm looking at it?
In the latter case, maybe the following scheme would work:
In application scope, define a global concurrent map.
Keys of the map represent the resources you want to protect.
Values of the map are a custom structure which hold a read write lock (e.g. ReentrantReadWriteLock), a token and a timestamp.
In application scope, there also is a single global lock (e.g. ReentrantLock)
Code in a request first grabs the global lock, and quickly checks if the entry in the map is there.
If the entry is there it is taken, otherwise it's created. Creation time should be very short. The global lock is quickly released.
If the entry was new, it's locked via its write lock and a new token and timestamp are created.
If the entry was not new, it's locked via its read lock
if the code has the same token, it can go ahead and access the protected resource, otherwise it checks the timestamp.
If the timestamp has expired, it tries to grab the write lock.
The write lock has a time-out. When the time-out occurs give up and communicate something to the client. Otherwise a new token and timestamp are created.
This just the general idea. In a Java EE application that I have build I have used something similar (though not exactly the same) and it worked quite well.
Alternatively you could use a quartz job anyway that periodically removed the stale entries. Yet another alternative for that is replacing the global concurrent map with e.g. a JBoss Cache or Infinispan instance. These allow you to define an eviction policy for their entries, which saves you from having to code this yourself. If you have never used those caches though, learning how to set them up and configuring them correctly can be more trouble than just building a simple quartz job yourself.

Related

KStreams: implementing session window with pocessor API

I need to implement a logic similar to session windows using processor API in order to have a full control over state store. Since processor API doesn't provide windowing abstraction, this needs to be done manually. However, I fail to find the source code for KStreams session window logic, to get some initial ideas (specifically regarding session timeouts).
I was expecting to use punctuate method, but it's a per processor timer rather than per key timer. Additionally SessionStore<K, AGG> doesn't provide an API to traverse the database for all keys.
[UPDATE]
As an example, assume processor instance is processing K1 and stream time is incremented which causes the session for K2 to timeout. K2 may or may not exist at all. How do you know that there exists a specific key (like K2 when stream time is incremented (while processing a different key)? In other words when stream time is incremented, how do you figure out which windows are expired (because you don't know those keys exists)?
This is the DSL code: https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/KStreamSessionWindowAggregate.java -- hope it helps.
It's unclear what your question is though -- it's mostly statements. So let me try to give some general answer.
In the DSL, sessions are close based on "stream time" progress. Only relying on the input data makes the operation deterministic. Using wall-clock time would introduce non-determinism. Hence, using a Punctuation is not necessary in the DSL implementation.
Additionally SessionStore<K, AGG> doesn't provide an API to traverse the database for all keys.
Sessions in the DSL are based on keys and thus it's sufficient to scan the store on a per-key basis over a time range (as done via findSessions(...)).
Update:
In the DSL, each time a session window is updated, as corresponding update event is sent downstream immediately. Hence, the DSL implementation does not wait for "stream time" to advance any further but publishes the current (potentially intermediate) result right away.
To obey the grace period, the record timestamp is compared to "stream time" and if the corresponding session window is already closed, the record is skipped (cf. https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/KStreamSessionWindowAggregate.java#L146). I.e., closing a window is just a logical step (not an actually operation); the session will still be stored and if a window is closed no additional event needs to be sent downstream because the final result was sent downstream in the last update to the window already.
Retention time itself must not be handled by the Processor implementation because it's a built-in feature of the SessionStore: internally, the session store maintains so-called "segments" that store sessions for a certain time period. Each time a put() is done, the store checks if old segments can be dropped (based on the timestamp provided by put()). I.e., old sessions are deleted lazily and as bulk deletes (i.e., all session of the whole segment will be deleted at once) as it's more efficient than individual deletes.

Compensating Events on CQRS/ES Architecture

So, I'm working on a CQRS/ES project in which we are having some doubts about how to handle trivial problems that would be easy to handle in other architectures
My scenario is the following:
I have a customer CRUD REST API and each customer has unique document(number), so when I'm registering a new customer I have to verify if there is another customer with that document to avoid duplicity, but when it comes to a CQRS/ES architecture where we have eventual consistency, I found out that this kind of validations can be very hard to address.
It is important to notice that my problem is not across microservices, but between the command application and the query application of the same microservice.
Also we are using eventstore.
My current solution:
So what I do today is, in my command application, before saving the CustomerCreated event, I ask the query application (using PostgreSQL) if there is a customer with that document, and if not, I allow the event to go on. But that doesn't guarantee 100%, right? Because my query can be desynchronized, so I cannot trust it 100%. That's when my second validation kicks in, when my query application is processing the events and saving them to my PostgreSQL, I check again if there is a customer with that document and if there is, I reject that event and emit a compensating event to undo/cancel/inactivate the customer with the duplicated document, therefore finishing that customer stream on eventstore.
Altough this works, there are 2 things that bother me here, the first thing is my command application relying on the query application, so if my query application is down, my command is affected (today I just return false on my validation if query is down but still...) and second thing is, should a query/read model really be able to emit events? And if so, what is the correct way of doing it? Should the command have some kind of API for that? Or should the query emit the event directly to eventstore using some common shared library? And if I have more than one view/read? Which one should I choose to handle this?
Really hope someone could shine a light into these questions and help me this these matters.
For reference, you may want to be reviewing what Greg Young has written about Set Validation.
I ask the query application (using PostgreSQL) if there is a customer with that document, and if not, I allow the event to go on. But that doesn't guarantee 100%, right?
That's exactly right - your read model is stale copy, and may not have all of the information collected by the write model.
That's when my second validation kicks in, when my query application is processing the events and saving them to my PostgreSQL, I check again if there is a customer with that document and if there is, I reject that event and emit a compensating event to undo/cancel/inactivate the customer with the duplicated document, therefore finishing that customer stream on eventstore.
This spelling doesn't quite match the usual designs. The more common implementation is that, if we detect a problem when reading data, we send a command message to the write model, telling it to straighten things out.
This is commonly referred to as a process manager, but you can think of it as the automation of a human supervisor of the system. Conceptually, a process manager is an event sourced collection of messages to be sent to the command model.
You might also want to consider whether you are modeling your domain correctly. If documents are supposed to be unique, then maybe the command model should be using the document number as a key in the book of record, rather than using the customer. Or perhaps the document id should be a function of the customer data, rather than being an arbitrary input.
as far as I know, eventstore doesn't have transactions across different streams
Right - one of the things you really need to be thinking about in general is where your stream boundaries lie. If set validation has significant business value, then you really need to be thinking about getting the entire set into a single stream (or by finding a way to constrain uniqueness without using a set).
How should I send a command message to the write model? via API? via a message broker like Kafka?
That's plumbing; it doesn't really matter how you do it, so long as you are sure that the command runs within its own transaction/unit of work.
So what I do today is, in my command application, before saving the CustomerCreated event, I ask the query application (using PostgreSQL) if there is a customer with that document, and if not, I allow the event to go on. But that doesn't guarantee 100%, right? Because my query can be desynchronized, so I cannot trust it 100%.
No, you cannot safely rely on the query side, which is eventually consistent, to prevent the system to step into an invalid state.
You have two options:
You permit the system to enter in a temporary, pending state and then, eventually, you will bring it into a valid permanent state; for this you could allow the command to pass, yield CustomerRegistered event and using a Saga/Process manager you verify against a uniquely-indexed-by-document-collection and issue a compensating command (not event!), i.e. UnregisterCustomer.
Instead of sending a command, you create&start a Saga/Process that preallocates the document in a uniquely-indexed-by-document-collection and if successfully then send the RegisterCustomer command. You can model the Saga as an entity.
So, in both solution you use a Saga/Process manager. In order for the system to be resilient you should make sure that RegisterCustomer command is idempotent (so you can resend it if the Saga fails/is restarted)
You've butted up against a fairly common problem. I think the other answer by VoicOfUnreason is worth reading. I just wanted to make you aware of a few more options.
A simple approach I have used in the past is to create a lookup table. Your command tries to register the key in a unique constraint table. If it can reserve the key the command can go ahead.
Depending on the nature of the data and the domain you could let this 'problem' occur and raise additional events to mark it. If it is something that's important to the business/the way the application works then you can deal with it either manually or at the time via compensating commands. if the latter then it would make sense to use a process manager.
In some (rare) cases where speed/capacity is less of an issue then you could consider old-fashioned locking and transactions. Admittedly these are much better suited to CRUD style implementations but they can be used in CQRS/ES.
I have more detail on this in my blog post: How to Handle Set Based Consistency Validation in CQRS
I hope you find it helpful.

SQLAlchemy session: how to keep it alive?

I have a session object that gets passed around a whole lot and at some point the following lines of code are called (this is unavoidable):
import transaction
transaction.commit()
This renders the session unusable (by closing it I think).
My question is two part:
How do I check if a session is still alive and well?
Is there a quick way to revitalize a dead session?
For 2: The only way I currently know is to use sqlalchemy.orm.scoped_session, then call query(...)get(id) many times to recreate the necessary model instances but this seems pretty darn inefficient.
EDIT
Here's an example of the sequence of events that causes the error:
modelInstance = DBSession.query(ModelClass).first()
import transaction
transaction.commit()
modelInstance.some_relationship
And here is the error:
sqlalchemy.orm.exc.DetachedInstanceError: Parent instance <CategoryNode at 0x7fdc4c4b3110> is not bound to a Session; lazy load operation of attribute 'children' cannot proceed
I don't really want to turn off lazy loading.
EDIT
DBSession.is_active seems to be no indication of whether or not the session is in fact alive and well in this case:
transaction.commit()
print(DBSession.is_active)
this prints True...
EDIT
This seemed too big for a comment so I'm putting it here.
zzzeek said:
"An expired object will automatically load new state from the database, via the Session, as soon as you access anything on it, so there's no need to tell the Session to do anything here."
So how do I get stuff committed in such a way that this will happen? calling transaction.commit is wrong, what's the correct way?
so the first thing to observe here is "import transaction" is a package called zope.transaction. this is a generic transaction that takes hold of any number of sub-tasks, of which the SQLAlchemy Session is one of them, via the zope.sqlalchemy extension.
What zope.sqlalchemy here is going to do is call the begin()/rollback()/commit() methods of the Session itself, in response to it's own management of the "transaction".
The Session itself works in such a way that it is almost always ready for use, even if its internal transaction has been committed. When this happens, the Session upon next use just keeps going, either starting a new transaction if it's in autocommit=False, or if autocommit=True it continues in "autocommit" mode. Basically it is auto-revitalizing.
The one time that the Session is not able to proceed is if a flush has failed, and the rollback() method has not been called, which, when in autocommit=False mode, the Session would like you do to explicitly when flush() fails. To see if the Session is in this specific state, the session.is_active property will return False in that case.
I'm not 100% sure what the implications are of continuing to use the Session when zope.transaction is in use. I think it depends on how you're using zope.transaction in the bigger scheme.
Which leads us where lots of these questions do, which is what are you really trying to do. Like, "recreate the necessary model instances" is not something the Session does, unless you are referring to existing instances which have been expired (their guts emptied out). An expired object will automatically load new state from the database, via the Session, as soon as you access anything on it, so there's no need to tell the Session to do anything here.
It's of course an option to even turn off auto-expiration entirely, but that you are even arriving at a problem here implies something is not working as it should. Like there's some error message you're getting. More detail would be needed to understand exactly what the issue you're having is.

Plone 4.2 how to make PAS cache external usera data

I'm implementing a PAS plugin that handles authentications against mailservers. Actually only DBMail is implemented.
I realized, that the enumerateUsers function from the PAS plugin is called numerous times per request and requires my plugin to open/close an SQL connections for every (subsequent) request. Of course, this is very expensive.
The connections itself are handled in a plone tool, which is able to handle multiple different mailservers and delegeates the enumerateUsers call to wrapper objects that represent registered servers.
My question is now, what sort of cache (OOBTree, Session?) I should use to provide a temporary local storage for repeating enumerations and avoid subsequent SQL connections?
Another idea was, to hook into the user creation process that takes place on the first login, an external user issues and completely "localize" the users.
Third idea was, to store the needed data in the specific member, if possible.
What would be best practice here?
I'd cache the query results, indeed. You need to make a decision on how long to cache the results, and if stored long term, how to invalidate that cache or check for changes.
There are no best practices for these decisions, as they depend entirely on the type of data stored and the APIs of the backends. If they support some kind of freshness query, for example, then you store everything forever and poll the backend to see if the cache needs updating.
You can start with a simple request cache; query once per request, store it on the request object. Your cache will automatically be invalidated at the end of the request as the request object is cleaned up, the next request will be a clean slate.
If your backend users rarely change, you can cache information for longer, in a local cache. I'd use a volatile attribute on the plugin. Any attribute starting with _v_ is ignored by the persistence machinery. Thus, anything stored in a _v_ volatile attribute is both thread-local and only exists for the lifetime of the process, a restart of the server clears these automatically.
At the very least you should use an _v_ volatile attribute to store your backend SQL connections. That way they can stay open between requests, and can be re-used. Something like the following method would do nicely:
def _connection(self):
# Return a backend connection
if getattr(self, '_v_connection', None) is None:
# Create connection here
self._v_connection = yourdatabaseconnection
return self._v_connection
You could also use a persistent attribute on your plugin to store your cache. This cache would be committed to the ZODB and persist across restarts. You then really need to work out how to invalidate the contents; store timestamps and evict data when to old, etc.
Your cache datastructure depends entirely on your application needs. If you don't persist information, a dictionary (username -> information) could be more than enough. Persisted caches could benefit from using a OOBTree instead of a dictionary as they reduce chances of conflicts between different threads and are more efficient when it comes to large sets of data.
Whatever you do, you do not need to use a Session. Sessions are prone to conflicts, do not scale well, and are in any case not the place to store a cache of this kind.

Coldfusion: is it better to keep just the user_id in the session, or the whole user object?

I've got a cfc to handle the user object. My question is: is it better to store just the user_id in the session and create the user object anew with each request? Or is is better to store the whole user object in the session?
Here are my thoughts either way:
If I store the whole object in the session:
There will be potentially less processor overhead
There will be potentially more memory overhead
all of the methods/functions are stored in the actual object, and new functions that I update in the cfc will not be available unless users logout and back in, or if I devise some way to make it refresh itself.
There could potentially be mutex or lock problems if I'm messing with the object via concurrent ajax calls
If I store just the user_id in the session:
I'll have to create the user object with each page request (potentially more processor overhead)
There will be potentially less memory overhead
There won't be a chance for mutex/lock/race conditions since each request will have its own copy of the user object
Updates to the CFC model itself will be immediately recognized across the system and users wouldn't have to log out and back in
Is there a normal practice for this sort of thing? Am I over-thinking it?
All of the CF apps I've written were targeted at high traffic levels and high availability, so we never had the luxury of being able to think about single-server practices.
So, in my experience, I always had to a) allow for multiple load-balanced servers, and b) avoid sticky-sessions on the load balancer for a number of reasons. Therefore, we needed to, at the very least, have a server become part of a cluster on the fly and pick up mid-session traffic.
So, we always pulled "session" data from a shared datastore on every request.
My suggestion is to implement a session facade.
This affords you the option to change how you persist session data (like the user record) without changing the rest of your app.
You can choose, behind the scenes, to store everything in the session scope, load it up for every request, do a hybrid, use a key-value store, whatever.
You can choose whether to eager-load data, or lazy-load data, or any mix in between, and the rest of the app doesn't need to be aware of what you've done.
On Race Conditions
If you're concerned about race conditions then I would suggest using named locks around data commit and access. This is another bonus of using a facade - your application code doesn't need to know about this, and you can choose to put locks around certain objects, as opposed to locking the whole session.
You haven't indicated whether you're using an ORM, so this is a general answer.
For typical applications, I recommend instantiating the user object into the session scope. There's a big downside to creating the object anew with each request that you didn't include in your list: changes to the user object's properties and state will not persist across requests unless you intend to flush the user object's state to your persistence layer (e.g. database) on every hit. That is likely to be a much more expensive operation than object instantiation, and it doesn't necessarily insulate you from the kinds of problems you're thinking about with respect to ajax calls, race conditions, etc -- it just transfers the manifestation of those problems to the persistence layer, where your object's data could be in an unpredictable state.
Since every new request would be an "implicit save", you would also have to design your "ephemeral" object to be able to persist itself regardless of whether it's in a valid state (imagine the case of a multi-page form that modifies some aspect of the user object).
For session-stored objects, your concerns about memory can be mitigated by careful design practices. For instance, if your user has many tasks, and each task has many items, it might be a bad idea to instantiate and compose all those objects into your user object (i.e., lazy loading would be a better approach than eager loading).
If you really must to be able to change your CFCs on the fly, you can achieve that goal even with session-stored objects. One way is to store a version flag in both the application and session. With each request, your app would compare those flags. When they differ, the app would run a session-reload routine that snapshots current properties, rebuilds the session-stored objects, and finally updates the session flag to match the application flag.
This is piggy-backing partially off Ken Redler's answer but I don't have enough reputation to comment.
The way we do it, and the way I prefer, is to store the user data in Session as a struct. Then on request start, our Auth Model creates the user object in the Request scope and overrides any default values with the Session data. There are a few advantages to this:
Less hits to the database, less CPU
Always run newest code without a complex custom system ensuring that
Clustered environment friendly (complex objects in Session can't be clustered)
Can add or remove properties without corruption (assuming your User object only updates dirty columns)
Also, if you're using CF9, one of the features they were really proud of is how much they optimized object instantiation. If you haven't, test it yourself!
It depends.
If you have a lot of traffic - in the thousands of unique visitors per minute range - the memory overhead of storing your User.cfc in the session will eventually weigh you down. This can be easily overcome by throwing hardware at it (more memory for a while, eventually more servers and a hardware load balancer). Of course popularity is a good problem to have.
If you seem to have a CPU, network or other bottleneck in your database space, you may want to have the object cached in session memory so that you have fewer hits to the database.
Why do I mention these scenarios? You may be prematurely optimizing - don't fix a problem that you don't have. Don't optimize your memory, CPU and database access until those are, or soon will be, problems.
Now from an architectural best practice - not from an optimized "what's best for my processor" - well, I can only say: It depends.
Truthfully, neither way is wrong. If you are going to find yourself needing to check credentials against your database on every request, don't cache it. If you like the feel of an object in the session, then cache it. Because you know your own domain, you can probably go back and forth all day on why you should or should not cache the user object in the session. If it's going to make it easier, do it. If it's going to make it harder, don't.
I would just warn you against doing something incredibly convoluted or anything that is not immediately obvious to a developer looking at your application - the more you write, the more you have to maintain forever, the more your co-workers will associate your name with evil.
Finally, last note, if this is a vote - I say you cache it. It makes sense and always feels good to call session.user.hasRole("xyz") or the like.

Resources