Multiple frontends, shared backend and dealing with concurrent requests attached to one session - session

Let say I have a simple architecture where sessions would be shared through a database, with multiple frontends (say F1 and F2) speaking to the same backend.
My issue is the case where both frontends would receive a request corresponding to a same session: a naive implementation would cause session to overwrite each other (I looked at django which seems to fall into that case). I could try to design the backend such as it garantees that no more than one frontend can deal with a given session, but this seems hard to do correctly, especially if I want to handle frontend failures.
I can't help but thinking that the case is pathologic in the first place (there should not be more than one request for a given session at any time), and is not worths being dealt for, but I have not much experience in web development, so maybe I am missing something. How does one usually deal with this case ?
Possible solutions that I would like to avoid:
Sticky session: that's the solution I currently use, and is difficult to support once you have several load balancers, and more significantly goes against the spirit of load balancing in the first place.
Putting data in cookie: for technical reasons outside my control, I cannot use cookie.

One common solution is known as session persistence. Whatever routes your request to the f1 or f2 ensures that as long as a session is active, the client with that session only goes to one frontend.
It is a common feature in almost all loadbalancers. For example, nginx has the ip_hash http://wiki.nginx.org/NginxHttpUpstreamModule

Related

Understanding web caching (Redis)

I'm working on a web app that receives data from an API provider. Now I need a way to cache the data to save from calling the provider again for the same data.
Then I stumbled on Redis which seems to serve my purpose but I'm not 100% clear about the concept of caching using Redis. I've checked their documentation but I don't really follow what they have to say.
Let's suppose I have just deployed my website to live and I have my first visitor called A. Since A is the first person that visits, my website will request a new set of data over API provider and after a couple seconds, the page will be loaded with the data that A wanted.
My website caches this data to Redis to serve future visitors that will hit the same page.
Now I have my second visitor B.
B hits the same page url as A did and because my website has this data stored in the cache, B is served from the cache and will experience much faster loading time than what A has experienced.
Is my understanding in line with with the concept of web caching?
I always thought of caching as per user basis so my interaction on a website has no influence or whatsoever to other people but Redis seems to work per application basis.
Yes, your understanding of web caching is spot on, but it can get much more complex, depending on your use case. Redis is essentially a key-value store. So, if you want application-level caching, your theoretical key/value pair would look like this:
key: /path/to/my/page
value: <html><...whatever...></html>
If you want user-level caching, your theoretical key would just change slightly:
key: visitorA|/path/to/my/page
value: <html><...whatever...></html>
Make sense? Essentially, there would be a tag in the key to define the user (but it would generally be a hash or something, not a plain-text string).
There are redis client libraries that are written for different web-development frameworks and content-management systems that will define how they handle caching (ie. user-specific or application-specific). If you are writing a custom web app, then you can choose application-level caching or user-level caching and do whatever else you want with caching.

Should I make my CouchDB database server public-facing?

I'm new to CouchDb and am trying to comprehend how to properly make use of it. I'm coming from MongoDB where I would always write a web layer and put it in front of mongo so that I could allow users to access the data inside of it, etc. In fact, this is how I've used all databases for every web site that I've ever written. So, looking at Couch, I see that it's native API is HTTP and that it has built in things like OAuth support, and other features that hint to me that perhaps I should no longer have my code layer sitting in front of Couch, but instead write Views and things and just give out accounts to Couch to my users? I'm thinking in terms of like an HTTP-based API for a site of mine, or something that users would consume my data through. Opening up Couch like this seems odd to me, though. Is OAuth, in Couch's sense, meant more for remote access for software that I'd write and run internal to my own network "officially", or is it literally meant for the end users?
I know there might be things that could only be done through a code layer on top of CouchDB, like if you wanted additional non-database related things to occur during API requests, also. So thinking along those lines I think I will still need a code layer, anyway.
Dealer's choice.
Nodejitsu has a great writeup on this sort of topic here.
Not knowing your application specifics I'll take a broad approach...
Back-end
If you want to prevent users from ever seeing your database then make it back-end. You can pipe everything through something like node.js and present only what the user needs to see and they'll never know anything about the database.
See Resource View Presenter
Front-end
If you are not concerned about data security, you can host an entire app on CouchDB; see CouchApp. This approach has the benefit of using the replication mechanism to control publishing your site/data. The drawback here is that you will almost certainly run into some technical limitations that will require moving CouchDB closer to the backend.
Bl-end
Have the app server present the interface and the client pull the data from the database separately. This gives the most flexibility but can be a bag of hurt because even with good design this could lead to supportability and scalability issues.
My recommendation
Use CouchDB on the backend. If you need mobile clients to synchronize then use a secondary DB publicly exposed for this purpose and selectively sync this data to wherever it needs to go.
Simply put, no.
There's no way to secure Couch properly on a public facing site. There's no way to discriminate access at a fine enough granular level. If someone has access to any of the data, they have access to all of the data.
Not all data on a site is meant for public consumption, save for the most trivial of sites.

CQRS: what backend to use for my View-Store?

My CQRS-based architecture currently has 4 components. It is more of a prototype so nothing is set in stone yet.
CommandProcessor: Gets commands, executes them, etc. (duh ^^),
publishes events. Is Azure-based
ViewProcessor: Gets view-requests,
responds with the view. Subscribes to events to update view store. Is
Azure-based
WebClient: AJAX-heavy web portal, sends commands and
requests (json-)views. Azure-based
DesktopClient: Not much to say,
also sends commands and requests views (undecided if json or some
other format). Obviously not azure-based.
My original approach was to use an InMemory-Viewstore. Azure VMs have quite a bit of memory available and I didn't really see the need to add the complexity Blob-Storage etc.
Additionally, I am trying to minimize the command-execution latency to at least partially get around the whole asynchronous UI problem so that I can (where needed) simulate a synchronous UI with (fast) callbacks (I hope that sentence made sense ^^).
In creating the web client, I noticed a potential flaw in my plan. The url of the ViewProcessor is obviously different to the WebClient-url, so json requests would fail because of the Same-Origin-Policy. Alternatives/Workarounds like jsonp did not seem that attractive because they don't solve the inherent security problem. Implementing the ajax requests to target the WebClient itself would be an option but then I would have redundant functionality (view-store in both webclient and viewprocessor).
I guess saving the views in blob-storage would solve this problem, but I can't shake the feeling that I am overlooking something important/obvious.
Client --command-- CommandProcessor
CommandProcessor --event-- ViewProcessor
ViewProcessor --view-- Blob
(ViewProcessor or CommandProcessor) --notification-- Client
Blob --view-- Client
That scenario would have quite a bit of latency :|
I would look again the blob storage option. We store serialized view objects in blob storage, and it is very fast and stable. Is there some aspect of blob storage that concerns you?
Erick

What are the benefits of a stateless web application?

It seems some web architects aim to have a stateless web application. Does that mean basically not storing user sessions? Or is there more to it?
If it is just the user session storing, what is the benefit of not doing that?
Reduces memory usage. Imagine if google stored session information about every one of their users
Easier to support server farms. If you need session data and you have more than 1 server, you need a way to sync that session data across servers. Normally this is done using a database.
Reduce session expiration problems. Sometimes expiring sessions cause issues that are hard to find and test for. Sessionless applications don't suffer from these.
Url linkability. Some sites store the ID of what the user is looking at in the sessions. This makes it impossible for users to simply copy and paste the URL or send it to friends.
NOTE: session data is really cached data. This is what it should be used for. If you have an expensive query which is going to be reused, then save it into session. Just remember that you cannot assume it will be there when you try and get it later. Always check if it exists before retrieving.
From a developer's perspective, statelessness can help make an application more maintainable and easier to work with. If I know a website I'm working on is stateless, I need not worry about things being correctly initialized in the session before loading a particular page.
From a user's perspective, statelessness allows resources to be linkable. If a page is stateless, then when I link a friend to that page, I know that they'll see what I'm seeing.
From the scaling and performance perspective, see tsters answer.

Client-side caching in Rich Internet Applications

I'm starting to step into unfamiliar territory with regards to performance improvement and our RIA (Rich Internet Application) built with GWT. For those unfamiliar with GWT, essentially when deployed it's just pure JavaScript. We're interfacing with the server side using a REST-style XML web service via XMLHttpRequest.
Our XML is un-marshalled into JavaScript objects and used within the application to represent the data model behind the interface. When changes occur, the model is updated and marshalled back to XML and sent back to the server.
I've learned the number one rule of performance (in terms of user experience) is to make as few requests as possible. Obviously this brings up the possibility of caching. Caching is great for static data but things get tricky in a multi-user system where data on the server may be changing. Also, use of "Last-Modified" and "If-Modified-Since" requests don't quite do enough since we'd like to avoid unnecessary requests altogether.
I'm trying to figure out if caching data in the browser is even right for us before researching the approaches. I hope someone has tread this path before. I'm looking for similar approaches, lessons learned, things to avoid, etc.
I'm happy to provide more specific info if needed...
For GWT, if performance matters that much to you, you get better performance by sending all the data you need in a single request, instead of querying multiple small data. I would recommend against client-side data caching as there are lots of issues like keeping the data in sync with the database.
Besides, you already have a good advantage with GWT over traditional html apps. Unless you are dealing with special data (eg: does not become stale too quickly - implies mostly-read queries) I found out that there is no special need for caching. You are better off doing a service-layer caching, since most of the time should come of server-side processing.
If you can provide more details about the nature of the app, maybe some different conclusions can be taken.

Resources