how do web application sessions work when running on more than one server?

how do web application sessions work when running on more than one server? - session

This is a general question based on how web sessions work across multiple servers, my knowledge around web sessions is not very deep but afaik a web session is typically stored directly in memory of the running web server application so when a request comes in it doesn't have to make database requests to fetch the session data. If a popular website needs multiple servers to handle the level of traffic it is receiving, when a request comes in I assume that it could get directed to any of the servers by some load balancer, but how does the server handling that request get the associated session data if the previous request was handled by a different server? do multi server sites require special session handling infrastructure, or do the load balancers know some how to route requests from the same client to the same server?

This question on ServerFault is the same as this question. with a good answer. in overview there are 3 common methods:
Session information stored in cookies only
Load balancer always directs user to the same machine
Shared backend database or key/value store.
See link for more indepth details of each.

Related

If request sent through jmeter, in glassfish clustering requests are not segregating to different servers

For the application server set as clustering in glass fish. I have sent request through jmeter and all the requests hits to only one server . Expected was requests should be distributed to multiple servers in the cluster. But if sent requests manually clustering is working. Please help to sort out this issue

There could be different clustering load balancing mechanisms, as far as I can see from the GlassFish Server High Availability Administration Guide:
Cookie Method
The Loadbalancer Plug-In uses a separate cookie to record the route information. The HTTP client (typically, the web browser) must support cookies to use the cookie based method. If the HTTP client is unable to accept cookies, the plug-in uses the following method.
Explicit URL Rewriting
The sticky information is appended to the URL. This method works even if the HTTP client does not support cookies. To implement explicit URL rewriting, the application developer must use HttpResponse.encodeURL() and encodeRedirectURL() calls to ensure that any URLs in the application have the session information appended to them.
So depending on your Load Balancer configuration you need to
Either define either different cookies in the HTTP Cookie Manager
Or make sure different threads send requests to different URLs i.e. via HTTP URL Re-writing Modifier
In any case it is recommended to add DNS Cache Manager so each virtual user would resolve the underlying IP address of the application under test on its own.

User state in a big (high traffic) application

Assumptions -
There are 4 servers sitting behind a reverse proxy which acts as a load balancer
Load Balancer is purely load balancing and sends a request to any of the 4 servers depending on their current load
Users need to be authenticated for accessing this application, and some space should hold state of all users, as reverse-proxy is only load balancing
Application needs to scale beyond 4 servers, say to 4000 servers.
Question -
In a large scale multi-server system who holds state of all the users - load balancer, each server, separate server?
Is the state of all users saved on all servers so that the load balancer can send a request to any server? How does this scale to 100m users?

You can use sticky sessions. It enables the load balancer to bind a user's session to a specific instance. This ensures that all requests from the user during the session are sent to the same instance. Read Sticky and NON-Sticky sessions.
Also let's say the instance gets killed due to some reason, in order to maintain the state, the authentication token and other information can also be saved on a separate redis cache, which is much faster to query. Read Session Management in microservices

In a stateless multi-server system, a separate server (authentication server) or a separate server cluster (authentication api) holds state of all users. If its a single authentication server for a large application, you can expect it to have RAM in the range of 100's of GBs, maybe more.
Nope, state of all users is usually not replicated on all application servers, it will be a huge waste of resources. Authentication server (or server cluster) may act as load balancer itself or forward all requests to a separate load balancer - true for a stateless application.
In a stateful application, individual servers hold state of users through sticky sessions.
If possible, try to keep your application stateless. A stateless application will have better performance and will be easier to scale out than a stateful application!

Sticky and NON-Sticky sessions

I want to know the difference between sticky- and non-sticky sessions. What I understood after reading from internet:
Sticky : only single session object will be there.
Non-sticky session : session object for each server node

When your website is served by only one web server, for each client-server pair, a session object is created and remains in the memory of the web server. All the requests from the client go to this web server and update this session object. If some data needs to be stored in the session object over the period of interaction, it is stored in this session object and stays there as long as the session exists.
However, if your website is served by multiple web servers which sit behind a load balancer, the load balancer decides which actual (physical) web-server should each request go to. For example, if there are 3 web servers A, B and C behind the load balancer, it is possible that www.mywebsite.com is served from server A, www.mywebsite.com is served from server B and www.mywebsite.com/ are served from server C.
Now, if the requests are being served from (physically) 3 different servers, each server has created a session object for you and because these session objects sit on three independent boxes, there's no direct way of one knowing what is there in the session object of the other. In order to synchronize between these server sessions, you may have to write/read the session data into a layer which is common to all - like a DB. Now writing and reading data to/from a db for this use-case may not be a good idea. Now, here comes the role of sticky-session.
If the load balancer is instructed to use sticky sessions, all of your interactions will happen with the same physical server, even though other servers are present. Thus, your session object will be the same throughout your entire interaction with this website.
To summarize, In case of Sticky Sessions, all your requests will be directed to the same physical web server while in case of a non-sticky load balancer may choose any webserver to serve your requests.
As an example, you may read about Amazon's Elastic Load Balancer and sticky sessions here : http://aws.typepad.com/aws/2010/04/new-elastic-load-balancing-feature-sticky-sessions.html

I've made an answer with some more details here :
https://stackoverflow.com/a/11045462/592477
Or you can read it there ==>
When you use loadbalancing it means you have several instances of tomcat and you need to divide loads.
If you're using session replication without sticky session : Imagine you have only one user using your web app, and you have 3
tomcat instances. This user sends several requests to your app, then
the loadbalancer will send some of these requests to the first tomcat
instance, and send some other of these requests to the secondth
instance, and other to the third.
If you're using sticky session without replication : Imagine you have only one user using your web app, and you have 3 tomcat
instances. This user sends several requests to your app, then the
loadbalancer will send the first user request to one of the three
tomcat instances, and all the other requests that are sent by this
user during his session will be sent to the same tomcat instance.
During these requests, if you shutdown or restart this tomcat
instance (tomcat instance which is used) the loadbalancer sends the
remaining requests to one other tomcat instance that is still
running, BUT as you don't use session replication, the instance
tomcat which receives the remaining requests doesn't have a copy of
the user session then for this tomcat the user begin a session : the
user loose his session and is disconnected from the web app although
the web app is still running.
If you're using sticky session WITH session replication : Imagine you have only one user using your web app, and you have 3 tomcat
instances. This user sends several requests to your app, then the
loadbalancer will send the first user request to one of the three
tomcat instances, and all the other requests that are sent by this
user during his session will be sent to the same tomcat instance.
During these requests, if you shutdown or restart this tomcat
instance (tomcat instance which is used) the loadbalancer sends the
remaining requests to one other tomcat instance that is still
running, as you use session replication, the instance tomcat which
receives the remaining requests has a copy of the user session then
the user keeps on his session : the user continue to browse your web
app without being disconnected, the shutdown of the tomcat instance
doesn't impact the user navigation.

Let's say the user sends a request to get its profile, there won't be anything in the memory of our web application instance. we get the user profile from DB nit before sending the response, we save the data in the memory of let's say Instance3. But the next request from the same user can go to any instance.
When the request first comes to Instance3, that time it will create a session that will have a session id. when the response is sent to the client, the client is supplied with a cookie. so next time this client makes a request, this cookie will be attached to the request, the load balancer will look at the cookie, and the load balancer will know that that request has to be forwarded to Instance3. This is sticky session solution. Its downside is what if Instance3 goes down? the load balancer will route the request to other instances but they do not have a cache. All the users stored in Instance3 will have high latency. This will impact the reliability of your system.
If you store sessions in all instances, now you would have memory issues. Let's say if an instance could store 100 user sessions and you have 3 instances, you would be able to store 300 sessions. But if each instance stores each session, you will be able to store only 100 sessions in all of your 3 instances. So this will impact the scalability of your application.
sticky and non-sticky sessions are components of stateful replication. If you want higher scalability you do not cache anything on your web application instance, your web instance will hit the DB with every request but this will cause high latency.
A better way is stateless replication where you do not store anything on your application instance but instead, you use server-side caching (memcached/redis)

Authenticating a client-side web service request in a cached environment

We're building a set of external web services to be consumed client-side (using jquery/AJAX) by visitors to our site. The web services need to be publicly available but we'd like to limit access to site visitors.
Importantly, the site in question sits behind a CDN and we cache page content for 24 hours; AJAX requests would preferably be cached as well but I'm conscious doing so will limit our authentication options. Our visitors access the site and services anonymously.
What are some standard "patterns" for authenticating client requests? I'm not dealing with confidential data per-se but do want to deter other users/sites from hijacking these services for liability (think data distribution) and performance reasons.
I'm thinking of a shared secret that's refreshed daily and used site-wide by all clients; any web service request would include the secret. Pretty basic but are there other, better ways for the service to detect the caller's origin in a manner that can't be spoofed?

If the threat to your web service is related to someone automating the client calls, you can implement rate limiting on server side. As you rightly mentioned, client can be required to provide key for each request. Alternatively, if only mortals are going to interact with web service, you can also implement Human Interaction Proof like Captcha etc. One thing to make sure is that "key" which will be used by client needs to given in controlled manner. I once came across a system which basically gave away unlimited keys - this means that automation control will be ineffective as an attacker can request as many keys and make unlimited calls. If you are limiting using IP address, make sure that you throttle requests on network part of ip address (A.B.C.X) as host part (X) can change (when users are behind proxies) If your clients are anonymous, the best/closest "identifier" is indeed address.

When would you need multiple servers to host one web application?

Is that called "clustering" of servers? When a web request is sent, does it go through the main server, and if the main server can't handle the extra load, then it forwards it to the secondary servers that can handle the load? Also, is one "server" that's up and running the application called an "instance"?

[...] Is that called "clustering" of servers?
Clustering is indeed using transparently multiple nodes that are seen as a unique entity: the cluster. Clustering allows you to scale: you can spread your load on all the nodes and, if you need more power, you can add more nodes (short version). Clustering allows you to be fault tolerant: if one node (physical or logical) goes down, other nodes can still process requests and your service remains available (short version).
When a web request is sent, does it go through the main server, and if the main server can't handle the extra load, then it forwards it to the secondary servers that can handle the load?
In general, this is the job of a dedicated component called a "load balancer" (hardware, software) that can use many algorithms to balance the request: round-robin, FIFO, LIFO, load based...
In the case of EC2, you previously had to load balance with round-robin DNS and/or HA Proxy. See Introduction to Software Load Balancing with Amazon EC2. But for some time now, Amazon has launched load balancing and auto-scaling (beta) as part of their EC2 offerings. See Elastic Load Balancing.
Also, is one "server" that's up and running the application called an "instance"?
Actually, an instance can be many things (depending of who's speaking): a machine, a virtual machine, a server (software) up and running, etc.
In the case of EC2, you might want to read Amazon EC2 Instance Types.

Here is a real example:
This specific configuration is hosted at RackSpace in their Managed Colo group.
Requests pass through a Cisco Firewall. They are then routed across a Gigabit LAN to a Cisco CSS 11501 Content Services Switch (eg Load Balancer). The Load Balancer matches the incoming content to a content rule, handles the SSL decryption if necessary, and then forwards the traffic to one of several back-end web servers.
Each 5 seconds, the load balancer requests a URL on each webserver. If the webserver fails (two times in a row, IIRC) to respond with the correct value, that server is not sent any traffic until the URL starts responding correctly.
Further behind the webservers is a MySQL master / slave configuration. Connections may be mad to the master (for transactions) or to the slaves for read only requests.
Memcached is installed on each of the webservers, with 1 GB of ram dedicated to caching. Each web application may utilize the cluster of memcache servers to cache all kinds of content.
Deployment is handled using rsync to sync specific directories on a management server out to each webserver. Apache restarts, etc.. are handled through similar scripting over ssh from the management server.
The amount of traffic that can be handled through this configuration is significant. The advantages of easy scaling and easy maintenance are great as well.

For clustering, any web request would be handled by a load balancer, which being updated as to the current loads of the server forming the cluster, sends the request to the least burdened server. As for if it's an instance.....I believe so but I'd wait for confirmation first on that.
You'd' need a very large application to be bothered with thinking about clustering and the "fun" that comes with it software and hardware wise, though. Unless you're looking to start or are already running something big, it wouldn't' be anything to worry about.

Yes, it can be required for clustering. Typically as the load goes up you might find yourself with a frontend server that does url rewriting, https if required and caching with squid say. The requests get passed on to multiple backend servers - probably using cookies to associate a session with a particular backend if necessary. You might have the database on a separate server also.
I should add that there are other reasons why you might need multiple servers, for instance there may be a requirement that the database is not on the frontend server for security reasons

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio