HTTP Session Management while using Nginx as in "Round Robin" mode Load-balancer? - session

I'm trying to load-balance "2 Web Servers (running Apache/PHP)" by putting Nginx at in front of them. But I need to use Round Robin algorithm but when i do this, I can't manage to have the stable SESSIONS.
(I understand; if I use Round Robin, the SESSION information will be lost once i hit to the another Server on next load)
Is there a proper way to achieve this? Any kind advice for the industrial standards on this please?
FYI, I have already put these 2 Web Servers into GlusterFS as in Cluster. So I have a common storage (if you are going to suggest something based on this)

The nginx manual says that session affinity is in the commercial distribution only ("sticky" directive). If you don't use the commercial distribution, you'll have to grab a third-party "plugin" and rebuild the server with support
("sticky" should help you find the third party addons)

If there isn't any specific reason for using Round Robin, you can try to use ip_hash load balancing mechanism.
upstream myapp1 {
ip_hash;
server srv1.example.com;
server srv2.example.com;
server srv3.example.com;
}
If there is the need to tie a client to a particular application server — in other words, make the client’s session “sticky” or “persistent” in terms of always trying to select a particular server — the ip-hash load balancing mechanism can be used.
Please refer to nginx doc for load_balancing for more information.

Related

How to prevent being affected by data-center DDoS attack & maintainance related downtime?

I'm hosting a web application which should be highly-available. I'm hosting on multiple linodes and using a nodebalancer to distribute the traffic. My question might be stupid simple - but not long ago I was affected by a DDoS hitting the data-center. That made me think how I can be better prepared next time this happens.
The nodebalancer and servers are all in the same datacenter which should, of course, be fixed. But how does one go about doing this? If I have two load balancers in two different data centers - how can I setup the domain to point to both, but ignore the one affected by DDoS? Should I look into the DNS manager? Am I making things too complicated?
Really would appreciate some insights.
Thanks everyone...
You have to look at ways to load balance across datacenters. There's a few ways to do this, each with pros and cons.
If you have a lot of DB calls, running to datacenters HOT can introduce a lot of latency problems. What I would do is as follows.
Have the second datacenter (DC2) be a warm location. It is configured for everything to work and is constantly getting data from the master DB in DC 1, but isn't actively getting traffic.
Use a service like CLoudFlare for their extremely fast DNS switching. Have a service in DC2 that constantly pings the load balancer in DC1 to make sure that everything is up and well. When it has trouble contacting DC1, it can connect to CloudFlare via the API and switch the main 'A' record to point to DC2, in which case it now picks up the traffic.
I forget what CloudFlare calls it but it has a DNS feature that allows you to switch 'A' records almost instantly because the actual IP address given to the public is their own, they just route the traffic for you.
Amazon also have a similar feature with CloudFront I believe.
This plan is costly however as you're running much more infrastructure that rarely gets used. Linode is and will be rolling out more network improvements so hopefully this becomes less necessary.
For more advanced load balancing and HA, you can go with more "cloud" providers but it does come at a cost.
-Ricardo
Developer Evangelist, CircleCI, formally Linode

PHP real time application

I have to create a little AJAX chat in my web application and I'm dealing with problem of real-time communication between JavaScript client and PHP server.
I want my js client to be able to catch new messages from the server as quick as possible. My first idea was to create AJAX request for example each 5 sec. to see whether there are new messages.
However, I'm not sure what happens if my application use for example 1000 people, it must be huge load to Apache httpd.
I also know about technique called 'long-polling' request, but when I tried that locally on my server, I've completely shooted down my Apache (I've read sth about problems with apache and long-polling). The next way I know about is WebSocket.
However, is it true that I have to be able to open port on webserver to use it? Because on regular web hosting, I thing it's not possible and I cant change any Apache/PHP settings on my hosting.
Do you have any suggestions how to solve it?
If you want to use websockets, you better have full control over your server as you may be facing the need to start and stop the websocket daemon whenever it's needed.
I wouldn't recommend using "regular web hosting" because of its restrictions.
I think that you are looking for "virtual server providers", that provides you full control over the server you manage. You should look at Amazon Web Services. There are many others that you may find.

Why should one use a http server in front of a framework web server?

Web applications frameworks such as sinatra (ruby), play (scala), lift (scala) produces a web server listening to a specific port.
I know there are some reasons like security, clustering and, in some cases, performance, that may lead me to use an apache web server in front of my web application one.
Do you have any reasons for this from your experience?
Part of any web application is fully standardized and commoditized functionality. The mature web servers like nginx or apache can do the following things. They can do the following things in a way that is very likely more correct, more efficient, more stable, more secure, more familiar to sysadmins, and more easy to configure than anything you could rewrite in your application server.
Serve static files such as HTML, images, CSS, javascript, fonts, etc
Handle virtual hosting (multiple domains on a single IP address)
URL rewriting
hostname rewriting/redirecting
TLS termination (thanks #emt14)
compression (thanks #JacobusR)
A separate web server provides the ability to serve a "down for maintenance" page while your application server restarts or crashes
Reverse proxies can provide load balancing and fault tolerance for you application framework
Web servers have built-in and tested mechanisms for binding to privileged ports (below 1024) as root and then executing as a non-privileged user. Most web application frameworks do not do this by default.
Mature web servers are battle hardened and stable. By stable, I mean that they quite literally almost never crash. Your web application is almost certainly far less stable. This gives you the ability to at least serve a pretty error page to the user saying your application is down instead of the web browser just displaying a generic "could not connect" error.
Anecdotal case in point: nginx handles attack that would otherwise DoS node.js: http://blog.nodejs.org/2013/10/22/cve-2013-4450-http-server-pipeline-flood-dos/
And just in case you want the semi-official answer from Isaac Schluetter at the Airbnb tech talk on January 30, 2013 around 40 minutes in he addresses the question of whether node is stable & secure enough to serve connections directly to the Internet. His answer is essentially "yes" it is fine. So you can do it and you will probably be fine from a stability and security standpoint (assuming you are using cluster to handle unexpected termination of an app server process), but as detailed above the reality of current operations is that still almost everybody runs node behind a separate web server or reverse proxy/cache.
I would add:
ssl handling
for some servers like apache lots of modules (i.e.
ntml/kerberos authentication)
Web servers are much better for some things compared to your application, like serving static.
Quite often the frameworks do everything you need, but sometimes, adding a layer on top of that can give you seemingly free functionality like compression, security, session management, load balancing, etc. Still, adding a web server may also introduce security issues, for example, chances are your web server security may be compromised easier than Lift by itself. Also, some of the web frameworks are extremely scalable and may even be hampered by an ill chosen web server.
In summary, if you require web server like functionality that is not provided by the framework, then a web server may be a very good option, but keep in mind that it's one more thing to configure properly and update regularly with security patches, etc.
If for example, you just need encryption, or compression, then you may find that adding the correct library or plug-in to your framework may do just that (and only that)
With a proxy http server, the framework doesn't need to keep an http connection open for giving the computed content and can then start serving some other request. It acts as a buffer.
It's an issue of reinventing the wheel. Most frameworks will give you a development environment but for production it's usually good practice to use a commercial/open source project that is able to deal with all issues that arise during production.
Guys building a Framework will have the framework to concentrate on whilst guys building a server are doing just the same(perfecting).

Should I host Website and REST API on the same server or split?

I have a web application that consists of Website and REST API. Should I host them on the same server or should I host them on different servers? By "server" I mean a server cluster - several servers behind load balancer.
API is mostly inbound traffic, website - mostly outbound.
If it matters - hosted on Rackspace and/or AWS.
Here is what I see so far:
Benefits of having Website and REST API on the same server
Simple deployment
Simple scaling - something is slow - just launch another instance
Single load balancer configuration
Simple monitoring
Simple, simple, simple ...
Effective use of full duplex network (API - inbound, website - outbound)
Benefits of splitting
API overload will not affect website load time
Detailed monitoring (I will know which component uses resources at this moment)
Any comments?
Thank you
Alexander
Just as you stated, in most situations, there are more advantages in hosting the API on the same server as the website. So I would stick with that option.
But if you predict allot of traffic for either the website or the API, then maybe a separate server would be more suited.
If this is on a load balancer why don't you leave the services and pages on the same site and let the load balancer/cluster do its job?
Your list of advantages/disadvantages are operational considerations, but you should consider application needs as well.
Caching?
Security?
Other resources, i.e. filesystem
These may or may not apply, but if your application architecture is different between the two, be sure to factor this into your decision.

1 A-record for every subdomain (10000+); any potential issues? Any other solution?

Most solutions I've read here for supporting subdomain-per-user at the DNS level are to point everything to one IP using *.domain.com.
It is an easy and simple solution, but what if I want to point first 1000 registered users to serverA, and next 1000 registered users to serverB? This is the preferred solution for us to keep our cost down in software and hardware for clustering.
alt text http://learn.iis.net/file.axd?i=1101
(diagram quoted from MS IIS site)
The most logical solution seems to have 1 x A-record per subdomain in Zone Datafiles. BIND doesn't seem to have any size limit on the Zone Datafiles, only restricted to memory available.
However, my team is worried about the latency of getting the new subdoamin up and ready, since creating a new subdomain consist of inserting a new A-record & restarting DNS server.
Is performance of restarting DNS server something we should worry about?
Thank you in advance.
UPDATE:
Seems like most of you suggest me to use a reverse proxy setup instead:
alt text http://learn.iis.net/file.axd?i=1102
(ARR is IIS7's reverse proxy solution)
However, here are the CONS I can see:
single point of failure
cannot strategically setup servers in different locations based on IP geolocation.
Use the wildcard DNS entry, then use load balancing to distribute the load between servers, regardless of what client they are.
While you're at it, skip the URL rewriting step and have your application determine which account it is based on the URL as entered (you can just as easily determine what X is in X.domain.com as in domain.com?user=X).
EDIT:
Based on your additional info, you may want to develop a "broker" that stores which clients are to access which servers. Make that public facing then draw from the resources associated with the client stored with the broker. Your front-end can be load balanced, then you can grab from the file/db servers based on who they are.
The front-end proxy with a wild-card DNS entry really is the way to go with this. It's how big sites like LiveJournal work.
Note that this is not just a TCP layer load-balancer - there are plenty of solutions that'll examine the host part of the URL to figure out which back-end server to forward the query too. You can easily do it with Apache running on a low-spec server with suitable configuration.
The proxy ensures that each user's session always goes to the right back-end server and most any session handling methods will just keep on working.
Also the proxy needn't be a single point of failure. It's perfectly possible and pretty easy to run two or more front-end proxies in a redundant configuration (to avoid failure) or even to have them share the load (to avoid stress).
I'd also second John Sheehan's suggestion that the application just look at the left-hand part of the URL to determine which user's content to display.
If using Apache for the back-end, see this post too for info about how to configure it.
If you use tinydns, you don't need to restart the nameserver if you modify its database and it should not be a bottleneck because it is generally very fast. I don't know whether it performs well with 10000+ entries though (it would surprise me if not).
http://cr.yp.to/djbdns.html

Resources