haproxy: What are its uses? - performance

I got information about haproxy from Stack Exchange Gives Back 2014 page. Stackoverflow is aso using this excellent application. After visiting haproxy website, I found its uses like - load balancer. So,
Does it work like F5 (reverse proxy) and can it replace F5?
Can someone list down its all features and similar competitors application?

You may check this list for alternative Load Balancing tools.
Cloud providers (Amazon, Rackspace, Google Compute Engine, Softlayer etc), but also some dedicated/VM server providers, usually offer some cheap Load-Balancing solutions as a service.
Haproxy currently seems to be one of the most popular opensource software for Reverse-Proxy, Load-Balancing and failover. Latest versions 1.5+ support SSL-offloading but it still doesn't support content caching (F5 does).
The usual configuration is combining it with a Content Caching system or/and CDN and using the last version you now don't even need a separate SSL-offloader. It offers a nice set of load-balancing rules (RR, leastconn, ipHash etc), apart from their website check this nicely-formatted link.

Related

HTTP Session Management while using Nginx as in "Round Robin" mode Load-balancer?

I'm trying to load-balance "2 Web Servers (running Apache/PHP)" by putting Nginx at in front of them. But I need to use Round Robin algorithm but when i do this, I can't manage to have the stable SESSIONS.
(I understand; if I use Round Robin, the SESSION information will be lost once i hit to the another Server on next load)
Is there a proper way to achieve this? Any kind advice for the industrial standards on this please?
FYI, I have already put these 2 Web Servers into GlusterFS as in Cluster. So I have a common storage (if you are going to suggest something based on this)
The nginx manual says that session affinity is in the commercial distribution only ("sticky" directive). If you don't use the commercial distribution, you'll have to grab a third-party "plugin" and rebuild the server with support
("sticky" should help you find the third party addons)
If there isn't any specific reason for using Round Robin, you can try to use ip_hash load balancing mechanism.
upstream myapp1 {
ip_hash;
server srv1.example.com;
server srv2.example.com;
server srv3.example.com;
}
If there is the need to tie a client to a particular application server — in other words, make the client’s session “sticky” or “persistent” in terms of always trying to select a particular server — the ip-hash load balancing mechanism can be used.
Please refer to nginx doc for load_balancing for more information.

Sitecore with DMS vs caching server - how do you handle it?

We're planning to introduce DMS to our customer's Sitecore installation. It's a rather popular site in our country and we have to use proxy caching server (it's Nginx in this case) to make it high-traffic-proof.
However, as far as we know, it's not possible to use all the DMS features with caching proxy enabled - for example personalization of content - if it gets cached it won't be personalized.
Is there a way to make use of all the DMS features with proxy cache turned on? If not, how do you handle this problem for high-traffic sites - is it buying more Content Delivery servers to carry the load, or extending current server with better hardware (RAM, CPU, bandwidth)?
You might try moving away from your proxy caching for some pages, or even all.
There's no reason not to use a CDN for static assets and media library assets, so stick with that
Leverage Sitecore's built-in html cache for sublayouts/renderings - there are quite a few options for caching
Use Sitecore's Debug feature to track down the slowest components on your site
Consider using indexes instead of doing "fast" or Sitecore queries
Don't do descendants query "//*" (I often see this when calculating selected state for navigation - hint: go the other way, calculate the ancestors of the current page)
#jammykam wrote an excellent answer on this over here.
John West wrote a great blog post on this also, though a bit older.
Good luck!
I've been wondering about this myself.
I have been thinking of implementing an ajax web service that:
- talks to the DMS and returns JSON
- allows you to render the personalised components client side
- allows you to trigger anlaytics events
I have been googling around and I haven't found anyone that has done it and published the information yet. The only place I have found something similar is actually in the mobile sdk, but I haven't had a chance to delve into it yet.
I have also not been able to use proxy server caching and DMS together successfully. For extremely high loads, I have recommended to clients to follow the standard optimization and scaling guidelines, especially architecting for proper Sitecore sublayout and layout caching for as much of the site as possible. With that caching done, follow it up by distributing across multiple Content Delivery nodes with load balancing to help support high volume with personalization at the same time.
I've heard that other CMS's with personalization use a javascript approach to load the personalized content on the client-side, but I would be worried about losing track of the analytics data that is gathered when personalized content is loaded and interacted with.

Why should one use a http server in front of a framework web server?

Web applications frameworks such as sinatra (ruby), play (scala), lift (scala) produces a web server listening to a specific port.
I know there are some reasons like security, clustering and, in some cases, performance, that may lead me to use an apache web server in front of my web application one.
Do you have any reasons for this from your experience?
Part of any web application is fully standardized and commoditized functionality. The mature web servers like nginx or apache can do the following things. They can do the following things in a way that is very likely more correct, more efficient, more stable, more secure, more familiar to sysadmins, and more easy to configure than anything you could rewrite in your application server.
Serve static files such as HTML, images, CSS, javascript, fonts, etc
Handle virtual hosting (multiple domains on a single IP address)
URL rewriting
hostname rewriting/redirecting
TLS termination (thanks #emt14)
compression (thanks #JacobusR)
A separate web server provides the ability to serve a "down for maintenance" page while your application server restarts or crashes
Reverse proxies can provide load balancing and fault tolerance for you application framework
Web servers have built-in and tested mechanisms for binding to privileged ports (below 1024) as root and then executing as a non-privileged user. Most web application frameworks do not do this by default.
Mature web servers are battle hardened and stable. By stable, I mean that they quite literally almost never crash. Your web application is almost certainly far less stable. This gives you the ability to at least serve a pretty error page to the user saying your application is down instead of the web browser just displaying a generic "could not connect" error.
Anecdotal case in point: nginx handles attack that would otherwise DoS node.js: http://blog.nodejs.org/2013/10/22/cve-2013-4450-http-server-pipeline-flood-dos/
And just in case you want the semi-official answer from Isaac Schluetter at the Airbnb tech talk on January 30, 2013 around 40 minutes in he addresses the question of whether node is stable & secure enough to serve connections directly to the Internet. His answer is essentially "yes" it is fine. So you can do it and you will probably be fine from a stability and security standpoint (assuming you are using cluster to handle unexpected termination of an app server process), but as detailed above the reality of current operations is that still almost everybody runs node behind a separate web server or reverse proxy/cache.
I would add:
ssl handling
for some servers like apache lots of modules (i.e.
ntml/kerberos authentication)
Web servers are much better for some things compared to your application, like serving static.
Quite often the frameworks do everything you need, but sometimes, adding a layer on top of that can give you seemingly free functionality like compression, security, session management, load balancing, etc. Still, adding a web server may also introduce security issues, for example, chances are your web server security may be compromised easier than Lift by itself. Also, some of the web frameworks are extremely scalable and may even be hampered by an ill chosen web server.
In summary, if you require web server like functionality that is not provided by the framework, then a web server may be a very good option, but keep in mind that it's one more thing to configure properly and update regularly with security patches, etc.
If for example, you just need encryption, or compression, then you may find that adding the correct library or plug-in to your framework may do just that (and only that)
With a proxy http server, the framework doesn't need to keep an http connection open for giving the computed content and can then start serving some other request. It acts as a buffer.
It's an issue of reinventing the wheel. Most frameworks will give you a development environment but for production it's usually good practice to use a commercial/open source project that is able to deal with all issues that arise during production.
Guys building a Framework will have the framework to concentrate on whilst guys building a server are doing just the same(perfecting).

MemoryCache object and load balancing

I'm writing a web application using ASP .NET MVC 3. I want to use the MemoryCache object but I'm worried about it causing issues with load balanced web servers. When I google for it looks like that problem is solved on the server ie using AppFabric. If a company has load balanced servers is it on them to make sure they have AppFabric or something similar running? or is there anything I can or should do as a developer for this?
First of all, for ASP.NET you should look at the ASP.NET Cache instead of MemoryCache. MemoryCache is a generic caching API that was introduced in .NET 4.0 to provide an equivalent of the ASP.NET Cache in non-web applications.
You're correct to say that AppFabric resolves the issue of multiple servers having their own instances of cached data, in that it provides a single logical cache accessible from all your web servers. Before you leap on it as the solution to your problem, there's a couple of things to consider:
It does not ship as part of Windows Server - it is, as you say, on
you to install it on your servers if you want to use it. When
AppFabric was released, there was a suggestion that it would ship as
part of the next release of Windows Server, but I haven't seen
anything about Windows Server 2012 that confirms that to be the case.
You need extra servers for it, or at least you're advised to have
them. Microsoft's recommendation for AppFabric is that you run it on
dedicated servers. Which means that whilst AppFabric itself is a free
download, you may be incurring additional Windows Server licence
costs. Speaking of which...
You might need Enterprise Edition licences. If you want to use the
High Availability features of AppFabric, you can only do this with
servers running Enterprise Edition, which is a more expensive licence
than Standard Edition.
You might not need it after all. Some of this will depend on your application and why you want to use a shared caching layer. If your concern is that caches on multiple servers could get out of sync with the database (or indeed each other), some judicious use of SqlCacheDependency objects might get you past the issue.
This CodeProject article Implementing Local MemoryCache Invalidation with Redis suggests an approach for handling the scenario you describe.
You didn't mention the flavor of load balancing that you are using: "sticky" or "stateless". By far the easiest solution is to use sticky sessions.
If you want to use local memory caches and stateless load balancing, you can end up with race conditions the cross-server invalidation messages arrive late. This can be particularly problematic if you use the Post-Redirect-Get pattern so common in ASP.Net MVC. This can be overcome by using cookies to supplement the cache invalidation broadcasts. I detail this in a blog post here.

Should I host Website and REST API on the same server or split?

I have a web application that consists of Website and REST API. Should I host them on the same server or should I host them on different servers? By "server" I mean a server cluster - several servers behind load balancer.
API is mostly inbound traffic, website - mostly outbound.
If it matters - hosted on Rackspace and/or AWS.
Here is what I see so far:
Benefits of having Website and REST API on the same server
Simple deployment
Simple scaling - something is slow - just launch another instance
Single load balancer configuration
Simple monitoring
Simple, simple, simple ...
Effective use of full duplex network (API - inbound, website - outbound)
Benefits of splitting
API overload will not affect website load time
Detailed monitoring (I will know which component uses resources at this moment)
Any comments?
Thank you
Alexander
Just as you stated, in most situations, there are more advantages in hosting the API on the same server as the website. So I would stick with that option.
But if you predict allot of traffic for either the website or the API, then maybe a separate server would be more suited.
If this is on a load balancer why don't you leave the services and pages on the same site and let the load balancer/cluster do its job?
Your list of advantages/disadvantages are operational considerations, but you should consider application needs as well.
Caching?
Security?
Other resources, i.e. filesystem
These may or may not apply, but if your application architecture is different between the two, be sure to factor this into your decision.

Resources