IIS cache with PURGE support - windows

On Unix, I normally deploy nginx in front of Varnish in front of my application server. Both nginx and Varnish are acting as reverse proxies here. Varnish maintains a cache and supports things like If-Modified-Since, Cache-Control response headers and PURGE requests from the application. nginx is good at receiving a lot of connections. I also use it to serve some static content, enable gzip compression etc.
On Windows, I can manage with Squid in front of IIS. I'm planning to deploy my (Python) application as an ISAPI wildcard filter (using the isapi-wsgi package), so the application will live in a thread pool managed by IIS.
However, Squid development on Windows appears to have stalled, and I'd prefer to keep IIS on port 80, so that I can serve certain things directly from disk. I also suspect IIS is more resilient in handling lots of connections than Squid on Windows.
What do people normally use here? One option would be to use another free-standing caching proxy in front of IIS. Another option may be something installed as an ISAPI filter, which would intercept requests and respond to things like If-Modified-Since, requets for images and other cached resources, and PURGE requests from the application.
Does such a thing exist? Or are the only real choices Squid and MS ISA (too expensive).
Cheers,
Martin

IIS7 with Application Request Routing (see http://www.iis.net/download/ApplicationRequestRouting) supports full proxy caching on the same box or with the cache server in front of your middle tier.
Once ARR is installed, to enable proxy caching from the command line run the following:
%windir%\System32\inetsrv\appcmd.exe set config -section:system.webServer/diskCache /+"[path='C:\MyCacheFolder',maxUsage='0']" /commit:apphost
To vary caching based on query string, execute the following:
%windir%\System32\inetsrv\appcmd.exe set config -section:system.webServer/proxy /cache.queryStringHandling:"Accept" /commit:apphost
See the documentation link above for more details. Notice that static and dynamic content can have different caching strategies, etc. If you pursue using this, follow up with specific questions--it can be a bit of a trick lining everything up if you're looking for fine-grained control.

Related

Is it better to use CORS or nginx proxy_pass for a RESTful client-server app?

I have a client-server app where the server is a Ruby on rails app that renders JSON and understands RESTful requests. It's served by nginx+passenger and it's address is api.whatever.com.
The client is an angular js application that consumes these services (whatever.com). It is served by a second nginx server and it's address is whatever.com.
I can either use CORS for cross subdomain ajax calls or configure the client' nginx to proxy_pass requests to the rails application.
Which one is better in terms of performance and less trouble for developers and server admins?
Unless you're Facebook, you are not going to notice any performance hit from having an extra reverse proxy. The overhead is tiny. It's basically parsing a bunch of bytes and then sending them over a local socket to another process. A reverse proxy in Nginx is easy enough to setup, it's unlikely to be an administrative burden.
You should worry more about browser support. CORS is supported on almost every browser, except of course for Internet Explorer and some mobile browsers.
Juvia uses CORS but falls back to JSONP. No reverse proxy setup.

Load balancing with nginx

I want to stop serving requests to my back end servers if the load on those servers goes above a certain level. Anyone who is already surfing the site will still get routed but new connection will be sent to a static server busy page until the load drops below a pre determined level.
I can use cookies to let the current customers in but I can't find information on how to to routing based on a custom load metric.
Can anyone point me in the right direction?
Nginx has an HTTP Upstream module for load balancing. Checking the responsiveness of the backend servers is done with the max_fails and fail_timeout options. Routing to an alternate page when no backends are available is done with the backup option. I recommend translating your load metrics into the options that Nginx supplies.
Let's say though that Nginx is still seeing the backend as being "up" when the load is higher than you want. You may be able to adjust that further by tuning the max connections of the backend servers. So, maybe the backend servers can only handle 5 connections before the load is too high, so you tune it only allow 5 connections. Then on the front-end, Nginx will time-out immediately when trying to send a sixth connection, and mark that server as inoperative.
Another option is to handle this outside of Nginx. Software like Nagios can not only monitor load, but can also proactively trigger actions based on the monitor it does.
You can generate your Nginx configs from a template that has options to mark each upstream node as up or down. When a monitor detects that the upstream load is too high, it could re-generate the Nginx config from the template as appropriate and then reload Nginx.
A lightweight version of the same idea could done with a script that runs on the same machine as your Nagios server, and performs simple monitoring as well as the config file updates.

How to build local web proxy without configuring the browsers

How does Netnanny or k9 Web Protection setup web proxy without configuring the browsers?
How can it be done?
Using WinSock directly, or at the NDIS or hardware driver level, and
then filter at those levels, just like any firewalls soft does. NDIS being the easy way.
Download this ISO image: http://www.microsoft.com/downloads/en/confirmation.aspx?displaylang=en&FamilyID=36a2630f-5d56-43b5-b996-7633f2ec14ff
it has bunch of samples and tools to help you build what you want.
After you mount or burn it on CD and install it go to this folder:
c:\WinDDK\7600.16385.1\src\network\ndis\
I think what you need is a transparent proxy that support WCCP.
Take a look at squid-cache FAQ page
And the Wikipedia entry for WCCP
With that setup you just need to do some firewall configuration and all your web traffic will be handled by the transparent proxy. And no setup will be needed on your browser.
netnanny is not a proxy. It is tied to the host machine and browser (and possibly other applications as well. It then filters all incoming and outgoing "content" from the machine/application.
Essentially Netnanny is a content-control system as against destination-control system (proxy).
Easiest way to divert all traffic to a certain site to some other address is by changing hosts file on local host
You might want to have a look at the explanation here: http://www.fiddlertool.com/fiddler/help/hookup.asp
This is how Fiddler2 achieves inserting a proxy in between most apps and the internet without modifying the apps (although lots of explanation of how-to failing the default setup). This does not answer how NetNanny/K9 etc work though, as noted above they do a little more and may be a little more intrusive.
I believe you search for BrowserHelperObjects. These little gizmos capture ALL browser communication, and as such can either remote ads from the HTML (good gizmo), or redirect every second click to a spam site (bad gizmo), or just capture every URL you type and send it home like all the WebToolBars do.
What you want to do is route all outgoing http(s) requests from your lan through a reverse proxy (like squid). This is the setup for a transparent web proxy.
There are different ways to do this, although I've only ever set it up OpenBSD and Linux; and using Squid as the reverse proxy.
At a high level you have a firewall with rules to send all externally bound http traffic to a local squid server. The Squid server is configured to:
accept all http requests
forward the requests on to the real external hosts
cache the reply
forward the reply back to the requestor on the local lan
You can then add more granular rules in Squid to control access to websites, filter content, etc.
I pretty sure you can also get this functionality in different networking gear. I bet F5 has some products that do some or all of what I described, and probably Cisco as well. There is probably other proxies out there besides Squid that you can use too.
PS. I have no idea if this is how K9 Web Protection or NetNanny works.
Squid could provide an intercept proxy for HTTP and HTTPs ports, without configuring the browsers and it also supports WCCP.

What is the fastest way(/make loading faster) to serve images and other static content on a site?

Ours is an e-commerce site with lots of images and flash(same heavy flash rendered across all pages). All the static content is stored and served up from the webserver(IHS clustered-2 nodes). We still notice that the image delivery is slow. Is this approach correct at all? What are the alternative ways of doing this, like maybe serving up images using a third party vendor or implementing some kind of caching?
P.S. All our pages are https. Could this be a reason?
Edit1: The images are served up from
https too so the alerts are a non
issue?
Edit2: The loading is slower on IE and
most of our users are IE. I am not
sure if browser specific styling could
be causing the slower IE loading?(We
have some browser specific styling for
IE)
While serving pages over HTTP may be faster (though I doubt https is not monstrously slow, for small files), a good lot of browsers will complain if included resources such as images and JS are not on https:// URL's. This will give your customers annoying popup notifications.
There are high-performance servers for static file serving, but unless your SSL certificate works for multiple subdomains, there are a variety of complications. Putting the high-performance server in front of your dynamic content server and reverse proxying might be an option, if that server can do the SSL negotiation. For unix platforms, Nginx is pretty highly liked for its reverse proxying and static file serving. Proxy-cache setups like Squid may be an option too.
Serving static content on a cloud like amazon is an option, and some of the cloud providers let you use https as well, as long as you are fine with using a subdomain of their domain name (due to technical limitations in SSL)
You could create a CDN. If your site uses HTTPS, you'll need a HTTPS domain too (otherwise you might get a non-secure items warning).
Also, if your site uses cookies (and most do), having a CDN on a different domain (or if you use www.example.com, on cdn.example.com) you could host them there and not have the cookie data in the HTTP request come in.

Any HTTP proxies with explicit, configurable support for request/response buffering and delayed connections?

When dealing with mobile clients it is very common to have multisecond delays during the transmission of HTTP requests. If you are serving pages or services out of a prefork Apache the child processes will be tied up for seconds serving a single mobile client, even if your app server logic is done in 5ms. I am looking for a HTTP server, balancer or proxy server that supports the following:
A request arrives to the proxy. The proxy starts buffering in RAM or in disk the request, including headers and POST/PUT bodies. The proxy DOES NOT open a connection to the backend server. This is probably the most important part.
The proxy server stops buffering the request when:
A size limit has been reached (say, 4KB), or
The request has been received completely, headers and body
Only now, with (part of) the request in memory, a connection is opened to the backend and the request is relayed.
The backend sends back the response. Again the proxy server starts buffering it immediately (up to a more generous size, say 64KB.)
Since the proxy has a big enough buffer the backend response is stored completely in the proxy server in a matter of miliseconds, and the backend process/thread is free to process more requests. The backend connection is immediately closed.
The proxy sends back the response to the mobile client, as fast or as slow as it is capable of, without having a connection to the backend tying up resources.
I am fairly sure you can do 4-6 with Squid, and nginx appears to support 1-3 (and looks like fairly unique in this respect). My question is: is there any proxy server that empathizes these buffering and not-opening-connections-until-ready capabilities? Maybe there is just a bit of Apache config-fu that makes this buffering behaviour trivial? Any of them that it is not a dinosaur like Squid and that supports a lean single-process, asynchronous, event-based execution model?
(Siderant: I would be using nginx but it doesn't support chunked POST bodies, making it useless for serving stuff to mobile clients. Yes cheap 50$ handsets love chunked POSTs... sigh)
What about using both nginx and Squid (client — Squid — nginx — backend)? When returning data from a backend, Squid does convert it from C-T-E: chunked to a regular stream with Content-Length set, so maybe it can normalize POST also.
Nginx can do everything you want. The configuration parameters you are looking for are
http://wiki.codemongers.com/NginxHttpCoreModule#client_body_buffer_size
and
http://wiki.codemongers.com/NginxHttpProxyModule#proxy_buffer_size
Fiddler, a free tool from Telerik, does at least some of the things you're looking for.
Specifically, go to Rules | Custom Rules... and you can add arbitrary Javascript code at all points during the connection. You could simulate some of the things you need with sleep() calls.
I'm not sure this method gives you the fine buffering control you want, however. Still, something might be better than nothing?
Squid 2.7 can support 1-3 with a patch:
http://www.squid-cache.org/Versions/v2/HEAD/changesets/12402.patch
I've tested this and found it to work well, with the proviso that it only buffers to memory, not disk (unless it swaps, of course, and you don't want this), so you need to run it on a box that's appropriately provisioned for your workload.
Chunked POSTs are a problem for most servers and intermediaries. Are you sure you need support? Usually clients should retry the request when they get a 411.
Unfortunately, I'm not aware of a ready-made solution for this. In the worst case scenario, consider developing it yourself, say, using Java NIO -- it shouldn't take more than a week.

Resources