Can apache's ProxyRemote be used to proxy HTTPS requests to mongrel for processing? - ruby

So I have a custom proxy that is written in ruby using mongrel to handle some fairly complex caching logic. This works great for both http and ftp requests, however since mongrel is not designed to handle https requests, I wish to front the whole thing with apache and make use of the ProxyRemote command to pass through to mongrel for https requests.
This sort of thing is easily accomplished to mirror certain site directory structures via the ProxyPass and ProxyPassReverse commands in apache, but I don't see a way to do this using ProxyRemote.
The problem is that mongrel does not handle CONNECT requests which are made to establish a secure request. So while I am able to handle https requests within the proxy itself, actually using the proxy with an https request directly is not supported.
It seems that the simplest solution would be to have apache handle the https request and then simply pass the http request itself (minus the CONNECT) to mongrel and have it handle it appropriately and return it to apache and then to the client.
So my question is, Is there a way to make ProxyRemote work the same way that ProxyPass does with HTTP requests (i.e. pass an unencrypted request to mongrel)?

Just use ProxyPass and ProxyPassReverse, the connection between your reverse proxy (apache) and your mongrel will see normal plain http :), no magic necessary (especially not CONNECT, afaik thats only possbile for forward proxies, but I'm not sure).

Hum, have you tried to do so ?
I've been using apache to do the https and just pass the requests with the old default .htaccess mod_rewrite rules.

Related

How to use forward proxy for HTTPS?

I have a use case where I have to put a middle server or relay or tunnel to do network communication with the following points:
I have a web server running, let say when I hit an API /request hosted my web server, it creates a post request to https://www.google.com and gives me a response through the endpoint.
I want a middle server (proxy etc.) which I will call while creating this post request instead of communicating through my webserver,
the call goes to the middle server and gives me the same response as I was getting directly.
For this, the SQUID proxy worked for me.
I came across NGINX, but we can not use NGINX as a forward proxy, also there are some observations that might be useful with this regard.
SQUID proxy also uses the conf file as similar to NGINX,
HTTPS traffic is encrypted, the proxy server need to do some more work to get something with Https requests,
For intercepting, and creating ACL rules, someone will need to have a dummy certificate to be used by the server to act as the owner of the requested content through the proxy,
a list of rules can be incorporated within SQUID.conf to achieve the filtering.
I hope this could be useful to achieve something like this.

URL Rewrite for Forwarding Proxy for HTTPS

Is it possible to create forwarding HTTPS Proxy (not reverse proxy) that would be able to:
block some urls based on the url regexp (ads, flash, movies, ...)
cache images based on the url regexp
It seems to me that in the usual case it is impossible because the HTTPS stream is encrypted and there's no way to process or alter it.
But, this case is special, it is a proxy for the web crawler, I don't need HTTPS at all, but some sites allow access via HTTPS only, and I have to somehow support it.
So, maybe it would be possible to do something like that?
Crawler --http--> Proxy --https--> Site
So, the proxy would be able to decode HTTPS stream and post-process it. Would it work? Is there any docs or details about such approach?
Pretty sure Apache 2.2 provides this functionality with mod_proxy in conjunction with mod_ssl and mod_cache.
Note: blocking is done using the 'ProxyBlock' directive in mod_proxy.

How can a web page send a message to the local network

Our web application has a button that is supposed to send data to a server on the local network that in turn prints something on a printer.
So far it was easy: The button triggered an AJAX POST request to http://printerserver/print.php with a token, that page connected to the web application to verify the token and get the data to print and then printed.
However, we are now delivering our web application via HTTPs (and I would rather not go back to HTTP for this) and newer versions of Chrome and Firefox don't make the request to the HTTP address anymore, they don't even send the request to check CORS headers.
Now, what is a modern alternative to the cross-protocol XHR? Do Websockets suffer from the same problem? (A Google search did not make clear what is the current state here.) Can I use TCP Sockets already? I would rather not switch to GET requests either, because the action is not idempotent and it might have practical implications with preloading and caching.
I can change the application on the printerserver in any way (so I could replace it with NodeJS or something) but I cannot change the users' browsers (to trust a self-signed certificate for printerserver for example).
You could store the print requests on the webserver in a queue and make the printserver periodically poll for requests to print.
If that isn't possible I would setup a tunnel or VPN between the webserver and printserver networks. That way you can make the print request from the webserver on the server-side instead of the client. If you use curl, there are flags to ignore invalid SSL certificates etc. (I still suspect it's nicer to introduce a queue anyway, so the print requests aren't blocking).
If the webserver can make an ssh connection to something on the network where the printserver is on, you could do something like: ssh params user#host some curl command here.
Third option I can think of, if printserver can bind to for example a subdomain of the webserver domain, like: print.somedomain.com, you may be able to make it trusted by the somedomain.com certificate, IIRC you have to create a CSR (Certificate Signing Request) from the printserver certificate, and sign it with the somedomain.com certificate. Perhaps it doesn't even need to be a subdomain for this per se, but maybe that's a requirement for the browser to do it client-side.
The easiest way is to add a route to the webapp that does nothing more than relay the request to the print server. So make your AJAX POST request to https://myapp.com/print, and the server-side code powering that makes a request to http://printerserver/print.php, with the exact same POST content it received itself. As #dnozay said, this is commonly called a reverse proxy. Yes, to do that you'll have to reconfigure your printserver to accept (authenticated) requests from the webserver.
Alternatively, you could switch the printserver to https and directly call it from the client.
Note that an insecure (http) web-socket connection on a secure (https) page probably won't work either. And for good reason: generally it's a bad idea to mislead people by making insecure connections from what appears to them to be a secure page.
The server hosting the https webapp can reverse proxy the print server,
but since the printer is local to the user, this may not work.
The print server should have the correct CORS headers
Access-Control-Allow-Origin: *
or:
Access-Control-Allow-Origin: https://www.example.com
However there are pitfalls with using the wildcard.
From what I understand from the question, printserver is not accessible from the web application so the reverse proxy solution won't work here.
You are restricted from making requests from the browser to the printserver by cross-origin-policy.
If wish to communicate with the printserver from an HTTPS page you will need the printserver to expose print.php as HTTPS too.
You could create a DNS A record as a subdomain of your web application that resolves to the internal address of your printserver.
With those steps in place you should be able to update your printserver page to respond with permissive CORS headers which the browser should then respect. I don't think the browser will even issue CORS requests across different protocol schemes (HTTPS vs HTTP) or to internal domains, without a TLD.

What's the de facto standard for a Reverse Proxy to tell the backend SSL is used?

I have a reverse proxy that does HTTPS on the outside, but HTTP on the inside.
This means that by default in-app URLs will have HTTP as the scheme, as this is the way it's being contacted.
How can the proxy tell the backend that HTTPS should be used?
The proxy can add extra (or overwrite) headers to requests it receives and passes through to the back-end. These can be used to communicate information to the back-end.
So far I've seen a couple used for forcing the use of https in URL scheme:
X-Forwarded-Protocol: https
X-Forwarded-Ssl: on
X-Url-Scheme: https
And wikipedia also mentions:
# a de facto standard:
X-Forwarded-Proto: https
# Non-standard header used by Microsoft applications and load-balancers:
Front-End-Https: on
This what you should add to the VirtualHost on apache: other proxies should have similar functionality
RequestHeader set X-FORWARDED-PROTOCOL https
RequestHeader set X-Forwarded-Ssl on
# etc.
I think it's best to set them all, or set one that works and remove the other known ones. To prevent evil clients messing with them.
It took me several hours of googling to find the magic setting for my environment. I have a SSL httpd Apache reverse proxy in front of a jetty app server and an apache2 http server. This answer actually gave me the information that worked. For me, adding:
RequestHeader set "X-Forwarded-Proto" expr=%{REQUEST_SCHEME}
to the site conf file was enough for the destination to use https instead of http as the protocol when building links in the response. I tried the X-FORWARDED-PROTOCOL above, but that didn't work. Hopefully this will help in future Google searches!

AWS ELB Server-Side HTTPS

I have been working my way though setting up HTTPS using AWS. I have been attempting this with a self-signed certificate and am finding the process a bit problematic.
One question that has come up along the way is this business of server-side HTTPS. The client that I am working with requests that when a user hits the server the URL change to HTTPS. I am wondering if "Server-Side HTTPS" means that the protocol is transparent to the end-user?
Will they still see HTTP int the browser?
Thanks.
Don't know if this is an exact answer to your question, but rather perhaps a piece of advice. When using ELB, I have found it MUCH easier to install the SSL cert on the ELB and use SSL offload to forward requests from port 443 on ELB to port 80 on the EC2 instances.
The pros of this:
There is only one place where you need to install the cert rather than having to install across a number of instances (or update AMI and relaunch instances), making cert updates much easier to perform.
You get better performance on your web servers as they don't have to deal with SSL encryption.
Some cons:
The communication is not encrypted end-to-end so there is the technical (albeit unlikely) chance that the communication could be intercepted between ELB and servers. If you are dealing with something like PCI compliance this might matter to you.
If you needed to directly access one of the instances over HTTPS that would not be possible.
You may need to make sure your application is aware of the https-related headers (i.e. x-forwarded-proto) that the ELB injects into the request if your application needs to check whether the request is over HTTPS.
There is no reason that this configuration would disallow you from redirecting incoming requests over HTTP to HTTPS. You might however need to look the x-forwarded-proto header to do any web-server or application level redirects to HTTPS. The end user would not have any way of knowing that their HTTPS wrapper for their request was being offloaded at the ELB.

Resources