How do I write a simple HTTPS proxy server in Ruby?

How do I write a simple HTTPS proxy server in Ruby? - ruby

I've seen several examples of writing an HTTP proxy in Ruby, e.g. this gist by Torsten Becker, but how would I extend it to handle HTTPS, aka for a "man in the middle" SSL proxy?
I'm looking for a simple source code framework which I can extend for my own logging and testing needs.
update
I already use Charles, a nifty HTTPS proxy app similar to Fiddler and it is essentially what I want except that it's packaged up in an app. I want to write my own because I have specific needs for filtering and presentation.
update II
Having poked around, I understand the terminology a little better. I'm NOT after a full "Man in the Middle" SSL proxy. Instead, it will run locally on my machine and so I can honor whatever SSL cert it offers. However, I need to see the decrypted contents of packets of my requests and the decrypted contents of the responses.

Just for background information, a normal HTTP proxy handles HTTPS requests via the CONNECT method: it reads the host name and port, establishes a TCP connection to this target server on this port, returns 200 OK and then merely tunnels that TCP connection to the initial client (the fact that SSL/TLS is exchanged on top of that TCP connection is barely relevant).
This is what the do_CONNECT method if WEBrick::HTTPProxyServer.
If you want a MITM proxy, i.e. if you want to be able to look inside the SSL/TLS traffic, you can certainly use WEBrick::HTTPProxyServer, but you'll need to change do_CONNECT completely:
Firstly, your proxy server will need to embed a mini CA, capable of generating certificates on the fly (failing that, you might be able to use self-signed certificates, if you're willing to bypass warning messages in the browser). You would then import that CA certificate into the browser.
When you get the CONNECT request, you'll need to generate a certificate valid for that host name (preferable with a Suject Alt. Name for that host name, or in the Subject DN's Common Name), and upgrade the socket into an SSL/TLS server socket (using that certificate). If the browser accepts to trust that certificate, what you get from thereon on this SSL/TLS socket is the plain text traffic.
You would then have to handle the requests (get the request line, headers and entity) and take it to use it via a normal HTTPS client library. You might be able to send that traffic to a second instance of WEBrick::HTTPProxyServer, but it would have to be tweaked to make outgoing HTTPS requests instead of plain HTTP requests.

Webrick can proxy ssl:
require 'webrick'
require 'webrick/httpproxy'
WEBrick::HTTPProxyServer.new(:Port => 8080).start

from my experience HTTPS is nowhere near "simple". Do you need a proxy that would catch traffic from your own machine? There are several applications, like Fiddler. Or google for alternatives. Comes with everything you need to debug the web traffic.

That blog is no way to write a proxy. It's very easy: you just accept a connection, read one line which tells you what to connect to, attempt the upstream connection, if it fails send the appropriate response and close the socket, otherwise just start copying bytes in both directions, simultaneously, until EOS has occurred in both directions. The only difference HTTPS makes is that you have to speak SSL instead of plaintext.

Related

Handling encrypted request depending on cert trust state using mitmproxy

I've read a lot of related topics in the net, but I still don't have an answer to my question.
Is it possible to implement flow described below?
Proxy receive request.
If request is encrypted and proxy cert is trusted then intercept.
If request is not encrypted, then intercept.
If request is encrypted and proxy cert is NOT trusted then pass it through without interception.
This behaviour should be default for all traffic going through the proxy.
It'd be also really nice to be able to get all possible info for passing encrypted requests (src and dst ip addresses etc.). Basically the same info which I can get with fiddler.

Not really. The main problem is that mitmproxy can not know if proxy cert is trusted by the client or not.
In the SSL/TLS protocol client starts with the CLIENT_HELLO and in response the server (in this case motmproxy) sends back the SERVER_HELLO message containing the generated server certificate.
The client now checks if the received server certificate is trusted. If not the connection is terminated. As far as I know the SSL/TLS spec does not define how to do so. Sems clients end back an SSL_ALERT message, other simply drop the connection, and a third group continues the SSL/TLS handshake but have certain internal values set in a way that always let the handshake fail.
There is a mitmproxy script that tries to identify connections that were not successful and then if the client asks for the same domain a second time bypasses interception.
Of course this requires that the client resends requests which is not always the case.
https://github.com/sociam/x-ray/blob/master/mitmproxy/examples/tls_passthrough.py

Not able to receive and forward remote request using Charles Web Proxy as a Reverse Proxy

I am trying to capture an old application that didn't honour the system's proxy setting. The only config I can change is the server IP address.
Capturing the packets with Wireshark. Without the Charles reverse proxy, I can see requests after the first three handshake requests.
With the reverse proxy, the connection stuck after the handshake requests.
I notice that when Charles received a request and connecting to somewhere but it will just stuck there:
Following is the config of the reverse proxy (Remote host removed):
Any help, solution and workarounds would be appreciated!

First of all, your app uses neither HTTP nor HTTPS. Studying screen shot of successful connection gives some details on protocol used:
the first message after handhsake is originated by server contrary to common client-server approach, where client is responsible for sending query. This fact is enough to cross out HTTP and HTTPS.
payload data isn't human-readable, so it's a binary protocol.
based on PUSH flags, protocol is much more likely to be message-based rather than stream-based
So client establishes connection, immediately gets some command from server and replies it. Then communication continues. I can't guess exact protocol. Port number might be irrelevant, but even if it's not, there are only few protocols using 4321 port by default. Anyway, it can always be custom private protocol.
I'm not familiar with Charles, but forwarding arbitrary TCP stream is probably covered by its port forwarding feature rather than reverse proxy. However, I don't really see any benefits in sending traffic through Charles in this case, capturing data on your PC should be enough to study details.
If you are looking for traffic manipulation, for arbitrary TCP stream it's not an easy task, but it must be possible. I'm not aware of suitable tools, quick googling shows lots of utils, but some of them looks applicable to text based stream only, so deeper study is required.

Reason for Failure
It may be because you are requesting a local IP address from a remote scope, which Charles proxy doesn't applies. For POS(Proof Of Statement), please refer to the below link
https://www.charlesproxy.com/documentation/faqs/localhost-traffic-doesnt-appear-in-charles/
Solution
So In order to solve the problem for the current scenario, use
http://192.168.86.22.charlesproxy.com/
Note: The url that you request will only be proxied properly by Charles not any other proxy services.

How can a web page send a message to the local network

Our web application has a button that is supposed to send data to a server on the local network that in turn prints something on a printer.
So far it was easy: The button triggered an AJAX POST request to http://printerserver/print.php with a token, that page connected to the web application to verify the token and get the data to print and then printed.
However, we are now delivering our web application via HTTPs (and I would rather not go back to HTTP for this) and newer versions of Chrome and Firefox don't make the request to the HTTP address anymore, they don't even send the request to check CORS headers.
Now, what is a modern alternative to the cross-protocol XHR? Do Websockets suffer from the same problem? (A Google search did not make clear what is the current state here.) Can I use TCP Sockets already? I would rather not switch to GET requests either, because the action is not idempotent and it might have practical implications with preloading and caching.
I can change the application on the printerserver in any way (so I could replace it with NodeJS or something) but I cannot change the users' browsers (to trust a self-signed certificate for printerserver for example).

You could store the print requests on the webserver in a queue and make the printserver periodically poll for requests to print.
If that isn't possible I would setup a tunnel or VPN between the webserver and printserver networks. That way you can make the print request from the webserver on the server-side instead of the client. If you use curl, there are flags to ignore invalid SSL certificates etc. (I still suspect it's nicer to introduce a queue anyway, so the print requests aren't blocking).
If the webserver can make an ssh connection to something on the network where the printserver is on, you could do something like: ssh params user#host some curl command here.
Third option I can think of, if printserver can bind to for example a subdomain of the webserver domain, like: print.somedomain.com, you may be able to make it trusted by the somedomain.com certificate, IIRC you have to create a CSR (Certificate Signing Request) from the printserver certificate, and sign it with the somedomain.com certificate. Perhaps it doesn't even need to be a subdomain for this per se, but maybe that's a requirement for the browser to do it client-side.

The easiest way is to add a route to the webapp that does nothing more than relay the request to the print server. So make your AJAX POST request to https://myapp.com/print, and the server-side code powering that makes a request to http://printerserver/print.php, with the exact same POST content it received itself. As #dnozay said, this is commonly called a reverse proxy. Yes, to do that you'll have to reconfigure your printserver to accept (authenticated) requests from the webserver.
Alternatively, you could switch the printserver to https and directly call it from the client.
Note that an insecure (http) web-socket connection on a secure (https) page probably won't work either. And for good reason: generally it's a bad idea to mislead people by making insecure connections from what appears to them to be a secure page.

The server hosting the https webapp can reverse proxy the print server,
but since the printer is local to the user, this may not work.
The print server should have the correct CORS headers
Access-Control-Allow-Origin: *
or:
Access-Control-Allow-Origin: https://www.example.com
However there are pitfalls with using the wildcard.

From what I understand from the question, printserver is not accessible from the web application so the reverse proxy solution won't work here.
You are restricted from making requests from the browser to the printserver by cross-origin-policy.
If wish to communicate with the printserver from an HTTPS page you will need the printserver to expose print.php as HTTPS too.
You could create a DNS A record as a subdomain of your web application that resolves to the internal address of your printserver.
With those steps in place you should be able to update your printserver page to respond with permissive CORS headers which the browser should then respect. I don't think the browser will even issue CORS requests across different protocol schemes (HTTPS vs HTTP) or to internal domains, without a TLD.

When should one use CONNECT and GET HTTP methods at HTTP Proxy Server?

I'm building a WebClient library. Now I'm implementing a proxy feature, so I am making some research and I saw some code using the CONNECT method to request a URL.
But checking it within my web browser, it doesn't use the CONNECT method but calls the GET method instead.
So I'm confused. When I should use both methods?

TL;DR a web client uses CONNECT only when it knows it talks to a proxy and the final URI begins with https://.
When a browser says:
CONNECT www.google.com:443 HTTP/1.1
it means:
Hi proxy, please open a raw TCP connection to google; any following
bytes I write, you just repeat over that connection without any
interpretation. Oh, and one more thing. Do that only if you talk to
Google directly, but if you use another proxy yourself, instead you
just tell them the same CONNECT.
Note how this says nothing about TLS (https). In fact CONNECT is orthogonal to TLS; you can have only one, you can have other, or you can have both of them.
That being said, the intent of CONNECT is to allow end-to-end encrypted TLS session, so the data is unreadable to a proxy (or a whole proxy chain). It works even if a proxy doesn't understand TLS at all, because CONNECT can be issued inside plain HTTP and requires from the proxy nothing more than copying raw bytes around.
But the connection to the first proxy can be TLS (https) although it means a double encryption of traffic between you and the first proxy.
Obviously, it makes no sense to CONNECT when talking directly to the final server. You just start talking TLS and then issue HTTP GET. The end servers normally disable CONNECT altogether.
To a proxy, CONNECT support adds security risks. Any data can be passed through CONNECT, even ssh hacking attempt to a server on 192.168.1.*, even SMTP sending spam. Outside world sees these attacks as regular TCP connections initiated by a proxy. They don't care what is the reason, they cannot check whether HTTP CONNECT is to blame. Hence it's up to proxies to secure themselves against misuse.

A CONNECT request urges your proxy to establish an HTTP tunnel to the remote end-point.
Usually is it used for SSL connections, though it can be used with HTTP as well (used for the purposes of proxy-chaining and tunneling)
CONNECT www.google.com:443
The above line opens a connection from your proxy to www.google.com on port 443.
After this, content that is sent by the client is forwarded by the proxy to www.google.com:443.
If a user tries to retrieve a page http://www.google.com, the proxy can send the exact same request and retrieve response for him, on his behalf.
With SSL(HTTPS), only the two remote end-points understand the requests, and the proxy cannot decipher them. Hence, all it does is open that tunnel using CONNECT, and lets the two end-points (webserver and client) talk to each other directly.
Proxy Chaining:
If you are chaining 2 proxy servers, this is the sequence of requests to be issued.
GET1 is the original GET request (HTTP URL)
CONNECT1 is the original CONNECT request (SSL/HTTPS URL or Another Proxy)
User Request ==CONNECT1==> (Your_Primary_Proxy ==CONNECT==> AnotherProxy-1 ... ==CONNECT==> AnotherProxy-n) ==GET1(IF is http)/CONNECT1(IF is https)==> Destination_URL

As a rule of thumb GET is used for plain HTTP and CONNECT for HTTPS
There are more details though so you probably want to read the relevant RFC-s
http://www.ietf.org/rfc/rfc2068.txt
http://www.ietf.org/rfc/rfc2817.txt

The CONNECT method converts the request connection to a transparent TCP/IP tunnel, usually to facilitate SSL-encrypted communication (HTTPS) through an unencrypted HTTP proxy.

XMPP Proxy TLS Encryption

I'm trying to develop a XMPP "Proxy" which will be in the middle of a standard Jabber communication.
The schema will be something like this:
Pidgin ---> Proxy <--- eJabberD
|
v
Console
The purpose of this proxy is to log all the stanzas which go over the wire. IMHO, this is very convenient when you're developing XMPP based solutions.
I'm doing this with EventMachine and Ruby, and the main problem is to know how to decypher the traffic after the TLS/SASL handshake.
Before the starttls, all works perfectly, the server and client can talk between them, but when the tls handshake begins, although it works, it is impossible to dump the clear content as all of the traffic is encrypted.
I'm not an expert in TLS/SASL thing, so I don't know which is the best approach to do this. I think one way to achieve this, should be to grab the certificate in the handshake and use it to decypher the content as it goes throught the proxy.
Thanks!

If you could do what you say (grab the certificate on the wire and use it to decrypt), then TLS would be pretty worthless. This is one of the primary attacks TLS exists to prevent.
If the server will allow it, just don't send starttls. This is not required by spec. If starttls is required by your server, then you can configure it to use a null cipher, which will leave the traffic unencrypted. Not all servers will support that of course.
You can man-in-the-middle the starttls. Respond with your own tunnel to the client, and send a separate starttls negotiation to the server. This should generate certificate warnings on the client, but since you control the client you can tell it to accept the certificate anyway.
If you control the server, you can use the private key from it to decrypt the traffic. I'm not aware with any off-the-shelf code to do that easily, but it's writable.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio