Web proxy with WebSocket support, preferably in python - websocket

Hello. I seek some guidance. I need a http web proxy with websocket support. The catch is, that I need to retrieve the context of HTTP request (headers, paramters, etc, used for retrieiving routing table) and in the same time I need websockets to work. So far I found onlt either HTTP-only proxies or TCP proxies that lack the context of HTTP request.
Is anyone aware of existence of a library or tool that will let me accomplish that?
I considered:
proxy.py, seems ugly and I couldn't make it work for my case in ~6 hours.
twisted, no experience nor success with this one either
nginx module in lua, looks promising and fast

Related

Caching proxy for all traffic

I am trying to find (or write) a caching proxy tool that accepts all traffic from a specific container in my localhost (using Iptables). What I want to do with this traffic is to save it and cache the response, and later, if I see that a request was already sent to a server, return the cached response to the requesting party (and not sending the request to the server again, because a previous similar request was already sent).
Here's a diagram to demonstrate what I'm trying to do:
I'm not sure exactly how big is the problem I'm trying to deal with here. I want to do it for all traffic, including HTTP, TLS and other TCP based traffic (database connections and such). I tried to check mitmproxy, and it seems to deal pretty good with HTTP and the TLS part, but intercepting raw TCP traffic (for databases etc.) is not possible.
Any advices or resources I can use to accomplish that? (Not necessarily in Python). How complex do you think this problem is? Do you think I can find a generic solution?
Thanks in advance!

Is there a Socket.IO alternative that is not based on WebSockets?

I built a realtime application that, thanks to Socket.IO, can serve a lot of different client types (C#, Java, Browser, ...)! I know that there are a lot of Socket.IO alternatives, but from my understanding, everything is more or less based on WebSockets. (I know that Socket.IO has fallbacks if WebSockets are not working, but that they are more less "inferior workarounds" so to speak...)
My question is: Is there any comparable real-time engine available that is NOT based on WebSockets, but can still serve all those different clients?
You don't say what your endpoints are. If one of the endpoints is a browser with purely the built-in capabilities of the browser and Javascript, then a webSocket is your only way to get a continuous connection from the browser to some other destination.
If a webSocket is not supported (in an older browser), then the other socket.io fallbacks (such as xhr-long-polling) are the next best alternatives. As the browser has limited communication capabilities, if you can't use a webSocket, then an ajax call is your only other generally supported option without requiring plug-ins on each browser (such as Flash or Java or something like that). socket.io already supports the next-best options that are available in a browser - you can't do better than that if you're talking about a standard browser with no custom plug-ins.
If your endpoints don't necessarily include a browser and you can use any language or library you want, then you can use plain TCP sockets and then use whatever protocol you want over a TCP socket.
The WebSocket protocol establishes a bidirectional communication channel between server and client; they kind of speak more naturally with each other. The server can just send something to the client and the other way around. In http it just goes in one direction, there's a request and a response and everything needs to be initiated with a request from the client.
From my experience, realtime webApps like a multiplayer game or a chat become easier to develop and it apparently creates less overhead than using http - but still you can do the same things more or less elegant with http as well (see e.g. long polling).
Look at gmail or other existing webApps, they all use http (so does Socket.io as a fallback) and it works quite well.

How can web requests be made and go undetected by a packet sniffer tool like Charles?

I am using a third party (OS X) tool to help me process OFX financial data. It works, but I am interested in knowing what exactly is going on behind the scenes to make it work (the structure of the HTTP requests).
I setup Charles as an SSL proxy for all traffic in hopes that I could observe the requests being made by this tool, but the program runs and Charles gets nothing. No requests show up whatsoever. How is that possible? Is there something I am not understanding about how Charles or other packet sniffing tools work? What are some ways that web requests could be made that wouldn't show up in a tool like Charles?
Charles is not a packet sniffer. It's a proxy. The app initiating the connection has to "voluntarily" use the proxy for the proxy to see anything. If an app uses a high-level networking API like NSURLConnection then it will, by virtue of the frameworks, automatically pick up the system-wide proxy settings and use the proxy. If, instead, the app wrote their networking using low-level socket API, then they will not end up going through the proxy unless they specifically re-implement that functionality.
If you want to see everything, you will need a real promiscuous-mode packet sniffer, which Charles is not. Unfortunately, using a "real" packet sniffer will just show you the gibberish going over the wire for SSL connections, so that's probably not what you want either. If an app has "in-housed" its SSL implementation and is not using a properly configured system-wide proxy, sniffing its traffic unencrypted will be considerably harder (you'll probably have to use a debugger or some other runtime hooking approach.)

Are Websockets more secure for communication between web pages?

This might sound really naive but I would really find a descriptive answer helpful.
So, my question is this:
I can use Firebug to look at AJAX requests made from any website I visit. So, am I right in saying that I wouldn't be able to examine the same communication between the client and the server if the website choses to use Websockets? In other words, does this make it more secure?
No. Not at all. Just because the browser does not (yet) have a tool to show WebSocket traffic, doesn't make it any more secure. You can always run a packet sniffer to monitor the traffic, for example.
No, because there will be other ways beside the browser-build in tools to read your traffic.
Have a try: Install and run Wireshark and you will be able to see all packets you send and receive via Websockets.
Depends on the application. If you are fully Ajax without reloading the document for data then I would think websockets would provide a better authentication for data requests then a cookie session in regards to connection hijack. Given that you are using SSL of course.
Never rely on secrecy of algorithm cause it only gives you false sense of security. Wiki: Security by obscurity
Remember that browser is a program on my computer and I am the one who have a full control over what is send to you, not my browser.
I guess it's only matter of time (up to few months IMO) when developer tools such as Firebug will provide some fancy tool for browsing data send/received by WebSockets.
WebSockets has both an unencrypted (ws://) and encrypted mode (wss://). This is analogous to HTTP and HTTPS. WebSockets protocol payload is simply UTF-8 encoded. From a network sniffing perspective there is no advantage to using WebSockets (use wss and HTTPS for everything at all sensitive). From the browser perspective there is no benefit to using WebSockets for security. Anything running in the browser can be examined (and modified) by a sufficiently knowledgeable user. The tools for examining HTTP/AJAX requests just happen to be better right now.

Any HTTP proxies with explicit, configurable support for request/response buffering and delayed connections?

When dealing with mobile clients it is very common to have multisecond delays during the transmission of HTTP requests. If you are serving pages or services out of a prefork Apache the child processes will be tied up for seconds serving a single mobile client, even if your app server logic is done in 5ms. I am looking for a HTTP server, balancer or proxy server that supports the following:
A request arrives to the proxy. The proxy starts buffering in RAM or in disk the request, including headers and POST/PUT bodies. The proxy DOES NOT open a connection to the backend server. This is probably the most important part.
The proxy server stops buffering the request when:
A size limit has been reached (say, 4KB), or
The request has been received completely, headers and body
Only now, with (part of) the request in memory, a connection is opened to the backend and the request is relayed.
The backend sends back the response. Again the proxy server starts buffering it immediately (up to a more generous size, say 64KB.)
Since the proxy has a big enough buffer the backend response is stored completely in the proxy server in a matter of miliseconds, and the backend process/thread is free to process more requests. The backend connection is immediately closed.
The proxy sends back the response to the mobile client, as fast or as slow as it is capable of, without having a connection to the backend tying up resources.
I am fairly sure you can do 4-6 with Squid, and nginx appears to support 1-3 (and looks like fairly unique in this respect). My question is: is there any proxy server that empathizes these buffering and not-opening-connections-until-ready capabilities? Maybe there is just a bit of Apache config-fu that makes this buffering behaviour trivial? Any of them that it is not a dinosaur like Squid and that supports a lean single-process, asynchronous, event-based execution model?
(Siderant: I would be using nginx but it doesn't support chunked POST bodies, making it useless for serving stuff to mobile clients. Yes cheap 50$ handsets love chunked POSTs... sigh)
What about using both nginx and Squid (client — Squid — nginx — backend)? When returning data from a backend, Squid does convert it from C-T-E: chunked to a regular stream with Content-Length set, so maybe it can normalize POST also.
Nginx can do everything you want. The configuration parameters you are looking for are
http://wiki.codemongers.com/NginxHttpCoreModule#client_body_buffer_size
and
http://wiki.codemongers.com/NginxHttpProxyModule#proxy_buffer_size
Fiddler, a free tool from Telerik, does at least some of the things you're looking for.
Specifically, go to Rules | Custom Rules... and you can add arbitrary Javascript code at all points during the connection. You could simulate some of the things you need with sleep() calls.
I'm not sure this method gives you the fine buffering control you want, however. Still, something might be better than nothing?
Squid 2.7 can support 1-3 with a patch:
http://www.squid-cache.org/Versions/v2/HEAD/changesets/12402.patch
I've tested this and found it to work well, with the proviso that it only buffers to memory, not disk (unless it swaps, of course, and you don't want this), so you need to run it on a box that's appropriately provisioned for your workload.
Chunked POSTs are a problem for most servers and intermediaries. Are you sure you need support? Usually clients should retry the request when they get a 411.
Unfortunately, I'm not aware of a ready-made solution for this. In the worst case scenario, consider developing it yourself, say, using Java NIO -- it shouldn't take more than a week.

Resources