Outbound https proxy sidecar - proxy

Let's say I have a Kubernetes Job that makes https requests to changing URLs and I want to allow specific URLs only and block all other requests. My idea is deploy an Https-Proxy-Pod and use NetworkPolicies to make sure the Job-Pod can only communicate with the Https-Proxy-Pod. See following sketch for better understanding:
sketch of https-proxy sidecar deployment
I know how to do that but have no idea what Https Proxy to use. As far as I understood envoy is not a suitable solution for what I want to do: https://github.com/envoyproxy/envoy/issues/1606
Does anyone has a better solution or can tell me which proxy to use?

Mitmproxy is an open source tool that you can use to filter HTTP and HTTPS requests transparently using the Python scripting language.
There's also a quite detailed tutorial on how to use it

Related

How to listen to GET and POST requests for all connections using GO

I am using python and mitmproxy to listen to all incoming and outgoing traffic so that I can capture the URLs. I run the script and it tells me all URLs my computer is trying to connect to.
I need to implement the same using Go but have not got a clue on how to start or what package to use. Can anyone guide me in the right direction please?
Thanks
You can either easily build a proxy with httputil - example here.
Or you might want to use gopacket as described here.
You would need a local intermediate proxy, in order to capture its traffic and display the URLs used.
See for example, in Go, sipt/shuttle, with its GUI web interface:

URL Rewrite for Forwarding Proxy for HTTPS

Is it possible to create forwarding HTTPS Proxy (not reverse proxy) that would be able to:
block some urls based on the url regexp (ads, flash, movies, ...)
cache images based on the url regexp
It seems to me that in the usual case it is impossible because the HTTPS stream is encrypted and there's no way to process or alter it.
But, this case is special, it is a proxy for the web crawler, I don't need HTTPS at all, but some sites allow access via HTTPS only, and I have to somehow support it.
So, maybe it would be possible to do something like that?
Crawler --http--> Proxy --https--> Site
So, the proxy would be able to decode HTTPS stream and post-process it. Would it work? Is there any docs or details about such approach?
Pretty sure Apache 2.2 provides this functionality with mod_proxy in conjunction with mod_ssl and mod_cache.
Note: blocking is done using the 'ProxyBlock' directive in mod_proxy.

What is a bootstrap-proxy in Chef?

I am looking at this http://tickets.opscode.com/browse/CHEF-2375 and I am a bit confused as to what this option is useful for. I understand the idea of a proxy node in general as described here: http://en.wikipedia.org/wiki/Proxy_server
Can someone explain the use of a bootstrap-proxy in Chef?
Basically the bootstrap-proxy allows you to specify a HTTP proxy server when setting up your servers.
This is probably not a big deal for most people, but there are a few cases where you need/want an HTTP proxy.
Some networks are set up to require the use of an HTTP proxy in order to make outbound HTTP connections. This can provide some added security as well as a wealth of control on the part of the network admin. Without http proxy support the knife bootstrap command would be unable to make any HTTP connections.
Added an HTTP proxy as a cache can make commands like apt-get run much faster, especially on a slow connection. Essentially your proxy can cache any packages you download while the first server won't see much of a difference, any subsequent servers will download packages directly from the cache which can be much faster.

How to build local web proxy without configuring the browsers

How does Netnanny or k9 Web Protection setup web proxy without configuring the browsers?
How can it be done?
Using WinSock directly, or at the NDIS or hardware driver level, and
then filter at those levels, just like any firewalls soft does. NDIS being the easy way.
Download this ISO image: http://www.microsoft.com/downloads/en/confirmation.aspx?displaylang=en&FamilyID=36a2630f-5d56-43b5-b996-7633f2ec14ff
it has bunch of samples and tools to help you build what you want.
After you mount or burn it on CD and install it go to this folder:
c:\WinDDK\7600.16385.1\src\network\ndis\
I think what you need is a transparent proxy that support WCCP.
Take a look at squid-cache FAQ page
And the Wikipedia entry for WCCP
With that setup you just need to do some firewall configuration and all your web traffic will be handled by the transparent proxy. And no setup will be needed on your browser.
netnanny is not a proxy. It is tied to the host machine and browser (and possibly other applications as well. It then filters all incoming and outgoing "content" from the machine/application.
Essentially Netnanny is a content-control system as against destination-control system (proxy).
Easiest way to divert all traffic to a certain site to some other address is by changing hosts file on local host
You might want to have a look at the explanation here: http://www.fiddlertool.com/fiddler/help/hookup.asp
This is how Fiddler2 achieves inserting a proxy in between most apps and the internet without modifying the apps (although lots of explanation of how-to failing the default setup). This does not answer how NetNanny/K9 etc work though, as noted above they do a little more and may be a little more intrusive.
I believe you search for BrowserHelperObjects. These little gizmos capture ALL browser communication, and as such can either remote ads from the HTML (good gizmo), or redirect every second click to a spam site (bad gizmo), or just capture every URL you type and send it home like all the WebToolBars do.
What you want to do is route all outgoing http(s) requests from your lan through a reverse proxy (like squid). This is the setup for a transparent web proxy.
There are different ways to do this, although I've only ever set it up OpenBSD and Linux; and using Squid as the reverse proxy.
At a high level you have a firewall with rules to send all externally bound http traffic to a local squid server. The Squid server is configured to:
accept all http requests
forward the requests on to the real external hosts
cache the reply
forward the reply back to the requestor on the local lan
You can then add more granular rules in Squid to control access to websites, filter content, etc.
I pretty sure you can also get this functionality in different networking gear. I bet F5 has some products that do some or all of what I described, and probably Cisco as well. There is probably other proxies out there besides Squid that you can use too.
PS. I have no idea if this is how K9 Web Protection or NetNanny works.
Squid could provide an intercept proxy for HTTP and HTTPs ports, without configuring the browsers and it also supports WCCP.

Open source HTTP or HTTPS proxy

I want to log all HTTP requests made by the browser to a file, so I thought I can run a HTTP/S proxy locally and do this. However, the proxies at proxies.xhaus.com/ don't meet my needs - either no HTTPS support or no logging. Do anyone of you know of a proxy that can do both HTTPS and HTTP and allow me to log the browser traffic to a file?
Squid can do that.
http://wiki.squid-cache.org/Features/SslBump
Squid was also my first thought given your description, but for development use you might prefer a more powerful intercepting proxy like:
Fiddler2
Paros Proxy
Burp Suite (despite the name, my personal favorite)

Resources