My system:
Ubuntu 16.04
Mitmproxy version 3.0.4
Python version 3.5.2
I've successfully installed mitmproxy from: docs.mitmproxy.org on my server. But now I got confused how to save log mitmproxy to file? I try use mitmdump --mode transparent --showhost -p 9001 -w output_file
While I open output_file, it's not human readable. I read the docs and try scripts from the mitmproxy's Github, but no clue.
Anyone know how to save log mitmproxy to file, but human readable?
Thank you!
As you have probably noticed, mitmproxy generates streams in a binary format. If you want to save the streams in a human readable format, you can pass a script when you run mitmproxy to do so.
save.py
from mitmproxy.net.http.http1.assemble import assemble_request, assemble_response
f = open('/tmp/test/output.txt', 'w')
def response(flow):
f.write(assemble_request(flow.request).decode('utf-8'))
And now run mitmproxy -s save.py and the output will be written to output.txt in a human readable format.
Do pay attention to the responses because they might contain a lot of binary data. but if you do want to write the responses also in a human readable format, then you can add f.write(assemble_response(flow.response).decode('utf-8', 'replace')) to the script.
Example output from the script:
❯❯ tail -f output.txt
GET / HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:77.0) Gecko/20100101 Firefox/77.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
If-Modified-Since: Thu, 17 Oct 2019 07:18:26 GMT
If-None-Match: "3147526947"
Cache-Control: max-age=0
I initially had trouble getting the requests and responses output to file using what #securisec proposed but what worked for me was to replace mitmproxy -s save.py with mitmdump -s save.py instead.
Once I killed the mitmdump process, the output file was written to the file system.
Related
When using Bash on Windows, copying from the terminal results in terribly formatted content when I paste. Here is a curl request I made and copied directly from Bash on Ubuntu on Windows:
$ curl -IL aol.com HTTP/1.1 301 Moved Permanently Date: Fri, 27 Apr 2018 15:58:31 GMT Server: Apache Location: https://www.aol.com/ Content-Type: text/html; charset=iso-8859-1
It's basically all on one line, almost like the carriage returns are not respected. I would expect the paste output to look like this:
$ curl -IL aol.com
HTTP/1.1 301 Moved Permanently
Date: Fri, 27 Apr 2018 15:19:49 GMT
Server: Apache
Location: https://www.aol.com/
Content-Type: text/html; charset=iso-8859-1
Is there a way to fix this? Right now I have to go into a text editor to format every copy/paste I need to make.
Edit: Here is a screen shot of the above code blocked copied to the clipboard directly from Bash on Win10: https://i.imgur.com/VUKRhLX.png
I've got the following link, which is downloading a CSV file when put through a web browser.
http://pro.allocine.fr/film/export_classement.html?typeaffichage=2&lsttype=1001&lsttypeperiode=3002&typedonnees=visites&cfilm=&datefiltre=
However, when using Wget with Cygwin, with the command below, Wget retrieves a file, which is not a CSV file, but a file without extension. The file is empty, that is, has no data at all.
wget 'http://pro.allocine.fr/film/export_classement.html?typeaffichage=2&lsttype=1001&lsttypeperiode=3002&typedonnees=visites&cfilm=&datefiltre='
So as I hate to be stuck, I tried the following as well. I put the URL in a text file and used Wget with the file option:
inside fic.txt
'http://pro.allocine.fr/film/export_classement.html?typeaffichage=2&lsttype=1001&lsttypeperiode=3002&typedonnees=visites&cfilm=&datefiltre='
I used Wget in the following way:
wget -i fic.txt
I got the following errors:
Scheme missing
No URLs found in toto.txt
I think I can suggest some other options that will make your underlying problem more clear which is that it's supposed to be html, but there is no content (content-length = 0).
More concretely, this
wget -S -O export_classement.html 'http://pro.allocine.fr/film/export_classement.html?typeaffichage=2&lsttype=1001&lsttypeperiode=3002&typedonnees=visites&cfilm=&datefiltre='
produces this
Resolving pro.allocine.fr... 62.39.143.50
Connecting to pro.allocine.fr|62.39.143.50|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: nginx
Date: Fri, 28 Mar 2014 09:54:44 GMT
Content-Type: text/html; Charset=iso-8859-1
Connection: close
X-ServerName: WEBNX2
akamainocache: no-store
Content-Length: 0
Cache-control: private
X-KompressorName: kompressor7
Length: 0 [text/html]
2014-03-28 05:54:52 (0.00 B/s) - ‘export_classement.html’ saved [0/0]
Additionally the server is tailoring it's output based on how the browser identifies itself. using wget does have an option to include an arbitrary user-agent in the headers. Here's an example what happens when you make wget identify itself as Chrome. Here's a list of other possibiities.
wget -S --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36" 'http://pro.allocine.fr/film/export_classement.html?typeaffichage=2&lsttype=1001&lsttypeperiode=3002&typedonnees=visites&cfilm=&datefiltre='
Now the output changes to export.csv, with type "application/octet-stream" instead of "text/html"
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: nginx
Date: Fri, 28 Mar 2014 10:34:09 GMT
Content-Type: application/octet-stream; Charset=iso-8859-1
Transfer-Encoding: chunked
Connection: close
X-ServerName: WEBNX2
Edge-Control: no-store
Last-Modified: Fri, 28 Mar 2014 10:34:17 GMT
Content-Disposition: attachment; filename=export.csv
I have a one line bash command which gets me an HTML site over an SSL encrypted HTTPS connection:
echo "GET / HTTP/1.1\nHost: www.example.com\n\n" | openssl s_client -connect www.example.com:443 -quiet 2> /dev/null
The site is being loaded but with HTTP headers like:
HTTP/1.1 200 OK
Date: Fri, 01 Feb 2013 13:15:59 GMT
Server: Apache/2.2.20 (Ubuntu)
and more like this. With 2> /dev/null I can hide the output of wrong SSL certificates and more.
I do not want to take another script because curl does not what I want to do.
It is not possible due to the nature of openssl s_client which gives you the direct and plain output from a service which runs behind the connecting port (443 in my example from the question where I want to get / on a webserver with SSL).
telnet would also give you the plain output from the HTTP protocol and curl would show me the HTML site without headers and with HTTPS but does not allow self written commands to the web server.
You may use expect command to do the interaction with openssl. I'm not sure it will work but it worth a try.
Just been tinkering with Sinatra and trying to get a bit of a restful web service going.
The error I'm getting at the moment is very specific though.
Take this example post method
post '/postMan/:someParam' do
#Edited here. This code can be anything. 411 is still the response
puts params[:someParam]
end
Seems simple enough. Take a param, make an object out of it, then go store it in whatever way the objects save method defines.
Heres what I use to post the data using Curl
$curl -I -X POST http://127.0.0.1/postman/123456
The only problem is, I'm getting 411 back and have no idea why.
To the best of my knowledge, 411 is length required. Here is the trace
HTTP/1.1 411 Length Required
Content-Type: text/html; charset=ISO-8859-1
Server: WEBrick/1.3.1 (Ruby/1.9.2/2011-07-09)
Date: Fri, 02 Mar 2012 22:27:09 GMT
Content-Length: 303
Connection: close
I 'Cannot' change the curl message in any way. So might anyone have a way to set the content length to be ignored in sinatra? Or some fix which doesn't involve changing the curl request?
For the record, it doesn't matter whether I use the parameters in the Post method or not. I could have some crazy code inside it, it will still throw the same error
As others said above, WEBrick wrongly requires POST requests to have a Content-Length header. I just pass an empty body, because it's less typing than passing in the header:
curl -X POST -d '' http://webrickwhyyounotakeemptyposts.com/
Are you sure you're on port 80 for your app?
When I run:
ruby -r sinatra -e "post('/postMan/:someParam'){puts params[:someParam]}"
and curl it:
curl -I -X POST http://127.0.0.1:4567/postMan/123456
HTTP/1.1 200 OK
X-Frame-Options: sameorigin
X-XSS-Protection: 1; mode=block
Content-Type: text/html;charset=utf-8
Content-Length: 0
Connection: keep-alive
Server: thin 1.3.1 codename Triple Espresso
it's ok. Had to change the URL to postManthough, your example threw a 404because you had postman.
The output was also as expected:
== Sinatra/1.3.2 has taken the stage on 4567 for development with backup from Thin
>> Thin web server (v1.3.1 codename Triple Espresso)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:4567, CTRL+C to stop
123456
Ah. Try it without -I. It's probably sending a HEAD request and as such, not sending what you expect. Use -v if you want to show the headers.
curl -v -X POST http://127.0.0.1/postman/123456
curl -v -X POST -d "key=val" http://127.0.0.1/postman/123456
WEBrick erroneously requires POST requests to include the Content-Length header.
curl -H 'Content-Length: 0' -X POST http://example.com
Standardly, however, POST requests don't require a body and therefore don't require the Content-Length header.
I wrote a bash script that gets output from a website using curl and does a bunch of string manipulation on the html output. The problem is when I run it against a site that is returning its output gzipped. Going to the site in a browser works fine.
When I run curl by hand, I get gzipped output:
$ curl "http://example.com"
Here's the header from that particular site:
HTTP/1.1 200 OK
Server: nginx
Content-Type: text/html; charset=utf-8
X-Powered-By: PHP/5.2.17
Last-Modified: Sat, 03 Dec 2011 00:07:57 GMT
ETag: "6c38e1154f32dbd9ba211db8ad189b27"
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Cache-Control: must-revalidate
Content-Encoding: gzip
Content-Length: 7796
Date: Sat, 03 Dec 2011 00:46:22 GMT
X-Varnish: 1509870407 1509810501
Age: 504
Via: 1.1 varnish
Connection: keep-alive
X-Cache-Svr: p2137050.pubip.peer1.net
X-Cache: HIT
X-Cache-Hits: 425
I know the returned data is gzipped, because this returns html, as expected:
$ curl "http://example.com" | gunzip
I don't want to pipe the output through gunzip, because the script works as-is on other sites, and piping through gzip would break that functionality.
What I've tried
changing the user-agent (I tried the same string my browser sends, "Mozilla/4.0", etc)
man curl
google search
searching stackoverflow
Everything came up empty
Any ideas?
curl will automatically decompress the response if you set the --compressed flag:
curl --compressed "http://example.com"
--compressed
(HTTP) Request a compressed response using one of the algorithms libcurl supports, and save the uncompressed document. If this option is used and the server sends an unsupported encoding, curl will report an error.
gzip is most likely supported, but you can check this by running curl -V and looking for libz somewhere in the "Features" line:
$ curl -V
...
Protocols: ...
Features: GSS-Negotiate IDN IPv6 Largefile NTLM SSL libz
Note that it's really the website in question that is at fault here. If curl did not pass an Accept-Encoding: gzip request header, the server should not have sent a compressed response.
In the relevant bug report Raw compressed output when not using --compressed but server returns gzip data #2836 the developers says:
The server shouldn't send content-encoding: gzip without the client having signaled that it is acceptable.
Besides, when you don't use --compressed with curl, you tell the command line tool you rather store the exact stream (compressed or not). I don't see a curl bug here...
So if the server could be sending gzipped content, use --compressed to let curl decompress it automatically.