I am trying to get a batch script which will do the below task.
Telnet to any website on port 80 (telnet google.com 80)
Execute HTTP GET request (GET / HTTP/1.1 and host: google.com)
Redirect this output to the txt file.
I want the output in .txt files as shown below.
Filename: HTTP.txt
HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.co.in/?gfe_rd=cr&ei=ClUXV8WSEujI8Aep2L7oAQ
Content-Length: 261
Date: Wed, 20 Apr 2016 10:08:10 GMT
I want a batch file to auto input HTTP GET request and to set the HTTP host as well.
For http reques you can try winhttpjs.bat :
call winhttpjs.bat "http://google.com" -method GET -reportfile reportfile.txt
though the format of the report is not exactly what you want you can rewrite it to fit your needs.
Related
How come this works from the BASH prompt:
/testproj> http http://localhost:5000/ping/ &
[1] 10733
(env)
/testproj> HTTP/1.0 200 OK
Content-Length: 2
Content-Type: application/json
Date: Sat, 17 Nov 2018 19:27:01 GMT
Server: Werkzeug/0.14.1 Python/3.6.4
{}
... but fails when executed from in a .sh:
/testproj> cat x.sh
http http://localhost:5000/ping/ &
(env)
/testproj> ./x.sh
(env)
/testproj> HTTP/1.0 405 METHOD NOT ALLOWED
Allow: GET, HEAD, OPTIONS
Content-Length: 178
Content-Type: text/html
Date: Sat, 17 Nov 2018 19:29:00 GMT
Server: Werkzeug/0.14.1 Python/3.6.4
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>405 Method Not Allowed</title>
<h1>Method Not Allowed</h1>
<p>The method is not allowed for the requested URL.</p>
?
EDIT: http is HTTPie
EDIT: type http gives http is hashed (/testproj/env/bin/http)
EDIT: One can reproduce the error with just http http://www.google.com </dev/null & (Thanks #e36freak)
EDIT: from e36freak on IRC:
it appears to be an issue with stdin
i get the same error with just http http://www.google.com </dev/null
http wants stdin to be attached to a tty it looks like
for whatever reason
couldn't find it in the man page but i'm sure it's out there
You most like like need to include the --ignore-stdin option to prevent httpie from trying to read it. See: https://httpie.org/doc#scripting
For example, if I want to know if I am connected I can use:
ping 8.8.8.8
to send a ping to google DNS server. Can I ask in a similar way the date of a server? Something like:
give_me_your_date 8.8.8.8
if you had curl you could
curl -I http://example.com
HTTP/1.1 200 OK
Date: Sun, 16 Oct 2016 23:37:15 GMT
Server: Apache/2.4.23 (Unix)
X-Powered-By: PHP/5.6.24
Connection: close
Content-Type: text/html; charset=UTF-8
and you if that server is a webserver you would get an http header which you can retrieve the date from
you can get curl for windows here https://curl.haxx.se/download.html
I've got the following link, which is downloading a CSV file when put through a web browser.
http://pro.allocine.fr/film/export_classement.html?typeaffichage=2&lsttype=1001&lsttypeperiode=3002&typedonnees=visites&cfilm=&datefiltre=
However, when using Wget with Cygwin, with the command below, Wget retrieves a file, which is not a CSV file, but a file without extension. The file is empty, that is, has no data at all.
wget 'http://pro.allocine.fr/film/export_classement.html?typeaffichage=2&lsttype=1001&lsttypeperiode=3002&typedonnees=visites&cfilm=&datefiltre='
So as I hate to be stuck, I tried the following as well. I put the URL in a text file and used Wget with the file option:
inside fic.txt
'http://pro.allocine.fr/film/export_classement.html?typeaffichage=2&lsttype=1001&lsttypeperiode=3002&typedonnees=visites&cfilm=&datefiltre='
I used Wget in the following way:
wget -i fic.txt
I got the following errors:
Scheme missing
No URLs found in toto.txt
I think I can suggest some other options that will make your underlying problem more clear which is that it's supposed to be html, but there is no content (content-length = 0).
More concretely, this
wget -S -O export_classement.html 'http://pro.allocine.fr/film/export_classement.html?typeaffichage=2&lsttype=1001&lsttypeperiode=3002&typedonnees=visites&cfilm=&datefiltre='
produces this
Resolving pro.allocine.fr... 62.39.143.50
Connecting to pro.allocine.fr|62.39.143.50|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: nginx
Date: Fri, 28 Mar 2014 09:54:44 GMT
Content-Type: text/html; Charset=iso-8859-1
Connection: close
X-ServerName: WEBNX2
akamainocache: no-store
Content-Length: 0
Cache-control: private
X-KompressorName: kompressor7
Length: 0 [text/html]
2014-03-28 05:54:52 (0.00 B/s) - ‘export_classement.html’ saved [0/0]
Additionally the server is tailoring it's output based on how the browser identifies itself. using wget does have an option to include an arbitrary user-agent in the headers. Here's an example what happens when you make wget identify itself as Chrome. Here's a list of other possibiities.
wget -S --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36" 'http://pro.allocine.fr/film/export_classement.html?typeaffichage=2&lsttype=1001&lsttypeperiode=3002&typedonnees=visites&cfilm=&datefiltre='
Now the output changes to export.csv, with type "application/octet-stream" instead of "text/html"
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: nginx
Date: Fri, 28 Mar 2014 10:34:09 GMT
Content-Type: application/octet-stream; Charset=iso-8859-1
Transfer-Encoding: chunked
Connection: close
X-ServerName: WEBNX2
Edge-Control: no-store
Last-Modified: Fri, 28 Mar 2014 10:34:17 GMT
Content-Disposition: attachment; filename=export.csv
I can download a file by url but when I try it from bash I get a html page instead of a file.
How to download file with url redirection (301 Moved Permanently) using curl, wget or something else?
UPD
Headers from the url request.
curl -I http://www.somesite.com/data/file/file.rar
HTTP/1.1 301 Moved Permanently
Date: Sat, 07 Dec 2013 10:15:28 GMT
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3
Location: http://www.somesite.com/files/html/archive.html
Vary: Accept-Encoding
Content-Type: text/html
X-Pad: avoid browser bug
Use -L, --location to follow redirects:
$ curl -L http://httpbin.org/redirect/1
I wrote a bash script that gets output from a website using curl and does a bunch of string manipulation on the html output. The problem is when I run it against a site that is returning its output gzipped. Going to the site in a browser works fine.
When I run curl by hand, I get gzipped output:
$ curl "http://example.com"
Here's the header from that particular site:
HTTP/1.1 200 OK
Server: nginx
Content-Type: text/html; charset=utf-8
X-Powered-By: PHP/5.2.17
Last-Modified: Sat, 03 Dec 2011 00:07:57 GMT
ETag: "6c38e1154f32dbd9ba211db8ad189b27"
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Cache-Control: must-revalidate
Content-Encoding: gzip
Content-Length: 7796
Date: Sat, 03 Dec 2011 00:46:22 GMT
X-Varnish: 1509870407 1509810501
Age: 504
Via: 1.1 varnish
Connection: keep-alive
X-Cache-Svr: p2137050.pubip.peer1.net
X-Cache: HIT
X-Cache-Hits: 425
I know the returned data is gzipped, because this returns html, as expected:
$ curl "http://example.com" | gunzip
I don't want to pipe the output through gunzip, because the script works as-is on other sites, and piping through gzip would break that functionality.
What I've tried
changing the user-agent (I tried the same string my browser sends, "Mozilla/4.0", etc)
man curl
google search
searching stackoverflow
Everything came up empty
Any ideas?
curl will automatically decompress the response if you set the --compressed flag:
curl --compressed "http://example.com"
--compressed
(HTTP) Request a compressed response using one of the algorithms libcurl supports, and save the uncompressed document. If this option is used and the server sends an unsupported encoding, curl will report an error.
gzip is most likely supported, but you can check this by running curl -V and looking for libz somewhere in the "Features" line:
$ curl -V
...
Protocols: ...
Features: GSS-Negotiate IDN IPv6 Largefile NTLM SSL libz
Note that it's really the website in question that is at fault here. If curl did not pass an Accept-Encoding: gzip request header, the server should not have sent a compressed response.
In the relevant bug report Raw compressed output when not using --compressed but server returns gzip data #2836 the developers says:
The server shouldn't send content-encoding: gzip without the client having signaled that it is acceptable.
Besides, when you don't use --compressed with curl, you tell the command line tool you rather store the exact stream (compressed or not). I don't see a curl bug here...
So if the server could be sending gzipped content, use --compressed to let curl decompress it automatically.