Why does WGET return 2 error messages before succeeding? - shell

I am using a script to pull down some XML data on a authentication required URL with WGET.
In doing so, my script produces the following output for each url accessed (IPs and hostnames changed to protect the guilty):
> Resolving host.name.com... 127.0.0.1
> Connecting to host.name.com|127.0.0.1|:80... connected.
> HTTP request sent, awaiting response... 401 Access denied
> Connecting to host.name.com|127.0.0.1|:80... connected.
> HTTP request sent, awaiting response... 401 Unauthorized
> Reusing existing connection to host.name.com:80.
> HTTP request sent, awaiting response... 200 OK
Why does WGET complain that accessing the URL fails twice before successfully connecting? Is there a way to shut it up, or get it to connect properly in the first attempt?
For reference, here's the line I am using to call WGET:
wget --http-user=USERNAME --password=PASSWORD -O file.xml http://host.name.com/file.xml

This appears to be by design. Following the advice of #Wayne Conrad, I added the -d switch and was able to observe the first attempt failing because NTLM was required, and the second attempt failing because the first NTLM attempt was only level 1, where a level 3 NTLM challenge-response was required. WGET finally provides the needed authentication at the third attempt.
WGET does get a cookie to prevent re-authenticating for the duration of the session, which would prevent this if the connection wasn't terminated between files. I would need to pass WGET a list of files for this to occur, however I am unable to because I do not know the file names in advance.

You seem to have a new version of wget. After 1.10.2, wget will not send out authentication unless challenged by the server first. And that is why the first one is failing. The second is failing cause of the what you described.
You can reduce one of them by adding the parameter --auth-no-challenge. This sends out the first in "basic" which will fail and the second one will be sent in "digest" mode. Which should work.

Related

Not able to download Oracle's jdk-8u181 package using wget behind an HTTP proxy

I'm trying to use WebUpd8 team's oracle-java8-installer to install Java 8 on my Ubuntu 14.04 computers. Some of them could succeed but others failed. After some debugging, I realized it was caused by the HTTP proxy setting. I'll provide more details below, but basically my questions are: Why does the use of http_proxy cause the problem? I believe it's must be related to how an HTTP proxy works, but since I have little experience in that, could someone tell me what knowledge I should learn to understand this issue?
Here are more details.
Under the hood, the oracle-java8-installer uses wget to download the jdk-8u181 package. So I can reproduce the issue with the steps below:
Install apt-cacher-ng: sudo apt-get install apt-cacher-ng
You don't have to configure anything in the APT configuration to reproduce this problem. apt-cacher-ng uses localhost:3142 by default to cache the packages.
Run http_proxy="http://localhost:3142" wget --continue --no-check-certificate -O jdk-8u181-linux-x64.tar.gz --header "Cookie: oraclelicense=a" http://download.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz
Here are some notes:
The http://localhost:3142 is configured for apt-cacher-ng. Those machines that failed had apt-cacher-ng installed before I tried to install jdk-8u181.
The Cookie: oraclelicense=a is to indicate the user has accepted the license.
If you run the last command, the download of the jdk-8u181-linux-x64.tar.gz is finished instantly. There is a line saying "Proxy request sent, awaiting response... 200 OK". But if you open the received ".tar.gz", you'll see it's merely an HTML page that contains error information.
If you remove the http_proxy environment variable and run:
wget --continue --no-check-certificate -O jdk-8u181-linux-x64.tar.gz --header "Cookie: oraclelicense=a" http://download.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz
You will have the full package downloaded correctly.
My best guess is that an HTTP proxy works with wget if the target URL is the final URL, so the proxy would cache it in its storage. Conceptually, it's like a key-value store:
proxy['URL'] = result
However, in this case, the target URL (http://download.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz) actually returns a "302" code and a "Location" header field for the new URL. This can be seen from the output:
ywen#ubuntu:~$ wget --continue --no-check-certificate -O
jdk-8u181-linux-x64.tar.gz --header "Cookie: oraclelicense=a"
http://download.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz
--2018-08-01 11:10:04-- http://download.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz
Resolving download.oracle.com (download.oracle.com)... 23.32.72.143
Connecting to download.oracle.com
(download.oracle.com)|23.32.72.143|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location:
https://edelivery.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz
[following]
--2018-08-01 11:10:04-- https://edelivery.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz
Resolving edelivery.oracle.com (edelivery.oracle.com)...
23.216.148.161, 2001:559:19:3081::2d3e, 2001:559:19:3086::2d3e
Connecting to edelivery.oracle.com
(edelivery.oracle.com)|23.216.148.161|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location:
http://download.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz?AuthParam=1533136324_72efc4e6208a5a7fc1cbba0527c741b6
[following]
--2018-08-01 11:10:04-- http://download.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz?AuthParam=1533136324_72efc4e6208a5a7fc1cbba0527c741b6
Connecting to download.oracle.com
(download.oracle.com)|23.32.72.143|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 185646832 (177M) [application/x-gzip]
Saving to: ‘jdk-8u181-linux-x64.tar.gz’
Handling the redirection is out of the capability of a proxy (Am I right??), therefore those machines set with the HTTP proxies failed.

Why is wget failing on this image?

I'm attempting to download an image from google books with wget (I've tried curl as well) and I continually get a 500 error
// COMMAND
wget "http://books.google.com/books/content?id=pztHgTT4BGUC&printsec=frontcover&img=1"
// OUTPUT
--2016-07-13 20:58:06-- http://books.google.com/books/content?id=pztHgTT4BGUC&printsec=frontcover&img=1
Resolving books.google.com... 216.58.194.206, 2607:f8b0:4005:801::200e
Connecting to books.google.com|216.58.194.206|:80... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2016-07-13 20:58:06 ERROR 500: Internal Server Error.
It fails for the same reason the URL will fail in a browser if you're not logged into Google: The server refuses to serve you the content unless you're logged in.
You can probably copy a session cookie from an existing session if you log in with a browser and use it in wget.

wget magento-2.0.5 to download magento CE- not found

I was trying to download magento2.0.5 on linux using command
wget http://www.magentocommerce.com/downloads/assets/2.0.4/magento-2.0.4.tar.gz
error returned-
Connecting to magento.com (magento.com)|66.211.190.110|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
Is the download url changed? or are there some new permission constraints I need to pass through?
Try here
https://github.com/magento/magento2/releases
If you are looking for download link then here
https://github.com/magento/magento2/archive/2.3.0.tar.gz
I tried and this link works.

Is greeting or handshake required when FTP data connection established?

I'm implementing a simple FTP server. When debugging, I try to use FileZilla client to connet my server. The request and response pattern found in the command panel is listed below:
GREETING: 220 (FTP v1.0)
REQUEST: USER ***
RESPONS: 331 Password?
REQUEST: PASS ********
RESPONS: 230 login successfully.
REQUEST: PWD
RESPONS: 257 "/a/" is current directory.
REQUEST: TYPE I
RESPONS: 200 Type set to I.
REQUEST: PASV
RESPONS: 200 127,255,0,0,175,200(I specify local port 45000)
REQUEST: LIST
RESPONS: 150 here is the listing
RESPONS: 226 Transfer done.
However, there is an error followed Fail to read directroy. I think the passive connection is indeed established since I can get stream on the socket(I implement the server in C#). But I have no idea why is the error. Is it because I should send some handshake/greeting information like those in the control connection instead of sending the data directly to sync server and client? If yes, what's the status code of this information?
Thanks and Best Regards.
There is no handshake on data connection.
Maybe 'Fail to read directory' error is a result of incorrect format of the folder list your server returns?

Why smtp.gmail.com returns Unrecognized command?

I've written a winsock client
that connects with smtp.gmail, but after the first
EHLO command, every other command would return Unrecognized
command. I tryed AUTH, AUTH LOGIN, MAIL...but all
return the same thing. Where do I find the commands
that works with this server, I think they use SMTP commands differently
They appear to be using an RFC 3207 SMTP/TLS server. Check here for more information: RFC 3207.

Resources