"wget" command question from Blobtool tutorial - bioinformatics

I followed up a tutorial (https://blobtoolkit.genomehubs.org/install/) based on 2. Fetch the nt database follows up
first step 1.mkdir -p nt (I am done with that part)
second step 2.
wget "ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.??.tar.gz" -P nt/ && \
for file in nt/*.tar.gz; \
do tar xf $file -C nt && rm $file; \
done
If I copied and paste the second step command, it won't work maybe I am not sure what
&& \
for file in nt/*.tar.gz; \
do tar xf $file -C nt && rm $file; \
done
means, so I tried using
wget "ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt/*.tar.gz"
first, but I received this error messages:
Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 130.14.250.13, 2607:f220:41e:250::13, 2607:f220:41e:250::11, ...
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.13|:21... failed: Connection refused.
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|2607:f220:41e:250::13|:21... failed: Network is unreachable.
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|2607:f220:41e:250::11|:21... failed: Network is unreachable.
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|2607:f220:41e:250::10|:21... failed: Network is unreachable.
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|2607:f220:41e:250::12|:21... failed: Network is unreachable.
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|2607:f220:41e:250::7|:21... failed: Network is unreachable.
Any idea what the problem is ? how to I adjust the second step command to download the database, please let me know , thank you.

wildcards not supported in HTTP.
http://ftp.ncbi.nlm.nih.gov/blast/db/nt/*.tar.gz Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)
The host looks like an ftp server. You shouldn't be requesting to it with http. It should be wget ftp://ftp.ncbi.... instead
I can't seem to find where in the tutorial you linked they have wget http://ftp... The command before the one you referenced (2. Fetch the nt database) is a curl command and uses ftp.
Perhaps edit the question with where in the docs it tells you to do what you did, and I can look closer.
Edit:
First try this: wget "ftp://ftp.ncbi.nlm.nih.gov". It's a simpler command. It should tell you that you logged in as anonymous.
Given more info in the question, I tried both the commands given.
The first one worked for me out of the box. I got the following output:
wget "ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.??.tar.gz" -P nt/ && \ for file in nt/*.tar.gz; \ do tar xf $file -C nt && rm $file; \ done
--2020-11-15 13:16:30-- ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.??.tar.gz
=> ‘nt/.listing’
Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 2607:f220:41e:250::13, 2607:f220:41e:250::10, 2607:f220:41e:250::11, ...
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|2607:f220:41e:250::13|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done. ==> PWD ... done.
==> TYPE I ... done. ==> CWD (1) /blast/db ... done.
==> EPSV ... done. ==> LIST ... done.
.listing [ <=> ] 43.51K 224KB/s in 0.2s
2020-11-15 13:16:32 (224 KB/s) - ‘nt/.listing’ saved [44552]
Removed ‘nt/.listing’.
--2020-11-15 13:16:32-- ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.00.tar.gz
=> ‘nt/nt.00.tar.gz’
==> CWD not required.
==> EPSV ... done. ==> RETR nt.00.tar.gz ... done.
Length: 3937869770 (3.7G)
nt.00.tar.gz 3%[ ] 133.87M 10.2MB/s eta 8m 31s
The second one seemed to also work. Probably a typo in the file path somewhere, but nothing big.
wget "ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt/*.tar.gz"
--2020-11-15 13:17:14-- ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt/*.tar.gz
=> ‘.listing’
Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 2607:f220:41e:250::10, 2607:f220:41e:250::11, 2607:f220:41e:250::7, ...
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|2607:f220:41e:250::10|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done. ==> PWD ... done.
==> TYPE I ... done. ==> CWD (1) /blast/db/nt ...
No such directory ‘blast/db/nt’.
About && and \, those are just syntactic sugar. && means 'and', allowing you to chain multiple commands in one. \ means new line, so you can write a new line in the command line without it treating as you pressing enter.
Neither of these are the root of your problem.
The errors you're getting seems to be nothing to do with the actual commands and more to do with the network. Perhaps you're behind a firewall or a proxy or something. I would try the commands on a different WIFI network. Or if you know how to disable firewall settings on your router (I don't), try to fiddle around with that.

Related

--up script fails with '/etc/openvpn/update-systemd-resolved': No such file or directory (errno=2)

Since I reinstalled my ArchLinux distro I get an error when I want to use OpenVPN. Here is the full output:
quentin#QuentinDesktop ~/Documents> openvpn --config ulille-vpn.ovpn
2022-01-04 21:52:15 WARNING: Compression for receiving enabled. Compression has been used in the past to break encryption. Sent packets are not compressed unless "allow-compression yes" is also set.
2022-01-04 21:52:15 WARNING: Compression for receiving enabled. Compression has been used in the past to break encryption. Sent packets are not compressed unless "allow-compression yes" is also set.
Options error: --up script fails with '/etc/openvpn/update-systemd-resolved': No such file or directory (errno=2)
Options error: Please correct this error.
Use --help for more information.
Here is the truncated ulille-vpn.ovpn file content (I just truncated the CA certificates):
ignore-unknown-option comp-lzo compress
dev tun
persist-tun
persist-key
cipher AES-256-CBC
tls-client
client
resolv-retry infinite
proto udp
remote vpn-etudiant.univ-lille.fr 443
verify-x509-name "vpn-etudiant.univ-lille.fr" name
auth SHA256
auth-user-pass
comp-lzo
compress lzo
#route-nopull
verb 3
pull-filter ignore "dhcp-option DOMAIN"
dhcp-option DOMAIN univ-lille.fr
dhcp-option DOMAIN univ-lille1.fr
script-security 2
setenv PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
up /etc/openvpn/update-systemd-resolved
up-restart
down /etc/openvpn/update-systemd-resolved
down-pre
Note that I didn't write this one myself, it is given by my university to access its local network.
I already tried to install the openvpn-update-systemd-resolved AUR package and enable it on systemd but it changed nothing.
How can I fix it ?
Okay, after a quick looking at the configuration file (what I did not think before asking this question), I commented the last 4 lines of the chunk I posted, and it works !
I am sorry for asking this question, I though the config file my university distributes was valid but it looks like it is Fedora/Debian specific, which is kind of weird because it works perfectly fine without these four lines.
I hope this short lifespan topic can help someone else in a similar case ! :^)
I had the very same problem and it was also the config file trying to run up /etc/openvpn/update-systemd-resolved. Seems to be a distro problem as I'm also running arch.

Send an HTTPS request to TLS1.0-only server in Alpine linux

I'm writing a simple web crawler inside Docker Alpine image. However I cannot send HTTPS requests to servers that support only TLS1.0 . How can I configure Alpine linux to allow obsolete TLS versions?
I tried adding MinProtocol to /etc/ssl/openssl.cnf with no luck.
Example Dockerfile:
FROM node:12.0-alpine
RUN printf "[system_default_sect]\nMinProtocol = TLSv1.0\nCipherString = DEFAULT#SECLEVEL=1" >> /etc/ssl/openssl.cnf
CMD ["/usr/bin/wget", "https://www.restauracesalanda.cz/"]
When I build and run this container, I get
Connecting to www.restauracesalanda.cz (93.185.102.124:443)
ssl_client: www.restauracesalanda.cz: handshake failed: error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol
wget: error getting response: Connection reset by peer
I can reproduce your issue using the builtin-busybox-wget. However, using the "regular" wget works:
root#a:~# docker run --rm -it node:12.0-alpine /bin/ash
/ # wget -q https://www.restauracesalanda.cz/; echo $?
ssl_client: www.restauracesalanda.cz: handshake failed: error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol
wget: error getting response: Connection reset by peer
1
/ # apk add wget
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/x86_64/APKINDEX.tar.gz
(1/1) Installing wget (1.20.3-r0)
Executing busybox-1.29.3-r10.trigger
OK: 7 MiB in 17 packages
/ # wget -q https://www.restauracesalanda.cz/; echo $?
0
/ #
I'm not sure, but maybe you should post an issue at https://bugs.alpinelinux.org
Putting this magic 1 liner into my dockerfile solved my issues and i was able to use TLS 1.0:
RUN sed -i 's/MinProtocol = TLSv1.2/MinProtocol = TLSv1/' /etc/ssl/openssl.cnf \ && sed -i 's/CipherString = DEFAULT#SECLEVEL=2/CipherString = DEFAULT#SECLEVEL=1/' /etc/ssl/openssl.cnf
Credit goes to this dude: http://blog.travisgosselin.com/tls-1-0-1-1-docker-container-support/

Parse download speed from wget output in terminal

I have the following command
sudo wget --output-document=/dev/null http://speedtest.pixelwolf.ch which outputs
--2016-03-27 17:15:47-- http://speedtest.pixelwolf.ch/
Resolving speedtest.pixelwolf.ch (speedtest.pixelwolf.ch)... 178.63.18.88, 2a02:418:3102::6
Connecting to speedtest.pixelwolf.ch (speedtest.pixelwolf.ch) | 178.63.18.88|:80... connected.
HTTP Request sent, awaiting response... 200 OK
Length: 85 [text/html]
Saving to: `/dev/null`
100%[======================>]85 --.-K/s in 0s
2016-03-27 17:15:47 (8.79 MB/s) - `dev/null` saved [85/85]
I'd like to be able to parse the (8.79 MB/s) from the last line and store this in a file (or any other way I can get this into a local PHP file easily), I tried to store the full output by changing my command to --output-document=/dev/speedtest however this just saved "Could not reach website" in the file and not the terminal output of the command.
Not quite sure where to start with this, so any help would be awesome.
Not sure if it helps, but my intention is for this stored value (8.79) in this instance to be read by a PHP file and handled there, every 30 seconds which I'll achieve by: while true; do (run speed test and save speed variable to a file cmd); php handleSpeedTest.php; sleep 5; done where handleSpeedTest.php will be able to read that stored value and handle it accordingly.
I changed the URL to one that works. Redirected stderr onto stdout. Used grep --only-matching (-o) and a regex.
sudo wget -O /dev/null http://www.google.com 2>&1 | grep -o '\([0-9.]\+ [KM]B/s\)'

Executing multiple (wget) commands in Mac Terminal properly?

I'm trying to execute a long list of repetitive commands on Terminal.
The commands look like this:
wget 'http://api.tiles.mapbox.com/v3/localstarlight.hl2o31b8/-180,52,9/1280x1280.png' -O '/Volumes/Alaya/XXXXXXXXX/Downloads/MapTiles/Tile (52.-180) 0.png' \
wget 'http://api.tiles.mapbox.com/v3/localstarlight.hl2o31b8/-177,52,9/1280x1280.png' -O '/Volumes/Alaya/XXXXXXXXX/Downloads/MapTiles/Tile (52.-177) 1.png' \
If I copy the entire list into Terminal, it executes them all but seems to do it in such a rush that some only get partially downloadeed, and some missed out entirely. It doesn't seem to take them one by one and wait until each is finished before attempting the next.
I tried putting them entire list into a shell script and running it, but then for some reason it seems to download everything, but only produces one file, and looking at the output, it seems to be trying to save each file under the same filename:
2014-03-29 09:56:31 (4.15 MB/s) - `/Volumes/Alaya/XXXXXXXX/Downloads/MapTiles/Tile (52.180) 120.png' saved [28319/28319]
--2014-03-29 09:56:31-- http://%20%0Dwget/
Resolving \rwget... failed: nodename nor servname provided, or not known.
wget: unable to resolve host address ` \rwget'
--2014-03-29 09:56:31-- http://api.tiles.mapbox.com/v3/localstarlight.hl2o31b8/171,52,9/1280x1280.png
Reusing existing connection to api.tiles.mapbox.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 33530 (33K) [image/jpeg]
Saving to: `/Volumes/Alaya/XXXXXXXX/Downloads/MapTiles/Tile (52.180) 120.png'
100%[======================================>] 33,530 --.-K/s in 0.008s
2014-03-29 09:56:31 (3.90 MB/s) - `/Volumes/Alaya/XXXXXXXX/Downloads/MapTiles/Tile (52.180) 120.png' saved [33530/33530]
--2014-03-29 09:56:31-- http://%20%0Dwget/
Resolving \rwget... failed: nodename nor servname provided, or not known.
wget: unable to resolve host address ` \rwget'
--2014-03-29 09:56:31-- http://api.tiles.mapbox.com/v3/localstarlight.hl2o31b8/174,52,9/1280x1280.png
Reusing existing connection to api.tiles.mapbox.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 48906 (48K) [image/jpeg]
Saving to: `/Volumes/Alaya/XXXXXXXX/Downloads/MapTiles/Tile (52.180) 120.png'
100%[======================================>] 48,906 --.-K/s in 0.01s
2014-03-29 09:56:31 (4.88 MB/s) - `/Volumes/Alaya/XXXXXXXX/Downloads/MapTiles/Tile (52.180) 120.png' saved [48906/48906]
--2014-03-29 09:56:31-- http://%20%0Dwget/
Resolving \rwget... failed: nodename nor servname provided, or not known.
wget: unable to resolve host address ` \rwget'
--2014-03-29 09:56:31-- http://api.tiles.mapbox.com/v3/localstarlight.hl2o31b8/177,52,9/1280x1280.png
Reusing existing connection to api.tiles.mapbox.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 45644 (45K) [image/jpeg]
Saving to: `/Volumes/Alaya/XXXXXXXX/Downloads/MapTiles/Tile (52.180) 120.png'
100%[======================================>] 45,644 --.-K/s in 0.01s
2014-03-29 09:56:31 (4.36 MB/s) - `/Volumes/Alaya/XXXXXXXX/Downloads/MapTiles/Tile (52.180) 120.png' saved [45644/45644]
So it's saving every file to this name: Tile (52.180) 120.png
Note that it doesn't do this if I put in each command separately...so I don't understand why it's doing that.
Can someone tell me how to execute this list of commands so that it does each one properly?
Thanks!
Your file should look like this:
#!/bin/bash
wget -q 'http://api.tiles.mapbox.com/v3/localstarlight.hl2o31b8/-180,52,9/1280x1280.png' -O 'a.png'
wget -q 'http://api.tiles.mapbox.com/v3/localstarlight.hl2o31b8/-177,52,9/1280x1280.png' -O 'b.png'
BUT... you have a backslash at the end of each wget line, which is a continuation character for long lines and which you don't need. Remove it.
Essentially you are asking wget to get a file and then another file called wget and then another file and then another file. Your script only does a single wget - the first one. All the other wget commands are seen as parameters to the first wget because of the continuation character.
You are doing this:
wget URL file wget URL file wget URL file
Quoting from the log you've posted:
http://%20%0Dwget/
This suggests that your script contains CR+LF line endings. Remove those before executing the script:
sed $'s/\r//' scriptname
or
tr -d '\r' < scriptname

Recursive wget won't work

I'm trying to crawl a local site with wget -r but I'm unsuccessful: it just downloads the first page and doesn't go any deeper. By the way, I'm so unsuccessful that for whatever site I'm trying it doesn't work... :)
I've tried various options but nothing better happens. Here's the command I thought I'd make it with:
wget -r -e robots=off --user-agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/22.0.1229.79 Safari/537.4" --follow-tags=a,ref --debug `http://rocky:8081/obix`
Really, I've no clue. Whatever site or documentation I read about wget tells me that it should simply work with wget -r so I'm starting to think my wget is buggy (I'm on Fedora 16).
Any idea?
EDIT: Here's the output I'm getting for wget -r --follow-tags=ref,a http://rocky:8081/obix/ :
wget -r --follow-tags=ref,a http://rocky:8081/obix/
--2012-10-19 09:29:51-- http://rocky:8081/obix/ Resolving rocky... 127.0.0.1 Connecting to rocky|127.0.0.1|:8081...
connected. HTTP request sent, awaiting response... 200 OK Length: 792
[text/xml] Saving to: “rocky:8081/obix/index.html”
100%[==============================================================================>] 792 --.-K/s in 0s
2012-10-19 09:29:51 (86,0 MB/s) - “rocky:8081/obix/index.html”
saved [792/792]
FINISHED --2012-10-19 09:29:51-- Downloaded: 1 files, 792 in 0s (86,0
MB/s)
Usually there's no need to give the user-agent.
It should be sufficient to give:
wget -r http://stackoverflow.com/questions/12955253/recursive-wget-wont-work
To see, why wget doesn't do what you want, look at the output it is giving you and post it here.

Resources