I'm using wget inside Python to test internet speed. My goal is to track the latency throughout the download, so I need to know the Mbps at least every second during the download.
If I manually run script and then wget I get the desired output ...
--2022-06-20 04:14:13-- https://speed.hetzner.de/100MB.bin
Resolving speed.hetzner.de (speed.hetzner.de)... 88.198.248.254
Connecting to speed.hetzner.de (speed.hetzner.de)|88.198.248.254|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 104857600 (100M) [application/octet-stream]
Saving to: ‘100MB.bin.9’
100MB.bin.9 0%[ ] 0 --.-KB/s
100MB.bin.9 0%[ ] 119.69K 516KB/s
100MB.bin.9 0%[ ] 231.69K 488KB/s
100MB.bin.9 0%[ ] 343.69K 494KB/s
100MB.bin.9 0%[ ] 423.69K 447KB/s
100MB.bin.9 0%[ ] 519.69K 431KB/s
But if I run wget <address> -o wget.log I get the follwowing...
Resolving speed.hetzner.de (speed.hetzner.de)... 88.198.248.254
Connecting to speed.hetzner.de (speed.hetzner.de)|88.198.248.254|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 104857600 (100M) [application/octet-stream]
Saving to: ‘100MB.bin.7’
0K .......... .......... .......... .......... .......... 0% 1.22M 82s
50K .......... .......... .......... .......... .......... 0% 1.68M 71s
100K .......... .......... .......... .......... .......... 0% 1.70M 67s
150K .......... .......... .......... .......... .......... 0% 8.41M 53s
200K .......... .......... .......... .......... .......... 0% 3.89M 47s
(snip)
102250K .......... .......... .......... .......... .......... 99% 3.45M 0s
102300K .......... .......... .......... .......... .......... 99% 2.86M 0s
102350K .......... .......... .......... .......... ..........100% 3.22M 0s
102400K 100% 0.00 =30s
2022-06-20 03:44:11 (3.37 MB/s) - ‘100MB.bin.7’ saved [104857600/104857600]
What exactly does each column mean?
For example in the lines...
50K .......... .......... .......... .......... .......... 0% 1.68M 71s
100K .......... .......... .......... .......... .......... 0% 1.70M 67s
Does the 1.68M mean that the first 50 Kilo(bytes?) of data was downloaded at 1.68Mega(bits?) per second, and the 1.70M means the next 50K was at 1.70Mbps?
GNU wget has 2 distinct ways of representing progress: thermometer and dot, 1st is used when output is TTY (as is your first example), writing to file is non-TTY, so you need to instruct GNU wget to use thermometer implicitly if you want first style written to file, that is
wget --progress=bar:force <address> -o wget.log
for more detailed description see --progress in wget man page
Related
Using wget is it possible to spider a host for a specific file type? I'm archiving some documents from an FTP and I need to have it crawl the entire host only downloading .txt files.
I've attempted like so:
wget mysite.com/ftplist --config=./.wgetrc
With the following .wgetrc:
accept = txt
check_certificate = off
connect_timeout = 3
cookies = off
dns_cache = off
follow_ftp = on
logfile = amz.log
max_redirect = 3
no_clobber = on
recursive = on
save_headers = on
This will make a call to mysite.com/ftplist. This page contains ftp:// URL's in a list. wget makes a request to this page but won't proceed any further and seems to stop on that page.
Here is the amz.log
Saving to: ‘mysite.com/ftplinks/index.html.tmp’
0K .......... .......... .......... .......... .......... 656K
50K .......... .......... .......... .......... .......... 741K
100K .......... .......... .......... .......... .......... 1.12M
150K .......... .......... .......... .......... .......... 975K
200K .......... .......... .......... .......... .......... 935K
250K .......... .......... .......... .......... .......... 835K
300K .......... .......... .......... .......... .......... 870K
350K .......... .......... .......... .......... .......... 1.07M
400K .......... .......... .......... ....... 907K=0.5s
2018-12-20 17:55:54 (881 KB/s) - ‘mysite.com/ftplinks/index.html.tmp’ saved [447555]
Removing mysite.com/ftplinks/index.html.tmp since it should be rejected.
Am I missing something?
The typical output recorded into chk file from the command:
wget -O - http://website/file > /dev/null 2>chk &
is something like :
0K .......... .......... .......... .......... .......... 0% 143K 62s
50K .......... .......... .......... .......... .......... 1% 433K 41s
100K .......... .......... .......... .......... .......... 1% 1.20M 30s
150K .......... .......... .......... .......... .......... 2% 259K 31s
200K .......... .......... .......... .......... .......... 2% 83.2M 24s
...
8800K .......... .......... .......... .......... .......... 98% 260K 1s
8850K .......... .......... .......... .......... .......... 98% 329K 0s
8900K .......... .......... .......... .......... .......... 99% 433K 0s
8950K .......... .......... .......... .......... ......... 100% 331K=31s
2017-01-13 13:16:59 (288 KB/s) - written to stdout [9215609/9215609]
The file is updated, line after line, during the whole download process.
Well, I need to get only the percentage: 0, 1, 2 ... 99 and nothing more.
The following script do the job, even if not perfectly:
tail -n 5 chk | tail -n 1 | colrm 1 63 | cut -d '%' -f 1
The problem arises when I need to do the same into a bash script, as in the following:
#!/bin/bash
# Test script for getting the percentage number from 'wget' output
i=0
wget -O - http://website/file > /dev/null 2>chk &
sleep 1
while (( $i < 90 ))
do
i=`tail -n 5 chk | tail -n 1 | colrm 1 63 | cut -d '%' -f 1`
echo $i
done
The script starts getting the wanted file, it writes out the chk file, but stops with the error message:
line 9: ((: < 90 : syntax error: operand expected (error token is "< 90 ")
I have tried by using [[ ]], quotes... but doesn't work.
Any idea here to do a better job?
Progress bar with wget, whiptail and GNU sed:
wget --progress=dot 'URL' 2>&1 | sed -un 's/.* \([0-9]\+\)% .*/\1/p' | whiptail --gauge "Download" 7 50 0
For example, when using wget
$ wget https://pypi.python.org/packages/source/F/Flask/Flask-0.10.1.tar.gz
The output looks like
--2016-03-05 20:01:58-- https://pypi.python.org/packages/source/F/Flask/Flask-0.10.1.tar.gz
Resolving pypi.python.org (pypi.python.org)... 199.27.74.223
Connecting to pypi.python.org (pypi.python.org)|199.27.74.223|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 544247 (531K) [application/octet-stream]
Saving to: ‘Flask-0.10.1.tar.gz’
100%[====================================================================================================================================================================================================>] 544,247 2.38MB/s in 0.2s
2016-03-05 20:01:59 (2.38 MB/s) - ‘Flask-0.10.1.tar.gz’ saved [544247/544247]
But when I redirect it to a log
$ wget https://pypi.python.org/packages/source/F/Flask/Flask-0.10.1.tar.gz &> tmp.log
$ cat tmp.log
--2016-03-05 20:02:54-- https://pypi.python.org/packages/source/F/Flask/Flask-0.10.1.tar.gz
Resolving pypi.python.org (pypi.python.org)... 199.27.74.223
Connecting to pypi.python.org (pypi.python.org)|199.27.74.223|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 544247 (531K) [application/octet-stream]
Saving to: ‘Flask-0.10.1.tar.gz.3’
0K .......... .......... .......... .......... .......... 9% 443K 1s
50K .......... .......... .......... .......... .......... 18% 409K 1s
100K .......... .......... .......... .......... .......... 28% 433K 1s
150K .......... .......... .......... .......... .......... 37% 374K 1s
200K .......... .......... .......... .......... .......... 47% 374K 1s
250K .......... .......... .......... .......... .......... 56% 338K 1s
300K .......... .......... .......... .......... .......... 65% 337K 0s
350K .......... .......... .......... .......... .......... 75% 241K 0s
400K .......... .......... .......... .......... .......... 84% 346K 0s
450K .......... .......... .......... .......... .......... 94% 384K 0s
500K .......... .......... .......... . 100% 693K=1.4s
2016-03-05 20:02:55 (369 KB/s) - ‘Flask-0.10.1.tar.gz.3’ saved [544247/544247]
I am being very curious, wondering what happened when writing to the screen?
How is the incremental appearance of those equal signs made possible, and where
are they gone when redireced to a log?
wget calls isatty() on stderr to decide whether or not to display the incremental equals signs progress bar for the download. This is convenient for many reasons as we can send terminal control characters to back up and erase the line and rewrite it. This is done by a different mechanism when writing to a file, and not possible when writing to a pipe.
I have this text file (wget.log):
1400K .......... .......... .......... .......... .......... 5% 78.5K 4m10s
1450K .......... .......... .......... .......... .......... 5% 46.6K 4m19s
1500K .......... .......... .......... .......... .......... 5% 105K 4m17s
1550K .......... .......... .......... .......... .......... 6% 63.0K 4m21s
1600K ..........
Since I just want to replace the .......... in the last 3 lines. I tried this command:
tail -n 3 /www/wget.log | sed 's/. /=>/g'
but it won't replace anything. I want the output to be like this:
1500K => 5% 105K 4m17s
1550K => 5% 105K 4m17s
1600K =>
How should I do that?
You can use:
tail -n 3 /www/wget.log | sed -r 's/(\.+ *)+/=> /'
1500K => 5% 105K 4m17s
1550K => 6% 63=> 0K 4m21s
1600K =>
On OSX use:
tail -n 3 /www/wget.log | sed -E 's/(\.+ *)+/=> /'
You could try the below sed command.
$ tail -n 3 /www/wget.log | sed 's/ \..*\.\( \|$\)/ => /g'
1500K => 5% 105K 4m17s
1550K => 6% 63.0K 4m21s
1600K =>
I have made a script that I am going to call using windows scheduler to back up a Ruby on Rails app I have made.
When I call the command normally in a command window, the output looks like this
C:\Users\admin\Desktop\app>heroku db:pull --confirm app
Loaded Taps v0.3.23
Auto-detected local database: postgres://db:pass#127.0.0.1/app?encoding=utf8
Warning: Data in the database 'postgres://db:pass#127.0.0.1/app?encoding=utf8' will be overwritten and
will not be recoverable.
Receiving schema
Schema: 0% | | ETA: --:--:--
Schema: 20% |======== | ETA: 00:00:21
Schema: 40% |================ | ETA: 00:00:18
Schema: 60% |========================= | ETA: 00:00:12
Schema: 80% |================================= | ETA: 00:00:05
Schema: 100% |==========================================| Time: 00:00:29
Receiving indexes
schema_migrat: 0% | | ETA: --:--:--
schema_migrat: 100% |==========================================| Time: 00:00:05
users: 0% | | ETA: --:--:--
users: 50% |===================== | ETA: 00:00:05
users: 100% |==========================================| Time: 00:00:10
Receiving data
5 tables, 1,000 records
table1: 100% |==========================================| Time: 00:00:00
table2: 100% |==========================================| Time: 00:00:00
table3: 100% |==========================================| Time: 00:00:00
table4: 100% |==========================================| Time: 00:00:00
table5: 100% |==========================================| Time: 00:00:01
Resetting sequences
Here is my .bat:
heroku db:pull --confirm app >> log.txt
If I run this twice, this is the output that goes into a file, log.txt
Loaded Taps v0.3.23
Auto-detected local database: postgres://db:pass#127.0.0.1/webapp_development?encoding=utf8
Warning: Data in the database 'postgres://db:pass#127.0.0.1/webapp_development?encoding=utf8' will be overwritten and will not be recoverable.
Receiving schema
Receiving indexes
Receiving data
5 tables, 1,000 records
Resetting sequences
Loaded Taps v0.3.23
Auto-detected local database: postgres://db:pass#127.0.0.1/webapp_development?encoding=utf8
Warning: Data in the database 'postgres://db:pass#127.0.0.1/webapp_development?encoding=utf8' will be overwritten and will not be recoverable.
Receiving schema
Receiving indexes
Receiving data
5 tables, 1,000 records
Resetting sequences
Is there any way to include the exact console output, and also include dates and times of when the script was run? Thanks in advance.
UPDATE:
Start: 19/10/2012 12:08:04.90
Schema: 0% | | ETA: --:--:--Schema: 20% |======== | ETA: 00:00:24Schema: 40% |================ | ETA: 00:00:18Schema: 60% |========================= | ETA: 00:00:13Schema: 80% |================================= | ETA: 00:00:06Schema: 100% |==========================================| ETA: 00:00:00Schema: 100% |==========================================| Time: 00:00:32
schema_migrat: 0% | | ETA: --:--:--schema_migrat: 100% |==========================================| ETA: 00:00:00schema_migrat: 100% |==========================================| Time: 00:00:05
users: 0% | | ETA: --:--:--users: 50% |===================== | ETA: 00:00:05users: 100% |==========================================| ETA: 00:00:00users: 100% |==========================================| Time: 00:00:08
schema_migrat: 0% | | ETA: --:--:--schema_migrat: 7% |== | ETA: 00:00:06schema_migrat: 100% |==========================================| Time: 00:00:00
users: 0% | | ETA: --:--:--users: 3% |= | ETA: 00:00:11users: 100% |==========================================| Time: 00:00:00
projecttechno: 0% | | ETA: --:--:--projecttechno: 6% |== | ETA: 00:00:05projecttechno: 100% |==========================================| Time: 00:00:00
technols: 0% | | ETA: --:--:--technols: 7% |== | ETA: 00:00:05technols: 100% |==========================================| Time: 00:00:00
projects: 0% | | ETA: --:--:--projects: 1% | | ETA: 00:00:54projects: 100% |==========================================| Time: 00:00:00
Loaded Taps v0.3.23
Auto-detected local database: postgres://postgres:a#127.0.0.1/webapp_development?encoding=utf8
Warning: Data in the database 'postgres://postgres:a#127.0.0.1/webapp_development?encoding=utf8' will be overwritten and will not be recoverable.
Receiving schema
Receiving indexes
Receiving data
5 tables, 1,000 records
Resetting sequences
Adding the date and time is easy using %date% and %time%.
You could try to redirect stderr (2) to stdout (&1), perhaps that will capture the missing output.
echo Start: %date% %time% >>log.txt
heroku db:pull --confirm app >>log.txt 2>&1
echo Stop: %date% %time% >>log.txt