Removing progress bar from program output redirected into log file - bash

I was running a program, and it will output the progress bar. I did it like this
python train.py |& tee train.log
The train.log looks like the following.
This is line 1
Training ...
This is line 2
...
[000] valid: 100%|█████████████████████████████████████████████████████████████▉| 2630/2631 [15:24<00:00, 2.98 track/s]
[000] valid: 100%|██████████████████████████████████████████████████████████████| 2631/2631 [15:25<00:00, 3.02 track/s]
Epoch 000: train=0.11940351 valid=0.10640465 best=0.1064 duration=0.79 days
This is line 3
...
[001] valid: 100%|█████████████████████████████████████████████████████████████▉| 2629/2631 [15:11<00:00, 2.90
[001] valid: 100%|█████████████████████████████████████████████████████████████▉| 2630/2631 [15:11<00:00, 2.89
[001] valid: 100%|██████████████████████████████████████████████████████████████| 2631/2631 [15:12<00:00, 2.88
Epoch 001: train=0.10971066 valid=0.09931737 best=0.0993 duration=0.79 days
On the terminal, they are supposed to be viewed as replacing itself, hence in the log file, there are alot of repetitions. So when I did wc -l train.log, it only returned 3 lines. However when I opened this 5MB text file in the text editor, there are like 20000 lines.
My objective is to only get these details:
Epoch 000: train=0.11940351 valid=0.10640465 best=0.1064 duration=0.79 days
Epoch 001: train=0.10971066 valid=0.09931737 best=0.0993 duration=0.79 days
My questions are:
How do I, without stopping my current training progress, extract my desired details from the suppposedly "3" lines of train.log? Keep in mind that this training will be continuously done for 10 more epochs, so I don't want to open the whole junk of progress bar in the editor.
In the future, how should I store my log file (instead of calling python train.py |& tee train.log) such that while I can see the progress bar in the terminal, I only keep the important information in a text file?
Edit 1 :
Here's a link to the file train.log

The progress bars are probably written to stderr, which you send to tee together with stdout by using |&.
To write only stdout to the file, use the normal pipe | instead.
The progress bar was generated by writing one line and then a carriage return character (\r) but no newline character (\n). To fix that and to be able to process the file further, you can use for example sed 's/\r/\n/g'.
The following works with the file linked in the question:
$ sed 's/\r/\n/g' train.log | grep Epoch
Epoch 000: train=0.11940351 valid=0.10640465 best=0.1064 duration=0.79 days

Ok, I solved it already.
According to this question,
You make a progress bar by doing echo -ne "your text \r" > log.file.
So because some editor that i used (Notepad, sublime text 3) recognise \r as a line breaker, you see them as seperate line, but in actual fact they are stored in single line.
So to reverse engineer it, you can make them into actual line breakers sed -i "s,\r,\n,g" train.log, and the grep accoringly.
Anyhoo, thanks #mkrieger1 for helping me out anyway !

Related

Adding Line Breaks in QR Code / 2D Barcode

In Windows 10, I am using qrencode for printing QR codes to image files. The printing is successful but I am stuck at adding line breaks. I have tried the below method with no success in line breaks.
Windows Command Prompt:
d:\ qrencode -o qrcode.png "INDO GERMAN ALKALOIDS \nUnique ID: ABC-123456789 \nAPI-Name: ABCDEFGH \nBrand: Indo-101 \nAddress: Inga House, Mahakali Road, Andheri-East, Mumbai-400093, \nTel-022-28202932/33, \nMobile: 9833942075, \nBatch No.: XYZ888999000, \nBatch Size: 1020, \nMfgd.Date: 29-12-2022, \nExpiry Date: 31-12-2023, \nContainer Code: RRR-101020, \nMfgr Lic.No.: ------------, \nStorage Instruction: Store in cool area 20deg"
After playing around for some time, I came across the below command in the Ubuntu Manuals:
cat bigfile.txt | qrencode -S -v 40 -l L -o output.png
I placed the required content as below in a text file named qr-data.txt
INDO GERMAN ALKALOIDS
Unique ID: ABC-123456789
API-Name: ABCDEFGH
Brand: Indo-101
Then at the DOS prompt I typed:
type qr-data.txt | qrencode -o qr-code.png
It now works perfect for me.
Note that I used [type] instead of [cat] in MS DOS.

How to consistently record command output in a variable in UNIX

In some cases what I see in a console output is different from what I get recorded after redirection. I see this on Linux/bash but this example is ksh/OpenBSD. Is there a way around this?
For example:
# pfctl -ttable -Ttest 123.123.123.123 > result.txt
0/1 addresses match.
# more result.txt
result.txt (END)
In other words the "0/1 addresses match." is printed on the console, but I cannot for the life of me get it into a file, variable or anything. I've used $() and > which work for most commands, but every now and then there is a command that spits out stuff on the screen but I get nothing via the redirect/pipe. I hope someone can shed light on this peculiarity.
So again contrast this:
# OUTP=$(pfctl -tscanners -Ttest 123.123.123.123)
0/1 addresses match.
# echo $OUTP
#
(nothing echoing, the variable does not hold the console output) with this:
# OUTP=$(date)
# echo $OUTP
Sun Aug 21 08:33:37 PDT 2016
#
(the variable contains the entire console output)
Thanks again for any help.
Your command has 2 different output streams.
You need to rederict the second (stderr) to the first.
pfctl -ttable -Ttest 123.123.123.123 > result.txt 2>&1

How to resume reading a file?

I'm trying to find the best and most efficient way to resume reading a file from a given point.
The given file is being written frequently (this is a log file).
This file is rotated on a daily basis.
In the log file I'm looking for a pattern 'slow transaction'. End of such lines have a number into parentheses. I want to have the sum of the numbers.
Example of log line:
Jun 24 2015 10:00:00 slow transaction (5)
Jun 24 2015 10:00:06 slow transaction (1)
This is easy part that I could do with awk command to get total of 6 with above example.
Now my challenge is that I want to get the values from this file on a regular basis. I've an external system that polls a custom OID using SNMP. When hitting this OID the Linux host runs a couple of basic commands.
I want this SNMP polling event to get the number of events since the last polling only. I don't want to have the total every time, just the total of the newly added lines.
Just to mention that only bash can be used, or basic commands such as awk sed tail etc. No perl or advanced programming language.
I hope my description will be clear enough. Apologizes if this is duplicate. I did some researches before posting but did not find something that precisely correspond to my need.
Thank you for any assistance
In addition to the methods in the comment link, you can also simply use dd and stat to read the logfile size, save it and sleep 300 then check the logfile size again. If the filesize has changed, then skip over the old information with dd and read the new information only.
Note: you can add a test to handle the case where the logfile is deleted and then restarted with 0 size (e.g. if $((newsize < size)) then read all.
Here is a short example with 5 minute intervals:
#!/bin/bash
lfn=${1:-/path/to/logfile}
size=$(stat -c "%s" "$lfn") ## save original log size
while :; do
newsize=$(stat -c "%s" "$lfn") ## get new log size
if ((size != newsize)); then ## if change, use new info
## use dd to skip over existing text to new text
newtext=$(dd if="$lfn" bs="$size" skip=1 2>/dev/null)
## process newtext however you need
printf "\nnewtext:\n\n%s\n" "$newtext"
size=$((newsize)); ## update size to newsize
fi
sleep 300
done

Bash scripting: date -d won't accept my string format of hhmmss. I need a workaround

I need a Bash script to accept 1 argument representing a time in hhmmss format, and from that derive a second time 3 minutes before that.
I've been trying to use date -d:
#! /bin/bash
DATE=`date +%Y%m%d`
TIME=$1
NEWTIME=`date -d "$DATE $TIME - 3 minutes" +%H%M%S`
echo $NEWTIME
In action:
$ ./myscript.sh 123456
invalid date `20141022 123456 - 3 minutes'
It seems the problem is with the 6 character time format because 4 characters (eg 1234) works. The subtraction of the 3 minutes is not the problem because I get the same error when I remove it.
It has occurred to me I could parse the time into a more palatable format before sending it to date. I tried inserting delimiters by adding this line:
TIME=${TIME:0:2}:${TIME:2:2}:${TIME:4:2}
It accepted that format but the answer to the - 3 minutes part was inexplicably very wrong (it subtracted 2 hours and 1 minute):
$ ./myscript.sh 123456
103356
Vexing.
It has also occurred to me that I might be able to provide date with an input format, like strptime which I'm familiar with from Python. I've found references to strptime in the context of Bash but I've been unable to get it to do anything.
Does anyone have any suggestions on getting the hhmmss time-string to work? Any help is much appreciated.
FYI: I'm trying to avoid changing the 6 character input format because that would involve changing other scripts as well as getting certain human users to alter long-entrenched habits. I'm also trying to avoid outsourcing this task to another language. (I could easily do this in Python). I want a Bash solution to this problem, if there is one.
TIME=093000
TIME=${TIME:0:2}:${TIME:2:2}:${TIME:4:2} # your line
date -d "2014-10-20 $TIME 3 mins ago" +%H%M%S
Output:
092700

Hex Dump Specific Parts of File - Bash

I'm trying to write a bash script to audit hard drives that have been wiped to ensure the wiping system is working properly. I would like to find a way to hex dump specific parts of a drive without having to hex dump the entire drive and extract the parts I'd like (as this seems to run for too long to make the script worth writing). Ideally, I'd be able to grab parts from the beginning, middle, and end of the drive.
I would like to take the output of the hex dump and check it for the existence of only one character (indicating the drive has been successfully wiped). This part, I can handle, but I thought it may affect any advice I may get.
I've used head piped into xxd to get the beginning of the file which has worked, but I'm still stuck on the other parts. I've tried using tail to just get the end of the drive, but that doesn't seem to work quickly either. Is it possible to do this efficiently? Possibly using dd or something else and pipe it into a hex editor? I've looked through options for xxd as well as hexdump to no avail. If someone could point me in the right direction, it would be greatly appreciated!
xxd has options to skip a ways into the file (-s) and dump a limited length (-l). If you use its plain hex (-p) option, you may be able to use grep to find any anomalies:
$ xxd -s 8192 -l 256 -p /dev/disk3s2 | grep [^0]
000000010000000000000000000000000000000000000000000000000000
000000000000000000000000300000000000000800000000000000000000
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
od has similar skip (-j) and limit length (-N). Similarly, dd has skip= and count= (although these are counted in blocks, not bytes; you can change the block size with bs=).
EDIT: Since xxd -p is giving weird results (not stopping at what should be the end of the device), I'd recommend running some tests to figure out what's going on. First, back up anything important on the computer, because if something is weird at the device access level, it's possible that some of these tests might overwrite something unexpected, possibly even on another disk.
Next, try dumping to the end of the device with different tools, and see if they all behave the same way:
xxd -s 65451982336 /dev/sdb | more # This *should* dump 512 bytes (32 lines) then stop, but apparently keeps going
od -xv -j 65451982336 /dev/sdb | more # This also *should* dump 512 bytes then stop
dd if=/dev/sdb skip=127835903 | xxd | more # This again should do the same thing (note that the skip value is in 512-byte blocks)
Do the other tools read past what fdisk reports as the end of the disk? If all three read more data, I'm going with the "fdisk is wrong/misleading" answer. You can test further by writing some nonzero data past the "end" and seeing what the results are:
dd if=/dev/random of=/dev/sdb seek=127835903 count=2
...then repeat the various dump commands. If they show two blocks (=64 lines) of random data followed by zeroes, I'm pretty sure the device is bigger than you think it is.
I am not near my shell, but something along these lines should get you started:
dd if=/dev/hda1 | hexdump -C | grep [^00]
will print all non-zero bytes.
dd if=/dev/hda1 | od -x -j100
will give you a hexadecimal dump with offsets, starting 100 bytes in.

Resources