How to consistently record command output in a variable in UNIX - shell

In some cases what I see in a console output is different from what I get recorded after redirection. I see this on Linux/bash but this example is ksh/OpenBSD. Is there a way around this?
For example:
# pfctl -ttable -Ttest 123.123.123.123 > result.txt
0/1 addresses match.
# more result.txt
result.txt (END)
In other words the "0/1 addresses match." is printed on the console, but I cannot for the life of me get it into a file, variable or anything. I've used $() and > which work for most commands, but every now and then there is a command that spits out stuff on the screen but I get nothing via the redirect/pipe. I hope someone can shed light on this peculiarity.
So again contrast this:
# OUTP=$(pfctl -tscanners -Ttest 123.123.123.123)
0/1 addresses match.
# echo $OUTP
#
(nothing echoing, the variable does not hold the console output) with this:
# OUTP=$(date)
# echo $OUTP
Sun Aug 21 08:33:37 PDT 2016
#
(the variable contains the entire console output)
Thanks again for any help.

Your command has 2 different output streams.
You need to rederict the second (stderr) to the first.
pfctl -ttable -Ttest 123.123.123.123 > result.txt 2>&1

Related

Removing progress bar from program output redirected into log file

I was running a program, and it will output the progress bar. I did it like this
python train.py |& tee train.log
The train.log looks like the following.
This is line 1
Training ...
This is line 2
...
[000] valid: 100%|█████████████████████████████████████████████████████████████▉| 2630/2631 [15:24<00:00, 2.98 track/s]
[000] valid: 100%|██████████████████████████████████████████████████████████████| 2631/2631 [15:25<00:00, 3.02 track/s]
Epoch 000: train=0.11940351 valid=0.10640465 best=0.1064 duration=0.79 days
This is line 3
...
[001] valid: 100%|█████████████████████████████████████████████████████████████▉| 2629/2631 [15:11<00:00, 2.90
[001] valid: 100%|█████████████████████████████████████████████████████████████▉| 2630/2631 [15:11<00:00, 2.89
[001] valid: 100%|██████████████████████████████████████████████████████████████| 2631/2631 [15:12<00:00, 2.88
Epoch 001: train=0.10971066 valid=0.09931737 best=0.0993 duration=0.79 days
On the terminal, they are supposed to be viewed as replacing itself, hence in the log file, there are alot of repetitions. So when I did wc -l train.log, it only returned 3 lines. However when I opened this 5MB text file in the text editor, there are like 20000 lines.
My objective is to only get these details:
Epoch 000: train=0.11940351 valid=0.10640465 best=0.1064 duration=0.79 days
Epoch 001: train=0.10971066 valid=0.09931737 best=0.0993 duration=0.79 days
My questions are:
How do I, without stopping my current training progress, extract my desired details from the suppposedly "3" lines of train.log? Keep in mind that this training will be continuously done for 10 more epochs, so I don't want to open the whole junk of progress bar in the editor.
In the future, how should I store my log file (instead of calling python train.py |& tee train.log) such that while I can see the progress bar in the terminal, I only keep the important information in a text file?
Edit 1 :
Here's a link to the file train.log
The progress bars are probably written to stderr, which you send to tee together with stdout by using |&.
To write only stdout to the file, use the normal pipe | instead.
The progress bar was generated by writing one line and then a carriage return character (\r) but no newline character (\n). To fix that and to be able to process the file further, you can use for example sed 's/\r/\n/g'.
The following works with the file linked in the question:
$ sed 's/\r/\n/g' train.log | grep Epoch
Epoch 000: train=0.11940351 valid=0.10640465 best=0.1064 duration=0.79 days
Ok, I solved it already.
According to this question,
You make a progress bar by doing echo -ne "your text \r" > log.file.
So because some editor that i used (Notepad, sublime text 3) recognise \r as a line breaker, you see them as seperate line, but in actual fact they are stored in single line.
So to reverse engineer it, you can make them into actual line breakers sed -i "s,\r,\n,g" train.log, and the grep accoringly.
Anyhoo, thanks #mkrieger1 for helping me out anyway !

Bash commands putting out extra information which results into issues with scripts

Okay, hopefully I can explain this correctly as I have no idea what's causing this or how to resolve this.
For some reason bash commands (on a CentOS 6.x server) are displaying more information than "normally" and that causes issues with certain scripts. I have no clue if there is a name for this, but hopefully someone knows a solution for this.
First example.
Correct / good server:
[root#goodserver ~]# vzctl enter 3567
entered into CT 3567
[root#example /]#
(this is the correct behaviour)
Incorrect / bad server:
[root#badserver /]# vzctl enter 3127
Entering CT
entered into CT 3127
Open /dev/pts/0
[root#example /]#
With the "bad" server it will display more information as usual, like:
Entering CT
Open /dev/pts/0
It's like it parsing extra information on what it's doing.
Ofcourse the above is purely something cosmetic, however with several bash scripts we use, these issues are really issues.
A part of the script we use, uses the following command (there are more, but this is mainly a example of what's wrong):
DOMAIN=`vzctl exec $VEID 'hostname -d'`
The result of the above information is parsed in /etc/named.conf.
On the GOOD server it would be added in the named.conf like this:
zone "example.com" {
type master;
file "example.com";
allow-transfer {
200.190.100.10;
200.190.101.10;
common-allow-transfer;
};
};
The above is correct.
On the BAD server it would be added in the named.conf like this:
zone "Executing command: hostname -d
example.com" {
type master;
file "Executing command: hostname -d
example.com";
allow-transfer {
200.190.100.10;
200.190.101.10;
common-allow-transfer;
};
};
So it's add stuff of the action it does, in this example "Executing command: hostname -d"
Another example here when I run the command on a good server and on the bad server.
Bad server:
[root#bad-server /]# DOMAIN=`vzctl exec 3333 'hostname -d'`
[root#bad-server /]# echo $DOMAIN
Executing command: hostname -d example.com
Good server:
[root#good-server ~]# DOMAIN=`vzctl exec 4444 'hostname -d'`
[root#good-server ~]# echo $DOMAIN
example.com
My knowledge is limited, but I have tried several things checking rsyslog and the grub.conf, but nothing seems out of the ordinary.
I have no clue why it's displaying the extra information.
Probably it's something simple / stupid, but I have been trying to solve this for hours now and I really have no clue...
So any help is really appreciated.
Added information:
Both servers use: kernel.printk = 7 4 1 7
(I don't know if that's useful)
Well (thanks to Aaron for pointing me in the right direction) I finally found the little culprit which was causing all the issues I experienced with this script (which worked for every other server, so no need to change that obviously).
The issues were caused by the VERBOSE leven set in vz.conf (located in /etc/vz/ directory). There is an option in there called "VERBOSE" and in my case it was set to 3.
According to OpenVZ's website it does the following:
Increments logging level up from the default. Can be used multiple times.
Default value is set to the value of VERBOSE parameter in the global
configuration file vz.conf(5), or to 0 if not set by VERBOSE parameter.
After I changed VERBOSE=3 to VERBOSE=0 my script worked fine once again (as it did for every other server). :-)
So a big shoutout to Aaron for pointing me in the right direction. The answer is easy when you know where to look!
Sorry to say, but I am kinda disappointed by ndim's reaction. This is the 2nd time he was very unhelpful and rude in his response after that. He clearly didn't read the issue I posted correctly. Oh well.
I would make sure to properly parse the output of the command. In this case, we are only interested in lines of the form
entered into CT 12345
One way of doing this would be to pipe everything through sed and having sed print only the number when the line looks as above (untested, and I always forget which braces/brackets/parens need a backslash in front of them):
whateverthecommand | sed -n 's/^entered into CT ([0-9]{1,})$/\1/p'

How to redirect stderr and stdout inside a while loop

I have a loop set up to search through a datafile and perform COMMANDS using each line of the datafile as parameters. The PUBLISH parameter can have spaces so that's why I have it escaped differently. I want to be able to output stderr and stdout for each iteration of the loop to a different logfile:
while read VARIABLE1 PUBLISH LOGFILENAME
do
/home/script_name -switch $VARIABLE1 -switch2 \"$PUBLISH\" >> /log/$LOGFILENAME 2>&1
done < $INPUTFILE
The logfile LOOKS to get created - that is - I can see it when I do an ll and it appears to have size - but when I attempt to cat the file - I'm getting:
# ll logfile
-rw-r--r-- 1 root root 3205 2014-10-15 12:52 logfile
# pg logfile
pg: logfile: No such file or directory
# cat logfile
cat: logfile: No such file or directory
# read -r logfile
^C (ctrl-c'd after no display)
# file logfile
logfile: ERROR: cannot open `logfile' (No such file or directory)
The COMMANDS are running fine - but I cannot get the logfile created. If I remove the redirect string (>> /log/$LOGFILENAME 2>&1) - the COMMAND output goes to the screen just fine. I'm executing the script and reading the logfile as root - so permissions should not be an issue.
Thanks to n.m. who got me on the right track. I changed the output filename to a hardcoded name and it worked - I was able to see the logfile and page it. I went back and looked at the original $INPUTFILE and discovered that there was a SPACE after the varialble name in the LOGFILENAME variable.

shell script display grep results

I need some help with displaying how many times two strings are found on the same line! Lets say I want to search the file 'test.txt', this file contains names and IP's, I want to enter a name as a parameter when running the script, the script will search the file for that name, and check if there's an IP-address there also. I have tried using the 'grep' command, but I don't know how I can display the results in a good way, I want it like this:
Name: John Doe IP: xxx.xxx.xx.x count: 3
The count is how many times this line was found, this is how my grep script looks like right now:
#!/bin/bash
echo "Searching $1 for the Name '$2'"
result=$(grep "$2" $1 | grep -E "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)")
echo $result
I will run the script like 'sh search test.txt John'.
I'm having trouble displaying the information I get from the grep command, maybe there's a better way to do this?
EDIT:
Okey, I will try to explain a little better, let's say I want to search a .log file, I want a script to search that file for a string the user enters as a parameter. i.e if the user enters 'sh search test.log logged in' the script will search for the string "logged in" within the file 'test.log'. If the script finds this line on the same line as a IP-address the IP address is printed, along with how many times this line was found.
And I simply don't know how to do it, I'm new to shell scripting, and was hoping I could use grep along with regular expressions for this! I will keep on trying, and update this question with an answer if I figure it out.
I don't have said file on my computer, but it looks something like this:
Apr 25 11:33:21 Admin CRON[2792]: pam_unix(cron:session): session opened for user 192.168.1.2 by (uid=0)
Apr 25 12:39:01 Admin CRON[2792]: pam_unix(cron:session): session closed for user 192.168.1.2
Apr 27 07:42:07 John CRON[2792]: pam_unix(cron:session): session opened for user 192.168.2.22 by (uid=0)
Apr 27 14:23:11 John CRON[2792]: pam_unix(cron:session): session closed for user 192.168.2.22
Apr 29 10:20:18 Admin CRON[2792]: pam_unix(cron:session): session opened for user 192.168.1.2 by (uid=0)
Apr 29 12:15:04 Admin CRON[2792]: pam_unix(cron:session): session closed for user 192.168.1.2
Here is a simple Awk script which does what you request, based on the log snippet you posted.
awk -v user="$2" '$4 == user { i[$11]++ }
END { for (a in i) printf ("Name: %s IP: %s count: %i\n", user, a, i[a]) }' "$1"
If the fourth whitespace-separated field in the log file matches the requested user name (which was passed to the shell script as its second parameter), add one to the count for the IP address (from field 11).
At the end, loop through all non-zero IP addresses, and print a summary for each. (The user name is obviously whatever was passed in, but matches your expected output.)
This is a very basic Awk script; if you think you want to learn more, I urge you to consult a simple introduction, rather than follow up here.
If you want a simpler grep-only solution, something like this provides the information in a different format:
grep "$2" "$1" |
grep -o -E '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' |
sort | uniq -c | sort -rn
The trick here is the -o option to the second grep, which extracts just the IP address from the matching line. It is however less precise than the Awk script; for example, a user named "sess" would match every input line in the log. You can improve on that slightly by using grep -w in the first grep -- that still won't help against users named "pam" --, but Awk really gives you a lot more control.
My original answer is below this line, partly becaus it's tangentially useful, partially because it is required in order to understand the pesky comment thread below.
The following
result=$(command)
echo $result
is wrong. You need the second line to be
echo "$result"
but in addition, the detour over echo is superfluous; the simple way to write that is simply
command

Is echo atomic when writing single lines

I am currently trying to get a script to write output from other started commands correctly into a log file. The script will write it's own messages to the log file using echo and there is a method to which I can pipe the lines from the other program.
The main problem is, that the program which produces the output is started in the background, so my function that does the read may write concurently to the logfile. Could this be a problem? Echo always only writes a single line, so it should not be to hard to ensure atomicity. However I have looked in google and I have found no way to make sure it actually is atomic.
Here is the current script:
LOG_FILE=/path/to/logfile
write_log() {
echo "$(date +%Y%m%d%H%M%S);$1" >> ${LOG_FILE}
}
write_output() {
while read data; do
write_log "Message from SUB process: [ $data ]"
done
}
write_log "Script started"
# do some stuff
call_complicated_program 2>&1 | write_output &
SUB_PID=$!
#do some more stuff
write_log "Script exiting"
wait $SUB_PID
As you can see, the script might write both on it's own as well as because of redirected output. Could this cause havok in the file?
echo just a simple wrapper around write (this is a simplification; see edit below for the gory details), so to determine if echo is atomic, it's useful to look up write. From the single UNIX specification:
Atomic/non-atomic: A write is atomic if the whole amount written in one operation is not interleaved with data from any other process. This is useful when there are multiple writers sending data to a single reader. Applications need to know how large a write request can be expected to be performed atomically.This maximum is called {PIPE_BUF}. Thisvolume of IEEE Std 1003.1-2001 does not say whether write requests for more than {PIPE_BUF} bytes are atomic, but requires that writes of {PIPE_BUF}or fewer bytes shall be atomic.
You can check PIPE_BUF on your system with a simple C program. If you're just printing a single line of output, that is not ridiculously long, it should be atomic.
Here is a simple program to check the value of PIPE_BUF:
#include <limits.h>
#include <stdio.h>
int main(void) {
printf("%d\n", PIPE_BUF);
return 0;
}
On Mac OS X, that gives me 512 (the minimum allowed value for PIPE_BUF). On Linux, I get 4096. So if your lines are fairly long, make sure you check it on the system in question.
edit to add: I decided to check the implementation of echo in Bash, to confirm that it will print atomically. It turns out, echo uses putchar or printf depending on whether you use the -e option. These are buffered stdio operations, which means that they fill up a buffer, and actually write it out only when a newline is reached (in line-buffered mode), the buffer is filled (in block-buffered mode), or you explicitly flush the output with fflush. By default, a stream will be in line buffered mode if it is an interactive terminal, and block buffered mode if it is any other file. Bash never sets the buffering type, so for your log file, it should default to block buffering mode. At then end of the echo builtin, Bash calls fflush to flush the output stream. Thus, the output will always be flushed at the end of echo, but may be flushed earlier if it doesn't fit into the buffer.
The size of the buffer used may be BUFSIZ, though it may be different; BUFSIZ is the default size if you set the buffer explicitly using setbuf, but there's no portable way to determine the actual the size of your buffer. There are also no portable guidelines for what BUFSIZ is, but when I tested it on Mac OS X and Linux, it was twice the size of PIPE_BUF.
What does this all mean? Since the output of echo is all buffered, it won't actually call the write until the buffer is filled or fflush is called. At that point, the output should be written, and the atomicity guarantee I mentioned above should apply. If the stdout buffer size is larger than PIPE_BUF, then PIPE_BUF will be the smallest atomic unit that can be written out. If PIPE_BUF is larger than the stdout buffer size, then the stream will write the buffer out when the buffer fills up.
So, echo is only guaranteed to atomically write sequences shorter than the smaller of PIPE_BUF and the size of the stdout buffer, which is most likely BUFSIZ. On most systems, BUFSIZ is larger that PIPE_BUF.
tl;dr: echo will atomically output lines, as long as those lines are short enough. On modern systems, you're probably safe up to 512 bytes, but it's not possible to determine the limit portably.
There is no involuntary file locking, but the >> operator is safe, the > operator is unsafe. So your practice is safe to do.
I tried the approach from user:pizza and could not get it to work like the answer from user:Brian Campbell. Let me know if I am doing something work and I'll update the answer. (And yes this is an answer because I'm actually giving a complete working demo.)
basic concurrency
This just illustrates the problem
$ for n in {1..5}; do (curl -svo /dev/null example.com 2>&1 &) done | grep GET
> GET / HTTP/1.1
>> GET / HTTP/1.1
GET / HTTP/1.1
>>> GET / HTTP/1.1
>>GET / HTTP/1.1
using echo on each line of output
This solves the problem using Brian Campbell's method. (Note that the length of the line for which this works is limited.)
$ for n in {1..5}; do (curl -svo /dev/null example.com 2>&1 | while read; do echo "${REPLY}"; done &) done | grep GET
> GET / HTTP/1.1
> GET / HTTP/1.1
> GET / HTTP/1.1
> GET / HTTP/1.1
> GET / HTTP/1.1
redirecting the for loop to append to stdout
Instinct should tell you that this is not going to work because it redirects after all the the output of the forked curls have been merged.
$ for n in {1..5}; do (curl -svo /dev/null example.com 2>&1 &) done >> /dev/stdout | grep GET
> GET / HTTP/1.1
> GET / HTTP/1.1
>> >GET / HTTP/1.1
> GET / HTTP/1.1
GET / HTTP/1.1
redirecting each curl to append to stdout
I suspect this failure is do to the fact that the entire content of each curl is being redirected and the size is greater than what the kernel is willing to block for. I have not taken the time to confirm that, but what Brian Campbell did share seems to support it.
$ for n in {1..5}; do (curl -svo /dev/null example.com >>/dev/stdout 2>&1 &) done | grep GET
>> GET / HTTP/1.1
GET / HTTP/1.1
> GET / HTTP/1.1
GET / HTTP/1.1
> GET / HTTP/1.1

Resources