Stdout based on a PID (OSX) - macos

I am processing many files using many separate scripts. To speed up the processing I placed them in the background using &, however with this I lost the ability to keep track of what they are doing (I can't see the output).
Is there a simple way of getting output based on PID? I found some answers which are based on fg [job number], but I can't figure out job number from PID.

You might consider running your scripts from screen then return to them whenever you want:
$ screen ./script.sh
To "detach" and keep the script running press ControlA followed by ControlD
$ screen -ls
Will list your screen sessions
$ screen -r <screen pid number>
Returns to a screen session
The few commands above barely touches on the abilities that screen has, so check out the man pages about it and you might be surprised by all it can do.

A script that is backgrounded will normally just continue to write to standard output; if you run several, they will all be dumping their output intermingled with each other. Dump them to a file instead. For example, generate an output file name using $$ (current process ID) and write to that file.
outfile=process.$$.out
# ...
echo Output >$outfile
will write to, say, process.27422.out.

The answers by other users are right - exec &>$outfile or exec &>$outfifo or exec &>$another_tty is what you need to do & is the correct way.
However, if you have already started the scripts, then there is a workaround that you can use. I had written this script to redirect the stdout/stderr of any running process to another file/terminal.
$ cat redirect_terminal
#!/bin/bash
PID=$1
stdout=$2
stderr=${3:-$2}
if [ -e "/proc/$PID" ]; then
gdb -q -n -p $PID <<EOF >/dev/null
p dup2(open("$stdout",1),1)
p dup2(open("$stderr",1),2)
detach
quit
EOF
else
echo No such PID : $PID
fi
Sample usage:
./redirect_terminal 1234 /dev/pts/16
Where,
1234 is the PID of the script process.
/dev/pts/16 is another terminal opened separately.
Note that this updated stdout/stderr will not be inherited to the already running children of that process.

Consider using GNU Parallel - it is easily installed on OSX with homebrew. Not only will it tag your output lines, but it will also keep your CPUs busy, scheduling another job immediately the previous one finishes. You can make up your own tags with substitution parameters.
Let's say you have 20 files called file{10..20}.txt to process:
parallel --tagstring "MyTag-{}" 'echo Start; echo Processing file {}; echo Done' ::: file*txt
MyTag-file15.txt Start
MyTag-file15.txt Processing file file15.txt
MyTag-file15.txt Done
MyTag-file16.txt Start
MyTag-file16.txt Processing file file16.txt
MyTag-file16.txt Done
MyTag-file17.txt Start
MyTag-file17.txt Processing file file17.txt
MyTag-file17.txt Done
MyTag-file18.txt Start
MyTag-file18.txt Processing file file18.txt
MyTag-file18.txt Done
MyTag-file14.txt Start
MyTag-file14.txt Processing file file14.txt
MyTag-file14.txt Done
MyTag-file13.txt Start
MyTag-file13.txt Processing file file13.txt
MyTag-file13.txt Done
MyTag-file12.txt Start
MyTag-file12.txt Processing file file12.txt
MyTag-file12.txt Done
MyTag-file19.txt Start
MyTag-file19.txt Processing file file19.txt
MyTag-file19.txt Done
MyTag-file20.txt Start
MyTag-file20.txt Processing file file20.txt
MyTag-file20.txt Done
MyTag-file11.txt Start
MyTag-file11.txt Processing file file11.txt
MyTag-file11.txt Done
MyTag-file10.txt Start
MyTag-file10.txt Processing file file10.txt
MyTag-file10.txt Done
If you want the output in order, use parallel -k to keep the output order
If you want a progress report, use parallel --progress
If you want a log of when jobs started/ended, use parallel --joblog log.txt
If you want to run 32 jobs in parallel, instead of the default 1 job per CPU core, use parallel -j 32
Example joblog:
Seq Host Starttime JobRuntime Send Receive Exitval Signal Command
6 : 1461141901.514 0.005 0 38 0 0 echo Start; echo Processing file file15.txt; echo Done
7 : 1461141901.517 0.006 0 38 0 0 echo Start; echo Processing file file16.txt; echo Done
8 : 1461141901.519 0.006 0 38 0 0 echo Start; echo Processing file file17.txt; echo Done

Related

Background process appears to hang

Editor's note: The OP is ultimately looking to package the code from this answer
as a script. Said code creates a stay-open FIFO from which a background command reads data to process as it arrives.
It works if I type it in the terminal, but it won't work if I enter those commands in a script file and run it.
#!/bin/bash
cat >a&
pid=$!
it seems that the program is stuck at cat>a&
$pid has no value after running the script, but the cat process seems to exist.
cdarke's answer contains the crucial pointer: your script mustn't run in a child process, so you have to source it.
Based on the question you linked to, it sounds like you're trying to do the following:
Open a FIFO (named pipe).
Keep that FIFO open indefinitely.
Make a background command read from that FIFO whenever new data is sent to it.
See bottom for a working solution.
As for an explanation of your symptoms:
Running your script NOT sourced (NOT with .) means that the script runs in a child process, which has the following implications:
Variables defined in the script are only visible inside that script, and the variables cease to exist altogether when the script finishes running.
That's why you didn't see the script's $myPid variable after running the script.
When the script finishes running, its background tasks (cat >a&) are killed (as cdarke explains, the SIGHUP signal is sent to them; any process that doesn't explicitly trap that signal is terminated).
This contradicts your claim that the cat process continues to exist, but my guess is that you mistook an interactively started cat process for one started by a script.
By contrast, any FIFO created by your script (with mkfifo) does persist after the script exits (a FIFO behaves like a file - it persists until you explicitly delete it).
However, when you write to that FIFO without another process reading from it, the writing command will block and thus appear to hang (the writing process blocks until another process reads the data from the FIFO).
That's probably what happened in your case: because the script's background processes were killed, no one was reading from the FIFO, causing an attempt to write to it to block. You incorrectly surmised that it was the cat >a& command that was getting "stuck".
The following script, when sourced, adds functions to the current shell for setting up and cleaning up a stay-open FIFO with a background command that processes data as it arrives. Save it as file bgfifo_funcs:
#!/usr/bin/env bash
[[ $0 != "$BASH_SOURCE" ]] || { echo "ERROR: This script must be SOURCED." >&2; exit 2; }
# Set up a background FIFO with a command listening for input.
# E.g.:
# bgfifo_setup bgfifo "sed 's/^/# /'"
# echo 'hi' > bgfifo # -> '# hi'
# bgfifo_cleanup
bgfifo_setup() {
(( $# == 2 )) || { echo "ERROR: usage: bgfifo_setup <fifo-file> <command>" >&2; return 2; }
local fifoFile=$1 cmd=$2
# Create the FIFO file.
mkfifo "$fifoFile" || return
# Use a dummy background command that keeps the FIFO *open*.
# Without this, it would be closed after the first time you write to it.
# NOTE: This call inevitably outputs a job control message that looks
# something like this:
# [1]+ Stopped cat > ...
{ cat > "$fifoFile" & } 2>/dev/null
# Note: The keep-the-FIFO-open `cat` PID is the only one we need to save for
# later cleanup.
# The background processing command launched below will terminate
# automatically then FIFO is closed when the `cat` process is killed.
__bgfifo_pid=$!
# Now launch the actual background command that should read from the FIFO
# whenever data is sent.
{ eval "$cmd" < "$fifoFile" & } 2>/dev/null || return
# Save the *full* path of the FIFO file in a global variable for reliable
# cleanup later.
__bgfifo_file=$fifoFile
[[ $__bgfifo_file == /* ]] || __bgfifo_file="$PWD/$__bgfifo_file"
echo "FIFO '$fifoFile' set up, awaiting input for: $cmd"
echo "(Ignore the '[1]+ Stopped ...' message below.)"
}
# Cleanup function that you must call when done, to remove
# the FIFO file and kill the background commands.
bgfifo_cleanup() {
[[ -n $__bgfifo_file ]] || { echo "(Nothing to clean up.)"; return 0; }
echo "Removing FIFO '$__bgfifo_file' and terminating associated background processes..."
rm "$__bgfifo_file"
kill $__bgfifo_pid # Note: We let the job control messages display.
unset __bgfifo_file __bgfifo_pid
return 0
}
Then, source script bgfifo_funcs, using the . shell builtin:
. bgfifo_funcs
Sourcing executes the script in the current shell (rather than in a child process that terminates after the script has run), and thus makes the script's functions and variables available to the current shell. Functions by definition run in the current shell, so any background commands started from functions stay alive.
Now you can set up a stay-open FIFO with a background process that processes input as it arrives as follows:
# Set up FIFO 'bgfifo in the current dir. and process lines sent to it
# with a sample Sed command that simply prepends '# ' to every line.
$ bgfifo_setup bgfifo "sed 's/^/# /'"
# Send sample data to the FIFO.
$ echo 'Hi.' > bgfifo
# Hi.
# ...
$ echo 'Hi again.' > bgfifo
# Hi again.
# ...
# Clean up when done.
$ bgfifo_cleanup
The reason that cat >a "hangs" is because it is reading from the standard input stream (stdin, file descriptor zero), which defaults to the keyboard.
Adding the & causes it to run in background, which disconnects from the keyboard. Normally that would leave a suspended job in background, but, since you exit your script, its background tasks are killed (sends a SIGHUP signal).
EDIT: although I followed the link in the question, it was not stated originally that the OP was actually using a FIFO at that stage. So thanks to #mklement0.
I don't understand what you are trying to do here, but I suspect you need to run it as a "sourced" file, as follows:
. gash.sh
Where gash.sh is the name of your script. Note the preceding .
You need to specify a file with "cat":
#!/bin/bash
cat SOMEFILE >a &
pid=$!
echo PID $pid
Although that seems a bit silly - why not just "cp" the file (cp SOMEFILE a)?
Q: What exactly are you trying to accomplish?

how to get value before child process halt in bash

Requirement is:
child process will return a value, IP address, it use wget method
but child process maybe halt.
parent process can not wait child process, it need return value after some second.
The possible script is
parent.sh:
./child.sh &
sleep 60
echo child_return_value
child.sh:
child_return_value=$(wget ipaddress)
Just to add another possible approach, you can capture the output of a background process without (manually) using files by using process substitution, if your shell supports it. You can use the read builtin to get the output, which allows setting a timeout value:
exec 3< <(wget -O- ipaddress);
read -r -u3 -t60;
return_value="$REPLY";
exec 3<&-;
echo "$return_value";
The shell will actually create a FIFO or /dev/fd/xx special file on your behalf under this solution.
I would use the -T|--timeout option of wget to have the request time out after a specified number of seconds. If you do this, you can avoid messing with background processes and IPC entirely:
return_value=$(wget -T60 -O- ipaddress); ## 60 sec timeout
echo "$return_value";
You could have the child process write the result to a file that the parent process can read.
child_out="$(mktemp)"
./child.sh > "$child_out" &
sleep 60
if [ -s "$child_out" ]
then
child_return_value=$(cat "$child_out")
else
# Child did not produce a result yet.
fi
Don't forget to remove the temporary file in the parent script. Preferably using a trap so it will be removed under all (well, most) circumstances.

How to get a stdout message once a background process finishes?

I realize that there are several other questions on SE about notifications upon completion of background tasks, and how to queue up jobs to start after others end, and questions like these, but I am looking for a simpler answer to a simpler question.
I want to start a very simple background job, and get a simple stdout text notification of its completion.
For example:
cp My_Huge_File.txt New_directory &
...and when it done, my bash shell would display a message. This message could just be the completed job's PID, but if I could program unique messages per background process, that would be cool too, so I could have numerous background jobs running without confusion.
Thanks for any suggestions!
EDIT: user000001's answer separates commands with ;. I separated commands with && in my original example. The only difference I notice is that you don't have to surround your base command with braces if you use&&. Semicolons are a bit more flexible, so I've updated my examples.
The first thing that comes to mind is
{ sleep 2; echo "Sleep done"; } &
You can also suppress the accompanying stderr output from the above line:
{ { sleep 2; echo "Sleep done"; } & } 2>/dev/null
If you want to save your program output (stdout) to a log file for later viewing, you can use:
{ { sleep 2; echo "Sleep done"; } & } 2>/dev/null 1>myfile.log
Here's even a generic form you might use (You can even make an alias so that you can run it at any time without having to type so much!):
# dont hesitate to add semicolons for multiple commands
CMD="cp My_Huge_File.txt New_directory"
{ eval $CMD & } 2>/dev/null 1>myfile.log
You might also pipe stdout into another process using | in case you wish to process output in real time with other scripts or software. tee is also a helpful tool in case you wish to use multiple pipes. For reference, there are more examples of I/O redirection here.
You could use command grouping:
{ slow_program; echo ok; } &
or the wait command
slow_program &
wait
echo ok
The most reliable way is to simply have the output from the background process go to a temporary file and then consume the temporary file.
When you have a background process running it can be difficult to capture the output into something useful because multiple jobs will overwrite eachother
For example, if you have two processes which each print out a string with a number "this is my string1" "this is my string2" then it is possible for you to end up with output that looks like this:
"this is mthis is my string2y string1"
instead of:
this is my string1
this is my string2
By using temporary files you guarantee that the output will be correct.
As I mentioned in my comment above, bash already does this kind of notification by default, as far as I know. Here's an example I just made:
$ sleep 5 &
[1] 25301
$ sleep 10 &
[2] 25305
$ sleep 3 &
[3] 25309
$ jobs
[1] Done sleep 5
[2]- Running sleep 10 &
[3]+ Running sleep 3 &
$ :
[3]+ Done sleep 3
$ :
[2]+ Done sleep 10
$

How can I have output from one named pipe fed back into another named pipe?

I'm adding some custom logging functionality to a bash script, and can't figure out why it won't take the output from one named pipe and feed it back into another named pipe.
Here is a basic version of the script (http://pastebin.com/RMt1FYPc):
#!/bin/bash
PROGNAME=$(basename $(readlink -f $0))
LOG="$PROGNAME.log"
PIPE_LOG="$PROGNAME-$$-log"
PIPE_ECHO="$PROGNAME-$$-echo"
# program output to log file and optionally echo to screen (if $1 is "-e")
log () {
if [ "$1" = '-e' ]; then
shift
$# > $PIPE_ECHO 2>&1
else
$# > $PIPE_LOG 2>&1
fi
}
# create named pipes if not exist
if [[ ! -p $PIPE_LOG ]]; then
mkfifo -m 600 $PIPE_LOG
fi
if [[ ! -p $PIPE_ECHO ]]; then
mkfifo -m 600 $PIPE_ECHO
fi
# cat pipe data to log file
while read data; do
echo -e "$PROGNAME: $data" >> $LOG
done < $PIPE_LOG &
# cat pipe data to log file & echo output to screen
while read data; do
echo -e "$PROGNAME: $data"
log echo $data # this doesn't work
echo -e $data > $PIPE_LOG 2>&1 # and neither does this
echo -e "$PROGNAME: $data" >> $LOG # so I have to do this
done < $PIPE_ECHO &
# clean up temp files & pipes
clean_up () {
# remove named pipes
rm -f $PIPE_LOG
rm -f $PIPE_ECHO
}
#execute "clean_up" on exit
trap "clean_up" EXIT
log echo "Log File Only"
log -e echo "Echo & Log File"
I thought the commands on line 34 & 35 would take the $data from $PIPE_ECHO and output it to the $PIPE_LOG. But, it doesn't work. Instead I have to send that output directly to the log file, without going through the $PIPE_LOG.
Why is this not working as I expect?
EDIT: I changed the shebang to "bash". The problem is the same, though.
SOLUTION: A.H.'s answer helped me understand that I wasn't using named pipes correctly. I have since solved my problem by not even using named pipes. That solution is here: http://pastebin.com/VFLjZpC3
it seems to me, you do not understand what a named pipe really is. A named pipe is not one stream like normal pipes. It is a series of normal pipes, because a named pipe can be closed and a close on the producer side is might be shown as a close on the consumer side.
The might be part is this: The consumer will read data until there is no more data. No more data means, that at the time of the read call no producer has the named pipe open. This means that multiple producer can feed one consumer only when there is no point in time without at least one producer. Think of it of door which closes automatically: If there is a steady stream of people keeping the door always open either by handing the doorknob to the next one or by squeezing multiple people through it at the same time, the door is open. But once the door is closed it stays closed.
A little demonstration should make the difference a little clearer:
Open three shells. First shell:
1> mkfifo xxx
1> cat xxx
no output is shown because cat has opened the named pipe and is waiting for data.
Second shell:
2> cat > xxx
no output, because this cat is a producer which keeps the named pipe open until we tell him to close it explicitly.
Third shell:
3> echo Hello > xxx
3>
This producer immediately returns.
First shell:
Hello
The consumer received data, wrote it and - since one more consumer keeps the door open, continues to wait.
Third shell
3> echo World > xxx
3>
First shell:
World
The consumer received data, wrote it and - since one more consumer keeps the door open, continues to wait.
Second Shell: write into the cat > xxx window:
And good bye!
(control-d key)
2>
First shell
And good bye!
1>
The ^D key closed the last producer, the cat > xxx, and hence the consumer exits also.
In your case which means:
Your log function will try to open and close the pipes multiple times. Not a good idea.
Both your while loops exit earlier than you think. (check this with (while ... done < $PIPE_X; echo FINISHED; ) &
Depending on the scheduling of your various producers and consumers the door might by slam shut sometimes and sometimes not - you have a race condition built in. (For testing you can add a sleep 1 at the end of the log function.)
You "testcases" only tries each possibility once - try to use them multiple times (you will block, especially with the sleeps ), because your producer might not find any consumer.
So I can explain the problems in your code but I cannot tell you a solution because it is unclear what the edges of your requirements are.
It seems the problem is in the "cat pipe data to log file" part.
Let's see: you use a "&" to put the loop in the background, I guess you mean it must run in parallel with the second loop.
But the problem is you don't even need the "&", because as soon as no more data is available in the fifo, the while..read stops. (still you've got to have some at first for the first read to work). The next read doesn't hang if no more data is available (which would pose another problem: how does your program stops ?).
I guess the while read checks if more data is available in the file before doing the read and stops if it's not the case.
You can check with this sample:
mkfifo foo
while read data; do echo $data; done < foo
This script will hang, until you write anything from another shell (or bg the first one). But it ends as soon as a read works.
Edit:
I've tested on RHEL 6.2 and it works as you say (eg : bad!).
The problem is that, after running the script (let's say script "a"), you've got an "a" process remaining. So, yes, in some way the script hangs as I wrote before (not that stupid answer as I thought then :) ). Except if you write only one log (be it log file only or echo,in this case it works).
(It's the read loop from PIPE_ECHO that hangs when writing to PIPE_LOG and leaves a process running each time).
I've added a few debug messages, and here is what I see:
only one line is read from PIPE_LOG and after that, the loop ends
then a second message is sent to the PIPE_LOG (after been received from the PIPE_ECHO), but the process no longer reads from PIPE_LOG => the write hangs.
When you ls -l /proc/[pid]/fd, you can see that the fifo is still open (but deleted).
If fact, the script exits and removes the fifos, but there is still one process using it.
If you don't remove the log fifo at the cleanup and cat it, it will free the hanging process.
Hope it will help...

Shell script that continuously checks a text file for log data and then runs a program

I have a java program that stops often due to errors which is logged in a .log file. What can be a simple shell script to detect a particular text in the last/latest line say
[INFO] Stream closed
and then run the following command
java -jar xyz.jar
This should keep on happening forever(possibly after every two minutes or so) because xyz.jar writes the log file.
The text stream closed can arrive a lot of times in the log file. I just want it to take an action when it comes in the last line.
How about
while [[ true ]];
do
sleep 120
tail -1 logfile | grep -q "[INFO] Stream Closed"
if [[ $? -eq 1 ]]
then
java -jar xyz.jar &
fi
done
There may be condition where the tailed last log "Stream Closed" is not the real last log and the process is still logging the messages. We can avoid this condition by checking if the process is alive or not. If the process exited and the last log is "Stream Closed" then we need to restart the application.
#!/bin/bash
java -jar xyz.jar &
PID=$1
while [ true ]
do
tail -1 logfile | grep -q "Stream Closed" && kill -0 $PID && sleep 20 && continue
java -jar xyz.jar &
PID=$1
done
I would prefer checking whether the corresponding process is still running and restart the program on that event. There might be other errors that cause the process to stop. You can use a cronjob to periodically (like every minute) perform such a check.
Also, you might want to improve your java code so that it does not crash that often (if you have access to the code).
i solved this using a watchdog script that checks directly (grep) if program(s) is(are) running. by calling watchdog every minute (from cron under ubuntu), i basically guarantee (programs and environment are VERY stable) that no program will stay offline for more than 59 seconds.
this script will check a list of programs using the name in an array and see if each one is running, and, in case not, start it.
#!/bin/bash
#
# watchdog
#
# Run as a cron job to keep an eye on what_to_monitor which should always
# be running. Restart what_to_monitor and send notification as needed.
#
# This needs to be run as root or a user that can start system services.
#
# Revisions: 0.1 (20100506), 0.2 (20100507)
# first prog to check
NAME[0]=soc_gt2
# 2nd
NAME[1]=soc_gt0
# 3rd, etc etc
NAME[2]=soc_gp00
# START=/usr/sbin/$NAME
NOTIFY=you#gmail.com
NOTIFYCC=you2#mail.com
GREP=/bin/grep
PS=/bin/ps
NOP=/bin/true
DATE=/bin/date
MAIL=/bin/mail
RM=/bin/rm
for nameTemp in "${NAME[#]}"; do
$PS -ef|$GREP -v grep|$GREP $nameTemp >/dev/null 2>&1
case "$?" in
0)
# It is running in this case so we do nothing.
echo "$nameTemp is RUNNING OK. Relax."
$NOP
;;
1)
echo "$nameTemp is NOT RUNNING. Starting $nameTemp and sending notices."
START=/usr/sbin/$nameTemp
$START 2>&1 >/dev/null &
NOTICE=/tmp/watchdog.txt
echo "$NAME was not running and was started on `$DATE`" > $NOTICE
# $MAIL -n -s "watchdog notice" -c $NOTIFYCC $NOTIFY < $NOTICE
$RM -f $NOTICE
;;
esac
done
exit
i do not use the log verification, though you could easily incorporate that into your own version (just change grep for log check, for example).
if you run it from command line (or putty, if you are remotely connected), you will see what was working and what wasnt. have been using it for months now without a hiccup. just call it whenever you want to see what's working (regardless of it running under cron).
you could also place all your critical programs in one folder, do a directory list and check if every file in that folder has a program running under the same name. or read a txt file line by line, with every line correspoding to a program that is supposed to be running. etcetcetc
A good way is to use the awk command:
tail -f somelog.log | awk '/.*[INFO] Stream Closed.*/ { system("java -jar xyz.jar") }'
This continually monitors the log stream and when the regular expression matches its fires off whatever system command you have set, which is anything you would type into a shell.
If you really wanna be good you can put that line into a .sh file and run that .sh file from a process monitoring daemon like upstart to ensure that it never dies.
Nice and clean =D

Resources