bash script with background process - bash

I have the following in my script:
#!/bin/bash
[ ! -S ./notify ] && find ./stylesheets/sass/ \-maxdepth 1 \-type f \-regex '.*/[^_][^/]*\.scss$' | entr \+notify &
what entr does here, is creating notify as a named pipe.
[ insert ]
while read F; do
...
#some processing on found files
#(does not matter for this question at all)
...
done < notify
The problem is, first time I run the script, it sees there is no notify pipe, so it creates one, and puts
the process into the background.
But then the following while loop complains it cannot find notify to read from.
However, when I run script immediately after that, so for the second time now, it continues normally the rest
of the program (while loop part).
How would I fix this, so it runs all good as a whole?
EDIT:
if I put into [ insert ] placeholder above,
sleep 1;
it works, but I would like a better solution for checking when that notify fifo exists, as sometimes it may need more than 1 sec.

You can always poll for the named pipe to be created:
until [ -p notify ]; do read -t 0.1; done
If you don't specifically need to maintain variables between runs, you could also consider using a script rather than entr's +notify. That would avoid the problem.

Related

Run a process only when the previous one is finished | Bash [duplicate]

This question already has answers here:
Quick-and-dirty way to ensure only one instance of a shell script is running at a time
(43 answers)
Closed 3 years ago.
I am using aria2 to download some data with the option --on-download-complete to run a bash script automatically to process the data.
aria2c --http-user='***' --http-passwd='***' --check-certificate=false --max-concurrent-downloads=2 -M products.meta4 --on-download-complete=/my/path/script_gpt.sh
Focusing on my bash script,
#!/bin/bash
oldEnd=.zip
newEnd=_processed.dim
for i in $(ls -d -1 /my/path/S1*.zip)
do
if [ -f ${i%$oldEnd}$newEnd ]; then
echo "Already processed"
else
gpt /my/path/graph.xml -Pinput1=$i -Poutput1=${i%$oldEnd}$newEnd
fi
done
Basically, everytime a download is finished, a for loop starts. First it checks if the downloaded product has been already processed and if not it runs a specific task.
My issue is that everytime a download is completed, the bash script is run. This means that if the analysis is not finished from the previous time the bash script was run, both tasks will overlap and eat all my memory resources.
Ideally, I would like to:
Each time the bash script is run, check if there is still and ongoing process.
If so, wait until it is finished and then run
Its like creating a queu of task (like in a for loop where each iteration waits until the previous one is finished).
I have tried to implement the solutin with wait or identifying the PID but nothing succesfull.
Maybe changing the approach and instead of using aria2 to process the data that is just donwloaded, implemente another solution?
You can try to acquire an exclusive file lock and only run when it lock is released. Your code could be like
#!/bin/bash
oldEnd=.zip
newEnd=_processed.dim
{
flock -e 200
while IFS= read -r -d'' i
do
if [ -f "${i%$oldEnd}$newEnd" ];
then
echo "Already processed"
else
gpt /my/path/graph.xml -Pinput1="$i" -Poutput1="${i%$oldEnd}$newEnd"
fi
done < <(find /my/path -maxdepth 1 -name "S1*.zip" -print0)
} 200> /tmp/aria.lock
This code opens an exclusive lock against file descriptor 200 (the one we told bash to open to redirect output to the lock file, and prevents other scripts to execute the code block until the file is closed. The file is closed as soon as the code block is finished, allowing other waiting processes to continue the execution.
BTW, you should always quote your variables and you should avoid parsing the ls output. Also, to avoid problems with whitespaces and unexpected globbing, outputting the file list separated by zeros and reading it with read is a way to avoid those problems.

Stop command after a given time and return its result in Bash

I need to execute several calls to a C++ program that records frames from a videogame. I have about 1800 test games, and some of them work and some of them don't.
When they don't work, the console returns a Segmentation fault error, but when they do work, the program opens a window and plays the game, and at the same time it records every frame.
The problem is that when it does work, this process does not end until you close the game window.
I need to make a Bash script that will test every game I have and write the names of the ones that work in a text file and the names of the ones that don't work in another file.
For the moment I have tried with this, using the timeout command:
count=0
# Run for every file in the ROMS folder
for filename in ../ROMs/*.bin; do
# Increase the counter
(( count++ ))
# Run the command with a timeout to prevent it from being infinite
timeout 5 ./doc/examples/videoRecordingExample "$filename"
# Check if execution succeeds/fails and print in a text file
if [ $? == 0 ]; then
echo "Game $count named $filename" >> successGames.txt
else
echo "Game $count named $filename" >> failedGames.txt
fi
done
But it doesn't seem to be working, because it writes all the names on the same file. I believe this is because the condition inside the if refers to the timeout and not the execution of the C++ program itself.
Then I tried without the timeout and everytime a game worked, I closed manually the window, and then the result was the expected. I tried this with only 10 games, but when I test it with all the 1800 I would need it to be completely automatic.
So, is there any way of making this process automatic? Like some command to stop the execution and at the same time know if it was successful or not?
instead of
timeout 5 ./doc/examples/videoRecordingExample "$filename"
you could try this:
./doc/examples/videoRecordingExample "$filename" && sleep 5 && pkill videoRecordingExample
Swap the arguments in the timeout code. It should be:
timeout 5 "$filename" ./doc/examples/videoRecordingExample
Reason: the syntax for timeout is:
timeout [OPTION] DURATION COMMAND [ARG]...
So the COMMAND should be just after the DURATION. In the code above the presumably non-executable file videoRecordingExample would be the COMMAND, which probably returns an error every time.

whether a shell script can be executed if another instance of the same script is already running

I have a shell script which usually runs nearly 10 mins for a single run,but i need to know if another request for running the script comes while a instance of the script is running already, whether new request need to wait for existing instance to compplete or a new instance will be started.
I need a new instance must be started whenever a request is available for the same script.
How to do it...
The shell script is a polling script which looks for a file in a directory and execute the file.The execution of the file takes nearly 10 min or more.But during execution if a new file arrives, it also has to be executed simultaneously.
the shell script is below, and how to modify it to execute multiple requests..
#!/bin/bash
while [ 1 ]; do
newfiles=`find /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/ -newer /afs/rch/usr$
touch /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/.my_marker
if [ -n "$newfiles" ]; then
echo "found files $newfiles"
name2=`ls /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/ -Art |tail -n 2 |head $
echo " $name2 "
mkdir -p -m 0755 /afs/rch/usr8/fsptools/WWW/dumpspace/$name2
name1="/afs/rch/usr8/fsptools/WWW/dumpspace/fipsdumputils/fipsdumputil -e -$
$name1
touch /afs/rch/usr8/fsptools/WWW/dumpspace/tempfiles/$name2
fi
sleep 5
done
When writing scripts like the one you describe, I take one of two approaches.
First, you can use a pid file to indicate that a second copy should not run. For example:
#!/bin/sh
pidfile=/var/run/$(0##*/).pid
# remove pid if we exit normally or are terminated
trap "rm -f $pidfile" 0 1 3 15
# Write the pid as a symlink
if ! ln -s "pid=$$" "$pidfile"; then
echo "Already running. Exiting." >&2
exit 0
fi
# Do your stuff
I like using symlinks to store pid because writing a symlink is an atomic operation; two processes can't conflict with each other. You don't even need to check for the existence of the pid symlink, because a failure of ln clearly indicates that a pid cannot be set. That's either a permission or path problem, or it's due to the symlink already being there.
Second option is to make it possible .. nay, preferable .. not to block additional instances, and instead configure whatever it is that this script does to permit multiple servers to run at the same time on different queue entries. "Single-queue-single-server" is never as good as "single-queue-multi-server". Since you haven't included code in your question, I have no way to know whether this approach would be useful for you, but here's some explanatory meta bash:
#!/usr/bin/env bash
workdir=/var/tmp # Set a better $workdir than this.
a=( $(get_list_of_queue_ids) ) # A command? A function? Up to you.
for qid in "${a[#]}"; do
# Set a "lock" for this item .. or don't, and move on.
if ! ln -s "pid=$$" $workdir/$qid.working; then
continue
fi
# Do your stuff with just this $qid.
...
# And finally, clean up after ourselves
remove_qid_from_queue $qid
rm $workdir/$qid.working
done
The effect of this is to transfer the idea of "one at a time" from the handler to the data. If you have a multi-CPU system, you probably have enough capacity to handle multiple queue entries at the same time.
ghoti's answer shows some helpful techniques, if modifying the script is an option.
Generally speaking, for an existing script:
Unless you know with certainty that:
the script has no side effects other than to output to the terminal or to write to files with shell-instance specific names (such as incorporating $$, the current shell's PID, into filenames) or some other instance-specific location,
OR that the script was explicitly designed for parallel execution,
I would assume that you cannot safely run multiple copies of the script simultaneously.
It is not reasonable to expect the average shell script to be designed for concurrent use.
From the viewpoint of the operating system, several processes may of course execute the same program in parallel. No need to worry about this.
However, it is conceivable, that a (careless) programmer wrote the program in such a way that it produces incorrect results, when two copies are executed in parallel.

How can I have output from one named pipe fed back into another named pipe?

I'm adding some custom logging functionality to a bash script, and can't figure out why it won't take the output from one named pipe and feed it back into another named pipe.
Here is a basic version of the script (http://pastebin.com/RMt1FYPc):
#!/bin/bash
PROGNAME=$(basename $(readlink -f $0))
LOG="$PROGNAME.log"
PIPE_LOG="$PROGNAME-$$-log"
PIPE_ECHO="$PROGNAME-$$-echo"
# program output to log file and optionally echo to screen (if $1 is "-e")
log () {
if [ "$1" = '-e' ]; then
shift
$# > $PIPE_ECHO 2>&1
else
$# > $PIPE_LOG 2>&1
fi
}
# create named pipes if not exist
if [[ ! -p $PIPE_LOG ]]; then
mkfifo -m 600 $PIPE_LOG
fi
if [[ ! -p $PIPE_ECHO ]]; then
mkfifo -m 600 $PIPE_ECHO
fi
# cat pipe data to log file
while read data; do
echo -e "$PROGNAME: $data" >> $LOG
done < $PIPE_LOG &
# cat pipe data to log file & echo output to screen
while read data; do
echo -e "$PROGNAME: $data"
log echo $data # this doesn't work
echo -e $data > $PIPE_LOG 2>&1 # and neither does this
echo -e "$PROGNAME: $data" >> $LOG # so I have to do this
done < $PIPE_ECHO &
# clean up temp files & pipes
clean_up () {
# remove named pipes
rm -f $PIPE_LOG
rm -f $PIPE_ECHO
}
#execute "clean_up" on exit
trap "clean_up" EXIT
log echo "Log File Only"
log -e echo "Echo & Log File"
I thought the commands on line 34 & 35 would take the $data from $PIPE_ECHO and output it to the $PIPE_LOG. But, it doesn't work. Instead I have to send that output directly to the log file, without going through the $PIPE_LOG.
Why is this not working as I expect?
EDIT: I changed the shebang to "bash". The problem is the same, though.
SOLUTION: A.H.'s answer helped me understand that I wasn't using named pipes correctly. I have since solved my problem by not even using named pipes. That solution is here: http://pastebin.com/VFLjZpC3
it seems to me, you do not understand what a named pipe really is. A named pipe is not one stream like normal pipes. It is a series of normal pipes, because a named pipe can be closed and a close on the producer side is might be shown as a close on the consumer side.
The might be part is this: The consumer will read data until there is no more data. No more data means, that at the time of the read call no producer has the named pipe open. This means that multiple producer can feed one consumer only when there is no point in time without at least one producer. Think of it of door which closes automatically: If there is a steady stream of people keeping the door always open either by handing the doorknob to the next one or by squeezing multiple people through it at the same time, the door is open. But once the door is closed it stays closed.
A little demonstration should make the difference a little clearer:
Open three shells. First shell:
1> mkfifo xxx
1> cat xxx
no output is shown because cat has opened the named pipe and is waiting for data.
Second shell:
2> cat > xxx
no output, because this cat is a producer which keeps the named pipe open until we tell him to close it explicitly.
Third shell:
3> echo Hello > xxx
3>
This producer immediately returns.
First shell:
Hello
The consumer received data, wrote it and - since one more consumer keeps the door open, continues to wait.
Third shell
3> echo World > xxx
3>
First shell:
World
The consumer received data, wrote it and - since one more consumer keeps the door open, continues to wait.
Second Shell: write into the cat > xxx window:
And good bye!
(control-d key)
2>
First shell
And good bye!
1>
The ^D key closed the last producer, the cat > xxx, and hence the consumer exits also.
In your case which means:
Your log function will try to open and close the pipes multiple times. Not a good idea.
Both your while loops exit earlier than you think. (check this with (while ... done < $PIPE_X; echo FINISHED; ) &
Depending on the scheduling of your various producers and consumers the door might by slam shut sometimes and sometimes not - you have a race condition built in. (For testing you can add a sleep 1 at the end of the log function.)
You "testcases" only tries each possibility once - try to use them multiple times (you will block, especially with the sleeps ), because your producer might not find any consumer.
So I can explain the problems in your code but I cannot tell you a solution because it is unclear what the edges of your requirements are.
It seems the problem is in the "cat pipe data to log file" part.
Let's see: you use a "&" to put the loop in the background, I guess you mean it must run in parallel with the second loop.
But the problem is you don't even need the "&", because as soon as no more data is available in the fifo, the while..read stops. (still you've got to have some at first for the first read to work). The next read doesn't hang if no more data is available (which would pose another problem: how does your program stops ?).
I guess the while read checks if more data is available in the file before doing the read and stops if it's not the case.
You can check with this sample:
mkfifo foo
while read data; do echo $data; done < foo
This script will hang, until you write anything from another shell (or bg the first one). But it ends as soon as a read works.
Edit:
I've tested on RHEL 6.2 and it works as you say (eg : bad!).
The problem is that, after running the script (let's say script "a"), you've got an "a" process remaining. So, yes, in some way the script hangs as I wrote before (not that stupid answer as I thought then :) ). Except if you write only one log (be it log file only or echo,in this case it works).
(It's the read loop from PIPE_ECHO that hangs when writing to PIPE_LOG and leaves a process running each time).
I've added a few debug messages, and here is what I see:
only one line is read from PIPE_LOG and after that, the loop ends
then a second message is sent to the PIPE_LOG (after been received from the PIPE_ECHO), but the process no longer reads from PIPE_LOG => the write hangs.
When you ls -l /proc/[pid]/fd, you can see that the fifo is still open (but deleted).
If fact, the script exits and removes the fifos, but there is still one process using it.
If you don't remove the log fifo at the cleanup and cat it, it will free the hanging process.
Hope it will help...

How to extend bash shell?

would like to add new functionality to the bash shell. I need to have a queue for executions.
What is the easy way to add new functionality to the bash shell keeping all native functions?
I would like to process the command line, then let the bash to execute them. For users it should be transparent.
Thanks Arman
EDIT
I just discovered prll.sourceforge.net it does exactly what I need.
Its easier than it seems:
#!/bin/sh
yourfunctiona(){ ...; }
...
yourfunctionz(){ ...; }
. /path/to/file/with/more/functions
while read COMMANDS; do
eval "$COMMANDS"
done
you can use read -p if you need a prompt or -t if you want it to timeout ... or if you wanted you could even use your favorite dialog program in place of read and pipe the output to a tailbox
touch /tmp/mycmdline
Xdialog --tailbox /tmp/mycmdline 0 0 &
COMMANDS="echo "
while ([ "$COMMANDS" != "" ]); do
COMMANDS=`Xdialog --stdout --inputbox "Text here" 0 0`
eval "$COMMANDS"
done >>/tmp/mycmdline &
To execute commands in threads you can use the following in place of eval $COMMANDS
#this will need to be before the loope
NUMCORES=$(awk '/cpu cores/{sum += $4}END{print sum}' /proc/cpuinfo)
for i in {1..$NUMCORES};do
if [ $i -eq $NUMCORES ] && #see comments below
if [ -d /proc/$threadarray[$i] ]; then #this core already has a thread
#note: each process gets a directory named /proc/<its_pid> - hacky, but works
continue
else #this core is free
$COMMAND &
threadarray[$i]=$!
break
fi
done
Then there is the case where you fill up all threads.
You can either put the whole thing in a while loop and add continues and breaks,
or you can pick a core to wait for (probably the last) and wait for it
to wait for a single thread to complete use:
wait $threadarray[$i]
to wait for all threads to complete use:
wait
#I ended up using this to keep my load from getting to high for too long
another note: you may find that some commands don't like to be threaded, if so you can put the whole thing in a case statement
I'll try to do some cleanup on this soon to put all of the little blocks together (sorry, I'm cobbling this together from random notes that I used to implement this exact thing, but can't seem to find)

Resources