Run a process only when the previous one is finished | Bash [duplicate] - bash

This question already has answers here:
Quick-and-dirty way to ensure only one instance of a shell script is running at a time
(43 answers)
Closed 3 years ago.
I am using aria2 to download some data with the option --on-download-complete to run a bash script automatically to process the data.
aria2c --http-user='***' --http-passwd='***' --check-certificate=false --max-concurrent-downloads=2 -M products.meta4 --on-download-complete=/my/path/script_gpt.sh
Focusing on my bash script,
#!/bin/bash
oldEnd=.zip
newEnd=_processed.dim
for i in $(ls -d -1 /my/path/S1*.zip)
do
if [ -f ${i%$oldEnd}$newEnd ]; then
echo "Already processed"
else
gpt /my/path/graph.xml -Pinput1=$i -Poutput1=${i%$oldEnd}$newEnd
fi
done
Basically, everytime a download is finished, a for loop starts. First it checks if the downloaded product has been already processed and if not it runs a specific task.
My issue is that everytime a download is completed, the bash script is run. This means that if the analysis is not finished from the previous time the bash script was run, both tasks will overlap and eat all my memory resources.
Ideally, I would like to:
Each time the bash script is run, check if there is still and ongoing process.
If so, wait until it is finished and then run
Its like creating a queu of task (like in a for loop where each iteration waits until the previous one is finished).
I have tried to implement the solutin with wait or identifying the PID but nothing succesfull.
Maybe changing the approach and instead of using aria2 to process the data that is just donwloaded, implemente another solution?

You can try to acquire an exclusive file lock and only run when it lock is released. Your code could be like
#!/bin/bash
oldEnd=.zip
newEnd=_processed.dim
{
flock -e 200
while IFS= read -r -d'' i
do
if [ -f "${i%$oldEnd}$newEnd" ];
then
echo "Already processed"
else
gpt /my/path/graph.xml -Pinput1="$i" -Poutput1="${i%$oldEnd}$newEnd"
fi
done < <(find /my/path -maxdepth 1 -name "S1*.zip" -print0)
} 200> /tmp/aria.lock
This code opens an exclusive lock against file descriptor 200 (the one we told bash to open to redirect output to the lock file, and prevents other scripts to execute the code block until the file is closed. The file is closed as soon as the code block is finished, allowing other waiting processes to continue the execution.
BTW, you should always quote your variables and you should avoid parsing the ls output. Also, to avoid problems with whitespaces and unexpected globbing, outputting the file list separated by zeros and reading it with read is a way to avoid those problems.

Related

Is there a way to know if the `script` command is running in the current bash session?

I'm attempting to add to my ~/.zshrc file a command to record the command line inputs and outputs. I have the script (https://www.tecmint.com/record-and-replay-linux-terminal-session-commands-using-script/) command required running but I'm having an issue with it seeming to infinitely attempt to record my session. I believe this is due to the fact that when you run the script command it starts a new bash session which in turn runs the ~/.zshrc which then tries to record the session which restarts the session (etc. etc.). This causes it to get into an infinite loop of attempting to record the session.
Attempt 1
Relevant piece of my ~/.zshrc
# RECORD COMMAND INPUT AND OUTPUT
LOG_PATH=/var/log/terminal/$(date +'%Y%m%d')
LOG_FILE=${LOG_PATH}/$(date +'%H%M%S').log
mkdir -p ${LOG_PATH}
script ${LOG_FILE}
This results in a session which constantly prints the following:
Script started, output file is /var/log/terminal/20200109/141849.log
Script started, output file is /var/log/terminal/20200109/141850.log
Script started, output file is /var/log/terminal/20200109/141851.log
Script started, output file is /var/log/terminal/20200109/141852.log
Script started, output file is /var/log/terminal/20200109/141853.log
Script started, output file is /var/log/terminal/20200109/141854.log
Script started, output file is /var/log/terminal/20200109/141855.log
... repeat infinitely ...
Attempt 2
One attempt at working around this was to check if a file already exists and then go on with recording (which works somewhat but sometimes it creates 2 files or sometimes the recording doesn't start say if I open 2 command sessions in rapid succession).
Upgraded script
# RECORD COMMAND INPUT AND OUTPUT
# Check if the recording of a file for the given time has already started
# as it seems that once you start the script recording it re-starts the session
# which in turn re-runs this file which attempts to script again running into an infinite loop
LOG_PATH=/var/log/terminal/$(date +'%Y%m%d')
LOG_FILE=${LOG_PATH}/$(date +'%H%M%S').log
if [[ ! -f $LOG_FILE ]]; then
mkdir -p ${LOG_PATH}
script ${LOG_FILE}
fi
Again this either yields no recording (in the case of opening 2 sessions quickly) or it results in the recording being done twice
Script started, output file is /var/log/terminal/20200109/141903.log
Script started, output file is /var/log/terminal/20200109/141904.log
Attempt 3
Another attempt was to check the bash history and see if the last command contained the word script. Problem with this attempt was that the bash session doesn't seem to have history when starting and hence gave the following error (and then infinitely tried to start the recording session like the first attempt):
# RECORD COMMAND INPUT AND OUTPUT
# Check what the last command was to make sure that it wasn't the script starting
# as it seems that once you start the script recording it re-starts the session
# which in turn re-runs this file which attempts to script again running into an infinite loop
LAST_HISTORY=$(history -1)
if [[ "$LAST_HISTORY" != *"script"* ]]; then
LOG_PATH=/var/log/terminal/$(date +'%Y%m%d')
LOG_FILE=${LOG_PATH}/$(date +'%H%M%S').log
mkdir -p ${LOG_PATH}
script ${LOG_FILE}
fi
Output
Last login: Thu Jan 9 14:27:30 on ttys009
Script started, output file is /var/log/terminal/20200109/142754.log
Script started, output file is /var/log/terminal/20200109/142755.log
omz_history:fc:13: no such event: 0
omz_history:fc:13: no events in that range
Script started, output file is /var/log/terminal/20200109/142755.log
omz_history:fc:13: no such event: 0
omz_history:fc:13: no events in that range
Script started, output file is /var/log/terminal/20200109/142755.log
Script started, output file is /var/log/terminal/20200109/142756.log
^C% danielcarmo#Daniels-MacBook-Pro-2 git %
Any suggestions or thoughts on this would be much appreciated.
Instead of trying to figure out whether or not script is running, simply leave a breadcrumb for yourself:
if [ -z "$BEING_LOGGED" ]
then
export BEING_LOGGED="yes"
LOG_PATH="/var/log/terminal/$(date +'%Y%m%d')"
LOG_FILE="${LOG_PATH}/$(date +'%H%M%S').log"
mkdir -p "${LOG_PATH}"
exec script "${LOG_FILE}"
fi
The first time the file is sourced, BEING_LOGGED will be unset. When script is invoked and the file is sourced again, it'll be set, and you can skip logging.
script adds the variable SCRIPT to the environment of the command it runs. You can check for that to decide whether script needs to run.
# RECORD COMMAND INPUT AND OUTPUT
log_path=/var/log/terminal/$(date +'%Y%m%d')
log_file=${log_path}/$(date +'%H%M%S').log
mkdir -p "${log_path}"
[ -z "$SCRIPT" ] && script "${log_file}"
The value of SCRIPT is the name of the file being logged to.

Loop trough docker output until I find a String in bash

I am quite new to bash (barely any experience at all) and I need some help with a bash script.
I am using docker-compose to create multiple containers - for this example let's say 2 containers. The 2nd container will execute a bash command, but before that, I need to check that the 1st container is operational and fully configured. Instead of using a sleep command I want to create a bash script that will be located in the 2nd container and once executed do the following:
Execute a command and log the console output in a file
Read that file and check if a String is present. The command that I will execute in the previous step will take a few seconds (5 - 10) seconds to complete and I need to read the file after it has finished executing. I suppose i can add sleep to make sure the command is finished executing or is there a better way to do this?
If the string is not present I want to execute the same command again until I find the String I am looking for
Once I find the string I am looking for I want to exit the loop and execute a different command
I found out how to do this in Java, but if I need to do this in a bash script.
The docker-containers have alpine as an operating system, but I updated the Dockerfile to install bash.
I tried this solution, but it does not work.
#!/bin/bash
[command to be executed] > allout.txt 2>&1
until
tail -n 0 -F /path/to/file | \
while read LINE
do
if echo "$LINE" | grep -q $string
then
echo -e "$string found in the console output"
fi
done
do
echo "String is not present. Executing command again"
sleep 5
[command to be executed] > allout.txt 2>&1
done
echo -e "String is found"
In your docker-compose file make use of depends_on option.
depends_on will take care of startup and shutdown sequence of your multiple containers.
But it does not check whether a container is ready before moving to another container startup. To handle this scenario check this out.
As described in this link,
You can use tools such as wait-for-it, dockerize, or sh-compatible wait-for. These are small wrapper scripts which you can include in your application’s image to poll a given host and port until it’s accepting TCP connections.
OR
Alternatively, write your own wrapper script to perform a more application-specific health check.
In case you don't want to make use of above tools then check this out. Here they use a combination of HEALTHCHECK and service_healthy condition as shown here. For complete example check this.
Just:
while :; do
# 1. Execute a command and log the console output in a file
command > output.log
# TODO: handle errors, etc.
# 2. Read that file and check if a String is present.
if grep -q "searched_string" output.log; then
# Once I find the string I am looking for I want to exit the loop
break;
fi
# 3. If the string is not present I want to execute the same command again until I find the String I am looking for
# add ex. sleep 0.1 for the loop to delay a little bit, not to use 100% cpu
done
# ...and execute a different command
different_command
You can timeout a command with timeout.
Notes:
colon is a utility that returns a zero exit status, much like true, I prefer while : instead of while true, they mean the same.
The code presented should work in any posix shell.

How can I start a subscript within a perpetually running bash script after a specific string has been printed in the terminal output?

Specifics:
I'm trying to build a bash script which needs to do a couple of things.
Firstly, it needs to run a third party script that I cannot manipulate. This script will build a project and then start a node server which outputs data to the terminal continually. This process needs to continue indefinitely so I can't have any exit codes.
Secondly, I need to wait for a specific line of output from the first script, namely 'Started your app.'.
Once that line has been output to the terminal, I need to launch a separate set of commands, either from another subscript or from an if or while block, which will change a few lines of code in the project that was built by the first script to resolve some dependencies for a later step.
So, how can I capture the output of the first subscript and use that to run another set of commands when a particular line is output to the terminal, all while allowing the first script to run in the terminal, and without using timers and without creating a huge file from the output of subscript1 as it will run indefinitely?
Pseudo-code:
#!/usr/bin/env bash
# This script needs to stay running & will output to the terminal (at some point)
# a string that we need to wait/watch for to launch subscript2
sh subscript1
# This can't run until subscript1 has output a particular string to the terminal
# This could be another script, or an if or while block
sh subscript2
I have been beating my head against my desk for hours trying to get this to work. Any help would be appreciated!
I think this is a bad idea — much better to have subscript1 changed to be automation-friendly — but in theory you can write:
sh subscript1 \
| {
while IFS= read -r line ; do
printf '%s\n' "$line"
if [[ "$line" = 'Started your app.' ]] ; then
sh subscript2 &
break
fi
done
cat
}

whether a shell script can be executed if another instance of the same script is already running

I have a shell script which usually runs nearly 10 mins for a single run,but i need to know if another request for running the script comes while a instance of the script is running already, whether new request need to wait for existing instance to compplete or a new instance will be started.
I need a new instance must be started whenever a request is available for the same script.
How to do it...
The shell script is a polling script which looks for a file in a directory and execute the file.The execution of the file takes nearly 10 min or more.But during execution if a new file arrives, it also has to be executed simultaneously.
the shell script is below, and how to modify it to execute multiple requests..
#!/bin/bash
while [ 1 ]; do
newfiles=`find /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/ -newer /afs/rch/usr$
touch /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/.my_marker
if [ -n "$newfiles" ]; then
echo "found files $newfiles"
name2=`ls /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/ -Art |tail -n 2 |head $
echo " $name2 "
mkdir -p -m 0755 /afs/rch/usr8/fsptools/WWW/dumpspace/$name2
name1="/afs/rch/usr8/fsptools/WWW/dumpspace/fipsdumputils/fipsdumputil -e -$
$name1
touch /afs/rch/usr8/fsptools/WWW/dumpspace/tempfiles/$name2
fi
sleep 5
done
When writing scripts like the one you describe, I take one of two approaches.
First, you can use a pid file to indicate that a second copy should not run. For example:
#!/bin/sh
pidfile=/var/run/$(0##*/).pid
# remove pid if we exit normally or are terminated
trap "rm -f $pidfile" 0 1 3 15
# Write the pid as a symlink
if ! ln -s "pid=$$" "$pidfile"; then
echo "Already running. Exiting." >&2
exit 0
fi
# Do your stuff
I like using symlinks to store pid because writing a symlink is an atomic operation; two processes can't conflict with each other. You don't even need to check for the existence of the pid symlink, because a failure of ln clearly indicates that a pid cannot be set. That's either a permission or path problem, or it's due to the symlink already being there.
Second option is to make it possible .. nay, preferable .. not to block additional instances, and instead configure whatever it is that this script does to permit multiple servers to run at the same time on different queue entries. "Single-queue-single-server" is never as good as "single-queue-multi-server". Since you haven't included code in your question, I have no way to know whether this approach would be useful for you, but here's some explanatory meta bash:
#!/usr/bin/env bash
workdir=/var/tmp # Set a better $workdir than this.
a=( $(get_list_of_queue_ids) ) # A command? A function? Up to you.
for qid in "${a[#]}"; do
# Set a "lock" for this item .. or don't, and move on.
if ! ln -s "pid=$$" $workdir/$qid.working; then
continue
fi
# Do your stuff with just this $qid.
...
# And finally, clean up after ourselves
remove_qid_from_queue $qid
rm $workdir/$qid.working
done
The effect of this is to transfer the idea of "one at a time" from the handler to the data. If you have a multi-CPU system, you probably have enough capacity to handle multiple queue entries at the same time.
ghoti's answer shows some helpful techniques, if modifying the script is an option.
Generally speaking, for an existing script:
Unless you know with certainty that:
the script has no side effects other than to output to the terminal or to write to files with shell-instance specific names (such as incorporating $$, the current shell's PID, into filenames) or some other instance-specific location,
OR that the script was explicitly designed for parallel execution,
I would assume that you cannot safely run multiple copies of the script simultaneously.
It is not reasonable to expect the average shell script to be designed for concurrent use.
From the viewpoint of the operating system, several processes may of course execute the same program in parallel. No need to worry about this.
However, it is conceivable, that a (careless) programmer wrote the program in such a way that it produces incorrect results, when two copies are executed in parallel.

bash script with background process

I have the following in my script:
#!/bin/bash
[ ! -S ./notify ] && find ./stylesheets/sass/ \-maxdepth 1 \-type f \-regex '.*/[^_][^/]*\.scss$' | entr \+notify &
what entr does here, is creating notify as a named pipe.
[ insert ]
while read F; do
...
#some processing on found files
#(does not matter for this question at all)
...
done < notify
The problem is, first time I run the script, it sees there is no notify pipe, so it creates one, and puts
the process into the background.
But then the following while loop complains it cannot find notify to read from.
However, when I run script immediately after that, so for the second time now, it continues normally the rest
of the program (while loop part).
How would I fix this, so it runs all good as a whole?
EDIT:
if I put into [ insert ] placeholder above,
sleep 1;
it works, but I would like a better solution for checking when that notify fifo exists, as sometimes it may need more than 1 sec.
You can always poll for the named pipe to be created:
until [ -p notify ]; do read -t 0.1; done
If you don't specifically need to maintain variables between runs, you could also consider using a script rather than entr's +notify. That would avoid the problem.

Resources