Parallel subshells doing work and report status - bash

I am trying to do work in all subfolders in parallel and describe a status per folder once it is done in bash.
suppose I have a work function which can return a couple of statuses
#param #1 is the folder
# can return 1 on fail, 2 on sucess, 3 on nothing happend
work(){
cd $1
// some update thing
return 1, 2, 3
}
now I call this in my wrapper function
do_work(){
while read -r folder; do
tput cup "${row}" 20
echo -n "${folder}"
(
ret=$(work "${folder}")
tput cup "${row}" 0
[[ $ret -eq 1 ]] && echo " \e[0;31mupdate failed \uf00d\e[0m"
[[ $ret -eq 2 ]] && echo " \e[0;32mupdated \uf00c\e[0m"
[[ $ret -eq 3 ]] && echo " \e[0;32malready up to date \uf00c\e[0m"
) &>/dev/null
pids+=("${!}")
((++row))
done < <(find . -maxdepth 1 -mindepth 1 -type d -printf "%f\n" | sort)
echo "waiting for pids ${pids[*]}"
wait "${pids[#]}"
}
and what I want is, that it prints out all the folders per line, and updates them independently from each other in parallel and when they are done, I want that status to be written in that line.
However, I am unsure subshell is writing, which ones I need to capture how and so on.
My attempt above is currently not writing correctly, and not in parallel.
If I get it to work in parallel, I get those [1] <PID> things and [1] + 3156389 done ... messing up my screen.
If I put the work itself in a subshell, I don't have anything to wait for.
If I then collect the pids I dont get the response code to print out the text to show the status.
I did have a look at GNU Parallel but I think I cannot have that behaviour. (I think I could hack it that the finished jobs are printed, but I want all 'running' jobs are printed, and the finished ones get amended).

Assumptions/undestandings:
a separate child process is spawned for each folder to be processed
the child process generates messages as work progresses
messages from child processes are to be displayed in the console in real time, with each child's latest message being displayed on a different line
The general idea is to setup a means of interprocess communications (IC) ... named pipe, normal file, queuing/messaging system, sockets (plenty of ideas available via a web search on bash interprocess communications); the children write to this system while the parent reads from the system and issues the appropriate tput commands.
One very simple example using a normal file:
> status.msgs # initialize our IC file
child_func () {
# Usage: child_func <unique_id> <other> ... <args>
local i
for ((i=1;i<=10;i++))
do
sleep $1
# each message should include the child's <unique_id> ($1 in this case);
# parent/monitoring process uses this <unique_id> to control tput output
echo "$1:message - $1.$i" >> status.msgs
done
}
clear
( child_func 3 & )
( child_func 5 & )
( child_func 2 & )
while IFS=: read -r child msg
do
tput cup $child 10
echo "$msg"
done < <(tail -f status.msgs)
NOTES:
the (child_func 3 &) construct is one way to eliminate the OS message re: 'background process completed' from showing up in stdout (there may be other ways but I'm drawing a blank at the moment)
when using a file (normal, pipe) OP will want to look at a locking method (flock?) to insure messages from multiple children don't stomp each other
OP can get creative with the format of the messages printed to status.msgs in conjunction with parsing logic in the parent's while loop
assuming variable width messages OP may want to look at appending a tput el on the end of each printed message in order to 'erase' any characters leftover from a previous/longer message
exiting the loop could be as simple as keeping count of the number of child processes that send a message <id>:done, or keeping track of the number of children still running in the background, or ...
Running this at my command line generates 3 separate lines of output that are updated at various times (based on the sleep $1):
# no ouput to line #1
message - 2.10 # messages change from 2.1 to 2.2 to ... to 2.10
message - 3.10 # messages change from 3.1 to 3.2 to ... to 3.10
# no ouput to line #4
message - 5.10 # messages change from 5.1 to 5.2 to ... to 5.10
NOTE: comments not actually displayed in console

Based on #markp-fuso's answer:
printer() {
while IFS=$'\t' read -r child msg
do
tput cup $child 10
echo "$child $msg"
done
}
clear
parallel --lb --tagstring "{%}\t{}" work ::: folder1 folder2 folder3 | printer
echo

You can't control exit statuses like that. Try this instead, rework your work function to echo status:
work(){
cd $1
# some update thing &> /dev/null without output
echo "${1}_$status" #status=1, 2, 3
}
And than set data collection from all folders like so:
data=$(
while read -r folder; do
work "$folder" &
done < <(find . -maxdepth 1 -mindepth 1 -type d -printf "%f\n" | sort)
wait
)
echo "$data"

Related

Cannot understand why our function calls return twice?

We have a 15 year (or so) old script we are trying to figure out and document. We have found some errors in it but one specific log file gives us much headache. and I would love some help figuring it out.
First the function that are run with the question:
#=========================================================#
# Define function removeOldBackupFile. #
#=========================================================#
removeOldBackupFile()
{
#set -x
echo "Removing old backups if they exists." >> "${report}"
local RCLOC=0
spaceBefore=$(getAvailableSpace ${backupDirectory})
timesToWait=60 # Wait a maximum of 10 minutes before bailing
cat ${oldDbContainer} | while read fileName
do
echo "Old file exists. Removing ${fileName}." >> "${report}"
removeFileIfExist "${fileName}"
RC=$?
echo "Resultcode for removing old backup is: RC=$RC." >> "${report}"
RCLOC=$(($RC+$RCLOC))
spaceAfter=$(getAvailableSpace ${backupDirectory})
# Wait for the OS to register that the file is removed
cnt=0
while [ $spaceAfter -le $spaceBefore ]; do
cnt=$((cnt+1))
if [ $cnt -gt $timesToWait ]; then
echo "Waited too long for space in ${backupDirectory}" | tee -a "${report}"
RCLOC=$(($RCLOC+1))
return $RCLOC
fi
sleep 10
spaceAfter=$(getAvailableSpace ${backupDirectory})
done
done
return $RCLOC
}
The place where this function is ran looks as follows:
#=========================================================#
# Remove old backupfiles if any exist. #
#=========================================================#
removeOldBackupFile
RC=$?
RCSUM=$(($RC+$RCSUM))
We have identified that the if condition is a bit wrong and the while loops would not work as intended if there are multiple files.
But what bothers us is output from a log file:
...
+ cnt=61
+ '[' 61 -gt 60 ']'
+ echo 'Waited too long for space in /<redacted>/backup'
+ tee -a /tmp/maintenanceBackupMessage.70927
Waited too long for space in /<redacted>/backup
+ RCLOC=1
+ return 1
+ return 0
+ RC=0
+ RCSUM=0
...
As seen in the log output after the inner loop have ran 60 times and ending it returns 1 as expected.. BUT! it also have return 0 after!? Why is it also returning 0?
We are unable to figure out the double returns... Any help appriciated
The first return executes in the subshell started by the pipe cat ${oldDbContainer} | while .... The second return is from return $RCLOC at the end of the function. Get rid of the useless use of cat:
removeOldBackupFile()
{
#set -x
echo "Removing old backups if they exists." >> "${report}"
local RCLOC=0
spaceBefore=$(getAvailableSpace ${backupDirectory})
timesToWait=60 # Wait a maximum of 10 minutes before bailing
while read fileName
do
...
done < ${oldDbContainer}
return $RCLOC
}

How to parallel process a function, with loops

So I have this function, I want this function to run everything that It contains in itself at the same time. So far it isn't working, and according to other sources, this is how you do it. The function itself works if its not in parallel.
#!/bin/bash
foo () {
cd ${HOME}/sh/path/to/script/execute
for f in *.sh; do #goes to "execute" directory and executes all
#scripts the current directory "execute" basically run-parts without cron
cd ~/sh/path/to/script
while IFS= read -r l1 #Line 1 in master.txt
IFS= read -r l2 #Line 2 in master.txt
IFS= read -r l3 #Line 3 in master.txt
do
cd /dev/shm/arb
echo ${l1} > arg.txt & echo ${l2} > arg2.txt & echo ${l3} > arg3.txt
cd ${HOME}/sh/path/to/script/execute
bash -H ${f} #executes all scripts inside "execute" folder
cd ~/sh/path/to/script/here
./here.sh &
cd ~/sh/path/to/script &
done <master.txt
done
}
export -f foo
parallel ::: foo
Results in
#No result at all....., just buffers. htop doesn't acknowledge any
#processes, and when this runs its pretty taxing on the cores.
master.txt content
In case this is relevant:
apple_fruit
apple_veggie
veggie_fruit
#apple changes
pear_fruit
pear_veggie
veggie_fruit
#pear changes
cucumber_fruit
...
I'm very new to using parallel, and don't know how it works in advanced(and basic) situations so would the loops interfere? And if it does interfere, is there a workaround?
The result is probably going to be something like:
inner() {
script="$1"
parallel -N3 "'$script' {}; here.sh {}" :::: master.txt
}
export -f inner
parallel inner ::: ${HOME}/sh/path/to/script/execute/*.sh
This will call each of the scripts in ${HOME}/sh/path/to/script/execute/ (and here.sh) with 3 arguments from master.txt like this:
${HOME}/sh/path/to/script/execute/script1.sh apple_fruit apple_veggie veggie_fruit
You need to change the scripts so that:
They get the arguments from the command line (not from arg.txt, arg2.txt, arg3.txt).
They send their output to stdout

Bash - Log all commands and exit codes in a script

I have a long (~2,000 lines) script that I'm trying to log for future debugging. Right now I have:
function log_with_time()
{
while read a; do
echo `date +'%H:%M:%S.%4N '` " $a" >> $LOGFILE
done
}
exec 7> >(log_with_time)
BASH_XTRACEFD=7
PS4=' exit($?)ln:$LINENO: '
set -x
echo "helloWorld 1"
which gives me very nice logging for any and all commands that are run:
15:18:03.6359 exit(0)ln:28: echo 'helloWorld 1'
The issue that I'm running into is that xtrace seems to be asynchronous. With longer scripts, the log times fall behind the actual time the commands are called, and the exit code doesn't match the logged command.
There has to be a better way to do this but I'd be happy if I could just synchronize xtrace.
...
tldr: How can I generally log the time, command and exit code for all commands in a script?
...
(First time posting, feedback appreciated)
UPDATE:
exec {BASH_XTRACEFD}>>$LOGFILE
PS4=' time:$(date +%H:%M:%S.%4N) ln:$LINENO: '
set -x
fail()
{
echo "fail" >> $LOGFILE
return 1
}
trap 'echo exit:$? >> $LOGFILE' DEBUG
fail
solves all of my synchronization issues. exit codes and timestamps are working beautifully. My only issue now is one of formatting: the trap itself is getting reported by xtrace.
time:18:30:07.6080 ln:27: fail
time:18:30:07.6089 ln:12: echo fail
fail
time:18:30:07.6126 ln:13: return 1
time:18:30:07.6134 ln:28: echo exit:1
exit:1
I've tried setting +x in the trap but then set +x gets logged. If I could find a way to omit one line from xtrace, this log would be perfect.
The async behavior is coming from the process substitution -- anything in >(...) is running in its own subshell on the other end of a FIFO. Since it's a separate process, it's inherently unsynchronized.
You don't need log_with_time here at all, though, and so you don't need BASH_XTRACEFD redirecting to a process substitution in the first place. Consider:
# aside: $(date ...) has a *huge* amount of performance overhead here. Personally, I'd
# advise against using it, unless you really need all that precision; $SECONDS will
# be orders-of-magnitude cheaper.
PS4=' prior-exit:$? time:$(date +%H:%M:%S.%4N) ln:$LINENO: '
...thereafter:
$ true
prior-exit:0 time:16:01:17.2509 ln:28: true
$ false
prior-exit:0 time:16:01:18.4242 ln:29: false
$ false
prior-exit:1 time:16:01:19.2963 ln:30: false
$ true
prior-exit:1 time:16:01:20.2159 ln:31: true
$ true
prior-exit:0 time:16:01:20.8650 ln:32: true
Per conversation with Charles Duffy in the comments to whom all credit is given:
Process substitution >(...) is asynchronous, allowing the log writing to fall behind and out of sync with the xtrace.
Instead use:
exec {BASH_XTRACEFD}>>$LOGFILE
PS4=' time:$(date +%H:%M:%S.%4N) ln:$LINENO: '
for synchronously logging the time and line.
Furthermore, xtrace is triggered before running the command, making it a bad candidate for capturing exit codes. Instead use:
trap 'echo exit:$? >> $LOGFILE' DEBUG
to log the exit codes of each command since trap triggers on command completion. Note that this won't report on every step in a function call like xtrace will. (could use some help with the phrasing here)
No solution yet for omitting the trap from xtrace, but it's good enough:
LOGFILE="SomeFile.log"
exec {BASH_XTRACEFD}>>$LOGFILE
PS4=' time:$(date +%H:%M:%S.%4N) ln:$LINENO: '
set -x
fail() # test function that returns 1
{
echo "fail" >> $LOGFILE
return 1
}
success() # test function that returns 0
{
echo "success" >> $LOGFILE
return 0
}
trap 'echo $? >> $LOGFILE' DEBUG
fail
success
echo "complete"
yields:
time:14:10:22.2686 ln:21: trap 'echo $? >> $LOGFILE' DEBUG
time:14:10:22.2693 ln:23: echo 0
0
time:14:10:22.2736 ln:23: fail
time:14:10:22.2741 ln:12: echo fail
fail
time:14:10:22.2775 ln:13: return 1
time:14:10:22.2782 ln:24: echo 1
1
time:14:10:22.2830 ln:24: success
time:14:10:22.2836 ln:17: echo success
success
time:14:10:22.2873 ln:18: return 0
time:14:10:22.2881 ln:26: echo 0
0
time:14:10:22.2912 ln:26: echo complete

What & How does this line of code work: {message:0:1} {message:23:17}

Would love for someone to help educate me on this line of code.
using diff command to compare two files which somehow allows for other commands such as message:0:1 and message:23:17 to access it's results.
How does this work?
message=$(diff previousscan.txt scan.txt | grep 192)
#get first char which indicates if the host came up or went away
iostring="${message:0:1}"
#get first ip-number from the list
computer="${message:23:17}"
#show ip-number in notify if host came up
if [ "$iostring" = \> ]; then
notify-send "$computer online"
fi
#show ip-number in notify if host went away
if [ "$iostring" = \< ]; then
notify-send "$computer offline"
fi
$message is not a command; it is a variable which holds the output of the diff command. The later lines reference substrings; ${message:0:1} is the first character (1 character starting at offset 0) of whatever is stored in $message.
A simple example to show the substring mechanism:
$ message="abcdefghijklmnop"
$ echo ${message:0:1}
a
$ echo ${message:7:3}
hij
The construction foo=$(bar) runs the command bar in a subshell, and places the output you would normally see in your terminal if you simply ran the command bar in the variable $foo.

Is it possible to run two loops at the same time?

So I have a project in my cyber security class to make a bash game. I like to make one of those medieval games where you make farms and mines to get resources. Well I like to make something like that. To do that I have to have two while loops running. Like this
while [ blah ]; do
blah
done
while [ blah ]; do
blah
done
Is it possible to run two while loops at the same time and if I am writing it wrong how do I write it?
If you put a & after each done, like done&, you will create new processes in the background that run the while loops. You will have to be careful to realize what this means though, since the bash script will continue executing commands after creating those new processes even if they are not finished. You might use the wait command to prevent this from happening, but I'm not too used to using that so I cannot vouch for it.
Yes, but you will have to fork a new process for each while loop to be executing in. Technically, they won't both run at the same time (unless you consider multiple cores, but this isn't even garaunteed).
Below is a link to how to fork multiple processes using bash.
Forking / Multi-Threaded Processes | Bash
Since you mention this is a school project, I'll stop here lest I help you "not learn".
R
First things first, wrap the loop into a function and then fork it.
This is done when you want to split a process, for example, if I'm processing a CSV with 160,000+ lines, single process/"thread" will take hours. If you wrap the loop into a function and simply fork it, you will have x amount of processes running, then add wait/kill defunct process loop and you are done. here what you are looking at.
while loop with nested loop:
function jobA() {
while read STR;
do
touch $1_temp
key=$(IFS="|";set -- $STR; echo $1)
for each in ${blah[#]};
do
#echo "$each"
done
done <$1;
}
for i in ${blah[#]};
do
echo "$i"
$(jobRDtemp $i) &
child_pid=$!
parent_pid=$$
PIDS+=($child_pid)
echo "forked process $child_pid with parent $parent_pid"
done
for pid in ${PIDS[#]};
do
wait $pid
done
echo "all jobs done"
sleep 1
Now this is wrapped, here is example of a FORKED loop. this means you will have parallel processes run in the background, WAIT will wait for ALL to complete before proceeding. This is important for some type of scripts.
Also, DO NOT use nested FOR loops written C style like presented above, example:
for (( i = 1; i <= 5; i++ )) ### Outer for loop ###
This is VERY slow. use THIS type:
for each in ${blah[#]};
do
#echo "$each"
if [ "$key" = "$each" ]; then
# echo "less than $keyValNeed..."
echo $STR >> $1_temp
fi
done
You could also use nested for loops
for (( i = 1; i <= 5; i++ )) ### Outer for loop ###
do
for (( j = 1 ; j <= 5; j++ )) ### Inner for loop ###
do
echo -n "$i "
done
echo "" #### print the new line ###
done
EDIT: I thought you meant Nested Loop but reading again you said running both loops "at the same time". I will leave my answer here though.

Resources