I try to use following command to sent out 4 spark jobs in parallel, and wait for all of them finished before starting new step. However I notice the $cmd_trainSparkModelx commands are empty inside 'xargs'. How to pass them into xargs?
eval $cmd_prepare_step
xargs -P 4 -I {} sh -c 'eval "$1"' - {} <<'EOF'
#eval "$cmd_trainSparkModel1"
#eval "$cmd_trainSparkModel2"
#eval "$cmd_trainSparkModel3"
#eval "$cmd_trainSparkModel4"
echo "$cmd_trainSparkModel1"
echo "$cmd_trainSparkModel2"
echo "$cmd_trainSparkModel3"
echo "$cmd_trainSparkModel4"
EOF
echo "finished training"
eval $cmd_postTraining_step
following command works. But would still like to see how to make xargs work, as it can also control how many jobs to run at the same time.
echo "$cmd_trainSparkModel1" &
p1=&!
echo "$cmd_trainSparkModel2" &
p2=&!
echo "$cmd_trainSparkModel3" &
p3=&!
echo "$cmd_trainSparkModel4" &
p4=&!
wait $p1 $p2 $p3 $p4
Related
I have a script like this
for i in `seq 100`
do
echo $i
some-command $i # will run for 1 minutes
done
I would like to run 10 some-command tasks in the same time. how can I do this?
for i in `seq 1 10 100` # step 10
do
echo $i
some-command $i&
some-command $I+1&
some-command $I+2&
some-command $I+3&
some-command $I+4&
some-command $I+5&
some-command $I+6&
some-command $I+7&
some-command $I+8&
some-command $I+9&
wait
done
You could use GNU parallel, which is designed especially for this:
seq 100 | parallel -j10 'some-command {}'
Or GNU make which is much more than a parallelizing tool but can perfectly do it:
$ cat Makefile
JOBS := $(shell seq 100)
.PHONY: all $(JOBS)
all: $(JOBS)
$(JOBS):
some-command $#
$ make -j10
Warning: if you copy-paste this in a Makefile do not forget to replace the 4 leading spaces before some-command $# by a tab.
IMO, no need for a loop:
seq 100 | xargs -n 1 -P 10 some-command
If you want to run commands in parallel in a controlled manner (i.e. (1) limit the number of parallel commands, (2) track their return statuses and (3) ensure that new commands are started once their predecessors finish, until all commands have run), you can reuse a simple harness, copied from my other answer here.
Just plug in your preferences, replace do_something_and_maybe_fail with the programs you want to run (which you can iterate through by modifying the place where pname is generated (some_program_{a..f}{0..5}) and you’re good to go.
The harness is runnable as-is. Its processes randomly sleep and randomly fail and there are 20 execution slots (MAX_PARALLELISM) for 36 “commands” (some_program_{a..f}{0..5}), so, quite obviously, a few commands will need to wait for other ones to finish (so that at most 20 of them run in parallel).
#!/bin/bash
set -euo pipefail
declare -ir MAX_PARALLELISM=20 # pick a limit
declare -i pid
declare -a pids=()
do_something_and_maybe_fail() {
sleep $((RANDOM % 10))
return $((RANDOM % 2 * 5))
}
for pname in some_program_{a..f}{0..5}; do # 36 items
if ((${#pids[#]} >= MAX_PARALLELISM)); then
wait -p pid -n \
&& echo "${pids[pid]} succeeded" 1>&2 \
|| echo "${pids[pid]} failed with ${?}" 1>&2
unset 'pids[pid]'
fi
do_something_and_maybe_fail & # forking here
pids[$!]="${pname}"
echo "${#pids[#]} running" 1>&2
done
for pid in "${!pids[#]}"; do
wait -n "$((pid))" \
&& echo "${pids[pid]} succeeded" 1>&2 \
|| echo "${pids[pid]} failed with ${?}" 1>&2
done
You can create threads in bash:
for i in `seq 100` ;do
echo $i
sleep 5 &
while [ $(jobs | wc -l) -ge 10 ]; do
sleep 1
done
done
The & after a command will spawn it in it's own thread. You can put the ampersand after commands or things like loops. If you have multiple commands you can surround them with () and put the ampersand after that.
In my example the "sleep 5" is the line which is your long lasting command.
The added while loop is to prevent the thing to spawn all the threads at once. The example will limit at 10.
Here is my code:
count=0
head -n 10 urls.txt | while read LINE; do
curl -o /dev/null -s "$LINE" -w "%{time_total}\n" &
count=$((count+1))
[ 0 -eq $((count % 3)) ] && wait && echo "process wait" # wait for 3 urls
done
echo "before wait"
wait
echo "after wait"
I am expecting the last curl to finish before printing the last echo, but actually it's not the case:
0.595499
0.602349
0.618237
process wait
0.084970
0.084243
0.099969
process wait
0.067999
0.068253
0.081602
process wait
before wait
after wait
➜ Downloads 0.088755 # already exited the script
Does anyone know why it's happening? And how to fix this?
As described in BashFAQ #24, this is caused by your pipeline causing the while loop to be performed in a different shell from the rest of your script.
Consequently, your curls are subprocesses of that subshell, not the outer interpreter; so the outer interpreter cannot wait for them.
This can be resolved by not piping to while read, but instead redirecting its input in a way that doesn't shuffle it into a pipeline element -- as with <(...), a process substitution:
#!/usr/bin/env bash
# ^^^^ - NOT /bin/sh; also, must not start with "sh scriptname"
count=0
while IFS= read -r line; do
curl -o /dev/null -s "$line" -w "%{time_total}\n" &
count=$((count+1))
(( count % 3 == 0 )) && { wait; echo "process wait"; } # wait for 3 urls
done < <(head -n 10 urls.txt)
echo "before wait"
wait
echo "after wait"
why it's happening?
Because you run the processes in the subshell, the parent process can't wait for them.
$ echo | { echo subshell; sleep 100 & }
$ wait # exits immiedately
$
Call wait from the same process the background processes were spawned:
someotherthing | {
while someotherthing; do
something &
done
wait # will wait for something
}
And how to fix this?
I recommend not to use a crude while read loop and use different approach using some tool. Use GNU xargs with -P option to run 3 processes concurently:
head -n 10 urls.txt | xargs -P3 -n1 -d '\n' curl -o /dev/null -w "%{time_total}\n" -s
But you could just use move wait into the subshell as above, or make the while loop to be executed in the parent shell alternatively.
I'm trying to edit my working bash script to an SGE script in order to submit it as a job to the cluster.
Currently I have:
#!/bin/bash
# Perform fastqc on files in a specified directory.
for ((j=1; j <=17; j++))
do
directory=/data4/una/batch"$j"/
files=$""$directory"/*.fastq.gz"
batch=$"batch_"$j""
outfile=$""$batch"_submit_script.sh"
echo "#!/bin/bash">>$outfile;
echo "# Your job name">>$outfile;
echo "# -N $batch">>$outfile;
echo "# The job should be placed into the queue 'all.q'">>$outfile;
echo "#$ -q all.q">>$outfile;
echo "# Running in the current working directory">>$outfile;
echo "#$ -cwd">>$outfile;
echo "">>$outfile;
echo "# Export some necessary environment variables">>$outfile;
echo "#$ -S /bin/bash">>$outfile;
echo "#$ -v PATH">>$outfile;
echo "#$ -v LD_LIBRARY_PATH">>$outfile;
echo "#$ -v PYTHONPATH">>$outfile;
echo "# Finally, put your command here">>$outfile;
echo "">>$outfile;
echo "#$ for i in $files;">>$outfile;
echo "#$ do;">>$outfile;
echo "#$ fastqc -f fastq -o /data4/una/test/fastq/$i;">>$outfile;
echo "#$done">>$outfile;
echo "">>$outfile;
qsub $outfile;
done
But I'm getting an error:
Unable to read script file because of error: ERROR! invalid option argument "-f"
But
fastqc -f fastq -o /data4/una/test/fastq/$i
is a totally valid line in my bash script.
Thoughts?
Thanks!
It actually was poor formatting for my loop that was causing this error. I didn't need to start those lines with #$ at all, so those lines become:
echo "for i in $files;">>$outfile;
echo "do">>$outfile;
echo " fastqc -f fastq -o /data4/una/test/fastqc $i">>$outfile;
echo "done">>$outfile;
echo "">>$outfile;
qsub $outfile;
I'm looking for the best way to duplicate the Linux 'watch' command on Mac OS X. I'd like to run a command every few seconds to pattern match on the contents of an output file using 'tail' and 'sed'.
What's my best option on a Mac, and can it be done without downloading software?
With Homebrew installed:
brew install watch
You can emulate the basic functionality with the shell loop:
while :; do clear; your_command; sleep 2; done
That will loop forever, clear the screen, run your command, and wait two seconds - the basic watch your_command implementation.
You can take this a step further and create a watch.sh script that can accept your_command and sleep_duration as parameters:
#!/bin/bash
# usage: watch.sh <your_command> <sleep_duration>
while :;
do
clear
date
$1
sleep $2
done
Use MacPorts:
$ sudo port install watch
The shells above will do the trick, and you could even convert them to an alias (you may need to wrap in a function to handle parameters):
alias myWatch='_() { while :; do clear; $2; sleep $1; done }; _'
Examples:
myWatch 1 ls ## Self-explanatory
myWatch 5 "ls -lF $HOME" ## Every 5 seconds, list out home directory; double-quotes around command to keep its arguments together
Alternately, Homebrew can install the watch from http://procps.sourceforge.net/:
brew install watch
It may be that "watch" is not what you want. You probably want to ask for help in solving your problem, not in implementing your solution! :)
If your real goal is to trigger actions based on what's seen from the tail command, then you can do that as part of the tail itself. Instead of running "periodically", which is what watch does, you can run your code on demand.
#!/bin/sh
tail -F /var/log/somelogfile | while read line; do
if echo "$line" | grep -q '[Ss]ome.regex'; then
# do your stuff
fi
done
Note that tail -F will continue to follow a log file even if it gets rotated by newsyslog or logrotate. You want to use this instead of the lower-case tail -f. Check man tail for details.
That said, if you really do want to run a command periodically, the other answers provided can be turned into a short shell script:
#!/bin/sh
if [ -z "$2" ]; then
echo "Usage: $0 SECONDS COMMAND" >&2
exit 1
fi
SECONDS=$1
shift 1
while sleep $SECONDS; do
clear
$*
done
I am going with the answer from here:
bash -c 'while [ 0 ]; do <your command>; sleep 5; done'
But you're really better off installing watch as this isn't very clean...
If watch doesn't want to install via
brew install watch
There is another similar/copy version that installed and worked perfectly for me
brew install visionmedia-watch
https://github.com/tj/watch
Or, in your ~/.bashrc file:
function watch {
while :; do clear; date; echo; $#; sleep 2; done
}
To prevent flickering when your main command takes perceivable time to complete, you can capture the output and only clear screen when it's done.
function watch {while :; do a=$($#); clear; echo "$(date)\n\n$a"; sleep 1; done}
Then use it by:
watch istats
Try this:
#!/bin/bash
# usage: watch [-n integer] COMMAND
case $# in
0)
echo "Usage $0 [-n int] COMMAND"
;;
*)
sleep=2;
;;
esac
if [ "$1" == "-n" ]; then
sleep=$2
shift; shift
fi
while :;
do
clear;
echo "$(date) every ${sleep}s $#"; echo
$#;
sleep $sleep;
done
Here's a slightly changed version of this answer that:
checks for valid args
shows a date and duration title at the top
moves the "duration" argument to be the 1st argument, so complex commands can be easily passed as the remaining arguments.
To use it:
Save this to ~/bin/watch
execute chmod 700 ~/bin/watch in a terminal to make it executable.
try it by running watch 1 echo "hi there"
~/bin/watch
#!/bin/bash
function show_help()
{
echo ""
echo "usage: watch [sleep duration in seconds] [command]"
echo ""
echo "e.g. To cat a file every second, run the following"
echo ""
echo " watch 1 cat /tmp/it.txt"
exit;
}
function show_help_if_required()
{
if [ "$1" == "help" ]
then
show_help
fi
if [ -z "$1" ]
then
show_help
fi
}
function require_numeric_value()
{
REG_EX='^[0-9]+$'
if ! [[ $1 =~ $REG_EX ]] ; then
show_help
fi
}
show_help_if_required $1
require_numeric_value $1
DURATION=$1
shift
while :; do
clear
echo "Updating every $DURATION seconds. Last updated $(date)"
bash -c "$*"
sleep $DURATION
done
Use the Nix package manager!
Install Nix, and then do nix-env -iA nixpkgs.watch and it should be available for use after the completing the install instructions (including sourcing . "$HOME/.nix-profile/etc/profile.d/nix.sh" in your shell).
The watch command that's available on Linux does not exist on macOS. If you don't want to use brew you can add this bash function to your shell profile.
# execute commands at a specified interval of seconds
function watch.command {
# USAGE: watch.commands [seconds] [commands...]
# EXAMPLE: watch.command 5 date
# EXAMPLE: watch.command 5 date echo 'ls -l' echo 'ps | grep "kubectl\\\|node\\\|npm\\\|puma"'
# EXAMPLE: watch.command 5 'date; echo; ls -l; echo; ps | grep "kubectl\\\|node\\\|npm\\\|puma"' echo date 'echo; ls -1'
local cmds=()
for arg in "${#:2}"; do
echo $arg | sed 's/; /;/g' | tr \; \\n | while read cmd; do
cmds+=($cmd)
done
done
while true; do
clear
for cmd in $cmds; do
eval $cmd
done
sleep $1
done
}
https://gist.github.com/Gerst20051/99c1cf570a2d0d59f09339a806732fd3
I have a GNU screen named demo, I want to send commands to it. How do I do this?
screen -S demo -X /home/aa/scripts/outputs.sh
yeilds No screen session found.
and doing screen -ls shows that it isn't running.
If the Screen session isn't running, you won't be able to send things to it. Start it first.
Once you've got a session, you need to distinguish between Screen commands and keyboard input. screen -X expects a Screen command. The stuff command sends input, and if you want to run that program from a shell prompt, you'll have to pass a newline as well.
screen -S demo -X stuff '/home/aa/scripts/outputs.sh
'
Note that this may be the wrong approach. Are you sure you want to type into whatever is active in that session? To direct the input at a particular window, use
screen -S demo -p 1 -X stuff '/home/aa/scripts/outputs.sh
'
where 1 is the window number (you can use its title instead).
To start a new window in that session, use the screen command instead. (That's the screen Screen command, not the screen shell command.)
screen -S demo -p 1 -X screen '/home/aa/scripts/outputs.sh'
I put this together to capture the output from the commands. It also handles stdin if you want to pipe some input.
function xscreen {
# Usage: xscreen <screen-name> command...
local SCREEN_NAME=$1
shift
# Create screen if it doesn't exist
if ! screen -list | grep $SCREEN_NAME >/dev/null ; then
screen -dmS $SCREEN_NAME
fi
# Create I/O pipes
local DIR=$( mktemp -d )
local STDIN=$DIR/stdin
local STDOUT=$DIR/stdout
local STDERR=$DIR/stderr
mkfifo $STDIN $STDOUT $STDERR
trap 'rm -f $STDIN $STDOUT $STDERR; rmdir $DIR' RETURN
# Print output and kill stdin when both pipes are closed
{ cat $STDERR >&2 & cat $STDOUT & wait ; fuser -s -PIPE -k -w $STDIN ; } &
# Start the command (Clear line ^A^K, enter command with redirects, run with ^O)
screen -S $SCREEN_NAME -p0 -X stuff "$(echo -ne '\001\013') { $* ; } <$STDIN 1> >(tee $STDOUT) 2> >(tee $STDERR >&2)$(echo -ne '\015')"
# Forward stdin
cat > $STDIN
# Just in case stdin is closed
wait
}
Taking it a step further, it can be useful to call this function over ssh:
ssh user#host -n xscreen somename 'echo hello world'
Maybe combine it with something like ssh user#host "$(typeset -f xscreen); xscreen ..." so you don't have to have the function already defined on the remote host.
A longer version in a bash script that handles the return status and syntax errors:
#!/bin/bash
function usage {
echo "$(basename $0) [[user#]server:[port]] <screen-name> command..." >&2
exit 1
}
[[ $# -ge 2 ]] || usage
SERVER=
SERVERPORT="-p 22"
SERVERPAT='^(([a-z]+#)?([A-Za-z0-9.]+)):([0-9]+)?$'
if [[ "$1" =~ $SERVERPAT ]]; then
SERVER="${BASH_REMATCH[1]}"
[[ -n "${BASH_REMATCH[4]}" ]] && SERVERPORT="-p ${BASH_REMATCH[4]}"
shift
fi
function xscreen {
# Usage: xscreen <screen-name> command...
local SCREEN_NAME=$1
shift
if ! screen -list | grep $SCREEN_NAME >/dev/null ; then
echo "Screen $SCREEN_NAME not found." >&2
return 124
# Create screen if it doesn't exist
#screen -dmS $SCREEN_NAME
fi
# Create I/O pipes
local DIR=$( mktemp -d )
mkfifo $DIR/stdin $DIR/stdout $DIR/stderr
echo 123 > $DIR/status
trap 'rm -f $DIR/{stdin,stdout,stderr,status}; rmdir $DIR' RETURN
# Forward ^C to screen
trap "screen -S $SCREEN_NAME -p0 -X stuff $'\003'" INT
# Print output and kill stdin when both pipes are closed
{
cat $DIR/stderr >&2 &
cat $DIR/stdout &
wait
[[ -e $DIR/stdin ]] && fuser -s -PIPE -k -w $DIR/stdin
} &
READER_PID=$!
# Close all the pipes if the command fails to start (e.g. syntax error)
{
# Kill the sleep when this subshell is killed. Ugh.. bash.
trap 'kill $(jobs -p)' EXIT
# Try to write nothing to stdin. This will block until something reads.
echo -n > $DIR/stdin &
TEST_PID=$!
sleep 2.0
# If the write failed and we're not killed, it probably didn't start
if [[ -e $DIR/stdin ]] && kill $TEST_PID 2>/dev/null; then
echo 'xscreen timeout' >&2
wait $TEST_PID 2>/dev/null
# Send ^C to clear any half-written command (e.g. no closing braces)
screen -S $SCREEN_NAME -p0 -X stuff $'\003'
# Write nothing to output, triggers SIGPIPE
echo -n 1> $DIR/stdout 2> $DIR/stderr
# Stop stdin by creating a fake reader and sending SIGPIPE
cat $DIR/stdin >/dev/null &
fuser -s -PIPE -k -w $DIR/stdin
fi
} &
CHECKER_PID=$!
# Start the command (Clear line ^A^K, enter command with redirects, run with ^O)
screen -S $SCREEN_NAME -p0 -X stuff "$(echo -ne '\001\013') { $* ; echo \$? > $DIR/status ; } <$DIR/stdin 1> >(tee $DIR/stdout) 2> >(tee $DIR/stderr >&2)$(echo -ne '\015')"
# Forward stdin
cat > $DIR/stdin
kill $CHECKER_PID 2>/dev/null && wait $CHECKER_PID 2>/dev/null
# Just in case stdin is closed early, wait for output to finish
wait $READER_PID 2>/dev/null
trap - INT
return $(cat $DIR/status)
}
if [[ -n $SERVER ]]; then
ssh $SERVER $SERVERPORT "$(typeset -f xscreen); xscreen $#"
RET=$?
if [[ $RET == 124 ]]; then
echo "To start screen: ssh $SERVER $SERVERPORT \"screen -dmS $1\"" >&2
fi
exit $RET
else
xscreen "$1" "${#:2}"
fi