Assuming we have the following GNU parallel command:
parallel --resume-failed --joblog test.log --jobs 2 -m -k sh ::: 1.sh 2.sh 3.sh
the first CPU: 1.sh 2.sh
the second CPU:3.sh
Right now if the 1.sh failed, the code will retry 1.sh. If 1.sh still failed, 2.ish won't be executed as well. I wonder if there is anyway that we run 2.sh first if 1.sh failed, and then re-try 1.sh at the end again? Thanks.
I have the feeling you really do not want -m:
$ parallel --dryrun --resume-failed --joblog test.log --jobs 2 -m -k sh ::: 1.sh 2.sh 3.sh
sh 1.sh 2.sh
sh 3.sh
Is that really what you want run? I.e. having 2.sh as argument for 1.sh.
I reckon you want to run this instead:
$ parallel --joblog test.log --jobs 2 -k sh ::: 1.sh 2.sh 3.sh
If any of these fails, you will resume the failed by:
$ parallel --resume-failed --joblog test.log --jobs 2 -k sh ::: 1.sh 2.sh 3.sh
Related
I run a few scripts 1 by 1
cat 001.sh
sh /home/mysqldom/da-cron/f_mysqldom_nrd/5_change_nrd_tld.sh
sh /home/mysqldom/da-cron/f_mysqldom_nrd/5_proxy_removed.sh
sh /home/mysqldom/da-cron/f_mysqldom_nrd/6_sync_nrd.sh
The last script wont work... if I run manually it work very well...
the script is
cat 6_sync_nrd.sh
source /home/mysqldom/da-cron/var.sh
cd /home/mysqldom/da-cron/f_mysqldom_nrd/
mysql -u mysqldom_fnrd -p$mysqldom_fnrd_password -D mysqldom_fnrd -e "UPDATE \`$yesterday\` SET sync='$yesterday';"
mysql -u mysqldom_fnrd -p$mysqldom_fnrd_password -D mysqldom_fnrd -e "DELETE FROM \`$yesterday\` WHERE domain_name = 'domain_name';"
sed s/change_database/$yesterday/g update.conf > $yesterday.conf
/usr/share/logstash/bin/logstash -f $yesterday.conf --path.data /var/lib/logstash108
rm -rf nohup.out
The 6 has to be run after 5
any idea whats worn in it
The tini init-process, used in Docker, mentions that process group killing is not activated by default and gives the following example:
docker run krallin/ubuntu-tini sh -c 'sleep 10'
If I run this, and press Ctrl-C immediately after, I indeed have to wait for 10 seconds till the child process exits.
However, if instead of sh I used bash:
docker run krallin/ubuntu-tini bash -c 'sleep 10'
and press Ctrl-C, the process exits immediately.
Why do sh (which is symlinked to dash) and bash behave differently towards this child process?
And how does Bash kill the child process, I thought Bash does not propagate signals by default?
Answered thanks to chepner and Charles Duffy:
bash -c has an implicit optimization where it uses exec to replace itself if possible. sh (dash) does not have this optimization. See also this observation.
To verify:
Process tree using bash:
❯ docker run --name test --rm --detach krallin/ubuntu-tini bash -c 'sleep 60'
03194d48a4dcc8225251fe1e5de2dcbb901c8a9cfd0853ae910bfe4d3735608d
❯ docker exec test ps axfo pid,ppid,args
PID PPID COMMAND
1 0 /usr/bin/tini -- bash -c sleep 60
7 1 sleep 60
Process tree using sh:
❯ docker run --name test --rm --detach krallin/ubuntu-tini sh -c 'sleep 60'
e56f207509df4b0b57f8e6b2b2760835f6784a147b200d798dffad112bb11d6a
❯ docker exec test ps axfo pid,ppid,args
PID PPID COMMAND
1 0 /usr/bin/tini -- sh -c sleep 60
7 1 sh -c sleep 60
8 7 \_ sleep 60
$ ulimit -n
1024
$ /bin/sh -c ulimit -n
unlimited
Even if I specify the shell I am using:
$ echo $SHELL
/bin/bash
$ /bin/bash -c ulimit -n
unlimited
Why is ulimit not giving me the same value?
This happens because you're running ulimit without arguments. The -n is not part of the command being executed, and will instead become $0. The thing that's unlimited is therefore the max file size
Compare the output of:
bash -c 'echo hello' # says hello
bash -c echo hello # blank line
and then run:
bash -c 'ulimit -n'
After an upgrade to Debian 8.6 Jessie the GNU parallel script suddenly stopped parallelizing to more than 2 jobs with the --pipe and -L options.
Before the upgrade the command:
cat file_with_1064_lines.txt | parallel -L10 -j5 -k -v --pipe "wc -l"
spawned 5 processes, which output this:
wc -l
10
wc -l
10
...
The same command after the upgrade:
wc -l
1060
wc -l
4
(The two values above change with respect to the -L option value -- the first is L*floor(1064/L) and the second is 1064 mod L, but there always only two processes outputting.)
The same is observed independently of the parallel version (tested the latest and one from 2013).
PS.
$ uname -a
Linux 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux
$ parallel --version
GNU parallel 20161222
-L is the record size. The bug was fixed around 20130122. What you want is to read 1 record of 10 lines:
parallel -L10 -N1 -j5 -k -v --pipe wc -l
or 10 records of 1 line:
parallel -L1 -N10 -j5 -k -v --pipe wc -l
I have seen GNU parallel with rsync, unfortunately, I cannot see a clear answer for my use case.
As part of my script I have this:
echo "file01.zip
file02.zip
file03.zip
" | ./gnu-parallel --line-buffer --will-cite \
-j 2 -t --verbose --progress --interactive \
rsync -aPz {} user#example.com:/home/user/
So, I run the script, and as a part of its output, once it gets to the gnu-parallel step, I get this (because I have --interactive, I get prompted to confirm each file:
rsync -aPz file01.zip user#example.com:/home/user/ ?...y
rsync -aPz file02.zip user#example.com:/home/user/ ?...y
Computers / CPU cores / Max jobs to run
1:local / 4 / 2
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
local:2/0/100%/0.0s
... and then, the process just hangs here and does nothing; no numbers change or anything.
At this point, I can do from another terminal this:
$ ps axf | grep rsync
12754 pts/1 S+ 0:00 | | \_ perl ./gnu-parallel --line-buffer --will-cite -j 2 -t --verbose --progress --interactive rsync -aPz {} user#example.com:/home/user/
12763 pts/1 T 0:00 | | \_ rsync -aPz file01.zip user#example.com:/home/user/
12764 pts/1 R 0:11 | | | \_ ssh -l user example.com rsync --server -logDtprze.iLs --log-format=X --partial . /home/user/
12766 pts/1 T 0:00 | | \_ rsync -aPz file02.zip user#example.com:/home/user/
12769 pts/1 R 0:10 | | \_ ssh -l user example.com rsync --server -logDtprze.iLs --log-format=X --partial . /home/user/
... and so I can indeed confirm that processes have been started, but they are apparently not doing anything. As to confirmation that they are not doing anything (as opposed to uploading, which they should be doing in this case), I ran the monitor sudo iptraf, and it reported 0 Kb/s for all traffic on wlan0, which is the only one I have here.
The thing is - the server where I'm logging in to, accepts only SSH authentication with passwords. At first I thought --interactive would allow me to enter the passwords interactively, but instead it prompts the user about whether to run each command line and read a line from the terminal. Only run the command line if the response starts with 'y' or 'Y'.. So ok, above I answered y, but it doesn't prompt me for a password afterwards, and it seems the processes are hanging there waiting for it. My version is "GNU parallel 20160422".
$ ./gnu-parallel --version | head -1
GNU parallel 20160422
So, how can I use GNU parallel, to run multiple rsync tasks with passwords?
Use sshpass:
doit() {
rsync -aPz -e "sshpass -p MyP4$$w0rd ssh" "$1" user#example.com:/home/user
}
export -f doit
parallel --line-buffer -j 2 --progress doit ::: *.zip
The fundamental problem with running interactive programs in parallel is: which program should get the input if two programs are ready for input? Therefore GNU Parallel's --tty implies -j1.