zsh history with timestamp slow when piped - macos

I have set zsh to store 100K lines of history and to save timestamp for every command, and noticed odd behaviour when I try to print history with timestamps.
Running history -i 1 takes around 2 seconds.
Running history -i 1 > out or history -i 1 > /dev/null takes around 0.4 second.
But running history -i 1 | cat takes around 30 seconds, same for piping through grep.
To be sure I checked that the lines that format and output the time are responsible, commenting them out makes the piped command run in few seconds, also replacing localtime with gmtime makes the piped command take only ~15 seconds.
I also tried creating a binary that calls localtime 100000 times, but it doesn't take 15 seconds to run.
What can be causing such slowness when piping?
Example output of running history -iD 1 (-D to show duration too):
…
100080 2022-03-27 19:33 0:01 history -i 1
100081 2022-03-27 19:33 0:00 history -i 1 > out
100082 2022-03-27 19:33 0:00 rm out
100083 2022-03-27 19:33 0:00 history -i 1 > /dev/null
100084 2022-03-27 19:33 0:30 history -i 1 | cat
100085 2022-03-27 19:34 0:30 history -i 1 | grep history

Related

Run million of list in PBS with parallel tool

I've huge size(few million) job contain list and wants to run java written tool to perform the features comparison. This tool completes the calculation in
real 0m0.179s
user 0m0.005s
sys 0m0.000s sec
Running 5 nodes(each have 72 cpus) with pbs torque scheduler in the GNU parallel, tool runs fine and produces the results but as I set 72 jobs per node, it should run 72 x 5 jobs at a time but I can see only it runs 25-35 jobs!
Checking of cpu utilization on each node also shows low utilization.
I desire to run 72 X 5 jobs or more at a time and produce the results by utilizing all the available source (72 X 5 cpus).
As I mentioned have ~200 millions of job to run, I desire to complete it faster(1-2 hours) by using/increasing the number of nodes/cpus.
Current code, input and job state:
example.lst (it has ~300 million lines)
ZNF512-xxxx_2_N-THRA-xxtx_2_N
ZNF512-xxxx_2_N-THRA-xxtx_3_N
ZNF512-xxxx_2_N-THRA-xxtx_4_N
.......
cat job_script.sh
#!/bin/bash
#PBS -l nodes=5:ppn=72
#PBS -N job01
#PBS -j oe
#work dir
export WDIR=/shared/data/work_dir
cd $WDIR;
# use available 72 cpu in each node
export JOBS_PER_NODE=72
#gnu parallel command
parallelrun="parallel -j $JOBS_PER_NODE --slf $PBS_NODEFILE --wd $WDIR --joblog process.log --resume"
$parallelrun -a example.lst sh run_script.sh {}
cat run_script.sh
#!/bin/bash
# parallel command options
i=$1
data=/shared/TF_data
# create tmp dir and work in
TMP_DIR=/shared/data/work_dir/$i
mkdir -p $TMP_DIR
cd $TMP_DIR/
# get file name
mk=$(echo "$i" | cut -d- -f1-2)
nk=$(echo "$i" | cut -d- -f3-6)
#run a tool to compare the features of pair files
/shared/software/tool_v2.1/tool -s1 $data/inf_tf/$mk -s1cf $data/features/$mk-cf -s1ss $data/features/$mk-ss -s2 $data/inf_tf/$nk.pdb -s2cf $data/features/$nk-cf.pdb -s2ss $data/features/$nk-ss.pdb > $data/$i.out
# move output files
mv matrix.txt $data/glosa_tf/matrix/$mk"_"$nk.txt
mv ali_struct.pdb $data/glosa_tf/aligned/$nk"_"$mk.pdb
# move back and remove tmp dir
cd $TMP_DIR/../
rm -rf $TMP_DIR
exit 0
PBS submission
qsub job_script.sh
Login to one of the node : ssh ip-172-31-9-208
top - 09:28:03 up 15 min, 1 user, load average: 14.77, 13.44, 8.08
Tasks: 928 total, 1 running, 434 sleeping, 0 stopped, 166 zombie
Cpu(s): 0.1%us, 0.1%sy, 0.0%ni, 98.4%id, 1.4%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 193694612k total, 1811200k used, 191883412k free, 94680k buffers
Swap: 0k total, 0k used, 0k free, 707960k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15348 ec2-user 20 0 16028 2820 1820 R 0.3 0.0 0:00.10 top
15621 ec2-user 20 0 169m 7584 6684 S 0.3 0.0 0:00.01 ssh
15625 ec2-user 20 0 171m 7472 6552 S 0.3 0.0 0:00.01 ssh
15626 ec2-user 20 0 126m 3924 3492 S 0.3 0.0 0:00.01 perl
.....
All of the nodes top shows the similar state and produces the results by running only ~26 at a time!
I've aws-parallelcluster contains 5 nodes(each have 72 cpus) with torque scheduler and GNU Parallel 2018, Mar 2018
Update
By introducing the new function that takes input on stdin and running the script in parallel works great and utilizes all the CPU in local machine.
However, when its runs over remote machines it produces a
parallel: Error: test.lst is neither a file nor a block device
MCVE:
A simple code that echoing list gives the same error while running it in remote machines but works great in local machine:
cat test.lst # contains list
DNMT3L-5yx2B_1_N-DNMT3L-5yx2B_2_N
DNMT3L-5yx2B_1_N-DNMT3L-6brrC_3_N
DNMT3L-5yx2B_1_N-DNMT3L-6f57B_2_N
DNMT3L-5yx2B_1_N-DNMT3L-6f57C_2_N
DNMT3L-5yx2B_1_N-DUX4-6e8cA_4_N
DNMT3L-5yx2B_1_N-E2F8-4yo2A_3_P
DNMT3L-5yx2B_1_N-E2F8-4yo2A_6_N
DNMT3L-5yx2B_1_N-EBF3-3n50A_2_N
DNMT3L-5yx2B_1_N-ELK4-1k6oA_3_N
DNMT3L-5yx2B_1_N-EPAS1-1p97A_1_N
cat test_job.sh # GNU parallel submission script
#!/bin/bash
#PBS -l nodes=1:ppn=72
#PBS -N test
#PBS -k oe
# introduce new function and Run from ~/
dowork() {
parallel sh test_work.sh {}
}
export -f dowork
parallel -a test.lst --env dowork --pipepart --slf $PBS_NODEFILE --block -10 dowork
cat test_work.sh # run/work script
#!/bin/bash
i=$1
data=pwd
#create temporary folder in current dir
TMP_DIR=$data/$i
mkdir -p $TMP_DIR
cd $TMP_DIR/
# split list
mk=$(echo "$i" | cut -d- -f1-2)
nk=$(echo "$i" | cut -d- -f3-6)
# echo list and save in echo_test.out
echo $mk, $nk >> $data/echo_test.out
cd $TMP_DIR/../
rm -rf $TMP_DIR
From your timing:
real 0m0.179s
user 0m0.005s
sys 0m0.000s sec
it seems the tool uses very little CPU power. When GNU Parallel runs local jobs it has an overhead of 10 ms CPU time per job. Your jobs use 179 ms time, and 5 ms CPU time. So GNU Parallel will be using quite a bit of the time spent.
The overhead is much worse when running jobs remotely. Here we are talking 10 ms + running an ssh command. This can easily be in the order of 100 ms.
So how can we minimize the number of ssh commands and how can spread the overhead over multiple cores?
First let us make a function that can take input on stdin and run the script - one job per CPU thread in parallel:
dowork() {
[...set variables here. that becomes particularly important we when run remotely...]
parallel sh run_script.sh {}
}
export -f dowork
Test that this actually works by running:
head -n 1000 example.lst | dowork
Then let us look at running jobs locally. This can be done similar to described here: https://www.gnu.org/software/parallel/man.html#EXAMPLE:-Running-more-than-250-jobs-workaround
parallel -a example.lst --pipepart --block -10 dowork
This will split example.lst into 10 blocks per CPU thread. So on a machine with 72 CPU threads this will make 720 blocks. It will the start 72 doworks and when one is done it will get another of the 720 blocks. The reason I choose 10 instead of 1 is if one of the jobs "get stuck" for a while, then you are unlikely to notice this.
This should make sure 100% of the CPUs on the local machine is busy.
If that works, we need to distribute this work to remote machines:
parallel -j1 -a example.lst --env dowork --pipepart --slf $PBS_NODEFILE --block -10 dowork
This should in total start 10 ssh per CPU thread (i.e. 5*72*10) - namely one for each block. With 1 running per server listed in $PBS_NODEFILE in parallel.
Unfortunately this means that --joblog and --resume will not work. There is currently no way to make that work, but if it is valuable to you contact me via parallel#gnu.org.
I am not sure what tool does. But if the copying takes most of the time and if tool only reads the files, then you might just be able symlink the files into $TMP_DIR instead of copying.
A good indication of whether you can do it faster is to look at top of the 5 machines in the cluster. If they are all using all cores at >90% then you cannot expect to get it faster.

bash can't capture output from aria2c to variable and stdout

I am trying to use aria2c to download a file. The command looks like this:
aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath
The command works perfectly from the script when run this way. What I'm trying to do is capture the output from the command to a variable and still display it on the screen in real-time.
I have successfully captured the output to a variable by using:
VAR=$(aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath)
With this scenario though, there's a long delay on the screen where there's no update while the download is happening. I have an echo command after this line in the script and $VAR has all of the aria2c download data captured.
I have tried using different combinations of 2>&1, and | tee /dev/tty at the end of the command, but nothing shows in the display in realtime.
Example:
VAR=$(aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath 2>&1)
VAR=$(aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath 2>&1 | tee /dev/tty )
VAR=$(aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath | tee /dev/tty )
VAR=$((aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath) 2>&1)
VAR=$((aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath) 2>&1 | tee /dev/tty )
VAR=$((aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath) 2>&1 ) | tee /dev/tty )
I've been able to use the "2>&1 | tee" combination before with other commands but for some reason I can't seem to capture aria2c to both simultaneously. Anyone had any luck doing this from a bash script?
Since aria2c seems to output to stdout, consider teeing that to stderr:
var=$(aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath | tee /dev/fd/2)
The stdout ends up in var while tee duplicates it to stderr, which displays to your screen.

GNU Parallel timeout for process

I want to use GNU Parallel for this command:
seq -w 30 | parallel -k -j6 java -javaagent:build/libs/pddl4j-3.1.0.jar -server -Xms8048m -Xmx8048m fr.uga.pddl4j.planners.hsp.HSP -o pddl/benchmarks_STRIPS/benchmarks_STRIPS/ipc1/movie/domain.pddl -f pddl/benchmarks_STRIPS/benchmarks_STRIPS/ipc1/movie/p{}.pddl -i 8 '>>' AstarMovie.txt
I have a timeout of 600 seconds in the java program but parallel doesn't execute it. Processes can run for 2, 3, 4 or more hours and never stop.
I tried this command based on the GNU tutorial online, but it doesn't work either:
seq -w 30 | parallel -k --timeout 600000 -j6 java -javaagent:build/libs/pddl4j-3.1.0.jar -server -Xms2048m -Xmx2048m fr.uga.pddl4j.planners.hsp.HSP -o pddl/benchmarks_STRIPS/benchmarks_STRIPS/ipc1/movie/domain.pddl -f pddl/benchmarks_STRIPS/benchmarks_STRIPS/ipc1/movie/p{}.pddl -i 8 '>>' AstarMovie.txt
I saw in the tutorial that GNU Parallel uses milliseconds - so 600000 is 10 minutes which is what I need but after 12 minutes the process was still running. I need 6 processes to run at once for a maximum of 10 minutes each.
Any help would be great. Thanks.
EDIT:
Why do people feel the need to edit posts for small changes like '600seconds' to '600 seconds'? Stop doing it for karma..
The timeout for GNU Parallel is given in seconds, not milliseconds. You can test it with this snippet which waits for 15 seconds but with a timeout that cuts it off after 10 seconds:
time parallel --timeout 10 sleep {} ::: 15
real 0m10.961s
user 0m0.071s
sys 0m0.038s

Using GNU Parallel and rsync with passwords?

I have seen GNU parallel with rsync, unfortunately, I cannot see a clear answer for my use case.
As part of my script I have this:
echo "file01.zip
file02.zip
file03.zip
" | ./gnu-parallel --line-buffer --will-cite \
-j 2 -t --verbose --progress --interactive \
rsync -aPz {} user#example.com:/home/user/
So, I run the script, and as a part of its output, once it gets to the gnu-parallel step, I get this (because I have --interactive, I get prompted to confirm each file:
rsync -aPz file01.zip user#example.com:/home/user/ ?...y
rsync -aPz file02.zip user#example.com:/home/user/ ?...y
Computers / CPU cores / Max jobs to run
1:local / 4 / 2
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
local:2/0/100%/0.0s
... and then, the process just hangs here and does nothing; no numbers change or anything.
At this point, I can do from another terminal this:
$ ps axf | grep rsync
12754 pts/1 S+ 0:00 | | \_ perl ./gnu-parallel --line-buffer --will-cite -j 2 -t --verbose --progress --interactive rsync -aPz {} user#example.com:/home/user/
12763 pts/1 T 0:00 | | \_ rsync -aPz file01.zip user#example.com:/home/user/
12764 pts/1 R 0:11 | | | \_ ssh -l user example.com rsync --server -logDtprze.iLs --log-format=X --partial . /home/user/
12766 pts/1 T 0:00 | | \_ rsync -aPz file02.zip user#example.com:/home/user/
12769 pts/1 R 0:10 | | \_ ssh -l user example.com rsync --server -logDtprze.iLs --log-format=X --partial . /home/user/
... and so I can indeed confirm that processes have been started, but they are apparently not doing anything. As to confirmation that they are not doing anything (as opposed to uploading, which they should be doing in this case), I ran the monitor sudo iptraf, and it reported 0 Kb/s for all traffic on wlan0, which is the only one I have here.
The thing is - the server where I'm logging in to, accepts only SSH authentication with passwords. At first I thought --interactive would allow me to enter the passwords interactively, but instead it prompts the user about whether to run each command line and read a line from the terminal. Only run the command line if the response starts with 'y' or 'Y'.. So ok, above I answered y, but it doesn't prompt me for a password afterwards, and it seems the processes are hanging there waiting for it. My version is "GNU parallel 20160422".
$ ./gnu-parallel --version | head -1
GNU parallel 20160422
So, how can I use GNU parallel, to run multiple rsync tasks with passwords?
Use sshpass:
doit() {
rsync -aPz -e "sshpass -p MyP4$$w0rd ssh" "$1" user#example.com:/home/user
}
export -f doit
parallel --line-buffer -j 2 --progress doit ::: *.zip
The fundamental problem with running interactive programs in parallel is: which program should get the input if two programs are ready for input? Therefore GNU Parallel's --tty implies -j1.

Cronjob command starts 2 processes, for 1 python script

My crontab had the command:
50 08 * * 1-5 /home/MY_SCRIPT.py /home/arguments 2> /dev/null
59 23 * * 1-5 killall MY_SCRIPT.py
Which worked perfectly fine, but when I used to do
ps aux | grep SCRIPT
It showed:
myuser 13898 0.0 0.0 4444 648 ? Ss 08:50 0:00 /bin/sh -c /home/MY_SCRIPT.py /home/arguments 2> /dev/null
myuser 13900 0.0 0.0 25268 7384 ? S 08:50 0:00 /usr/bin/python /home/MY_SCRIPT.py /home/arguments
Why are 2 processes been shown?
And the killall command also used to work fine.
I made a change to my script and in order to get the new behaviour, I had to kill the currently running scripts and I used
kill 13898 13900
After that I used the same command (as in crontab)
/home/MY_SCRIPT.py /home/arguments 2> /dev/null
But now after restarting the script, it showed only 1 process (which makes sense)
Everything looks good till here, but this time the killall MY_SCRIPT in the cronjob didnt work, it said could not find pid. And the script kept on running until I had to manually kill it.
Need to find out the reason for this behaviour:
Why 2 processes from cronjob
Is there something wrong the way I restrated the script
How do I make sure that next time I restart the script, the cron should kill it for sure
OS:Linux Ubuntu
You're seeing two processes because crontab uses /bin/sh to summon your python script. So basically what happens is:
/bin/sh -c '/home/MY_SCRIPT.py /home/arguments 2> /dev/null'
And the process structure becomes
/bin/sh -> /usr/bin/python
Try this format instead:
50 08 * * 1-5 /bin/sh -c 'exec /home/MY_SCRIPT.py /home/arguments 2> /dev/null'
It may also be a good idea to specify the full path to killall. It's probably in /usr/bin. Verify it with which killall.
59 23 * * 1-5 /usr/bin/killall MY_SCRIPT.py
Another more efficient way to do it is to save your process id somewhere:
50 08 * * 1-5 /bin/sh -c 'echo "$$" > /var/run/my_script.pid; exec /home/MY_SCRIPT.py /home/arguments 2> /dev/null'
And use a more efficient killer:
59 23 * * 1-5 /bin/sh -c 'read PID < /var/run/my_script.pid; kill "$PID"'

Resources