Watch with Process Substitution - bash

I often run the command
squeue -u $USER | tee >(wc -l)
where squeue is a Slurm command to see how many jobs you are running. This gives me both the output from squeue and automatically tells how many lines are in it.
How can I watch this command?
watch -n.1 "squeue -u $USER | tee >(wc -l)" results in
Every 0.1s: squeue -u randoms | tee >(wc -l) Wed May 9 14:46:36 2018
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `squeue -u randoms | tee >(wc -l)'

From the watch man page:
Note that command is given to "sh -c" which means that you may need to use extra quoting to get the desired effect.
sh -c also does not support process substitution, the syntax you're using here as >().
Fortunately, that syntax isn't actually needed for what you're doing:
watch -n.1 'out=$(squeue -u "$USER"); echo "$out"; { echo "$out" | wc -l; }'
...or, if you really want to use your original code even at a heavy performance penalty (starting not just one but two new shells every tenth of a second -- first sh, and then bash):
bash_cmd() { squeue -u "$USER" | tee >(wc -l); } # create a function
export -f bash_cmd # export function to the environment
watch -n.1 'bash -c bash_cmd' # call function from bash started from sh started by watch

Related

How to find the number of instances of current script running in bash?

I have the below code to find out the number of instances of current script running that is running with same arg1. But looks like the script creates a subshell and executes this command which also shows up in output. What would be the better approach to find the number of instances of running script ?
$cat test.sh
#!/bin/bash
num_inst=`ps -ef | grep $0 | grep $1 | wc -l`
echo $num_inst
$ps aux | grep test.sh | grep arg1 | grep -v grep | wc -l
0
$./test.sh arg1 arg2
3
$
I am looking for a solution that matches all running instance of ./test.sh arg1 arg2 not the one with ./test.sh arg10 arg20
The reason this creates a subshell is that there's a pipeline inside the command substitution. If you run ps -ef alone in a command substitution, and then separately process the output from that, you can avoid this problem:
#!/bin/bash
all_processes=$(ps -ef)
num_inst=$(echo "$all_processes" | grep "$0" | grep -c "$1")
echo "$num_inst"
I also did a bit of cleanup on the script: double-quote all variable references to avoid weird parsing, used $() instead of backticks, and replaced grep ... | wc -l with grep -c.
You might also replace the echo "$all_processes" | ... with ... <<<"$all_processes" and maybe the two greps with a single grep -c "$0 $1":
...
num_inst=$(grep -c "$0 $1" <<<"$all_processes")
...
Modify your script like this:
#!/bin/bash
ps -ef | grep $0 | wc -l
No need to store the value in a variable, the result is printed to standard out anyway.
Now why do you get 3?
When you run a command within back ticks (fyi you should use syntax num_inst=$( COMMAND ) and not back ticks), it creates a new sub-shell to run COMMAND, then assigns the stdout text to the variable. So if you remove the use of $(), you will get your expected value of 2.
To convince yourself of that, remove the | wc -l, you will see that num_inst has 3 processes, not 2. The third one exists only for the execution of COMMAND.

execute a string in a bash script containing multiple redirects

I am trying to write a bash script which simply acts as an emulator. It takes input from the user and executes the command while forwarding the command along with the result onto a file. I am unable to handle inputs which have either a | or a > in them.
The only option I could find was segregating the commands based on the | into an array and run them individually. However, this does not allow > redirects.
Thanking in advance.
$cmd is a command taken as input from the user
I used the command
$cmd 2>&1 | tee -a $flname
but this does not work if there is a | or a > in $cmd
/bin/bash -c "$cmd 2>&1 | tee -a $flname" does not run/store the command either
Try this:
#!/bin/bash
read -r -p "Insert command to execute"$'\n' cmd
echo "Executing '$cmd'"
/bin/bash -c "$cmd"
# or eval "$cmd"
Example of execution:
$ ./script.sh
Insert command to execute
printf '1\n2\n3\n4\n' | grep '1\|3'
Executing 'printf '1\n2\n3\n4\n' | grep '1\|3''
1
3

xargs output buffering -P parallel

I have a bash function that i call in parallel using xargs -P like so
echo ${list} | xargs -n 1 -P 24 -I# bash -l -c 'myAwesomeShellFunction #'
Everything works fine but output is messed up for obvious reasons (no buffering)
Trying to figure out a way to buffer output effectively. I was thinking I could use awk, but I'm not good enough to write such a script and I can't find anything worthwhile on google? Can someone help me write this "output buffer" in sed or awk? Nothing fancy, just accumulate output and spit it out after process terminates. I don't care the order that shell functions execute, just need their output buffered... Something like:
echo ${list} | xargs -n 1 -P 24 -I# bash -l -c 'myAwesomeShellFunction # | sed -u ""'
P.s. I tried to use stdbuf as per
https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe but did not work, i specified buffering on o and e but output still unbuffered:
echo ${list} | xargs -n 1 -P 24 -I# stdbuf -i0 -oL -eL bash -l -c 'myAwesomeShellFunction #'
Here's my first attempt, this only captures first line of output:
$ bash -c "echo stuff;sleep 3; echo more stuff" | awk '{while (( getline line) > 0 )print "got ",$line;}'
$ got stuff
This isn't quite atomic if your output is longer than a page (4kb typically), but for most cases it'll do:
xargs -P 24 bash -c 'for arg; do printf "%s\n" "$(myAwesomeShellFunction "$arg")"; done' _
The magic here is the command substitution: $(...) creates a subshell (a fork()ed-off copy of your shell), runs the code ... in it, and then reads that in to be substituted into the relevant position in the outer script.
Note that we don't need -n 1 (if you're dealing with a large number of arguments -- for a small number it may improve parallelization), since we're iterating over as many arguments as each of your 24 parallel bash instances is passed.
If you want to make it truly atomic, you can do that with a lockfile:
# generate a lockfile, arrange for it to be deleted when this shell exits
lockfile=$(mktemp -t lock.XXXXXX); export lockfile
trap 'rm -f "$lockfile"' 0
xargs -P 24 bash -c '
for arg; do
{
output=$(myAwesomeShellFunction "$arg")
flock -x 99
printf "%s\n" "$output"
} 99>"$lockfile"
done
' _

How does /bin/bash -c differ from executing a command directly?

I'm curious why the commmand:
for f in `/bin/ls /mydir | sort | tail -n 10`; do echo $f; done;
Outputs the last ten files in /mydir, but
/bin/bash -c "for f in `/bin/ls /mydir | sort | tail -n 10`; do echo $f; done;"
Outputs "syntax error near unexpected token '[file in /mydir]'"
You are using double-quotes, so the parent shell is interpolating backticks and variables before passing the argument to /bin/bash.
Thus, your /bin/bash is receiving the following arguments:
-c "for f in x
y
z
...
; do echo ; done;"
which is a syntax error.
To avoid this, use single quotes to pass your argument:
/bin/bash -c 'for f in `/bin/ls /mydir | sort | tail -n 10`; do echo $f; done;'
Different newline handling in your subcommand output. For example on my machine, using /bin I get this output for your first example:
rmdir
sh
sleep
stty
sync
tcsh
test
unlink
wait4path
zsh
But with the latter:
/bin/bash: -c: line 1: syntax error near unexpected token `sh'
/bin/bash: -c: line 1: `sh'
Because the command substitution takes place in your first shell in both cases, your first one works (the newlines are stripped out when making the command line), but in the second case it doesn't - they remain in the string thanks to your "". Using echo rather than bash -c can showcase this:
$ echo "for f in `/bin/ls /bin | sort | tail -n 10`; do echo \$f; done"
for f in rmdir
sh
sleep
stty
sync
tcsh
test
unlink
wait4path
zsh; do echo $f; done
You can see from that what your bash -c is seeing and why it doesn't work - the sh comes before the do!
You can use single quotes instead, but that will cause the subcommand to run in your new subshell:
$ /bin/bash -c 'for f in `/bin/ls /bin | sort | tail -n 10`; do echo $f; done'
rmdir
sh
sleep
stty
sync
tcsh
test
unlink
wait4path
zsh
If that's not ok, you need to get rid of those newlines:
$ /bin/bash -c "for f in `/bin/ls /bin | sort | tail -n 10 | tr '\n' ' '`; do echo \$f; done"
rmdir
sh
sleep
stty
sync
tcsh
test
unlink
wait4path
zsh

how to use GNU Time with pipeline

I want to measure the running time of some SQL query in postgresql. Using BASH built-in time, I could do the following:
$ time (echo "SELECT * FROM sometable" | psql)
I like GNU time, which provides more formats. However I don't know how to do it with pipe line. For simplicity, I use ls | wc in the following examples:
$ /usr/bin/time -f "%es" (ls | wc)
-bash: syntax error near unexpected token `('
$ /usr/bin/time -f "%es" "ls | wc"
/usr/bin/time: cannot run ls | wc: No such file or directory
If I do not group the pipe in any way, it does not complains:
$ /usr/bin/time -f "%es" ls | wc
0.00s
But apparently, this only measure the first part of the pipe, as showing in the next example
$ /usr/bin/time -f "%es" ls | sleep 20
0.00s
So the question is what is the correct syntax for GNU Time with pipe line?
Call the shell from time:
/usr/bin/time -f "%es" bash -c "ls | wc"
Of course, this will include the shell start-up time as well; it shouldn't be too much, but if you're on a system that has a lightweight shell like dash (and it's sufficient to do what you need), then you could use that to minimize the start-up time overhead:
/usr/bin/time -f "%es" dash -c "ls | wc"
Another option would be to just time the command you are actually interested in, which is the psql command. time will pass its standard input to the program being executed, so you can run it on just one component of the pipeline:
echo "SELECT * FROM sometable" | /usr/bin/time -f "%es" psql
Create a script that calls your pipeline. Then
/usr/bin/time -f '%es' script.sh

Resources