Different output of command substitution - bash

Why does adding | wc -l alters the result as in the following?
pgrep tst | wc -l
echo $(pgrep tst | wc -l)
echo $(pgrep tst) | wc -l
$ ./tst
and even
$ bash -x tst
+ wc -l
+ pgrep tst
++ pgrep tst
++ wc -l
+ echo 0
++ pgrep tst
+ echo

pgrep and subshells can have weird interactions, but in this case that's just a red herring; the actual cause is missing double-quotes around the command substitution:
$ cat tst2
pgrep tst | wc -l
echo "$(pgrep tst | wc -l)"
echo "$(pgrep tst)" | wc -l
$ ./tst2
What's going on in the original script is that in the command
echo $(pgrep tst) | wc -l
pgrep prints two process IDs (the main shell running the script, and a subshell created to handle the echo part of the pipeline). It prints each one as a separate line, something like:
The command substitution captures that, but since it's not in double-quotes the newline between them gets converted to an argument break, so the whole thing becomes equivalent to:
echo 11730 11736 | wc -l
As a result, echo prints both IDs as a single line, and wc -l correctly reports that.

The command substitution induces an additional process that has tst in its name, which is included in the input to wc -l.


Prevent grep from exiting in case of nomatch

Prevent grep returning an error when input doesn't match. I would like it to keep running and not exit with exit code: 1
set -euo pipefail
numbr_match=$(find logs/log_proj | grep "$name" | wc -l);
How could I solve this?
In this individual case, you should probably use
find logs/log_proj -name "*$name*" | wc -l
More generally, you can run grep in a subshell and trap the error.
find logs/log_proj | ( grep "$name" || true) | wc -l
... though of course grep | wc -l is separately an antipattern;
find logs/log_proj | grep -c "$name" || true
I don't know why you are using -e and pipefail when you don't want to have this behaviour. If your goal is just to treat exit code 2 (by grep) as error, but exit code 1 as no-error, you could write a wrapper script around grep, which you
call instead of grep:
# This script behaves exactly like grep, only
# that it returns exit code 0 if there are no
# matching lines
grep "$#"
exit $((rc == 1 ? 0 : rc))

grep -c kills script when no match using set -e

Basic example:
set -e
set -x
NUM_LINES=$(printf "Hello\nHi" | grep -c "How$")
echo "Number of lines: ${NUM_LINES}" # never prints 0
++ grep -c 'How$'
++ printf 'Hello\nHi'
If there are matches, it prints the correct number of lines. Also grep "How$" | wc -l works instead of using grep -c "How$".
You can suppress grep's exit code by running : when it "fails". : always succeeds.
NUM_LINES=$(printf "Hello\nHi" | grep -c "How$" || :)

system command returns strange value

I have this script
npid=$(pgrep -f procname | wc -l)
echo $npid
if [ $npid -lt 3 ];then
service procname start;
When procname is running, it works fine showing the correct number for $npid. It fails when there isn't any procname running. It should return zero, but it returns npid=3, exactly 3. The same problem I see for ps auxw | grep procname | grep -v grep | wc -l as well.
Something trivially wrong I just couldn't figure out, any suggestions ?
# This returns nothing if therisn't a process name poopit running
pgrep -f poopit
# In the case when no process running, below returns zero if typed on a bash cmd line
pgrep -f poopit | wc -l
# If running,
pgrep -f poopit | wc -l
# If running, the script $npid shows

Different pipeline behavior between sh and ksh

I have isolated the problem to the below code snippet:
Notice below that null string gets assigned to LATEST_FILE_NAME='' when the script is run using ksh; but the script assigns the value to variable $LATEST_FILE_NAME correctly when run using sh. This in turn affects the value of $FILE_LIST_COUNT.
But as the script is in KornShell (ksh), I am not sure what might be causing the issue.
When I comment out the tee command in the below line, the ksh script works fine and correctly assigns the value to variable $LATEST_FILE_NAME.
(cd $SOURCE_FILE_PATH; ls *.txt 2>/dev/null) | sort -r > ${SOURCE_FILE_PATH}/${FILE_LIST} | tee -a $LOG_FILE_PATH
Kindly consider:
1. Source Code: script.sh
set -vx # Enable debugging
# Log file
Timestamp=`date +%Y%m%d%H%M`
## Temporary files
FILE_LIST=FILE_LIST.temp #Will store all extract filenames
FILE_LIST_COUNT=0 # Stores total number of files
# Get list of all files, Sort in reverse order, and store names of the files line-wise. If no files are found, error is muted.
(cd $SOURCE_FILE_PATH; ls *.txt 2>/dev/null) | sort -r > ${SOURCE_FILE_PATH}/${FILE_LIST} | tee -a $LOG_FILE_PATH
if [[ ! -f $SOURCE_FILE_PATH/$FILE_LIST ]]; then
echo "FATAL ERROR - Could not create a temp file for file list.";exit 1;
exit 0;
2. Output when using shell sh script.sh:
+ getFileListDetails
+ rm -f /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp
+ tee -a /some/path/Scripts/TEST/shell_issue/TEST_LOGS_201304300506.log
+ cd /some/path/Scripts/TEST/shell_issue
+ sort -r
+ tee -a /some/path/Scripts/TEST/shell_issue/TEST_LOGS_201304300506.log
+ ls 1.txt 2.txt 3.txt
+ [[ ! -f /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp ]]
++ cd /some/path/Scripts/TEST/shell_issue
++ head -1 FILE_LIST.temp
++ cat /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp
++ wc -l
exit 0;
+ exit 0
3. Output when using ksh ksh script.sh:
+ getFileListDetails
+ tee -a /some/path/Scripts/TEST/shell_issue/TEST_LOGS_201304300507.log
+ rm -f /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp
+ 2>& 1
+ tee -a /some/path/Scripts/TEST/shell_issue/TEST_LOGS_201304300507.log
+ sort -r
+ 1> /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp
+ cd /some/path/Scripts/TEST/shell_issue
+ ls 1.txt 2.txt 3.txt
+ 2> /dev/null
+ [[ ! -f /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp ]]
+ cd /some/path/Scripts/TEST/shell_issue
+ head -1 FILE_LIST.temp
+ wc -l
+ cat /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp
exit 0;+ exit 0
OK, here goes...this is a tricky and subtle one. The answer lies in how pipelines are implemented. POSIX states that
If the pipeline is not in the background (see Asynchronous Lists), the shell shall wait for the last command specified in the pipeline to complete, and may also wait for all commands to complete.)
Notice the keyword may. Many shells implement this in a way that all commands need to complete, e.g. see the bash manpage:
The shell waits for all commands in the pipeline to terminate before returning a value.
Notice the wording in the ksh manpage:
Each command, except possibly the last, is run as a separate process; the shell waits for the last command to terminate.
In your example, the last command is the tee command. Since there is no input to tee because you redirect stdout to ${SOURCE_FILE_PATH}/${FILE_LIST} in the command before, it immediately exits. Oversimplified speaking, the tee is faster than the earlier redirection, which means that your file is probably not finished writing to by the time you are reading from it. You can test this (this is not a fix!) by adding a sleep at the end of the whole command:
$ ksh -c 'ls /tmp/* | sort -r > /tmp/foo.txt | tee /tmp/bar.txt; echo "[$(head -n 1 /tmp/foo.txt)]"'
$ ksh -c 'ls /tmp/* | sort -r > /tmp/foo.txt | tee /tmp/bar.txt; sleep 0.1; echo "[$(head -n 1 /tmp/foo.txt)]"'
$ bash -c 'ls /tmp/* | sort -r > /tmp/foo.txt | tee /tmp/bar.txt; echo "[$(head -n 1 /tmp/foo.txt)]"'
That being said, here are a few other things to consider:
Always quote your variables, especially when dealing with files, to avoid problems with globbing, word splitting (if your path contains spaces) etc.:
do_something "${this_is_my_file}"
head -1 is deprecated, use head -n 1
If you only have one command on a line, the ending semicolon ; is superfluous...just skip it
No need to cd into the directory first, just specify the whole path as argument to head:
This is called Useless Use Of Cat because the cat is not needed - wc can deal with files. You probably used it because the output of wc -l myfile includes the filename, but you can use e.g. FILE_LIST_COUNT="$(wc -l < "${SOURCE_FILE_PATH}/${FILE_LIST}")" instead.
Furthermore, you will want to read Why you shouldn't parse the output of ls(1) and How can I get the newest (or oldest) file from a directory?.

set variable in heredoc section

I'm a shell script newbie, so I must be doing something stupid, why won't this work:
while read line
ssh $USER#$line <<ENDSSH
ls -d foo* | wc -l
count=`ls -d foo* | wc -l`
echo $count
done <$myfile
Two lines should be printed, and each should have the same value... but they don't. The first print statement [the result of ls -d foo* | wc -l] has the correct value, the second print statement is incorrect, it always prints blank. Do I need to do something special to assign the value to $count?
What am I doing wrong?
while read line; do
echo Begin $line
ssh $USER#$line << \ENDSSH
ls -d foo* | wc -l
count=`ls -d foo* | wc -l`
echo $count
done < $1
The only problem with your script was that when the heredoc token is not quoted, the shell does variable expansion, so $count was being expanded by your local shell before the remote commands were shipped off...
