Bash: test mutual equality of multiple variables? - bash

What is the right way to test if several variables are all equal?
if [[ $var1 = $var2 = $var3 ]] # syntax error
Is it necessary to write something like the following?
if [[ $var1 = $var2 && $var1 = $var3 && $var2 = $var3 ]] # cumbersome
if [[ $var1 = $var2 && $var2 = $var3 && $var3 = $var4 ]] # somewhat better
Unfortunately, the otherwise excellent Advanced Bash Scripting Guide and other online sources I could find don't provide such an example.
My particular motivation is to test if several directories all have the same number of files, using ls -1 $dir | wc -l to count files.
Note
"var1" etc. are example variables. I'm looking for a solution for arbitrary variable names, not just those with a predictable numeric ending.
Update
I've accepted Richo's answer, as it is the most general. However, I'm actually using Kyle's because it's the simplest and my inputs are guaranteed to avoid the caveat.
Thanks for the suggestions, everyone.

if you want to test equality of an arbitrary number of items (let's call them $item1-5, but they could be an array
st=0
for i in $item2 $item3 $item4 $item5; do
[ "$item1" = "$i" ]
st=$(( $? + st ))
done
if [ $st -eq 0 ]; then
echo "They were all the same"
fi

If they are single words you can get really cheap about it.
varUniqCount=`echo "${var1} ${var2} ${var3} ${var4}" | sort -u | wc -l`
if [ ${varUniqCount} -gt 1 ]; then
echo "Do not match"
fi

Transitive method of inspection.
#!/bin/bash
var1=10
var2=10
var3=10
if [[ ($var1 == $var2) && ($var2 == $var3) ]]; then
echo "yay"
else
echo "nay"
fi
Output:
[jaypal:~/Temp] ./s.sh
yay
Note:
Since you have stated in your question that your objective is to test several directories that have same number of files, I thought of the following solution. I know this isn't something you had request so please feel free to disregard it.
Step1:
Identify number of files in a given directory. This command will look inside sub-dirs too but that can be controlled using -depth option of find.
[jaypal:~/Temp] find . -type d -exec sh -c "printf {} && ls -1 {} | wc -l " \;
. 9
./Backup 7
./bash 2
./GTP 22
./GTP/ParserDump 11
./GTP/ParserDump/ParserDump 1
./perl 7
./perl/p1 2
./python 1
./ruby 0
./scripts 22
Step2:
This can be combined with Step1 as we are just redirecting the content to a file.
[jaypal:~/Temp] find . -type d -exec sh -c "printf {} && ls -1 {} | wc -l " \; > file.temp
Step3:
Using the following command we will look in the file.temp twice and it will give us a list of directories that have same number of files.
[jaypal:~/Temp] awk 'NR==FNR && a[$2]++ {b[$2];next} ($2 in b)' file.temp file.temp | sort -k2
./GTP/ParserDump/ParserDump 1
./python 1
./bash 2
./perl/p1 2
./Backup 7
./perl 7
./GTP 22
./scripts 22

(edited to include delimiters to fix the problem noted by Keith Thompson)
Treating the variable values as strings, you can concatenate them along with a suitable delimiter and do one comparison:
if [[ "$var1|$var2|$var3" = "$var1|$var1|$var1" ]]
I used = instead == because == isn't an equality comparison inside [[ ]], it is a pattern match.

For your specific case, this should work:
distinct_values=$(for dir in this_dir that_dir another_dir ; do ls -l "$dir" | wc -l ; done | uniq | wc -l)
if [ $distinct_values -eq 1 ] ; then
echo All the same
else
echo Not all the same
fi
Explanation:
ls -l "$dir" lists the files and subdirectories in the directory, one per line (omitting dot files).
Piping the output through wc -l gives you the number of files in the directory.
Doing that consecutively for each directory in the list gives you a list consisting of the number of files in each directory; if there are 7 in each, this gives 3 lines each consisting of the number 7
Piping that through uniq eliminates consecutive duplicate lines.
Piping that through wc -l gives you the number of distinct lines, which will be 1 if and only if all the directories contain the same number of files.
Note that the output of the 4th stage doesn't necessarily give you the number of distinct numbers of files in the directories; uniq only removes adjacent duplicates, so if the inputs are 7 6 7, the two 7s won't be merged. But it will merge all lines into 1 only if they're all the same.
This is the power of the Unix command line: putting small tools together to do interesting and useful things. (Show me a GUI that can do that!)
For values stored in variables, replace the first line by:
distinct_values=$(echo "$this_var" "$that_var" "$another_var" | fmt -1 | uniq | wc -l)
This assumes that the values of the variables don't contain spaces.

Related

How to find files and count them (storing the info into a variable)?

I want to have a conditional behavior depending on the number of files found:
found=$(find . -type f -name "$1")
numfiles=$(printf "%s\n" "$found" | wc -l)
if [ $numfiles -eq 0 ]; then
echo "cannot access $1: No such file" > /dev/stderr; exit 2;
elif [ $numfiles -gt 1 ]; then
echo "cannot access $1: Duplicate file found" > /dev/stderr; exit 2;
else
echo "File: $(ls $found)"
head $found
fi
EDITED CODE (to reflect more precisely what I need)
Though, numfiles isn't equal to 2(or more) when there are duplicate files found...
All the filenames are on one line, separated by a space.
On the other hand, this works correctly:
find . -type f -name "$1" | wc -l
but I don't want to do twice the recursive search in the if/then/else construct...
Adding -print0 doesn't help either.
What would?
PS- Simplifications or improvements are always welcome!
You want to find files and count the files with a name "$1":
grep -c "/${1}$" $(find . 2>/dev/null)
And store the result in a var. In one command:
numfiles=$(grep -c "/${1}$" <(find . 2>/dev/null))
Using $() to store data to a variable trims tailing whitespace. Since the final newline does not appear in the variable numfiles, wc miscounts by one. You can recover the trailing newline with:
numfiles=$(printf "%s\n" "$found" | wc -l)
This miscounts if found is empty (and if any filenames contain a newline), emphasizing the fact that this entire approach is faulty. If you really want to go this way, you can try:
numfiles=$(test -z "$numfiles" && echo 0 || printf "%s\n" "$found" | wc -l)
or pipe the output of find to a script that counts the output and prints a count along with the first filename:
find . -type f -name "$1" | tr '\n' ' ' |
awk '{c=NF; f=$1 } END {print c, f; exit c!=1}' c=0 |
while read count name; do
case $count in
0) echo no files >&2;;
1) echo 1 file $name;;
*) echo Duplicate files >&2;;
esac;
done
All of these solutions fail miserably if any pathnames contain whitespace. If that matters, you could change the awk to a perl script to make it easier to handle null separators and use -print0, but really I think you should stop worrying about special cases. (find -exec and find | xargs both fail to handle to 0 files matching case cleanly. Arguably this awk solution also doesn't handle it cleanly.)

Find number of files with prefixes in bash

I've been trying to count all files with a specific prefix and then if the number of files with the prefix does not match the number 5 I want to print the prefix.
To achieve this, I wrote the following bash script:
#!/bin/bash
for filename in $(ls); do
name=$(echo $filename | cut -f 1 -d '.')
num=$(ls $name* | wc -l)
if [$num != 5]; then
echo $name
fi
done
But I get this error (repeatedly):
./check_uneven_number.sh: line 5: [1: command not found
Thank you!
The if statement takes a command, runs it, and checks its exit status. Left bracket ([) by itself is a command, but you wrote [$num. The shell expands $num to 1, creating the word [1, which is not a command.
if [ $num != 5 ]; then
Your code loops over file names, not prefixes; so if there are three file names with a particular prefix, you will get three warnings, instead of one.
Try this instead:
# Avoid pesky ls
printf '%s\n' * |
# Trim to just prefixes
cut -d . -f 1 |
# Reduce to unique
sort -u |
while IFS='' read -r prefix; do
# Pay attention to quoting
num=$(printf . "$prefix"* | wc -c)
# Pay attention to spaces
if [ "$num" -ne 5 ]; then
printf '%s\n' "$prefix"
fi
done
Personally, I'd prefer case over the clunky if here, but it takes some getting used to.

Output a file in two columns in BASH

I'd like to rearrange a file in two columns after the nth line.
For example, say I have a file like this here:
This is a bunch
of text
that I'd like to print
as two
columns starting
at line number 7
and separated by four spaces.
Here are some
more lines so I can
demonstrate
what I'm talking about.
And I'd like to print it out like this:
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
How could I do that with a bash command or function?
Actually, pr can do almost exactly this:
pr --output-tabs=' 1' -2 -t tmp1
↓
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
-2 for two columns; -t to omit page headers; and without the --output-tabs=' 1', it'll insert a tab for every 8 spaces it added. You can also set the page width and length (if your actual files are much longer than 100 lines); check out man pr for some options.
If you're fixed upon “four spaces more than the longest line on the left,” then perhaps you might have to use something a bit more complex;
The following works with your test input, but is getting to the point where the correct answer would be, “just use Perl, already;”
#!/bin/sh
infile=${1:-tmp1}
longest=$(longest=0;
head -n $(( $( wc -l $infile | cut -d ' ' -f 1 ) / 2 )) $infile | \
while read line
do
current="$( echo $line | wc -c | cut -d ' ' -f 1 )"
if [ $current -gt $longest ]
then
echo $current
longest=$current
fi
done | tail -n 1 )
pr -t -2 -w$(( $longest * 2 + 6 )) --output-tabs=' 1' $infile
↓
This is a bunch and separated by four spa
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
… re-reading your question, I wonder if you meant that you were going to literally specify the nth line to the program, in which case, neither of the above will work unless that line happens to be halfway down.
Thank you chatraed and BRPocock (and your colleague). Your answers helped me think up this solution, which answers my need.
function make_cols
{
file=$1 # input file
line=$2 # line to break at
pad=$(($3-1)) # spaces between cols - 1
len=$( wc -l < $file )
max=$(( $( wc -L < <(head -$(( line - 1 )) $file ) ) + $pad ))
SAVEIFS=$IFS;IFS=$(echo -en "\n\b")
paste -d" " <( for l in $( cat <(head -$(( line - 1 )) $file ) )
do
printf "%-""$max""s\n" $l
done ) \
<(tail -$(( len - line + 1 )) $file )
IFS=$SAVEIFS
}
make_cols tmp1 7 4
Could be optimized in many ways, but does its job as requested.
Input data (configurable):
file
num of rows borrowed from file for the first column
num of spaces between columns
format.sh:
#!/bin/bash
file=$1
if [[ ! -f $file ]]; then
echo "File not found!"
exit 1
fi
spaces_col1_col2=4
rows_col1=6
rows_col2=$(($(cat $file | wc -l) - $rows_col1))
IFS=$'\n'
ar1=($(head -$rows_col1 $file))
ar2=($(tail -$rows_col2 $file))
maxlen_col1=0
for i in "${ar1[#]}"; do
if [[ $maxlen_col1 -lt ${#i} ]]; then
maxlen_col1=${#i}
fi
done
maxlen_col1=$(($maxlen_col1+$spaces_col1_col2))
if [[ $rows_col1 -lt $rows_col2 ]]; then
rows=$rows_col2
else
rows=$rows_col1
fi
ar=()
for i in $(seq 0 $(($rows-1))); do
line=$(printf "%-${maxlen_col1}s\n" ${ar1[$i]})
line="$line${ar2[$i]}"
ar+=("$line")
done
printf '%s\n' "${ar[#]}"
Output:
$ > bash format.sh myfile
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
$ >

How to check that a file has more than 1 line in a BASH conditional?

I need to check if a file has more than 1 line. I tried this:
if [ `wc -l file.txt` -ge "2" ]
then
echo "This has more than 1 line."
fi
if [ `wc -l file.txt` >= 2 ]
then
echo "This has more than 1 line."
fi
These just report errors. How can I check if a file has more than 1 line in a BASH conditional?
The command:
wc -l file.txt
will generate output like:
42 file.txt
with wc helpfully telling you the file name as well. It does this in case you're checking out a lot of files at once and want individual as well as total stats:
pax> wc -l *.txt
973 list_of_people_i_must_kill_if_i_find_out_i_have_cancer.txt
2 major_acheivements_of_my_life.txt
975 total
You can stop wc from doing this by providing its data on standard input, so it doesn't know the file name:
if [[ $(wc -l <file.txt) -ge 2 ]]
The following transcript shows this in action:
pax> wc -l qq.c
26 qq.c
pax> wc -l <qq.c
26
As an aside, you'll notice I've also switched to using [[ ]] and $().
I prefer the former because it has less issues due to backward compatibility (mostly to do with with string splitting) and the latter because it's far easier to nest executables.
A pure bash (≥4) possibility using mapfile:
#!/bin/bash
mapfile -n 2 < file.txt
if ((${#MAPFILE[#]}>1)); then
echo "This file has more than 1 line."
fi
The mapfile builtin stores what it reads from stdin in an array (MAPFILE by default), one line per field. Using -n 2 makes it read at most two lines (for efficiency). After that, you only need to check whether the array MAPFILE has more that one field. This method is very efficient.
As a byproduct, the first line of the file is stored in ${MAPFILE[0]}, in case you need it. You'll find out that the trailing newline character is not trimmed. If you need to remove the trailing newline character, use the -t option:
mapfile -t -n 2 < file.txt
if [ `wc -l file.txt | awk '{print $1}'` -ge "2" ]
...
You should always check what each subcommand returns. Command wc -l file.txt returns output in the following format:
12 file.txt
You need first column - you can extract it with awk or cut or any other utility of your choice.
How about:
if read -r && read -r
then
echo "This has more than 1 line."
fi < file.txt
The -r flag is needed to ensure line continuation characters don't fold two lines into one, which would cause the following file to report one line only:
This is a file with _two_ lines, \
but will be seen as one.
change
if [ `wc -l file.txt` -ge "2" ]
to
if [ `cat file.tex | wc -l` -ge "2" ]
If you're dealing with large files, this awk command is much faster than using wc:
awk 'BEGIN{x=0}{if(NR>1){x=1;exit}}END{if(x>0){print FILENAME,"has more than one line"}else{print FILENAME,"has one or less lines"}}' file.txt

ksh: shell script to search for a string in all files present in a directory at a regular interval

I have a directory (output) in unix (SUN). There are two types of files created with timestamp prefix to the file name. These file are created on a regular interval of 10 minutes.
e. g:
1. 20140129_170343_fail.csv (some lines are there)
2. 20140129_170343_success.csv (some lines are there)
Now I have to search for a particular string in all the files present in the output directory and if the string is found in fail and success files, I have to count the number of lines present in those files and save the output to the cnt_succ and cnt_fail variables. If the string is not found I will search again in the same directory after a sleep timer of 20 seconds.
here is my code
#!/usr/bin/ksh
for i in 1 2
do
grep -l 0140127_123933_part_hg_log_status.csv /osp/local/var/log/tool2/final_logs/* >log_t.txt; ### log_t.txt will contain all the matching file list
while read line ### reading the log_t.txt
do
echo "$line has following count"
CNT=`wc -l $line|tr -s " "|cut -d" " -f2`
CNT=`expr $CNT - 1`
echo $CNT
done <log_t.txt
if [ $CNT > 0 ]
then
exit
fi
echo "waiitng"
sleep 20
done
The problem I'm facing is, I'm not able to get the _success and _fail in file in line and and check their count
I'm not sure about ksh, but while ... do; ... done is notorious for running off with whatever variables you're using in bash. ksh might be similar.
If I've understand your question right, SunOS has grep, uniq and sort AFAIK, so a possible alternative might be...
First of all:
$ cat fail.txt
W34523TERG
ADFLKJ
W34523TERG
WER
ASDTQ34T
DBVSER6
W34523TERG
ASDTQ34T
DBVSER6
$ cat success.txt
abcde
defgh
234523452
vxczvzxc
jkl
vxczvzxc
asdf
234523452
vxczvzxc
dlkjhgl
jkl
wer
234523452
vxczvzxc
And now:
egrep "W34523TERG|ASDTQ34T" fail.txt | sort | uniq -c
2 ASDTQ34T
3 W34523TERG
egrep "234523452|vxczvzxc|jkl" success.txt | sort | uniq -c
3 234523452
2 jkl
4 vxczvzxc
Depending on the input data, you may want to see what options sort has on your system. Examining uniq's options may prove useful too (it can do more than just count duplicates).
Think you want something like this (will work in both bash and ksh)
#!/bin/ksh
while read -r file; do
lines=$(wc -l < "$file")
((sum+=$lines))
done < <(grep -Rl --include="[1|2]*_fail.csv" "somestring")
echo "$sum"
Note this will match files starting with 1 or 2 and ending in _fail.csv, not exactly clear if that's what you want or not.
e.g. Let's say I have two files, one starting with 1 (containing 4 lines) and one starting with 2 (containing 3 lines), both ending in `_fail.csv somewhere under my current working directory
> abovescript
7
Important to understand grep options here
-R, --dereference-recursive
Read all files under each directory, recursively. Follow all
symbolic links, unlike -r.
and
-l, --files-with-matches
Suppress normal output; instead print the name of each input
file from which output would normally have been printed. The
scanning will stop on the first match. (-l is specified by
POSIX.)
Finaly I'm able to find the solution. Here is the complete code:
#!/usr/bin/ksh
file_name="0140127_123933.csv"
for i in 1 2
do
grep -l $file_name /osp/local/var/log/tool2/final_logs/* >log_t.txt;
while read line
do
if [ $(echo "$line" |awk '/success/') ] ## will check the success file
then
CNT_SUCC=`wc -l $line|tr -s " "|cut -d" " -f2`
CNT_SUCC=`expr $CNT_SUCC - 1`
fi
if [ $(echo "$line" |awk '/fail/') ] ## will check the fail file
then
CNT_FAIL=`wc -l $line|tr -s " "|cut -d" " -f2`
CNT_FAIL=`expr $CNT_FAIL - 1`
fi
done <log_t.txt
if [ $CNT_SUCC > 0 ] && [ $CNT_FAIL > 0 ]
then
echo " Fail count = $CNT_FAIL"
echo " Success count = $CNT_SUCC"
exit
fi
echo "waitng for next search..."
sleep 10
done
Thanks everyone for your help.
I don't think I'm getting it right, but You can't diffrinciate the files?
maybe try:
#...
CNT=`expr $CNT - 1`
if [ $(echo $line | grep -o "fail") ]
then
#do something with fail count
else
#do something with success count
fi

Resources