read printf format from a bash var - bash

I have a bash script I'm happy with::
$ printf ' Number of xml files: %s\n' `find . -name '*.xml' | wc -l`
42
$
then the message became longer:
$ printf ' Very long message here about number of xml files: %s\n' `find . -name '*.xml' | wc -l`
42
$
So I try to put it in a MSG var to stay at 80cols::
$ MSG=' Number of xml files after zip-zip extraction: %s\n'
$ printf $MSG `find xml_out -name '*.xml' | wc -l`
with no success::
$ printf $MSG `find xml_out -name '*.xml' | wc -l`
Number$
$

you need to put it inside double quotation
printf "$MSG" `ls | wc -l`

You can use this way:
msg=' Number of xml files after zip-zip extraction: %s\n'
printf "$msg" "$(find xml_out -name '*.xml' -exec printf '.' \; | wc -c)"
msg should be quoted in printf command.
Avoiding pipeline with wc -l to address issues when filename may contain newlines, spaces or wildcard characters.
Avoid all uppercase variables in shell.

Related

find and grep / zgrep / lzgrep progress bar

I would like to add a progress bar to this command line:
find . \( -iname "*.bz" -o -iname "*.zip" -o -iname "*.gz" -o -iname "*.rar" \) -print0 | while read -d '' file; do echo "$file"; lzgrep -a stringtosearch\.anything "$file"; done
The progress file should be calculated on the total of compressed size files (not on the single file).
Of course, it can be a script too.
I would also like to add other progress bars, if possible:
The total number of files processed (example 3 out of 21)
The percentage of progress of the single file
Can anybody help me please?
Here some example of it should look alike (example from here):
tar cf - /folder-with-big-files -P | pv -s $(du -sb /folder-with-big-files | awk '{print $1}') | gzip > big-files.tar.gz
Multiple progress bars (example from here):
pv -cN orig < foo.tar.bz2 | bzcat | pv -cN bzcat | gzip -9 | pv -cN gzip > foo.tar.gz
Thanks,
This is the first time I've ever heard of pv and it's not on any machine I have access to but assuming it needs to know a total at startup and then a number on each iteration of a command, you could do something like this to get a progress bar per file processed:
IFS= readarray -d '' files < <(find . -whatever -print0)
printf '%s\n' "${files[#]}" | pv -s "${#files[#]}" | command
The first line gives you an array of files so you can then use "${#files[#]}" to provide pv it's initial total value (looks like you use -s value for that?) and then do whatever you normally do to get progress as each file is processed.
I don't see any way to tell pv that the pipe it's reading from is NUL-terminated rather than newline-terminated so if your files can have newlines in their names then you'd have to figure out how to solve that problem.
To additionally get progress on a single file you might need something like:
IFS= readarray -d '' files < <(find . -whatever -print0)
printf '%s\n' "${files[#]}" |
pv -s "${#files[#]}" |
xargs -n 1 -I {} sh -c 'pv {} | command'
I don't have pv so all of the above is untested so check the syntax, especially since I've never heard of pv :-).
Thanks to Max C., I found a solution for the main question:
find ./ -type f -iname *\.gz -o -iname *\.bz | (tot=0;while read fname; do s=$(stat -c%s "$fname"); if [ ! -z "$s" ] ; then echo "$fname"; tot=$(($tot+$s)); fi; done; echo $tot) | tac | (read size; xargs -i{} cat "{}" | pv -s $size | lzgrep -a something -)
But this work only for gz and bz files, now I have to develop to use different tool according to extension.
I'm gonna to try the Ed solution too.
Thanks to ED and Max C., here the verision 0.2
This version work with zgrep, but not with lzgrep. :-\
#!/bin/bash
echo -n "collecting dump... "
IFS= readarray -d '' files < <(find . \( -iname "*.bz" -o -iname "*.gz" \) -print0)
echo done
echo "Calculating archives size..."
tot=0
for line in "${files[#]}"; do
s=$(stat -c\%s "$line")
if [ ! -z "$s" ]
then
tot=$(($tot+$s))
fi
done
(for line in "${files[#]}"; do
s=$(stat -c\%s "$line")
if [ ! -z "$s" ]
then
echo "$line"
fi
done
) | xargs -i{} sh -c 'echo Processing file: "{}" 1>&2 ; cat "{}"' | pv -s $tot | zgrep -a anything -

Find and count compressed files by extension

I have a bash script that counts compressed files by file extension and prints the count.
#!/bin/bash
FIND_COMPRESSED=$(find . -type f | sed -e 's/.*\.//' | sort | uniq -c | sort -rn | grep -Ei '(deb|tgz|tar|gz|zip)$')
COUNT_LINES=$($FIND_COMPRESSED | wc -l)
if [[ $COUNT_LINES -eq 0 ]]; then
echo "No archived files found!"
else
echo "$FIND_COMPRESSED"
fi
However, the script works only if there are NO files with .deb .tar .gz .tgz .zip.
If there are some, say test.zip and test.tar in the current folder, I get this error:
./arch.sh: line 5: 1: command not found
Yet, if I copy the contents of the FIND_COMPRESSED variable into the COUNT_LINES, all works fine.
#!/bin/bash
FIND_COMPRESSED=$(find . -type f | sed -e 's/.*\.//' | sort | uniq -c | sort -rn | grep -Ei '(deb|tgz|tar|gz|zip)$')
COUNT_LINES=$(find . -type f | sed -e 's/.*\.//' | sort | uniq -c | sort -rn | grep -Ei '(deb|tgz|tar|gz|zip)$'| wc -l)
if [[ $COUNT_LINES -eq 0 ]]; then
echo "No archived files found!"
else
echo "$FIND_COMPRESSED"
fi
What am I missing here?
So when you do that variable like that, it tries to execute it like a command, which is why it fails when it has contents. When it's empty, wc simply returns 0 and it marches on.
Thus, you need to change that line to this:
COUNT_LINES=$(echo $FIND_COMPRESSED | wc -l)
But, while we're at it, you can also simplify the other line with something like this:
FIND_COMPRESSED=$(find . -type f -iname "*deb" -or -iname "*tgz" -or -iname "*tar*") #etc
you can do
mapfile FIND_COMPRESSED < <(find . -type f -regextype posix-extended -regex ".*(deb|tgz|tar|gz|zip)$" -exec bash -c '[[ "$(file {})" =~ compressed ]] && echo {}' \;)
COUNT_LINES=${#FIND_COMPRESSED[#]}

Bash new line feed in results [duplicate]

This question already has answers here:
Iterate over a list of files with spaces
(12 answers)
Closed 5 years ago.
Trying to create a mysql backup script.
However, I am finding that I am getting line feeds in the results:
#!/bin/bash
cd /home
for i in $(find $PWD -type f -name "wp-config.php" );
do echo "'$i'";
done
And the results show:
'/home/site1/public_html/folders/wp-config.php'
\'/home/site2/public_html/New'
'Website/wp-config.php'
'/home/site3/public_html/wp-config.php'
'/home/site4/public_html/old'
'website/wp-config.php'
'/home/site5/public_html/wp-config.php'
Do a ls from the command-line, we see for the folders in question:
New\ website
old\ website
and is treating the '\' as newline character.
OK.. Doing some research:
https://stackoverflow.com/a/5928254/175063
${foo/ /.}
Updating for what we may want:
${i/\ /}
The code now becomes:
#!/bin/bash
cd /home
for i in $(find $PWD -type f -name "wp-config.php" |${i/\ /});
do echo "'$i'";
done
Ref. https://tomjn.com/2014/03/01/wordpress-bash-magic/
Ultimately, I really want something like this:
!/bin/bash
# delete files older than 7 days
## find /home/dummmyacount/backups/ -type f -name '*.7z' -mtime +7 -exec rm {} \;
# set a date variable
DT=$(date +"%m-%d-%Y")
cd /home
for i in $(find $PWD -type f -name "wp-config.php" );
WPDBNAME=`cat $i | grep DB_NAME | cut -d \' -f 4`
WPDBUSER=`cat $i | grep DB_USER | cut -d \' -f 4`
WPDBPASS=`cat $i | grep DB_PASSWORD | cut -d \' -f 4`
do echo "$i";
#do echo $File;
#mysqldump...
done
You can do this
find . -type f -name "wp-config.php" -print0 | while read -rd $'\x00' f
do
printf '[%s]\n' "$f"
done
which uses the NUL character as the delimiter to avoid special chars

List files (recursively) which have no matching pair

I have a set of files in multiple directories. Most of them have a related pair with a different extension and the same base name. The related files are always within the same directory. I need to list only files (and path) without pairs within a directory including all sub directories. How can I do that in bash?
file1.xxx
file1.yyy
file2.xxx
file2.zzz
file3.xxx
file3.aaa
file4.xxx
Any help is much appreciated!
You could use find and pipe to perl to sort the data
find . -type f -print0 |\
perl -0 -l012 -ne 'if(/.*\/(.*)\./){$x{$1}++;$y{$1}=$_}
}{for(keys %x){print $y{$_} if $x{$_}==1}'
This adds the name with no suffix to a hash and incremements for each match, whilst adding the full line to another hash with the same key.
In the end it just checks which have a single match and prints.
As the filenames are null delimited it should work with all filenames.
You can list all the files under your directory and then count how many matches you can find of their whole name in the same tree directory which has the same path name (excluding extension).
If your file matches with less or one names, that means it has not "companion" files:
for f in $(find -type f); do
c=$(find -wholename "$(echo $f | rev | cut --complement -d . -f 1 | rev).*" | wc -l);
if [ "$c" -le "1" ]; then echo $f; fi;
done
Edit:
It might more readable if the pattern composition is performed in a different line:
for f in $(find -type f); do
compPattern="$(echo $f | rev | cut --complement -d . -f 1 | rev).*"
c=$(find -wholename "$compPattern" | wc -l);
if [ "$c" -le "1" ]; then echo $f; fi;
done
Edit (2)
To avoid parsing the output of the find you can use read:
find -type f | while read f; do
if [ $(find -wholename "$(echo $f | rev | cut --complement -d . -f 1 | rev).*" | wc -l) -le "1" ]; then echo $f; fi;
done
Edit(3)
To handle special chars, spaces etc. you can use the following.
while IFS= read -r -d '' f ; do
c=$(find -wholename "$(echo $f | rev | cut --complement -d . -f 1 | rev).*" | wc -l);
if [ "$c" -le "1" ]; then echo $f; fi;
done < <(find -type f -print0)

FInd all files that contains both the string1 and string2

The following script finds and prints the names of all those files that contains either string1 or string2.
However I could not figure out how to make change into this code so that it prints only those files that contains both string1 and string2. Kindly suggest the required change
number=0
for file in `find -name "*.txt"`
do
if [ "`grep "string2\|string1" $file`" != "" ] // change has to be done here
then
echo "`basename $file`"
number=$((number + 1))
fi
done
echo "$number"
Using grep and cut:
grep -H string1 input | grep -E '[^:]*:.*string2' | cut -d: -f1
You can use this with the find command:
find -name '*.txt' -exec grep -H string1 {} \; | grep -E '[^:]*:.*string2'
And if the patterns are not necessarily on the same line:
find -name '*.txt' -exec grep -l string1 {} \; | \
xargs -n 1 -I{} grep -l string2 {}
This solution can handle files with spaces in their names:
number=0
oldIFS=$IFS
IFS=$'\n'
for file in `find -name "*.txt"`
do
if grep -l "string1" "$file" >/dev/null; then
if grep -l "string2" "$file" >/dev/null; then
basename "$file"
number=$((number + 1))
fi
fi
done
echo $number
IFS=$oldIFS

Resources