FInd all files that contains both the string1 and string2 - shell

The following script finds and prints the names of all those files that contains either string1 or string2.
However I could not figure out how to make change into this code so that it prints only those files that contains both string1 and string2. Kindly suggest the required change
number=0
for file in `find -name "*.txt"`
do
if [ "`grep "string2\|string1" $file`" != "" ] // change has to be done here
then
echo "`basename $file`"
number=$((number + 1))
fi
done
echo "$number"

Using grep and cut:
grep -H string1 input | grep -E '[^:]*:.*string2' | cut -d: -f1
You can use this with the find command:
find -name '*.txt' -exec grep -H string1 {} \; | grep -E '[^:]*:.*string2'
And if the patterns are not necessarily on the same line:
find -name '*.txt' -exec grep -l string1 {} \; | \
xargs -n 1 -I{} grep -l string2 {}

This solution can handle files with spaces in their names:
number=0
oldIFS=$IFS
IFS=$'\n'
for file in `find -name "*.txt"`
do
if grep -l "string1" "$file" >/dev/null; then
if grep -l "string2" "$file" >/dev/null; then
basename "$file"
number=$((number + 1))
fi
fi
done
echo $number
IFS=$oldIFS

Related

Find and count compressed files by extension

I have a bash script that counts compressed files by file extension and prints the count.
#!/bin/bash
FIND_COMPRESSED=$(find . -type f | sed -e 's/.*\.//' | sort | uniq -c | sort -rn | grep -Ei '(deb|tgz|tar|gz|zip)$')
COUNT_LINES=$($FIND_COMPRESSED | wc -l)
if [[ $COUNT_LINES -eq 0 ]]; then
echo "No archived files found!"
else
echo "$FIND_COMPRESSED"
fi
However, the script works only if there are NO files with .deb .tar .gz .tgz .zip.
If there are some, say test.zip and test.tar in the current folder, I get this error:
./arch.sh: line 5: 1: command not found
Yet, if I copy the contents of the FIND_COMPRESSED variable into the COUNT_LINES, all works fine.
#!/bin/bash
FIND_COMPRESSED=$(find . -type f | sed -e 's/.*\.//' | sort | uniq -c | sort -rn | grep -Ei '(deb|tgz|tar|gz|zip)$')
COUNT_LINES=$(find . -type f | sed -e 's/.*\.//' | sort | uniq -c | sort -rn | grep -Ei '(deb|tgz|tar|gz|zip)$'| wc -l)
if [[ $COUNT_LINES -eq 0 ]]; then
echo "No archived files found!"
else
echo "$FIND_COMPRESSED"
fi
What am I missing here?
So when you do that variable like that, it tries to execute it like a command, which is why it fails when it has contents. When it's empty, wc simply returns 0 and it marches on.
Thus, you need to change that line to this:
COUNT_LINES=$(echo $FIND_COMPRESSED | wc -l)
But, while we're at it, you can also simplify the other line with something like this:
FIND_COMPRESSED=$(find . -type f -iname "*deb" -or -iname "*tgz" -or -iname "*tar*") #etc
you can do
mapfile FIND_COMPRESSED < <(find . -type f -regextype posix-extended -regex ".*(deb|tgz|tar|gz|zip)$" -exec bash -c '[[ "$(file {})" =~ compressed ]] && echo {}' \;)
COUNT_LINES=${#FIND_COMPRESSED[#]}

read printf format from a bash var

I have a bash script I'm happy with::
$ printf ' Number of xml files: %s\n' `find . -name '*.xml' | wc -l`
42
$
then the message became longer:
$ printf ' Very long message here about number of xml files: %s\n' `find . -name '*.xml' | wc -l`
42
$
So I try to put it in a MSG var to stay at 80cols::
$ MSG=' Number of xml files after zip-zip extraction: %s\n'
$ printf $MSG `find xml_out -name '*.xml' | wc -l`
with no success::
$ printf $MSG `find xml_out -name '*.xml' | wc -l`
Number$
$
you need to put it inside double quotation
printf "$MSG" `ls | wc -l`
You can use this way:
msg=' Number of xml files after zip-zip extraction: %s\n'
printf "$msg" "$(find xml_out -name '*.xml' -exec printf '.' \; | wc -c)"
msg should be quoted in printf command.
Avoiding pipeline with wc -l to address issues when filename may contain newlines, spaces or wildcard characters.
Avoid all uppercase variables in shell.

How to find xml files and comment lines containing string 'dark'?

<hello>
<world>dark</world>
</hello>
So far I have tried...
find . -name "*.xml" -print0 | while read -d $'\0' file; do awk '{print "<!--"$0"-->"}' "$file"; done
... which fails.
But some how awk for a single file...
awk '{print "<!--"$0"-->"}' "$file"
... works just fine.
To cover the condition "to find xml files and comment lines containing string 'dark'" exactly:
find + grep + sed solution:
find . -type f -name "*.xml" -exec sh -c \
'if grep -wq "dark" "$1"; then sed -i "s/.*dark.*/<!--&-->/" "$1"; fi' _ {} \;
You'd better not use awk for parsing XML files. Instead use an XML parser.
Here an example with xmllint:
find -name "*.xml" -exec bash -c 'xmllint --xpath "//*/world/text()" $1 >/dev/null 2>&1 && echo $1' _ {} \;
The xpath expression looks for the tag <world> nested in any other tag.

iterate over lines in file then find in directory

I am having trouble looping and searching. It seems that the loop is not waiting for the find to finish. What am I doing wrong?
I made a loop the reads a file line by line. I then want to use that "name" to search a directory looking to see if a folder has that name. If it exists copy it to a drive.
#!/bin/bash
DIRFIND="$2"
DIRCOPY="$3"
if [ -d $DIRFIND ]; then
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "$line"
FILE=`find "$DIRFIND" -type d -name "$line"`
if [ -n "$FILE" ]; then
echo "Found $FILE"
cp -a "$FILE" "$DIRCOPY"
else
echo "$line not found."
fi
done < "$1"
else
echo "No such file or directory"
fi
Have you tried xargs...
Proposed Solution
cat filenamelist | xargs -n1 -I {} find . -type d -name {} -print | xargs -n1 -I {} mv {} .
what the above does is pipe a list of filenames into find (one at a time), when found find prints the name and passes to xarg which moves the file...
Expansion
file = yogo
yogo -> | xargs -n1 -I yogo find . -type d -name yogo -print | xargs -n1 -I {} mv ./<path>/yogo .
I hope the above helps, note that xargs has the advantage that you do not run out of command line buffer.

check if variable is set and grep

i have three variables $a, $b, $c. I don't know whether the three variables are set. I want the variable in the GREP query only if the variables are set. How do i do this?
find . -iname "*.txt" -type f | xargs grep -inw "$a" -sl | xargs grep -inw "$b" -sl | xargs grep -inw "$c" -sl
find .* -iname "*.txt" -type f | xargs grep -iw "$a|$b|$c" -sl
You can prepare multiple -e arguments on an array:
args=()
for x in "$a" "$b" "$c"; do
[[ -n $x ]] && args+=(-e "$x")
done
[[ ${#args[#]} -gt 0 ]] && find . -iname "*.txt" -type f | xargs grep -iw "${args[#]}" -sl
Note: Having -e "$a" -e "$b" -e "$c" is practically synonymous to "($a|$b|$c)" and might be even safer. Also if you don't intend "$a", "$b", and "$c" to be parsed as regex, you can just use fgrep or add the option -F; that which can't be done with "($a|$b|$c)".

Resources