What is the correct Linux command of find, grep and sort? - shell

I am writing a command using find, grep and sort to display a sorted list of all files that contain 'some-text'.
I was unable to figure out the command.
Here is my attempt:
$find . -type f |grep -l "some-text" | sort
but it didn't work.

You need to use something like XARGS so that the content of each file passed through the pipe | is made available for grep.
XARGS: converts input from standard input into arguments to a command
In my case, I have files1,2,3 and they contain the word test. This will do it.
za:tmp za$ find . -type f | xargs grep -l "test" | sort
./file1.txt
./file2.txt
./file3.txt
or
za:tmp za$ find . -type f | xargs grep -i "test" | sort
./file1.txt:some test string
./file2.txt:some test string
./file3.txt:some test string

You can use it in any unix:
find . -type f -exec sh -c 'grep "some text" {} /dev/null > /dev/null 2>&1' \; -a -print 2> /dev/null|sort
A more optimized solution that works only with GNU-grep:
find . -type f -exec grep -Hq "some-text" {} \; -a -print 2> /dev/null|sort

Related

Solution for find -exec if single and double quotes already in use

I would like to recursively go through all subdirectories and remove the oldest two PDFs in each subfolder named "bak":
Works:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && pwd" \;
Does not work, as the double quotes are already in use:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && rm "$(ls -t *.pdf | tail -2)"" \;
Any solution to the double quote conundrum?
In a double quoted string you can use backslashes to escape other double quotes, e.g.
find ... "rm \"\$(...)\""
If that is too convoluted use variables:
cmd='$(...)'
find ... "rm $cmd"
However, I think your find -exec has more problems than that.
Using {} inside the command string "cd '{}' ..." is risky. If there is a ' inside the file name things will break and might execcute unexpected commands.
$() will be expanded by bash before find even runs. So ls -t *.pdf | tail -2 will only be executed once in the top directory . instead of once for each found directory. rm will (try to) delete the same file for each found directory.
rm "$(ls -t *.pdf | tail -2)" will not work if ls lists more than one file. Because of the quotes both files would be listed in one argument. Therefore, rm would try to delete one file with the name first.pdf\nsecond.pdf.
I'd suggest
cmd='cd "$1" && ls -t *.pdf | tail -n2 | sed "s/./\\\\&/g" | xargs rm'
find . -type d -name bak -exec bash -c "$cmd" -- {} \;
You have a more fundamental problem; because you are using the weaker double quotes around the entire script, the $(...) command substitution will be interpreted by the shell which parses the find command, not by the bash shell you are starting, which will only receive a static string containing the result from the command substitution.
If you switch to single quotes around the script, you get most of it right; but that would still fail if the file name you find contains a double quote (just like your attempt would fail for file names with single quotes). The proper fix is to pass the matching files as command-line arguments to the bash subprocess.
But a better fix still is to use -execdir so that you don't have to pass the directory name to the subshell at all:
find . -type d -name "bak" \
-execdir bash -c 'ls -t *.pdf | tail -2 | xargs -r rm' \;
This could stll fail in funny ways because you are parsing ls which is inherently buggy.
You are explicitely asking for find -exec. Usually I would just concatenate find -exec find -delete but in your case only two files should be deleted. Therefore the only method is running subshell. Socowi already gave nice solution, however if your file names do not contain tabulator or newlines, another workaround is find while read loop.
This will sort files by mtime
find . -type d -iname 'bak' | \
while read -r dir;
do
find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | \
sort | head -n2 | \
cut -f2- | \
while read -r file;
do
rm "$file";
done;
done;
The above find while read loop as "one-liner"
find . -type d -iname 'bak' | while read -r dir; do find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | sort | head -n2 | cut -f2- | while read -r file; do rm "$file"; done; done;
find while read loop can also handle NUL terminated file names. However head can not handle this, so I did improve other answers and made it work with nontrivial file names (only GNU + bash)
replace 'realpath' with rm
#!/bin/bash
rm_old () {
find "$1" -maxdepth 1 -type f -iname \*.$2 -printf "%T+\t%p\0" | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
export -f rm_old
find -type d -iname bak -execdir bash -c 'rm_old "{}" pdf 2' \;
However bash -c might still exploitable, to make it more secure let stat %N do the quoting
#!/bin/bash
rm_old () {
local dir="$1"
# we don't like eval
# eval "dir=$dir"
# this works like eval
dir="${dir#?}"
dir="${dir%?}"
dir="${dir//"'$'\t''"/$'\011'}"
dir="${dir//"'$'\n''"/$'\012'}"
dir="${dir//$'\047'\\$'\047'$'\047'/$'\047'}"
find "$dir" -maxdepth 1 -type f -iname \*.$2 -printf '%T+\t%p\0' | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
find -type d -iname bak -exec stat -c'%N' {} + | while read -r dir; do rm_old "$dir" pdf 2; done

Find large strings in files and show the files

I would like to find large strings in a directory files and report them:
awk 'length>50' /home/* -exec ls -l {} ';'
Thanks in advance
You need find for that, e.g:
find . -type f -exec grep -Eq '.{50}' {} \; \
-exec ls -l {} +
In GNU find -exec ls -l {} + could be replaced with just -ls.
And if long output is not necessary (requires GNU grep):
grep -Erl '.{50}' .
If your file names don't contain spaces then with POSIX tools:
grep -El '.{50}' /home/* | xargs ls -l
otherwise with GNU tools:
grep -ElZ '.{50}' /home/* | xargs -0 ls -l

Copying the result of a find operation in shell

I want to find a file, and simultaneously copy it to another directory like this:
cp (find . -name myFile | tail -n 1) dir/to/copy/to
But this says unexpected token `find'
Is there a better way to do this?
You may use a pipeline:
find . -name 'myFile' -print0 | tail -z -n 1 | xargs -0 -I {} cp {} /dir/to/copy/to/
Using -print0 option to address filenames with whitespace, glob characters
find . -name 'myFile' -print0 | tail -n 1 | xargs -0 -I {} cp {} /dir/to/copy/to/
Two options are available-
Appended the missing $() - to evaluate command (not sure the purpose of tail command, only required for samefile in multiple directories)
cp $(find . -name myFile | tail -n 1) dir/to/copy/to
find . -name myFile -type f -exec cp {} dir/to/copy/to \;

Is there way to use If condition inside a find command with option exec?

scenario: There are multiple files in an folder, I'm trying to find specific set of files and if a given file has specific info then I need to grep the information.
Ex:
find /abc/test \( -type f -name 'tst*.txt' -mtime -1 \) -exec grep -Po '(?<type1).*(?=type1|(?<=type2).*(?=type2)' {} \;
I need to include if condition along with find -exec (if grep is true then print the above)
if grep -q 'case=1' <filename>; then
grep -Po '(?<type1).*(?=type1|(?<=type2).*(?=type2)'
fi
Thanks
You can use -exec in find as a condition -- the file matches if the command returns a successful exit code. So you can write:
find /abc/test -type f -name 'tst*.txt' -mtime -1 -exec grep -q 'case=1' {} \; -exec grep -Po '(?<type1).*(?=type1|(?<=type2).*(?=type2)' {} \;
Tests in find are evaluated left-to-right, so the second grep will only be executed if the first one was successful.
If your conditions are more complicated, you can put the whole shell code into a script, and execute the script with -exec. E.g. put this in myscript.sh:
#!/bin/sh
if grep -q 'case=1' "$1"; then
grep -Po '(?<type1).*(?=type1|(?<=type2).*(?=type2)' "$1";
fi
and then do:
find /abc/test -type f -name 'tst*.txt' -mtime -1 -exec ./myscript.sh {} \;
Since you're using PCRE option -P in grep you can combine both searches into one grep as well using lookahead:
find /abc/test -type f -name 'tst*.txt' -mtime -1 -exec grep -Po '(?=.*case=1).*\K((?<=type1).*(?=type1)|(?<=type2).*(?=type2))' {} +
btw the regex shown in your question is invalid, that I've tried to correct it here.

input file is output file error

I'm trying to run the command
find . -name "*.csv" | xargs -I{} cat '{}' > Everything2.csv
and I get back:
cat: ./Everything2.csv: input file is output file
What does this mean?
As shown in that answer, you should run:
$ find . -name '*.csv' -exec cat {} + | tee Everything2.csv
since redirection operator (> or >>) has a higher precedence, therefore it creating/truncating the file, before the find command is invoked. So to avoid that you need to generate the list first, then pipe it into the file, but without using redirection operator, so tee in this cause works fine.
Alternatively use sponge instead of cat which soaks up standard input and write to a file:
find . -name "*.csv" | xargs -I{} sponge '{}' > Everything2.csv
Tell find to exclude the output file from its results to prevent this loop:
find . -name Everything2.csv -prune -o \
-name '*.csv' -exec cat {} + \
>Everything2.csv

Resources