Using find grep together - bash

I am trying to make a command to get all the files from the current folder and it's subtree that ends with a suffix then them need to contain lines which start's with a capital letter and and with !. I spent some to find a solution. I only found how to print the lines which starts with capital character but i don't know how to put in the command that '!'.
This is to find all the files which contains a line starting with a capital letter. How do i add to look for lines which ends with !.
find . -type f -exec grep -l "^[A-Z]+*" {} +

You can use this regex in grep:
find . -type f -exec grep -El '^[[:blank:]]*[A-Z].*![[:blank:]]*$' {} +

Following syntax will be useful. If am getting the question right this is the solution for you:
find . -type f -name '*!.*' -exec grep -l "^[A-Z]+*" {} +

find . -type f -name '*suffix' -print0 | xargs -r0 grep -le '^[A-Z].*!$' should do what you want.
It finds all files (-type f) with a suffixed name (-name '*suffix') and feeds those files to grep using xargs. The regular expression then finds lines that begin with a capital and end with an exclamation mark.
The problem here is mostly quoting. The ! is special in bash (and other shells) and refers to the history. You need to escape it, either by using single quotes or escaping it, by prepending a backslash.

Related

using find with variables in bash

I am new to bash scripting and need help:
I need to remove specific files from a directory . My goal is to find in each subdirectory a file called "filename.A" and remove all files that starts with "filename" with extension B,
that is: "filename01.B" , "filename02.B" etc..
I tried:
B_folders="$(find /someparentdirectory -type d -name "*.B" | sed 's# (.*\)/.*#\1#'|uniq)"
A_folders="$(find "$B_folders" -type f -name "*.A")"
for FILE in "$A_folders" ; do
A="${file%.A}"
find "$FILE" -name "$A*.B" -exec rm -f {}\;
done
Started to get problems when the directories name contained spaces.
Any suggestions for the right way to do it?
EDIT:
My goal is to find in each subdirectory (may have spaces in its name), files in the form: "filename.A"
if such files exists:
check if "filename*.B" exists And remove it,
That is: remove: "filename01.B" , "filename02.B" etc..
In bash 4, it's simply
shopt -s globstar nullglob
for f in some_parent_directory/**/filename.A; do
rm -f "${f%.A}"*.B
done
If the space is the only issue you can modify the find inside the for as follows:
find "$FILE" -name "$A*.B" -print0 | xargs -0 rm
man find shows:
-print0
True; print the full file name on the standard output, followed by a null character (instead of the newline character that -print uses). This allows
file names that contain newlines or other types of white space to be correctly interpreted by programs that process the find output. This option corre-
sponds to the -0 option of xargs.
and xarg's manual
-0 Input items are terminated by a null character instead of by whitespace, and the quotes and backslash are not special (every character is taken literal-
ly). Disables the end of file string, which is treated like any other argument. Useful when input items might contain white space, quote marks, or
backslashes. The GNU find -print0 option produces input suitable for this mode.

Find a file and delete the parent level dir

How would it possible to delete the parent dir (only one-level above) where the file is located and is found with find command like
find . -type f -name "*.root" -size 1M
which returns
./level1/level1_chunk84/file.root
So, I want to do actually delete recursively the level_chunck84 dir for example..
thanks
You can try something like:
find . -type f -name "*.root" -size 1M -print0 | \
xargs -0 -n1 -I'{}' bash -c 'fpath={}; rm -r ${fpath%%$(basename {})}'
find + xargs combo is very common. Please refer to man find and you will find a few examples showing how to use them together.
All I did here I simply added -print0 flag to your original find statement:
-print0
True; print the full file name on the standard output, followed by a null character (instead of the newline character that -print
uses). This allows file names that contain newlines or other types of white space to be correctly interpreted by programs that
process the find output. This option corresponds to the -0 option of xargs.
Then piped out everything to xargs which serves as a helper to craft further commands:
- execute everything in bash subshell
- assign file path to a variable fpath={}
- extract dirname from your file path
${parameter%%word}
Remove matching suffix pattern. The word is expanded to produce a pattern just as in pathname expansion. If the pattern matches a
trailing portion of the expanded value of parameter, then the result of the expansion is the expanded value of parameter with the
shortest matching pattern (the %'' case) or the longest matching pattern (the%%'' case) deleted. If parameter is # or *, the
pattern removal operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is
an array variable subscripted with # or *, the pattern removal operation is applied to each member of the array in turn, and the
expansion is the resultant list.
- and finally remove recursively
Also there's a little shorter version of it:
find . -type f -name "*.root" -size 1M -print0 | \
xargs -0 -n1 -I'{}' bash -c 'fpath={}; rm -r ${fpath%/*}'

How to overwrite the contents in the sed, without having backup file

I have a command like this:
sed -i -e '/console.log/ s/^\/*/\/\//' *.js
which does comments out all console.log statements. But there are two things
It keeps the backup file like test.js-e , I doesn't want to do that.
Say I want to the same process recursive to the folder, how to do it?
You don't have to use -e option in this particular case as it is unnecessary. This will solve your 1st problem (as -e seems to be going as suffix for -i option).
For the 2nd part, u can try something like this:
for i in $(find . -type f -name "*.js"); do sed -i '/console.log/ s/^\/*/\/\//' $i; done;
Use find to recursively find all .js files and do the replacement.
When checking sed's help, -i takes a suffix and uses it as a backup,
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)
and the output backup seems to be samefile + -e which is the second argument you're sending, try removing the space and see if that would work
sed -ie '/console.log/ s/^\/*/\/\//' *.js
As for the recursion, you could use find with -exec or xargs, please modify the find command and test it before running exec
find -name 'console.log' -type f -exec sed -ie '/console.log/ s/^\/*/\/\//' *.js \;
From your original post I presume you just want to make a C-style comment leading like:
/*
to a double back-slash style like:
//
right?
Then you can do it with this command
find . -name "*.js" -type f -exec sed -i '/console.log/ s#^/\*#//#g' '{}' \;
To be awared that:
in sed the split character normally be / but if you found that annoying to Escape when your replacing or matching string contains a / . You can change the split character to # or | as you like, I found it very useful trick.
if you do want to do is what I presumed, be sure that you should Escape the character *, because a combination of regex /* just means to match a pattern that / occurs one time or many times or none at all, that will match everything, it's very dangerous!

Recursively find all files that match a certain pattern

I need to find (or more specifically, count) all files that match this pattern:
*/foo/*.doc
Where the first wildcard asterisk includes a variable number of subdirectories.
With gnu find you can use regex, which (unlike -name) match the entire path:
find . -regex '.*/foo/[^/]*.doc'
To just count the number of files:
find . -regex '.*/foo/[^/]*.doc' -printf '%i\n' | wc -l
(The %i format code causes find to print the inode number instead of the filename; unlike the filename, the inode number is guaranteed to not have characters like a newline, so counting is more reliable. Thanks to #tripleee for the suggestion.)
I don't know if that will work on OSX, though.
how about:
find BASE_OF_SEARCH/*/foo -name \*.doc -type f | wc -l
What this is doing:
start at directory BASE_OF_SEARCH/
look in all directories that have a directory foo
look for files named like *.doc
count the lines of the result (one per file)
The benefit of this method:
not recursive nor iterative (no loops)
it's easy to read, and if you include it in a script it's fairly easy to decipher (regex sometimes is not).
UPDATE: you want variable depth? ok:
find BASE_OF_SEARCH -name \*.doc -type f | grep foo | wc -l
start at directory BASE_OF_SEARCH
look for files named like *.doc
only show the lines of this result that include "foo"
count the lines of the result (one per file)
Optionally, you could filter out results that have "foo" in the filename, because this will show those too.
Based on the answers on this page on other pages I managed to put together the following, where a search is performed in the current folder and all others under it for all files that have the extension pdf, followed by a filtering for those that contain test_text on their title.
find . -name "*.pdf" | grep test_text | wc -l
Untested, but try:
find . -type d -name foo -print | while read d; do echo "$d/*.doc" ; done | wc -l
find all the "foo" directories (at varying depths) (this ignores symlinks, if that's part of the problem you can add them); use shell globbing to find all the ".doc" files, then count them.

How to remove first and last folder in 'find' result output?

I want to search for folders by part of their name, which i know and it's common among these kind of folders. i used 'find' command in bash script like this
find . -type d -name "*.hg"
it just print out the whole path from current directory to the found folder itself. the foldr name has '.hg'.then i tried to use 'sed' command but i couldn't address the last part of the path. i decided to get the folder name ends in .hg save it in a variable then use 'sed' command to remove the last directory from output. i use this to get the last part, and try to save the result to a varable, no luck.
find . -type d -name "*.hg"|sed 's/*.hg$/ /'
find . -type d -name "*.hg"|awk -F/ '{print $NF}
this just print out the file names, here the folder with .hg at the end.
then i use different approach
for i in $(find . -type d -name '*.hg' );
do
$DIR = $(dirname ${i})
echo $DIR
done
this didin't work neither. can anyone point me any hint to make this works.
and yes it's homework.
You could use parameter expansion:
d=path/to/my/dir
d="${d#*/}" # remove the first dir
d="${d%/*}" # remove the last dir
echo $d # "to/my"
one problem that you have is with the pattern you are using in your sed script - there is a different pattern language used by both bash and the find command.
They use a very simple regular expression language where * means any number of any character and ? means any single character. The sed command uses a much richer regular expression language where * means any number of the previous character and . means any character (there's a lot more to it than that).
So to remove the last component of the path delivered by find you will need to use the following sed command: sed -e 's,/[^/].hg,,'
Alternatively you could use the dirname command. Pipe the output of the find command to xargs (which will run a command passing standard input as arguments to the command:
xargs -i dirname
#Pamador - that's strange. It works for me. Just to explain: the sed command needs to be quoted in single quotes just to protect against any unwanted shell expansions. The character following the 's' is a comma; what we're doing here is changing the character that sed uses to separate the two parts of the substitute command, this means that we can use the slash character without having to escape it without a preceding backslash. The next part matches any sequence of characters apart from a slash followed by any character and then hg. Honestly I should have anchored the pattern to the end of line with a $ but apart from that it's fine.
I tested it with
echo "./abc/xxx.hg" | sed -e 's,/[^/]\.hg$'
And it printed ./abc
Did I misunderstand what you wanted to do?
find . -type d -name "*.hg" | awk -v m=1 -v n=1 'NR<=m{};NR>n+m{print line[NR%n]};{line[NR%n]=$0}'
awk parameters:
m = number of lines to remove from beginning of output
n = number of
lines to remove from end of output
Bonus: If you wanted to remove 1 line from the end and you have coreutils installed, you could do this: find . -type d -name "*.hg" | ghead -n -1

Resources