grep: ./Coding/CNL.md: Is a directory - bash

I intend to find all the markdown files which contain the word desire using pipeline
In [37]: !find -E . -iregex ".*/[^/]+\.md" -print0 -exec grep -i "desire" "{}" \; | grep ".md"
grep: ./Coding/CNL.md: Is a directory
Binary file (standard input) matches
How to solve such a problem?

About the errors:
grep: ./Coding/CNL.md: Is a directory
means a directory was passed as argument to grep and grep can't process directories, adding -type f option, filters only files.
Binary file (standard input) matches
Means stadard input (because grep is used in a pipe there's no file name) is detected as a binary file, grep doesn't print ouput to avoid special characters or escape sequences to be send to terminal. This may be due to -print0 option which uses NUL character (or \0) as output delimiter.
It's not clear why are you using -print0 and -exec grep ..., this will mix file names and files' content.

Related

Using cat and grep commands in Bash

I'm having trouble with trying to achieve this bash command:
Concatenate all the text files in the current directory that have at least one occurrence of the word BOB (in any case) within the text of the file.
Is it correct for me to do this use the cat command then use grep to find the occurences of the word BOB?
cat grep -i [BOB] *.txt > catFile.txt
To handle filenames with whitespace characters correctly:
grep --null -l -i "BOB" *.txt | xargs -0 cat > catFile.txt
Your issue was the need to pass grep's file names to cat as an inline function:
cat $(grep --null -l -i "BOB" *.txt ) > catFile.txt
$(.....) handles the inline execution
-l returns only filenames of the things that matched
You could use find with -exec:
find -maxdepth 1 -name '*.txt' -exec grep -qi 'bob' {} \; \
-exec cat {} + > catFile.txt
-maxdepth 1 makes sure you don't search any deeper than the current directory
-name '*.txt' says to look at all files ending with .txt – for the case that there is also a directory ending in .txt, you could add -type f to only look at files
-exec grep -qi 'bob' {} \; runs grep for each .txt file found. If bob is in the file, the exit status is zero and the next directive is executed. -q makes sure the grep is silent.
-exec cat {} + runs cat on all the files that contain bob
You need to remove the square brackets...
grep -il "BOB" *
You can also use the following command that you must run from the directory containing your BOB files.
grep -il BOB *.in | xargs cat > BOB_concat.out
-i is an option used to set grep in case insensitive mode
-l will be used to output only the filename containing the pattern provided as argument to grep
*.in is used to find all the input files in the dir (should be adapted to your folder content)
then you pipe the result of the command to xargs in order to build the arguments that cat will use to produce your file concatenation.
HYPOTHESIS:
Your folder does only contain files without strange characters in their name (e.g. space)

Files with quotes, spaces causing bad behavior from xargs

I want to find some files and calculate the shasum by using a pipe command.
find . -type f | xargs shasum
But there are files withe quotes in my directory, for example the file named
file with "special" characters.txt
The pipe output look like this:
user#home ~ $ find . -type f | xargs shasum
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./empty1.txt
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./empty2.txt
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./empty3.txt
shasum: ./file:
shasum: with: No such file or directory
shasum: special: No such file or directory
shasum: characters.txt: No such file or directory
25ea78ccd362e1903c4a10201092edeb83912d78 ./file1.txt
25ea78ccd362e1903c4a10201092edeb83912d78 ./file2.txt
The quotes within the filename makes problems.
How can I tell shasum to process the files correctly?
The short explanation is that xargs is widely considered broken-by-design, unless using extensions to the standard that disable its behavior of trying to parse and honor quote and escaping content in its input. See the xargs section of UsingFind for more details.
Using NUL Delimited Streams
On a system with GNU or modern BSD extensions (including MacOS X), you can (and should) NUL-delimit the output from find:
find . -type f -print0 | xargs -0 shasum --
Using find -exec
That said, you can do even better by getting xargs out of the loop entirely in a way that's fully compliant with modern (~2006) POSIX:
find . -type f -exec shasum -- '{}' +
Note that the -- argument specifies to shasum that all future arguments are filenames. If you'd used find * -type f ..., then you could have a result starting with a dash; using -- ensures that this result isn't interpreted as a set of options.
Using Newline Delimiters (And Security Risks Thereof)
If you have GNU xargs, but don't have the option of a NUL-delimited input stream, then xargs -d $'\n' (in shells such as bash with ksh extensions) will avoid the quoting and escaping behavior:
xargs -d $'\n' shasum -- <files.txt
However, this is suboptimal, because newline literals are actually possible inside filenames, thus making it impossible to distinguish between a newline that separates two names and a newline that is part of an actual name. Consider the following scenario:
mkdir -p ./file.txt$'\n'/etc/passwd$'\n'/
touch ./file.txt$'\n'/etc/passwd$'\n'file.txt file.txt
find . -type f | xargs -d $'\n' shasum --
This will have output akin to the following:
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./file.txt
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./file.txt
c0c71bac843a3ec7233e99e123888beb6da8fbcf /etc/passwd
da39a3ee5e6b4b0d3255bfef95601890afd80709 file.txt
...thus allowing an attacker who can control filenames to cause a shasum for an arbitrary file outside the intended directory structure to be added to your output.

grep cannot read filename after find folders with spaces

Hi after I find the files and enclose their name with double quotes with the following command:
FILES=$(find . -type f -not -path "./.git/*" -exec echo -n '"{}" ' \; | tr '\n' ' ')
I do a for loop to grep a certain word inside each file that matches find:
for f in $FILES; do grep -Eq '(GNU)' $f; done
but grep complains about each entry that it cannot find file or directory:
grep: "./test/test.c": No such file or directory
see picture:
whereas echo $FILES produces:
"./.DS_Store" "./.gitignore" "./add_license.sh" "./ads.add_lcs.log" "./lcs_gplv2" "./lcs_mit" "./LICENSE" "./new test/test.js" "./README.md" "./sxs.add_lcs.log" "./test/test.c" "./test/test.h" "./test/test.js" "./test/test.m" "./test/test.py" "./test/test.pyc"
EDIT
found the answer here. works perfectly!
The issue is that your array contains filenames surrounded by literal " quotes.
But worse, find's -exec cmd {} \; executes cmd separately for each file which can be inefficient. As mentioned by #TomFenech in the comments, you can use -exec cmd {} + to search as many files within a single cmd invocation as possible.
A better approach for recursive search is usually to let find output filenames to search, and pipe its results to xargs in order to grep inside as many filenames together as possible. Use -print0 and -0 respectively to correctly support filenames with spaces and other separators, by splitting results by a null character instead - this way you don't need quotes, reducing possibility of bugs.
Something like this:
find . -type f -not -path './.git/*' -print0 | xargs -0 egrep '(GNU)'
However in your question you had grep -q in a loop, so I suspect you may be looking for an error status (found/not found) for each file? If so, you could use -l instead of -q to make grep list matching filenames, and then pipe/send that output to where you need the results.
find . -print0 | xargs -0 egrep -l pattern > matching_filenames
Also note that grep -E (or egrep) uses extended regular expressions, which means parentheses create a regex group. If you want to search for files containing (GNU) (with the parentheses) use grep -F or fgrep instead, which treats the pattern as a string literal.

How to remove files using grep and rm?

grep -n magenta *| rm *
grep: a.txt: No such file or directory
grep: b: No such file or directory
Above command removes all files present in the directory except ., .. .
It should remove only those files which contains the word "magenta"
Also, tried grep magenta * -exec rm '{}' \; but no luck.
Any idea?
Use xargs:
grep -l --null magenta ./* | xargs -0 rm
The purpose of xargs is to take input on stdin and place it on the command line of its argument.
What the options do:
The -l option tells grep not to print the matching text and instead just print the names of the files that contain matching text.
The --null option tells grep to separate the filenames with NUL characters. This allows all manner of filenames to be handled safely.
The -0 option to xargs to treat its input as NUL-separated.
Here is a safe way:
grep -lr magenta . | xargs -0 rm -f --
-l prints file names of files matching the search pattern.
-r performs a recursive search for the pattern magenta in the given directory .. 
If this doesn't work, try -R.
(i.e., as multiple names instead of one).
xargs -0 feeds the file names from grep to rm -f
-- is often forgotten but it is very important to mark the end of options and allow for removal of files whose names begin with -.
If you would like to see which files are about to be deleted, simply remove the | xargs -0 rm -f -- part.

xargs and sed creating unwanted files when replacing strings in files

I have a folder named test which has two files style.css and a hidden file named .DS_Store. My aim is to recursively replace all "changefrom.this" strings in all files under test to "to.this". So I came up with:
folder_root="test"
# change text in files
find $folder_root/ -type f -print0 | xargs -0 -n 1 sed -i -e 's/changefrom.this/to.this/g'
And while the strings do get replaced in the style.css file for instance, the execution outputs an error:
sed: RE error: illegal byte sequence
And I get some new files in the test folder: style.css-e and !2766!.DS_Store. Didn't expect that. What's going on here?
Can you try this simplified command with find -exec sed:
find "$folder_root/" -type f -exec sed -i 's/changefrom\.this/to.this/g' '{}' +
If changefrom.this is not the actual pattern you're using then let us know what that pattern is as it might be causing problems.
Try this:
find $folder_root/ -type f -print0 | LC_ALL=en_US.CP437 xargs -0 -n 1 sed -i -e 's/changefrom.this/to.this/g'
If it works, the problem is your terminal encoding doesn't match the file's encoding. CP437 doesn't have bad bytes so this fixes it for regexes in 7 bit ascii.

Resources