Why does grep print out "No such file or directory"? - macos

I tried to search the MarkDown files containing some specific text under the working directory, so I used the command below:
find . -name "*.md" | xargs grep "interpretation"
However, though I got the result what I need, the terminal also print out much errors like below:
grep: Problem: No such file or directory
grep: Solving: No such file or directory
grep: with: No such file or directory
grep: Algorithms: No such file or directory
……
etc
I wrote my solution as a answer below.

Found it!
At first I used the option -s to suppress errors, suggested by here, but #Kelvin 's comment reminded me the true reason is that my many files' names has the spaces.
So the correct command is:
$ find . -name "*.md" -print0 | xargs -0 grep "some-text-want-to-find" (on os x)
Here is some clearer explanation I found:
Unix-like systems allow embedded spaces (and even newlines!) in filenames. This causes problems for programs like xargs that construct argument lists for other programs. An embedded space will be treated as a delimiter and the resulting command will interpret each space-separated word as a separate argument. To overcome this, find and xarg allow the optional use of a null character as argument separator. A null character is defined in ASCII as the character represented by the number zero (as opposed to, for example, the space character, which is defined in ASCII as the character represented by the number 32). The find command provides the action -print0, which produces null separated output, and the xargs command has the –null option, which accepts null separated input.
—— The Linux Command Line: A Complete Introduction by William E. Shotts
Warning: When you are on os x, the null separated option of xargs command is –0
Updated 2017-05-27 22:58:48
Thanks to #Sundeep ,who suggested me to use -exec, a new feature in find itself, rather than xargs.
So, use this to search files in current dir and its sub-dirs:
$ find . -type f -name "*.md" -exec grep "some-text-want-to-find" {} +
Note:
What is meaning of {} + in find's -exec command? - Unix & Linux Stack Exchange

I found that when there's symbolic link and the destination directory not exist, it will print out no such file or directory

Related

Given a text file with file names, how can I find files in subdirectories of the current directory?

I have a bunch of files with different names in different subdirectories. I created a txt file with those names but I cannot make find to work using the file. I have seen posts on problems creating the list, on not using find (do not understand the reason though). Suggestions? Is difficult for me to come up with an example because I do not know how to reproduce the directory structure.
The following are the names of the files (just in case there is a formatting problem)
AO-169
AO-170
AO-171
The best that I came up with is:
cat ExtendedList.txt | xargs -I {} find . -name {}
It obviously dies in the first directory that it finds.
I also tried
ta="AO-169 AO-170 AO-171"
find . -name $ta
but it complains find: AO-170: unknown primary or operator
If you are trying to ask "how can I find files with any of these names in subdirectories of the current directory", the answer to that would look something like
xargs printf -- '-o\0-name\0%s\0' <ExtendedList.txt |
xargs -r0 find . -false
The -false is just a cute way to let the list of actual predicates start with "... or".
If the list of names in ExtendedList.txt is large, this could fail if the second xargs decides to break it up between -o and -name.
The option -0 is not portable, but should work e.g. on Linux or wherever you have GNU xargs.
If you can guarantee that the list of strings in ExtendedList.txt does not contain any characters which are problematic to the shell (like single quotes), you could simply say
sed "s/.*/-o -name '&'/" ExtendedList.txt |
xargs -r find . -false

find/grep to list found specific file that contains specific string

I have a root directory that I need to run a find and/or grep command on to return a list of files that contain a specific string.
Here's an example of the file and directory set up. In reality, this root directory contains a lot of subdirectories that each have a lot of subdirectories and files, but this example, I hope, gets my point across.
From root, I need to go through each of the children directories, specifically into subdir/ and look through file.html for the string "example:". If a result is found, I'd like it to print out the full path to file.html, such as website_two/subdir/file.html.
I figured limiting the search to subdir/file.html will greatly increase the speed of this operation.
I'm not too knowledgeable with find and grep commands, but I have tried the following with no luck, but I honestly don't know how to troubleshoot it.
find . -name "file.html" -exec grep -HI "example:" {} \;
EDIT: I understand this may be marked as a duplicate, but I think my question is more along the lines of how can I tell the command to only search a specific file in a specific path, looping through all root-> level directories.
find ./ -type f -iname file.html -exec grep -l "example:" {} \+;
or
grep -Rl "example:" ./ | grep -iE "file.htm(l)*$" will do the trick.
Quote from GNU Grep 2.25 man page:
-R, --dereference-recursive
Read all files under each directory, recursively. Follow all symbolic links, unlike -r.
-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have
been printed. The scanning will stop on the first match.
-i, --ignore-case
Ignore case distinctions in both the PATTERN and the input files.
-E, --extended-regexp
Interpret PATTERN as an extended regular expression.

Replacing a line from multiple files, limiting to a line number range

I have a large number of files and I want to replace some lines from all of these files. I don't know the exact contents of the lines, all I know is all of them contain two known words - let's say for example 'Programmer' and 'Bob'. So the lines I want to replace could be something like:
Created by Programmer Bob
Programmer extraordinaire Bob, such an awesome guy
Copyright programmer bob, all rights reserved
So far this sounds easy, but the problem is I only want to replace the lines that are contained within a line range - for example in the beginning of the file (where typically one would find comments regarding the file). I can't replace lines found in the later parts of the file, because I don't want to accidentally replace actual code.
So far I have tried:
find . -exec grep -il -E 'Programmer.*Bob' {} \; | xargs sed -i '1,10 /Programmer.*Bob/Ic\LINE REPLACED'
(I'm using find because grep ran into an infinite recursion - I think. Not the point here.)
However it seems that I can't use address ranges with c\ (change line). Feel free to point out any syntax errors, but I think I've tried everything to no avail. This does work without the line numbers.
EDIT:
I got the answer, but I decided to edit my question to include my solution which expands upon the answer I got - maybe someone will find this helpful.
I realised later that I want to retain the possible whitespace and comment characters in the beginning of the line. I accomplished it using this command:
find . -exec grep -ilI '.*Programmer.*Bob.*' {} \; xargs sed -i -r '1,10 s/([ \t#*]*)(.*Programmer.*Bob.*)/\1LINE REPLACED/I'
\1 keeps the pattern that matches [ \t#*]*. One could change this to ^[ \t#*]* that would anchor the pattern to the beginning of the line, but (I THINK) this current version would change
** Text I don't want to remove ** Programmer Bob
into
** Text I don't want to remove ** LINE REPLACED
Which could actually be better. (I also added the -I (capital i) flag to the find command, which skips binary files.)
You are mixing addresses and commands. Simple substitution should work:
find . -exec grep -il -E 'Programmer.*Bob' {} \; \
| xargs sed -i '1,10 s/.*Programmer.*Bob.*/LINE REPLACED/'
find . -type f -name "*.cpp"|xargs perl -pi -e 'if(/Programmer/ && /Bob/ && $.>=1 && $.<10){$_="line to replace"}'
sed command:
>sed '1,10 {s/programmer\|bob/LINE REPLACED/i;s/programmer\|bob//ig}' file

How to remove first and last folder in 'find' result output?

I want to search for folders by part of their name, which i know and it's common among these kind of folders. i used 'find' command in bash script like this
find . -type d -name "*.hg"
it just print out the whole path from current directory to the found folder itself. the foldr name has '.hg'.then i tried to use 'sed' command but i couldn't address the last part of the path. i decided to get the folder name ends in .hg save it in a variable then use 'sed' command to remove the last directory from output. i use this to get the last part, and try to save the result to a varable, no luck.
find . -type d -name "*.hg"|sed 's/*.hg$/ /'
find . -type d -name "*.hg"|awk -F/ '{print $NF}
this just print out the file names, here the folder with .hg at the end.
then i use different approach
for i in $(find . -type d -name '*.hg' );
do
$DIR = $(dirname ${i})
echo $DIR
done
this didin't work neither. can anyone point me any hint to make this works.
and yes it's homework.
You could use parameter expansion:
d=path/to/my/dir
d="${d#*/}" # remove the first dir
d="${d%/*}" # remove the last dir
echo $d # "to/my"
one problem that you have is with the pattern you are using in your sed script - there is a different pattern language used by both bash and the find command.
They use a very simple regular expression language where * means any number of any character and ? means any single character. The sed command uses a much richer regular expression language where * means any number of the previous character and . means any character (there's a lot more to it than that).
So to remove the last component of the path delivered by find you will need to use the following sed command: sed -e 's,/[^/].hg,,'
Alternatively you could use the dirname command. Pipe the output of the find command to xargs (which will run a command passing standard input as arguments to the command:
xargs -i dirname
#Pamador - that's strange. It works for me. Just to explain: the sed command needs to be quoted in single quotes just to protect against any unwanted shell expansions. The character following the 's' is a comma; what we're doing here is changing the character that sed uses to separate the two parts of the substitute command, this means that we can use the slash character without having to escape it without a preceding backslash. The next part matches any sequence of characters apart from a slash followed by any character and then hg. Honestly I should have anchored the pattern to the end of line with a $ but apart from that it's fine.
I tested it with
echo "./abc/xxx.hg" | sed -e 's,/[^/]\.hg$'
And it printed ./abc
Did I misunderstand what you wanted to do?
find . -type d -name "*.hg" | awk -v m=1 -v n=1 'NR<=m{};NR>n+m{print line[NR%n]};{line[NR%n]=$0}'
awk parameters:
m = number of lines to remove from beginning of output
n = number of
lines to remove from end of output
Bonus: If you wanted to remove 1 line from the end and you have coreutils installed, you could do this: find . -type d -name "*.hg" | ghead -n -1

To understand xargs better

I want to understand the use of xargs man in Rampion's code:
screen -t man /bin/sh -c 'xargs man || read'
Thanks to Rampion: we do not need cat!
Why do we need xargs in the command?
I understand the xargs -part as follows
cat nothing to xargs
xargs makes a list of man -commands
I have had an idea that xargs makes a list of commands. For instance,
find . -type f -print0 | xargs -0 grep masi
is the same as a list of commands:
find fileA AND grep masi in it
find fileB AND grep masi in it
and so on for fileC, fileD, ...
No, I don't cat nothing. I cat whatever input I get after I run the command. cat is actually extraneous here, so let's ignore it.
xargs man waits on user input. Which is necessary. Since in the script you grabbed that from, I can't paste in the argument for man until after I create the window. So the command that runs in the window needs to wait for me to give it something, before it tries to run man.
If we just ran screen /bin/sh -d 'man || read', it would always complain "What manual page do you want?" since we never told it.
xargs gathers arguments from stdin and executes the command given with those arguments.
so cat is waiting for something to be typed, and then xargs is running man with that input.
xargs is useful if you have a lot of files to process, I often use it with output from find.
xargs will stuff as many arguments as it can onto the command line.
It's great for doing something like
find . -name '*.o' -print | xargs rm
The cat command does not operate on nothing; it operates on standard input, up until it is told that the input is ended. As Rampion notes, the cat command is not necessary here, but it is operating on its implicit input (standard input), not on nothing.
The xargs command reads the output from cat, and groups the information into arguments to the man command specified as its (only) argument. When it reaches a limit (configurable on the command line), it will execute the man command.
The find ... -print0 | xargs -0 ... idiom deals with file names that contain awkward characters such as blanks, tabs and newlines. The find command prints each filename followed by an ASCII NUL ('\0'); this is one of two characters that cannot appear in a simple file name - the other being '/' (which appears in path names, of course, but not in simple file names). It is not directly equivalent to the sequence you provide; xargs groups collections of file names into a single argument list, up to a size limit. If the names are short enough (they usually are), then there will be fewer executions of grep than there are file names.
Note, too, the grep only prints the file name where the material is found if it has more than one file to search -- or if it supports an option so that it always prints the file names and the option is used: '-H' is a GNU extension to grep that does this. The portable way to ensure that the file names always appear is to list /dev/null as the first file (so 'xargs grep something /dev/null'); it doesn't take long to search /dev/null.

Resources