terminal unix command find problem with setting up modes - macos

I have trouble constructing a single find line to do the following:
find all files in the current dir and sub-dir with name ending with ~. or star and end with '#'.I think I have made a fundamental mistake but not so sure after 2 hours of thinking.
This is what I came up with and it does not seem to work:
find -name '[#]' -a -name '[~#]'
macOSX terminal

You could use a combination of ls and grep to find all the files ending with either ~ or #
ls * | grep -E "*.(\~|#)"
ls -R * will show all files in the current dir and sub-dir;
grep -E will search for lines matching a regular expression;
"*.(\~|#)" will match all lines ending with either ~ or # (note that you'll need to escape the ~ with \ since it's a special character).

Related

find/grep to list found specific file that contains specific string

I have a root directory that I need to run a find and/or grep command on to return a list of files that contain a specific string.
Here's an example of the file and directory set up. In reality, this root directory contains a lot of subdirectories that each have a lot of subdirectories and files, but this example, I hope, gets my point across.
From root, I need to go through each of the children directories, specifically into subdir/ and look through file.html for the string "example:". If a result is found, I'd like it to print out the full path to file.html, such as website_two/subdir/file.html.
I figured limiting the search to subdir/file.html will greatly increase the speed of this operation.
I'm not too knowledgeable with find and grep commands, but I have tried the following with no luck, but I honestly don't know how to troubleshoot it.
find . -name "file.html" -exec grep -HI "example:" {} \;
EDIT: I understand this may be marked as a duplicate, but I think my question is more along the lines of how can I tell the command to only search a specific file in a specific path, looping through all root-> level directories.
find ./ -type f -iname file.html -exec grep -l "example:" {} \+;
or
grep -Rl "example:" ./ | grep -iE "file.htm(l)*$" will do the trick.
Quote from GNU Grep 2.25 man page:
-R, --dereference-recursive
Read all files under each directory, recursively. Follow all symbolic links, unlike -r.
-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have
been printed. The scanning will stop on the first match.
-i, --ignore-case
Ignore case distinctions in both the PATTERN and the input files.
-E, --extended-regexp
Interpret PATTERN as an extended regular expression.

Why does grep print out "No such file or directory"?

I tried to search the MarkDown files containing some specific text under the working directory, so I used the command below:
find . -name "*.md" | xargs grep "interpretation"
However, though I got the result what I need, the terminal also print out much errors like below:
grep: Problem: No such file or directory
grep: Solving: No such file or directory
grep: with: No such file or directory
grep: Algorithms: No such file or directory
……
etc
I wrote my solution as a answer below.
Found it!
At first I used the option -s to suppress errors, suggested by here, but #Kelvin 's comment reminded me the true reason is that my many files' names has the spaces.
So the correct command is:
$ find . -name "*.md" -print0 | xargs -0 grep "some-text-want-to-find" (on os x)
Here is some clearer explanation I found:
Unix-like systems allow embedded spaces (and even newlines!) in filenames. This causes problems for programs like xargs that construct argument lists for other programs. An embedded space will be treated as a delimiter and the resulting command will interpret each space-separated word as a separate argument. To overcome this, find and xarg allow the optional use of a null character as argument separator. A null character is defined in ASCII as the character represented by the number zero (as opposed to, for example, the space character, which is defined in ASCII as the character represented by the number 32). The find command provides the action -print0, which produces null separated output, and the xargs command has the –null option, which accepts null separated input.
—— The Linux Command Line: A Complete Introduction by William E. Shotts
Warning: When you are on os x, the null separated option of xargs command is –0
Updated 2017-05-27 22:58:48
Thanks to #Sundeep ,who suggested me to use -exec, a new feature in find itself, rather than xargs.
So, use this to search files in current dir and its sub-dirs:
$ find . -type f -name "*.md" -exec grep "some-text-want-to-find" {} +
Note:
What is meaning of {} + in find's -exec command? - Unix & Linux Stack Exchange
I found that when there's symbolic link and the destination directory not exist, it will print out no such file or directory

Why am I getting some extra, weird characters when making a file from grep output?

I am doing a very basic command that never gave me trouble in the past, but is inexplicably returning undesired characters now.
I am in BASH on linux, and simply want to search through a directory and make a file containing filenames that match a pattern:
ls | grep "*.file_ID" > my_list.txt
...This works fine, and if I cat the data:
cat my_list.txt
seriesA.file_ID
seriesB.file_ID
seriesC.file_ID
However, when I try to feed this file into downstream processes, I keep getting a weird errors, as if the file isn't properly formatted as a list of file names. When I open the file in vim to reveal any unnecessary characters, I find the file actually looks like this:
vi my_list.txt
^[[00mseriesA.file_ID^[[00m
^[[00mseriesB.file_ID^[[00m
^[[00mseriesC.file_ID^[[00m
For some reason, every line is started and ended with the characters ^[[00m. If I delete these characters, all of the downstream processes work fine. However, I need to have my scripts automatically make such a file list, so I can't keep going in and manually deleting these chars.
Does anyone know what is producing the ^[[00m characters? I don't have any idea where they are coming from, and need a to be able to generate files without them.
Thanks!
Probably your GREP_OPTIONS environment variable contains --color=always, which causes the output to be stuffed with control characters, even when piped to a file.
Use --color=auto instead.
http://www.gnu.org/software/grep/manual/html_node/Environment-Variables.html
Even better, don't use grep:
ls *.file_ID > my_list.txt
usually it is automatically => grep --color=auto
if it pipes into a csv file that looks like this ^[[00...^[[00m
you would have to type this in the terminal:
grep --color=auto "your regex" > example.csv
if you want it to be a permanent situation where you do not have to type "--color=auto" every time, type this in the terminal:
export GREP_OPTIONS='--color=auto'
more info:
https://linuxcommando.blogspot.com/2007/10/grep-with-color-output.html
Don't use ls:
printf "%s\n" *.file_ID > my_list.txt
This should take care of it (assuming GNU find and no directory traversing):
find . -maxdepth 1 -type f -name "*.file_ID" -printf "%f\n" > my_list.txt
Example:
~> ls *file_ID*
a.file_ID b.file_ID c.file_ID
~> find . -maxdepth 1 -type f -name "*.file_ID" -printf "%f\n" > my_list.txt
~> cat my_list.txt
a.file_ID
b.file_ID
c.file_ID
As far as the "^[[00m" characters, check your ls options:
~> alias -p | grep "ls="
You may get something like:
alias ls='/bin/ls $LS_OPTIONS'
If so, check env for this:
~> env | grep LS_OP
LS_OPTIONS=-N --color=tty -T 0
The character string you're referencing is used to turn off colors, so your shell likely has been set to show colors. Removing and/or changing the ls alias should resolve it.
The weird characters such as ^[[00m are escape characters for colorizing the output. Color output for ls is most likely enabled through an alias in your environment.
To avoid getting these color characters, you can try disable the ls alias temporarily with a backslash:
\ls *.txt
Or you can use printf command instead.
printf "%s\n" *.txt

Replacing a line from multiple files, limiting to a line number range

I have a large number of files and I want to replace some lines from all of these files. I don't know the exact contents of the lines, all I know is all of them contain two known words - let's say for example 'Programmer' and 'Bob'. So the lines I want to replace could be something like:
Created by Programmer Bob
Programmer extraordinaire Bob, such an awesome guy
Copyright programmer bob, all rights reserved
So far this sounds easy, but the problem is I only want to replace the lines that are contained within a line range - for example in the beginning of the file (where typically one would find comments regarding the file). I can't replace lines found in the later parts of the file, because I don't want to accidentally replace actual code.
So far I have tried:
find . -exec grep -il -E 'Programmer.*Bob' {} \; | xargs sed -i '1,10 /Programmer.*Bob/Ic\LINE REPLACED'
(I'm using find because grep ran into an infinite recursion - I think. Not the point here.)
However it seems that I can't use address ranges with c\ (change line). Feel free to point out any syntax errors, but I think I've tried everything to no avail. This does work without the line numbers.
EDIT:
I got the answer, but I decided to edit my question to include my solution which expands upon the answer I got - maybe someone will find this helpful.
I realised later that I want to retain the possible whitespace and comment characters in the beginning of the line. I accomplished it using this command:
find . -exec grep -ilI '.*Programmer.*Bob.*' {} \; xargs sed -i -r '1,10 s/([ \t#*]*)(.*Programmer.*Bob.*)/\1LINE REPLACED/I'
\1 keeps the pattern that matches [ \t#*]*. One could change this to ^[ \t#*]* that would anchor the pattern to the beginning of the line, but (I THINK) this current version would change
** Text I don't want to remove ** Programmer Bob
into
** Text I don't want to remove ** LINE REPLACED
Which could actually be better. (I also added the -I (capital i) flag to the find command, which skips binary files.)
You are mixing addresses and commands. Simple substitution should work:
find . -exec grep -il -E 'Programmer.*Bob' {} \; \
| xargs sed -i '1,10 s/.*Programmer.*Bob.*/LINE REPLACED/'
find . -type f -name "*.cpp"|xargs perl -pi -e 'if(/Programmer/ && /Bob/ && $.>=1 && $.<10){$_="line to replace"}'
sed command:
>sed '1,10 {s/programmer\|bob/LINE REPLACED/i;s/programmer\|bob//ig}' file

How to remove first and last folder in 'find' result output?

I want to search for folders by part of their name, which i know and it's common among these kind of folders. i used 'find' command in bash script like this
find . -type d -name "*.hg"
it just print out the whole path from current directory to the found folder itself. the foldr name has '.hg'.then i tried to use 'sed' command but i couldn't address the last part of the path. i decided to get the folder name ends in .hg save it in a variable then use 'sed' command to remove the last directory from output. i use this to get the last part, and try to save the result to a varable, no luck.
find . -type d -name "*.hg"|sed 's/*.hg$/ /'
find . -type d -name "*.hg"|awk -F/ '{print $NF}
this just print out the file names, here the folder with .hg at the end.
then i use different approach
for i in $(find . -type d -name '*.hg' );
do
$DIR = $(dirname ${i})
echo $DIR
done
this didin't work neither. can anyone point me any hint to make this works.
and yes it's homework.
You could use parameter expansion:
d=path/to/my/dir
d="${d#*/}" # remove the first dir
d="${d%/*}" # remove the last dir
echo $d # "to/my"
one problem that you have is with the pattern you are using in your sed script - there is a different pattern language used by both bash and the find command.
They use a very simple regular expression language where * means any number of any character and ? means any single character. The sed command uses a much richer regular expression language where * means any number of the previous character and . means any character (there's a lot more to it than that).
So to remove the last component of the path delivered by find you will need to use the following sed command: sed -e 's,/[^/].hg,,'
Alternatively you could use the dirname command. Pipe the output of the find command to xargs (which will run a command passing standard input as arguments to the command:
xargs -i dirname
#Pamador - that's strange. It works for me. Just to explain: the sed command needs to be quoted in single quotes just to protect against any unwanted shell expansions. The character following the 's' is a comma; what we're doing here is changing the character that sed uses to separate the two parts of the substitute command, this means that we can use the slash character without having to escape it without a preceding backslash. The next part matches any sequence of characters apart from a slash followed by any character and then hg. Honestly I should have anchored the pattern to the end of line with a $ but apart from that it's fine.
I tested it with
echo "./abc/xxx.hg" | sed -e 's,/[^/]\.hg$'
And it printed ./abc
Did I misunderstand what you wanted to do?
find . -type d -name "*.hg" | awk -v m=1 -v n=1 'NR<=m{};NR>n+m{print line[NR%n]};{line[NR%n]=$0}'
awk parameters:
m = number of lines to remove from beginning of output
n = number of
lines to remove from end of output
Bonus: If you wanted to remove 1 line from the end and you have coreutils installed, you could do this: find . -type d -name "*.hg" | ghead -n -1

Resources