From UNIX shell, how to find all files containing a specific string, then print the 4th line of each file?

I want to find all files within the current directory that contain a given string, then print just the 4th line of each file.

grep --null -l "$yourstring" * | # List all the files containing your string
xargs -0 sed -n '4p;q' # Print the fourth line of said files.
Different editions of grep have slightly different incantations of --null, but it's usually there in some form. Read your manpage for details.
Update: I believe one of the null file list incantations of grep is a reasonable solution that will cover the vast majority of real-world use cases, but to be entirely portable, if your version of grep does not support any null output it is not perfectly safe to use it with xargs, so you must resort to find.
find . -maxdepth 1 -type f -exec grep -q "$yourstring" {} \; -exec sed -n '4p;q' {} +
Because find arguments can almost all be used as predicates, the -exec grep -q… part filters the files that are eventually fed to sed down to only those that contain the required string.

From other user:
grep -Frl string . | xargs -n 1 sed -n 4p

Give a try to the below GNU find command,
find . -maxdepth 1 -type f -exec grep -l 'yourstring' {} \; | xargs -I {} awk 'NR==4{print; exit}' {}
It finds all the files in the current directory which contains specific string, and prints the line number 4 present in each file.

This for loop should work:
while read -d '' -r file; do
echo -n "$file: "
sed '4q;d' "$file"
done < <(grep --null -l "some-text" *.txt)


how to find every file in my repo that has a specific word in the last line?

In other words, how to combine tail and find/grep command in bash.
I want to find all the files(including the files in subdirectories) in my repo have a specific word in the last line, say FIX in the last line. I tried grep -Rl "FIX" to display all the files containing "FIX", but I don't know how to combine the tail command in it. Anyone can help??
Run tail on all the files at once and then grep the output for FIX. Since tail prepends each line with the corresponding file name when given multiple file names, that's all you have to do.
find -type f -exec tail -n1 {} + | grep FIX
Or use ** to find all files and subdirectories, then run tail on each of them one at a time:
shopt -s globstar
for file in **; do
[[ -f $file ]] && tail -n1 "$file" | grep -q FIX && echo "$file"
Or use find to find all matches and pipe it to a while read loop:
find -type f -print0 | while IFS= read -rd '' file; do
tail -n1 "$file" | grep -q FIX && echo "$file"
Or do the same thing but with -exec + and an explicit sub-shell:
find -type f -exec sh -c 'for file; do tail -n1 "$file" | grep -q FIX && echo "$file"; done' sh {} +
If you want to know if the last line matches a pattern, use sed and restrict the match to the last line with $. sed doesn't easily give a return value or do pretty printing of the filename like grep, but it gets the job done.
find . -exec sh -c "sed -n '$ { /FIX/p; }' {} | grep -q . " \; -print
Here, we use -n to suppress printing, and then print (with /p) only when the last line matches the pattern /FIX/. The output is piped to grep to get a return value that find uses to decide whether or not to -print the name.
Or, you can avoid using grep for the return by doing something like:
find . -exec awk 'END{ exit ! match($0, "FIX")}' {} \; -print

Changing file content using sed in bash [duplicate]

How do I find and replace every occurrence of:
in every text file under the /home/www/ directory tree recursively?
find /home/www \( -type d -name .git -prune \) -o -type f -print0 | xargs -0 sed -i 's/subdomainA\.example\.com/'
-print0 tells find to print each of the results separated by a null character, rather than a new line. In the unlikely event that your directory has files with newlines in the names, this still lets xargs work on the correct filenames.
\( -type d -name .git -prune \) is an expression which completely skips over all directories named .git. You could easily expand it, if you use SVN or have other folders you want to preserve -- just match against more names. It's roughly equivalent to -not -path .git, but more efficient, because rather than checking every file in the directory, it skips it entirely. The -o after it is required because of how -prune actually works.
For more information, see man find.
The simplest way for me is
grep -rl oldtext . | xargs sed -i 's/oldtext/newtext/g'
Note: Do not run this command on a folder including a git repo - changes to .git could corrupt your git index.
find /home/www/ -type f -exec \
sed -i 's/subdomainA\.example\.com/' {} +
Compared to other answers here, this is simpler than most and uses sed instead of perl, which is what the original question asked for.
All the tricks are almost the same, but I like this one:
find <mydir> -type f -exec sed -i 's/<string1>/<string2>/g' {} +
find <mydir>: look up in the directory.
-type f:
File is of type: regular file
-exec command {} +:
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending
each selected file name at the end; the total number of invocations of the command will be much less than the number of
matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of
`{}' is allowed within the command. The command is executed in the starting directory.
For me the easiest solution to remember is, i.e.:
sed -i '' -e 's/subdomainA/subdomainB/g' $(find /home/www/ -type f)
NOTE: -i '' solves OSX problem sed: 1: "...": invalid command code .
NOTE: If there are too many files to process you'll get Argument list too long. The workaround - use find -exec or xargs solution described above.
cd /home/www && find . -type f -print0 |
xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/'
For anyone using silver searcher (ag)
ag SearchString -l0 | xargs -0 sed -i 's/SearchString/Replacement/g'
Since ag ignores git/hg/svn file/folders by default, this is safe to run inside a repository.
This one is compatible with git repositories, and a bit simpler:
git grep -l 'original_text' | xargs sed -i 's/original_text/new_text/g'
git grep -l 'original_text' | xargs sed -i '' -e 's/original_text/new_text/g'
(Thanks to
To cut down on files to recursively sed through, you could grep for your string instance:
grep -rl <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
If you run man grep you'll notice you can also define an --exlude-dir="*.git" flag if you want to omit searching through .git directories, avoiding git index issues as others have politely pointed out.
Leading you to:
grep -rl --exclude-dir="*.git" <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
A straight forward method if you need to exclude directories (--exclude-dir=..folder) and also might have file names with spaces (solved by using 0Byte for both grep -Z and xargs -0)
grep -rlZ oldtext . --exclude-dir=.folder | xargs -0 sed -i 's/oldtext/newtext/g'
An one nice oneliner as an extra. Using git grep.
git grep -lz '' | xargs -0 perl -i'' -pE "s/"
Simplest way to replace (all files, directory, recursive)
find . -type f -not -path '*/\.*' -exec sed -i 's/foo/bar/g' {} +
Note: Sometimes you might need to ignore some hidden files i.e. .git, you can use above command.
If you want to include hidden files use,
find . -type f -exec sed -i 's/foo/bar/g' {} +
In both case the string foo will be replaced with new string bar
find /home/www/ -type f -exec perl -i.bak -pe 's/subdomainA\.example\.com/' {} +
find /home/www/ -type f will list all files in /home/www/ (and its subdirectories).
The "-exec" flag tells find to run the following command on each file found.
perl -i.bak -pe 's/subdomainA\.example\.com/' {} +
is the command run on the files (many at a time). The {} gets replaced by file names.
The + at the end of the command tells find to build one command for many filenames.
Per the find man page:
"The command line is built in much the same way that
xargs builds its command lines."
Thus it's possible to achieve your goal (and handle filenames containing spaces) without using xargs -0, or -print0.
I just needed this and was not happy with the speed of the available examples. So I came up with my own:
cd /var/www && ack-grep -l --print0 | xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/'
Ack-grep is very efficient on finding relevant files. This command replaced ~145 000 files with a breeze whereas others took so long I couldn't wait until they finish.
or use the blazing fast GNU Parallel:
grep -rl oldtext . | parallel sed -i 's/oldtext/newtext/g' {}
grep -lr '' | while read file; do sed -i "s/" "$file"; done
I guess most people don't know that they can pipe something into a "while read file" and it avoids those nasty -print0 args, while presevering spaces in filenames.
Further adding an echo before the sed allows you to see what files will change before actually doing it.
Try this:
sed -i 's/subdomainA/subdomainB/g' `grep -ril 'subdomainA' *`
According to this blog post:
find . -type f | xargs perl -pi -e 's/oldtext/newtext/g;'
#!/usr/local/bin/bash -x
find * /home/www -type f | while read files
sedtest=$(sed -n '/^/,/$/p' "${files}" | sed -n '/subdomainA/p')
if [ "${sedtest}" ]
sed s'/subdomainA/subdomainB/'g "${files}" > "${files}".tmp
mv "${files}".tmp "${files}"
If you do not mind using vim together with grep or find tools, you could follow up the answer given by user Gert in this link --> How to do a text replacement in a big folder hierarchy?.
Here's the deal:
recursively grep for the string that you want to replace in a certain path, and take only the complete path of the matching file. (that would be the $(grep 'string' 'pathname' -Rl).
(optional) if you want to make a pre-backup of those files on centralized directory maybe you can use this also: cp -iv $(grep 'string' 'pathname' -Rl) 'centralized-directory-pathname'
after that you can edit/replace at will in vim following a scheme similar to the one provided on the link given:
:bufdo %s#string#replacement#gc | update
You can use awk to solve this as below,
for file in `find /home/www -type f`
awk '{gsub(/,""); print $0;}' $file > ./tempFile && mv ./tempFile $file;
hope this will help you !!!
For replace all occurrences in a git repository you can use:
git ls-files -z | xargs -0 sed -i 's/subdomainA\.example\.com/'
See List files in local git repo? for other options to list all files in a repository. The -z options tells git to separate the file names with a zero byte, which assures that xargs (with the option -0) can separate filenames, even if they contain spaces or whatnot.
A bit old school but this worked on OS X.
There are few trickeries:
• Will only edit files with extension .sls under the current directory
• . must be escaped to ensure sed does not evaluate them as "any character"
• , is used as the sed delimiter instead of the usual /
Also note this is to edit a Jinja template to pass a variable in the path of an import (but this is off topic).
First, verify your sed command does what you want (this will only print the changes to stdout, it will not change the files):
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Edit the sed command as needed, once you are ready to make changes:
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed -i '' 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Note the -i '' in the sed command, I did not want to create a backup of the original files (as explained in In-place edits with sed on OS X or in Robert Lujo's comment in this page).
Happy seding folks!
just to avoid to change also
but still
(maybe not good in the idea behind domain root)
find /home/www/ -type f -exec sed -i 's/\bsubdomainA\.example\.com\b/\\2/g' {} \;
Here's a version that should be more general than most; it doesn't require find (using du instead), for instance. It does require xargs, which are only found in some versions of Plan 9 (like 9front).
du -a | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/'
If you want to add filters like file extensions use grep:
du -a | grep "\.scala$" | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/'
For Qshell (qsh) on IBMi, not bash as tagged by OP.
Limitations of qsh commands:
find does not have the -print0 option
xargs does not have -0 option
sed does not have -i option
Thus the solution in qsh:
for file in $( find ${PATH} -P -type f ); do
if [ ! -e ${TEMP_FILE} ]; then
touch -C 819 ${TEMP_FILE}
sed -e 's/'$SEARCH'/'$REPLACE'/g' \
< ${file} > ${TEMP_FILE}
mv ${TEMP_FILE} ${file}
Solution excludes error handling
Not Bash as tagged by OP
If you wanted to use this without completely destroying your SVN repository, you can tell 'find' to ignore all hidden files by doing:
find . \( ! -regex '.*/\..*' \) -type f -print0 | xargs -0 sed -i 's/'
Using combination of grep and sed
for pp in $(grep -Rl looking_for_string)
sed -i 's/looking_for_string/something_other/g' "${pp}"
perl -p -i -e 's/oldthing/new_thingy/g' `grep -ril oldthing *`
to change multiple files (and saving a backup as *.bak):
perl -p -i -e "s/\|/x/g" *
will take all files in directory and replace | with x
called a “Perl pie” (easy as a pie)

how to grep large number of files?

I am trying to grep 40k files in the current directory and i am getting this error.
for i in $(cat A01/genes.txt); do grep $i *.kaks; done > A01/A01.result.txt
-bash: /usr/bin/grep: Argument list too long
How do one normally grep thousands of files?
This makes David sad...
Everyone so far is wrong (except for anubhava).
Shell scripting is not like any other programming language because much of the interpretation of lines comes from the power of the shell interpolating them before the command is actually executed.
Let's take something simple:
$ set -x
$ ls
+ ls
bar.txt foo.txt fubar.log
$ echo The text files are *.txt
echo The text files are *.txt
> echo The text files are bar.txt foo.txt
The text files are bar.txt foo.txt
$ set +x
The set -x allows you to see how the shell actually interpolates the glob and then passes that back to the command as input. The > points to the line that is actually being executed by the command.
You can see that the echo command isn't interpreting the *. Instead, the shell grabs the * and replaces it with the names of the matching files. Then and only then does the echo command actually executes the command.
When you have 40K plus files, and you do grep *, you're expanding that * to the names of those 40,000 plus files before grep even has a chance to execute, and that's where the error message /usr/bin/grep: Argument list too long is coming from.
Fortunately, Unix has a way around this dilemma:
$ find . -name "*.kaks" -type f -maxdepth 1 | xargs grep -f A01/genes.txt
The find . -name "*.kaks" -type f -maxdepth 1 will find all of your *.kaks files, and the -depth 1 will only include files in the current directory. The -type f makes sure you only pick up files and not a directory.
The find command pipes the names of the files into xargs and xargs will append the names of the file to the grep -f A01/genes.txtcommand. However, xargs has a trick up it sleeve. It knows how long the command line buffer is, and will execute the grep when the command line buffer is full, then pass in another series of file to the grep. This way, grep gets executed maybe three or ten times (depending upon the size of the command line buffer), and all of our files are used.
Unfortunately, xargs uses whitespace as a separator for the file names. If your files contain spaces or tabs, you'll have trouble with xargs. Fortunately, there's another fix:
$ find . -name "*.kaks" -type f -maxdepth 1 -print0 | xargs -0 grep -f A01/genes.txt
The -print0 will cause find to print out the names of the files not separated by newlines, but by the NUL character. The -0 parameter for xargs tells xargs that the file separator isn't whitespace, but the NUL character. Thus, fixes the issue.
You could also do this too:
$ find . -name "*.kaks" -type f -maxdepth 1 -exec grep -f A01/genes.txt {} \;
This will execute the grep for each and every file found instead of what xargs does and only runs grep for all the files it can stuff on the command line. The advantage of this is that it avoids shell interference entirely. However, it may or may not be less efficient.
What would be interesting is to experiment and see which one is more efficient. You can use time to see:
$ time find . -name "*.kaks" -type f -maxdepth 1 -exec grep -f A01/genes.txt {} \;
This will execute the command and then tell you how long it took. Try it with the -exec and with xargs and see which is faster. Let us know what you find.
You can combine find with grep like this:
find . -maxdepth 1 -name '*.kaks' -exec grep -H -f A01/genes.txt '{}' \; > A01/A01.result.txt
you can use recursive feature of grep:
for i in $(cat A01/genes.txt); do
grep -r $i .
done > A01/A01.result.txt
though if you want to select only kaks files:
for i in $(cat A01/genes.txt); do
find . -iregex '.*\.kaks$' -exec grep $i \;
done > A01/A01.result.txt
Put another for loop inside your outer one:
for f in *.kaks; do
grep -H $i "$f"
By the way, are you interested in finding EVERY occurrence in each file, or merely if the search string exists in there one or more times? If it is "good enough" to know the string occurs in there one or more times you can specify "-n 1" to grep and it will not bother reading/searching the rest of the file after finding the first match, which could potentially save lots of time.
The following solution has worked for me:
grep -r "example\.com" *
-bash: /bin/grep: Argument list too long
grep -r "example\.com" .
["In newer versions of grep you can omit the “.“, as the current directory is implied."]
Reinlick, J.

Displaying the result of find / replace over multiple documents on bash

I love to use the following command to do find / replace across multiple files in bash:
find -wholename "*.txt" -print | xargs sed -i 's/foo/bar/g'
However, the above command process everything in silence, and sometimes I would like the above command to print all the changes it made in order to double check if I did everything correctly. Can I know how should I improve the above command to allow it to dump such information? I tried the -v argument in the xargs command but it gives me the invalid option error.
You can do something like:
find -wholename "*.txt" | xargs sed -n '/foo/p;s/foo/bar/gp'
What this will do is print the line that you wish to substitute and print the substitution in the next line.
You can use awk and get filename as well:
find -wholename "*.txt" | xargs awk '/foo/{print FILENAME; gsub(/foo/,"bar");print}'
To print entire file remove print and add 1
find -wholename "*.txt" | xargs awk '/foo/{print FILENAME; gsub(/foo/,"bar")}1'
Regex will have to be modified as per your requirement and changes in-file is only available in gawk version 4.1
$ head file*
==> file1 <==
==> file2 <==
$ find . -name "file*" -print | xargs awk '/user1/{print FILENAME; gsub(/user1/,"TESTING");print}'
In order to see the differences you can redirect the output of sed to a new file for every input file and compare it with the original.
for i in `find -wholename "*.txt"`; do
sed 's/foo/bar/g' ${i} > ${i}.new;
diff -u ${i} ${i}.new;
If the changes seem ok, move the new files to their original names.
for i in `find -wholename "*.new"` ; do
mv ${i} ${i/.new};
All can be done with find and sed. Only a little modification needed:
find -path "*.txt" -exec sed -i.bak 's/foo/bar/g' {} +
This calls sed with the max number of files (mind + at the end of -exec), so xargs is not needed. In sed -i.bak does an in-place-editing renaming the original file as .bak. So You can check the differences later if needed.
In man find one can read:
-wholename pattern
See -path. This alternative is less portable than -path.

