How to remove trailing whitespace of all files recursively? - bash

How can you remove all of the trailing whitespace of an entire project? Starting at a root directory, and removing the trailing whitespace from all files in all folders.
Also, I want to to be able to modify the file directly, and not just print everything to stdout.

Here is an OS X >= 10.6 Snow Leopard solution.
It Ignores .git and .svn folders and their contents. Also it won't leave a backup file.
(export LANG=C LC_CTYPE=C
find . -not \( -name .svn -prune -o -name .git -prune \) -type f -print0 | perl -0ne 'print if -T' | xargs -0 sed -Ei 's/[[:blank:]]+$//'
)
The enclosing parenthesis preserves the L* variables of current shell – executing in subshell.

Use:
find . -type f -print0 | xargs -0 perl -pi.bak -e 's/ +$//'
if you don't want the ".bak" files generated:
find . -type f -print0 | xargs -0 perl -pi -e 's/ +$//'
as a zsh user, you can omit the call to find, and instead use:
perl -pi -e 's/ +$//' **/*
Note: To prevent destroying .git directory, try adding: -not -iwholename '*.git*'.

Two alternative approaches which also work with DOS newlines (CR/LF) and do a pretty good job at avoiding binary files:
Generic solution which checks that the MIME type starts with text/:
while IFS= read -r -d '' -u 9
do
if [[ "$(file -bs --mime-type -- "$REPLY")" = text/* ]]
then
sed -i 's/[ \t]\+\(\r\?\)$/\1/' -- "$REPLY"
else
echo "Skipping $REPLY" >&2
fi
done 9< <(find . -type f -print0)
Git repository-specific solution by Mat which uses the -I option of git grep to skip files which Git considers to be binary:
git grep -I --name-only -z -e '' | xargs -0 sed -i 's/[ \t]\+\(\r\?\)$/\1/'

In Bash:
find dir -type f -exec sed -i 's/ *$//' '{}' ';'
Note: If you're using .git repository, try adding: -not -iwholename '.git'.

Ack was made for this kind of task.
It works just like grep, but knows not to descend into places like .svn, .git, .cvs, etc.
ack --print0 -l '[ \t]+$' | xargs -0 -n1 perl -pi -e 's/[ \t]+$//'
Much easier than jumping through hoops with find/grep.
Ack is available via most package managers (as either ack or ack-grep).
It's just a Perl program, so it's also available in a single-file version that you can just download and run. See: Ack Install

This worked for me in OSX 10.5 Leopard, which does not use GNU sed or xargs.
find dir -type f -print0 | xargs -0 sed -i.bak -E "s/[[:space:]]*$//"
Just be careful with this if you have files that need to be excluded (I did)!
You can use -prune to ignore certain directories or files. For Python files in a git repository, you could use something like:
find dir -not -path '.git' -iname '*.py'

ex
Try using Ex editor (part of Vim):
$ ex +'bufdo!%s/\s\+$//e' -cxa **/*.*
Note: For recursion (bash4 & zsh), we use a new globbing option (**/*.*). Enable by shopt -s globstar.
You may add the following function into your .bash_profile:
# Strip trailing whitespaces.
# Usage: trim *.*
# See: https://stackoverflow.com/q/10711051/55075
trim() {
ex +'bufdo!%s/\s\+$//e' -cxa $*
}
sed
For using sed, check: How to remove trailing whitespaces with sed?
find
Find the following script (e.g. remove_trail_spaces.sh) for removing trailing whitespaces from the files:
#!/bin/sh
# Script to remove trailing whitespace of all files recursively
# See: https://stackoverflow.com/questions/149057/how-to-remove-trailing-whitespace-of-all-files-recursively
case "$OSTYPE" in
darwin*) # OSX 10.5 Leopard, which does not use GNU sed or xargs.
find . -type f -not -iwholename '*.git*' -print0 | xargs -0 sed -i .bak -E "s/[[:space:]]*$//"
find . -type f -name \*.bak -print0 | xargs -0 rm -v
;;
*)
find . -type f -not -iwholename '*.git*' -print0 | xargs -0 perl -pi -e 's/ +$//'
esac
Run this script from the directory which you want to scan. On OSX at the end, it will remove all the files ending with .bak.
Or just:
find . -type f -name "*.java" -exec perl -p -i -e "s/[ \t]$//g" {} \;
which is recommended way by Spring Framework Code Style.

I ended up not using find and not creating backup files.
sed -i '' 's/[[:space:]]*$//g' **/*.*
Depending on the depth of the file tree, this (shorter version) may be sufficient for your needs.
NOTE this also takes binary files, for instance.

Instead of excluding files, here is a variation of the above the explicitly white lists the files, based on file extension, that you want to strip, feel free to season to taste:
find . \( -name *.rb -or -name *.html -or -name *.js -or -name *.coffee -or \
-name *.css -or -name *.scss -or -name *.erb -or -name *.yml -or -name *.ru \) \
-print0 | xargs -0 sed -i '' -E "s/[[:space:]]*$//"

I ended up running this, which is a mix between pojo and adams version.
It will clean both trailing whitespace, and also another form of trailing whitespace, the carriage return:
find . -not \( -name .svn -prune -o -name .git -prune \) -type f \
-exec sed -i 's/[:space:]+$//' \{} \; \
-exec sed -i 's/\r\n$/\n/' \{} \;
It won't touch the .git folder if there is one.
Edit: Made it a bit safer after the comment, not allowing to take files with ".git" or ".svn" in it. But beware, it will touch binary files if you've got some. Use -iname "*.py" -or -iname "*.php" after -type f if you only want it to touch e.g. .py and .php-files.
Update 2: It now replaces all kinds of spaces at end of line (which means tabs as well)

This works well.. add/remove --include for specific file types :
egrep -rl ' $' --include *.c * | xargs sed -i 's/\s\+$//g'

Ruby:
irb
Dir['lib/**/*.rb'].each{|f| x = File.read(f); File.write(f, x.gsub(/[ \t]+$/,"")) }

1) Many other answers use -E. I am not sure why, as that's undocumented BSD compatibility option. -r should be used instead.
2) Other answers use -i ''. That should be just -i (or -i'' if preffered), because -i has the suffix right after.
3) Git specific solution:
git config --global alias.check-whitespace \
'git diff-tree --check $(git hash-object -t tree /dev/null) HEAD'
git check-whitespace | grep trailing | cut -d: -f1 | uniq -u -z | xargs -0 sed --in-place -e 's/[ \t]+$//'
The first one registers a git alias check-whitespace which lists the files with trailing whitespaces.
The second one runs sed on them.
I only use \t rather than [:space:] as I don't typically see vertical tabs, form feeds and non-breakable spaces. Your measurement may vary.

I use regular expressions. 4 steps:
Open the root folder in your editor (I use Visual Studio Code).
Tap the Search icon on the left, and enable the regular expression mode.
Enter " +\n" in the Search bar and "\n" in the Replace bar.
Click "Replace All".
This removes all trailing spaces at the end of each line in all files. And you can exclude some files that don't fit with this need.

This is what works for me (Mac OS X 10.8, GNU sed installed by Homebrew):
find . -path ./vendor -prune -o \
\( -name '*.java' -o -name '*.xml' -o -name '*.css' \) \
-exec gsed -i -E 's/\t/ /' \{} \; \
-exec gsed -i -E 's/[[:space:]]*$//' \{} \; \
-exec gsed -i -E 's/\r\n/\n/' \{} \;
Removed trailing spaces, replaces tabs with spaces, replaces Windows CRLF with Unix \n.
What's interesting is that I have to run this 3-4 times before all files get fixed, by all cleaning gsed instructions.

Related

Changing file content using sed in bash [duplicate]

How do I find and replace every occurrence of:
subdomainA.example.com
with
subdomainB.example.com
in every text file under the /home/www/ directory tree recursively?
find /home/www \( -type d -name .git -prune \) -o -type f -print0 | xargs -0 sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g'
-print0 tells find to print each of the results separated by a null character, rather than a new line. In the unlikely event that your directory has files with newlines in the names, this still lets xargs work on the correct filenames.
\( -type d -name .git -prune \) is an expression which completely skips over all directories named .git. You could easily expand it, if you use SVN or have other folders you want to preserve -- just match against more names. It's roughly equivalent to -not -path .git, but more efficient, because rather than checking every file in the directory, it skips it entirely. The -o after it is required because of how -prune actually works.
For more information, see man find.
The simplest way for me is
grep -rl oldtext . | xargs sed -i 's/oldtext/newtext/g'
Note: Do not run this command on a folder including a git repo - changes to .git could corrupt your git index.
find /home/www/ -type f -exec \
sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
Compared to other answers here, this is simpler than most and uses sed instead of perl, which is what the original question asked for.
All the tricks are almost the same, but I like this one:
find <mydir> -type f -exec sed -i 's/<string1>/<string2>/g' {} +
find <mydir>: look up in the directory.
-type f:
File is of type: regular file
-exec command {} +:
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending
each selected file name at the end; the total number of invocations of the command will be much less than the number of
matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of
`{}' is allowed within the command. The command is executed in the starting directory.
For me the easiest solution to remember is https://stackoverflow.com/a/2113224/565525, i.e.:
sed -i '' -e 's/subdomainA/subdomainB/g' $(find /home/www/ -type f)
NOTE: -i '' solves OSX problem sed: 1: "...": invalid command code .
NOTE: If there are too many files to process you'll get Argument list too long. The workaround - use find -exec or xargs solution described above.
cd /home/www && find . -type f -print0 |
xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'
For anyone using silver searcher (ag)
ag SearchString -l0 | xargs -0 sed -i 's/SearchString/Replacement/g'
Since ag ignores git/hg/svn file/folders by default, this is safe to run inside a repository.
This one is compatible with git repositories, and a bit simpler:
Linux:
git grep -l 'original_text' | xargs sed -i 's/original_text/new_text/g'
Mac:
git grep -l 'original_text' | xargs sed -i '' -e 's/original_text/new_text/g'
(Thanks to http://blog.jasonmeridth.com/posts/use-git-grep-to-replace-strings-in-files-in-your-git-repository/)
To cut down on files to recursively sed through, you could grep for your string instance:
grep -rl <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
If you run man grep you'll notice you can also define an --exlude-dir="*.git" flag if you want to omit searching through .git directories, avoiding git index issues as others have politely pointed out.
Leading you to:
grep -rl --exclude-dir="*.git" <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
A straight forward method if you need to exclude directories (--exclude-dir=..folder) and also might have file names with spaces (solved by using 0Byte for both grep -Z and xargs -0)
grep -rlZ oldtext . --exclude-dir=.folder | xargs -0 sed -i 's/oldtext/newtext/g'
An one nice oneliner as an extra. Using git grep.
git grep -lz 'subdomainA.example.com' | xargs -0 perl -i'' -pE "s/subdomainA.example.com/subdomainB.example.com/g"
Simplest way to replace (all files, directory, recursive)
find . -type f -not -path '*/\.*' -exec sed -i 's/foo/bar/g' {} +
Note: Sometimes you might need to ignore some hidden files i.e. .git, you can use above command.
If you want to include hidden files use,
find . -type f -exec sed -i 's/foo/bar/g' {} +
In both case the string foo will be replaced with new string bar
find /home/www/ -type f -exec perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
find /home/www/ -type f will list all files in /home/www/ (and its subdirectories).
The "-exec" flag tells find to run the following command on each file found.
perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
is the command run on the files (many at a time). The {} gets replaced by file names.
The + at the end of the command tells find to build one command for many filenames.
Per the find man page:
"The command line is built in much the same way that
xargs builds its command lines."
Thus it's possible to achieve your goal (and handle filenames containing spaces) without using xargs -0, or -print0.
I just needed this and was not happy with the speed of the available examples. So I came up with my own:
cd /var/www && ack-grep -l --print0 subdomainA.example.com | xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'
Ack-grep is very efficient on finding relevant files. This command replaced ~145 000 files with a breeze whereas others took so long I couldn't wait until they finish.
or use the blazing fast GNU Parallel:
grep -rl oldtext . | parallel sed -i 's/oldtext/newtext/g' {}
grep -lr 'subdomainA.example.com' | while read file; do sed -i "s/subdomainA.example.com/subdomainB.example.com/g" "$file"; done
I guess most people don't know that they can pipe something into a "while read file" and it avoids those nasty -print0 args, while presevering spaces in filenames.
Further adding an echo before the sed allows you to see what files will change before actually doing it.
Try this:
sed -i 's/subdomainA/subdomainB/g' `grep -ril 'subdomainA' *`
According to this blog post:
find . -type f | xargs perl -pi -e 's/oldtext/newtext/g;'
#!/usr/local/bin/bash -x
find * /home/www -type f | while read files
do
sedtest=$(sed -n '/^/,/$/p' "${files}" | sed -n '/subdomainA/p')
if [ "${sedtest}" ]
then
sed s'/subdomainA/subdomainB/'g "${files}" > "${files}".tmp
mv "${files}".tmp "${files}"
fi
done
If you do not mind using vim together with grep or find tools, you could follow up the answer given by user Gert in this link --> How to do a text replacement in a big folder hierarchy?.
Here's the deal:
recursively grep for the string that you want to replace in a certain path, and take only the complete path of the matching file. (that would be the $(grep 'string' 'pathname' -Rl).
(optional) if you want to make a pre-backup of those files on centralized directory maybe you can use this also: cp -iv $(grep 'string' 'pathname' -Rl) 'centralized-directory-pathname'
after that you can edit/replace at will in vim following a scheme similar to the one provided on the link given:
:bufdo %s#string#replacement#gc | update
You can use awk to solve this as below,
for file in `find /home/www -type f`
do
awk '{gsub(/subdomainA.example.com/,"subdomainB.example.com"); print $0;}' $file > ./tempFile && mv ./tempFile $file;
done
hope this will help you !!!
For replace all occurrences in a git repository you can use:
git ls-files -z | xargs -0 sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g'
See List files in local git repo? for other options to list all files in a repository. The -z options tells git to separate the file names with a zero byte, which assures that xargs (with the option -0) can separate filenames, even if they contain spaces or whatnot.
A bit old school but this worked on OS X.
There are few trickeries:
• Will only edit files with extension .sls under the current directory
• . must be escaped to ensure sed does not evaluate them as "any character"
• , is used as the sed delimiter instead of the usual /
Also note this is to edit a Jinja template to pass a variable in the path of an import (but this is off topic).
First, verify your sed command does what you want (this will only print the changes to stdout, it will not change the files):
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Edit the sed command as needed, once you are ready to make changes:
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed -i '' 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Note the -i '' in the sed command, I did not want to create a backup of the original files (as explained in In-place edits with sed on OS X or in Robert Lujo's comment in this page).
Happy seding folks!
just to avoid to change also
NearlysubdomainA.example.com
subdomainA.example.comp.other
but still
subdomainA.example.com.IsIt.good
(maybe not good in the idea behind domain root)
find /home/www/ -type f -exec sed -i 's/\bsubdomainA\.example\.com\b/\1subdomainB.example.com\2/g' {} \;
Here's a version that should be more general than most; it doesn't require find (using du instead), for instance. It does require xargs, which are only found in some versions of Plan 9 (like 9front).
du -a | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/subdomainB.example.com/g'
If you want to add filters like file extensions use grep:
du -a | grep "\.scala$" | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/subdomainB.example.com/g'
For Qshell (qsh) on IBMi, not bash as tagged by OP.
Limitations of qsh commands:
find does not have the -print0 option
xargs does not have -0 option
sed does not have -i option
Thus the solution in qsh:
PATH='your/path/here'
SEARCH=\'subdomainA.example.com\'
REPLACE=\'subdomainB.example.com\'
for file in $( find ${PATH} -P -type f ); do
TEMP_FILE=${file}.${RANDOM}.temp_file
if [ ! -e ${TEMP_FILE} ]; then
touch -C 819 ${TEMP_FILE}
sed -e 's/'$SEARCH'/'$REPLACE'/g' \
< ${file} > ${TEMP_FILE}
mv ${TEMP_FILE} ${file}
fi
done
Caveats:
Solution excludes error handling
Not Bash as tagged by OP
If you wanted to use this without completely destroying your SVN repository, you can tell 'find' to ignore all hidden files by doing:
find . \( ! -regex '.*/\..*' \) -type f -print0 | xargs -0 sed -i 's/subdomainA.example.com/subdomainB.example.com/g'
Using combination of grep and sed
for pp in $(grep -Rl looking_for_string)
do
sed -i 's/looking_for_string/something_other/g' "${pp}"
done
perl -p -i -e 's/oldthing/new_thingy/g' `grep -ril oldthing *`
to change multiple files (and saving a backup as *.bak):
perl -p -i -e "s/\|/x/g" *
will take all files in directory and replace | with x
called a “Perl pie” (easy as a pie)

issue with piping find into sed (find and replace)

Here is my current code, my goal is to find every file in a given directory (recursively) and replace "FIND" with "REPLACEWITH" and overwrite the files.
FIND='ALEX'
REPLACEWITH='<strong>ALEX</strong>'
DIRECTORY='/some/directory/'
find $DIRECTORY -type f -name "*.html" -print0 |
LANG=C xargs -0 sed -i "s|$FIND|$REPLACEWITH|g"
The error I am getting is:
sed: 1: "/some/directory ...": command a expects \ followed by text
As given in BashFAQ #21, you can use perl to perform search-and-replace operations with no potential for data being treated as code:
in="$FIND" out="$REPLACEWITH" find "$DIRECTORY" -type f -name '*.html' \
-exec perl -pi -e 's/\Q$ENV{"in"}/$ENV{"out"}/g' '{}' +
If you want to include only files matching the FIND string, find can be told to only pass files which grep flags on to perl:
in="$FIND" out="$REPLACEWITH" find "$DIRECTORY" -type f -name '*.html' \
-exec grep -F -q -e "$FIND" '{}' ';' \
-exec perl -pi -e 's/\Q$ENV{"in"}/$ENV{"out"}/g' '{}' +
Because grep is being used to evaluate individual files, it's necessary to use one grep call per file so its exit status can be evaluated on a per-file basis; thus, the use of the less efficient -exec ... {} ';' action. For perl, it's possible to put multiple files to process on one command, hence the use of -exec ... {} +.
Note that fgrep is line-oriented; if your FIND string contains multiple lines, then files with any one of those lines will be passed to perl for replacements.
You can have find invoke sed directly although I think all the modification times on your files will be affected (which might matter or not):
find $DIRECTORY -type f -name "*.html" -exec sed -i "s|$FIND|$REPLACEWITH|g" '{}' ';'

Bash script to change file names recursively

I have a script for changing file names of mht files but it does not traverse through dirs and sub dirs. I asked a question on a local forum and I got an answer that this is a solution:
find . -type f -name "*.mhtml" -o -type f -name "*.mht" | xargs -I item sh -c '{ echo item; echo item | sed "s/[:?|]//g"; }' | xargs -n2 mv
But it generates an error. With some of my experimenting it turns out that sh -c breaks file names with space and that this generates an error. How can I fix this?
#!/bin/bash
# renames.sh
# basic file renamer
for i in . *.mht
do
j=`echo $i | sed 's/|/ /g' | sed 's/:/ /g' | sed 's/?//g' | sed 's/"//g'`
mv "$i" "$j"
done
#! /bin/bash
find . -type f \( -name "*.mhtml" -o -name ".mht" \) -print0 |
while IFS= read -r -d '' source; do
target="${source//[:?|]/}"
[ "X$source" != "X$target" ] &&
mv -nv "$source" "$target"
done
Update: Do the rename according to the original question, and added support for .mht.
Use rename. With rename you can specify a renaming pattern:
find . -type f \( -name "*.mhtml" -o -name "*.mht" \) -print0 | xargs -0 -I'{}' rename 's/[:?|]//g' "{}"
This way you can properly handle names with spaces. xargs will replace {} with every names of file provided by the find command. Also note the use of -print0 and -0. This use a \0 as a separator so its avoid problems dealing with filnames containing \n (newline).
The -o was not working the way it was intended to. you must use parenthesis to group conditions.
You may also consider using -iname instead of -name if you deal with file ending with ".mHtml".

How to run a command recursively on all files except for those under .svn directories

Here is how i run dos2unix recursively on all files:
find -exec dos2unix {} \;
What do i need to change to make it skip over files under .svn/ directories?
Actual tested solution:
$ find . -type f \! -path \*/\.svn/\* -exec dos2unix {} \;
Here's a general script on which you can change the last line as required.
I've taken the technique from my findrepo script:
repodirs=".git .svn CVS .hg .bzr _darcs"
for dir in $repodirs; do
repo_ign="$repo_ign${repo_ign+" -o "}-name $dir"
done
find \( -type d -a \( $repo_ign \) \) -prune -o \
\( -type f -print0 \) |
xargs -r0 \
dos2unix
Just offering an additional tip: piping the result through xargs instead of using find's -exec option will increase the performance when going through a large directory structure if the filtering program accepts multiple arguments, as this will reduce the number of fork()'s, so:
find <opts> | xargs dos2unix
One caveat: piping through xargs will fail horribly if any filenames include whitespace.
In bash
for fic in **/*; dos2unix $fic
Or even better in zsh
for fic in **/*(.); dos2unix $fic
find . -path ./.svn -prune -o -print0 | xargs -0 -i echo dos2unix "{}" "{}"
if you have bash 4.0
shopt -s globstar
shopt -s dotglob
for file in /path/**
do
case "$file" in
*/.svn* )continue;;
esac
echo dos2unix $file $file
done

How can I do a recursive find/replace of a string with awk or sed?

How do I find and replace every occurrence of:
subdomainA.example.com
with
subdomainB.example.com
in every text file under the /home/www/ directory tree recursively?
find /home/www \( -type d -name .git -prune \) -o -type f -print0 | xargs -0 sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g'
-print0 tells find to print each of the results separated by a null character, rather than a new line. In the unlikely event that your directory has files with newlines in the names, this still lets xargs work on the correct filenames.
\( -type d -name .git -prune \) is an expression which completely skips over all directories named .git. You could easily expand it, if you use SVN or have other folders you want to preserve -- just match against more names. It's roughly equivalent to -not -path .git, but more efficient, because rather than checking every file in the directory, it skips it entirely. The -o after it is required because of how -prune actually works.
For more information, see man find.
The simplest way for me is
grep -rl oldtext . | xargs sed -i 's/oldtext/newtext/g'
Note: Do not run this command on a folder including a git repo - changes to .git could corrupt your git index.
find /home/www/ -type f -exec \
sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
Compared to other answers here, this is simpler than most and uses sed instead of perl, which is what the original question asked for.
All the tricks are almost the same, but I like this one:
find <mydir> -type f -exec sed -i 's/<string1>/<string2>/g' {} +
find <mydir>: look up in the directory.
-type f:
File is of type: regular file
-exec command {} +:
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending
each selected file name at the end; the total number of invocations of the command will be much less than the number of
matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of
`{}' is allowed within the command. The command is executed in the starting directory.
For me the easiest solution to remember is https://stackoverflow.com/a/2113224/565525, i.e.:
sed -i '' -e 's/subdomainA/subdomainB/g' $(find /home/www/ -type f)
NOTE: -i '' solves OSX problem sed: 1: "...": invalid command code .
NOTE: If there are too many files to process you'll get Argument list too long. The workaround - use find -exec or xargs solution described above.
cd /home/www && find . -type f -print0 |
xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'
For anyone using silver searcher (ag)
ag SearchString -l0 | xargs -0 sed -i 's/SearchString/Replacement/g'
Since ag ignores git/hg/svn file/folders by default, this is safe to run inside a repository.
This one is compatible with git repositories, and a bit simpler:
Linux:
git grep -l 'original_text' | xargs sed -i 's/original_text/new_text/g'
Mac:
git grep -l 'original_text' | xargs sed -i '' -e 's/original_text/new_text/g'
(Thanks to http://blog.jasonmeridth.com/posts/use-git-grep-to-replace-strings-in-files-in-your-git-repository/)
To cut down on files to recursively sed through, you could grep for your string instance:
grep -rl <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
If you run man grep you'll notice you can also define an --exlude-dir="*.git" flag if you want to omit searching through .git directories, avoiding git index issues as others have politely pointed out.
Leading you to:
grep -rl --exclude-dir="*.git" <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
A straight forward method if you need to exclude directories (--exclude-dir=..folder) and also might have file names with spaces (solved by using 0Byte for both grep -Z and xargs -0)
grep -rlZ oldtext . --exclude-dir=.folder | xargs -0 sed -i 's/oldtext/newtext/g'
An one nice oneliner as an extra. Using git grep.
git grep -lz 'subdomainA.example.com' | xargs -0 perl -i'' -pE "s/subdomainA.example.com/subdomainB.example.com/g"
Simplest way to replace (all files, directory, recursive)
find . -type f -not -path '*/\.*' -exec sed -i 's/foo/bar/g' {} +
Note: Sometimes you might need to ignore some hidden files i.e. .git, you can use above command.
If you want to include hidden files use,
find . -type f -exec sed -i 's/foo/bar/g' {} +
In both case the string foo will be replaced with new string bar
find /home/www/ -type f -exec perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
find /home/www/ -type f will list all files in /home/www/ (and its subdirectories).
The "-exec" flag tells find to run the following command on each file found.
perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
is the command run on the files (many at a time). The {} gets replaced by file names.
The + at the end of the command tells find to build one command for many filenames.
Per the find man page:
"The command line is built in much the same way that
xargs builds its command lines."
Thus it's possible to achieve your goal (and handle filenames containing spaces) without using xargs -0, or -print0.
I just needed this and was not happy with the speed of the available examples. So I came up with my own:
cd /var/www && ack-grep -l --print0 subdomainA.example.com | xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'
Ack-grep is very efficient on finding relevant files. This command replaced ~145 000 files with a breeze whereas others took so long I couldn't wait until they finish.
or use the blazing fast GNU Parallel:
grep -rl oldtext . | parallel sed -i 's/oldtext/newtext/g' {}
grep -lr 'subdomainA.example.com' | while read file; do sed -i "s/subdomainA.example.com/subdomainB.example.com/g" "$file"; done
I guess most people don't know that they can pipe something into a "while read file" and it avoids those nasty -print0 args, while presevering spaces in filenames.
Further adding an echo before the sed allows you to see what files will change before actually doing it.
Try this:
sed -i 's/subdomainA/subdomainB/g' `grep -ril 'subdomainA' *`
According to this blog post:
find . -type f | xargs perl -pi -e 's/oldtext/newtext/g;'
#!/usr/local/bin/bash -x
find * /home/www -type f | while read files
do
sedtest=$(sed -n '/^/,/$/p' "${files}" | sed -n '/subdomainA/p')
if [ "${sedtest}" ]
then
sed s'/subdomainA/subdomainB/'g "${files}" > "${files}".tmp
mv "${files}".tmp "${files}"
fi
done
If you do not mind using vim together with grep or find tools, you could follow up the answer given by user Gert in this link --> How to do a text replacement in a big folder hierarchy?.
Here's the deal:
recursively grep for the string that you want to replace in a certain path, and take only the complete path of the matching file. (that would be the $(grep 'string' 'pathname' -Rl).
(optional) if you want to make a pre-backup of those files on centralized directory maybe you can use this also: cp -iv $(grep 'string' 'pathname' -Rl) 'centralized-directory-pathname'
after that you can edit/replace at will in vim following a scheme similar to the one provided on the link given:
:bufdo %s#string#replacement#gc | update
You can use awk to solve this as below,
for file in `find /home/www -type f`
do
awk '{gsub(/subdomainA.example.com/,"subdomainB.example.com"); print $0;}' $file > ./tempFile && mv ./tempFile $file;
done
hope this will help you !!!
For replace all occurrences in a git repository you can use:
git ls-files -z | xargs -0 sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g'
See List files in local git repo? for other options to list all files in a repository. The -z options tells git to separate the file names with a zero byte, which assures that xargs (with the option -0) can separate filenames, even if they contain spaces or whatnot.
A bit old school but this worked on OS X.
There are few trickeries:
• Will only edit files with extension .sls under the current directory
• . must be escaped to ensure sed does not evaluate them as "any character"
• , is used as the sed delimiter instead of the usual /
Also note this is to edit a Jinja template to pass a variable in the path of an import (but this is off topic).
First, verify your sed command does what you want (this will only print the changes to stdout, it will not change the files):
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Edit the sed command as needed, once you are ready to make changes:
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed -i '' 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Note the -i '' in the sed command, I did not want to create a backup of the original files (as explained in In-place edits with sed on OS X or in Robert Lujo's comment in this page).
Happy seding folks!
just to avoid to change also
NearlysubdomainA.example.com
subdomainA.example.comp.other
but still
subdomainA.example.com.IsIt.good
(maybe not good in the idea behind domain root)
find /home/www/ -type f -exec sed -i 's/\bsubdomainA\.example\.com\b/\1subdomainB.example.com\2/g' {} \;
Here's a version that should be more general than most; it doesn't require find (using du instead), for instance. It does require xargs, which are only found in some versions of Plan 9 (like 9front).
du -a | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/subdomainB.example.com/g'
If you want to add filters like file extensions use grep:
du -a | grep "\.scala$" | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/subdomainB.example.com/g'
For Qshell (qsh) on IBMi, not bash as tagged by OP.
Limitations of qsh commands:
find does not have the -print0 option
xargs does not have -0 option
sed does not have -i option
Thus the solution in qsh:
PATH='your/path/here'
SEARCH=\'subdomainA.example.com\'
REPLACE=\'subdomainB.example.com\'
for file in $( find ${PATH} -P -type f ); do
TEMP_FILE=${file}.${RANDOM}.temp_file
if [ ! -e ${TEMP_FILE} ]; then
touch -C 819 ${TEMP_FILE}
sed -e 's/'$SEARCH'/'$REPLACE'/g' \
< ${file} > ${TEMP_FILE}
mv ${TEMP_FILE} ${file}
fi
done
Caveats:
Solution excludes error handling
Not Bash as tagged by OP
If you wanted to use this without completely destroying your SVN repository, you can tell 'find' to ignore all hidden files by doing:
find . \( ! -regex '.*/\..*' \) -type f -print0 | xargs -0 sed -i 's/subdomainA.example.com/subdomainB.example.com/g'
Using combination of grep and sed
for pp in $(grep -Rl looking_for_string)
do
sed -i 's/looking_for_string/something_other/g' "${pp}"
done
perl -p -i -e 's/oldthing/new_thingy/g' `grep -ril oldthing *`
to change multiple files (and saving a backup as *.bak):
perl -p -i -e "s/\|/x/g" *
will take all files in directory and replace | with x
called a “Perl pie” (easy as a pie)

Resources