Os X 10.9 using sed to delete multiple lines in multiple files - text-files

I use terminal in Os X 10.9;
I want to delete the same lines in multiple files in the same directory;
The file in the directory are named 1.txt 2.txt 3.txt etc. (170 files);
The purpose is to delete all the lines except 156 to 698 in every file;
The command look like this:
find . -type f -name "*.txt" | xargs sed -i '156,698!d'
Doesn't work except for the first file ? What is the best way to do this ?
Thanks

You can use the -I option with xargs and do something like this:
find . -type f -name "*.txt" | xargs -I {} sed -i '156,698!d' {}

Related

renaming series of files using xargs

I would like to rename several files picked by find in some directory, then use xargs and mv to rename the files, with parameter expansion. However, it did not work...
example:
mkdir test
touch abc.txt
touch def.txt
find . -type f -print0 | \
xargs -I {} -n 1 -0 mv {} "${{}/.txt/.tx}"
Result:
bad substitution
[1] 134 broken pipe find . -type f -print0
Working Solution:
for i in ./*.txt ; do mv "$i" "${i/.txt/.tx}" ; done
Although I finally got a way to fix the problem, I still want to know why the first find + xargs way doesn't work, since I don't think the second way is very general for similar tasks.
Thanks!
Remember that shell variable substitution happens before your command runs. So when you run:
find . -type f -print0 | \
xargs -I {} -n 1 -0 mv {} "${{}/.txt/.tx}"
The shell tries to expan that ${...} construct before xargs even
runs...and since that contents of that expression aren't a valid shell variable reference, you get an error. A better solution would be to use the rename command:
find . -type f -print0 | \
xargs -I {} -0 rename .txt .tx {}
And since rename can operate on multiple files, you can simplify
that to:
find . -type f -print0 | \
xargs -0 rename .txt .tx

Changing file content using sed in bash [duplicate]

How do I find and replace every occurrence of:
subdomainA.example.com
with
subdomainB.example.com
in every text file under the /home/www/ directory tree recursively?
find /home/www \( -type d -name .git -prune \) -o -type f -print0 | xargs -0 sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g'
-print0 tells find to print each of the results separated by a null character, rather than a new line. In the unlikely event that your directory has files with newlines in the names, this still lets xargs work on the correct filenames.
\( -type d -name .git -prune \) is an expression which completely skips over all directories named .git. You could easily expand it, if you use SVN or have other folders you want to preserve -- just match against more names. It's roughly equivalent to -not -path .git, but more efficient, because rather than checking every file in the directory, it skips it entirely. The -o after it is required because of how -prune actually works.
For more information, see man find.
The simplest way for me is
grep -rl oldtext . | xargs sed -i 's/oldtext/newtext/g'
Note: Do not run this command on a folder including a git repo - changes to .git could corrupt your git index.
find /home/www/ -type f -exec \
sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
Compared to other answers here, this is simpler than most and uses sed instead of perl, which is what the original question asked for.
All the tricks are almost the same, but I like this one:
find <mydir> -type f -exec sed -i 's/<string1>/<string2>/g' {} +
find <mydir>: look up in the directory.
-type f:
File is of type: regular file
-exec command {} +:
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending
each selected file name at the end; the total number of invocations of the command will be much less than the number of
matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of
`{}' is allowed within the command. The command is executed in the starting directory.
For me the easiest solution to remember is https://stackoverflow.com/a/2113224/565525, i.e.:
sed -i '' -e 's/subdomainA/subdomainB/g' $(find /home/www/ -type f)
NOTE: -i '' solves OSX problem sed: 1: "...": invalid command code .
NOTE: If there are too many files to process you'll get Argument list too long. The workaround - use find -exec or xargs solution described above.
cd /home/www && find . -type f -print0 |
xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'
For anyone using silver searcher (ag)
ag SearchString -l0 | xargs -0 sed -i 's/SearchString/Replacement/g'
Since ag ignores git/hg/svn file/folders by default, this is safe to run inside a repository.
This one is compatible with git repositories, and a bit simpler:
Linux:
git grep -l 'original_text' | xargs sed -i 's/original_text/new_text/g'
Mac:
git grep -l 'original_text' | xargs sed -i '' -e 's/original_text/new_text/g'
(Thanks to http://blog.jasonmeridth.com/posts/use-git-grep-to-replace-strings-in-files-in-your-git-repository/)
To cut down on files to recursively sed through, you could grep for your string instance:
grep -rl <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
If you run man grep you'll notice you can also define an --exlude-dir="*.git" flag if you want to omit searching through .git directories, avoiding git index issues as others have politely pointed out.
Leading you to:
grep -rl --exclude-dir="*.git" <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
A straight forward method if you need to exclude directories (--exclude-dir=..folder) and also might have file names with spaces (solved by using 0Byte for both grep -Z and xargs -0)
grep -rlZ oldtext . --exclude-dir=.folder | xargs -0 sed -i 's/oldtext/newtext/g'
An one nice oneliner as an extra. Using git grep.
git grep -lz 'subdomainA.example.com' | xargs -0 perl -i'' -pE "s/subdomainA.example.com/subdomainB.example.com/g"
Simplest way to replace (all files, directory, recursive)
find . -type f -not -path '*/\.*' -exec sed -i 's/foo/bar/g' {} +
Note: Sometimes you might need to ignore some hidden files i.e. .git, you can use above command.
If you want to include hidden files use,
find . -type f -exec sed -i 's/foo/bar/g' {} +
In both case the string foo will be replaced with new string bar
find /home/www/ -type f -exec perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
find /home/www/ -type f will list all files in /home/www/ (and its subdirectories).
The "-exec" flag tells find to run the following command on each file found.
perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
is the command run on the files (many at a time). The {} gets replaced by file names.
The + at the end of the command tells find to build one command for many filenames.
Per the find man page:
"The command line is built in much the same way that
xargs builds its command lines."
Thus it's possible to achieve your goal (and handle filenames containing spaces) without using xargs -0, or -print0.
I just needed this and was not happy with the speed of the available examples. So I came up with my own:
cd /var/www && ack-grep -l --print0 subdomainA.example.com | xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'
Ack-grep is very efficient on finding relevant files. This command replaced ~145 000 files with a breeze whereas others took so long I couldn't wait until they finish.
or use the blazing fast GNU Parallel:
grep -rl oldtext . | parallel sed -i 's/oldtext/newtext/g' {}
grep -lr 'subdomainA.example.com' | while read file; do sed -i "s/subdomainA.example.com/subdomainB.example.com/g" "$file"; done
I guess most people don't know that they can pipe something into a "while read file" and it avoids those nasty -print0 args, while presevering spaces in filenames.
Further adding an echo before the sed allows you to see what files will change before actually doing it.
Try this:
sed -i 's/subdomainA/subdomainB/g' `grep -ril 'subdomainA' *`
According to this blog post:
find . -type f | xargs perl -pi -e 's/oldtext/newtext/g;'
#!/usr/local/bin/bash -x
find * /home/www -type f | while read files
do
sedtest=$(sed -n '/^/,/$/p' "${files}" | sed -n '/subdomainA/p')
if [ "${sedtest}" ]
then
sed s'/subdomainA/subdomainB/'g "${files}" > "${files}".tmp
mv "${files}".tmp "${files}"
fi
done
If you do not mind using vim together with grep or find tools, you could follow up the answer given by user Gert in this link --> How to do a text replacement in a big folder hierarchy?.
Here's the deal:
recursively grep for the string that you want to replace in a certain path, and take only the complete path of the matching file. (that would be the $(grep 'string' 'pathname' -Rl).
(optional) if you want to make a pre-backup of those files on centralized directory maybe you can use this also: cp -iv $(grep 'string' 'pathname' -Rl) 'centralized-directory-pathname'
after that you can edit/replace at will in vim following a scheme similar to the one provided on the link given:
:bufdo %s#string#replacement#gc | update
You can use awk to solve this as below,
for file in `find /home/www -type f`
do
awk '{gsub(/subdomainA.example.com/,"subdomainB.example.com"); print $0;}' $file > ./tempFile && mv ./tempFile $file;
done
hope this will help you !!!
For replace all occurrences in a git repository you can use:
git ls-files -z | xargs -0 sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g'
See List files in local git repo? for other options to list all files in a repository. The -z options tells git to separate the file names with a zero byte, which assures that xargs (with the option -0) can separate filenames, even if they contain spaces or whatnot.
A bit old school but this worked on OS X.
There are few trickeries:
• Will only edit files with extension .sls under the current directory
• . must be escaped to ensure sed does not evaluate them as "any character"
• , is used as the sed delimiter instead of the usual /
Also note this is to edit a Jinja template to pass a variable in the path of an import (but this is off topic).
First, verify your sed command does what you want (this will only print the changes to stdout, it will not change the files):
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Edit the sed command as needed, once you are ready to make changes:
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed -i '' 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Note the -i '' in the sed command, I did not want to create a backup of the original files (as explained in In-place edits with sed on OS X or in Robert Lujo's comment in this page).
Happy seding folks!
just to avoid to change also
NearlysubdomainA.example.com
subdomainA.example.comp.other
but still
subdomainA.example.com.IsIt.good
(maybe not good in the idea behind domain root)
find /home/www/ -type f -exec sed -i 's/\bsubdomainA\.example\.com\b/\1subdomainB.example.com\2/g' {} \;
Here's a version that should be more general than most; it doesn't require find (using du instead), for instance. It does require xargs, which are only found in some versions of Plan 9 (like 9front).
du -a | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/subdomainB.example.com/g'
If you want to add filters like file extensions use grep:
du -a | grep "\.scala$" | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/subdomainB.example.com/g'
For Qshell (qsh) on IBMi, not bash as tagged by OP.
Limitations of qsh commands:
find does not have the -print0 option
xargs does not have -0 option
sed does not have -i option
Thus the solution in qsh:
PATH='your/path/here'
SEARCH=\'subdomainA.example.com\'
REPLACE=\'subdomainB.example.com\'
for file in $( find ${PATH} -P -type f ); do
TEMP_FILE=${file}.${RANDOM}.temp_file
if [ ! -e ${TEMP_FILE} ]; then
touch -C 819 ${TEMP_FILE}
sed -e 's/'$SEARCH'/'$REPLACE'/g' \
< ${file} > ${TEMP_FILE}
mv ${TEMP_FILE} ${file}
fi
done
Caveats:
Solution excludes error handling
Not Bash as tagged by OP
If you wanted to use this without completely destroying your SVN repository, you can tell 'find' to ignore all hidden files by doing:
find . \( ! -regex '.*/\..*' \) -type f -print0 | xargs -0 sed -i 's/subdomainA.example.com/subdomainB.example.com/g'
Using combination of grep and sed
for pp in $(grep -Rl looking_for_string)
do
sed -i 's/looking_for_string/something_other/g' "${pp}"
done
perl -p -i -e 's/oldthing/new_thingy/g' `grep -ril oldthing *`
to change multiple files (and saving a backup as *.bak):
perl -p -i -e "s/\|/x/g" *
will take all files in directory and replace | with x
called a “Perl pie” (easy as a pie)

Output of 'cat' to find files with partial filenames

Say I've file1.txt with
ptext1
ptext2
ptext3
ptext4
These are the partial file names (library names) which I'm trying to find from a directory. Something like
cat file1.txt | xargs find . -name "*$0*"
or say,
cat file.txt | awk '{system("find . -name " *$0*)}'
None of them are working.
Please suggest.
I'm sure there is a more elegant way, but you could always loop over and run find on each:
Update to reflect suggestions in comments
while read -r filename; do
find . -type f -name "*$filename*"
done < file1.txt
One way with xargs
xargs -I{} find . -name "*"{}"*" < file

Terminal find, directories last instead of first

I have a makefile that concatenates JavaScript files together and then runs the file through uglify-js to create a .min.js version.
I'm currently using this command to find and concat my files
find src/js -type f -name "*.js" -exec cat {} >> ${jsbuild}$# \;
But it lists files in directories first, this makes heaps of sense but I'd like it to list the .js files in the src/js files above the directories to avoid getting my undefined JS error.
Is there anyway to do this or? I've had a google around and seen the sort command and the -s flag for find but it's a bit above my understanding at this point!
[EDIT]
The final solution is slightly different to the accepted answer but it is marked as accepted as it brought me to the answer. Here is the command I used
cat `find src/js -type f -name "*.js" -print0 | xargs -0 stat -f "%z %N" | sort -n | sed -e "s|[0-9]*\ \ ||"` > public/js/myCleverScript.js
Possible solution:
use find for getting filenames and directory depth, i.e find ... -printf "%d\t%p\n"
sort list by directory depth with sort -n
remove directory depth from output to use filenames only
test:
without sorting:
$ find folder1/ -depth -type f -printf "%d\t%p\n"
2 folder1/f2/f3
1 folder1/file0
with sorting:
$ find folder1/ -type f -printf "%d\t%p\n" | sort -n | sed -e "s|[0-9]*\t||"
folder1/file0
folder1/f2/f3
the command you need looks like
cat $(find src/js -type f -name "*.js" -printf "%d\t%p\n" | sort -n | sed -e "s|[0-9]*\t||")>min.js
Mmmmm...
find src/js -type f
shouldn't find ANY directories at all, and doubly so as your directory names will probably not end in ".js". The brackets around your "-name" parameter are superfluous too, try removing them
find src/js -type f -name "*.js" -exec cat {} >> ${jsbuild}$# \;
find could get the first directory level already expanded on commandline, which enforces the order of directory tree traversal. This solves the problem just for the top directory (unlike the already accepted solution by Sergey Fedorov), but this should answer your question too and more options are always welcome.
Using GNU coreutils ls, you can sort directories before regular files with --group-directories-first option. From reading the Mac OS X ls manpage it seems that directories are grouped always in OS X, you should just drop the option.
ls -A --group-directories-first -r | tac | xargs -I'%' find '%' -type f -name '*.js' -exec cat '{}' + > ${jsbuild}$#
If you do not have the tac command, you could easily implement it using sed. It reverses the order of lines. See info sed tac of GNU sed.
tac(){
sed -n '1!G;$p;h'
}
You could do something like this...
First create a variable holding the name of our output file:
OUT="$(pwd)/theLot.js"
Then, get all "*.js" in top directory into that file:
cat *.js > $OUT
Then have "find" grab all other "*.js" files below current directory:
find . -type d ! -name . -exec sh -c "cd {} ; cat *.js >> $OUT" \;
Just to explain the "find" command, it says:
find
. = starting at current directory
-type d = all directories, not files
-! -name . = except the current one
-exec sh -c = and for each one you find execute the following
"..." = go to that directory and concatenate all "*.js" files there onto end of $OUT
\; = and that's all for today, thank you!
I'd get the list of all the files:
$ find src/js -type f -name "*.js" > list.txt
Sort them by depth, i.e. by the number of '/' in them, using the following ruby script:
sort.rb:
files=[]; while gets; files<<$_; end
files.sort! {|a,b| a.count('/') <=> b.count('/')}
files.each {|f| puts f}
Like so:
$ ruby sort.rb < list.txt > sorted.txt
Concatenate them:
$ cat sorted.txt | while read FILE; do cat "$FILE" >> output.txt; done
(All this assumes that your file names don't contain newline characters.)
EDIT:
I was aiming for clarity. If you want conciseness, you can absolutely condense it to something like:
find src/js -name '*.js'| ruby -ne 'BEGIN{f=[];}; f<<$_; END{f.sort!{|a,b| a.count("/") <=> b.count("/")}; f.each{|e| puts e}}' | xargs cat >> concatenated

Run script against all txt files in directory and sub directories - BASH

What im trying to do is something along the lines of(this is pseudocode):
for txt in $(some fancy command ./*.txt); do
some command here $txt
You can use find:
find /path -type f -name "*.txt" | while read txt; do
echo "$txt"; # Do something else
done
Use the -exec option to find:
find /usr/share/wordlists/*/* -type f -name '*.txt' -exec yourScript {} \;
Try
find . | grep ".txt" | xargs -I script.sh {}
find returns all files in the directory. grep selects only .txt files and xargs sends the file as Parameter to script.sh

Resources