How to Grep and Replace With Exclusions - bash

How can I replace some strings in some files recursively, taking some exclusions into account? For example, I don't want to apply the replacement to binary files or files in .svn directories.

This is the solution I'm currently using, perhaps there is a better way?
grep -irl foobar | grep -v .svn | grep -v Binary | xargs sed -i 's/foobar/baz/g'

Related

How to get "Find" command to ONLY return files and no directories [duplicate]

This question already has answers here:
How to list only files and not directories of a directory Bash?
(12 answers)
Closed 2 years ago.
I'm developing a little desktop application that lists the files in a given directory.
The table on the left gets compared to the table on the right. If anything on the right is missing, the text of the files on the left will turn red. Simple.
What I'm throwing at these tables is a path that contains multiple subfolders of images.
These are saved as a variable so I cannot explicitly declare "omit a directory with this name".
Using this as an example:
I get this output:
And I'd like to get this output:
How do I get the find command to return ONLY the file names? No directory names at all.
Is this possible?
As of now I have:
"find -x "+ path + " | sed 's!.*/!!' | grep -v CaptureOne | grep -v .DS_Store | grep -v .cop | grep -v .cof | grep -v .cot | grep -v .cos | grep -v Settings131 | grep -v Proxies | grep -v Thumbnails | grep -v Cache | sort"
This does get me only the file names and not the full paths. Which I want.
And it doesn't include other file extensions and folders that I know will exist.
Like I said - I've never gone down this path and the above code could probably be done in a much easier way. Open to suggestions!
To limit yourself to files, use -type f (to exclude directories, you'd use ! -type d). To get just the basename, pipe the output through basename. All together, that'd be:
find . -type f -print0 | xargs -0 basename
(The -print0 and -0 uses NUL as the separator so that whitespace in pathnames is preserved.)

How to caseless grep listing names of files containing ALL of several strings and NONE of another several strings

Frequently I want to generate a list of files having
the stated condition.
Suppose I want to find all files with a copyright and a main but
without using fcntl or a namespace.
Here is a clumsy approach:
fgrep -i -r -l copyright *|xargs fgrep -i -l main|xargs fgrep -i -l -v fcntl|xargs fgrep -i -l namespace
Does anyone know how to achieve the same result with a more sophisticated approach using standard utilities?
For fun, I have begun to write my own C++17 program to achieve a speedy result but I would love to find my own work unnecessary. Here is my GitHub repository with that code:
https://github.com/jlettvin/Greased-Grep
With (GNU) grep, I would do this as follows:
grep -Flir 'main' . \
| xargs grep -Fli 'copyright' \
| xargs grep -FLi -e 'fcntl' -e 'namespace'
This is quite similar to what you had. To get files not containing a pattern, I use the -L option (you tried -lv – that returns the files that contain at least one line that doesn't match, i.e., typically all files).
For the last step, excluding files that don't match, I can do with just one grep invocation and multiple patterns specified with -e.
To make this more robust and allow for any characters in filenames, you can require that grep separates filenames with a NUL byte (-Z) and xargs expecting that (-0):
grep -FlirZ 'main' . \
| xargs -0 grep -FliZ 'copyright' \
| xargs -0 grep -FLi -e 'fcntl' -e 'namespace'

untar the directory structure only

Is it possible to extract only the directory structure from a tar-archive without extracting the files?
I tried to use
tar tvpf archive.tar | nawk '/.*\/$/{ print $NF }' | tar -xpf archive.tar --no-recursion -T -
All directories are shown with a trailing /. So with nawk I print out only these which have these / at the end of the line. The problem is, that tar -T - does not accept pathnames seperated by line. The other whitespaces separetes the filenames so it does not work for pathes which contains spaces. Any idea?
It is not a solution to untar all, then remove the files, then tar-copy the directory to the target, although it is possible. I want to create a solution without temporary files. bash solution would be helpful.
Sure, I came across an answer for something similar a while ago.
tar -tf file.tar | grep -v '/$' | sed 's/ /\\ /g' > excluded_files ; tar xfX file.tar excluded_files ; rm -f excluded_files
The first part looks in the tar, piping output to grep to grab only filenames, these are sent to a temporary file. Then extract the tar, excluding the excluded files, and finally remove the temporary file.
Edit: added sed to remove escape spaces in paths.
Update
As suggested in the comments, this is also acheivable with no temporary file and a process substitution. Cheers Walter A. Much cleaner.
tar xfX file.tar <(tar -tf file.tar | grep -v '/$' | sed 's/ /\\ /g')
If you can use a bit of C code, you could take libarchive/examples/untar.c and add a single check.
if (archive_entry_filetype(entry) != AE_IFDIR) continue;

Force sort command to ignore folder names

I ran the following from a base folder ./
find . -name *.xvi.txt | sort
Which returns the following sort order:
./LT/filename.2004167.xvi.txt
./LT/filename.2004247.xvi.txt
./pred/2004186/filename.2004186.xvi.txt
./pred/2004202/filename.2004202.xvi.txt
./pred/2004222/filename.2004222.xvi.txt
As you can see, the filenames follow a regular structure, but the files themselves might be located in different parts of the directory structure. Is there a way of ignoring the folder names and/or directory structure so that the sort returns a list of folders/filenames based ONLY on the file names themselves? Like so:
./LT/filename.2004167.xvi.txt
./pred/2004186/filename.2004186.xvi.txt
./pred/2004202/filename.2004202.xvi.txt
./pred/2004222/filename.2004222.xvi.txt
./LT/filename.2004247.xvi.txt
I've tried a few different switches under the find and sort commands, but no luck. I could always copy everything out to a single folder and sort from there, but there are several hundred files, and I'm hoping that a more elegant option exists.
Thanks! Your help is appreciated.
If your find has -printf you can print both the base filename and the full filename. Sort by the first field, then strip it off.
find . -name '*.xvi.txt' -printf '%f %p\n' | sort -k1,1 | cut -f 2- -d ' '
I have chosen a space as a delimiter. If your filenames include spaces, you should choose another delimiter which is a character that's not in your filenames. If any filenames include newlines, you'll have to modify this because it won't work.
Note that the glob in the find command should be quoted.
If your find doesn't have printf, you could use awk to accomplish the same thing:
find . -name *.xvi.txt | awk -F / '{ print $NF, $0 }' | sort | sed 's/.* //'
The same caveats about spaces that Dennis Williamson mentioned apply here. And for variety, I'm using sed to strip off the sort field, instead of cut.
find . -name *.xvi.txt | sort -t'.' -k3 -n
will sort it as you want. the only problem is if filename or directory name will include additinal dots.
To avoid it you can use :
find . -name *.xvi.txt | sed 's/[0-9]\+.xvi.txt$/\\&/' | sort -t'\' -k2 | sed 's/\\//'

To show only file name without the entire directory path

ls /home/user/new/*.txt prints all txt files in that directory. However it prints the output as follows:
[me#comp]$ ls /home/user/new/*.txt
/home/user/new/file1.txt /home/user/new/file2.txt /home/user/new/file3.txt
and so on.
I want to run the ls command not from the /home/user/new/ directory thus I have to give the full directory name, yet I want the output to be only as
[me#comp]$ ls /home/user/new/*.txt
file1.txt file2.txt file3.txt
I don't want the entire path. Only filename is needed. This issues has to be solved using ls command, as its output is meant for another program.
ls whateveryouwant | xargs -n 1 basename
Does that work for you?
Otherwise you can (cd /the/directory && ls) (yes, parentheses intended)
No need for Xargs and all , ls is more than enough.
ls -1 *.txt
displays row wise
There are several ways you can achieve this. One would be something like:
for filepath in /path/to/dir/*
do
filename=$(basename $filepath)
... whatever you want to do with the file here
done
Use the basename command:
basename /home/user/new/*.txt
(cd dir && ls)
will only output filenames in dir. Use ls -1 if you want one per line.
(Changed ; to && as per Sactiw's comment).
you could add an sed script to your commandline:
ls /home/user/new/*.txt | sed -r 's/^.+\///'
A fancy way to solve it is by using twice "rev" and "cut":
find ./ -name "*.txt" | rev | cut -d '/' -f1 | rev
The selected answer did not work for me, as I had spaces, quotes and other strange characters in my filenames. To quote the input for basename, you should use:
ls /path/to/my/directory | xargs -n1 -I{} basename "{}"
This is guaranteed to work, regardless of what the files are called.
I prefer the base name which is already answered by fge.
Another way is :
ls /home/user/new/*.txt|awk -F"/" '{print $NF}'
one more ugly way is :
ls /home/user/new/*.txt| perl -pe 's/\//\n/g'|tail -1
just hoping to be helpful to someone as old problems seem to come back every now and again and I always find good tips here.
My problem was to list in a text file all the names of the "*.txt" files in a certain directory without path and without extension from a Datastage 7.5 sequence.
The solution we used is:
ls /home/user/new/*.txt | xargs -n 1 basename | cut -d '.' -f1 > name_list.txt
There are lots of way we can do that and simply you can try following.
ls /home/user/new | tr '\n' '\n' | grep .txt
Another method:
cd /home/user/new && ls *.txt
Here is another way:
ls -1 /home/user/new/*.txt|rev|cut -d'/' -f1|rev
You could also pipe to grep and pull everything after the last forward slash. It looks goofy, but I think a defensive grep should be fine unless (like some kind of maniac) you have forward slashes within your filenames.
ls folderpathwithcriteria | grep -P -o -e "[^/]*$"
When you want to list names in a path but they have different file extensions.
me#server:/var/backups$ ls -1 *.zip && ls -1 *.gz

Resources