How do I grab the filename of the file containing a certain string when there are hundreds of files? - shell

I have a folder with 200 files in it. We can say that the files are named "abc0" to "abc199". Five of these files contain the string "ez123" but I don't know which ones. My current attempt to find the file names of the files that contain the string is:
#!/bin/sh
while read FILES
do
cat $FILES | egrep "ez123"
done
I have a file that contains the filenames of all files in the directory. So I then execute:
./script < filenames
This is verifies for me that the files containing the string exist but I still don't have the name of the files. Are there any ideas concerning the best way to accomplish this?
Thanks

you can try
grep -l "ez123" abc*

find /directory -maxdepth 1 -type f -exec fgrep -l 'ez123' \{\} \;
(-maxdepth 1 is only necessary if you only want to search the directory and not the tree recursively (if there's any)).
fgrep is a bit faster than grep. -l lists the matched filenames only.

Try
find -type f -exec grep -qs "ez123" {} \; -print
This will use find to find all real files in current directory (and subdirectories), execute grep on them ({} will be replaced by file name, -qs tells it to be silent and just set an exit code), -print will print out the names of the files that grep found a matching line in.

What about:
xargs egrep -l ez123
That reads filenames from stdin and prints out the filenames with matches.

Related

Recursively go through directory and copy files in unix

I need to recursively go through a directory and all subdirectories and copy all files in them to an othet empty directory. "From" is the directory i want to go through and copy all files to "to". Currently it just copies all files from "from" to "to" and doesn't go in the subdirectories. Is there any way this will work or maybe some other way of doing this?
#/bin/bash
from=$1
to=$2
list=`ls -lR $from | grep '^-' | awk '{ print $10; }'`
for i in $list
do
cp $from/$i $to/$i
done
You can use https://www.shellcheck.net/ to check your syntax, and learn about indentation. It will suggest some modifications to your code. Excellent suggestions also by #KamilCuk in the comments.
When you need to do something recursively in sub-directories, you can always use find with option -exec. You use {} to specify that the command should be applied to each file (or directory) that matches your find options.
This would do the trick: find $from -type f -name "*.txt" -exec cp {} $to \; -print
-type f: find all files
-exec cp {} $to \;: for each file that is found ({}), copy it to the $to directory. \; delimits the end of the command for option -exec.
-name "*.txt": if you want to filter by file name (here the extension .txt must be present to match).
-print: if you want to see which files are copied, as it goes through your directories.
Note also that find will work even if you have a huge number of sub-directories, or a very large number of files in them. ls sometimes fails, or takes an enormously long time to output large number of files.

How to remove files from a directory if their names are not in a text file? Bash script

I am writing a bash script and want it to tell me if the names of the files in a directory appear in a text file and if not, remove them.
Something like this:
counter = 1
numFiles = ls -1 TestDir/ | wc -l
while [$counter -lt $numFiles]
do
if [file in TestDir/ not in fileNames.txt]
then
rm file
fi
((counter++))
done
So what I need help with is the if statement, which is still pseudo-code.
You can simplify your script logic a lot :
#/bin/bash
# for loop to iterate over all files in the testdir
for file in TestDir/*
do
# if grep exit code is 1 (file not found in the text document), we delete the file
[[ ! $(grep -x "$file" fileNames.txt &> /dev/null) ]] && rm "$file"
done
It looks like you've got a solution that works, but I thought I'd offer this one as well, as it might still be of help to you or someone else.
find /Path/To/TestDir -type f ! -name '.*' -exec basename {} + | grep -xvF -f /Path/To/filenames.txt"
Breakdown
find: This gets file paths in the specified directory (which would be TestDir) that match the given criteria. In this case, I've specified it return only regular files (-type f) whose names don't start with a period (-name '.*'). It then uses its own builtin utility to execute the next command:
basename: Given a file path (which is what find spits out), it will return the base filename only, or, more specifically, everything after the last /.
|: This is a command pipe, that takes the output of the previous command to use as input in the next command.
grep: This is a regular-expression matching utility that, in this case, is given two lists of files: one fed in through the pipe from find—the files of your TestDir directory; and the files listed in filenames.txt. Ordinarily, the filenames in the text file would be used to match against filenames returned by find, and those that match would be given as the output. However, the -v flag inverts the matching process, so that grep returns those filenames that do not match.
What results is a list of files that exist in the directory TestDir, but do not appear in the filenames.txt file. These are the files you wish to delete, so you can simply use this line of code inside a parameter expansion $(...) to supply rm with the files it's able to delete.
The full command chain—after you cd into TestDir—looks like this:
rm $(find . -type f ! -name '.*' -exec basename {} + | grep -xvF -f filenames.txt")

Get all occurrences of a string within a directory(including subdirectories) in .gz file using bash?

I want to find all the occurrences of "getId" inside a directory which has subdirectories as follows:
*/*/*/*/*/*/myfile.gz
i tried thisfind -name *myfile.gz -print0 | xargs -0 zgrep -i "getId" but it didn't work. Can anyone tell me the best and simplest approach to get this?
find ./ -name '*gz' -exec zgrep -aiH 'getSorById' {} \;
find allows you to execute a command on the file using "-exe" and it replaces "{}" with the file name, you terminate the command with "\;"
I added "-H" to zgrep so it also prints out the file path when it has a match, as its helpful. "-a" treats binary files as text (since you might get tar-ed gzipped files)
Lastly, its best to quote your strings in case bash starts globbing them.
https://linux.die.net/man/1/grep
https://linux.die.net/man/1/find
Use the following find approach:
find . -name *myfile.gz -exec zgrep -ai 'getSORByID' {} \;
This will print all possible lines containing getSORByID substring

Copying list of files to a directory

I want to make a search for all .fits files that contain a certain text in their name and then copy them to a directory.
I can use a command called fetchKeys to list the files that contain say 'foo'
The command looks like this : fetchKeys -t 'foo' -F | grep .fits
This returns a list of .fits files that contain 'foo'. Great! Now I want to copy all of these to a directory /path/to/dir. There are too many files to do individually , I need to copy them all using one command.
I'm thinking something like:
fetchKeys -t 'foo' -F | grep .fits > /path/to/dir
or
cp fetchKeys -t 'foo' -F | grep .fits /path/to/dir
but of course neither of these works. Any other ideas?
If this is on Linux/Unix, can you use the find command? That seems very much like fetchkeys.
$ find . -name "*foo*.fit" -type f -print0 | while read -r -d $'\0' file
do
basename=$(basename $file)
cp "$file" "$fits_dir/$basename"
done
The find command will find all files that match *foo*.fits in their name. The -type f says they have to be files and not directories. The -print0 means print out the files found, but separate them with the NUL character. Normally, the find command will simply return a file on each line, but what if the file name contains spaces, tabs, new lines, or even other strange characters?
The -print0 will separate out files with nulls (\0), and the read -d $'\0' file means to read in each file separating by these null characters. If your files don't contain whitespace or strange characters, you could do this:
$ find . -name "*foo*.fit" -type f | while read file
do
basename=$(basename $file)
cp "$file" "$fits_dir/$basename"
done
Basically, you read each file found with your find command into the shell variable file. Then, you can use that to copy that file into your $fits_dir or where ever you want.
Again, maybe there's a reason to use fetchKeys, and it is possible to replace that find with fetchKeys, but I don't know that fetchKeys command.
Copy all files with the name containing foo to a certain directory:
find . -name "*foo*.fit" -type f -exec cp {} "/path/to/dir/" \;
Copy all files themselves containing foo to a certain directory (solution without xargs):
for f in `find . -type f -exec grep -l foo {} \;`; do cp "$f" /path/to/dir/; done
The find command has very useful arguments -exec, -print, -delete. They are very robust and eliminate the need to manually process the file names. The syntax for -exec is: -exec (what to do) \;. The name of the file currently processed will be substituted instead of the placeholder {}.
Other commands that are very useful for such tasks are sed and awk.
The xargs tool can execute a command for every line what it gets from stdin. This time, we execute a cp command:
fetchkeys -t 'foo' -F | grep .fits | xargs -P 1 -n 500 --replace='{}' cp -vfa '{}' /path/to/dir
xargs is a very useful tool, although its parametrization is not really trivial. This command reads in 500 .fits files, and calls a single cp command for every group. I didn't tested it to deep, if it doesn't go, I'm waiting your comment.

using output of grep command to find command

I have a problem related to searching a pattern among several files.
I want to search "Logger." pattern in jsp files,so i used the command
grep -ir Logger. * | find . -name *.jsp
Now the problem i am facing is that this command is listing all the jsp files and its not searching the pattern "Logger." in jsp files and listing them.
I just want the jsp files in which "Logger." instance is present.
start like this
you want to search in jsp files.
find . -name "*.jsp"
the above will output all the jsp files recursively from current directory. like below
1/2/ahbd.jsp
befwej/dg/wefwefw/wefwefwe/ijn.jsp
And now you want to find the string in just these files.
grep -ir Logger. (output of find)
so the actual complete command becomes:
find . -name "*.jsp"|xargs grep -ir 'Logger.'
magic here is done by xargs
it gives the output of find as an input for grep line by line.
if you remove xargs,then only the first line that is 1/2/ahbd.jsp will be searched for the string.
there are several other ways to do this.But i feel more comfortable using this regularly
To recursively find all *.jsp files containing the string Logger. you can do:
find . -type f -name '*.jsp' -exec grep -l "Logger\." {} \;
grep -l means to print only the file name if the file contains the string.
The -exec switch of find will execute the given command for each file matching the other criteria (-type f and -name '*.jsp'). The string {} is substituted by the filename. Some versions of find also support + instead of {} to feed several file names to the command (like xargs does) and not only one at once, e.g.:
find . -type f -name '*.jsp' -exec grep -l "Logger\." + \;
You can just use grep for that, here's a command that should give you the results:
grep -ir "Logger\." * | grep ".jsp"
Problem is, grep will bail when you use ".jsp" instead or "" if you don't have at least one .jsp file into your root directory. So we have to tell him to look every file.
Since you give grep the -r (recursive) argument, it will walk the subdirectories to find the pattern "Logger.", then the second grep will only display the .jsp files. Note that -i tells grep not to care about the letter case, which is may be not what you want.
edit: following John's answer: we have to escape the . to prevent it to be taken as a regexp.
re-edit: actually, I think that using find is better, since it will filter the jsp files directly instead of grepping all the files:
find . -name "*.jsp" -exec grep -i "Logger\." {} \;
(you don't need the -r anymore since find takes care of recursion.
If you have bash 4+
shopt -s globstar
shopt -s nullglob
for file in **/*.jsp
do
if grep -q "Logger." "$file" ;then
echo "found in $file"
fi
# or just grep -l "Logger." "$file"
done

Resources