How to find the particular files in a directory in a shell script? - shell

I'm trying to find the particular below files in the directory using find command pattern in shell script .
The below files will create in the directory "/data/output" in the below format every time.
PO_ABCLOAD0626201807383269.txt
PO_DEF 0626201811383639.txt
So I need to find the above txt files starting from "PO_ABCLOAD" and "PO_DEF" is created or not.if not create for four hours then I need to write in logs.
I written script but I am stuck up to find the file "PO_ABCLOAD" and "PO_DEF format text file in the below script.
Please help on this.
What changes i need to add in the find command.
My script is:
file_path=/data/output
PO_count='find ${file_path}/PO/*.txt -mtime +4 -exec ls -ltr {} + | wc -l'
if [ $PO_count == 0 ]
then
find ${file_path}/PO/*.xml -mtime +4 -exec ls -ltr {} + >
/logs/test/PO_list.txt
fi
Thanks in advance

Welcome to the forum. To search for files which match the names you are looking for you could try the -iname or -name predicates. However, there are other issues with your script.
Modification times
Firstly, I think that find's -mtime test works in a different way than you expect. From the manual:
-mtime n
File's data was last modified n*24 hours ago.
So if, for example, you run
find . -mtime +4
you are searching for files which are more than four days old. To search for files that are more than four hours old, I think you need to use the -mmin option instead; this will search for files which were modified a certain number of minutes ago.
Command substitution syntax
Secondly, using ' for command substitution in Bash will not work: you need to use backticks instead - as in
PO_COUNT=`find ...`
instead of
PO_COUNT='find ...'
Alternatively - even better (as codeforester pointed out in a comment) - use $(...) - as in
PO_COUNT=$(find ...)
Redundant options
Thirdly, using -exec ls -ltr {} + is redundant in this context - since all you are doing is determining the number of lines in the output.
So the relevant line in your script might become something like
PO_COUNT=$(find $FILE_PATH/PO/ -mmin +240 -a -name 'PO_*' | wc -l)
or
PO_COUNT=$(find $FILE_PATH/PO/PO_* -mmin +240 | wc -l)
If you wanted tighter matching of filenames, try (as per codeforester's suggestion) something like
PO_COUNT=$(find $file_path/PO/PO_* -mmin +240 -a \( -name 'PO_DEF*' -o -name 'PO_ABCLOAD*' \) | wc -l)
Alternative file-name matching in Bash
One last thing ...
If using bash, you can use brace expansion to match filenames, as in
PO_COUNT=$(find $file_path/PO/PO_{ABCLOAD,DEF}* -mmin +240 | wc -l)
Although this is slightly more concise, I don't think it is compatible with all shells.

Related

how does grep only today's files in current directory?

I want to grep files which created today in the current directory. So how many ways to do that? What's the best way to do that?
grep --color 'content' ./directory
This should do the trick for you:
find ./directory -maxdepth 1 -type f -daystart -ctime 0 -print | xargs grep --color 'content'
In the above command, we are using find to find all the files (-type f) in directory, that were made today (-daystart -ctime 0) and then -print the full files paths to standard output. We then send the output to xargs. Using xargs we are able to execute each line of the output through the grep command. This is much simpler than having to create a for loop and iterate over each line of the output.
If I understand you want to grep "content" within all file in ./directory modified today, then you can use a combination of find and xargs. For example to find the files in ./directory modified today, you can give the -mtime 0 option which find files modified 0 24 hour periods ago (e.g. today). To handle strange filenames, use the -print0 option to have find output nul-terminated filenames. Your find command could be:
find . -maxdepth 1 -type f -mtime 0 -print0
One the list of files is generated, you can pass the result to xargs -0 which will process the list of filenames as being nul-terminated and using your grep command, you would have:
xargs -0 grep --color 'content'
To put it altogether, simply pipe the result of find to xargs, e.g.
find . -maxdepth 1 -type f -mtime 0 -print0 |
xargs -0 grep --color 'content'
Give that a go and let me know if it does what you need or if you have further questions.
Edit Per Comment
If you want more exact control of the hour, or minute or second from which you want to select your files, you can use the -newermt option for find to file all files newer than the date you give as the option, e.g. -newermt "2021-07-02 02:10:00" would select today's file created after 2:10:00 (all files after 2:10:00 am this morning)
Modifying the test above and replacing -mtime 0 with -newermt "2021-07-02 02:10:00" you would have:
find . -maxdepth 1 -type f -newermt "2021-07-02 02:10:00"` -print 0 |
xargs -0 grep --color 'content'
(adjust the time to your exact starting time you want to begin selecting files from)
Give that a go also. It is quite a bit more flexible as you can specify any time within the day to begin selecting files from based on the files modification time.

Deleting oldest files with shell

I have a folder /var/backup where a cronjob saves a backup of a database/filesystem. It contains a latest.gz.zip and lots of older dumps which are names timestamp.gz.zip.
The folder ist getting bigger and bigger and I would like to create a bash script that does the following:
Keep latest.gz.zip
Keep the youngest 10 files
Delete all other files
Unfortunately, I'm not a good bash scripter so I have no idea where to start. Thanks for your help.
In zsh you can do most of it with expansion flags:
files=(*(.Om))
rm $files[1,-9]
Be careful with this command, you can check what matches were made with:
print -rl -- $files[1,-9]
You should learn to use the find command, possibly with xargs, that is something similar to
find /var/backup -type f -name 'foo' -mtime -20 -delete
or if your find doesn't have -delete:
find /var/backup -type f -name 'foo' -mtime -20 -print0 | xargs -0 rm -f
Of course you'll need to improve a lot, this is just to give ideas.

Unix find: list of files from stdin

I'm working in Linux & bash (or Cygwin & bash).
I have a huge--huge--directory structure, and I have to find a few needles in the haystack.
Specifically, I'm looking for these files (20 or so):
foo.c
bar.h
...
quux.txt
I know that they are in a subdirectory somewhere under ..
I know I can find any one of them with
find . -name foo.c -print. This command takes a few minutes to execute.
How can I print the names of these files with their full directory name? I don't want to execute 20 separate finds--it will take too long.
Can I give find the list of files from stdin? From a file? Is there a different command that does what I want?
Do I have to first assemble a command line for find with -o using a loop or something?
If your directory structure is huge but not changing frequently, it is good to run
cd /to/root/of/the/files
find . -type f -print > ../LIST_OF_FILES.txt #and sometimes handy the next one too
find . -type d -print > ../LIST_OF_DIRS.txt
after it you can really FAST find anything (with grep, sed, etc..) and update the file-lists only when the tree is changed. (it is a simplified replacement if you don't have locate)
So,
grep '/foo.c$' LIST_OF_FILES.txt #list all foo.c in the tree..
When want find a list of files, you can try the following:
fgrep -f wanted_file_list.txt < LIST_OF_FILES.txt
or directly with the find command
find . type f -print | fgrep -f wanted_file_list.txt
the -f for fgrep mean - read patterns from the file, so you can easily grepping input for multiple patterns...
You shouldn't need to run find twenty times.
You can construct a single command with a multiple of filename specifiers:
find . \( -name 'file1' -o -name 'file2' -o -name 'file3' \) -exec echo {} \;
Is the locate(1) command an acceptable answer? Nightly it builds an index, and you can query the index quite quickly:
$ time locate id_rsa
/home/sarnold/.ssh/id_rsa
/home/sarnold/.ssh/id_rsa.pub
real 0m0.779s
user 0m0.760s
sys 0m0.010s
I gave up executing a similar find command in my home directory at 36 seconds. :)
If nightly doesn't work, you could run the updatedb(8) program by hand once before running locate(1) queries. /etc/updatedb.conf (updatedb.conf(5)) lets you select specific directories or filesystem types to include or exclude.
Yes, assemble your command line.
Here's a way to process a list of files from stdin and assemble your (FreeBSD) find command to use extended regular expression matching (n1|n2|n3).
For GNU find you may have to use one of the following options to enable extended regular expression matching:
-regextype posix-egrep
-regextype posix-extended
echo '
foo\\.c
bar\\.h
quux\\.txt
' | xargs bash -c '
IFS="|";
find -E "$PWD" -type f -regex "^.*/($*)$" -print
echo find -E "$PWD" -type f -regex "^.*/($*)$" -print
' arg0
# note: "$*" uses the first character of the IFS variable as array item delimiter
(
IFS='|'
set -- 1 2 3 4 5
echo "$*" # 1|2|3|4|5
)

Find, grep, and execute - all in one?

This is the command I've been using for finding matches (queryString) in php files, in the current directory, with grep, case insensitive, and showing matching results in line:
find . -iname "*php" -exec grep -iH queryString {} \;
Is there a way to also pipe just the file name of the matches to another script?
I could probably run the -exec command twice, but that seems inefficient.
What I'd love to do on Mac OS X is then actually to "reveal" that file in the finder. I think I can handle that part. If I had to give up the inline matches and just let grep show the files names, and then pipe that to a third script, that would be fine, too - I would settle.
But I'm actually not even sure how to pipe the output (the matched file names) to somewhere else...
Help! :)
Clarification
I'd like to reveal each of the files in a finder window - so I'm probably not going to using the -q flag and stop at the first one.
I'm going to run this in the console, ideally I'd like to see the inline matches printed out there, as well as being able to pipe them to another script, like oascript (applescript, to reveal them in the finder). That's why I have been using -H - because I like to see both the file name and the match.
If I had to settle for just using -l so that the file name could more easily be piped to another script, that would be OK, too. But I think after looking at the reply below from #Charlie Martin, that xargs could be helpful here in doing both at the same time with a single find, and single grep command.
I did say bash but I don't really mind if this needs to be ran as /bin/sh instead - I don't know too much about the differences yet, but I do know there are some important ones.
Thank you all for the responses, I'm going to try some of them at the command line and see if I can get any of them to work and then I think I can choose the best answer. Leave a comment if you want me to clarify anything more.
Thanks again!
You bet. The usual thing is something like
$ find /path -name pattern -print | xargs command
So you might for example do
$ find . -name '*.[ch]' -print | xargs grep -H 'main'
(Quiz: why -H?)
You can carry on with this farther; for example. you might use
$ find . -name '*.[ch]' -print | xargs grep -H 'main' | cut -d ':' -f 1
to get the vector of file names for files that contain 'main', or
$ find . -name '*.[ch]' -print | xargs grep -H 'main' | cut -d ':' -f 1 |
xargs growlnotify -
to have each name become a Growl notification.
You could also do
$ grep pattern `find /path -name pattern`
or
$ grep pattern $(find /path -name pattern)
(in bash(1) at least these are equivalent) but you can run into limits on the length of a command line that way.
Update
To answer your questions:
(1) You can do anything in bash you can do in sh. The one thing I've mentioned that would be any different is the use of $(command) in place of using backticks around command, and that works in the version of sh on Macs. The csh, zsh, ash, and fish are different.
(2) I think merely doing $ open $(dirname arg) will opena finder window on the containing directory.
It sounds like you want to open all *.php files that contain querystring from within a Terminal.app session.
You could do it this way:
find . -name '*.php' -exec grep -li 'querystring' {} \; | xargs open
With my setup, this opens MacVim with each file on a separate tab. YMMV.
Replace -H with -l and you will get a list of those filenames that matched the pattern.
if you have bash4, simply do
grep pattern /path/**/*.php
the ** operator is like
grep pattern `find -name \*.php -print`
find /home/aaronmcdaid/Code/ -name '*.cpp' -exec grep -q -iH boost {} \; -exec echo {} \;
The first change I made is to add -q to your grep command. This is "Exit immediately with zero status if any match is found".
The good news is that this speeds up grep when a file has many matching lines. You don't care how many matches there are. But that means we need another exec on the end to actually print the filenames when grep has been successful
The grep result will be sent to stdout, so another -exec predicate is probably the best solution here.
Pipe to another script:
find . -iname "*.php" | myScript
File names will come into the stdin of myScript 1 line at a time.
You can also use xargs to form/execute commands to act on each file:
find . -iname "*.php" | xargs ls -l
act on files you find that match:
find . -iname "*.php" | xargs grep -l pattern | myScript
act that don't match pattern
find . -iname "*.php" | xargs grep -L pattern | myScript
In general using multiple -exec's and grep -q will be FAR faster than piping, since find has implied short circuits -a's separating each juxtaposed pair of expressions that's not separated with an explicit operator. The main problem here, is that you want something to happen if grep matches something AND for matches to be printed. If the files are reasonably sized then this should be faster (because grep -q exits after finding a single match)
find . -iname "*php" -exec grep -iq queryString {} \; -exec grep -iH queryString {} \; -exec otherprogram {} \;
If the files are particularly big, encapsulating it in a shell script may be faster then running multiple grep commands
find . -iname "*php" -exec bash -c \
'out=$(grep -iH queryString "$1"); [[ -n $out ]] && echo "$out" && exit 0 || exit 1' \
bash {} \; -print
Also note, if the matches are not particularly needed, then
find . -iname "*php" -exec grep -iq queryString {} \; -exec otherprogram {} \;
Will virtually always be faster than then a piped solution like
find . -iname "*php" -print0 | xargs -0 grep -iH | ...
Additionally, you should really have -type f in all cases, unless you want to catch *php directories
Regarding the question of which is faster, and you actually care about the minuscule time difference, which maybe you might if you are trying to see which will save your processor some time... perhaps testing using the command as a suffix to the "time" command, and see which one performs better.

grep returns "Too many argument specified on command" [duplicate]

This question already has answers here:
Argument list too long error for rm, cp, mv commands
(31 answers)
Closed 7 years ago.
I am trying to list all files we received in one month
The filename pattern will be
20110101000000.txt
YYYYMMDDHHIISS.txt
The entire directory is having millions of files.
For one month there can be minimum 50000 files.
Idea of sub directory is still pending.
Is there any way to list huge number of files with file name almost similar.
grep -l 20110101*
Am trying this and returning error.
I try php it took a huge time , thats why i use shell script . I dont understand why shell also not giving a result
Any suggestion please!!
$ find ./ -name '20110101*' -print0 -type f | xargs -0 grep -l "search_pattern"
you can use find and xargs. xargs will run grep for each file found by find. You can use -P to run multiple grep's parallely and -n for multiple files per grep command invocation. The print0 argument in find separates each filename with a null character to avoid confusion caused by any spaces in the file name. If you are sure there will not be any spaces you can remove -print0 and -0 args.
This should be the faster way:
find . -name "20110101*" -exec grep -l "search_pattern" {} +
Should you want to avoid the leading dot:
find . -name "20110101*" -exec grep -l "search_pattern" {} + | sed 's/^.\///'
or better thanks to adl:
find . -name "20110101*" -exec grep -l "search_pattern" {} + | cut -c3-
The 20110101* is getting expanded by your shell before getting passed to the command, so you're getting one argument passed for every file in the dir that starts with 20110101.
If you just want a list of matching files you can use find:
find . -name "20110101*"
(note that this will search every subdirectory also)
Some in depth information available here and also another work-around: for FILE in 20110101*; do grep foo ${FILE}; done. Most people will go with xargs and more seasoned admins with -exec {} + which accomplishes exactly the same, except is shorter to type. One would use the inline shell for construct, when running more processes is less important then seeing the results. With the for construct you may end up running grep thousands of times, but you see each match in real time, while using find and/or xargs you see batched results, however grep is run significantly less.
you need to put in a search term, so
grep -l "search term" 20110101*
if you want to just find the files, use ls 20110101*
Just pipe the output of ls to grep: ls | grep '^20110101'

Resources