Unix Count Multiple Folders Needed - shell

I have a directory on unix server.
cd /home/client/files
It has multiple client folders like these below.
cd /home/client/files/ibm
cd /home/client/files/aol
cd /home/client/files/citi
All of them send us a file starting with either lower or upper case like below:
pre-ibm-03222017
PRE-aol-170322
Once we recieve the files, we process them and convert pre to pro as below:
pro-ibm-03222017
PRO-aol-170322
I want to count the files processed each day. Here is what I am looking for:
If I can just get the total count per client, that would be perfect. If not, then the total count overall.
Keep in mind it has all files as below:
cd /home/client/files/ibm
pre-ibm-03222017
pro-ibm-03222017
cd /home/client/files/aol
PRE-aol-170322
PRO-aol-170322
And I want to COUNT ONLY the PRO/pro that will either be lower or upper case. One folder can get more than 1 file per day.
I am using the below command:
find /home/client/files -type f -mtime -1 -exec ls -1 {} \;| wc -l
But it is giving me the total count of pre and pro files and also it is counting files for last 24 hours....and not the current day only.
For Example. It is currently 09:00 PM. The above command include files received yesterday between 09:00 PM and 12:00 AM as well. I don't wan't those. In other words if I run it at 01:00 AM....it should have all files for 1 hour only and not last 24 hours.
Thanks
---- Update -----
This works great for me.
touch -t 201703230000 first
touch -t 201703232359 last
find /home/client/files/ -newer first ! -newer last | grep -i pro | wc -l
Now, I was just wondering if I can pass the above as parameter.
For example, instead of using touch -t date and alias.....I want to type shortcuts and dates only to get the output. I have made the following aliases:
alias reset='touch -t `date +%m%d0000` /tmp/$$'
alias count='find /home/client/files/ -type f -newer /tmp/$$ -exec ls -1 {} \; | grep -i pro | wc -l'
This way as soon as I logon to the server, I type reset and then I type count and I get my daily number.
I was wondering if I can do something similar for any duration of days by setting date1 and date2 as aliases. If not, then perhaps a short script that would ask for parameters.

What about this?
touch -t `date +%m%d0000` /tmp/$$
find /home/client/files -type f -newer /tmp/$$ -exec ls -1 {} \; | grep -i pro | wc -l
rm /tmp/$$
Other options for finding a file created today can be found in this question:
How do I find all the files that were created today
Actually, a better way to do this is to just use this:
find /home/client/files -type f -m 0 | grep -i pro | wc -l
You can replace -m 0 with -m 5 to find files 5 days old.

For the same day issue u can use -daystart (GNU find)
The regex define a contains of /pre
find /home/client/files -regex '.*\/[pP][rR][eE].*' -type f -daystart -mtime -1 -exec ls -1 {} \;| wc -l

Related

find the last created subdirectory in a directory of 500k subdirs

I have a folder with some 500k subfolders - and I would like to find the last directory which was added to this folder. I am having to do this due to a power failure issue :(
I dont excatly know when the power failed, so using this:
find . -type d -mmin -360 -print
which I beleive is the last 360 minutes? However, gives me results which I am not exactly sure of.
Shortly speaking, I would like to get the last directory which was created within this folder.
Any pointers would be great.
Suggesting :
find . -type d -printf "%C# %p\n" |sort -n|tail -n1|awk '{print $2}'
Explanation:
find . -type d -printf "%C# %p\n"
find . start searching from current directory recursively
-type d search only directory files
-printf "%C# %p\n" for each directory print its last change time in secs from Unix epoch time including sec fraction, followed by file name with path.
For example: 1648051886.4404644000 /tmp/mc-dudi
|sort -n|tail -n1
Sort the result from find as numbers, and print the last row.
awk '{print $2}'
From last row, print only second field
You might try this: it shows your last modification date/time in a sortable manner, and by sorting it, the last entry should be the most recent one:
find ./ -exec ls -dils --time-style=long-iso {} \; | sort -k8,9
Edit: and specific for directories:
find ./ -type d -exec ls -dils --time-style=long-iso {} \; | sort -k8,9
Assuming you're using a file system that tracks file creation ('birth' is the usual terminology) times, and GNU versions of the programs used below:
find . -type d -exec stat --printf '%W\t%n\0' \{\} + | sort -z -k1,1nr | head -1 -z | cut -f 2-
This will find all subdirectories of the current working directory, and for each one, print its birth time (The %W format field for stat(1)) and name (The %n format). Those entries are then sorted based on the timestamp, newest first, and the first line is returned minus the timestamp.
Unfortunately, GNU find's -printf doesn't support birth times, so it calls out to stat(1) to get those, using the multi-argument version of -exec to minimize the number of instances of the program that need to be run. The rest is straightforward sorting of a column, using 0-byte terminators instead of newlines to robustly handle filenames with newlines or other funky characters in them.
Mantaining a symbolic link to the last known subdirectory could avoid listing all of them to find the latest one.
ls -dl $(readlink ~/tmp/last_dir)
drwxr-xr-x 2 lmc users 4096 Jan 13 13:20 /home/lmc/Documents/some_dir
Find newer ones
ls -ldt $(find -L . -newer ~/tmp/last_dir -type d ! -path .)
drwxr-xr-x 2 lmc users 6 Mar 1 00:00 ./dir2
drwxr-xr-x 2 lmc users 6 Feb 1 00:00 ./dir1
Or
ls -ldt $(find -L . -newer ~/tmp/last_dir -type d ! -path .) | head -n 1
drwxr-xr-x 2 lmc users 6 Mar 1 00:00 ./dir2
Don't use the chosen answer if you really want to find the last created sub-directory
According to the question:
Directories should be sorted by creation time instead of modification time.
find --mindepth 1 is necessary because we want to search only sub-directories.
Here are 2 solutions that both fulfill the 2 requirements:
GNU
find . -mindepth 1 -type d -exec stat -c '%W %n' '{}' '+' |
sort -nr | head -n1
BSD
find . -mindepth 1 -type d -exec stat -f '%B %N' '{}' '+' |
sort -nr | head -n1

Check from files in directory which is the most recent in Bash Shell Script

I am making a bash script to run in a directory with files generated everyday and copy the most recent file to another directory.
I am using this now
for [FILE in directory]
do
if [ls -Art | tail -n 1]
something...
else
something...
fi
done
I know this is not alright. I would like to compare the date modified of the files with the current date and if it was equal, copy that file then.
How would that work or is there an easier method to do it?
We could use find:
find . -maxdepth 1 -daystart -type f -mtime -1 -exec cp -f {} dest \;
Explanation:
-maxdepth 1 limits the search to the current directory.
-daystart sets the reference time of -mtime to the beginning of today.
-type f limits the search to files.
-mtime -1 limits the search to files that have been modified less than 1 day from reference time.
-exec cp -f {} dest \; copies the found files to directory dest.
Note that -daystart -mtime -1 means anytime after today 00:00 (included), but also tomorrow or any time in the future. So if you have files with last modification time in year 2042 they will be copied too. Use -mtime 0 if you prefer coping files that have been modified between today at 00:00 (excluded) and tomorrow at 00:00 (included).
Note also that all this could be impacted by irregularities like daylight saving time or leap seconds (not tested).
The newest file is different from file(s) modified today.
Using ls is actually a pretty simple and portable approach. The stdout output format is defined by POSIX (if not printing to a terminal), and ls -A is also in newer POSIX standards.
It should look more like this though:
newest=$(ls -At | head -n 1)
You could add -1, but it AFAIK it shouldn’t be required, as it’s not printing to a terminal.
If you don’t want to use ls, you can use this on linux:
find . -mindepth 1 -maxdepth 1 -type f -exec stat -c ‘%Y:%n’ {} + |
sort -n |
tail -n 1 |
cut -d : -f 2-
Note using 2- not 2 with cut, in case a filename contains :.
Also, the resulting file name will be a relative path (./file), or an empty string if no files exist.

complicated find on bash

I have the following task: delete old "builds" older than 30 days. And this solution works perfectly:
find $jenkins_jobs -type d -name builds -exec find {} -type d -mtime +30 \; >> $filesToBeDelete
cat $filesToBeDelete | xargs rm -rf
But later some condition were added: delete only in case when we have more than 30 builds and clean the oldest ones. So in results we should keep 30 newest build and delete rest.
Also I have found that I can use if statement in find like that:
if [ $(find bla-bla | wc -l) -gt 30 ]; then
...
fi
but I am wandering how can I delete that files.
Is it clear? For example we have in "build" folder 100 builds and all of them are older than 30 days. So I want to keep 30 new builds and delete another 70.
Pretty hacky but should be pretty robust for weird filenames
find -type d -name "builds" -mtime +30 -printf "%T# %p\0" |\
awk -vRS="\0" -vORS="\0" '{match($0,/([^ ]* )(.*)/,a);b[a[2]]=a[1];c[a[1]]=a[2]}END{x=asort(b);for(i=x-30;i>0;i--)print c[b[i]]}' |\
xargs -0 -I{} rm -r {}
I tested with echo and it seems to work but i'd make sure it's showing the right files before using rm -r.
So what it does is passes null terminated strings through so filenames are preserved.
The main limitation is that if two files were created in the same second then it will miss one as it uses an associative array.
Here is a relatively safe answer to list the dirs, if your stat is close enough to mine (cygwin/bash):
now=$(date +%s)
find $jenkins_jobs -type d -name builds -exec find {} -type d |
while read f; do stat -c'%Y %n' "$f"; done |
sort -nr |
tail -n +31 |
awk $now'-$1>2592000'|
sed 's/^[0-9]* //'
This is working with epoch time (seconds since 1970) as provided by the %s of date and the %Y of stat. The sort and tail are removing the newest 30, and the awk is removing any 30 days old or newer. (2592000 is the number of seconds in 30 days.) The final sed is just removing what stat added, leaving only the dirname.
This will list all, but 30 newest directoiries.
find -type d -name builds -exec ls -d -l --time-style="+%s" {} \;|sed "s#[^ ]\+ \w\+ \w\+ \w\+ \w\+ ##"|sort -r |sed "s#[^ ]\+ ##"|tail -n +31
after you are sure you want to remove them, you can use the | xargs rm -rf
It reads this way:
find all build dirs
list them with time from epoch
drop (sed - away) rights, user, group atc, leaving only time and name
sort by time from newest
drop those times
tail will show everything from 31. entry (so skip 30 newest)

delete file - specific date

How to Delete file created in a specific date??
ls -ltr | grep "Nov 22" | rm -- why this is not wrking??
There are three problems with your code:
rm takes its arguments on its command line, but you're not passing any file name on the command line. You are passing data on the standard input, but rm doesn't read that¹. There are ways around this.
The output of ls -ltr | grep "Nov 22" doesn't just consist of file names, it consists of mangled file names plus a bunch of other information such as the time.
The grep filter won't just catch files modified on November 22; it will also catch files whose name contains Nov 22, amongst others. It also won't catch the files you want in a locale that displays dates differently.
The find command lets you search files according to criteria such as their name matching a certain pattern or their date being in a certain range. For example, the following command will list the files in the current directory and its subdirectories that were modified today (going by GMT calendar date). Replace echo by rm -- once you've checked you have the right files.
find . -type f -mtime -2 -exec echo {} +
With GNU find, such as found on Linux and Cygwin, there are a few options that might do a better job:
-maxdepth 1 (must be specified before other criteria) limits the search to the specified directory (i.e. it doesn't recurse).
-mmin -43 matches files modified at most 42 minutes ago.
-newermt "Nov 22" matches files modified on or after November 22 (local time).
Thus:
find . -maxdepth 1 -type f -newermt "Nov 22" \! -newermt "Nov 23" -exec echo {} +
or, further abbreviated:
find -maxdepth 1 -type f -newermt "Nov 22" \! -newermt "Nov 23" -delete
With zsh, the m glob qualifier limits a pattern to files modified within a certain relative date range. For example, *(m1) expands to the files modified within the last 24 hours; *(m-3) expands to the files modified within the last 48 hours (first the number of days is rounded up to an integer, then - denotes a strict inequality); *(mm-6) expands to the files modified within the last 5 minutes, and so on.
¹ rm -i (and plain rm for read-only files) uses it to read a confirmation y before deletion.
If you need to try find for any given day,
this might help
touch -d "2010-11-21 23:59:59" /tmp/date.start;
touch -d "2010-11-23 00:00:00" /tmp/date.end;
find ./ -type f -newer /tmp/date.start ! -newer /tmp/date.end -exec rm {} \;
If your find supports it, as GNU find does, you can use:
find -type f -newermt "Nov 21" ! -newermt "Nov 22" -delete
which will find files that were modified on November 21.
You would be better suited to use the find command:
find . -type f -mtime 1 -exec echo rm {} +
This will delete all files one day old in the current directory and recursing down into its sub-directories. You can use '0' if you want to delete files created today. Once you are satisfied with the output, remove the echo and the files will truly be deleted.
for i in `ls -ltr | grep "NOV 23" | awk '{print $9}'`
do
rm -rf $i
done
Mb better
#!/bin/bash
for i in $(ls -ltr | grep "NOV 23" | awk '{print $9}')
do
rm -rf $i
done
then in the previous comment

How to delete files older than X hours

I'm writing a bash script that needs to delete old files.
It's currently implemented using :
find $LOCATION -name $REQUIRED_FILES -type f -mtime +1 -delete
This will delete of the files older than 1 day.
However, what if I need a finer resolution that 1 day, say like 6 hours old? Is there a nice clean way to do it, like there is using find and -mtime?
Does your find have the -mmin option? That can let you test the number of mins since last modification:
find $LOCATION -name $REQUIRED_FILES -type f -mmin +360 -delete
Or maybe look at using tmpwatch to do the same job. phjr also recommended tmpreaper in the comments.
Here is the approach that worked for me (and I don't see it being used above)
$ find /path/to/the/folder -name '*.*' -mmin +59 -delete > /dev/null
deleting all the files older than 59 minutes while leaving the folders intact.
You could to this trick: create a file 1 hour ago, and use the -newer file argument.
(Or use touch -t to create such a file).
-mmin is for minutes.
Try looking at the man page.
man find
for more types.
For SunOS 5.10
Example 6 Selecting a File Using 24-hour Mode
The descriptions of -atime, -ctime, and -mtime use the ter-
minology n ``24-hour periods''. For example, a file accessed
at 23:59 is selected by:
example% find . -atime -1 -print
at 00:01 the next day (less than 24 hours later, not more
than one day ago). The midnight boundary between days has no
effect on the 24-hour calculation.
If you do not have "-mmin" in your version of "find", then "-mtime -0.041667" gets pretty close to "within the last hour", so in your case, use:
-mtime +(X * 0.041667)
so, if X means 6 hours, then:
find . -mtime +0.25 -ls
works because 24 hours * 0.25 = 6 hours
If one's find does not have -mmin and if one also is stuck with a find that accepts only integer values for -mtime, then all is not necessarily lost if one considers that "older than" is similar to "not newer than".
If we were able to create a file that that has an mtime of our cut-off time, we can ask find to locate the files that are "not newer than" our reference file.
To create a file that has the correct time stamp is a bit involved because a system that doesn't have an adequate find probably also has a less-than-capable date command that could do things like: date +%Y%m%d%H%M%S -d "6 hours ago".
Fortunately, other old tools can manage this, albeit in a more unwieldy way.
To begin finding a way to delete files that are over six hours old, we first have to find the time that is six hours ago. Consider that six hours is 21600 seconds:
$ date && perl -e '#d=localtime time()-21600; \
printf "%4d%02d%02d%02d%02d.%02d\n", $d[5]+1900,$d[4]+1,$d[3],$d[2],$d[1],$d[0]'
> Thu Apr 16 04:50:57 CDT 2020
202004152250.57
Since the perl statement produces the date/time information we need, use it to create a reference file that is exactly six hours old:
$ date && touch -t `perl -e '#d=localtime time()-21600; \
printf "%4d%02d%02d%02d%02d.%02d\n", \
$d[5]+1900,$d[4]+1,$d[3],$d[2],$d[1],$d[0]'` ref_file && ls -l ref_file
Thu Apr 16 04:53:54 CDT 2020
-rw-rw-rw- 1 root sys 0 Apr 15 22:53 ref_file
Now that we have a reference file exactly six hours old, the "old UNIX" solution for "delete all files older than six hours" becomes something along the lines of:
$ find . -type f ! -newer ref_file -a ! -name ref_file -exec rm -f "{}" \;
It might also be a good idea to clean up our reference file...
$ rm -f ref_file
Here is what one can do for going on the way #iconoclast was wondering about in their comment on another answer.
use crontab for user or an /etc/crontab to create file /tmp/hour:
# m h dom mon dow user command
0 * * * * root /usr/bin/touch /tmp/hour > /dev/null 2>&1
and then use this to run your command:
find /tmp/ -daystart -maxdepth 1 -not -newer /tmp/hour -type f -name "for_one_hour_files*" -exec do_something {} \;
find $PATH -name $log_prefix"*"$log_ext -mmin +$num_mins -exec rm -f {} \;

Resources