Remove files older than the start of the current day - ksh

I want to use logic that allows to use the find command to find all files older than today's date.
Using the below has a 24 hour timestamp from the current time:
find /home/test/ -mtime +1
I am trying to achieve a solution that no matter what time it executes in the cron it will check all files older than the start of the day at 00:00. I believe this can be achieved using epoch, but struggling to find the best logic for this.

#!/bin/ksh
touch -t $(date +%Y%m%d0000.00) fence
find /home/test/ ! -newer fence -exec \
sh -c '
for f in "$#"; do
[[ $f -ot fence ]] && printf "%s\n" "$f"
done
' sh {} + \
;
rm fence
Why find(1) has no -older expression. :-(
UNIX find: opposite of -newer option exists?

Related

How to delete files older than 30 days based on the date in the filename [duplicate]

This question already has answers here:
Delete all files older than 30 days, based on file name as date
(3 answers)
Closed 3 years ago.
I have CSV files get updated every day and we process the files and delete the files older than 30 days based on the date in the filename.
Example filenames :
XXXXXXXXXXX_xx00xx_**20171001**.000000_0.csv
I would like to schedule the job in crontab to delete 30 days older files daily.
Path could be /mount/store/
XXXXXXXXXXX_xx00xx_**20171001**.000000_0.csv
if [ $(date -d '-30 days' +%Y%m%d) -gt $D ]; then
rm -rf $D
fi
this above script doesn't seem to help me. Kindly help me on this.
I have been trying this for last two days.
Using CENTOS7
Thanks.
For all files:
Extract the date
touch the file with that date
delete files with the -mtime option
Do this in the desired dir for all files:
f=XXXXXXXXXXX_xx00xx_20171001.000000_0.csv
d=$(echo $f | sed -r 's/[^_]+_[^_]+_(20[0-9]{6})\.[0-9]{6}_.\.csv/\1/')
touch -d $d $f
After performing that for the whole dir, delete the older-thans:
find YourDir -type f -mtime +30 -name "*.csv" -delete
Gnu-sed has the -delete option. Other finds might need -exec rm ... .
Test before. Other pitfalls are different kind of dates, affected by touch (mtime, ctime, atime).
Test, manipulating the date with touch:
touch XXXXXXXXXXX_xx00xx_20171001.000000_0.csv
f=XXXXXXXXXXX_xx00xx_20171001.000000_0.csv; d=$(echo $f | sed -r 's/[^_]+_[^_]+_(20[0-9]{6})\.[0-9]{6}_.\.csv/\1/'); touch -d $d $f
ls -l $f
-rw-rw-r-- 1 stefan stefan 0 Okt 1 00:00 XXXXXXXXXXX_xx00xx_20171001.000000_0.csv
An efficient way to extract date from filename is to use variable expansions
f=XXXXXXXXXXX_xx00xx_20171001.000000_0.csv
d=${f%%.*} # removes largest suffix .*
d=${d##*_} # removes largest prefix *_
Or to use bash specific regex
if [[ $f =~ [0-9]{8} ]]; then echo "$BASH_REMATCH"; fi
Here is a solution if you have dgrep from dateutils.
ls *.csv | dateutils.dgrep -i '%Y%m%d' --le $(date -d "-30 day" +%F) | xargs -d '\n' rm
First we can use either ls or find to obtain a list of filenames. We can then pipe the results to dgrep to filter the filenames that contains a date string which matches our condition (in this case older than 30 days). Finally, we pipe the result to xargs rm to remove all the matched files.
-i '%Y%m%d' input date format as specified in your filename
--le $(date -d "-30 day" +%F) filter dates that are older than 30 days
You can change rm to printf "%s\n" to test the command before actually deleting it.
The following approach does not look at any generation time information of the file, it assumes the date in the filename is unrelated to the day the file is created.
#/usr/bin/env bash
d=$(date -d "-30 days" "+%Y%m%d")
for file in /yourdir/*csv; do
date=${file:$((${#file}-21)):8}
(( date < d )) && rm $file
done

Boolean check if a file has been opened in the past hour

I am trying to write a crontab that checks inside some specified directory and checks if the files are more than an hour old.
!#/bin/bash
for F in /My/Path/*.txt;do
if [ ***TEST IF FILE WAS OPENED IN THE PAST HOUR *** ]
then
echo "$F"
fi
done
thanks for any help
This can be done with a simple find
find /path/to/directory -type f -newermt "1 hours ago"
Any files accessed / modified within the past hour will print to stdout. No need to loop and print.
#/bin/bash
OLD_FILES=$(find /path/to/directory -type f -newermt "1 hours ago")
if [[ -n $OLD_FILES ]]; then
echo "$OLD_FILES"
else
echo "No old files found in dir"
fi
You can always pipe the results to a log file if you're trying to compile a list as well
find /path/to/directory -type f -newermt "1 hours ago" >> $yourLogFile
A more rigorous approach using GNU date, which has an option -r
-r, --reference=FILE
display the last modification time of FILE
Using the above, incorporating in your script
#!/bin/bash
for filename in /My/Path/*.txt ; do
if (( (($(date +%s) - $(date -r "$filename" +%s))/60) <= 60 )); then
echo "$filename"
fi
done
The logic is straight-forward, we are getting the file modification time in minutes by subtracting the file's modification EPOCH with the current time. If the file is modified within 60 minutes, the particular file is printed.

Ksh command to find files modified during last hour

Can you please tell me most efficient command which can be used to find files modified in a directory in last hour ( more precisely last 60 minutes).
Or
If its not good approach then please tell me how can I compare current time with the timestamp of file creation/modifition
thanks
Use find's option -newermt:
find -newermt 'now -1 hour'
Or simply
find -newermt -1hour
Read more about the usage in find's manual.
Another way for ksh:
BEFORE=$(date -d '-1 hour' '+%s'); find -type f -printf '%T# %p\n' | while read -r TS FILE; do TS=${TS%.*}; [[ TS -ge BEFORE ]] && echo "$FILE"; done
Without using -d:
BEFORE=$(( $(date '+%s') - 3600 )); find -type f -printf '%T# %p\n' | while read -r TS FILE; do TS=${TS%.*}; [[ TS -ge BEFORE ]] && echo "$FILE"; done
This will work on most modern POSIX conforming systems, using ksh[nn] or bash.
filetime=$(perl -e'
use POSIX qw(strftime);
$now_string = strftime "%Y%m%d%H%M.%S", localtime(time - 3601);
print $now_string, "\n";' )
touch -t $filetime /tmp/dummy # here for verification
ls -l /tmp/dummy
find . -newer /tmp/dummy -exec ls -l {} \;
the ls - l command is there to show you the "dummy", remove it after you check what the touch command does.
Touch a file with timestamp of one hour old, and use find's option -newer.
The touch hour can be calculated with a timezone trick (example for GMT):
echo "$(TZ=GMT+1 date +%Y-%m-%d)"
So you do something like
HOUROLDFILE=/tmp/hourold.tmp
touch -t "$(TZ=GMT+1 date '+%y%m%d%H%M')" ${HOUROLDFILE}
find a_directory -newer ${HOUROLDFILE}
rm ${HOUROLDFILE}

Removing old folders in bash backup script

I have a bash script that rsyncs files onto my NAS to the directory below:
mkdir /backup/folder_`date +%F`
How would I go about writing a cleanup script that removes directories older than 7 days old based upon the date in directories name?
#!/bin/bash
shopt -s extglob
OLD=$(exec date -d "now - 7 days" '+%s')
cd /backup || exit 1 ## If necessary.
while read DIR; do
if read DATE < <(exec date -d "${DIR#*folder_}" '+%s') && [[ $DATE == +([[:digit:]]) && DATE -lt OLD ]]; then
echo "Removing $DIR." ## Just an example message. Or we could just exclude this and add -v option to rm.
rm -ir "$DIR" ## Change to -fr to skip confirmation.
fi
done < <(exec find -maxdepth 1 -type d -name 'folder_*')
exit 0
We could actually use more careful approaches like -rd $'\0', -print0 and IFS= but I don't think they are really necessary this time.
Create a list of folders with the pattern you want to remove, remove the folders you want to keep from the list, delete everything else.
How about a simple find:
find /backup -name 'folder_*' -type d -ctime 7 -exec rm -rf {} \;

bash script to find old files based off date in file name

I'm developing a bash script that needs to search out files within a single directory that are "old" based off a variable that specifies how many days need to pass before the threshold is exceeded and the files are marked for action (could be anything from move to archive to delete, etc...).
The catch is that the modify time of the file is irrelevant in determining how old the files need to be before taken action upon, as the files may infrequently be changed, the execution time of the script can vary, etc...
The time that determines hold the files are is in the actual file name in the form of YYYY-MM-DD (or %F with the date command). take for instance the filename contents-2011-05-23.txt. What command(s) could be run in this directory to find all files that exceed a certain amount of days (I have the threshold currently set to 7 days, could change) and print out their file names?
Create a bash script isOld.sh like this:
#!/bin/bash
fileName=$1
numDays=$2
fileDt=$(echo $fileName | sed 's/^[^-]*-\([^.]*\)\..*$/\1/')
d1=$(date '+%s')
d2=$(date -d $fileDt '+%s')
diff=$((d1-d2))
seconds=$((numDays * 24 * 60 * 60))
[[ diff -ge seconds ]] && echo $fileName
Then give execute permission to above file by running:
chmod +x ./isOld.sh
And finally run this find command from top of your directory to print files older than 7 days as:
find . -name "contents-*" -exec ./isOld.sh {} 7 \;
In BSD, the -j is used to prevent the date being set and the -f parameter is used to set the format of the input date. :
First, you need to find today's date in the number of days since January 1, 1970:
today=$(date -j -f "%Y-%m-%d" 1969-12-31 +%s)
Now, you can use that to find out the time seven days ago:
((cutoff = $today - 604800))
The number 604800 is the number of seconds in seven days.
Now, for each file in your directory, you need to find the date part of the string. I don't know of a better way. (Maybe someone knows some Bash magic).
find . -type f | while read fileName
do
fileDate=$(echo $foo | sed 's/.*-\([0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]\).*/\1/')
yadda, yadda, yadda #Figure this out later
done
Once we have the file date, we can use the date command to figure out if that date in seconds in less than (and thus older than the cutoff date)
today=$(date -j -f "%Y-%m-%d" 1969-12-31 +%s)
((cutoff = $today - 604800))
find . -type f | while read fileName #Or however you get all the file names
do
fileDate=$(echo $foo | sed 's/.*-\([0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]\).*/\1/')
fileDateInSeconds=$(date -j -f "%Y-%m-%d" $fileDate +%s)
if [ $fileDateInSeconds -lt $cutoff ]
then
rm $fileName
fi
done
In Linux, you use the -d parameter to define the date which must be in YYYY-MM-DD format:
today=$(date +"%Y-%m-%d)
Now, you can take that and find the number of seconds:
todayInSeconds=(date -d $today +%s)
Everything else should be more or less the same as above.
If you run the command daily, you could do this:
echo *-`date -d '8 days ago' '+%F'`.txt
Additional wildcards could be added ofcourse
find *[0-9][0-9][0-9][0-9]-[0-1][0-9]-[0-3][0-9]*.txt -exec bash -c 'dt=`echo $0 | sed -re "s/.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/"`; file_time=`date -d $dt +%s`; cutoff_time=`date -d "31 days ago" +%s` ; test $file_time -lt $cutoff_time ' {} \; -print
That's one of my longest one liners :-) Here it is again wrapped:
find *[0-9][0-9][0-9][0-9]-[0-1][0-9]-[0-3][0-9]*.txt \
-exec bash -c ' dt=`echo $0 | \
sed -re "s/.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/"`; \
file_time=`date -d $dt +%s`; \
cutoff_time=`date -d "31 days ago" +%s` ;\
test $file_time -lt $cutoff_time \
' {} \; -print

Resources