Identify the files year wise and delete from a dir in unix - shell

I need to list out the files which are created in a specific year and then to delete the files. year should be the input.
i tried with date it is working for me. but not able to covert that date to year for comparison in loop to get the list of files.
Below code is giving 05/07 files. but want to list out the files which are created in 2022,2021,etc.,
for file in /tmp/abc*txt ; do
[ "$(date -I -r "$file")" == "2022-05-07" ] && ls -lstr "$file"
done

If you end up doing ls -l anyway, you might just parse the date information from the output. (However, generally don't use ls in scripts.)
ls -ltr | awk '$8 ~ /^202[01]$/'
date -r is not portable, though if you have it, you could do
for file in /tmp/abc*txt ; do
case $(date -I -r "$file") in
2020-* | 2021-* ) ls -l "$file";;
esac
done
(The -t and -r flags to ls have no meaning when you are listing a single file anyway.)
If you don't, the tool of choice would be stat, but it too has portability issues; the precise options to get the information you want will vary between platforms. On Linux, try
for file in /tmp/abc*txt ; do
case $(LC_ALL=C stat -c %y "$file") in
2020-* | 2021-* ) ls -l "$file";;
esac
done
On BSD (including MacOS) try stat -f %Sm -t %Y "$file" to get just the year.
If you need proper portability, perhaps look for a scripting language with wide support, such as Perl or Python. The stat() system call is the fundamental resource for getting metainformation about a file. The find command also has some features for finding files by age, though its default behavior is to traverse subdirectories, too (you can inhibit that with -maxdepth 1; but then the options to select files by age are again not entirely POSIX portable).

To list out files which were last modified in a specific year and then to delete those files, you could use a combination of the find -newer and touch commands:
# given a year as input
year=2022
stampdir=$(mktemp -d)
touch -t ${year}01010000 "$stampdir"/beginning
touch -t $((year+1))01010000 "$stampdir"/end
find /tmp -name 'abc*txt' -type f -newer "$stampdir/beginning" ! -newer "$stampdir/end" -print -delete
rm -r "$stampdir"
First, create a temporary working directory to store the timestamp files; we don't want the find command to accidentally find them. Be careful here; mktemp will probably create a directory in /tmp; this use-case is safe only because we're naming the timestamp files such that they don't match the "abc*txt" pattern from the question.
Next, create bordering timestamp files with the touch command: one that is the newest date in the year, named "beginning", and another for the newest date of the next year, named "end".
Then run the find command; here's the breakdown:
start in /tmp (from the question)
files named with the 'abc*txt' pattern (from the question)
only files (not directories, etc -- from the question)
newer than the beginning timestamp file
not newer (i.e. older) than the end timestamp file
if found, print the filename and then delete it
Finally, clean up the temporary working directory that we created.

Try this:
For checking which files are picked up:
echo -e "Give Year :"
read yr
ls -ltr /tmp | grep "^-" |grep -v ":" | grep $yr | awk -F " " '{ print $9;}'
** You can replace { print $9 ;} with { rm $9; } in the above command for deleting the picked files

Related

GREP date from email header and make it the files creation date

I am on Mac Terminal and want to "grep" a string (which is a UNIX timestamp) out of an email header, convert that into a format the OS can work with and make that the creation date of the file. I want to do that recursively for all mails inside a folder (with multiple possible subfolders).
The structure would probably look something like this:
#!/bin/bash
for i in `ls`
do
# Find the date field (X-Delivery-Time) inside an email header and grep the UNIX timestamp
# convert timestamp to a format the OS can work with
# overwrite the existing creation date with the new one
done
The mails header look like this
X-Envelope-From: <some#mail.com>
X-Envelope-To: <my#mail.com>
X-Delivery-Time: 1535436541
...
Some background: Apple Mail uses the date a file was created as the date displayed within Apple Mail. That’s why after moving mails from one server to another all mails now display the same date which makes sorting impossible.
As I am new to Terminal/Bash any help is appreciated. Thanks
On a Mac this should work, but since I have no mac I cannot test it myself. I assume your email files have the .emlx extension.
For a single directory:
for i in ./*.emlx; do
unixTime=$(grep -m1 '^X-Delivery-Time:' "$i" | grep -Eo '[0-9]+') &&
humanTime=$(date -r "$unixTime" +%Y%m%d%H%M.%S) &&
touch -t "$humanTime" "$i"
done
For a whole directory tree:
fixdate() {
unixTime=$(grep -m1 '^X-Delivery-Time:' "$1" | grep -Eo '[0-9]+') &&
humanTime=$(date -r "$unixTime" +%Y%m%d%H%M.%S) &&
touch -t "$humanTime" "$1"
}
export -f fixdate
find . -name '*.emlx' -exec bash -c 'fixdate "$#"' . {} \;
or, if you have bash 4 or higher installed (macOS still uses 3 by default)
shopt -s globstar
for i in ./**/*.emlx; do
unixTime=$(grep -m1 '^X-Delivery-Time:' "$i" | grep -Eo '[0-9]+') &&
humanTime=$(date -r "$unixTime" +%Y%m%d%H%M.%S) &&
touch -t "$humanTime" "$i"
done
What follows assumes you are using the default macOS utilities (touch, date...) As they are completely outdated some adjustments will be needed if you use more recent versions (e.g. macports or brew). It also assumes that you are using bash.
If you have sub-folders ls is not the right tool. And anyway, the output of ls is not for computers, it is for humans. So, the first thing to do is find all email files. Guess what? The utility that does this is named find:
$ find . -type f -name '*.emlx'
foo/bar.emlx
baz.emlx
...
searches for true files (-type f) starting from the current directory (.) and which name is anything.emlx (-name '*.emlx'). Adapt to your situation. If all files are email files you can skip the -name ... part.
Next we need to loop over all these files and process each of them. This is a bit more complex than for f in ... for several reasons (large number of files, file names with spaces...) A robust way to do this is to redirect the output of a find command to a while loop:
while IFS= read -r -d '' f; do
<process file "$f">
done < <(find . -type f -name '*.emlx' -print0)
The -print0 option of find is used to separate the file names with a null character instead of the default newline character. The < <(find...) part is a way to redirect the output of find to the input of the while loop. The while IFS= read -r -d '' f; do reads each file name produced by find, stores it in shell variable f, preserving the leading and trailing spaces if any (IFS=), the backslashes (-r) and using the null character as separator (-d '').
Now we must code the processing of each file. Let's first retrieve the delivery time, assuming it is always the second word of the last line starting with X-Delivery-Time::
awk '/^X-Delivery-Time:/ {t = $2} END {print t}' "$f"
does that. If you don't know awk already it's time to learn a bit of it. It's one of the very useful Swiss knives of text processing (sed is another). But let's improve it a bit such that it returns the first encountered delivery time instead of the last, stops as soon as it encountered it, and also checks that the timestamp is a real timestamp (digits):
awk '/^X-Delivery-Time:[[:space:]]+[[:digit:]]+$/ {print $2; exit}' "$f"
The [[:space:]]+ part of the regular expression matches 1 or more spaces, tabs,... and the [[:digit:]]+ matches 1 or more digits. ^ and $ match the beginning and the end of the line, respectively. The result can be assigned to a shell variable:
t="$(awk '/^X-Delivery-Time:[[:space:]]+[[:digit:]]+$/ {print $2; exit}' "$f")"
Note that if there was no match the t variable will store the empty string. We will use this later to skip such files.
Once we have this delivery time, which looks like a UNIX timestamp (seconds since 1970/01/01) in your example, we must use it to change the last modification time of the email file. The command that does this is touch:
$ man touch
...
touch [-A [-][[hh]mm]SS] [-acfhm] [-r file] [-t [[CC]YY]MMDDhhmm[.SS]] file ...
...
Unfortunately touch wants a time in the CCYYMMDDhhmm.SS format. No worry, the date utility can be used to convert a UNIX timestamp in any format we like. For instance, with your example timestamp (1535436541):
$ date -r 1535436541 +%Y%m%d%H%M.%S
201808280809.01
We are almost done:
while IFS= read -r -d '' f; do
# uncomment for debugging
# echo "processing $f"
t="$(awk '/^X-Delivery-Time:[[:space:]]+[[:digit:]]+$/ {print $2; exit}' "$f")"
if [ -z "$t" ]; then
echo "no delivery time found in $f"
continue
fi
# uncomment for debugging
# echo touch -t "$(date -r "$t" +%Y%m%d%H%M.%S)" "$f"
touch -t "$(date -r "$t" +%Y%m%d%H%M.%S)" "$f"
done < <(find . -type f -name '*.emlx' -print0)
Note how we test if t is the empty string (if [ -z "$t" ]). If it is, we print a message and jump to the next file (continue). Just put all this in a file with a shebang line and run...
If, instead of the X-Delivery-Time field, you must use a Date field with a more complex and variable format (e.g. Date: Mon, 11 Jun 2018 10:36:14 +0200), the best would be to install a decently recent version of touch with the coreutils package of Mac Ports or Homebrew. Then:
while IFS= read -r -d '' f; do
t="$(awk '/^Date:/ {print gensub(/^Date:[[:space:]+](.*)$/,"\\1","1"); exit}' "$f")"
if [ -z "$t" ]; then
echo "no delivery time found in $f"
continue
fi
touch -d "$t" "$f"
done < <(find . -type f -name '*.emlx' -print0)
The awk command is slightly more complex. It prints the matching line without the Date: prefix. The following sed command would do the same in a more compact form but would not really be more readable:
t="$(sed -rn 's/^Date:\s*(.*)/\1/p;Ta;q;:a' "$f")"

search returning subfolder not main folder in bash

I have the below bash that searches a specific directory and returns the earliest folder in that directory. The bash works great if their are no subfolders within each folder. If there is those are returned instead of the main folder. I am not sure why this is happening or how to fix it. Thank you :).
For example,
/home/cmccabe/Desktop/NGS/test is the directory searched and in it there are two folder, R_1 and R_2
output
The earliest folder is: R_1
However, if /home/cmccabe/Desktop/NGS/testhasR_1 and within it testfolderandR_2 and testfolder2 within it`
output
The earliest folder is: testfolder
Bash
cd /home/cmccabe/Desktop/NGS/test
folder=$(ls -u *"R_"* | head -n 1) # earliest folder
echo "The earliest folder is: $folder"
ls is the wrong tool for this job: Its output is built for humans, not scripts, and is often surprising when nonprintable characters are present.
Assuming you have GNU find and sort, the following works with all possible filenames, including those with literal newlines:
dir=/home/cmccabe/Desktop/NGS/test # for my testing, it's "."
{
read -r -d $'\t' time && read -r -d '' filename
} < <(find "$dir" -maxdepth 1 -mindepth 1 -printf '%T+\t%P\0' | sort -z -r )
...thereafter:
echo "The oldest file is $filename, with an mtime of $time"
For a larger discussion of portably finding the newest or oldest file in a directory, including options that don't require GNU tools, see BashFAQ #99.
You should read about ls, the -u option doesn't do what you think it does. The following are the relevant options:
-u - Tells ls to use last access time instead of last modification time when sorting by time. So by itself it does nothing, should be called with -t
-t - Sorts by time (of modification or access or something else), with newest first
-r - Reverses order of output
-d - Don't search directories recursively
So what you actually need is:
$ ls -trd
Or:
$ ls -utrd

Newest file in directories

Hi i have two directories
Directory 1:
song.mp3
work.txt
Directory 2:
song.mp3
work.txt
These files are the same, but song.mp3 in directory1 is newer than song.mp3 in directory2 and work.txt in directory 2 is newest than work.txt in directory 1.
And now how i can print in two files for example
in file1 files that are newer that in directory 2 so it must be song.mp3
and in file2 files that are newer that in directory 1 so it must be work.txt
i tried
find $directory1 -type f -newer $directory2
but it always print me the newest file in both directories. Could someone help me ?
-newer $directory2 is just using the timestamp on the directory $directory2 as the reference point for all the comparisons. It doesn't look at any of the files inside $directory2.
I don't think there's anything like a "compare each file to its counterpart in another directory" operation built in to find, so you'll probably have to do some of the work yourself. Here's a short script demonstrating one way it can be done:
(cd $directory1 && find . -print) | while IFS= read -r fn; do
if [ "$directory1/$fn" -nt "$directory2/$fn" ]; then
printf "%s\n" "$directory1/$fn"
else
printf "%s\n" "$directory2/$fn"
fi
done
# set up the test
mkdir directory1 directory2
touch directory1/song.mp3
touch -t 200101010000 directory1/work.txt
touch -t 200101010000 directory2/work.txt
touch directory2/work.txt
# find the newest of each filename:
# sort the files in both directories by mtime
# then only output the filename (regardless of directory) the first time seen
stat -c '%Y %n' directory[12]/* |
sort -rn |
cut -d " " -f 2- |
awk -F / '!seen[$2]++'
directory2/work.txt
directory1/song.mp3
If you are on a Linux that supports the following:
Fileage=`date +%s -r filename`
You could run a "find" and print age in seconds followed by filename for each file and then sort that file. This has the benefit that it will work across any number of directories - not just two. Glenn's more widely available "stat -c" could be used in place of my "date" command - and he's done the "sort" and "awk" for you!

Get the latest created directory from a filepath

I am trying to find what is the latest directory created in a given filepath.
ls -t sorts the content by timestamp of the of file or directory. But I need only directory.
You can use the fact that directories have a d in the beginning of its information.
Hence, you can do:
ls -lt /your/dir | grep ^d
This way, the last created directory will appear at the top. If you want it to be the other way round, with oldest at the top and newer at the bottom, use -r:
ls -ltr /your/dir | grep ^d
*/ matches directories.
So you could use the following command to get the most recent directory:
ls -td /path/to/dir/*/ | head -1
BUT, I would not recommend this because parsing the output of ls is unsafe.
Instead, you should create a loop and compare timestamps:
dirs=( /path/to/dir/*/ )
newest=${dirs[0]}
for d in "${dirs[#]}"
do
if [[ $d -nt $newest ]]
then
newest=$d
fi
done
echo "Most recent directory is: $newest"

OS X script to send email when new file is created

How can I monitor a directory, and send an email whenever a new file is created?
I currently have a script running daily which uses find to search for all files in a directory with a last modified date newer than an empty timestamp file:
#!/bin/bash
folderToWatch="/Path/to/files"
files=files.$$
find $folderToWatch/* -newer timestamp -print > $files
if [ -s "$files" ]
then
# SEND THE EMAIL
touch timestamp
Unfortunately, this also sends emails when files are modified. I know creation date is not stored in Unix, but this information is available in Finder, so can I somehow modify my script to use that information date rather than last modified?
Snow Leopard's find command has a -Bnewer primary that compares the file's "birth time" (aka inode creation time) to the timestamp file's modify time, so it should do pretty much what you want. I'm not sure exactly when this feature was added; it's there in 10.6.4, not there in 10.4.11, and I don't have a 10.5 machine handy to look at. If you need this to work on an earlier version, you can use stat to fake it, something like this:
find "$folderToWatch"/* -newer timestamp -print | \
while IFS="" read file; do
if [[ $(stat -f %B "$file") > $(stat -f %m timestamp) ]]; then
printf "%s\n" "$file"
fi
done >"$files"
You could maintain a manifest:
new_manifest=/tmp/new_manifest.$$
(cd $folderToWatch; find .) > $new_manifest
diff manifest $new_manifest | perl -ne 'print "$1\n" if m{^> \./(.*)}' > $files
mv -f $new_manifest manifest
You may be interested in looking at the change time.
if test `find "text.txt" -cmin +120`
then
echo old enough
fi
See: How do i check in bash whether a file was created more than x time ago

Resources