remove old backup files - bash

# find /home/shantanu -name 'my_stops*' | xargs ls -lt | head -2
The command mentioned above will list the latest 2 files having my_stops in it's name. I want to keep these 2 files. But I want to delete all other files starting with "my_stops" from the current directory.

If you create backups on a regular basis, it may be useful to use the -atime option of find so only files older than your last two backups can be selected for deletion.
For daily backups you might use
$ find /home/shantanu -atime +2 -name 'my_stops*' -exec rm {} \;
but a different expression (other than -atime) may suit you better.
In the example I used +2 to mean more than 2 days.

Here is a non-recursive solution:
ls -t my_stops* | awk 'NR>2 {system("rm \"" $0 "\"")}'
Explanation:
The ls command lists files with the latest 2 on top
The awk command states that for those lines (NR = number of records, i.e. lines) greater than 2, delete them
The quote characters are needed just in case the file names have embedded spaces

See here
(ls -t|head -n 2;ls)|sort|uniq -u|xargs rm

That will show you from the second line forward ;)
find /home/shantanu -name 'my_stops*' | xargs ls -lt | tail -n +2
Just keep in mind that find is recursive ;)

Without recursive approach:
find /home/folder/ -maxdepth 1 -name "*.jpg" -mtime +2

Related

Automator - find and move most recent file in filtered list

I'm trying to use Automator to
get a filtered list of files in a folder that are more than 30 days old (ok)
move the most recent of those (if there are any) to an existing subfolder
trash the others.
Step 1 is easy enough, but I haven't been able to find a way to do step 2 -- select and move the most recent of the filtered set.
The outcome would be: if the workflow is scheduled once a month, then the subfolder will contain one file per month, & the parent folder will contain only files less than 31 days old
Is there a way to do that?
UPDATE
I tried adding a shell script to the automator workflow
fn=$(ls -t | head -n1)
mv -f -- "$fn" ./<subdirectory>/
But am having trouble with the path in the second line.
ls -t | head -n1 is good, but be careful; if the subdir is the most recently modified, it would take the first slot, not only resulting in an attempt to move that dir into itself (not allowed, and that may be your problem "with the path in the second line"), but potentially deleting the rest of the files, including the one you want to keep. There are many ways to filter out any directories; off the top of my head you could ls -tp | grep -v '/$' | head -n1 . Note that adding a file to a directory affects that directories mtime (last modified time) on posix.
Removing all files is easy, once you move out the file you want to keep, just rm *. Note that this will not remove directories (so long as you do not put -r), which I think is what you want because it appears you're moving the file you want to keep to a sub-directory of where it was.
You may want to add some error trapping too, so if a step fails later steps don't delete files you don't want deleted. I do not use automator, but this should work so long as your using real bash: (including other schedulers, like cron, as long as you get into the correct working dir first)
mv -- "$(ls -tp | grep -v '/$' | head -n1)" subdirectory/ && rm *
&& means do what follows only if what preceeds succeeds. Adding ./ to the beginning of the destination file does nothing, though keeping the / at the end prevents creating a new file named "subdirectory" if it does not already exist. Also, I'm pretty sure the "<>" in the code snippet you sent is to mark it as being different from your actual code, but just in case: Note that the subdirectory, whatever it's called, may need special handling if it does actually contain those characters.
Edit:
I just noticed in the question the constraint "get a filtered list of files in a folder that are more than 30 days old". So, a slight change: (use find to compare the time)
mv -- "$(find -maxdepth 1 -type f -mtime +30 -printf '%T# %f\n' | sort -rn | head -n1 | cut -d\ -f2-)" subdirectory/ && find -maxdepth 1 -type f -mtime +30 -delete
Explanation: find in the current directory (not subdirs, so maxdepth 1) files (not directories, type f) that have a mtime of at least 30 days in the past (-mtime +30) and print the time of the modification and the name (%T# %f); sort as if it were a number (-n) in reverse order (-r); take only the first (head -n1); extract the filename (second+ space-delimited field) and move it to subdirectory. If successful, delete anything that fits the same find criteria as before.
I would not put the files in a environment variable unless the disk is /very/ slow and uncached. The time spent filtering out the filename you moved probably takes more effort then requerying the disk, unless you have an insane number of files, in which case they might not fit in the environment section.
Edit 2: KamilCuk is right. Use null terminated, as null is (the only character) not allowed in filenames:
find -maxdepth 1 -type f -mtime +30 -printf '%T %f\0' | sort -z -t' ' -r -n -s -k1 | head -z -n1 | cut -z -d' ' -f2- | xargs -0 -I{} mv {} subdirectory/ && find -maxdepth 1 -type f -mtime +30 -delete

remove files from subfolders without the last three

I have a structure like that:
/usr/local/a/1.txt
/usr/local/a/2.txt
/usr/local/a/3.txt
/usr/local/b/4.txt
/usr/local/b/3.txt
/usr/local/c/1.txt
/usr/local/c/7.txt
/usr/local/c/6.txt
/usr/local/c/12.txt
...
I want to delete all the files *.txt in subfolders except the last three files with the greatest modification date, but here I am in current directory
ls -tr *.txt | head -n-3 |xargs rm -f
I need to combine that with the code:
find /usr/local/**/* -type f
Should I use the maxdepth option?
Thanks for helping,
aola
Added maxdepth options to find for one level, sorting files by last modification time, tail to ignore the oldest modified 3 files and xargs with -r to remove the files only if they are found.
for folder in $(find /usr/local/ -type d)
do
find $folder -maxdepth 1 -type f -name "*.txt" | xargs -r ls -1tr | tail -n+3 | xargs -r rm -f
done
Run the above command once without rm to ensure that the previous commands pick the proper files for deletion.
You've almost got the solution: use find to get the files,ls to sort them by modification date and tail to omit three most recently modified ones:
find /usr/lib -type f | xargs ls -t | tail -n +4 | xargs rm
If you would like to remove only the files at a specified depth add -mindepth 4 -maxdepth 4 to find parameters.
You can use find's -printf option, to print the modification time in front of the file name and then sort and strip the date off. This avoids using ls at all.
find /usr/local -type f -name '*.txt' -printf '%T#|%p\n' | sort -r | cut -d '|' -f 2 | head -n-3 | xargs rm -f
The other Answers using xargs ls -t can lead to incorrect results, when there are more results than xargs can put in a single ls -t command.
but for each subfolder, so when I have
/usr/local/a/1.txt
/usr/local/a/2.txt
/usr/local/a/3.txt
/usr/local/a/4.txt
/usr/local/b/4.txt
/usr/local/b/3.txt
/usr/local/c/1.txt
/usr/local/c/7.txt
/usr/local/c/6.txt
/usr/local/c/12.txt
I want to to use the code for each subfolder separately
head -n-3 |xargs rm -f
so I bet if I have it sorted by date then the files to delete:
/usr/local/a/4.txt
/usr/local/c/12.txt
I want to leave in any subfolder three newest files

Get the newest directory to a variable in Bash

I would like to find the newest sub directory in a directory and save the result to variable in bash.
Something like this:
ls -t /backups | head -1 > $BACKUPDIR
Can anyone help?
BACKUPDIR=$(ls -td /backups/*/ | head -1)
$(...) evaluates the statement in a subshell and returns the output.
There is a simple solution to this using only ls:
BACKUPDIR=$(ls -td /backups/*/ | head -1)
-t orders by time (latest first)
-d only lists items from this folder
*/ only lists directories
head -1 returns the first item
I didn't know about */ until I found Listing only directories using ls in bash: An examination.
This ia a pure Bash solution:
topdir=/backups
BACKUPDIR=
# Handle subdirectories beginning with '.', and empty $topdir
shopt -s dotglob nullglob
for file in "$topdir"/* ; do
[[ -L $file || ! -d $file ]] && continue
[[ -z $BACKUPDIR || $file -nt $BACKUPDIR ]] && BACKUPDIR=$file
done
printf 'BACKUPDIR=%q\n' "$BACKUPDIR"
It skips symlinks, including symlinks to directories, which may or may not be the right thing to do. It skips other non-directories. It handles directories whose names contain any characters, including newlines and leading dots.
Well, I think this solution is the most efficient:
path="/my/dir/structure/*"
backupdir=$(find $path -type d -prune | tail -n 1)
Explanation why this is a little better:
We do not need sub-shells (aside from the one for getting the result into the bash variable).
We do not need a useless -exec ls -d at the end of the find command, it already prints the directory listing.
We can easily alter this, e.g. to exclude certain patterns. For example, if you want the second newest directory, because backup files are first written to a tmp dir in the same path:
backupdir=$(find $path -type -d -prune -not -name "*temp_dir" | tail -n 1)
The above solution doesn't take into account things like files being written and removed from the directory resulting in the upper directory being returned instead of the newest subdirectory.
The other issue is that this solution assumes that the directory only contains other directories and not files being written.
Let's say I create a file called "test.txt" and then run this command again:
echo "test" > test.txt
ls -t /backups | head -1
test.txt
The result is test.txt showing up instead of the last modified directory.
The proposed solution "works" but only in the best case scenario.
Assuming you have a maximum of 1 directory depth, a better solution is to use:
find /backups/* -type d -prune -exec ls -d {} \; |tail -1
Just swap the "/backups/" portion for your actual path.
If you want to avoid showing an absolute path in a bash script, you could always use something like this:
LOCALPATH=/backups
DIRECTORY=$(cd $LOCALPATH; find * -type d -prune -exec ls -d {} \; |tail -1)
With GNU find you can get list of directories with modification timestamps, sort that list and output the newest:
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\0" | sort -z -n | cut -z -f2- | tail -z -n1
or newline separated
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\n" | sort -n | cut -f2- | tail -n1
With POSIX find (that does not have -printf) you may, if you have it, run stat to get file modification timestamp:
find . -mindepth 1 -maxdepth 1 -type d -exec stat -c '%Y %n' {} \; | sort -n | cut -d' ' -f2- | tail -n1
Without stat a pure shell solution may be used by replacing [[ bash extension with [ as in this answer.
Your "something like this" was almost a hit:
BACKUPDIR=$(ls -t ./backups | head -1)
Combining what you wrote with what I have learned solved my problem too. Thank you for rising this question.
Note: I run the line above from GitBash within Windows environment in file called ./something.bash.

unix command to find most recent directory created

I want to copy the files from the most recent directory created. How would I do so in unix?
For example, if I have the directories names as date stamp as such:
/20110311
/20110318
/20110325
This is the answer to the question I think you are asking.
When I deal with many directories that have date/time stamps in the name, I always take the approach that you have which is YYYYMMDD - the great thing about that is that the date order is then also the alphabetical order. In most shells (certainly in bash and I am 90% sure of the others), the '*' expansion is done alphabetically, and by default 'ls' return alphabetical order. Hence
ls | head -1
ls | tail -1
Give you the earliest and the latest dates in the directory.
This can be extended to only keep the last 5 entries etc.
lastdir=`ls -tr <parentdir> | tail -1`
I don't know how to make the backticks play nice with the commenting system here. Just replace those apostrophes with backticks.
After some experimenting, I came up with the following:
The unix stat command is useful here. The '-t' option causes stat to print its output in terse mode (all in one line), and the 13th element of that terse output is the unix timestamp (seconds since epoch) for the last-modified time. This command will list all directories (and sub-directories) in order from newest-modified to oldest-modified:
find -type d -exec stat -t {} \; | sort -r -n -k 13,13
Hopefully the "terse" mode of stat will remain consistent in future releases of stat !
Here's some explanation of the command-line options used:
find -type d # only find directories
find -exec [command] {} \; # execute given command against each *found* file.
sort -r # reverse the sort
sort -n # numeric sort (100 should not appear before 2!)
sort -k M,N # only sort the line using elements M through N.
Returning to your original request, to copy files, maybe try the following. To output just a single directory (the most recent), append this to the command (notice the initial pipe), and feed it all into your 'cp' command with backticks.
| head --lines=1 | sed 's/\ .*$//'
The trouble with the ls based solutions is that they are not filtering just for directories. I think this:
cp `find . -mindepth 1 -maxdepth 1 -type d -exec stat -c "%Y %n" {} \; |sort -n -r |head -1 |awk '{print $2}'`/* /target-directory/.
might do the trick, though note that that will only copy files in the immediate directory. If you want a more general answer for copying anything below your newest directory over to a new directory I think you would be better off using rsync like:
rsync -av `find . -mindepth 1 -maxdepth 1 -type d -exec stat -c "%Y %n" {} \; |sort -n -r |head -1 |awk '{print $2}'`/ /target-directory/
but it depends a bit which behaviour you want. The explanation of the stuff in the backticks is:
. - the current directory (you may want to specify an absolute path here)
-mindepth/-maxdepth - restrict the find command only to the immediate children of the current directory
-type d - only directories
-exec stat .. - outputs the modified time and the name of the directory from find
sort -n -r |head -1 | awk '{print $2}' - date orders the directory and outputs the name of the most recently modified
If your directories are named YYYYMMDD like your question suggests, take advantage of the alphabetic globbing.
Put all directories in an array, and then pick the first one:
dirs=(*/); first_dir="$dirs";
(This is actually a shortcut for first_dir="${dirs[0]}";.)
Similarly, for the last one:
dirs=(*/); last_dir="${dirs[$((${#dirs[#]} - 1))]}";
Ugly syntax, but this is what it breaks down to:
# Create an array of all directories inside the working directory.
dirs=(*/);
# Get the number of entries in the array.
num_dirs=${#dirs[#]};
# Calculate the index of the last entry.
last_index=$(($num_dirs - 1));
# Get the value at the last index.
last_dir="${dirs[$last_index]}";
I know this is an old question with an accepted answer, but I think this method is preferable as it does everything in Bash. No reason to spawn extra processes, let alone parse the output of ls. (Which, admittedly, should be fine in this particular case of YYYYMMDD names.)
please try with following command
ls -1tr | tail -1
find ~ -type d | ls -ltra
This one is simple and useful which I learned recently.
This command will show the results in reverse chronological order.
I wrote a command that can be used to identify which folder or files are created in a folder as a newest. That's seems pure :)
#/bin/sh
path=/var/folder_name
newest=`find $path -maxdepth 1 -exec stat -t {} \; |sed 1d |sort -r -k 14 | head -1 |awk {'print $1'} | sed 's/\.\///g'`
find $path -maxdepth 1| sed 1d |grep -v $newest

How to echo directories containing matching file with Bash?

I want to write a bash script which will use a list of all the directories containing specific files. I can use find to echo the path of each and every matching file. I only want to list the path to the directory containing at least one matching file.
For example, given the following directory structure:
dir1/
matches1
matches2
dir2/
no-match
The command (looking for 'matches*') will only output the path to dir1.
As extra background, I'm using this to find each directory which contains a Java .class file.
find . -name '*.class' -printf '%h\n' | sort -u
From man find:
-printf format
%h Leading directories of file’s name (all but the last element). If the file name contains no slashes (since it is in the current directory) the %h specifier expands to ".".
On OS X and FreeBSD, with a find that lacks the -printf option, this will work:
find . -name *.class -print0 | xargs -0 -n1 dirname | sort --unique
The -n1 in xargs sets to 1 the maximum number of arguments taken from standard input for each invocation of dirname
GNU find
find /root_path -type f -iname "*.class" -printf "%h\n" | sort -u
Ok, i come way too late, but you also could do it without find, to answer specifically to "matching file with Bash" (or at least a POSIX shell).
ls */*.class | while read; do
echo ${REPLY%/*}
done | sort -u
The ${VARNAME%/*} will strip everything after the last / (if you wanted to strip everything after the first, it would have been ${VARNAME%%/*}).
Regards.
find / -name *.class -printf '%h\n' | sort --unique
Far too late, but this might be helpful to future readers:
I personally find it more helpful to have the list of folders printed into a file, rather than to Terminal (on a Mac).
For that, you can simply output the paths to a file, e.g. folders.txt, by using:
find . -name *.sql -print0 | xargs -0 -n1 dirname | sort --unique > folders.txt
How about this?
find dirs/ -name '*.class' -exec dirname '{}' \; | awk '!seen[$0]++'
For the awk command, see #43 on this list

Resources