how to recursively find files in shell - shell

I am trying the below :
find /dir1/dir2/dir3 -name '*.txt' -type f
What I want to do is I want to search for files recursively in the dir3 folder. That means under dir3 there are dir4 and dir5 folders and I want that the files matching with *txt extension should be returned from dir4 and dir5 directories. Also, how can I get the files which are only created on today's date?

Assuming you use bash, I came up with this:
#!/bin/bash
# This depends on your systems locale I think.
# for me, `date` returns `So 25. Jul 12:32:27 CEST 2021`.
# Therefore I want the $2 for DAY and $3 for MONTH, yours might be different.
DAY="$(date | awk '{print $2}')"
MONTH="$(date | awk '{print $3}')"
FINDIN="/dir1/dir2/dir3"
# This prints only `.txt` files which were created today along with all their details.
# If you only want the path, you could pipe it into
# `awk '{print $11}'`
find "$FINDIN" -type f -ls | grep -E "*.txt$" | grep "$MONTH $DAY"
Using bashisms you could technically make this a (lengthy) one-liner but if you put it in a script, you can substitute your path ($FINDIN) for a dynamic value passed as an argument ($1, $2, …) or the caller directory (implied/parsed from $0).

Related

creating a list of files with file absolute path in linux

I have a large sum of files (~50000 files).
ls /home/abc/def/
file1.txt
file2.txt
file3.txt
.........
.........
file50000.txt
I want to create a CSV file with two columns: first column provides the filename and the second provides the absolute file path as:
output.csv
file1.txt,/home/abc/def/file1.txt
file2.txt,/home/abc/def/file2.txt
file3.txt,/home/abc/def/file3.txt
.........................
.........................
file50000.txt,/home/abc/def/file50000.txt
How to do this with bash commands. I tried with ls and find as
find /home/abc/def/ -type f -exec ls -ld {} \; | awk '{ print $5, $9 }' > output.csv
but this gives me absolute paths. How to get the output as shown in output.csv above
You can get both just the filename and the full path with GNU find's -printf option:
find /home/abc/def -type f -printf "%f,%p\n"
Pipe through sort if you want sorted results.
How about:
$ find /path/ | awk -F/ -v OFS=, '{print $NF,$0}'
Add proper switches to find where needed.
if u wanna fully canonicalize all existing paths, including fixing duplicate / and resolving symlinks out to their physical why not just
find … -print0 |
or
gls --zero |
or
mawk 8 ORS='\0' filelist.txt |
xargs -0 -P 8 grealpath -ePq
In plain bash:
for file in /home/abc/def/*.txt; do printf '%s,%s\n' "${file##*/}" "$file"; done
or,
dir=/home/abc/def
cd "$dir" && for file in *.txt; do printf '%s,%s\n' "$file" "$dir/$file"; done

Deleting oldest files from a subdirectory in a directory

I have an archive folder, inside which some sub folders (as A,B,C) containing archived files.How to find and delete the oldest created files from sub folders (say B)which I want?
This command would do exactly what you desired,
find . -mindepth 2 -type f -printf '%T+ %p\n' | sort | awk 'NR==1{print $2}' | xargs rm -v
Brief explanation,
find . -mindepth 2 -type f -printf '%T+ %p\n': limit the min depth to 2, it means find would only show the files under the sub-directories or even further. And then prints the file's last modification time and its name.
Pipe the output of find ... to sort to sort the modification time of all found files.
awk 'NR==1{print $2}': pipe the output to awk to get the name of oldest file
xargs rm -v: remove the oldest file
Eidt
For further request to pass the sub-directories name using variables, here's the modified method to use. You only need to modify the awk part,
$ a="sub_dir1"
$ b="sub_dir2"
$ find ... | sort | awk -v a=$a -v b=$b '$2 ~ "./" a "/" || $2 ~ "./" b "/"{print $2 ;exit}' | xargs ...
If you are trying to delete the oldest modified file (not created), then you can use this:
rm "$(ls -t | tail -1)"

Listing only directories using ls in Bash? [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 28 days ago.
The community reviewed whether to reopen this question 28 days ago and left it closed:
Original close reason(s) were not resolved
Improve this question
This command lists directories in the current path:
ls -d */
What exactly does the pattern */ do?
And how can we give the absolute path in the above command (e.g. ls -d /home/alice/Documents) for listing only directories in that path?
*/ is a pattern that matches all of the subdirectories in the current directory (* would match all files and subdirectories; the / restricts it to directories). Similarly, to list all subdirectories under /home/alice/Documents, use ls -d /home/alice/Documents/*/
Four ways to get this done, each with a different output format
1. Using echo
Example: echo */, echo */*/
Here is what I got:
cs/ draft/ files/ hacks/ masters/ static/
cs/code/ files/images/ static/images/ static/stylesheets/
2. Using ls only
Example: ls -d */
Here is exactly what I got:
cs/ files/ masters/
draft/ hacks/ static/
Or as list (with detail info): ls -dl */
3. Using ls and grep
Example: ls -l | grep "^d"
Here is what I got:
drwxr-xr-x 24 h staff 816 Jun 8 10:55 cs
drwxr-xr-x 6 h staff 204 Jun 8 10:55 draft
drwxr-xr-x 9 h staff 306 Jun 8 10:55 files
drwxr-xr-x 2 h staff 68 Jun 9 13:19 hacks
drwxr-xr-x 6 h staff 204 Jun 8 10:55 masters
drwxr-xr-x 4 h staff 136 Jun 8 10:55 static
4. Bash Script (Not recommended for filename containing spaces)
Example: for i in $(ls -d */); do echo ${i%%/}; done
Here is what I got:
cs
draft
files
hacks
masters
static
If you like to have '/' as ending character, the command will be: for i in $(ls -d */); do echo ${i}; done
cs/
draft/
files/
hacks/
masters/
static/
I use:
ls -d */ | cut -f1 -d'/'
This creates a single column without a trailing slash - useful in scripts.
For all folders without subfolders:
find /home/alice/Documents -maxdepth 1 -type d
For all folders with subfolders:
find /home/alice/Documents -type d
Four (more) Reliable Options.
An unquoted asterisk * will be interpreted as a pattern (glob) by the shell. The shell will use it in pathname expansion. It will then generate a list of filenames that match the pattern.
A simple asterisk will match all filenames in the PWD (present working directory). A more complex pattern as */ will match all filenames that end in /. Thus, all directories. That is why the command:
1.- echo.
echo */
echo ./*/ ### Avoid misinterpreting filenames like "-e dir"
will be expanded (by the shell) to echo all directories in the PWD.
To test this: Create a directory (mkdir) named like test-dir, and cd into it:
mkdir test-dir; cd test-dir
Create some directories:
mkdir {cs,files,masters,draft,static} # Safe directories.
mkdir {*,-,--,-v\ var,-h,-n,dir\ with\ spaces} # Some a bit less secure.
touch -- 'file with spaces' '-a' '-l' 'filename' # And some files:
The command echo ./*/ will remain reliable even with odd named files:
./--/ ./-/ ./*/ ./cs/ ./dir with spaces/ ./draft/ ./files/ ./-h/
./masters/ ./-n/ ./static/ ./-v var/
But the spaces in filenames make reading a bit confusing.
If instead of echo, we use ls. The shell is still what is expanding the list of filenames. The shell is the reason to get a list of directories in the PWD. The -d option to ls makes it list the present directory entry instead of the contents of each directory (as presented by default).
ls -d */
However, this command is (somewhat) less reliable. It will fail with the odd named files listed above. It will choke with several names. You need to erase one by one till you find the ones with problems.
2.- ls
The GNU ls will accept the "end of options" (--) key.
ls -d ./*/ ### More reliable BSD ls
ls -d -- */ ### More reliable GNU ls
3.-printf
To list each directory in its own line (in one column, similar to ls -1), use:
$ printf "%s\n" */ ### Correct even with "-", spaces or newlines.
And, even better, we could remove the trailing /:
$ set -- */; printf "%s\n" "${#%/}" ### Correct with spaces and newlines.
An attempt like
$ for i in $(ls -d */); do echo ${i%%/}; done
will fail on:
some names (ls -d */) as already shown above.
will be affected by the value of IFS.
will split names on spaces and tabs (with default IFS).
each newline in the name will start a new echo command.
4.- Function
Finally, using the argument list inside a function will not affect the arguments list of the present running shell. Simply
$ listdirs(){ set -- */; printf "%s\n" "${#%/}"; }
$ listdirs
presents this list:
--
-
*
cs
dir with spaces
draft
files
-h
masters
-n
static
-v var
These options are safe with several types of odd filenames.
The tree command is also pretty useful here. By default it will show all files and directories to a complete depth, with some ASCII characters showing the directory tree.
$ tree
.
├── config.dat
├── data
│ ├── data1.bin
│ ├── data2.inf
│ └── sql
| │ └── data3.sql
├── images
│ ├── background.jpg
│ ├── icon.gif
│ └── logo.jpg
├── program.exe
└── readme.txt
But if we wanted to get just the directories, without the ASCII tree, and with the full path from the current directory, you could do:
$ tree -dfi
.
./data
./data/sql
./images
The arguments being:
-d List directories only.
-f Prints the full path prefix for each file.
-i Makes tree not print the indentation lines, useful when used in conjunction with the -f option.
And if you then want the absolute path, you could start by specifying the full path to the current directory:
$ tree -dfi "$(pwd)"
/home/alice/Documents
/home/alice/Documents/data
/home/alice/Documents/data/sql
/home/alice/Documents/images
And to limit the number of subdirectories, you can set the max level of subdirectories with -L level, e.g.:
$ tree -dfi -L 1 "$(pwd)"
/home/alice/Documents
/home/alice/Documents/data
/home/alice/Documents/images
More arguments can be seen with man tree.
In case you're wondering why output from 'ls -d */' gives you two trailing slashes, like:
[prompt]$ ls -d */
app// cgi-bin// lib// pub//
it's probably because somewhere your shell or session configuration files alias the ls command to a version of ls that includes the -F flag. That flag appends a character to each output name (that's not a plain file) indicating the kind of thing it is. So one slash is from matching the pattern '*/', and the other slash is the appended type indicator.
To get rid of this issue, you could of course define a different alias for ls. However, to temporarily not invoke the alias, you can prepend the command with backslash:
\ls -d */
Actual ls solution, including symlinks to directories
Many answers here don't actually use ls (or only use it in the trivial sense of ls -d, while using wildcards for the actual subdirectory matching. A true ls solution is useful, since it allows the use of ls options for sorting order, etc.
Excluding symlinks
One solution using ls has been given, but it does something different from the other solutions in that it excludes symlinks to directories:
ls -l | grep '^d'
(possibly piping through sed or awk to isolate the file names)
Including symlinks
In the (probably more common) case that symlinks to directories should be included, we can use the -p option of ls, which makes it append a slash character to names of directories (including symlinked ones):
ls -1p | grep '/$'
or, getting rid of the trailing slashes:
ls -1p | grep '/$' | sed 's/\/$//'
We can add options to ls as needed (if a long listing is used, the -1 is no longer required).
Note: if we want trailing slashes, but don't want them highlighted by grep, we can hackishly remove the highlighting by making the actual matched portion of the line empty:
ls -1p | grep -P '(?=/$)'
A plain list of the current directory, it'd be:
ls -1d */
If you want it sorted and clean:
ls -1d */ | cut -c 1- | rev | cut -c 2- | rev | sort
Remember: capitalized characters have different behavior in the sort
I just add this to my .bashrc file (you could also just type it on the command line if you only need/want it for one session):
alias lsd='ls -ld */'
Then lsd will produce the desired result.
Here is what I am using
ls -d1 /Directory/Path/*;
If a hidden directory is not needed to be listed, I offer:
ls -l | grep "^d" | awk -F" " '{print $9}'
And if hidden directories are needed to be listed, use:
ls -Al | grep "^d" | awk -F" " '{print $9}'
Or
find -maxdepth 1 -type d | awk -F"./" '{print $2}'
For listing only directories:
ls -l | grep ^d
For listing only files:
ls -l | grep -v ^d
Or also you can do as:
ls -ld */
Try this one. It works for all Linux distribution.
ls -ltr | grep drw
ls and awk (without grep)
No need to use grep since awk can perform regularexpressino check so it is enough to do this:
ls -l | awk '/^d/ {print $9}'
where ls -l list files with permisions
awk filter output
'/^d/' regularexpresion that search only for lines starting with letter d (as directory) looking at first line - permisions
{print} would prints all columns
{print $9} will print only 9th column (name) from ls -l output
Very simple and clean
To show folder lists without /:
ls -d */|sed 's|[/]||g'
I found this solution the most comfortable, I add to the list:
find * -maxdepth 0 -type d
The difference is that it has no ./ at the beginning, and the folder names are ready to use.
Test whether the item is a directory with test -d:
for i in $(ls); do test -d $i && echo $i ; done
FYI, if you want to print all the files in multi-line, you can do a ls -1 which will print each file in a separate line.
file1
file2
file3
*/ is a filename matching pattern that matches directories in the current directory.
To list directories only, I like this function:
# Long list only directories
llod () {
ls -l --color=always "$#" | grep --color=never '^d'
}
Put it in your .bashrc file.
Usage examples:
llod # Long listing of all directories in current directory
llod -tr # Same but in chronological order oldest first
llod -d a* # Limit to directories beginning with letter 'a'
llod -d .* # Limit to hidden directories
Note: it will break if you use the -i option. Here is a fix for that:
# Long list only directories
llod () {
ls -l --color=always "$#" | egrep --color=never '^d|^[[:digit:]]+ d'
}
file * | grep directory
Output (on my machine) --
[root#rhel6 ~]# file * | grep directory
mongo-example-master: directory
nostarch: directory
scriptzz: directory
splunk: directory
testdir: directory
The above output can be refined more by using cut:
file * | grep directory | cut -d':' -f1
mongo-example-master
nostarch
scriptzz
splunk
testdir
* could be replaced with any path that's permitted
file - determine file type
grep - searches for string named directory
-d - to specify a field delimiter
-f1 - denotes field 1
One-liner to list directories only from "here".
With file count.
for i in `ls -d */`; do g=`find ./$i -type f -print| wc -l`; echo "Directory $i contains $g files."; done
Using Perl:
ls | perl -nle 'print if -d;'
I partially solved it with:
cd "/path/to/pricipal/folder"
for i in $(ls -d .*/); do sudo ln -s "$PWD"/${i%%/} /home/inukaze/${i%%/}; done
 
ln: «/home/inukaze/./.»: can't overwrite a directory
ln: «/home/inukaze/../..»: can't overwrite a directory
ln: accesing to «/home/inukaze/.config»: too much symbolics links levels
ln: accesing to «/home/inukaze/.disruptive»: too much symbolics links levels
ln: accesing to «/home/inukaze/innovations»: too much symbolics links levels
ln: accesing to «/home/inukaze/sarl»: too much symbolics links levels
ln: accesing to «/home/inukaze/.e_old»: too much symbolics links levels
ln: accesing to «/home/inukaze/.gnome2_private»: too much symbolics links levels
ln: accesing to «/home/inukaze/.gvfs»: too much symbolics links levels
ln: accesing to «/home/inukaze/.kde»: too much symbolics links levels
ln: accesing to «/home/inukaze/.local»: too much symbolics links levels
ln: accesing to «/home/inukaze/.xVideoServiceThief»: too much symbolics links levels
Well, this reduce to me, the major part :)
Here is a variation using tree which outputs directory names only on separate lines, yes it's ugly, but hey, it works.
tree -d | grep -E '^[├|└]' | cut -d ' ' -f2
or with awk
tree -d | grep -E '^[├|└]' | awk '{print $2}'
This is probably better however and will retain the / after directory name.
ls -l | awk '/^d/{print $9}'
if you have space in your folder name $9 print wont work try below command
ls -l yourfolder/alldata/ | grep '^d' | awk '{print $9" " $10}'
output
ls -l yourfolder/alldata/ | grep '^d' | awk '{print $9" " $10}'
testing_Data
Folder 1
To answer the original question, */ has nothing to do with ls per se; it is done by the shell/Bash, in a process known as globbing.
This is why echo */ and ls -d */ output the same elements. (The -d flag makes ls output the directory names and not contents of the directories.)
Adding on to make it full circle, to retrieve the path of every folder, use a combination of Albert's answer as well as Gordans. That should be pretty useful.
for i in $(ls -d /pathto/parent/folder/*/); do echo ${i%%/}; done
Output:
/pathto/parent/folder/childfolder1/
/pathto/parent/folder/childfolder2/
/pathto/parent/folder/childfolder3/
/pathto/parent/folder/childfolder4/
/pathto/parent/folder/childfolder5/
/pathto/parent/folder/childfolder6/
/pathto/parent/folder/childfolder7/
/pathto/parent/folder/childfolder8/
Here is what I use for listing only directory names:
ls -1d /some/folder/*/ | awk -F "/" "{print \$(NF-1)}"

unix command to find most recent directory created

I want to copy the files from the most recent directory created. How would I do so in unix?
For example, if I have the directories names as date stamp as such:
/20110311
/20110318
/20110325
This is the answer to the question I think you are asking.
When I deal with many directories that have date/time stamps in the name, I always take the approach that you have which is YYYYMMDD - the great thing about that is that the date order is then also the alphabetical order. In most shells (certainly in bash and I am 90% sure of the others), the '*' expansion is done alphabetically, and by default 'ls' return alphabetical order. Hence
ls | head -1
ls | tail -1
Give you the earliest and the latest dates in the directory.
This can be extended to only keep the last 5 entries etc.
lastdir=`ls -tr <parentdir> | tail -1`
I don't know how to make the backticks play nice with the commenting system here. Just replace those apostrophes with backticks.
After some experimenting, I came up with the following:
The unix stat command is useful here. The '-t' option causes stat to print its output in terse mode (all in one line), and the 13th element of that terse output is the unix timestamp (seconds since epoch) for the last-modified time. This command will list all directories (and sub-directories) in order from newest-modified to oldest-modified:
find -type d -exec stat -t {} \; | sort -r -n -k 13,13
Hopefully the "terse" mode of stat will remain consistent in future releases of stat !
Here's some explanation of the command-line options used:
find -type d # only find directories
find -exec [command] {} \; # execute given command against each *found* file.
sort -r # reverse the sort
sort -n # numeric sort (100 should not appear before 2!)
sort -k M,N # only sort the line using elements M through N.
Returning to your original request, to copy files, maybe try the following. To output just a single directory (the most recent), append this to the command (notice the initial pipe), and feed it all into your 'cp' command with backticks.
| head --lines=1 | sed 's/\ .*$//'
The trouble with the ls based solutions is that they are not filtering just for directories. I think this:
cp `find . -mindepth 1 -maxdepth 1 -type d -exec stat -c "%Y %n" {} \; |sort -n -r |head -1 |awk '{print $2}'`/* /target-directory/.
might do the trick, though note that that will only copy files in the immediate directory. If you want a more general answer for copying anything below your newest directory over to a new directory I think you would be better off using rsync like:
rsync -av `find . -mindepth 1 -maxdepth 1 -type d -exec stat -c "%Y %n" {} \; |sort -n -r |head -1 |awk '{print $2}'`/ /target-directory/
but it depends a bit which behaviour you want. The explanation of the stuff in the backticks is:
. - the current directory (you may want to specify an absolute path here)
-mindepth/-maxdepth - restrict the find command only to the immediate children of the current directory
-type d - only directories
-exec stat .. - outputs the modified time and the name of the directory from find
sort -n -r |head -1 | awk '{print $2}' - date orders the directory and outputs the name of the most recently modified
If your directories are named YYYYMMDD like your question suggests, take advantage of the alphabetic globbing.
Put all directories in an array, and then pick the first one:
dirs=(*/); first_dir="$dirs";
(This is actually a shortcut for first_dir="${dirs[0]}";.)
Similarly, for the last one:
dirs=(*/); last_dir="${dirs[$((${#dirs[#]} - 1))]}";
Ugly syntax, but this is what it breaks down to:
# Create an array of all directories inside the working directory.
dirs=(*/);
# Get the number of entries in the array.
num_dirs=${#dirs[#]};
# Calculate the index of the last entry.
last_index=$(($num_dirs - 1));
# Get the value at the last index.
last_dir="${dirs[$last_index]}";
I know this is an old question with an accepted answer, but I think this method is preferable as it does everything in Bash. No reason to spawn extra processes, let alone parse the output of ls. (Which, admittedly, should be fine in this particular case of YYYYMMDD names.)
please try with following command
ls -1tr | tail -1
find ~ -type d | ls -ltra
This one is simple and useful which I learned recently.
This command will show the results in reverse chronological order.
I wrote a command that can be used to identify which folder or files are created in a folder as a newest. That's seems pure :)
#/bin/sh
path=/var/folder_name
newest=`find $path -maxdepth 1 -exec stat -t {} \; |sed 1d |sort -r -k 14 | head -1 |awk {'print $1'} | sed 's/\.\///g'`
find $path -maxdepth 1| sed 1d |grep -v $newest

Delete all but the most recent X files in bash

Is there a simple way, in a pretty standard UNIX environment with bash, to run a command to delete all but the most recent X files from a directory?
To give a bit more of a concrete example, imagine some cron job writing out a file (say, a log file or a tar-ed up backup) to a directory every hour. I'd like a way to have another cron job running which would remove the oldest files in that directory until there are less than, say, 5.
And just to be clear, there's only one file present, it should never be deleted.
The problems with the existing answers:
inability to handle filenames with embedded spaces or newlines.
in the case of solutions that invoke rm directly on an unquoted command substitution (rm `...`), there's an added risk of unintended globbing.
inability to distinguish between files and directories (i.e., if directories happened to be among the 5 most recently modified filesystem items, you'd effectively retain fewer than 5 files, and applying rm to directories will fail).
wnoise's answer addresses these issues, but the solution is GNU-specific (and quite complex).
Here's a pragmatic, POSIX-compliant solution that comes with only one caveat: it cannot handle filenames with embedded newlines - but I don't consider that a real-world concern for most people.
For the record, here's the explanation for why it's generally not a good idea to parse ls output: http://mywiki.wooledge.org/ParsingLs
ls -tp | grep -v '/$' | tail -n +6 | xargs -I {} rm -- {}
Note: This command operates in the current directory; to target a directory explicitly, use a subshell ((...)) with cd:
(cd /path/to && ls -tp | grep -v '/$' | tail -n +6 | xargs -I {} rm -- {})
The same applies analogously to the commands below.
The above is inefficient, because xargs has to invoke rm separately for each filename.
However, your platform's specific xargs implementation may allow you to solve this problem:
A solution that works with GNU xargs is to use -d '\n', which makes xargs consider each input line a separate argument, yet passes as many arguments as will fit on a command line at once:
ls -tp | grep -v '/$' | tail -n +6 | xargs -d '\n' -r rm --
Note: Option -r (--no-run-if-empty) ensures that rm is not invoked if there's no input.
A solution that works with both GNU xargs and BSD xargs (including on macOS) - though technically still not POSIX-compliant - is to use -0 to handle NUL-separated input, after first translating newlines to NUL (0x0) chars., which also passes (typically) all filenames at once:
ls -tp | grep -v '/$' | tail -n +6 | tr '\n' '\0' | xargs -0 rm --
Explanation:
ls -tp prints the names of filesystem items sorted by how recently they were modified , in descending order (most recently modified items first) (-t), with directories printed with a trailing / to mark them as such (-p).
Note: It is the fact that ls -tp always outputs file / directory names only, not full paths, that necessitates the subshell approach mentioned above for targeting a directory other than the current one ((cd /path/to && ls -tp ...)).
grep -v '/$' then weeds out directories from the resulting listing, by omitting (-v) lines that have a trailing / (/$).
Caveat: Since a symlink that points to a directory is technically not itself a directory, such symlinks will not be excluded.
tail -n +6 skips the first 5 entries in the listing, in effect returning all but the 5 most recently modified files, if any.
Note that in order to exclude N files, N+1 must be passed to tail -n +.
xargs -I {} rm -- {} (and its variations) then invokes on rm on all these files; if there are no matches at all, xargs won't do anything.
xargs -I {} rm -- {} defines placeholder {} that represents each input line as a whole, so rm is then invoked once for each input line, but with filenames with embedded spaces handled correctly.
-- in all cases ensures that any filenames that happen to start with - aren't mistaken for options by rm.
A variation on the original problem, in case the matching files need to be processed individually or collected in a shell array:
# One by one, in a shell loop (POSIX-compliant):
ls -tp | grep -v '/$' | tail -n +6 | while IFS= read -r f; do echo "$f"; done
# One by one, but using a Bash process substitution (<(...),
# so that the variables inside the `while` loop remain in scope:
while IFS= read -r f; do echo "$f"; done < <(ls -tp | grep -v '/$' | tail -n +6)
# Collecting the matches in a Bash *array*:
IFS=$'\n' read -d '' -ra files < <(ls -tp | grep -v '/$' | tail -n +6)
printf '%s\n' "${files[#]}" # print array elements
Remove all but 5 (or whatever number) of the most recent files in a directory.
rm `ls -t | awk 'NR>5'`
(ls -t|head -n 5;ls)|sort|uniq -u|xargs rm
This version supports names with spaces:
(ls -t|head -n 5;ls)|sort|uniq -u|sed -e 's,.*,"&",g'|xargs rm
Simpler variant of thelsdj's answer:
ls -tr | head -n -5 | xargs --no-run-if-empty rm
ls -tr displays all the files, oldest first (-t newest first, -r reverse).
head -n -5 displays all but the 5 last lines (ie the 5 newest files).
xargs rm calls rm for each selected file.
find . -maxdepth 1 -type f -printf '%T# %p\0' | sort -r -z -n | awk 'BEGIN { RS="\0"; ORS="\0"; FS="" } NR > 5 { sub("^[0-9]*(.[0-9]*)? ", ""); print }' | xargs -0 rm -f
Requires GNU find for -printf, and GNU sort for -z, and GNU awk for "\0", and GNU xargs for -0, but handles files with embedded newlines or spaces.
All these answers fail when there are directories in the current directory. Here's something that works:
find . -maxdepth 1 -type f | xargs -x ls -t | awk 'NR>5' | xargs -L1 rm
This:
works when there are directories in the current directory
tries to remove each file even if the previous one couldn't be removed (due to permissions, etc.)
fails safe when the number of files in the current directory is excessive and xargs would normally screw you over (the -x)
doesn't cater for spaces in filenames (perhaps you're using the wrong OS?)
ls -tQ | tail -n+4 | xargs rm
List filenames by modification time, quoting each filename. Exclude first 3 (3 most recent). Remove remaining.
EDIT after helpful comment from mklement0 (thanks!): corrected -n+3 argument, and note this will not work as expected if filenames contain newlines and/or the directory contains subdirectories.
Ignoring newlines is ignoring security and good coding. wnoise had the only good answer. Here is a variation on his that puts the filenames in an array $x
while IFS= read -rd ''; do
x+=("${REPLY#* }");
done < <(find . -maxdepth 1 -printf '%T# %p\0' | sort -r -z -n )
For Linux (GNU tools), an efficient & robust way to keep the n newest files in the current directory while removing the rest:
n=5
find . -maxdepth 1 -type f -printf '%T# %p\0' |
sort -z -nrt ' ' -k1,1 |
sed -z -e "1,${n}d" -e 's/[^ ]* //' |
xargs -0r rm -f
For BSD, find doesn't have the -printf predicate, stat can't output NULL bytes, and sed + awk can't handle NULL-delimited records.
Here's a solution that doesn't support newlines in paths but that safeguards against them by filtering them out:
#!/bin/bash
n=5
find . -maxdepth 1 -type f ! -path $'*\n*' -exec stat -f '%.9Fm %N' {} + |
sort -nrt ' ' -k1,1 |
awk -v n="$n" -F'^[^ ]* ' 'NR > n {printf "%s%c", $2, 0}' |
xargs -0 rm -f
note: I'm using bash because of the $'\n' notation. For sh you can define a variable containing a literal newline and use it instead.
Solution for UNIX & Linux (inspired from AIX/HP-UX/SunOS/BSD/Linux ls -b):
Some platforms don't provide find -printf, nor stat, nor support NUL-delimited records with stat/sort/awk/sed/xargs. That's why using perl is probably the most portable way to tackle the problem, because it is available by default in almost every OS.
I could have written the whole thing in perl but I didn't. I only use it for substituting stat and for encoding-decoding-escaping the filenames. The core logic is the same as the previous solutions and is implemented with POSIX tools.
note: perl's default stat has a resolution of a second, but starting from perl-5.8.9 you can get sub-second resolution with the stat function of the module Time::HiRes (when both the OS and the filesystem support it). That's what I'm using here; if your perl doesn't provide it then you can remove the ‑MTime::HiRes=stat from the command line.
n=5
find . '(' -name '.' -o -prune ')' -type f -exec \
perl -MTime::HiRes=stat -le '
foreach (#ARGV) {
#st = stat($_);
if ( #st > 0 ) {
s/([\\\n])/sprintf( "\\%03o", ord($1) )/ge;
print sprintf( "%.9f %s", $st[9], $_ );
}
else { print STDERR "stat: $_: $!"; }
}
' {} + |
sort -nrt ' ' -k1,1 |
sed -e "1,${n}d" -e 's/[^ ]* //' |
perl -l -ne '
s/\\([0-7]{3})/chr(oct($1))/ge;
s/(["\n])/"\\$1"/g;
print "\"$_\"";
' |
xargs -E '' sh -c '[ "$#" -gt 0 ] && rm -f "$#"' sh
Explanations:
For each file found, the first perl gets the modification time and outputs it along the encoded filename (each newline and backslash characters are replaced with the literals \012 and \134 respectively).
Now each time filename is guaranteed to be single-line, so POSIX sort and sed can safely work with this stream.
The second perl decodes the filenames and escapes them for POSIX xargs.
Lastly, xargs calls rm for deleting the files. The sh command is a trick that prevents xargs from running rm when there's no files to delete.
I realize this is an old thread, but maybe someone will benefit from this. This command will find files in the current directory :
for F in $(find . -maxdepth 1 -type f -name "*_srv_logs_*.tar.gz" -printf '%T# %p\n' | sort -r -z -n | tail -n+5 | awk '{ print $2; }'); do rm $F; done
This is a little more robust than some of the previous answers as it allows to limit your search domain to files matching expressions. First, find files matching whatever conditions you want. Print those files with the timestamps next to them.
find . -maxdepth 1 -type f -name "*_srv_logs_*.tar.gz" -printf '%T# %p\n'
Next, sort them by the timestamps:
sort -r -z -n
Then, knock off the 4 most recent files from the list:
tail -n+5
Grab the 2nd column (the filename, not the timestamp):
awk '{ print $2; }'
And then wrap that whole thing up into a for statement:
for F in $(); do rm $F; done
This may be a more verbose command, but I had much better luck being able to target conditional files and execute more complex commands against them.
If the filenames don't have spaces, this will work:
ls -C1 -t| awk 'NR>5'|xargs rm
If the filenames do have spaces, something like
ls -C1 -t | awk 'NR>5' | sed -e "s/^/rm '/" -e "s/$/'/" | sh
Basic logic:
get a listing of the files in time order, one column
get all but the first 5 (n=5 for this example)
first version: send those to rm
second version: gen a script that will remove them properly
With zsh
Assuming you don't care about present directories and you will not have more than 999 files (choose a bigger number if you want, or create a while loop).
[ 6 -le `ls *(.)|wc -l` ] && rm *(.om[6,999])
In *(.om[6,999]), the . means files, the o means sort order up, the m means by date of modification (put a for access time or c for inode change), the [6,999] chooses a range of file, so doesn't rm the 5 first.
Adaptation of #mklement0's excellent answer with some parameters and without needing to navigate to the folder containing the files to be deleted...
TARGET_FOLDER="/my/folder/path"
FILES_KEEP=5
ls -tp "$TARGET_FOLDER"**/* | grep -v '/$' | tail -n +$((FILES_KEEP+1)) | xargs -d '\n' -r rm --
[Ref(s).: https://stackoverflow.com/a/3572628/3223785 ]
Thanks! 😉
found interesting cmd in Sed-Onliners - Delete last 3 lines - fnd it perfect for another way to skin the cat (okay not) but idea:
#!/bin/bash
# sed cmd chng #2 to value file wish to retain
cd /opt/depot
ls -1 MyMintFiles*.zip > BigList
sed -n -e :a -e '1,2!{P;N;D;};N;ba' BigList > DeList
for i in `cat DeList`
do
echo "Deleted $i"
rm -f $i
#echo "File(s) gonzo "
#read junk
done
exit 0
Removes all but the 10 latest (most recents) files
ls -t1 | head -n $(echo $(ls -1 | wc -l) - 10 | bc) | xargs rm
If less than 10 files no file is removed and you will have :
error head: illegal line count -- 0
To count files with bash
I needed an elegant solution for the busybox (router), all xargs or array solutions were useless to me - no such command available there. find and mtime is not the proper answer as we are talking about 10 items and not necessarily 10 days. Espo's answer was the shortest and cleanest and likely the most unversal one.
Error with spaces and when no files are to be deleted are both simply solved the standard way:
rm "$(ls -td *.tar | awk 'NR>7')" 2>&-
Bit more educational version: We can do it all if we use awk differently. Normally, I use this method to pass (return) variables from the awk to the sh. As we read all the time that can not be done, I beg to differ: here is the method.
Example for .tar files with no problem regarding the spaces in the filename. To test, replace "rm" with the "ls".
eval $(ls -td *.tar | awk 'NR>7 { print "rm \"" $0 "\""}')
Explanation:
ls -td *.tar lists all .tar files sorted by the time. To apply to all the files in the current folder, remove the "d *.tar" part
awk 'NR>7... skips the first 7 lines
print "rm \"" $0 "\"" constructs a line: rm "file name"
eval executes it
Since we are using rm, I would not use the above command in a script! Wiser usage is:
(cd /FolderToDeleteWithin && eval $(ls -td *.tar | awk 'NR>7 { print "rm \"" $0 "\""}'))
In the case of using ls -t command will not do any harm on such silly examples as: touch 'foo " bar' and touch 'hello * world'. Not that we ever create files with such names in real life!
Sidenote. If we wanted to pass a variable to the sh this way, we would simply modify the print (simple form, no spaces tolerated):
print "VarName="$1
to set the variable VarName to the value of $1. Multiple variables can be created in one go. This VarName becomes a normal sh variable and can be normally used in a script or shell afterwards. So, to create variables with awk and give them back to the shell:
eval $(ls -td *.tar | awk 'NR>7 { print "VarName=\""$1"\"" }'); echo "$VarName"
leaveCount=5
fileCount=$(ls -1 *.log | wc -l)
tailCount=$((fileCount - leaveCount))
# avoid negative tail argument
[[ $tailCount < 0 ]] && tailCount=0
ls -t *.log | tail -$tailCount | xargs rm -f
I made this into a bash shell script. Usage: keep NUM DIR where NUM is the number of files to keep and DIR is the directory to scrub.
#!/bin/bash
# Keep last N files by date.
# Usage: keep NUMBER DIRECTORY
echo ""
if [ $# -lt 2 ]; then
echo "Usage: $0 NUMFILES DIR"
echo "Keep last N newest files."
exit 1
fi
if [ ! -e $2 ]; then
echo "ERROR: directory '$1' does not exist"
exit 1
fi
if [ ! -d $2 ]; then
echo "ERROR: '$1' is not a directory"
exit 1
fi
pushd $2 > /dev/null
ls -tp | grep -v '/' | tail -n +"$1" | xargs -I {} rm -- {}
popd > /dev/null
echo "Done. Kept $1 most recent files in $2."
ls $2|wc -l
Modified version of the answer of #Fabien if you want to specify a path. Useful if you're running the script elsewhere.
ls -tr /path/foo/ | head -n -5 | xargs -I% --no-run-if-empty rm /path/foo/%

Resources