Remove empty files and save a list of deleted files - bash

I need a script that removes all empty files and writes a list of deleted files to a text file.
Deleting files works. Unfortunately, the listing does not work.
find . -type f -empty -print -delete
I tried something like this:
-print >> test.txt

When I redirect the output of your command to a file in ., it gets delete by the find command before anything is written to it, since it is empty.
To solve this, make sure the output file is not empty at the beginning, or save it elsewhere:
find . -type f -empty -print -delete > ../log
or
date > log
find . -type f -empty -print -delete >> log
or, adapted from #DanielFarrell's comment:
find . -type f -empty -a -not -wholename ./log -print -delete > log
The added -a -not -wholename ./log excludes ./log from the find operation.

You can use -exec option with rm command instead of -delete.
find . -type f -emtpy -exec rm --verbose {} \; >> logfile.txt
logfile.txt:
removed './emptyfile1'
removed './emptyfile0'
Or you can use pipes and xargs for a more clean output:
find . -type f -empty | xargs ls | tee -a logfile.txt | xargs rm
This will give you only deleted filenames.

Related

Delete directories with specific files in Bash Script

I would like to delete specific files if existed but also the directories that contain these files. I do know the files I would like to wipe but not the directories. So far, as I'm new in bash scripting, I think of this :
find ./ -type f -name '*.r*' -print0 | xargs -0 rm -rf &> log_del.txt
find ./ -type f -name '*.c*' -print0 | xargs -0 rm -rf &>> log_del.txt
At the moment, all files named with the specific extensions *.r* and *.c* are deleted.
But the directories are still remaining and also the subdirectories in it, if existed.
I also thought of the option -o in find to delete in one line :
find ./ -type f \( -name '*.r*' -o -name '*.c*' \) -print0 | xargs -0 rm -rf &> log_del.txt
How can I do this?
And I also see that my log_del.txt file is empty... :-(
It looks like what you really want is to remove all empty directories, recursively.
find . -type d -delete
-delete processes the directories in child-first order, so that a/b is deleted before a. If a given directory is not empty, find will just display an error and continue.
If the directories remain empty, let rmdir try to remove all of them. It will fail on any directories which have still files.
find ./ -type d -exec rmdir --ignore-fail-on-non-empty {} 2>/dev/null \;
See if this serves your requirement:
find ./ -type f -name '*.r*' -delete -printf "%h\0" | xargs -0 rmdir
If the directory contained any other files, rmdir will fail.
So consider below sample file structure:
$ find a
a/
a/a/
a/a/4
a/b/
a/b/5
a/b/4
a/b/3
a/b/2
a/b/1
$ find a -type f -name '4' -delete -printf "%h\0" | xargs -0 -r rmdir
rmdir: failed to remove ‘a/b’: Directory not empty
$ find a
a
a/b
a/b/5
a/b/3
a/b/2
a/b/1
If in above example, you want to delete directory b also, you can simply use:
$ find ./ -type f -name '*.r*' -printf "%h\0" | xargs -0 rm -rf
EDIT: As per the comment, you (OP) wanted that the empty directory tree should also be deleted. These 2 commands should help you then:
$ find ./ -type f -name '*.r*' -delete # Delete matching files
$ find ./ -empty -type d -delete # Delete tree of empty directories

Using cp in bash to use piped in information about files like modification date

I am trying to copy files from one directory into another from certain modification date ranges. For example, copy all files created after May 10 from dir1 to dir2. I have tried a few things but have been unsuccessful so far.
This made sense to me but cp does not take the filenames piped to it, but just executes ./* and copies all files in the directory:
find . -type f -daystart -mtime 2 | cp ./* /dir/
This almost worked, but did not copy all of the matching files, I also tried xargs -s 50000, but did not work:
find . -type f -daystart -mtime 2 | xargs -I {} cp {} /dir/
find . -type f -daystart -mtime 2 | xargs cp -t /dir/
Found this online, does not work:
cp $(find . -type f -daystart -mtime 2) /dir/
Ideas? Thanks.
Given as your actual question is about using filenames from stdin rather than metadata from stdin, this is quite straightforward:
while IFS= read -r -d '' filename; do
cp "$filename" /wherever
done < <(find . -type f -daystart -mtime 2 -print0)
Note the use of IFS= read -r -d '' and -print0 -- as NUL and / are the only two characters which can't be used in UNIX filenames, using any other character, including the newline, to delimit them is unsafe. Think about what would happen if someone (or a software bug) created a file called $'./ \n/etc/passwd'; you want to be damned sure none of your scripts try to delete or overwrite /etc/passwd when they're trying to delete or overwrite that file.
That said, you don't actually need to use a pipe at all:
find . -type f -daystart -mtime -2 -exec cp '{}' /wherever ';'
...or, if you're only trying to support GNU cp, you can use this more efficient variant:
find . -type f -daystart -mtime -2 -exec cp -t /wherever '{}' +
You don't specify why the various attempts didn't work, so I can only assume that they are the result of whitespace in the filenames.
Try using find's useful -exec action instead of using xargs:
find . -type f -daystart -mtime 2 -exec cp {} /media/alex/Extra/Music/watchfolder/ \;
find . -type f -daystart -mtime 2 \
| cpio -pdv /media/alex/Extra/Music/watchfolder/

I am getting an error "arg list too long" in unix

i am using the following command and getting an error "arg list too long".Help needed.
find ./* \
-prune \
-name "*.dat" \
-type f \
-cmin +60 \
-exec basename {} \;
Here is the fix
find . -prune -name "*.dat" -type f -cmin +60 |xargs -i basename {} \;
To only find files in the current directory, use -maxdepth 1.
find . -maxdepth 1 -name '*.dat' -type f -cmin +60 -exec basename {} \;
In all *nix systems the shell has a maximum length of arguments that can be passed to a command. This is measured after the shell has expanded filenames passed as arguments on the command line.
The syntax of find is find location_to_find_from arguments..... so when you are running this command the shell will expand your ./* to a list of all files in the current directory. This will expand your find command line to find file1 file2 file3 etc etc This is probably not want you want as the find is recursive anyway. I expect that you are running this command in a large directory and blowing your command length limit.
Try running the command as follows
find . -name "*.dat" -type f -cmin +60 -exec basename {} \;
This will prevent the filename expansion that is probably causing your issue.
Without find, and only checking the current directory
now=$(date +%s)
for file in *.dat; do
if (( $now - $(stat -c %Y "$file") > 3600 )); then
echo "$file"
fi
done
This works on my GNU system. You may need to alter the date and stat formats for different OS's
If you have to show only .dat filename in the ./ tree. Execute it without -prune option, and use just path:
find ./ -name "*.dat" -type f -cmin +60 -exec basename {} \;
To find all the .dat files which are older than 60 minutes in the present directory only do as follows:
find . -iregex "./[^/]+\.dat" -type f -cmin +60 -exec basename {} \;
And if you have croppen (for example aix) version of find tool do as follows:
find . -name "*.dat" -type f -cmin +60 | grep "^./[^/]\+dat" | sed "s/^.\///"

recursively delete all files except some especific types

I want to recursively delete all files in some folders except those who have .gz extension. Normally I use
find /thepath -name "foo" -print0 | xargs -0 rm -rf
to recursively delete all folders named "foo" in the /thepath. But now I wan to add an exclusion option. How that is possible?
For example, the folder structure looks like
.hiddenfolder
.hiddenfolder/bin.so
arc.tar.gz
note.txt
sample
So I want to delete everything but keep arc.tar.gz
Find and delete all files under /thepath except with name matching *.gz:
# First check with ls -l
find /thepath -type f ! -name '*.gz' -print0 | xargs -0 ls -l
# Ok: delete
find /thepath -type f ! -name '*.gz' -print0 | xargs -0 rm -vf
Oh, and to delete all empty left-over directories:
find /thepath -type d -empty -print0 | xargs -0 rmdir -v
I think
find /thepath -name "foo" ! -name "*.gz" -print0
should produce the correct list of filenames, but check before piping the output to your xargs command to perform the actual deletions.

how to ignore directories but not the files in them in bash script with find

I want to run a find command but only find the files in directories, not the directories or subdirectories themselves. Also acceptable would be to find the directories but grep them out or something similar, still listing the files in those directories. As of right now, to find all files changed in the last day in the working directory, and grep'ing out DS_Store and replacing spaces with underscores:
find . -mtime -1 -type f -print | grep -v '\.DS_Store' | awk '{gsub(/ /,"_")}; 1'
Any help would be appreciated!
If you have GNU find:
find . -mtime -1 ! -name '.DS_Store' -type f -printf '%f\n'
will print only the basename of the file.
For other versions of find:
find . -mtime -1 ! -name '.DS_Store' -type f -exec basename {} \;
you could then do:
find -name index.html -exec sh -c 'basename "$1" | tr " " _' _ {} \;

Resources