Show and count all file extensions in directory (with subdirectories) - bash

I'm using command from this topic to view all file extensions in directory and all subdirectories.
find . -type f -name '*.*' | sed 's|.*\.||' | sort -u
How do I can count number of appearance for each extension?
Like:
png: 140

Like this, using uniq with the -c, --count flag:
find . -type f -name '*.*' | sed 's|.*\.||' | sort | uniq -c

Related

Format the output of ls -lRA command using grep

I am trying to print only specific lines from the output based on keywords using grep
ls -RlA | grep foo | sed -n '1 p'
ls -RlA | grep bar | sed -n '1 p'
ls -RlA | grep foo_file
ls -RlA | grep bar_file
Is there a way to simplify these statements into just one command?
P.S: Order does not matter
Find all files or directories with the given names:
find . '(' -name foo -o -name bar -o -name foo_file -o -name bar_file ')' -ls
A more compact version using a regex:
find . -regex '.*/\(foo\|bar\|foo_file\|bar_file)' -ls
Same as above, but check that foo_file and bar_file are files, not directories:
find . '(' -name foo -o -name bar -o -name foo_file -type f -o -name bar_file -type f ')' -ls
Here in one command.
ls -la **/{foo,bar,foo_file,bar_file}
You can also use * inside {}, such as {*.txt,foo_*.zip}.
Note that it will not work if one of the field inside {} is not found.

Finding duplicate files in Unix by content

How to find the list of duplicate files recursively by content instead of file name
find . -type f -exec basename {} \; | sed 's/(.)../\1/' | sort | uniq -c | grep -v "^[ \t]*1 "
This will search the duplicate files with the folders.

Find duplicates of a specific file on macOS

I have a directory that contains files and other directories. And I have one specific file where I know that there are duplicates of somewhere in the given directory tree.
How can I find these duplicates using Bash on macOS?
Basically, I'm looking for something like this (pseudo-code):
$ find-duplicates --of foo.txt --in ~/some/dir --recursive
I have seen that there are tools such as fdupes, but I'm neither interested in any duplicate files (only duplicates of a specific file) nor am I interested in duplicates anywhere on disk (only within the given directory or its subdirectories).
How do I do this?
For a solution compatible with macOS built-in shell utilities, try this instead:
find DIR -type f -print0 | xargs -0 md5 -r | grep "$(md5 -q FILE)"
where:
DIR is the directory you are interested in;
FILE is the file (path) you are searching for duplicates of.
If you only need the duplicated files paths, then pipe thru this as well:
cut -d' ' -f2
If you're looking for a specific filename, you could do:
find ~/some/dir -name foo.txt
which would return a list of all files with the name foo.txt in the directory. If you're looking if there are multiple files in the directory with the same name, you could do:
find ~/some/dir -exec basename {} \; | sort | uniq -d
This will give you a list of files with duplicate names (you can then use find again to figure out where those live).
---- EDIT -----
If you're looking for identical files (with the same md5 sum), you could also do:
find . -type f -exec md5sum {} \; | sort | uniq -d --check-chars=32
--- EDIT 2 ----
If your md5sum doesn't output the filename, you can use:
find . -type f -exec echo -n "{} " \; -exec md5sum {} \; | awk {'print $2 $1'} | sort | uniq -d --check-chars=32
--- EDIT 3 ----
if you're looking for a file with a specific md5 sums:
sum=`md5sum foo.txt | cut -f1 -d " "`
find ~/some/dir -type f -exec md5sum {} \; | grep $sum

BASH script : list all files including subdirectories and sort them by date

I have a bash script:
for entry in "/home/pictures"/*
do
echo "ls -larth $entry"
done
I want to list also the files in subfolders and include their path
I want to sort the results by date
It must be a bash script, because some other software (Jenkins) will call it .
Try find.
find /home/pictures -type f -exec ls -l --full-time {} \; | sort -k 6
If there are no newlines in file names use:
find /home/pictures -type f -printf '%T# %p\n'|sort -n
If you can not tolerate timestamps in output, use:
find /home/pictures -type f -printf '%28T# %p\n' | sort -n | cut -c30-
If there is possibility of newlines in file name, and, if you can make the program that consumes the output accept null terminated records, you can use:
find /home/pictures -type f -printf '%T#,%p\0' | sort -nz
For no timestamps in output, use:
find /home/pictures -type f -printf '%28T# %p\0' | sort -nz | cut -zc30-
P.S.
I have assumed that you want to sort by last modification time.
I found out the solution for my question:
find . -name * -exec ls -larth {} +

Find Files having multiple links in shell script

I want to find the files which have multiple links.
I am using ubuntu 10.10.
find -type l
It will shows all links to the file but I want to count links for particular file.
Thanks.
With this command, you will get a sumary of linked files:
find . -type l -exec readlink -f {} \; | sort | uniq -c | sort -n
or
find . -type l -print0 | xargs -n1 -0 readlink -f | sort | uniq -c | sort -n

Resources