Counting the contents of a directory

Counting the contents of a directory - bash

So I know how I would approach counting the number of files in a directory- I would use a for filename in * loop and then test the files names to fit my purpose, but I'm having trouble figuring out how to loop through a directory and then count how many (sub)directories are in it.
Could anyone point me in the right direction?

You can test if its a directory by using -d.
You can use find: find . -mindepth 1 -maxdepth 1 -type d

((n=0))
for fn in *
do
[[ -d "${fn}" ]] && ((n=1+${n}))
done
Keep a counter and only increment it for directories...

What are you trying to do? Take a look at the wc command. Specifically wc -l which counts the number of lines in the output. You can use a whole array of commands that generate output and then pipe that to the wc -l. Be careful of commands that add headers and footers to the files (like ls -l).
Here are some examples:
This will count all files and directories that don't start with .:
$ ls | wc -l
It's the same as your for loop you had in your question.
This will count all files and directories including those hidden ones. Note the ls -A instead of ls -a. The first won't list . and .. as files while the second will:
$ ls -A | wc -l
This will count all files and directories in the entire directory tree
$ find . | wc -l
This will only count the directories in the whole directory tree
$ find . -type d| wc -l
This will count all the files in the whole directory tree
$ find . -type f | wc -l
ls -
This will limit you to the number of directories in the current directory
$ find . -mindepth 1 -maxdepth 1 -type d | wc -l
And, you can use this to assign it to a variable:
$ num_of_files=$(find . -type f | wc -l)

here is how count directories or do stuff with directory names.
#!/bin/bash
old_IFS=$IFS
IFS=$'\n'
array=($(ls -F /foo/bar/ | grep '/$')) # this creates an array named "array" that holds
IFS=$old_IFS # all the directory names located in /foo/bar/
echo ${#array[#]} # this will give you the number of directories in /foo/bar/
for ((i=0; i<${#array[#]}; i++))
do
echo ${array[$i]} # this will output a list of all the directories
done
alternatively you could:
ls -F /foo/bar/ | grep '/$' | cat > directorynames.txt
and then count the number of lines. or you could get rid of the cat and just put the above in a for loop that would count up for every newline character.

Related

How do I add numbers from result?

I want to make a bash script that counts how many files there are in specific folders.
Example:
#!/usr/bin/env bash
sum=0
find /home/user/Downloads | wc -l
find /home/user/Documents | wc -l
find /home/user/videos | wc -l
echo "files are $sum "
Downloads folder has 5 files, Documents has 10 files and videos has 10 files.
I want to add all files from above directories and print the number of files.
echo "files are $sum "
Please I would like to use "only" find command, because my script delete some files. My goal is how many files I deleted.

Anything that's piping to wc -l isn't counting how many files you have, it's counting how many newlines are present, and so it'll fail if any of your file names contain newlines (a very real possibility, especially given the specific directories you're searching). You could do this instead using GNU tools:
find \
/home/user/Downloads \
/home/user/Documents \
/home/user/videos \
-print0 |
awk -v RS='\0' 'END{print NR}'

I'm not sure about this, but you can try this code.
#!/usr/bin/env bash
sum=0
downloads=$(find /home/user/Downloads | wc -l)
((sum += downloads))
documents=$(find /home/user/Documents | wc -l)
((sum += documents))
videos=$(find /home/user/videos | wc -l)
((sum += videos))
echo "files are $sum "

You can save the individual results and sum them as as per FS-GSW's answer.
That's what I would do in Java or C, but shell scripts have their own patterns. Whenever you find yourself with lots of variables or explicit looping, take a step back. That imperative style is not super idiomatic in shell scripts.
Can you pass multiple file names to the same command?
Can that loop become a pipe?
In this case, I would pass all three directories to a single find command. Then you only need to call wc -l once on the combined listing:
find /home/user/Downloads /home/user/Documents /home/user/videos | wc -l
Or using tilde and brace expansion:
find ~user/{Downloads,Documents,videos} | wc -l
And if the list of files is getting long you could store it in an array:
files=(
/home/user/Downloads
/home/user/Documents
/home/user/videos
)
find "${files[#]}" | wc -l

Just use ..
find /home/user/ -type f | wc -l
.. to count the files inside the user directory recursively.
With tree you could also find/count (or whatever else you want to do) hidden files.
e.g. tree -a /home/user/ - for which the output would be: XXX directories, XXX files - which then wouldn't apply to your question though.

Script to count number of files in each directory

I need to count the number of files on a large number of directories. Is there an easy way to do this with a shell script (using find, wc, sed, awk or similar)? Just to avoid having to write a proper script in python.
The output would be something like this:
$ <magic_command>
dir1 2
dir2 12
dir3 5
The number after the dir name would be the number of files. A plus would be able to turn counting of dot/hidden files on and off.
Thanks!

Try the below one:
du -a | cut -d/ -f2 | sort | uniq -c | sort -nr
from http://www.linuxquestions.org/questions/linux-newbie-8/how-to-find-the-total-number-of-files-in-a-folder-510009/#post3466477

find <dir> -type f | wc -l
find -type f will list all files in the specified directory one at each line, wc -l count the amount of newlines seen from stdin.
Also for future reference: answers like this are a google away.

More or less what I was looking for:
find . -type d -exec sh -c 'echo "{}" `ls "{}" |wc -l`' \;

try ls | wc it list the file in your directory and gives list of file output to wc as input

One way like this:
$ for dir in $(find . -type d )
> do
> echo $dir $(ls -A $dir | wc -l )
> done
Just remove the -A option if you do not want the hidden file count

find . -type d | xargs ls -1 | perl -lne 'if(/^\./ || eof){print $a." ".$count;$a=$_;$count=-1}else{$count++}'
below is the test:
> find . -type d
.
./SunWS_cache
./wicked
./wicked/segvhandler
./test
./test/test2
./test/tempdir.
./signal_handlers
./signal_handlers/part2
> find . -type d | xargs ls -1 | perl -lne 'if(/^\./ || eof){print $a." ".$count;$a=$_;$count=-1}else{$count++}'
.: 79
./SunWS_cache: 4
./signal_handlers: 6
./signal_handlers/part2: 5
./test: 6
./test/tempdir.: 0
./test/test2: 0
./wicked: 4
./wicked/segvhandler: 9

A generic version of Mehdi Karamosly's solution to list folders of any directory without changing current directory
DIR=~/test/ sh -c 'cd $DIR; du -a | cut -d/ -f2 | sort | uniq -c | sort -nr'
Explanation:
Extract directory into variable
Start new shell
Change directory in that shell so that current shell's directory stays same
Process

I use these functions:
nf()(for d;do echo $(ls -A -- "$d"|wc -l) "$d";done)
nfr()(for d;do echo $(find "$d" -mindepth 1|wc -l) "$d";done)
Both assume that filenames don't contain newlines.
Here's bash-only versions:
nf()(shopt -s nullglob dotglob;for d;do a=("$d"/*);echo "${#a[#]} $d";done)
nfr()(shopt -s nullglob dotglob globstar;for d;do a=("$d"/**);echo "${#a[#]} $d";done)

I liked the output from the du based answer, but when I was looking at a large filesystem it was taking ages, so I put together a small ls based script which gives the same output, but much quicker:
for dir in `ls -1A ~/test/`;
do
echo "$dir `ls -R1Ap ~/test/$dir | grep -Ev "[/:]|^\s*$" | wc -l`"
done

You can try out copying the output of ls command in a text file and then count the number of lines in that file.
ls $LOCATION > outText.txt; NUM_FILES=$(wc -w outText.txt); echo $NUM_FILES

find -type f -printf '%h\n' | sort | uniq -c | sort -n

How to get the number of files in a folder as a variable?

Using bash, how can one get the number of files in a folder, excluding directories from a shell script without the interpreter complaining?
With the help of a friend, I've tried
$files=$(find ../ -maxdepth 1 -type f | sort -n)
$num=$("ls -l" | "grep ^-" | "wc -l")
which returns from the command line:
../1-prefix_blended_fused.jpg: No such file or directory
ls -l : command not found
grep ^-: command not found
wc -l: command not found
respectively. These commands work on the command line, but NOT with a bash script.
Given a file filled with image files formatted like 1-pano.jpg, I want to grab all the images in the directory to get the largest numbered file to tack onto the next image being processed.
Why the discrepancy?

The quotes are causing the error messages.
To get a count of files in the directory:
shopt -s nullglob
numfiles=(*)
numfiles=${#numfiles[#]}
which creates an array and then replaces it with the count of its elements. This will include files and directories, but not dotfiles or . or .. or other dotted directories.
Use nullglob so an empty directory gives a count of 0 instead of 1.
You can instead use find -type f or you can count the directories and subtract:
# continuing from above
numdirs=(*/)
numdirs=${#numdirs[#]}
(( numfiles -= numdirs ))
Also see "How can I find the latest (newest, earliest, oldest) file in a directory?"
You can have as many spaces as you want inside an execution block. They often aid in readability. The only downside is that they make the file a little larger and may slow initial parsing (only) slightly. There are a few places that must have spaces (e.g. around [, [[, ], ]] and = in comparisons) and a few that must not (e.g. around = in an assignment.

ls -l | grep -v ^d | wc -l
One line.

How about:
count=$(find .. -maxdepth 1 -type f|wc -l)
echo $count
let count=count+1 # Increase by one, for the next file number
echo $count
Note that this solution is not efficient: it spawns sub shells for the find and wc commands, but it should work.

file_num=$(ls -1 --file-type | grep -v '/$' | wc -l)
this is a bit lightweight than a find command, and count all files of the current directory.

The most straightforward, reliable way I can think of is using the find command to create a reliably countable output.
Counting characters output of find with wc:
find . -maxdepth 1 -type f -printf '.' | wc --char
or string length of the find output:
a=$(find . -maxdepth 1 -type f -printf '.')
echo ${#a}
or using find output to populate an arithmetic expression:
echo $(($(find . -maxdepth 1 -type f -printf '+1')))

Simple efficient method:
#!/bin/bash
RES=$(find ${SOURCE} -type f | wc -l)

Get rid of the quotes. The shell is treating them like one file, so it's looking for "ls -l".

REmove the qoutes and you will be fine

Expanding on the accepted answer (by Dennis W): when I tried this approach I got incorrect counts for dirs without subdirs in Bash 4.4.5.
The issue is that by default nullglob is not set in Bash and numdirs=(*/) sets an 1 element array with the glob pattern */. Likewise I suspect numfiles=(*) would have 1 element for an empty folder.
Setting shopt -s nullglob to disable nullglobbing resolves the issue for me. For an excellent discussion on why nullglob is not set by default on Bash see the answer here: Why is nullglob not default?
Note: I would have commented on the answer directly but lack the reputation points.

Here's one way you could do it as a function. Note: you can pass this example, dirs for (directory count), files for files count or "all" for count of everything in a directory. Does not traverse tree as we aren't looking to do that.
function get_counts_dir() {
# -- handle inputs (e.g. get_counts_dir "files" /path/to/folder)
[[ -z "${1,,}" ]] && type="files" || type="${1,,}"
[[ -z "${2,,}" ]] && dir="$(pwd)" || dir="${2,,}"
shopt -s nullglob
PWD=$(pwd)
cd ${dir}
numfiles=(*)
numfiles=${#numfiles[#]}
numdirs=(*/)
numdirs=${#numdirs[#]}
# -- handle input types files/dirs/or both
result=0
case "${type,,}" in
"files")
result=$((( numfiles -= numdirs )))
;;
"dirs")
result=${numdirs}
;;
*) # -- returns all files/dirs
result=${numfiles}
;;
esac
cd ${PWD}
shopt -u nullglob
# -- return result --
[[ -z ${result} ]] && echo 0 || echo ${result}
}
Examples of using the function :
folder="/home"
get_counts_dir "files" "${folder}"
get_counts_dir "dirs" "${folder}"
get_counts_dir "both" "${folder}"
Will print something like :
2
4
6

Short and sweet method which also ignores symlinked directories.
count=$(ls -l | grep ^- | wc -l)
or if you have a target:
count=$(ls -l /path/to/target | grep ^- | wc -l)

How to list only files and not directories of a directory Bash?

How can I list all the files of one folder but not their folders or subfiles. In other words: How can I list only the files?

Using find:
find . -maxdepth 1 -type f
Using the -maxdepth 1 option ensures that you only look in the current directory (or, if you replace the . with some path, that directory). If you want a full recursive listing of all files in that and subdirectories, just remove that option.

ls -p | grep -v /
ls -p lets you show / after the folder name, which acts as a tag for you to remove.

carlpett's find-based answer (find . -maxdepth 1 -type f) works in principle, but is not quite the same as using ls: you get a potentially unsorted list of filenames all prefixed with ./, and you lose the ability to apply ls's many options;
also find invariably finds hidden items too, whereas ls' behavior depends on the presence or absence of the -a or -A options.
An improvement, suggested by Alex Hall in a comment on the question is to combine shell globbing with find:
find * -maxdepth 0 -type f # find -L * ... includes symlinks to files
However, while this addresses the prefix problem and gives you alphabetically sorted output, you still have neither (inline) control over inclusion of hidden items nor access to ls's many other sorting / output-format options.
Hans Roggeman's ls + grep answer is pragmatic, but locks you into using long (-l) output format.
To address these limitations I wrote the fls (filtering ls) utility,
a utility that provides the output flexibility of ls while also providing type-filtering capability,
simply by placing type-filtering characters such as f for files, d for directories, and l for symlinks before a list of ls arguments (run fls --help or fls --man to learn more).
Examples:
fls f # list all files in current dir.
fls d -tA ~ # list dirs. in home dir., including hidden ones, most recent first
fls f^l /usr/local/bin/c* # List matches that are files, but not (^) symlinks (l)
Installation
Supported platforms
When installing from the npm registry: Linux and macOS
When installing manually: any Unix-like platform with Bash
From the npm registry
Note: Even if you don't use Node.js, its package manager, npm, works across platforms and is easy to install; try
curl -L https://git.io/n-install | bash
With Node.js installed, install as follows:
[sudo] npm install fls -g
Note:
Whether you need sudo depends on how you installed Node.js / io.js and whether you've changed permissions later; if you get an EACCES error, try again with sudo.
The -g ensures global installation and is needed to put fls in your system's $PATH.
Manual installation
Download this bash script as fls.
Make it executable with chmod +x fls.
Move it or symlink it to a folder in your $PATH, such as /usr/local/bin (macOS) or /usr/bin (Linux).

Listing content of some directory, without subdirectories
I like using ls options, for sample:
-l use a long listing format
-t sort by modification time, newest first
-r reverse order while sorting
-F, --classify append indicator (one of */=>#|) to entries
-h, --human-readable with -l and -s, print sizes like 1K 234M 2G etc...
Sometime --color and all others. (See ls --help)
Listing everything but folders
This will show files, symlinks, devices, pipe, sockets etc.
so
find /some/path -maxdepth 1 ! -type d
could be sorted by date easily:
find /some/path -maxdepth 1 ! -type d -exec ls -hltrF {} +
Listing files only:
or
find /some/path -maxdepth 1 -type f
sorted by size:
find /some/path -maxdepth 1 -type f -exec ls -lSF --color {} +
Prevent listing of hidden entries:
To not show hidden entries, where name begin by a dot, you could add ! -name '.*':
find /some/path -maxdepth 1 ! -type d ! -name '.*' -exec ls -hltrF {} +
Then
You could replace /some/path by . to list for current directory or .. for parent directory.

You can also use ls with grep or egrep and put it in your profile as an alias:
ls -l | egrep -v '^d'
ls -l | grep -v '^d'

find files: ls -l /home | grep "^-" | tr -s ' ' | cut -d ' ' -f 9
find directories: ls -l /home | grep "^d" | tr -s ' ' | cut -d ' ' -f 9
find links: ls -l /home | grep "^l" | tr -s ' ' | cut -d ' ' -f 9
tr -s ' ' turns the output into a space-delimited file
the cut command says the delimiter is a space, and return the 9th field (always the filename/directory name/linkname).
I use this all the time!

You are welcome!
ls -l | grep '^-'
Looking just for the name, pipe to cut or awk.
ls -l | grep '^-' | awk '{print $9}'
ls -l | grep '^-' | cut -d " " -f 13

{ find . -maxdepth 1 -type f | xargs ls -1t | less; }
added xargs to make it works, and used -1 instead of -l to show only filenames without additional ls info

You can one of these:
echo *.* | cut -d ' ' -f 1- --output-delimiter=$'\n'
echo *.* | tr ' ' '\n'
echo *.* | sed 's/\s\+/\n/g'
ls -Ap | sort | grep -v /

This method does not use external commands.
bash$ res=$( IFS=$'\n'; AA=(`compgen -d`); IFS='|'; eval compgen -f -X '#("${AA[*]}")' )
bash$ echo "$res"
. . .

Just adding on to carlpett's answer.
For a much useful view of the files, you could pipe the output to ls.
find . -maxdepth 1 -type f|ls -lt|less
Shows the most recently modified files in a list format, quite useful when you have downloaded a lot of files, and want to see a non-cluttered version of the recent ones.

"find '-maxdepth' " does not work with my old version of bash, therefore I use:
for f in $(ls) ; do if [ -f $f ] ; then echo $f ; fi ; done

Get the newest directory to a variable in Bash

I would like to find the newest sub directory in a directory and save the result to variable in bash.
Something like this:
ls -t /backups | head -1 > $BACKUPDIR
Can anyone help?

BACKUPDIR=$(ls -td /backups/*/ | head -1)
$(...) evaluates the statement in a subshell and returns the output.

There is a simple solution to this using only ls:
BACKUPDIR=$(ls -td /backups/*/ | head -1)
-t orders by time (latest first)
-d only lists items from this folder
*/ only lists directories
head -1 returns the first item
I didn't know about */ until I found Listing only directories using ls in bash: An examination.

This ia a pure Bash solution:
topdir=/backups
BACKUPDIR=
# Handle subdirectories beginning with '.', and empty $topdir
shopt -s dotglob nullglob
for file in "$topdir"/* ; do
[[ -L $file || ! -d $file ]] && continue
[[ -z $BACKUPDIR || $file -nt $BACKUPDIR ]] && BACKUPDIR=$file
done
printf 'BACKUPDIR=%q\n' "$BACKUPDIR"
It skips symlinks, including symlinks to directories, which may or may not be the right thing to do. It skips other non-directories. It handles directories whose names contain any characters, including newlines and leading dots.

Well, I think this solution is the most efficient:
path="/my/dir/structure/*"
backupdir=$(find $path -type d -prune | tail -n 1)
Explanation why this is a little better:
We do not need sub-shells (aside from the one for getting the result into the bash variable).
We do not need a useless -exec ls -d at the end of the find command, it already prints the directory listing.
We can easily alter this, e.g. to exclude certain patterns. For example, if you want the second newest directory, because backup files are first written to a tmp dir in the same path:
backupdir=$(find $path -type -d -prune -not -name "*temp_dir" | tail -n 1)

The above solution doesn't take into account things like files being written and removed from the directory resulting in the upper directory being returned instead of the newest subdirectory.
The other issue is that this solution assumes that the directory only contains other directories and not files being written.
Let's say I create a file called "test.txt" and then run this command again:
echo "test" > test.txt
ls -t /backups | head -1
test.txt
The result is test.txt showing up instead of the last modified directory.
The proposed solution "works" but only in the best case scenario.
Assuming you have a maximum of 1 directory depth, a better solution is to use:
find /backups/* -type d -prune -exec ls -d {} \; |tail -1
Just swap the "/backups/" portion for your actual path.
If you want to avoid showing an absolute path in a bash script, you could always use something like this:
LOCALPATH=/backups
DIRECTORY=$(cd $LOCALPATH; find * -type d -prune -exec ls -d {} \; |tail -1)

With GNU find you can get list of directories with modification timestamps, sort that list and output the newest:
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\0" | sort -z -n | cut -z -f2- | tail -z -n1
or newline separated
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\n" | sort -n | cut -f2- | tail -n1
With POSIX find (that does not have -printf) you may, if you have it, run stat to get file modification timestamp:
find . -mindepth 1 -maxdepth 1 -type d -exec stat -c '%Y %n' {} \; | sort -n | cut -d' ' -f2- | tail -n1
Without stat a pure shell solution may be used by replacing [[ bash extension with [ as in this answer.

Your "something like this" was almost a hit:
BACKUPDIR=$(ls -t ./backups | head -1)
Combining what you wrote with what I have learned solved my problem too. Thank you for rising this question.
Note: I run the line above from GitBash within Windows environment in file called ./something.bash.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio