I am trying to write a one liner to find the number of files in each home directory. I am trying to do this as the other day I had situation where I ran out of inodes on /home. It took me a long time to find the offender and I want to shorten this process. This is what i have but it is not working.
for i in /home/*; do if [ -d "$i" ]; then cd $i find . -xdev -maxdepth 100 -type f |wc -l; fi done
When I run it, it prints a 0 for each home directory, and I remain in roots home directory.
However when I run just this part:
for i in /home/*; do if [ -d "$i" ]; then cd $i; fi done
I wind up in the last home directory leading me to believe I traversed them all.
And when I run this in each users home directory:
find . -xdev -maxdepth 100 -type f |wc -l
I get a legit answer.
You're missing a terminating character after your cd. But more importantly, using cd can cause unwanted errors if you're not careful, try below instead (cd not needed).
for i in /home/*; do [ -d "$i" ] && echo "$i" && find "$i" -xdev -maxdepth 100 -type f | wc -l; done
Since find can take multiple paths, you don't need a loop:
find /home/*/ -xdev -maxdepth 100 -type f | wc -l
To avoid any issues with filenames containing newlines (rare, yes), you can take advantage of an additional GNU extension to find (you're using -maxdepth, so I assume you can use -printf as well):
find /home/*/ -xdev -maxdepth 100 -type f -printf "." | wc -c
Since you aren't actually using the name of the file for counting, replace it with a single-character string, then count the length of the resulting string.
Related
I want to get the file count & file names & folder names in directory:
mkdir -p /tmp/test/folder1
mkdir -p /tmp/test/folder2
touch /tmp/test/file1
touch /tmp/test/file2
file_names=$(find "/tmp/test" -mindepth 1 -maxdepth 1 -type f -print0 | xargs -0 -I {} basename "{}")
echo $file_names
here is the output:
file2 file1
For folder:
folder_names=$(find "/tmp/test" -mindepth 1 -maxdepth 1 -type d -print0 | xargs -0 -I {} basename "{}")
echo $folder_names
here is the output:
folder2 folder1
For count:
file_count=0 && $(find "/tmp/test" -mindepth 1 -maxdepth 1 -type f -print0 | let "file_count=file_count+1")
echo $file_count
folder_count=0 && $(find "/tmp/test" -mindepth 1 -maxdepth 1 -type d -print0 | let "folder_count=folder_count+1")
echo $folder_count
The file_count and folder_count does not work
Question 1:
How to get the correct file_count and folder_count?
Question 2:
Is it possible for getting names into an array and check the count from array size?
The answer to the second question is really the answer to the first, too.
mapfile -d '' files < <( find /tmp/test -type f \
-mindepth 1 -maxdepth 1 \
-printf '%f\0')
echo "${#files} files"
printf '%s\n' "${files[#]}"
The use of double quotes and # in the array expansion are essential for printing file names with whitespace correctly. The use of a null byte terminator between file names ensures that even newlines in file names are disambiguated.
Notice also the use of -printf with a specific format string to avoid having to run basename separately. However, the -printf option and its various format strings, as well as the -print0 option you used, are a GNU find extension, and thus not portable. (Linux typically ships with GNU tools; on other platforms, they are obviously easy to install, but typically not installed out of the box.)
If you have an older version of Bash which doesn't support mapfiles, try an explicit loop:
files=()
while IFS= read -r -d $'\0' file; do
files+=("$file")
done < <(find ...)
If you don't have GNU find, a common workaround is to print a fixed string for each found file, and then the line or character count reliably reflects the number of found files.
find /tmp/test -type f \
-mindepth 1 -maxdepth 1 \
-exec printf . \; |
wc -c
Though then, how do you collect the file names? If (as in your case) you don't require recursion into subdirectories, simply loop over all items in the directory.
In which case, again, the number of items in the collected array will also tell you how many there are.
files=()
dirs=()
for item in /tmp/test/*; do
if [[ -f "$item"]]; then
files+=("$item")
elif [[ -d "$item" ]]; then
dirs+=("$item")
fi
done
echo "${#dirs[#] directories}
printf '- %s\n' "${dirs[#]}"
echo "${#files[#]} files"
printf '%s\n' "${dirs[#]}"
For a further discussion, see also https://mywiki.wooledge.org/BashFAQ/020
Needlessly collecting items into an array so you can loop over it once is a common beginner antipattern. If you just want to process each item in isolation and then forget it, there is no need to waste memory on remembering all the other items - just loop over them directly.
As an aside, running find in a subprocess will create a new shell instance with its own set of variables; thus in your attempt, the pipe to let would increment from 0 to 1 each time you ran it (though of course, piping to let also does not do anything useful anyway).
I have a script as below.
mount=/transfer/input
rm src/cb.log
array=( $(find ${mount} -type f \( -name "Turbine_DATA*" -o -name "GT_Data*" -o -name "INSIGHTS_data*" \) -printf '%f\n' ))
for i in "${array[#]}";do
echo $i >>src/cb.log
Filecount=`find ${mount} -maxdepth 1 -type f -name "$i" | wc -l`
echo $Filecount>> src/cb.log;
done
I am facing below issue.
Although the file Turbine_DATA has 3 lines of data in that but the Filecount is still showing as 1. I am able to do wc -l on that file in the /transfer/input directory where the file is present and I can see 3 count there.
The point to note here that I am running the script from /NAS/Files directory.
Is this the problem here?
Filecount is the number of files, not the total number of lines in those files.
That's why it's named Filecount, not Linecount.
It's counting the number of files with the given name (I assume there can be multiple files of the same name in different subdirectories).
However, for a correct understanding you should probably consult the author of that script - it's not clear what information it should write to cb.log.
I'm writing a script to check if there actually is a directory that has content and a normal size, and see if there is a directory older then 36 hours, if not it should alert me.
However I'm having trouble using the directories as variable.
When I execute the script it returns: ./test.sh: line 5: 1: No such file or directory.
I tried ALLDIR=$(ls /home/customers/*/ as well but returned the same error.
What am I doing wrong? Below is the script.
Thanks a lot in advance!!
#!/bin/bash
ALLDIR=$(find * /home/customers/*/ -maxdepth 2 -mindepth 2)
for DIR in ${ALLDIR}
do
if [[ $(find "$DIR" -maxdepth 1 -type d -name '*' ! -mtime -36 | wc -l = <1 ) ]]; then
mail -s "No back-ups found today at $DIR! Please check the issue!" test#example.com
exit 1
fi
done
for DIR in ${ALLDIR}
do
if [[ $(find "$DIR" -mindepth 1 -maxdepth 1 -type d -exec du -ks {} + | awk '$1 <= 50' | cut -f 2- ) ]]; then
mail -s "Backup directory size is too small for $DIR, please check the issue!" test#example.com
exit 1
fi
done
For a start, to loop through all directories a fixed level deep, use this:
for dir in /home/customers/*/*/*/
A pattern ending in a slash / will only match directories.
Note that $dir is a lowercase variable name, don't use uppercase ones as they may clash with shell internal/environment variables.
Next, your conditions are a bit broken - you don't need to use a [[ test here:
if ! find "$dir" -maxdepth 1 -type d ! -mtime -36 | grep -q .
If anything is found, find will print it and grep will quietly match anything, so the pipeline will exit successfully. The ! at the start negates the condition, so the if branch will only be taken when this doesn't happen, i.e. when nothing is found. -name '*' is redundant.
You can do something similar with the second if, removing the [[ and $() and using grep -q . to test for any output. I guess the cut part is redundant too.
i think that i don't understand very well how the find command in Unix works; i have this code for counting the number of files in each folder but i want to count the number of lines of each file found and save the total in variable.
find "$d_path" -type d -maxdepth 1 -name R -print0 | while IFS= read -r -d '' file; do
nb_fichier_R="$(find "$file" -type f -maxdepth 1 -iname '*.R' | wc -l)"
nb_ligne_fichier_R= "$(find "$file" -type f -maxdepth 1 -iname '*.R' -exec wc -l {} +)"
echo "$nb_ligne_fichier_R"
done
output:
43 .//system d exploi/r-repos/gbm/R/basehaz.gbm.R
90 .//system d exploi/r-repos/gbm/R/calibrate.plot.R
45 .//system d exploi/r-repos/gbm/R/checks.R
178 total: File name too long
can i just save to total number of lines in my variable? here in my example just save 178 and that for each files in my folder "$d_path"
Many Thanks
Maybe I'm missing something, but wouldn't this do what you want?
wc -l R/*.[Rr]
Solution:
find "$d_path" -type d -maxdepth 1 -name R | while IFS= read -r file; do
nb_fichier_R="$(find "$file" -type f -maxdepth 1 -iname '*.R' | wc -l)"
echo "$nb_fichier_R" #here is fine
find "$file" -type f -maxdepth 1 -iname '*.R' | while IFS= read -r fille; do
wc -l $fille #here is the problem nothing shown
done
done
Explanation:
adding -print0 the first find produced no newline so you had to tell read -d '' to tell it not to look for a newline. Your subsequent finds output newlines so you can use read without a delimiter. I removed -print0 and -d '' from all calls so it is consistent and idiomatic. Newlines are good in the unix world.
For the command:
find "$d_path" -type d -maxdepth 1 -name R -print0
there can be at most one directory that matches ("$d_path/R"). For that one directory, you want to print:
The number of files matching *.R
For each such file, the number of lines in it.
Allowing for spaces in $d_path and in the file names is most easily handled, I find, with an auxilliary shell script. The auxilliary script processes the directories named on its command line. You then invoke that script from the main find command.
counter.sh
shopt -s nullglob;
for dir in "$#"
do
count=0
for file in "$dir"/*.R; do ((count++)); done
echo "$count"
wc -l "$dir"/*.R </dev/null
done
The shopt -s nullglob option means that if there are no .R files (with names that don't start with a .), then the glob expands to nothing rather than expanding to a string containing *.R at the end. It is convenient in this script. The I/O redirection on wc ensures that if there are no files, it reads from /dev/null, reporting 0 lines (rather than sitting around waiting for you to type something).
On the other hand, the find command will find names that start with a . as well as those that do not, whereas the globbing notation will not. The easiest way around that is to use two globs:
for file in "$dir"/*.R "$dir"/.*.R; do ((count++)); done
or use find (rather carefully):
find . -type f -name '*.R' -exec sh -c 'echo $#' arg0 {} +
Using counter.sh
find "$d_path" -type d -maxdepth 1 -name R -exec sh ./counter.sh {} +
This script allows for the possibility of more than one sub-directory (if you remove -maxdepth 1) and invokes counter.sh with all the directories to be examined as arguments. The script itself carefully handles file names so that whether there are spaces, tabs or newlines (or any other character) in the names, it will work correctly. The sh ./counter.sh part of the find command assumes that the counter.sh script is in the current directory. If it can be found on $PATH, then you can drop the sh and the ./.
Discussion
The technique of having find execute a command with the list of file name arguments is powerful. It avoids issues with -print0 and using xargs -0, but gives you the same reliable handling of arbitrary file names, including names with spaces, tabs and newlines. If there isn't already a command that does what you need (but you could write one as a shell script), then do so and use it. If you might need to do the job more than once, you can keep the script. If you're sure you won't, you can delete it after you're done with it. It is generally much easier to handle files with awkward names like this than it is to fiddle with $IFS.
Consider this solution:
# If `"$dir"/*.R` doesn't match anything, yield nothing instead of giving the pattern.
shopt -s nullglob
# Allows matching both `*.r` and `*.R` in one expression. Using them separately would
# give double results.
shopt -s nocaseglob
while IFS= read -ru 4 -d '' dir; do
files=("$dir"/*.R)
echo "${#files[#]}"
for file in "${files[#]}"; do
wc -l "$file"
done
# Use process substitution to prevent going to a subshell. This may not be
# necessary for now but it could be useful to future modifications.
# Let's also use a custom fd to keep troubles isolated.
# It works with `-u 4`.
done 4< <(exec find "$d_path" -type d -maxdepth 1 -name R -print0)
Another form is to use readarray which allocates all found directories at once. Only caveat is that it can only read normal newline-terminated paths.
shopt -s nullglob
shopt -s nocaseglob
readarray -t dirs < <(exec find "$d_path" -type d -maxdepth 1 -name R)
for dir in "${dirs[#]}"; do
files=("$dir"/*.R)
echo "${#files[#]}"
for file in "${files[#]}"; do
wc -l "$file"
done
done
I have a bash script, created by someone else, that I need to modify a little.
Since I'm new to Bash, I may need a little help with some common commands.
The script simply loops through a directory (recursively) for a specific file extension.
Here's the current script: (runme.sh)
#! /bin/bash
SRC=/docs/companies/
function report()
{
echo "-----------------------"
find $SRC -iname "*.aws" -type f -print
echo -e "\033[1mSOURCE FILES=\033[0m" `find $SRC -iname "*.aws" -type f -print |wc -l`
echo "-----------------------"
exit 0
}
report
I simply type #./runme.sh and I can see a list of all files with the extension of .aws
My primary goal is to limit the search. (some directories have way too many files)
I would like to run the script, limiting it to just 20 files.
Do I need to place the entire script into a loop method?
That's easy -- as long as you want the first 20 files, just pipe the first find command through head -n 20. But I can't resist a little cleanup while I'm at it: as written, it runs find twice, once to print the filenames and once to count them; if there are a lot of files to search, this is a waste of time. Second, wrapping the actual content of the script in a function (report) doesn't make much sense, and having the function exit (rather than returning) makes even less. Finally, I like to protect filenames with double-quotes and hate backquotes (use $() instead). So I took the liberty of a bit of cleanup:
#! /bin/bash
SRC=/docs/companies/
files="$(find "$SRC" -iname "*.aws" -type f -print)"
if [ -n "$files" ]; then
count="$(echo "$files" | wc -l)"
else # echo would print one line even if there are no files, so special-case the empty list
count=0
fi
echo "-----------------------"
echo "$files" | head -n 20
echo -e "\033[1mSOURCE FILES=\033[0m $count"
echo "-----------------------"
Use head -n 20 (as proposed by Peter). Additional remark: the script is very inefficient, as it runs find twice. You should consider using tee to gennerate a temporary file when the command runs for the first time, count the lines of this file afterwards and delete the file.
I would personnaly prefer to do it like this:
files=0
while read file ; do
files=$(($files + 1))
echo $file
done < <(find "$SRC" -iname "*.aws" -type f -print0 | head -20)
echo "-----------------------"
find $SRC -iname "*.aws" -type f -print
echo -e "\033[1mSOURCE FILES=\033[0m" $files
echo "-----------------------"
If you just want there count, you could only use find "$SRC" -iname "*.aws" -type f -print0 | head -20