How to locate the directory where the sum of the number of lines of regular file is greatest (in bash) - bash

Hi i'm new in Unix and bash and I'd like to ask q. how can i do this
The specified directory is given as arguments. Locate the directory
where the sum of the number of lines of regular file is greatest.
Browse all specific directories and their subdirectories. Amounts
count only for files that are directly in the directory.
I try somethnig but it's not working properly.
while [ $# -ne 0 ];
do case "$1" in
-h) show_help ;;
-*) echo "Error: Wrong arguments" 1>&2 exit 1 ;;
*) directories=("$#") break ;;
esac
shift
done
IFS='
'
amount=0
for direct in "${directories[#]}"; do
for subdirect in `find $direct -type d `; do
temp=`find "$subdirect" -type f -exec cat {} \; | wc -l | tr -s " "`
if [ $amount -lt $temp ]; then
amount=$temp
subdirect2=$subdirect
fi
done
echo Output: "'"$subdirect2$amount"'"
done
the problem is here when i use as arguments this dirc.(just example)
/home/usr/first and there are this direct.
/home/usr/first/tmp/first.txt (50 lines)
/home/usr/first/tmp/second.txt (30 lines)
/home/usr/first/tmp1/one.txt (20 lines)
it will give me on Output /home/usr/first/tmp1 100 and this is wrong it should be /home/usr/first/tmp 80
I'd like to scan all directories and all its subdirectories in depth. Also if multiple directories meets the maximum should list all.

Given your sample files, I'm going to assume you only want to look at the immediate subdirectories, not recurse down several levels:
max=-1
# the trailing slash limits the wildcard to directories only
for dir in */; do
count=0
for file in "$dir"/*; do
[[ -f "$file" ]] && (( count += $(wc -l < "$file") ))
done
if (( count > max )); then
max=$count
maxdir="$dir"
fi
done
echo "files in $maxdir have $max lines"
files in tmp/ have 80 lines

In the spirit of Unix (caugh), here's an absolutely disgusting chain of pipes that I personally hate, but it's a lot of fun to construct :):
find . -mindepth 1 -maxdepth 1 -type d -exec sh -c 'find "$1" -maxdepth 1 -type f -print0 | wc -l --files0-from=- | tail -1 | { read a _ && echo "$a $1"; }' _ {} \; | sort -nr | head -1
Of course, don't use this unless you're mentally ill, use glenn jackman's nice answer instead.
You can have great control on find's unlimited filtering possibilities, too. Yay. But use glenn's answer!

Related

Create a new folder named the next count

I have a folder of projects, which labels projects based on a number. It starts at 001 and continues counting. I have a bash script I run through Alfred, however, I currently have to type the name of the folder.
QUERY={query}
mkdir /Users/admin/Documents/projects/"$QUERY"
I would like to have the script automatically name the folder to the next number.
For example, if the newest folder is "019" then I would like it to automatically name it to "020"
This is what I've whipped up so far:
nextNum = $(find ~/documents/projects/* -maxdepth 1 -type d -print| wc -l)
numP = nextNum + 1
mkdir /Users/admin/Documents/projects/00"$numP"
I'm not sure if my variable syntax is correct, or if variables are the best way to do this. I am a complete noob to bash so any help is appreciated.
This might be what you're looking for:
#!/bin/bash
cd /Users/admin/Documents/projects || exit
if ! [[ -d 000 ]]; then mkdir 000; exit; fi
dirs=([0-9][0-9][0-9])
mkdir $(printf %03d $(( 1 + 10#${dirs[${#dirs[*]} - 1]} )) )
See comments in code to understand the code.
#!/bin/bash
# Configure your project dir
projects_dir=/Users/admin/Documents/projects
# Find the last project number.
# You can use wc instead of the sort|tail|sed I used. The result is the same.
last_project=$(\ls $projects_dir | sort -n | tail -n1 | sed 's/^0*//')
# Use printf to add the leading '0'
# ${last_project:-0} will substitute '0' when $last_project not set,
# therefore it will work even if your project directory is empty.
# $(( expr )) evaluates math expression. Also you can not have
# subshell expression in $(( expr )) so you can't substitute
# $last_project with the expression above
mkdir $projects_dir/$(printf "%03d" $((${last_project:-0} + 1)))
Although unlikely, the issue with the above script is that hitting a project count over 999 will break the 3 digits directory name convention. A simple fix is to use a 4 digits convention. A directory large amount of files/subdirectories is not recommended anyway so if you do reach 9999 projects, it's best to continue in a second project directory.
Give a try to this funny one:
projects_dir="/Users/admin/Documents/projects/"
[[ -n "${HOME}" ]] && projects_dir="${projects_dir/#~/$HOME}"
mkdir "${projects_dir}"/$(printf "%03d" $(find "${projects_dir}" -maxdepth 1 -type d -printf . | wc -c))
(EDIT)
Expanded version with bash trace output:
set -x
projects_dir="/Users/admin/Documents/projects/"
[[ -n "${HOME}" ]] && projects_dir="${projects_dir/#~/$HOME}"
cd "${projects_dir}" || exit 1
printf "each dot is a directory: "
find "${projects_dir}" -maxdepth 1 -type d -printf .
printf "\n"
let next_directory_id=$(find "${projects_dir}" -maxdepth 1 -type d -printf . | wc -c)
next_directory_name=$(printf "%03d" "${next_directory_id}")
mkdir "${projects_dir}"/"${next_directory_name}"
set +x
targ_number=$(find ~/documents/projects/ -maxdepth 1 -type d -print| wc -l)
mkdir /Users/admin/Documents/projects/$(printf "%03d" $targ_number)

Find and count the results, delete if it is less than x

I would like to search a directory and all its subdirectories for files that are structured like this: ABC.001.XYZ, ABC.001.DEF, ABC.002.XYZ and so fourth.
It should search for all files beginning with ABC.001, count the results, and if it is less than x, delete all files beginning with that. Then move on to ABC.002 and so on.
dir = X
counter=1
while [ $counter -le 500 ]
do
if [find ${dir} -type f -name 'ABC*' | wc -l -eq 5]
then
for file in $(find ${dir} -type f -name 'ABC*')
do
/bin/rm -i ${file}
fi
((counter++))
done
My question is
I. how do I plug in the variable counter for -name 'ABC*' so it increments up. (Like a string placeholder)
II. How would I make it so if the counter is less than 10 or 100, I place 00 or 0 before the counter, so it would actually search for ABC001*, instead of ABC1*
You can use printf to print formatted numbers as in most languages:
printf "ABC%03d" "$counter"
Simple substitution can put this into the arguments to find. Also worth mentioning that find can delete files directly, and just personal preference, but a for loop is probably neater.
#!/bin/bash
dir=X
for counter in $(seq 1 500); do
if [[ $(find "$dir" -type f -name "$(printf "ABC%03d" "$counter")" | wc -l) -eq 5 ]]; then
find "$dir" -type f -name "$(printf "ABC%03d" "$counter")" -delete
fi
done

rename files in a folder using find shell

i have a n files in a different folders like abc.mp3 acc.mp3 bbb.mp3 and i want to rename them 01-abc.mp3, 02-acc.mp3, 03-bbb.mp3... i tried this
#!/bin/bash
IFS='
'
COUNT=1
for file in ./uff/*;
do mv "$file" "${COUNT}-$file" let COUNT++ done
but i keep getting errors like for syntax error near 'do and sometimes for not found... Can someone provide single line solution to this using "find" from terminal. i'm looking for a solution using find only due to certain constraints... Thanks in advance
I'd probably use:
#!/bin/bash
cd ./uff || exit 1
COUNT=1
for file in *.mp3;
do
mv "$file" $(printf "%.2d-%s" ${COUNT} "$file")
((COUNT++))
done
This avoids a number of issues and also includes a 2-digit number for the first 9 files (the next 90 get 2-digit numbers anyway, and after that you get 3-digit numbers, etc).
you can try this;
#!/bin/bash
COUNT=1
for file in ./uff/*;
do
path=$(dirname $file)
filename=$(basename $file)
if [ $COUNT -lt 10 ]; then
mv "$file" "$path"/0"${COUNT}-$filename";
else
mv "$file" "$path"/"${COUNT}-$filename";
fi
COUNT=$(($COUNT+1));
done
Eg:
user#host:/tmp/test$ ls uff/
abc.mp3 acc.mp3 bbb.mp3
user#host:/tmp/test$ ./test.sh
user#host:/tmp/test$ ls uff/
01-abc.mp3 02-acc.mp3 03-bbb.mp3
Ok, here's the version without loops:
paste -d'\n' <(printf "%s\n" *) <(printf "%s\n" * | nl -w1 -s-) | xargs -d'\n' -n2 mv -v
You can also use find if you want:
paste -d'\n' <(find -mindepth 1 -maxdepth 1 -printf "%f\n") <(find -mindepth 1 -maxdepth 1 -printf "%f\n" | nl -w1 -s-) | xargs -d'\n' -n2 mv -v
Replace mv with echo mv for the "dry run":
paste -d'\n' <(printf "%s\n" *) <(printf "%s\n" * | nl -w1 -s-) | xargs -d'\n' -n2 echo mv -v
Here's a solution.
i=1
for f in $(find ./uff -mindepth 1 -maxdepth 1 -type f | sort)
do
n=$i
[ $i -lt 10 ] && n="0$i"
echo "$f" "$n-$(basename "$f")"
((i++))
done
And here it is as a one-liner (but in real life if you ever tried anything remotely like what's below in a coding or ops interview you'd not only fail to get the job, you'd probably give the interviewer PTSD. They'd wake up in cold sweats thinking about how terrible your solution was).
i=1; for f in $(find ./uff -mindepth 1 -maxdepth 1 -type f | sort); do n=$i; [ $i -lt 10 ] && n="0$i"; echo "$f" "$n-$(basename "$f")" ; ((i++)); done
Alternatively, you could just cd ./uff if you wanted the rename them in the same directory, and then use find . (along with the other find arguments) to clear everything up. I'm assuming you only want files moved, not directories. And I'm assuming you don't want to recursively rename files / directories.

How to find files and count them (storing the info into a variable)?

I want to have a conditional behavior depending on the number of files found:
found=$(find . -type f -name "$1")
numfiles=$(printf "%s\n" "$found" | wc -l)
if [ $numfiles -eq 0 ]; then
echo "cannot access $1: No such file" > /dev/stderr; exit 2;
elif [ $numfiles -gt 1 ]; then
echo "cannot access $1: Duplicate file found" > /dev/stderr; exit 2;
else
echo "File: $(ls $found)"
head $found
fi
EDITED CODE (to reflect more precisely what I need)
Though, numfiles isn't equal to 2(or more) when there are duplicate files found...
All the filenames are on one line, separated by a space.
On the other hand, this works correctly:
find . -type f -name "$1" | wc -l
but I don't want to do twice the recursive search in the if/then/else construct...
Adding -print0 doesn't help either.
What would?
PS- Simplifications or improvements are always welcome!
You want to find files and count the files with a name "$1":
grep -c "/${1}$" $(find . 2>/dev/null)
And store the result in a var. In one command:
numfiles=$(grep -c "/${1}$" <(find . 2>/dev/null))
Using $() to store data to a variable trims tailing whitespace. Since the final newline does not appear in the variable numfiles, wc miscounts by one. You can recover the trailing newline with:
numfiles=$(printf "%s\n" "$found" | wc -l)
This miscounts if found is empty (and if any filenames contain a newline), emphasizing the fact that this entire approach is faulty. If you really want to go this way, you can try:
numfiles=$(test -z "$numfiles" && echo 0 || printf "%s\n" "$found" | wc -l)
or pipe the output of find to a script that counts the output and prints a count along with the first filename:
find . -type f -name "$1" | tr '\n' ' ' |
awk '{c=NF; f=$1 } END {print c, f; exit c!=1}' c=0 |
while read count name; do
case $count in
0) echo no files >&2;;
1) echo 1 file $name;;
*) echo Duplicate files >&2;;
esac;
done
All of these solutions fail miserably if any pathnames contain whitespace. If that matters, you could change the awk to a perl script to make it easier to handle null separators and use -print0, but really I think you should stop worrying about special cases. (find -exec and find | xargs both fail to handle to 0 files matching case cleanly. Arguably this awk solution also doesn't handle it cleanly.)

Bash script to list files not found

I have been looking for a way to list file that do not exist from a list of files that are required to exist. The files can exist in more than one location. What I have now:
#!/bin/bash
fileslist="$1"
while read fn
do
if [ ! -f `find . -type f -name $fn ` ];
then
echo $fn
fi
done < $fileslist
If a file does not exist the find command will not print anything and the test does not work. Removing the not and creating an if then else condition does not resolve the problem.
How can i print the filenames that are not found from a list of file names?
New script:
#!/bin/bash
fileslist="$1"
foundfiles="~/tmp/tmp`date +%Y%m%d%H%M%S`.txt"
touch $foundfiles
while read fn
do
`find . -type f -name $fn | sed 's:./.*/::' >> $foundfiles`
done < $fileslist
cat $fileslist $foundfiles | sort | uniq -u
rm $foundfiles
#!/bin/bash
fileslist="$1"
while read fn
do
FPATH=`find . -type f -name $fn`
if [ "$FPATH." = "." ]
then
echo $fn
fi
done < $fileslist
You were close!
Here is test.bash:
#!/bin/bash
fn=test.bash
exists=`find . -type f -name $fn`
if [ -n "$exists" ]
then
echo Found it
fi
It sets $exists = to the result of the find. the if -n checks if the result is not null.
Try replacing body with [[ -z "$(find . -type f -name $fn)" ]] && echo $fn. (note that this code is bound to have problems with filenames containing spaces).
More efficient bashism:
diff <(sort $fileslist|uniq) <(find . -type f -printf %f\\n|sort|uniq)
I think you can handle diff output.
Give this a try:
find -type f -print0 | grep -Fzxvf - requiredfiles.txt
The -print0 and -z protect against filenames which contain newlines. If your utilities don't have these options and your filenames don't contain newlines, you should be OK.
The repeated find to filter one file at a time is very expensive. If your file list is directly compatible with the output from find, run a single find and remove any matches from your list:
find . -type f |
fgrep -vxf - "$1"
If not, maybe you can massage the output from find in the pipeline before the fgrep so that it matches the format in your file; or, conversely, massage the data in your file into find-compatible.
I use this script and it works for me
#!/bin/bash
fileslist="$1"
found="Found:"
notfound="Not found:"
len=`cat $1 | wc -l`
n=0;
while read fn
do
# don't worry about this, i use it to display the file list progress
n=$((n + 1))
echo -en "\rLooking $(echo "scale=0; $n * 100 / $len" | bc)% "
if [ $(find / -name $fn | wc -l) -gt 0 ]
then
found=$(printf "$found\n\t$fn")
else
notfound=$(printf "$notfound\n\t$fn")
fi
done < $fileslist
printf "\n$found\n$notfound\n"
The line counts the number of lines and if its greater than 0 the find was a success. This searches everything on the hdd. You could replace / with . for just the current directory.
$(find / -name $fn | wc -l) -gt 0
Then i simply run it with the files in the files list being separated by newline
./search.sh files.list

Resources