How to list files with words exceeding n characters in all subdirectories - bash

I have to write a shell script that creates a file containing the name of each text files from a folder (given as parameter) and it's subfolders that contain words longer than n characters (read n from keyboard).
I wrote the following code so far :
#!/bin/bash
Verifies if the first given parameter is a folder:
if [ ! -d $1 ]
then echo $1 is not a directory\!
exit 1
fi
Reading n
echo -n "Give the number n: "
read n
echo "You entered: $n"
Destination where to write the name of the files:
destinatie="destinatie"
the actual part that i think it makes me problems:
nr=0;
#while read line;
#do
for fisier in `find $1 -type f`
do
counter=0
for word in $(<$fisier);
do
file=`basename "$fisier"`
length=`expr length $word`
echo "$length"
if [ $length -gt $n ];
then counter=$(($counter+1))
fi
done
if [ $counter -gt $nr ];
then echo "$file" >> $destinatie
fi
done
break
done
exit
The script works but it does a few more steps that i don't need.It seems like it reads some files more than 1 time. If anyone can help me please?

Does this help?
egrep -lr "\w{$n,}" $1/* >$destinatie
Some explanation:
\w means: a character that words consist of
{$n,} means: number of consecutive characters is at least $n
Option -l lists files and does not print the grepped text and -r performs a recursive scan on your directory in $1
Edit:
a bit more complete version around the egrep command:
#!/bin/bash
die() { echo "$#" 1>&2 ; exit 1; }
[ -z "$1" ] && die "which directory to scan?"
dir="$1"
[ -d "$dir" ] || die "$dir isn't a directory"
echo -n "Give the number n: "
read n
echo "You entered: $n"
[ $n -le 0 ] && die "the number should be > 0"
destinatie="destinatie"
egrep -lr "\w{$n,}" "$dir"/* | while read f; do basename "$f"; done >$destinatie

This code has syntax errors, probably leftovers from your commented-out while loop: It would be best to remove the last 3 lines: done causes the error, break and exit are unnecessary as there is nothing to break out from and the program always terminates at its end.
The program appears to output files multiple times because you just append to $destinatie. You could simply delete that file when you start:
rm "$destinatie"
You echo the numbers to stdout (echo "$length") and the file names to $destinatie (echo "$file" >> $destinatie). I do not know if that is intentional.

I found the problem.The problem was the directory in which i was searching.Because i worked on the files from the direcotry and modified them , it seems that there remained some files which were not displayed in file explorer but the script would find them.i created another directory and i gived it as parameter and it works. Thank you for your answers
.

Related

how to continue with the loop even though we use exit for a condition in shell script

I have a following list.txt file with the content
cat list.txt
one
two
zero
three
four
I have a shell script (check.sh) like below,
for i in $(cat list.txt)
do
if [ $i != zero ]; then
echo "the number is $i"
else
exit 1
fi
done
it gives output like below,
./check.sh
the number is one
the number is two
I want to have script which continue with the rest of the items in the list.txt, but it should not process zero and continue with the rest of item.
eg.
the number is one
the number is two
the number is three
the number is four
I tried using "return" but it did not work, gave error.
./check.sh: line 6: return: can only `return' from a function or sourced script
About exit (and return)
The command exit will quit running script. There is no way to continue.
As well, return command will quit function. There in no more way to continue.
About reading input file
For processing line based input file, you'd better to use while read instead of for i in $(cat...:
Simply try:
while read -r i;do
if [ "$i" != "zero" ] ;then
echo number $i
fi
done <list.txt
Alternatively, you could drop unwanted entries before loop:
while read -r i;do
echo number $i
done < <( grep -v ^zero$ <list.txt)
Note: In this specific case, ^zero$ don't need to be quoted. Consider quoting if your string do contain special characters or spaces.
If you have more than one entries to drop, you could use
while read -r i;do echo number $i ;done < <(grep -v '^\(zero\|null\)$' <list.txt)
Alternatively, once input file filtered, use xargs:
If your process is only one single command, you could avoid bash loop by using xargs:
xargs -n 1 echo number < <(grep -v '^\(zero\|null\)$' <list.txt)
How to use continue in bash script
Maybe you are thinking about something like:
while read -r i;do
if [ "$i" = "zero" ] ;then
continue
fi
echo number $i
done <list.txt
Argument of continue is a number representing number of loop to shortcut.
Try this:
for i in {1..5};do
for l in {a..d};do
if [ "$i" -eq 3 ] && [ "$l" = "b" ] ;then
continue 2
fi
echo $i.$l
done
done
(This print 3.a and stop 3 serie at 3.b, breaking 2 loop level)
Then compare with
for i in {1..5};do
for l in {a..d};do
if [ "$i" -eq 3 ] && [ "$l" = "b" ] ;then
continue 1
fi
echo $i.$l
done
done
(This print 3.a , 3.c and 3.d. Only 3.b are skipped, breaking only 1 loop level)

Shell script with absolute path and control errors

I was doing this little script in which the first argument must be a path to an existing directory and the second any other thing.
Each object in the path indicated in the first argument must be renamed so that the new
name is the original that was added as a prefix to the character string passed as the second argument. Example, for the string "hello", the object OBJECT1 is renamed hello.OBJECT1 and so on
Additionally, if an object with the new name is already present, a message is shown by a standard error output and the operation is not carried out continuing with the next object.
I have the following done:
#! /bin/bash
if [ "$#" != 2 ]; then
exit 1
else
echo "$2"
if [ -d "$1" ]; then
echo "directory"
for i in $(ls "$1")
do
for j in $(ls "$1")
do
echo "$i"
if [ "$j" = "$2"."$i" ]; then
exit 1
else
mv -v "$i" "$2"."$i"
echo "$2"."$i"
fi
done
done
else
echo "no"
fi
fi
I am having problems if I run the script from another file other than the one I want to do it, for example if I am in /home/pp and I want the changes to be made in /home/pp/rr, since that is the only way It does in the current.
I tried to change the ls to catch the whole route with
ls -R | sed "s;^;pwd;" but the route catches me badly.
Using find you can't because it puts me in front of the path and doesn't leave the file
Then another question, to verify that that object that is going to create new is not inside, when doing it with two for I get bash errors for all files and not just for coincidences
I'm starting with this scripting, so it has to be a very simple solution thing
An obvious answer to your question would be to put a cd "$2 in the script to make it work. However, there are some opportunities in this script for improvement.
#! /bin/bash
if [ "$#" != 2 ]; then
You might put an error message here, for example, echo "Usage: $0 dir prefix" or even a more elaborate help text.
exit 1
else
echo $2
Please quote, as in echo "$2".
if [ -d $1 ]; then
Here, the quotes are important. Suppose that your directory name has a space in it; then this if would fail with bash: [: a: binary operator expected. So, put quotes around the $1: if [ -d "$1" ]; then
echo "directory"
This is where you could insert the cd "$1".
for i in $(ls $1)
do
It is almost always a bad idea to parse the output of ls. Once again, this for-loop will fail if a file name has a space in it. A possible improvement would be for i in "$1"/* ; do.
for j in $(ls $1)
do
echo $i
if [ $j = $2.$i ]; then
exit 1
else
The logic of this section seems to be: if a file with the prefix exists, then exit instead of overwriting. It is always a good idea to tell why the script fails; an echo before the exit 1 will be helpful.
The question is why you use the second loop? a simple if [ -f "$2.$i ] ; then would do the same, but without the second loop. And it will therefore be faster.
mv -v $i $2.$i
echo $2.$i
Once again: use quotes!
fi
done
done
else
echo "no"
fi
fi
So, with all the remarks, you should be able to improve your script. As tripleee said in his comment, running shellcheck would have provided you with most of the comment above. But he also mentioned basename, which would be useful here.
With all that, this is how I would do it. Some changes you will probably only appreciate in a few months time when you need some changes to the script and try to remember what the logic was that you had in the past.
#!/bin/bash
if [ "$#" != 2 ]; then
echo "Usage: $0 directory prefix" >&2
echo "Put a prefix to all the files in a directory." >&2
exit 1
else
directory="$1"
prefix="$2"
if [ -d "$directory" ]; then
for f in "$directory"/* ; do
base=$(basename "$f")
if [ -f "Sdirectory/$prefix.$base" ] ; then
echo "This would overwrite $prefix.$base; exiting" >&2
exit 1
else
mv -v "$directory/$base" "$directory/$prefix.$base"
fi
done
else
echo "$directory is not a directory" >&2
fi
fi

Why is "ls -1 $fl | wc -l" not returning value 0 in my for loop?

I am trying to add a condition in a for loop to check the existence of a file as well as check for file size > 0 KB.
Period file contains monthly data:
20180101
20180201
20180301
20180401
20180501
There are individual files created for each month. Suppose a file is not created for one month, (20180201), then the loop below should terminate.
For example:
xxx_20180101.txt
xxx_20180301.txt
xxx_20180401.txt
xxx_20180501.txt
if [[ $STATUS -eq 0 ]]; then
for per in `cat ${PATH}/${PERIOD}.txt | cut -f 1 -d";"`
do
for fl in `ls -1 ${PATH}/${FILE} | grep ${per}`
do
if [[ `ls -1 $fl | wc -l` -eq 0 ]]; then
echo "File not found"
STATUS=1
else
if [[ -s "$fl" ]]; then
echo "$fl contain data.">>/dev/null
else
echo "$fl File size is 0KB"
STATUS=1
fi
fi
done
done
fi
but ls -1 $fl | wc -l is not returning 0 value when the if condition is executed.
The following is a demonstration of what a best-practices rewrite might look like.
Note:
We do not (indeed, must not) use a variable named PATH to store a directory under which we look for data files; doing this overwrites the PATH environment variable used to find programs to execute.
ls is not used anywhere; it is a tool intended to generate output for human consumption, not machines.
Reading through input is accomplished with a while read loop; see BashFAQ #1 for more details. Note that the input source for the loop is established at the very end; see the redirection after the done.
Finding file sizes is done with stat -c here; for more options, portable to platforms where stat -c is not supported, see BashFAQ #87.
Because your filename format is well-formed (with an underscore before the substring from your input file, and a .txt after that substring), we're refining the glob to look only for names matching that restriction. This prevents a search for 001 to find xxx_0015.txt, xxx_5001.txt, etc. as well.
#!/usr/bin/env bash
# ^^^^ -- NOT /bin/sh; this lets us use bash-only syntax
path=/provided/by/your/code # replacing buggy use of PATH in original code
period=likewise # replacing use of PERIOD in original code
shopt -s nullglob # generate a zero-length list for unmatched globs
while IFS=';' read -r per _; do
# populate an array with a list of files with names containing $per
files=( "$path/$period/"*"_${per}.txt" )
# if there aren't any, log a message and proceed
if (( ${#files[#]} == 0 )); then
echo "No files with $per found in $path/$period" >&2
continue
fi
# if they *do* exist, loop over them.
for file in "${files[#]}"; do
if [[ -s "$file" ]]; then
echo "$file contains data" >&2
if (( $(stat -c +%s -- "$file") >= 1024 )); then
echo "$file contains 1kb of data or more" >&2
else
echo "$file is not empty, but is smaller than 1kb" >&2
fi
else
echo "$file is empty" >&2
fi
done
done < "$path/$period.txt"
Here's a refactoring of Mikhail's answer with the standard http://shellcheck.net/ warnings ironed out. I have not been able to understand the actual question well enough to guess whether this actually solves the OP's problem.
while IFS='' read -r per; do
if [ -e "xxx_$per.txt" ]; then
echo "xxx_$per.txt found" >&2
else
echo "xxx_$per.txt not found" >&2
fi
done <periods.txt
You are over engineering here. Just iterate over content of file with periods and search each period in a list of files. Like this:
for per in `cat periods.txt`
do
if ls | grep -q "$per"; then
echo "$per found";
else
echo "$per not found"
fi
done

I want a to compare a variable with files in a directory and output the equals

I am making a bash script where I want to find files that are equal to a variable. The equals will then be used.
I want to use "mogrify" to shrink a couple of image files that have the same name as the ones i gather from a list (similar to "dpkg -l"). It is not "dpkg -l" I am using but it is similar. My problem is that it prints all the files not just the equals. I am pretty sure this could be done with awk instead of a for-loop but I do not know how.
prog="`dpkg -l | awk '{print $1}'`"
for file in $dirone* $dirtwo*
do
if [ "basename ${file}" = "${prog}" ]; then
echo ${file} are equal
else
echo ${file} are not equal
fi
done
Could you please help me get this working?
First, I think there's a small typo. if [ "basename ${file}" =... should have backticks inside the double quotes, just like the prog=... line at the top does.
Second, if $prog is a multi-line string (like dpkg -l) you can't really compare a filename to the entire list. Instead you have to compare one item at a time to the filename.
Here's an example using dpkg and /usr/bin
#!/bin/bash
progs="`dpkg -l | awk '{print $2}'`"
for file in /usr/bin/*
do
base=`basename ${file}`
for prog in ${progs}
do
if [ "${base}" = "${prog}" ]; then
echo "${file}" matches "${prog}"
fi
done
done
The condition "$file = $prog" is a single string. You should try "$file" = "$prog" instead.
The following transcript shows the fix:
pax> ls -1 qq*
qq
qq.c
qq.cpp
pax> export xx=qq.cpp
pax> for file in qq* ; do
if [[ "${file} = ${xx}" ]] ; then
echo .....${file} equal
else
echo .....${file} not equal
fi
done
.....qq equal
.....qq.c equal
.....qq.cpp equal
pax> for file in qq* ; do
if [[ "${file}" = "${xx}" ]] ; then
echo .....${file} equal
else
echo .....${file} not equal
fi
done
.....qq not equal
.....qq.c not equal
.....qq.cpp equal
You can see in the last bit of output that only qq.cpp is shown as equal since it's the only one that matches ${xx}.
The reason you're getting true is because that's what non-empty strings will give you:
pax> if [[ "" ]] ; then
echo .....equal
fi
pax> if [[ "x" ]] ; then
echo .....equal
fi
.....equal
That's because that form is the string length checking variation. From the bash manpage under CONDITIONAL EXPRESSIONS:
string
-n string
True if the length of string is non-zero.
Update:
The new code in your question won't quite work as expected. You need:
if [[ "$(basename ${file})" = "${prog}" ]]; then
to actually execute basename and use its output as the first part of the equality check.
you can use case/esac
case "$file" in
"$prog" ) echo "same";;
esac

How to test filename expansion result in bash?

I want to check whether a directory has files or not in bash.
My code is here.
for d in {,/usr/local}/etc/bash_completion.d ~/.bash/completion.d
do
[ -d "$d" ] && [ -n "${d}/*" ] &&
for f in $d/*; do
[ -f "$f" ] && echo "$f" && . "$f"
done
done
The problem is that "~/.bash/completion.d" has no file.
So, $d/* is regarded as simple string "~/.bash/completion.d/*", not empty string which is result of filename expansion.
As a result of that code, bash tries to run
. "~/.bash/completion.d/*"
and of course, it generates error message.
Can anybody help me?
If you set the nullglob bash option, through
shopt -s nullglob
then globbing will drop patterns that don't match any file.
# NOTE: using only bash builtins
# Assuming $d contains directory path
shopt -s nullglob
# Assign matching files to array
files=( "$d"/* )
if [ ${#files[#]} -eq 0 ]; then
echo 'No files found.'
else
# Whatever
fi
Assignment to an array has other benefits, including desirable (correct!) handling of filenames/paths containing white-space, and simple iteration without using a sub-shell, as the following code does:
find "$d" -type f |
while read; do
# Process $REPLY
done
Instead, you can use:
for file in "${files[#]}"; do
# Process $file
done
with the benefit that the loop is run by the main shell, meaning that side-effects (such as variable assignment, say) made within the loop are visible for the remainder of script. Of course, it's also way faster, if performance is an issue.
Finally, an array can also be inserted in command line arguments (without splitting arguments containing white-space):
$ md5sum fileA "${files[#]}" fileZ
You should always attempt to correctly handle files/paths containing white-space, because one day, they will happen!
You could use find directly in the following way:
for f in $(find {,/usr/local}/etc/bash_completion.d ~/.bash/completion.d -maxdepth 1 -type f);
do echo $f; . $f;
done
But find will print a warning if some of the directory isn't found, you can either put a 2> /dev/null or put the find call after testing if the directories exist (like in your code).
find() {
for files in "$1"/*;do
if [ -d "$files" ];then
numfile=$(ls $files|wc -l)
if [ "$numfile" -eq 0 ];then
echo "dir: $files has no files"
continue
fi
recurse "$files"
elif [ -f "$files" ];then
echo "file: $files";
:
fi
done
}
find /path
Another approach
# prelim stuff to set up d
files=`/bin/ls $d`
if [ ${#files} -eq 0 ]
then
echo "No files were found"
else
# do processing
fi

Resources