Shell Script- Traversing sub directories using shell scripts [duplicate] - bash

This question already has answers here:
Bash: how to traverse directory structure and execute commands?
(6 answers)
Closed 4 years ago.
I am trying to traverse through the sub directories under current directory. There are certain files that i want to access and process inside each sub--directories. Can anyone help how can I access files inside sub directories?
"
for dir in /home/ayushi/perfios/fraud_stmt/*;
do echo $dir;
done;
"
This above script will echo all the sub directories. but instead of echoing I want to go inside the directories and access files that are present inside it.

find /home/ayushi/perfios/fraud_stmt/ -type f | while read fname; do
: do something on $fname here
done
This will search for all files (i.e. not actual directories) from the specified directory downwards. Note that you should enclose "$fname" in double quotes, in case it contains spaces or other "odd" characters.

An example using a recursive function
process_file() {
echo "$1"
}
rec_traverse() {
local file_or_dir
for file_or_dir in "$1"/*; do
[[ -d $file_or_dir ]] && rec_traverse "$file_or_dir"
[[ -f $file_or_dir ]] && process_file "$file_or_dir"
done
}
rec_traverse /home/ayushi/perfios/fraud_stmt
process_file can be changed to do something on file.
"$1"/* may be changed to "$1"/* "$1"/.* to match hidden directories but in this case special hard linked directories . and .. must be filtered to avoid infinite loop.

Related

How to find a file without knowing its extension? [duplicate]

This question already has answers here:
Test whether a glob has any matches in Bash
(22 answers)
Closed 1 year ago.
Currently I'm writing a script and I need to check if a certain file exists in a directory without knowing the extension of the file. I tried this:
if [[ -f "$2"* ]]
but that didn't work. Does anyone know how I could do this?
-f expects a single argument, and AFIK, you don't get filename expansion in this context anyway.
Although cumbersome, the best I can think of, is to produce an array of all matching filenames, i.e.
shopt -s nullglob
files=( "$2".* )
and test the size of the array. If it is larger than 1, you have more than one candidate., i.e.
if (( ${#files[*]} > 1 ))
then
....
fi
If the size is 1, ${files[0]} gives you the desired one. If the size is 0 (which can only happen if you turn on nullglob), no files are matching.
Don't forget to reset nullglob afterwards, if you don't need it anymore.
In shell, you have to iterate each file globbing pattern's match and test each one individually.
Here is how you could do it with standard POSIX shell syntax:
#!/usr/bin/env sh
# Boolean flag to check if a file match was found
found_flag=0
# Iterate all matches
for match in "$2."*; do
# In shell, when no match is found, the pattern itself is returned.
# To exclude the pattern, check a file of this name actually exists
# and is an actual file
if [ -f "$match" ]; then
found_flag=1
printf 'Found file: %s\n' "$match"
fi
done
if [ $found_flag -eq 0 ]; then
printf 'No file matching: %s.*\n' "$2" >&2
exit 1
fi
You can use find:
find ./ -name "<filename>.*" -exec <do_something> {} \;
<filename> is the filename without extension, do_something the command you want to launch, {} is the placeholder of the filename.

if folders exist - by using wildcard [duplicate]

This question already has answers here:
Test whether a glob has any matches in Bash
(22 answers)
Closed 4 years ago.
or "How to handle prefixed folder names?"
Inside a folder I have two (or more) foo_* folders
foo_0
foo_1
What I'm trying to achieve is to
perform an action if there's 1 or more foo_* folders
Use a wildcard *
Currently I'm doing it this way (going directly to check if directory foo_0 exists):
prefix=foo_
if [ -d "./${prefix}0/" ]; then
printf "foo_0 folder found!"
# delete all foo_* folders
fi
Having directories 0-to-N so the above works, but i'm not sure I'll always have a foo_0 folder...
I'd like to do use a wildcard:
prefix=foo_
if [ -d "./${prefix}*/" ]; then # By using wildcard...
printf "One or more foo_* folders found!" # this never prints
# delete all foo_* folders
fi
I've read that a wildcard * inside quotes loses its powers, but placing it outside quotes throws :
if [ -d "./${prefix}"* ] <<< ERROR: binary operator expected
Or is it possible to use some sort of regex like? ./foo_\d+ ?
The only solution I don't (arguably) like, is by using set
set -- foo_*
if [ -d $1 ]; then
printf "foo_* found!"
fi
but it wipes program arguments.
Is there any other nice solution to this I'm missing?
I think a nice solution for pretty much all such cases is to use ls in the test, in that it often works quite simply:
if [ -n "$(ls -d foo_*)" ]; then ... If you want to do more regexp-like matching you can shopt -s extglob and then match with ls foo_+([0-9]).
There's also an all-bash solution using several shell options, but it's not as easy to remember, so I'll leave that to another poster ;-)
EDIT: As #PesaThe pointed out, using ls foo_* would fail if there's only one empty matching directory, as just the empty contents of that directory would get listed and ls foo_* would not only match directories, so it's preferable to use -d.

Iterate through several files in bash [duplicate]

This question already has answers here:
How to zero pad a sequence of integers in bash so that all have the same width?
(15 answers)
Closed 6 years ago.
I have a folder with several files that are named like this:
file.001.txt.gz, file.002.txt.gz, ... , file.150.txt.gz
What I want to do is use a loop to run a program with each file. I was thinking in something like this (just a sketch):
for i in {1:150}
gunzip file.$i.txt.gz
./my_program file.$i.txt output.$1.txt
gzip file.$1.txt
First of all, I don't know if something like this is gonna work, and second, I can't figure out how to keep the three digits numeration the file have ('001' instead of just '1').
Thanks a lot
The syntax for ranges in bash is
{1..150}
not {1:150}.
Moreover, if your bash is recent enough, you can add the leading zeroes:
{001..150}
The correct syntax of the for loop needs do and done.
for i in {001..150} ; do
# ...
done
It's unclear what $1 contains in your script.
To iterate over files I believe the simpler way is:
(assuming there are no files named 'file.*.txt' already in the directory and that your output file can have a different name)
for i in file.*.txt.gz; do
gunzip $i
./my_program $i $i-output.txt
gzip file.*.txt
done
Using find command:
# Path to the source directory
dir="./"
while read file
do
output="$(basename "$file")"
output="$(dirname "$file")/"${output/#file/output}
echo "$file ==> $output"
done < <(find "$dir" \
-regextype 'posix-egrep' \
-regex '.*file\.[0-9]{3}\.txt\.gz$')
The same via pipe:
find "$dir" \
-regextype 'posix-egrep' \
-regex '.*file\.[0-9]{3}\.txt\.gz$' | \
while read file
do
output="$(basename "$file")"
output="$(dirname "$file")/"${output/#file/output}
echo "$file ==> $output"
done
Sample output
/home/ruslan/tmp/file.001.txt.gz ==> /home/ruslan/tmp/output.001.txt.gz
/home/ruslan/tmp/file.002.txt.gz ==> /home/ruslan/tmp/output.002.txt.gz
(for $dir=/home/ruslan/tmp/).
Description
The scripts iterate the files in $dir directory. The $file variable is filled with the next line read from the find command.
The find command returns a list of paths corresponding to the regular expression '.*file\.[0-9]{3}\.txt\.gz$'.
The $output variable is built from two parts: basename (path without directories) and dirname (path to file's directory).
${output/#file/output} expression replaces file with output at the front end of $output variable (see Manipulating Strings)
Try-
for i in $(seq -w 1 150) #-w adds the leading zeroes
do
gunzip file."$i".txt.gz
./my_program file."$i".txt output."$1".txt
gzip file."$1".txt
done
The syntax for ranges is as choroba said, but when iterating over files you usually want to use a glob. If you know all the files have three digits in their names you can match on digits:
shopt -s nullglob
for i in file.0[0-9][0-9].txt.gz file.1[0-4][0-9] file.15[0].txt.gz; do
gunzip file.$i.txt.gz
./my_program file.$i.txt output.$i.txt
gzip file.$i.txt
done
This will only iterate through files that exist. If you use the range expression, you have to take extra care not to try to operate on files that don't exist.
for i in file.{000..150}.txt.gz; do
[[ -e "$i" ]] || continue
...otherstuff
done

Comparing files in the same directory with same name different extension [duplicate]

This question already has answers here:
Looping over pairs of values in bash [duplicate]
(6 answers)
Closed 6 years ago.
I have a bash script that looks through a directory and creates a .ppt from a .pdf, but i want to be able to check to see if there is a .pdf already for the .ppt because if there is I don't want to create one and if the .pdf is timestamped older then the .ppt I want to update it. I know for timestamp I can use (date -r bar +%s) but I cant seem how to figure out how to compare the files with the same name if they are in the same folder.
This is what I have:
#!/bin/env bash
#checks to see if argument is clean if so it deletes the .pdf and archive files
if [ "$1" = "clean" ]; then
rm -f *pdf
else
#reads the files that are PPT in the directory and copies them and changes the extension to .pdf
ls *.ppt|while read FILE
do
NEWFILE=$(echo $FILE|cut -d"." -f1)
echo $FILE": " $FILE " "$NEWFILE: " " $NEWFILE.pdf
cp $FILE $NEWFILE.pdf
done
fi
EDITS:
#!/bin/env bash
#checks to see if argument is clean if so it deletes the .pdf and archive files
if [ "$1" = "clean" ]; then
rm -f *pdf lectures.tar.gz
else
#reads the files that are in the directory and copies them and changes the extension to .pdf
for f in *.ppt
do
[ "$f" -nt "${f%ppt}pdf" ] &&
nf="${f%.*}"
echo $f": " $f " "$nf: " " $nf.pdf
cp $f $nf.pdf
done
To loop through all ppt files in the current directory and test to see if they are newer than the corresponding pdf and then do_something if they are:
for f in *.ppt
do
[ "$f" -nt "${f%ppt}pdf" ] && do_something
done
-nt is the bash test for one file being newer than another.
Notes:
Do not parse ls. The output from ls often contains a "displayable" form of the filename, not the actual filename.
The construct for f in *.ppt will work reliably all file names, even ones with tabs, or newlines in their names.
Avoid using all caps for shell variables. The system uses all caps for its variables and you do not want to accidentally overwrite one. Thus, use lower case or mixed case.
The shell has built-in capabilities for suffix removal. So, for example, newfile=$(echo $file |cut -d"." -f1) can be replaced with the much more efficient and more reliable form newfile="${file%%.*}". This is particularly important in the odd case that the file's name ends with a newline: command substitution removes all trailing newlines but the bash variable expansions don't.
Further, note that cut -d"." -f1 removes everything after the first period. If a file name has more than one period, this is likely not what you want. The form, ${file%.*}, with just one %, removes everything after the last period in the name. This is more likely what you want when you are trying to remove standard extensions like ppt.
Putting it all together
#!/bin/env bash
#checks to see if argument is clean if so it deletes the .pdf and archive files
if [ "$1" = "clean" ]; then
rm -f ./*pdf lectures.tar.gz
else
#reads the files that are in the directory and copies them and changes the extension to .pdf
for f in ./*.ppt
do
if [ "$f" -nt "${f%ppt}pdf" ]; then
nf="${f%.*}"
echo "$f: $f $nf: $nf.pdf"
cp "$f" "$nf.pdf"
fi
done
fi

move files based on filename length

I want to move some files in a directory, using their filename length as the criteria.
For example, I want to move any files longer that 10 characters.
I assumed I need an if loop in bash script, but I'm not sure how to proceed.
use this template
for f in *; do if [ ${#f} -gt 10 ]; then echo $f; fi; done
replace echo with your mv command.
note that directories will be in the list too.

Resources