Recursive Shell Script and file extensions issue

Recursive Shell Script and file extensions issue - bash

I have a problem with this script. The script is supposed to go trough all the files and all sub-directories and sub-files (recursively). If the file ends with the extension .txt i need to replace a char/word in the text with a new char/word and then copy it into a existing directory. The first argument is the directory i need to start the search, the second is the old char/word, third the new char/word and fourth the directory to copy the files to. The script goes trough the files but only does the replacement and copies the files from the original directory. Here is the script
#!/bin/bash
funk(){
for file in `ls $1`
do
if [ -f $file ]
then
ext=${file##*.}
if [ "$ext" = "txt" ]
then
sed -i "s/$2/$3/g" $file
cp $file $4
fi
elif [ -d $file ]
then
funk $file $2 $3 $4
fi
done
}
if [ $# -lt 4 ]
then
echo "Need more arg"
exit 2;
fi
cw=$1
a=$2
b=$3
od=$4
funk $cw $a $b $od

You're using a lot of bad practices here: lack of quotings, you're parsing the output of ls... all this will break as soon as a filename contains a space of other funny symbol.
You don't need recursion if you either use bash's globstar optional behavior, or find.
Here's a possibility with the former, that will hopefully show you better practices:
#!/bin/bash
shopt -s globstar
shopt -s nullglob
funk() {
local search=${2//\//\\/}
local replace=${3//\//\\/}
for f in "$1"/**.txt; do
sed -i "s/$search/$replace/g" -- "$f"
cp -nvt "$4" -- "$f"
done
}
if (($#!=4)); then
echo >&2 "Need 4 arguments"
exit 1
fi
funk "$#"
The same function funk using find:
#!/bin/bash
funk() {
local search=${2//\//\\/}
local replace=${3//\//\\/}
find "$1" -name '*.txt' -type f -exec sed -i "s/$search/$replace/g" -- {} \; -exec cp -nvt "$4" -- {} \;
}
if (($#!=4)); then
echo >&2 "Need 4 arguments"
exit 1
fi
funk "$#"
In cp I'm using
the -n switch: no clobber, so as to not overwrite an existing file. Use it if your version of mv supports it, unless you actually want to overwrite files.
the -v switch: verbose, will show you the moved files (optional).
the -t switch: -t followed by a directory tells to copy into this directory. It's a very good thing to use cp this way: imagine instead of giving an existing directory, you give an existing file: without this feature, this file will get overwritten several times (well, this will be the case if you omit the -n option)! with this feature the existing file will remain safe.
Also notice the use of --. If your cp and sed supports it (it's the case for GNU sed and cp), use it always! it means end of options now. If you don't use it and if a filename start with a hyphen, it would confuse the command trying to interpret an option. With this --, we're safe to put a filename that may start with a hyphen.
Notice that in the search and replace patterns I replaced all slashes / by their escaped form \/ so as not to clash with the separator in sed if a slash happens to appear in search or replace.
Enjoy!

As pointed out, looping over find output is not a good idea. It also doesn't support slashes in search&replace.
Check gniourf_gniourf's answer.
How about using find for that?
#!/bin/bash
funk () {
local dir=$1; shift
local search=$1; shift
local replace=$1; shift
local dest=$1; shift
mkdir -p "$dest"
for file in `find $dir -name *.txt`; do
sed -i "s/$search/$replace/g" "$file"
cp "$file" "$dest"
done
}
if [[ $# -lt 4 ]] ; then
echo "Need 4 arguments"
exit 2;
fi
funk "$#"
Though you might have files with the same names in the subdirectories, then those will be overwritten. Is that an issue in your case?

Related

Testing for existing file with an extension fails with globbing [duplicate]

This question already has answers here:
Test whether a glob has any matches in Bash
(22 answers)
Closed 4 years ago.
I'm trying to check if a file exists, but with a wildcard. Here is my example:
if [ -f "xorg-x11-fonts*" ]; then
printf "BLAH"
fi
I have also tried it without the double quotes.

For Bash scripts, the most direct and performant approach is:
if compgen -G "${PROJECT_DIR}/*.png" > /dev/null; then
echo "pattern exists!"
fi
This will work very speedily even in directories with millions of files and does not involve a new subshell.
Source
The simplest should be to rely on ls return value (it returns non-zero when the files do not exist):
if ls /path/to/your/files* 1> /dev/null 2>&1; then
echo "files do exist"
else
echo "files do not exist"
fi
I redirected the ls output to make it completely silent.
Here is an optimization that also relies on glob expansion, but avoids the use of ls:
for f in /path/to/your/files*; do
## Check if the glob gets expanded to existing files.
## If not, f here will be exactly the pattern above
## and the exists test will evaluate to false.
[ -e "$f" ] && echo "files do exist" || echo "files do not exist"
## This is all we needed to know, so we can break after the first iteration
break
done
This is very similar to grok12's answer, but it avoids the unnecessary iteration through the whole list.

If your shell has a nullglob option and it's turned on, a wildcard pattern that matches no files will be removed from the command line altogether. This will make ls see no pathname arguments, list the contents of the current directory and succeed, which is wrong. GNU stat, which always fails if given no arguments or an argument naming a nonexistent file, would be more robust. Also, the &> redirection operator is a bashism.
if stat --printf='' /path/to/your/files* 2>/dev/null
then
echo found
else
echo not found
fi
Better still is GNU find, which can handle a wildcard search internally and exit as soon as at it finds one matching file, rather than waste time processing a potentially huge list of them expanded by the shell; this also avoids the risk that the shell might overflow its command line buffer.
if test -n "$(find /dir/to/search -maxdepth 1 -name 'files*' -print -quit)"
then
echo found
else
echo not found
fi
Non-GNU versions of find might not have the -maxdepth option used here to make find search only the /dir/to/search instead of the entire directory tree rooted there.

Use:
files=(xorg-x11-fonts*)
if [ -e "${files[0]}" ];
then
printf "BLAH"
fi

You can do the following:
set -- xorg-x11-fonts*
if [ -f "$1" ]; then
printf "BLAH"
fi
This works with sh and derivatives: KornShell and Bash. It doesn't create any sub-shell. $(..) and `...` commands used in other solutions create a sub-shell: they fork a process, and they are inefficient. Of course it works with several files, and this solution can be the fastest, or second to the fastest one.
It works too when there aren't any matches. There isn't a need to use nullglob as one of the commentators say. $1 will contain the original test name, and therefore the test -f $1 won't success, because the $1 file doesn't exist.

for i in xorg-x11-fonts*; do
if [ -f "$i" ]; then printf "BLAH"; fi
done
This will work with multiple files and with white space in file names.

The solution:
files=$(ls xorg-x11-fonts* 2> /dev/null | wc -l)
if [ "$files" != "0" ]
then
echo "Exists"
else
echo "None found."
fi
> Exists

Use:
if [ "`echo xorg-x11-fonts*`" != "xorg-x11-fonts*" ]; then
printf "BLAH"
fi

The PowerShell way - which treats wildcards different - you put it in the quotes like so below:
If (Test-Path "./output/test-pdf-docx/Text-Book-Part-I*"){
Remove-Item -force -v -path ./output/test-pdf-docx/*.pdf
Remove-Item -force -v -path ./output/test-pdf-docx/*.docx
}
I think this is helpful because the concept of the original question covers "shells" in general not just Bash or Linux, and would apply to PowerShell users with the same question too.

The Bash code I use:
if ls /syslog/*.log > /dev/null 2>&1; then
echo "Log files are present in /syslog/;
fi

Strictly speaking, if you only want to print "Blah", here is the solution:
find . -maxdepth 1 -name 'xorg-x11-fonts*' -printf 'BLAH' -quit
Here is another way:
doesFirstFileExist(){
test -e "$1"
}
if doesFirstFileExist xorg-x11-fonts*
then printf "BLAH"
fi
But I think the most optimal is as follows, because it won't try to sort file names:
if [ -z $(find . -maxdepth 1 -name 'xorg-x11-fonts*' -printf 1 -quit) ]
then
printf "BLAH"
fi

Here's a solution for your specific problem that doesn't require for loops or external commands like ls, find and the like.
if [ "$(echo xorg-x11-fonts*)" != "xorg-x11-fonts*" ]; then
printf "BLAH"
fi
As you can see, it's just a tad more complicated than what you were hoping for, and relies on the fact that if the shell is not able to expand the glob, it means no files with that glob exist and echo will output the glob as is, which allows us to do a mere string comparison to check whether any of those files exist at all.
If we were to generalize the procedure, though, we should take into account the fact that files might contain spaces within their names and/or paths and that the glob char could rightfully expand to nothing (in your example, that would be the case of a file whose name is exactly xorg-x11-fonts).
This could be achieved by the following function, in bash.
function doesAnyFileExist {
local arg="$*"
local files=($arg)
[ ${#files[#]} -gt 1 ] || [ ${#files[#]} -eq 1 ] && [ -e "${files[0]}" ]
}
Going back to your example, it could be invoked like this.
if doesAnyFileExist "xorg-x11-fonts*"; then
printf "BLAH"
fi
Glob expansion should happen within the function itself for it to work properly, that's why I put the argument in quotes and that's what the first line in the function body is there for: so that any multiple arguments (which could be the result of a glob expansion outside the function, as well as a spurious parameter) would be coalesced into one. Another approach could be to raise an error if there's more than one argument, yet another could be to ignore all but the 1st argument.
The second line in the function body sets the files var to an array constituted by all the file names that the glob expanded to, one for each array element. It's fine if the file names contain spaces, each array element will contain the names as is, including the spaces.
The third line in the function body does two things:
It first checks whether there's more than one element in the array. If so, it means the glob surely got expanded to something (due to what we did on the 1st line), which in turn implies that at least one file matching the glob exist, which is all we wanted to know.
If at step 1. we discovered that we got less than 2 elements in the array, then we check whether we got one and if so we check whether that one exist, the usual way. We need to do this extra check in order to account for function arguments without glob chars, in which case the array contains only one, unexpanded, element.

I found a couple of neat solutions worth sharing. The first still suffers from "this will break if there are too many matches" problem:
pat="yourpattern*" matches=($pat) ; [[ "$matches" != "$pat" ]] && echo "found"
(Recall that if you use an array without the [ ] syntax, you get the first element of the array.)
If you have "shopt -s nullglob" in your script, you could simply do:
matches=(yourpattern*) ; [[ "$matches" ]] && echo "found"
Now, if it's possible to have a ton of files in a directory, you're pretty well much stuck with using find:
find /path/to/dir -maxdepth 1 -type f -name 'yourpattern*' | grep -q '.' && echo 'found'

I use this:
filescount=`ls xorg-x11-fonts* | awk 'END { print NR }'`
if [ $filescount -gt 0 ]; then
blah
fi

Using new fancy shmancy features in KornShell, Bash, and Z shell shells (this example doesn't handle spaces in filenames):
# Declare a regular array (-A will declare an associative array. Kewl!)
declare -a myarray=( /mydir/tmp*.txt )
array_length=${#myarray[#]}
# Not found if the first element of the array is the unexpanded string
# (ie, if it contains a "*")
if [[ ${myarray[0]} =~ [*] ]] ; then
echo "No files not found"
elif [ $array_length -eq 1 ] ; then
echo "File was found"
else
echo "Files were found"
fi
for myfile in ${myarray[#]}
do
echo "$myfile"
done
Yes, this does smell like Perl. I am glad I didn't step in it ;)

IMHO it's better to use find always when testing for files, globs or directories. The stumbling block in doing so is find's exit status: 0 if all paths were traversed successfully, >0 otherwise. The expression you passed to find creates no echo in its exit code.
The following example tests if a directory has entries:
$ mkdir A
$ touch A/b
$ find A -maxdepth 0 -not -empty -print | head -n1 | grep -q . && echo 'not empty'
not empty
When A has no files grep fails:
$ rm A/b
$ find A -maxdepth 0 -not -empty -print | head -n1 | grep -q . || echo 'empty'
empty
When A does not exist grep fails again because find only prints to stderr:
$ rmdir A
$ find A -maxdepth 0 -not -empty -print | head -n1 | grep -q . && echo 'not empty' || echo 'empty'
find: 'A': No such file or directory
empty
Replace -not -empty by any other find expression, but be careful if you -exec a command that prints to stdout. You may want to grep for a more specific expression in such cases.
This approach works nicely in shell scripts. The originally question was to look for the glob xorg-x11-fonts*:
if find -maxdepth 0 -name 'xorg-x11-fonts*' -print | head -n1 | grep -q .
then
: the glob matched
else
: ...not
fi
Note that the else-branched is reached if xorg-x11-fonts* had not matched, or find encountered an error. To distinguish the case use $?.

If there is a huge amount of files on a network folder using the wildcard is questionable (speed, or command line arguments overflow).
I ended up with:
if [ -n "$(find somedir/that_may_not_exist_yet -maxdepth 1 -name \*.ext -print -quit)" ] ; then
echo Such file exists
fi

if [ `ls path1/* path2/* 2> /dev/null | wc -l` -ne 0 ]; then echo ok; else echo no; fi

Try this
fileTarget="xorg-x11-fonts*"
filesFound=$(ls $fileTarget)
case ${filesFound} in
"" ) printf "NO files found for target=${fileTarget}\n" ;;
* ) printf "FileTarget Files found=${filesFound}\n" ;;
esac
Test
fileTarget="*.html" # Where I have some HTML documents in the current directory
FileTarget Files found=Baby21.html
baby22.html
charlie 22.html
charlie21.html
charlie22.html
charlie23.html
fileTarget="xorg-x11-fonts*"
NO files found for target=xorg-x11-fonts*
Note that this only works in the current directory, or where the variable fileTarget includes the path you want to inspect.

You can also cut other files out
if [ -e $( echo $1 | cut -d" " -f1 ) ] ; then
...
fi

Use:
if ls -l | grep -q 'xorg-x11-fonts.*' # grep needs a regex, not a shell glob
then
# do something
else
# do something else
fi

man test.
if [ -e file ]; then
...
fi
will work for directory and file.

How to iterate over a directory and display only filename

I would want to iterate over contents of a directory and list only ordinary files.
The path of the directory is given as an user input. The script works if the input is current directory but not with others.
I am aware that this can be done using ls.. but i need to use a for .. in control structure.
#!/bin/bash
echo "Enter the path:"
read path
contents=$(ls $path)
for content in $contents
do
if [ -f $content ];
then
echo $content
fi
done

ls is only returning the file names, not including the path. You need to either:
Change your working directory to the path in question, or
Combine the path with the names for your -f test
Option #2 would just change:
if [ -f $content ];
to:
if [ -f "$path/$content" ];
Note that there are other issues here; ls may make changes to the output that break this, depending on wrapping. If you insist on using ls, you can at least make it (somewhat) safer with:
contents="$(command ls -1F "$path")"

You have two ways of doing this properly:
Either loop through the * pattern and test file type:
#!/usr/bin/env bash
echo "Enter the path:"
read -r path
for file in "$path/"*; do
if [ -f "$file" ]; then
echo "$file"
fi
done
Or using find to iterate a null delimited list of file-names:
#!/usr/bin/env bash
echo "Enter the path:"
read -r path
while IFS= read -r -d '' file; do
echo "$file"
done < <(
find "$path" -maxdepth 1 -type f -print0
)
The second way is preferred since it will properly handle files with special characters and offload the file-type check to the find command.

Use file, set to search for files (-type f) from $path directory:
find "$path" -type f

Here is what you could write:
#!/usr/bin/env bash
path=
while [[ ! $path ]]; do
read -p "Enter path: " path
done
for file in "$path"/*; do
[[ -f $file ]] && printf '%s\n' "$file"
done
If you want to traverse all the subdirectories recursively looking for files, you can use globstar:
shopt -s globstar
for file in "$path"/**; do
printf '%s\n' "$file"
done
In case you are looking for specific files based on one or more patterns or some other condition, you could use the find command to pick those files. See this post:
How to loop through file names returned by find?
Related
When to wrap quotes around a shell variable?
Why you shouldn't parse the output of ls
Is double square brackets [[ ]] preferable over single square brackets [ ] in Bash?

How can I creates array that contains the names of all the files in a folder?

Given a folder (that my script get the of this folder as argument) , how can I creates array that will contain the names of all the files in this folder (and the files that exists at any folder in this folder and the other folder - recursively)?
I tried to do it like that :
#!/bin/bash
function get_all_the_files {
for i in "${1}"/*; do
if [ -d "$i" ]; then
get_all_the_files ${1}
else
if [ -f "${i}" ]; then
arrayNamesOfAllTheFiles=(${arrayNamesOfAllTheFiles[#]} "${i}")
fi
fi
done
}
arrayNamesOfAllTheFiles=()
get_all_the_files folder
declare -p arrayNamesOfAllTheFiles
But it's not working. What is the problem and how can I fix it?

To stick with your design (looping on the files and inserting only the regular files), populating the array at each step, but have Bash perform the recursion via the glob, you can use the following:
# the globstar shell option enables the ** glob pattern for recursion
shopt -s globstar
# the nullglob shell option makes non-matching globs expand to nothing (recommended)
shopt -s nullglob
array=()
for file in /path/to/folder/**; do
if [[ ! -h $file && -f $file ]]; then
array+=( "$file" )
fi
done
With the test [[ ! -h $file && -f $file ]] we test that the file is not a symbolic link and a regular file (without testing that the file is not a symbolic link, you would also have the symbolic links that resolve to a regular file).
You also learned about the array+=( "stuff" ) pattern to append to an array, instead of array=( "${array[#]}" "stuff" ).
Another possibility (with Bash ≥ 4.4 where the -d option of mapfile is implemented) and with GNU find (that supports the -print0 predicate):
mapfile -d '' array < <(find /path/to/folder -type f -print0)

You almost had it right. There is a small typo in the recursive call:
if [ -d "$i" ]; then
get_all_the_files ${1}
else
should be
if [ -d "$i" ]; then
get_all_the_files ${i}
else
I will add that use of arrays like this in bash is very unidiomatic. If you are trying to work with recursive trees of files, its more usual to use tools like find and xargs.
find . -type f -print0 | xargs -0 command-or-script-to-run-on-each-file

Add .old to files without .old in them, having trouble with which variable to use?

#!/bin/bash
for filenames in $( ls $1 )
do
echo $filenames | grep "\.old$"
if [ ! $filenames = 0 ]
then
$( mv "$1/$filenames" "$1/$filenames.old" )
fi
done
So I think most of the script works. It is intended to take the output of ls for a directory inputed in the first parameter, and search for any files with .old at the end. Any files that do not contain .old will then be renamed.
The script successfully renames the files, but it will add .old to a file already containing the extension. I am assuming that the if variable is wrong, but I cannot figure out which variable to use in this case.
Answer is in the key but if anyone needs to do this here is an even easier way:
#!/bin/bash
for filenames in $( ls $1 | grep -v "\.old$" )
do
$( mv "$1/$filenames" "$1/$filenames.old" )
done

Use `find for this
find /directory/here -type f ! -iname "*.old" -exec mv {} {}.old \;
Problems the original approach
for filenames in $( ls $1 ) Never parse ls output. Check [ this ]
Variables are not double quoted, say in if [ ! $filenames = 0 ]. This results in word-splitting. Use "$filenames" unless you expect word splitting.
So the final script would be
#!/bin/bash
if [ -d "$1" ]
then
find "$1" -type f ! -iname "*.old" -exec mv {} {}.old \;
# use -maxdepth 1 with find if you don't wish to recursively check subdirectories
else
echo "Directory : $1 doesn't exist !"
fi
Usage
./script '/path/to/directory'

Don't use ls in scripts.
#!/bin/bash
for filename in "$1"/*
do
case $filename in *.old) continue;; esac
mv "$filename" "$filename.old"
done
I prefer case over if because it supports wildcard matching naturally and portably. (You could run this with /bin/sh just as well.) If you wanted to use if instead, that'd be
if echo "$filename" | grep -q '\.old$'; then
or more idiomatically, but recent shells only,
if [[ "$filename" == *.old ]]; then

You want to avoid calling additional utility functions if simple shell builtins will do. Why? Each additional utility you call grep, etc. spawns and runs in a separate subshell of its own. (if you are spawning a subshell for every iteration in your loop -- things will really slow down) If the shell doesn't provide a feature, then sure... calling a utility is the right thing to do.
As mentioned above, shell globbing along with parameter expansion with substring removal provides a simple test for determining if a file has an .old extension. All you need is:
for i in "$1"/*; do
[ "${i##*.}" = "old" ] || mv "$i" "${i}.old"
done
(note: this will skip add the .old extension to single file named 'old', but that can be handled separately if needed -- unlikely. Additionally, the solution with find is a fine approach as well)

I solved the problem, as I was misled by my instructor!
$? is the variable which represents the pipeline output which is currently in the forground (which would be grep). The new code is unedited except for
if [ ! $? = 0 ]

How to use grep in a for loop

Could someone please help with this script. I need to use grep to loop to through the filenames that need to be changed.
#!/bin/bash
file=
for file in $(ls $1)
do
grep "^.old" | mv "$1/$file" "$1/$file.old"
done

bash can handle regular expressions without using grep.
for f in "$1"/*; do
[[ $f =~ \.old ]] && continue
# Or a pattern instead
# [[ $f == *.old* ]] && continue
mv "$f" "$f.old"
done
You can also move the name checking into the pattern itself:
shopt -s extglob
for f in "$1/"!(*.old*); do
mv "$f" "$f.old"
done

If I understand your question correctly, you want to make rename a file (i.e. dir/file.txt ==> dir/file.old) only if the file has not been renamed before. The solution is as follow.
#!/bin/bash
for file in "$1/"*
do
backup_file="${file%.*}.old"
if [ ! -e "$backup_file" ]
then
echo mv "$file" "$backup_file"
fi
done
Discussion
The script currently does not actual make back up, it only displays the action. Run the script once and examine the output. If this is what you want, then remove the echo from the script and run it again.
Update
Here is the no if solution:
ls "$1/"* | grep -v ".old" | while read file
do
echo mv "$file" "${file}.old"
done
Discussion
The ls command displays all files.
The grep command filter out those files that has the .old extension so they won't be displayed.
The while loop reads the file names that do not have the .old extension, one by one and rename them.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Recursive Shell Script and file extensions issue - bash

Related

Testing for existing file with an extension fails with globbing [duplicate]

How to iterate over a directory and display only filename

How can I creates array that contains the names of all the files in a folder?

Add .old to files without .old in them, having trouble with which variable to use?

How to use grep in a for loop

Categories

Resources