bash - forcing globstar asterisk expansion when passed to loop - bash

I am attempting to write a script that tried to use globstar expressions to execute a command (for example ls)
#!/usr/bin/env bash
shopt -s globstar nullglob
DISCOVERED_EXTENSIONS=$(find . -type f -name '*.*' | sed 's|.*\.||' | sort -u | tr '\n' ' ' | sed "s| | ./\**/*.|g" | rev | cut -c9- | rev | echo "./**/*.$(</dev/stdin)")
IFS=$'\n'; set -f
for f in $(echo $DISCOVERED_EXTENSIONS | tr ' ' '\n'); do
ls $f;
done
unset IFS; set +f
shopt -u globstar nullglob
The script output is:
ls: ./**/*.jpg: No such file or directory
ls: ./**/*.mp4: No such file or directory
It is passing ls "./**/*.avi" instead of ls ./**/*.avi (no variable expansion). I attempted to use eval, envsubst and even used a custom expand function, to no avail
The result of echo "$DISCOVERED_EXTENSIONS" is:
./**/*.jpg ./**/*.mp4
What changes can be recommended so that value of $f is the result of glob expansion and not the expression itself?
EDIT: I'm keeping the question up as I have resolved my problem by not using globstar at all which solves my immediate problem but doesn't solve the question.
As pynexj points out, the set -f un-does shopt -s globstar nullglob so that makes the script I've written as non-functional 'cause removing set -f breaks this script

$f is the result of glob expansion
The result of glob expansion is a list of arguments. It could be saved in an array. Saving it is just calling a subshell and transfering data.
mapfile -t -d '' arr < <(bash -c 'printf "%s\0" '"$f")
ls "${arr[#]}"
Notes:
Do not do for i in $(....). Use a while IFS= read -r loop. Bashfaq how to read a stream line by line.
I have no idea what is going on at that DISCOVERED_EXTENSIONS long line, but I would find . -maxdepth 1 -type f -name '*.*' -exec bash -c 'printf "%s\n" "${0##*.}"' {} \; | sort -u.
I usually recommend using find instead of glubulation and working on pipelines/streams. I guess I would write it as: find . -maxdepth 1 -type f -name '*.*' -exec bash -c 'printf "%s\n" "${0##*.}"' {} \; | sort -u | while IFS= read -r ext; do find . -type f -name "*.$ext" | xargs -d '\n' ls; done

Related

Bash Cutting a Filename as a String in a Find Loop?

I'm trying to use the cut function to parse filenames, but am encountering difficulty while doing so in a find loop With the intention of converting my music library from ARTIST - TITLE.EXT to TITLE.EXT
So If I had the file X - Y.EXT it should yield Y.EXT as an output.
The current function is something like this:
find . -iname "*.mp3" -exec cut -d "-" -f 2 <<< "`echo {}`" \;
It should be noted that the above syntax looks a bit strange, why not just use <<< {} \; instead of the echo {}. cut seems to parse the file instead of the filename if it's not given a string.
Another attempt I had looked something like:
find . -iname "*.mp3" -exec TRACKTITLE=`echo {} | cut -d '-' -f2` \; -exec echo "$TRACKTITLE" \;
But this fails with find: ‘TRACKTITLE=./DAN TERMINUS - Underwater Cities.mp3’: No such file or directory.
This (cut -d "-" -f 2 <<< FILENAME) command works wonderfully for a single instance (although keeps the space after the "-" character frustratingly).
How can I perform this operation in a find loop?
First thing is try to extract what you want in your file name with Parameter Expansion.
file="ARTIST - TITLE.EXT"
echo "${file#* - }"
Output
TITLE.EXT
Using find and invoking a shell with a for loop.
find . -type f -iname "*.mp3" -exec sh -c 'for music; do echo mv -v "$music" "${music#* - }"; done' sh {} +
If there are .mp3 files in sub directories, just change
-exec
with
-execdir
if available/supported by your find
For whatever reason -execdir is not available.
find . -type f -iname "*.mp3" -exec sh -c '
for music; do
pathname="${music%/*}"
filename="${music##*/}"
new_music="${filename#* - }"
echo mv -v "$music" "$pathname/$new_music"
done' sh {} +
Remove the echo if you're satisfied with the output.
See Understanding -exec option to Find
Below command would say what it would do, remove echo to actually
run mv:
find . -iname "*.mp3" -exec sh -c 'echo mv "$1" "$(echo "$1" | cut -d - -f2)"' sh {} \;
Example output:
$ find . -iname "*.mp3" -exec sh -c 'echo mv "$1" "$(echo "$1" | cut -d - -f2)"' sh {} \;
mv ./X - Y.mp3 Y.mp3
mv ./ARTIST - TITLE.mp3 TITLE.mp3
Also notice that your cut command will leave a whitespace at the
beginning of the new filename:
$ echo ARTIST\ -\ TITLE.mp3 | cut -d - -f2-
TITLE.mp3
You don't need the find nor the cut for this task.
for f in *' - '*.mp3; do mv -i "$f" "${f##* - }"; done
will do the job for the current directory.
If you want to descend through directories, then:
shopt -s globstar
for f in ./**/*' - '*.mp3; do
mv -i "$f" "${f%/*}/${f##* - }"
done

sed to replace string in file only displayed but not executed

I want to find all files with certain name (Myfile.txt) that do not contain certain string (my-wished-string) and then do a sed in order to do a replace in the found files. I tried with:
find . -type f -name "Myfile.txt" -exec grep -H -E -L "my-wished-string" {} + | sed 's/similar-to-my-wished-string/my-wished-string/'
But this only displays me all files with wished name that miss the "my-wished-string", but does not execute the replacement. Do I miss here something?
With a for loop and invoking a shell.
find . -type f -name "Myfile.txt" -exec sh -c '
for f; do
grep -H -E -L "my-wished-string" "$f" &&
sed -i "s/similar-to-my-wished-string/my-wished-string/" "$f"
done' sh {} +
You might want to add a -q to grep and -n to sed to silence the printing/output to stdout
You can do this by constructing two stacks; the first containing the files to search, and the second containing negative hits, which will then be iterated over to perform the replacement.
find . -type f -name "Myfile.txt" > stack1
while read -r line;
do
[ -z $(sed -n '/my-wished-string/p' "${line}") ] && echo "${line}" >> stack2
done < stack1
while read -r line;
do
sed -i "s/similar-to-my-wished-string/my-wished-string/" "${line}"
done < stack2
With some versions of sed, you can use -i to edit the file. But don't pipe the list of names to sed, just execute sed in the find:
find . -type f -name Myfile.txt -not -exec grep -q "my-wished-string" {} \; -exec sed -i 's/similar-to-my-wished-string/my-wished-string/g' {} \;
Note that any file which contains similar-to-my-wished-string also contains the string my-wished-string as a substring, so with these exact strings the command is a no-op, but I suppose your actual strings are different than these.

xargs argument not interpreted

I have a directory $dir that contains .txt.xy files and subdirectories with .txt.xy files. I try to iterate over each file and pass the whole path as well as the path without $dir as argument to a program like this:
dir="/path/to/"
suffix=".xy"
find "$dir" -name "*.txt.xy" -print0 | xargs -0 -I {} sh -c 'program "$1" |
subprogram filename="$2"' _ {} "$(echo {} | sed -e "s#^${dir}##" -e "s#${suffix}\$##")"
$1 shold be the full path (e.g. /path/to/subdir/file.txt.xy)
$2 should be the full path without $dir and $suffix (e.g. subdir/file.txt)
$1 is the propper full path but $2 is also the full path as if the pipe in $(...) is never executed. What am I missing here?
Your attempt seems rather roundabout. It sounds like you are looking for
find /path/to -name "*.txt.xy" -exec sh -c '
for f; do
g=${f##*/}
program "$f" | subprogram filename="${g%.xy}"
done' _ {} +
If you really need your parameters to be in variables, maybe pass in the suffix as $0 which isn't used for anything useful here anyway. It's a bit obscure, but helps avoid the mess you had with double quotes.
find /path/to -name "*.txt.xy" -exec sh -c '
for f; do
g=${f##*/}
program "$f" | subprogram filename="${g%"$0"}"
done' ".xy" {} +
The above simply trims g to the basename which I guess on closer reading you don't want. So pass /path/to in $0 instead, and hardcode .xy inside:
find /path/to -name "*.txt.xy" -exec sh -c '
for f; do
g=${f#"$0"/}
program "$f" | subprogram filename="${g%.xy}"
done' "/path/to" {} +
or if you really need both to be command-line parameters,
dir="/path/to"
suffix=".xy"
find "$dir" -name "*.txt$suffix" -exec sh -c '
suffix=$1
shift
for f; do
g=${f#"$0"/}
program "$f" | subprogram filename="${g%"$suffix"}"
done' "$dir" "$suffix" {} +
One reason for the failure, is that the command substitution with xargs under double quotes is expanded by the shell even before the former is executed. One way to avoid that would be to do the whole substitution inside the sub-shell created by sh -c as
find "$dir" -name "*.txt.xy" -print0 |
xargs -0 -I {} sh -c '
f="{}"
g="$(echo "$f" | sed -e 's|"'^"${dir}"'"||' -e 's|"'\\"${suffix}"$'"||' )"
program "$f" | subprogram filename="$g"
'

Bash new line feed in results [duplicate]

This question already has answers here:
Iterate over a list of files with spaces
(12 answers)
Closed 5 years ago.
Trying to create a mysql backup script.
However, I am finding that I am getting line feeds in the results:
#!/bin/bash
cd /home
for i in $(find $PWD -type f -name "wp-config.php" );
do echo "'$i'";
done
And the results show:
'/home/site1/public_html/folders/wp-config.php'
\'/home/site2/public_html/New'
'Website/wp-config.php'
'/home/site3/public_html/wp-config.php'
'/home/site4/public_html/old'
'website/wp-config.php'
'/home/site5/public_html/wp-config.php'
Do a ls from the command-line, we see for the folders in question:
New\ website
old\ website
and is treating the '\' as newline character.
OK.. Doing some research:
https://stackoverflow.com/a/5928254/175063
${foo/ /.}
Updating for what we may want:
${i/\ /}
The code now becomes:
#!/bin/bash
cd /home
for i in $(find $PWD -type f -name "wp-config.php" |${i/\ /});
do echo "'$i'";
done
Ref. https://tomjn.com/2014/03/01/wordpress-bash-magic/
Ultimately, I really want something like this:
!/bin/bash
# delete files older than 7 days
## find /home/dummmyacount/backups/ -type f -name '*.7z' -mtime +7 -exec rm {} \;
# set a date variable
DT=$(date +"%m-%d-%Y")
cd /home
for i in $(find $PWD -type f -name "wp-config.php" );
WPDBNAME=`cat $i | grep DB_NAME | cut -d \' -f 4`
WPDBUSER=`cat $i | grep DB_USER | cut -d \' -f 4`
WPDBPASS=`cat $i | grep DB_PASSWORD | cut -d \' -f 4`
do echo "$i";
#do echo $File;
#mysqldump...
done
You can do this
find . -type f -name "wp-config.php" -print0 | while read -rd $'\x00' f
do
printf '[%s]\n' "$f"
done
which uses the NUL character as the delimiter to avoid special chars

Looping over filtered find and performing an operation

I have a garbage dump of a bunch of Wordpress files and I'm trying to convert them all to Markdown.
The script I wrote is:
htmlDocs=($(find . -print | grep -i '.*[.]html'))
for html in "${htmlDocs[#]}"
do
P_MD=${html}.markdown
echo "${html} \> ${P_MD}"
pandoc --ignore-args -r html -w markdown < "${html}" | awk 'NR > 130' | sed '/<div class="site-info">/,$d' > "${P_MD}"
done
As far as I understand, the first line should be making an array of all html files in all subdirectories, then the for loop has a line to create a variable with the Markdown name (followed by a debugging echo), then the actual pandoc command to do the conversion.
One at a time, this command works.
However, when I try to execute it, OSX gives me:
$ ./pandoc_convert.command
./pandoc_convert.command: line 1: : No such file or directory
./pandoc_convert.command: line 1: : No such file or directory
o_0
Help?
There may be many reasons why the script fails, because the way you create the array is incorrect:
htmlDocs=($(find . -print | grep -i '.*[.]html'))
Arrays are assigned in the form: NAME=(VALUE1 VALUE2 ... ), where NAME is the name of the variable, VALUE1, VALUE2, and the rest are fields separated with characters that are present in the $IFS (input field separator) variable. Suppose you find a file name with spaces. Then the expression will create separate items in the array.
Another issue is that the expression doesn't handle globbing, i.e. file name generation based on the shell expansion of special characters such as *:
mkdir dir.html
touch \ *.html
touch a\ b\ c.html
a=($(find . -print | grep -i '.*[.]html'))
for html in "${a[#]}"; do echo ">>>${html}<<<"; done
Output
>>>./a<<<
>>>b<<<
>>>c.html<<<
>>>./<<<
>>>a b c.html<<<
>>>dir.html<<<
>>> *.html<<<
>>>./dir.html<<<
I know two ways to fix this behavior: 1) temporarily disable globbing, and 2) use the mapfile command.
Disabling Globbing
# Disable globbing, remember current -f flag value
[[ "$-" == *f* ]] || globbing_disabled=1
set -f
IFS=$'\n' a=($(find . -print | grep -i '.*[.]html'))
for html in "${a[#]}"; do echo ">>>${html}<<<"; done
# Restore globbing
test -n "$globbing_disabled" && set +f
Output
>>>./ .html<<<
>>>./a b c.html<<<
>>>./ *.html<<<
>>>./dir.html<<<
Using mapfile
The mapfile is introduced in Bash 4. The command reads lines from the standard input into an indexed array:
mapfile -t a < <(find . -print | grep -i '.*[.]html')
for html in "${a[#]}"; do echo ">>>${html}<<<"; done
The find Options
The find command selects all types of nodes, including directories. You should use the -type option, e.g. -type f for files.
If you want to filter the result set with a regular expression use -regex option, or -iregex for case-insensitive matching:
mapfile -t a < <(find . -type f -iregex .*\.html$)
for html in "${a[#]}"; do echo ">>>${html}<<<"; done
Output
>>>./ .html<<<
>>>./a b c.html<<<
>>>./ *.html<<<
echo vs. printf
Finally, don't use echo in new software. Use printf instead:
mapfile -t a < <(find . -type f -iregex .*\.html$)
for html in "${a[#]}"; do printf '>>>%s<<<\n' "$html"; done
Alternative Approach
However, I would rather pipe a loop with a read:
find . -type f -iregex .*\.html$ | while read line
do
printf '>>>%s<<<\n' "$line"
done
In this example, the read command reads a line from the standard input and stores the value into line variable.
Although I like the mapfile feature, I find the code with the pipe more clear.
Try adding the bash shebang and set IFS to handle spaces in folders and filenames:
#!/bin/bash
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
htmlDocs=($(find . -print | grep -i '.*[.]html'))
for html in "${htmlDocs[#]}"
do
P_MD=${html}.markdown
echo "${html} \> ${P_MD}"
pandoc --ignore-args -r html -w markdown < "${html}" | awk 'NR > 130' | sed '/<div class="site-info">/,$d' > "${P_MD}"
done
IFS=$SAVEIFS

Resources