Sed for loop GNU Linux on Synology NAS - bash

I am working on a short script to search a large number of folders on a NAS for this odd character  and delete the character. I am on a Synology NAS running Linux. This is what I have so far.
#!/bin/bash
for file in "$(find "/volume1/PLNAS/" -depth -type d -name '**')";
do
echo "$file";
mv "$file" "$(echo $file | sed s/// )";
done
Current problem is that the Kernel does not appear to be passing each MV command separately. I get a long error message that appears to list every file in one command, truncated error message below. There are spaces in my file path and that it why I have tried to quote every variable.
mv: failed to access '/volume1/PLNAS/... UT Thickness Review ': File name too long

Several issues. The most important is probably that for file in "$(find...)" iterates only once with file set to the full result of your search. This is what the double quotes are for: prevent word splitting.
But for file in $(find...) is not safe: if some file names contain spaces they will be split...
Assuming the character is unicode 0xf028 (  ) try the following:
while IFS= read -r -d '' file; do
new_file="${file//$'\uf028'}"
printf 'mv %s %s\n' "$file" "$new_file"
# mv "$file" "$new_file"
done < <(find "/volume1/PLNAS/" -depth -type d -name $'*\uf028*' -print0)
Uncomment the mv line if things look correct.
As your file names are unusual we use the -d '' read separator and the print0 find option. This will use the NUL character (ASCII code zero) as separator between the file names instead of the default newline characters. The NUL character is the only one that you cannot find in a full file name.
We also use the bash $'...' expansion to represent the unwanted character by its unicode hexadecimal code, it is safer than copy-pasting the glyph. The new name is computed with the bash pattern substitution (${var//}).
Note: do not use echo with unusual strings, especially without quoting the strings (e.g. your echo $file | ...). Prefer printf or quoted here strings (sed ... <<< "$file").

Related

Why bash ignored the quotation in ls output?

Below is a script and its output describing the problem I found today. Even though ls output is quoted, bash still breaks at the whitespaces. I changed to use for file in *.txt, just want to know why bash behaves this way.
[chau#archlinux example]$ cat a.sh
#!/bin/bash
FILES=$(ls --quote-name *.txt)
echo "Value of \$FILES:"
echo $FILES
echo
echo "Loop output:"
for file in $FILES
do
echo $file
done
[chau#archlinux example]$ ./a.sh
Value of $FILES:
"b.txt" "File with space in name.txt"
Loop output:
"b.txt"
"File
with
space
in
name.txt"
Why bash ignored the quotation in ls output?
Because word splitting happens on the result of variable expansion.
When evaluating a statement the shell goes through different phases, called shell expansions. One of these phases is "word splitting". Word splitting literally does split your variables into separate words, quoting from the bash manual:
The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting.
The shell treats each character of $IFS as a delimiter, and splits the results of the other expansions into words using these characters as field terminators. . If IFS is unset, or its value is exactly <space><tab><newline>, the default, then sequences of <space>, <tab>, and <newline> at the beginning and end of the results of the previous expansions are ignored, and any sequence of IFS characters not at the beginning or end serves to delimit words. ...
When shell has a $FILES, that is not within double quotes, it firsts does "parameter expansion". It expands $FILES to the string "b.txt" "File with space in name.txt". Then word splitting occurs. So with the default IFS, the resulting string is split/separated on spaces, tabs or newlines.
To prevent word splitting the $FILES has to be inside double quotes itself, no the value of $FILES.
Well, you could do this (unsafe):
ls -1 --quote-name *.txt |
while IFS= read -r file; do
eval file="$file"
ls -l "$file"
done
tell ls to output newline separated list -1
read the list line by line
re-evaulate the variable to remove the quotes with evil. I mean eval
I use ls -l "$file" inside the loop to check if "$file" is a valid filename.
This will still not work on all filenames, because of ls. Filenames with unreadable characters are just ignored by my ls, like touch "c.txt"$'\x01'. And filenames with embedded newlines will have problems like ls $'\n'"c.txt".
That's why it's advisable to forget ls in scripts - ls is only for nice-pretty-printing in your terminal. In scripts use find.
If your filenames have no newlines embedded in them, you can:
find . -mindepth 1 -maxdepth 1 -name '*.txt' |
while IFS= read -r file; do
ls -l "$file"
done
If your filenames are just anything, use a null-terminated stream:
find . -mindepth 1 -maxdepth 1 -name '*.txt' -print0 |
while IFS= read -r -d'' file; do
ls -l "$file"
done
Many, many unix utilities (grep -z, xargs -0, cut -z, sort -z) come with support for handling zero-terminated strings/streams just for handling all the strange filenames you can have.
You can try the follwing snippet:
#!/bin/bash
while read -r file; do
echo "$file"
done < <(ls --quote-name *.txt)

How to surround find's -name parameter with wildcards before and after a variable?

I have a list of newline-separated strings. I need to iterate through each line, and use the argument surrounded with wildcards. The end result will append the found files to another text file. Here's some of what I've tried so far:
cat < ${INPUT} | while read -r line; do find ${SEARCH_DIR} -name $(eval *"$line"*); done >> ${OUTPUT}
I've tried many variations of eval/$() etc, but I haven't found a way to get both of the asterisks to remain. Mostly, I get things that resemble *$itemFromList, but it's missing the second asterisk, resulting in the file not being found. I think this may have something to do with bash expansion, but I haven't had any luck with the resources I've found so far.
Basically, need to supply the -name parameter with something that looks like *$itemFromList*, because the file has words both before and after the value I'm searching for.
Any ideas?
Use double quotes to prevent the asterisk from being interpreted as an instruction to the shell rather than find.
-name "*$line*"
Thus:
while read -r line; do
line=${line%$'\r'} # strip trailing CRs if input file is in DOS format
find "$SEARCH_DIR" -name "*$line*"
done <"$INPUT" >>"$OUTPUT"
...or, better:
#!/usr/bin/env bash
## use lower-case variable names
input=$1
output=$2
args=( -false ) # for our future find command line, start with -false
while read -r line; do
line=${line%$'\r'} # strip trailing CR if present
[[ $line ]] || continue # skip empty lines
args+=( -o -name "*$line*" ) # add an OR clause matching if this line's substring exists
done <"$input"
# since our last command is find, use "exec" to let it replace the shell in memory
exec find "$SEARCH_DIR" '(' "${args[#]}" ')' -print >"$output"
Note:
The shebang specifying bash ensures that extended syntax, such as arrays, are available.
See BashFAQ #50 for a discussion of why an array is the correct structure to use to collect a list of command-line arguments.
See the fourth paragraph of http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html for the relevant POSIX specification on environment and shell variable naming conventions: All-caps names are used for variables with meaning to the shell itself, or to POSIX-specified tools; lowercase names are reserved for application use. That script you're writing? For purposes of the spec, it's an application.

Unexpected sequence in bash recursive loop

This is how I expect a bash loop to sequence the output:
for i in $(seq 2); do
echo $i
echo $(expr $i + 10)
done
1
11
2
12
This is how it sequences for a recursive folder file operation:
for file in "$(find . -name '*.txt')"; do
echo "$file";
newfile="${file//\.txt/.csv}"
echo "$newfile";
mv '$file' '$newfile'
done
./dir1/a.txt
./dir2/b.txt
./dir2/dir3/c.txt
./dir1/a.csv
./dir2/b.csv
./dir2/dir3/c.csv
mv: rename $file to $newfile: No such file or directory
I've tried the mv call with name variables wrapped in " and no quotes, which return different errors.
Grateful for a pointer where I'm going wrong.
There should not be quotes around the $(find) command: the quotes cause all of the file names to be concatenated into one large string. The quotes in the mv command should be double quotes: variables aren't expanded inside single quotes.
for file in $(find . -name '*.txt'); do
echo "$file"
newfile="${file//\.txt/.csv}"
echo "$newfile"
mv "$file" "$newfile"
done
This isn't the best way to loop through a list of files. It'll trip up on any file names with spaces. A better way is to pipe find to a read loop.
find . -name '*.txt' | while read file; do
...
done
This will handle most file names fine. It'll still have trouble with files with leading spaces, with backslashes, or with embedded newlines (which, technically, are legal). To handle those:
find . -name '*.txt' -print0 | while IFS= read -r -d $'\0' file; do
...
done
-print0 and -d $'\0' take care of newlines. IFS= keeps read from dropping leading whitespace. -r tells it not to interpret backslashes specially.
For what it's worth, the . in .txt doesn't need to be escaped. . isn't a special character here. And /% would be better than // since the replacement should only be done at the end of the string.
newfile=${file/%.txt/.csv}

Why does this script not find directories with names ending in spaces

I have the following script to recursively clean up directories when they no longer contain (any directories with) any .mp3 or .ogg files:
set -u
find -L $1 -depth -type d | while read dir
do
songList=`find -L "$dir" -type f \( -iname '*.ogg' -o -iname '*.mp3' \)` && {
if [[ -z "$songList" ]]
then
echo removing "$dir"
rm -rf "$dir"
fi
}
done
This works great, except that it fails in the case of directories that have a space as the last character of their name, in which case the second find fails, with the following feedback, if the script is invoked with . as its only argument, and a directory with the path './FOO/BAR BAZ ' (note the space at the end) exists:
find: `./FOO/BAR BAZ': No such file or directory
(Note the space that is now missing at the end, though other spaces are left intact.)
I'm pretty sure it's a quoting thing, but every other way of quoting I've tried makes the behavior worse (i.e. more directories failing).
read is splitting the input when it encounters spaces. Quoting help read:
Read a line from the standard input and split it into fields.
Reads a single line from the standard input, or from file descriptor FD
if the -u option is supplied. The line is split into fields as with word
splitting, and the first word is assigned to the first NAME, the second
word to the second NAME, and so on, with any leftover words assigned to
the last NAME. Only the characters found in $IFS are recognized as word
delimiters.
You could set IFS and avoid the word splitting. Say:
find -L "$1" -depth -type d | while IFS='' read dir

Replace underscores to whitespaces using bash script

How can I replace all underscore chars with a whitespace in multiple file names using Bash Script? Using this code we can replace underscore with dash. But how it works with whitespace?
for i in *.mp3;
do x=$(echo $i | grep '_' | sed 's/_/\-/g');
if [ -n "$x" ];
then mv $i $x;
fi;
done;
Thank you!
This should do:
for i in *.mp3; do
[[ "$i" = *_* ]] && mv -nv -- "$i" "${i//_/ }"
done
The test [[ "$i" = *_* ]] tests if file name contains any underscore and if it does, will mv the file, where "${i//_/ }" expands to i where all the underscores have been replaced with a space (see shell parameter expansions).
The option -n to mv means no clobber: will not overwrite any existent file (quite safe). Optional.
The option -v to mv is for verbose: will say what it's doing (if you want to see what's happening). Very optional.
The -- is here to tell mv that the arguments will start right here. This is always good practice, as if a file name starts with a -, mv will try to interpret it as an option, and your script will fail. Very good practice.
Another comment: When using globs (i.e., for i in *.mp3), it's always very good to either set shopt -s nullglob or shopt -s failglob. The former will make *.mp3 expand to nothing if no files match the pattern (so the loop will not be executed), the latter will explicitly raise an error. Without these options, if no files matching *.mp3 are present, the code inside loop will be executed with i having the verbatim value *.mp3 which can cause problems. (well, there won't be any problems here because of the guard [[ "$i" = *_* ]], but it's a good habit to always use either option).
Hope this helps!
The reason your script is failing with spaces is that the filename gets treated as multiple arguments when passed to mv. You'll need to quote the filenames so that each filename is treated as a single agrument. Update the relevant line in your script with:
mv "$i" "$x"
# where $i is your original filename, and $x is the new name
As an aside, if you have the perl version of the rename command installed, you skip the script and achieve the same thing using:
rename 's/_/ /' *.mp3
Or if you have the more classic rename command:
rename "_" " " *.mp3
Using tr
tr '_' ' ' <file1 >file2

Resources