Shell: rsync parsing spaces incorrectly in file name/path - shell

I'm trying to pull a list of files over ssh with rsync, but I can't get it to work with filenames that have spaces on it! One example file is this:
/home/pi/Transmission_Downloads/FUNDAMENTOS_JAVA_E_ORIENTAÇÃO_A_OBJETOS/2. Fundamentos da linguagem/estruturas-de-controle-if-else-if-e-else-v1.mp4
and I'm trying to transfer it using this shell code.
cat $file_name | while read LINE
do
echo $LINE
rsync -aP "$user#$server:$LINE" $local_folder
done
and the error I'm getting is this:
receiving incremental file list
rsync: link_stat "/home/pi/Transmission_Downloads/FUNDAMENTOS_JAVA_E_ORIENTAÇÃO_A_OBJETOS/2." failed: No such file or directory (2)
rsync: link_stat "/home/pi/Fundamentos" failed: No such file or directory (2)
rsync: link_stat "/home/pi/da" failed: No such file or directory (2)
rsync: change_dir "/home/pi//linguagem" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [Receiver=3.1.0]
I don't get it why does it print OK on the screen, but parses the file name/path incorrectly! I know spaces are actually backslash with spaces, but don't know how to solve this. Sed (find/replace) didn't help either, and I also tried this code without success
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "Text read from file: $line"
rsync -aP "$user#$server:$line" $local_folder
done < $file_name
What should I do to fix this, and why this is happening?
I read the list of files from a .txt file (each file and path on one line), and I'm using ubuntu 14.04. Thanks!

rsync does space splitting by default.
You can disable this using the -s (or --protect-args) flag, or you can escape the spaces within the filename

The shell is correctly passing the filename to rsync, but rsync interprets spaces as separating multiple paths on the same server. So in addition to double-quoting the variable expansion to make sure rsync sees the string as a single argument, you also need to quote the spaces within the filename.
If your filenames don't have apostrophes in them, you can do that with single quotes inside the double quotes:
rsync -aP "$user#$server:'$LINE'" "$local_folder"
If your filenames might have apostrophes in them, then you need to quote those (whether or not the filenames also have spaces). You can use bash's built-in parameter substitution to do that (as long as you're on bash 4; older versions, such as the /bin/bash that ships on OS X, have issues with backslashes and apostrophes in such expressions). Here's what it looks like:
rsync -aP "$user#$server:'${LINE//\'/\'\\\'\'}'" "$local_folder"
Ugly, I know, but effective. Explanation follows after the other options.
If you're using an older bash or a different shell, you can use sed instead:
rsync -aP "$user#$server:'$(sed "s/'/'\\\\''/g" <<<"$LINE")'" "$local_folder"
... or if your shell also doesn't support <<< here-strings:
rsync -aP "$user#$server:'$(echo "$LINE" | sed "s/'/'\\\\''/g")'" "$local_folder"
Explanation: we want to replace all apostrophes with.. something that becomes a literal apostrophe in the middle of a single-quoted string. Since there's no way to escape anything inside single quotes, we have to first close the quotes, then add a literal apostrophe, and then re-open the quotes for the rest of the string. Effectively, that means we want to replace all occurrences of an apostrophe (') with the sequence (apostrophe, backslash, apostrophe, apostrophe): '\''. We can do that with either bash parameter expansion or sed.
In bash, ${varname/old/new} expands to the value of the variable $varname with the first occurrence of the string old replaced by the string new. Doubling the first slash ( ${varname//old/new} ) replaces all occurrences instead of just the first one. That's what we want here. But since both apostrophe and backslash are special to the shell, we have to put a(nother) backslash in front of every one of those characters in both expressions. That turns our old value into \', and our new one into \'\\\'\'.
The sed version is a little simpler, since apostrophes aren't special. Backslashes still are, so we have to put a \\ in the string to get a \ back. Since we want apostrophes in the string, it's easier to use a double-quoted string instead of a single-quoted one, but that means we need to double all the backslashes again to make sure the shell passes them on to sed unmolested. That's why the shell command has \\\\: that gets handed to sed as \\, which it outputs as \.

Related

Iterate and change content of a file

I have written a script, which takes first and second parameter strings and the other parameters are files.The idea of the script is to replace the first parameter with the second in every line of every file .Here is my implementation ,however it does not change the content of the files ,but it prints correct information
first=$1
second=$2
shift 2
for i in $*; do
if [ -f $i]; then
sed -i -e 's/$first/$second/g' $i
fi
done
You used a single quote to enclose the sed command. Thus, the special meaning of the dollar sign (parameter expansion) is ignored and it is treated as a simple character.
Check out bash manual:
Enclosing characters in single quotes preserves the literal value of each character within the quotes.
... Enclosing characters in double quotes preserves the literal value of all characters within the quotes, with the exception of $, `, \, and, when history expansion is enabled, !.
You should replace them with double quotes:
sed -i -e "s/$first/$second/g" $i
Your script doesn't change the files because you are simply just printing to stdout, not to the file. They way you did it, you would need a new variable to store the new content word by word and then echo it to the original file with redirection (>).
But you can do this simply with sed, like this:
sed -i '' 's/original/new/g' file(s)
Explanation:
sed is a stream editor
-i '' means it will edit the current file and won't create any backup
s/original/new/g means substitute original word or regexp with new word or regexp. The g means global = substitute all occurencies, not just the first for every line
file(s) are all the files in which to perform the substitution. Can be * for all files in the working directory.

Quoting a bash script

I am trying to run the following command
echo `grep -o "<\/div><div class\=\".*" $1` |
grep -o "title=\\"\(.*\?\)\\" aria-describedby" -> title.txt
from script test.sh.
However, every time I check my file title.txt, it is empty.
I tested the first part of the command,
echo `grep -o "<\/div><div class\=\".*" $1`
and it works fine.
The second part is the one with the problem"
grep -o "title=\\\"\(.*\?\)\\\" aria-describedby" -> title.txt
Just to keep in mind, this is not being run from the terminal itself, but from a bash script file being called from the terminal.
I believe my problem lies in how I am quoting or escaping the quotes.
I do not know if your expressions do what you want them to do, but there is an issue with this one :
"title=\\"\(.*\?\)\\"
When the shell sees to consecutive backslashes (basically an escaped backslash), it will read them as one literal backslash. The first twin backslashes in your expression are read like this, and the double quote that follows ends the string. In other words, the following is a string :
"title=\\"
And the rest of the line :
\(.*\?\)\\"
ends with a double quote (not escaped once again due to the twin backslashes that become one literal backslash), but has no initial double quote.

How to make bash script take file names with spaces?

I have a bash script like this:
myfiles=("file\ with\ spaces.csv")
for file_name in "${myfiles[#]}"
do
echo "removing first line of file $file_name"
echo "first line is `head -1 $file_name`"
echo "\n"
done
but it does not recognize the spaces for some reason, even though I enclosed it in double quotes "":
head: cannot open ‘file\\’ for reading: No such file or directory
How do I fix this?
You need double quotes inside the backticks. The outer set isn't sufficient.
echo "first line is `head -1 "$file_name"`"
Also, do not put backslashes in the file name, since it's already quoted. Quotes or backslashes, but not both.
myfiles=("file with spaces.csv")
myfiles=(file\ with\ spaces.csv)
To expand on #JohnKugelman's answer:
Quoting takes a bit of getting used to in Bash. As a simple rule use single quotes for static strings with no special characters, double quotes for strings with variables, and $'' quoting for strings with special characters.
There's a separate quoting context inside every command substitution.
$() is a clearer way to establish a command substitution, because it can be nested much easier.
Consequently you'd typically write myfiles=('file with spaces.csv') and echo "first line is $(head -1 "$file_name")".

How to escape & in scp

Yes, I do realize it has been asked a thousand of times how to escape spaces in scp, but I fail to do that with the &-sign, so if that sign is part of the directory name.
[sorunome#sorunome-desktop tmp]$ scp test.txt "bpi:/home/sorunome/test & stuff"
zsh:1: command not found: stuff
lost connection
The & sign seems to be messing things quite a bit up, using \& won't solve the issue as then the remote directory is not found:
[sorunome#sorunome-desktop tmp]$ scp test.txt "bpi:/home/sorunome/test \& stuff"
scp: ambiguous target
Not even by omitting the quotes and adding \ all over the place this is working:
[sorunome#sorunome-desktop tmp]$ scp test.txt bpi:/home/sorunome/test\ \&\ stuff
zsh:1: command not found: stuff
lost connection
So, any idea?
Escaping both the spaces and the ampersand did the trick for me :
scp source_file "user#host:/dir\ with\ spaces\ \&\ ampersand"
The quotes are still needed for some reason.
When using scp or cp, special characters can break the file path. You get around this by escaping the special character.
Using cp you can use the normal method of escaping special characters, which is preceding it with a backslash. For example, a path with a space could be copied using:
cp photo.jpg My\ Pictures/photo.jpg
The remote path in scp doesn’t work escaping using this method. You need to escape the special characters using a double backslash. Using the same example, the My Photos folder would have its space escaped using:
scp photo.jpg "user#remotehost:/home/user/My\\ Photos/photo.jpg"
The double quotes are also important, the whole path with the special characters must be enclosed with double quotes.
Source : https://dominichosler.wordpress.com/2011/08/27/using-scp-with-special-characters/
If you need to escape % use %%
Surround the file name in an additional pair of \" like this:
scp "test.txt" "bpi:/home/sorunome/\"test & stuff\""
Since nothing needs to change inside the file name, this can be directly applied to variables:
scp "$local" "bpi:/home/sorunome/\"$remote\""
The outer quotes (") are interpreted by the local shell. The inner quotes (\") are interpreted on the remote server. Thanks to #chepner for pointing out how the arguments are processed twice.
This is what worked for me:
function escape_file() {
local ESCAPED=$(echo "$1" | sed -E 's:([ ()[!&<>"$*,;=?#\^`{}|]|]):\\\1:g' | sed -E "s/([':])/\\\\\1/g")
echo "$ESCAPED"
}
REMOTE_FILE="/tmp/Filename with & symbol's! (xxx) [1.8, _aaa].gz"
scp "server:$(escape_ "$REMOTE_FILE")" /tmp/

Allowing punctuation characters in directory and file names in bash

What techniques or principles should I use in a bash script to handle directories and filenames that are allowed to contain as many as possible of
!"#$%&'()*+,-./:;<=>?#[\]^_`{|}~
and space?
I guess / is not a valid filename or directory name character in most linux/unix systems?
So far I have had problems with !, ;, |, (a space character) and ' in filenames.
You are right, / is not valid, as is the null-byte \0. There is no way around that limitation (besides file system hacking).
All other characters can be used in file names, including such surprising characters as a newline \n or a tab \t. There are many ways to enter them so that the shell does not understand them as special characters. I will give just a pragmatic approach.
You can enter most of the printable characters by using the singlequote ' to to quote them:
date > 'foo!bar["#$%&()*+,-.:;<=>?#[\]^_`{|}~'
Of course, you cannot enter a singlequote this way, but for this you can use the doublequote ":
date > "foo'bar"
If you need to have both, you can end one quotation and start another:
date > "foo'bar"'"bloh'
Alternatively you also can use the backslash \ to escape the special character directly:
date > foo\"bar
The backslash also works as an escaper withing doublequotes, it does not work that way within singlequotes (there it is a simple character without special meaning).
If you need to enter non-printable characters like a newline, you can use the dollar-singlequote notation:
date > $'foo\nbar'
This is valid in bash, but not necessarily in all other shells. So take care!
Finally, it can make sense to use a variable to keep your strange name (in order not to have to spell it out directly:
strangeName=$(xxd -r <<< "00 41 42 43 ff 45 46")
date > "$strangeName"
This way you can keep the shell code readable.
BUT in general it is not a good idea to have such characters in file names because a lot of scripts cannot handle such files properly.
To write scripts fool-proof is not easy. The most basic rule is the quote variable usage in doublequotes:
for i in *
do
cat "$i" | wc -l
done
This will solve 99% of the issues you are likely to encounter.
If you are using find to find directory entries which can contain special characters, you should use printf0 to separate the output not by spaces but by null-bytes. Other programs like xargs often can understand a list of null-byte separated file names.
If your file name can start with a dash - it often can be mistaken as an option. Some programs allow giving the special option -- to state that all following arguments are no options. The more general approach is to use a name which does not start with a dash:
for i in *
do
cat ./"$i" | wc -l
done
This way, a file named -n will not run cat -n but cat ./-n which will not be understood as the option -n given to cat (which would mean "number lines").
Always quote your variable substitutions. I.e. not cp $source $target, but cp "$source" "$target". This way they won't be subject to word splitting and pathname expansion.
Specify "--" before positional arguments to file operation commands. I.e. not cp "$source" "$target", but cp -- "$source" "$target". This prevents interpreting file names starting with dash as options.
And yes, "/" is not a valid character for file/directory names.

Resources