rsync: filenames with spaces, quotes and japanese letters - bash

I'm writing a script for syncing completed torrents from a remote server and remembering already synced files/folders. But filenames with quotes or japanese letters are causing some trouble, so rsync have to run two times at the moment.
rsync -aP ovh:"$src/$(printf "%q" "$1")" "$dst/"
working:
brackets
whitespaces
quotes
not working:
japanese letters
Every single japanese letter will be converted to something like ?\#211, \#215島 or \#227?\. Since I'm using OS X, I also tried --iconv=utf-8-mac,utf-8 - without success.
rsync -aP ovh:"'$src/${1/\'/\\'}'" "$dst/"
working:
brackets
whitespaces
japanese letters
not working:
quotes
For example $1 is test's file, the string send to the server becomes "'/data/rtorrent/complete/test\'s file'"
Error message: zsh:1: unmatched '
Seems like escaping within a single quoted text doesn't work. But if the outer single quotes are removed, rsync interprets a whitespace as separator for another file.
I thought maybe it would help to convert every single character to unicode (like \u1337) and send this string to the server, but didn't find a way to do that. Just endless scripts for the other way around.
Also sed wasn't helpful either - way too much work to escape everything manually. This script should work reliable and I don't want to check every now and then if I covered every possibility which maybe needs escaping.
Any idea how to merge the two commands into one?
Edit: my temporary solution was this:
sync() {
# 1. escape quotes / 2. escape kana
rsync -aP ovh:"$src/$(printf "%q" "$1")" "$dst/" >& /dev/null && success "$1" || \
(rsync -aP ovh:"'$src/${1/\'/\\'}'" "$dst/" >& /dev/null && success "$1" || unlock "$1")
}

Finally got it working. Now it can handle strings like test"!'試みる.ext. I'm using ssh and tar now, but should also work with rsync.
sync {
item=${1//\"/\\\"}
ssh -n -c arcfour $server tar -C \"$remotedir\" -c -- \"$item\" | tar x 2>/dev/null
}

Related

Downloading a list of files with WGET - rename files up to .jpg ie. get rid of extraneous text

My problem is pretty straightforward to understand.
I have images.txt which is a list of line separated URLs pointing to .jpg files separated as follows:
https://region.URL.com/files/2/2f/dir/2533x1946_IMG.jpg?Tag=2&Policy=BLAH__&Signature=BLAH7-BLAH-BLAH__&Key-Pair-Id=BLAH
I'm able to successfully download with wget -i but they are formatted like 2533x1946_IMG.jpg?BLAH_BLAH_BLAH_BLAH when I need them named like this instead: 2533x1946_IMG.jpg
Note that I've already tried the popular solutions to no avail (see below), so I'm thinking more along the lines of a solution that would involved sed, grep and awk
wget --content-disposition-i images.txt
wget --trust-server-names -i images.txt
wget --metalink-over-http --trust-server-names --content-disposition -i images.txt
wget --trust-server-names --content-disposition -i images.txt
and more iterations like this based on those three flags....
I'd ideally like to do it with one command, but even if it's a matter of downloading the files as-is and later doing a recursive command that renames them to the 2533x1946_IMG.jpg format is acceptable too.
1) you can use rename in ONE liner to rename all files
rename -n 's/[?].*//' *_BLAH
rename uses the next sintax 's/selectedString/whatYouChange/'
rename uses regex to find all your files and also to rename using a loop. Because your name is very specific, you can select it very easy. you're going to select the char ? and because in regex it has a special meaning youre going to put that in brackets [ ]. end result [?].
-n argument it's to show you what is going to change and not make the changes until you remove it. delete -n and changes will be applied.
.* is for selecting everything after the char ?, so BLAH_BLAH_BLAH_BLAH
// is for remove what you select, because there are NOT words OR anything in here.
*_BLAH is for selecting all files that end with _BLAH, you could use * but maybe you have other files, folders in that same place, so it's safer this way.
output
find . \
-name '*[?]*' \
-exec bash -c $'for f; do mv -- "$f" "${f%%\'?\'*}"; done' _ {} +
Why *[?]*? That prevents the ? from being treated as a single-character wildcard, and instead ensures that it only matches itself.
Why $'...\'?\'...'? The $'...' ANSI-C-style string quoting form allows backslash escapes to be able to specify literal ' characters even inside a single-quoted string.
Why bash -c '...' _ {} +? Unlike approaches that substitute the filenames that were found into code to be executed, this keeps those names out-of-band from the code, preventing shell injection attacks via hostile filenames. The _ placeholder fills in $0, so subsequent arguments become $1 and onword; and the for loop iterates over them (for f; do is the same as for f in "$#"; do).
What does ${f%%'?'*} do? This paramater expansion expands $f with the longest possible string matching the glob-style/fnmatch pattern '?'* removed from the end.

Shell: rsync parsing spaces incorrectly in file name/path

I'm trying to pull a list of files over ssh with rsync, but I can't get it to work with filenames that have spaces on it! One example file is this:
/home/pi/Transmission_Downloads/FUNDAMENTOS_JAVA_E_ORIENTAÇÃO_A_OBJETOS/2. Fundamentos da linguagem/estruturas-de-controle-if-else-if-e-else-v1.mp4
and I'm trying to transfer it using this shell code.
cat $file_name | while read LINE
do
echo $LINE
rsync -aP "$user#$server:$LINE" $local_folder
done
and the error I'm getting is this:
receiving incremental file list
rsync: link_stat "/home/pi/Transmission_Downloads/FUNDAMENTOS_JAVA_E_ORIENTAÇÃO_A_OBJETOS/2." failed: No such file or directory (2)
rsync: link_stat "/home/pi/Fundamentos" failed: No such file or directory (2)
rsync: link_stat "/home/pi/da" failed: No such file or directory (2)
rsync: change_dir "/home/pi//linguagem" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [Receiver=3.1.0]
I don't get it why does it print OK on the screen, but parses the file name/path incorrectly! I know spaces are actually backslash with spaces, but don't know how to solve this. Sed (find/replace) didn't help either, and I also tried this code without success
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "Text read from file: $line"
rsync -aP "$user#$server:$line" $local_folder
done < $file_name
What should I do to fix this, and why this is happening?
I read the list of files from a .txt file (each file and path on one line), and I'm using ubuntu 14.04. Thanks!
rsync does space splitting by default.
You can disable this using the -s (or --protect-args) flag, or you can escape the spaces within the filename
The shell is correctly passing the filename to rsync, but rsync interprets spaces as separating multiple paths on the same server. So in addition to double-quoting the variable expansion to make sure rsync sees the string as a single argument, you also need to quote the spaces within the filename.
If your filenames don't have apostrophes in them, you can do that with single quotes inside the double quotes:
rsync -aP "$user#$server:'$LINE'" "$local_folder"
If your filenames might have apostrophes in them, then you need to quote those (whether or not the filenames also have spaces). You can use bash's built-in parameter substitution to do that (as long as you're on bash 4; older versions, such as the /bin/bash that ships on OS X, have issues with backslashes and apostrophes in such expressions). Here's what it looks like:
rsync -aP "$user#$server:'${LINE//\'/\'\\\'\'}'" "$local_folder"
Ugly, I know, but effective. Explanation follows after the other options.
If you're using an older bash or a different shell, you can use sed instead:
rsync -aP "$user#$server:'$(sed "s/'/'\\\\''/g" <<<"$LINE")'" "$local_folder"
... or if your shell also doesn't support <<< here-strings:
rsync -aP "$user#$server:'$(echo "$LINE" | sed "s/'/'\\\\''/g")'" "$local_folder"
Explanation: we want to replace all apostrophes with.. something that becomes a literal apostrophe in the middle of a single-quoted string. Since there's no way to escape anything inside single quotes, we have to first close the quotes, then add a literal apostrophe, and then re-open the quotes for the rest of the string. Effectively, that means we want to replace all occurrences of an apostrophe (') with the sequence (apostrophe, backslash, apostrophe, apostrophe): '\''. We can do that with either bash parameter expansion or sed.
In bash, ${varname/old/new} expands to the value of the variable $varname with the first occurrence of the string old replaced by the string new. Doubling the first slash ( ${varname//old/new} ) replaces all occurrences instead of just the first one. That's what we want here. But since both apostrophe and backslash are special to the shell, we have to put a(nother) backslash in front of every one of those characters in both expressions. That turns our old value into \', and our new one into \'\\\'\'.
The sed version is a little simpler, since apostrophes aren't special. Backslashes still are, so we have to put a \\ in the string to get a \ back. Since we want apostrophes in the string, it's easier to use a double-quoted string instead of a single-quoted one, but that means we need to double all the backslashes again to make sure the shell passes them on to sed unmolested. That's why the shell command has \\\\: that gets handed to sed as \\, which it outputs as \.

How to escape & in scp

Yes, I do realize it has been asked a thousand of times how to escape spaces in scp, but I fail to do that with the &-sign, so if that sign is part of the directory name.
[sorunome#sorunome-desktop tmp]$ scp test.txt "bpi:/home/sorunome/test & stuff"
zsh:1: command not found: stuff
lost connection
The & sign seems to be messing things quite a bit up, using \& won't solve the issue as then the remote directory is not found:
[sorunome#sorunome-desktop tmp]$ scp test.txt "bpi:/home/sorunome/test \& stuff"
scp: ambiguous target
Not even by omitting the quotes and adding \ all over the place this is working:
[sorunome#sorunome-desktop tmp]$ scp test.txt bpi:/home/sorunome/test\ \&\ stuff
zsh:1: command not found: stuff
lost connection
So, any idea?
Escaping both the spaces and the ampersand did the trick for me :
scp source_file "user#host:/dir\ with\ spaces\ \&\ ampersand"
The quotes are still needed for some reason.
When using scp or cp, special characters can break the file path. You get around this by escaping the special character.
Using cp you can use the normal method of escaping special characters, which is preceding it with a backslash. For example, a path with a space could be copied using:
cp photo.jpg My\ Pictures/photo.jpg
The remote path in scp doesn’t work escaping using this method. You need to escape the special characters using a double backslash. Using the same example, the My Photos folder would have its space escaped using:
scp photo.jpg "user#remotehost:/home/user/My\\ Photos/photo.jpg"
The double quotes are also important, the whole path with the special characters must be enclosed with double quotes.
Source : https://dominichosler.wordpress.com/2011/08/27/using-scp-with-special-characters/
If you need to escape % use %%
Surround the file name in an additional pair of \" like this:
scp "test.txt" "bpi:/home/sorunome/\"test & stuff\""
Since nothing needs to change inside the file name, this can be directly applied to variables:
scp "$local" "bpi:/home/sorunome/\"$remote\""
The outer quotes (") are interpreted by the local shell. The inner quotes (\") are interpreted on the remote server. Thanks to #chepner for pointing out how the arguments are processed twice.
This is what worked for me:
function escape_file() {
local ESCAPED=$(echo "$1" | sed -E 's:([ ()[!&<>"$*,;=?#\^`{}|]|]):\\\1:g' | sed -E "s/([':])/\\\\\1/g")
echo "$ESCAPED"
}
REMOTE_FILE="/tmp/Filename with & symbol's! (xxx) [1.8, _aaa].gz"
scp "server:$(escape_ "$REMOTE_FILE")" /tmp/

Bash syntax error: unexpected end of file with for loop

#!/bin/bash
cp ./Source/* ./Working/ 2> /dev/null
echo "Done"
for filename in *.zip;do unzip “$filename”;
done
In the above script, I am trying to copy all the files from source to working and unzip the files in working folder but I am geting igetting an error unexpected end of file
It looks like you have different kinds of double quotes in "$filename", make sure both are ASCII double quotes (decimal 34, hex 22). Try analyzing your script with
od -c scriptname
You have up to two problems here. First you quotes are not standard. You probably copy pasted from MS Word or something that automatically converts quotes.
The second problem you may have is that your filenames may have spaces in it. This can cause all sorts of problems in scripts if you do not expect it. There are a few workarounds but the easiest is probably to change the IFS:
OLDIFS=$IFS
IFS="$(echo -e '\t\n')" # get rid of the space character in the IFS
... do stuff ...
IFS=$OLDIFS

Backticks returns filename with spaces and surrounding command fails

I am trying to do something like "copy the newest file in a directory." I have come up the following command simple command using backticks, which works fine for filenames without embedded white space:
cp -rp `ls -1d searchstring | head -1` destination
As you can see this should work fine when the returned file has no space within it. However, this will obviously not work when there is such a space.
I need either a way to handle the output of the backticks, or some alternate approach.
You can treat the result of the command substitution as a single word by adding double quotes around it:
cp -rp "`ls -t searchstring | head -n 1`" destination
The double quotes are not needed when assigning to a variable. a=`uptime` is equivalent to a="`uptime`".

Resources