rsync --exclude issue when using bash variable [duplicate]

rsync --exclude issue when using bash variable [duplicate] - bash

This question already has answers here:
Why do bash parameter expansions cause an rsync command to operate differently?
(2 answers)
Closed 4 years ago.
I need to copy a source directory to under a destination directory with rsync in bash, with excluding files having specific extensions (.qcow2). It works properly when I try typing the command manually, however fails when using with bash variable.
I set a bash variable, on below is its content:
# echo $line
/mnt/source --exclude='*.qcow2'
Although there is the exclude parameter, rsync is copying the ".qcow2" file:
# rsync -av $line destination/
sending incremental file list
source/Atlas/
source/Atlas/atlas.sh
source/Atlas/atlas.qcow2
sent 2143238309 bytes received 56 bytes 115850722.43 bytes/sec
total size is 2143164594 speedup is 1.00
Meanwhile rsync is running I can see the process as below:
# ps -ef | grep rsync
root 39058 11032 62 14:56 pts/22 00:00:01 rsync -av /mnt/source --exclude='*.qcow2' destination/
root 39059 39058 0 14:56 pts/22 00:00:00 rsync -av /mnt/source --exclude='*.qcow2' destination/
root 39060 39059 71 14:56 pts/22 00:00:02 rsync -av /mnt/source --exclude='*.qcow2' destination/
root 39066 14866 0 14:56 pts/24 00:00:00 grep rsync
".qcow2" file is copied above, this is what I want to avoid.
When I run the same command without the variable, as seen on the ps output (after removing the files on the destination directory), it works properly, ".qcow2" file is not transferred:
# rm -f destination/Atlas/*
# rsync -av /mnt/source --exclude='*.qcow2' destination/
sending incremental file list
source/Atlas/
source/Atlas/atlas.sh
sent 14956 bytes received 37 bytes 29986.00 bytes/sec
total size is 202930 speedup is 13.53
How can I make it work, to avoid ".qcow2" file transfer, with using variables in bash?
Thanks in advance

The quoting in your variable is off. Try this:
$ line='/mnt/source --exclude=*.qcow2'
$ echo $line
/mnt/source --exclude=*.qcow2
$ rsync -av $line destination/

Related

tar & split remote files saving output locally remove "tar: Removing leading `/' from member names" message from output

This is a 2 part question.
Ive made a bash script that logs into a remote server makes a list.txt and saves that locally.
#!/bin/bash
sshpass -p "xxxx" ssh user#pass ls /path/to/files/ | grep "^.*iso" > list.txt
It then starts a for loop using the list.txt
for f in $(cat list.txt); do
The next command splits the target file and saves it locally
sshpass -p "xxxx" ssh user#pass tar --no-same-owner -czf - /path/to/files/$f | split -b 10M - "$f.tar.bz2.part"
Question 1
I need help understanding the above command, why is it saving the *part files locally? Even though that is what I intend to do I would like to understand it better, How would I do this the other way round, tar and split files saving output to remote directory (flip around what is happening in the above command using the same tools sshpass is a requirement)
Question 2
When running the above command even though I have made it not verbose it still prints this message
tar: Removing leading `/' from member names
How do I get rid of it as I have my own echo output as part of the script I have tried the following after searching online but I think me piping a few commands together confuses tar and breaks the operation.
I have tried these with no luck
sshpass -p "xxxx" ssh user#pass tar --no-same-owner -czfP - /path/to/files/$f | split -b 10M - "$f.tar.bz2.part
sshpass -p "xxxx" ssh user#pass tar --no-same-owner -czf -C /path/to/files/$f | split -b 10M - "$f.tar.bz2.part
sshpass -p "xxxx" ssh user#pass tar --no-same-owner -czf - /path/to/files/$f | split -b 10M - "$f.tar.bz2.part > /dev/null 2>&1
sshpass -p "xxxx" ssh user#pass tar --no-same-owner -czf - /path/to/files/$f > /dev/null 2>&1 | split -b 10M - "$f.tar.bz2.part
All of the above break the operation and I would like it to not display any messages at all. I suspect it has something to do with regex and how the pipe passes through arguments. Any input is appreciated.
Anyways this is just part of the script the other part uploads the processed file after tar and splitting it but Ive had to break it up into a few commands a 'tar | split' locally, then uploading via rclone. It would be way more efficient if I could pipe the output of split and save it remotely via ssh.

First and foremost, you must consider the security vulnerabilities when using sshpass.
About question 1:
Using tar with -f - option will create the tar on the fly and will send to stdout.
The | separates the commands.
sshpass -p "xxxx" ssh user#pass tar --no-same-owner -czf - /path/to/files/$f - Runs remotely
split -b 10M - "$f.tar.bz2.part" - Runs in local shell
The second command reads the stdin from the first command (the tar output) and it creates the file locally.
If you want to perform all the operations in the remote machine, you could enclose the rest of the commands in quotes like this (read other sources about qouting).
sshpass -p "xxxx" ssh user#pass 'tar --no-same-owner -czf - /path/to/files/$f | split -b 10M - "$f.tar.bz2.part"'
About question 2.
tar: Removing leading '/' from member names is generated by tar command which sends errors/warnings to STDERR which in the terminal, STDERR defaults to the user's screen.
So you can suppress tar errors by adding 2>/dev/null:
sshpass -p "xxxx" ssh user#pass tar --no-same-owner -czf - /path/to/files/$f 2 > /dev/null | split -b 10M - "$f.tar.bz2.part

Cannot get rsync to accept `mkdir -p` (--parents) in rsync-path argument?

I've seen in posts like rsync - create all missing parent directories? :
rsync -aq --rsync-path='mkdir -p /tmp/imaginary/ && rsync' file user#remote:/tmp/imaginary/
I thought - great, let me try that:
$ rsync -aP --remove-source-files --rsync-path="mkdir -p /home/pi/ARCHIVE/2020/01/24 && rsync" a1.test a1.json a1.pdf /home/pi/ARCHIVE/2020/01/24/
sending incremental file list
rsync: mkdir "/home/pi/ARCHIVE/2020/01/24" failed: No such file or directory (2)
rsync error: error in file IO (code 11) at main.c(675) [Receiver=3.1.2]
Well, sure mkdir "/home/pi/ARCHIVE/2020/01/24" failed - but I did NOT issue mkdir, i issued mkdir -p!
So why did rsync ignore this? Is there any other setting I should set for it? Or maybe rsync-path can be used STRICTLY for ssh connections (which is not the case here)?

--rsync-path only applies to remote machines:
--rsync-path=PROGRAM specify the rsync to run on remote machine
This is because you don't need to invoke a second instance of rsync when you are doing a local copy. The copy will simply be done by the process you ran.
Since it's technically you who are invoking rsync on the target machine, it's you who should be adding mkdir .. && in front of rsync:
mkdir -p /home/pi/ARCHIVE/2020/01/24 &&
rsync -aP --remove-source-files a1.test a1.json a1.pdf /home/pi/ARCHIVE/2020/01/24/

FTP not working UNIX

hi i have a script where i am performing sudo and going to particular directory,and within that directory editing files name as required. After getting required file name i want to FTP files on windows machine but script after reading FTP commands says-:
-bash: line 19: quote: command not found
-bash: line 20: quote: command not found
-bash: line 21: put: command not found
-bash: line 22: quit: command not found
My ftp is working if i run normally so it is some other problem.Script is below-:
#!/usr/bin/
path=/global/u70/glob
echo password | sudo -S -l
sudo /usr/bin/su - glob << 'EOF'
#ls -lrt
cd "$path"
pwd
for entry in $(ls -r)
do
if [ "$entry" = "ADM" ];then
cd "$entry"
FileName=$(ls -t | head -n1)
echo "$FileName"
FileNameIniKey=$(ls -t | head -n1 | cut -c 12-20)
echo "$FileNameIniKey"
echo "$xmlFileName" >> "$xmlFileNameIniKey.ini"
chmod 755 "$FileName"
chmod 755 "$FileNameIniKey.ini"
ftp -n hostname
quote USER ftp
quote PASS
put "$FileName"
quit
rm "$FileNameIniKey.ini"
fi
done
EOF

You can improve your questions and make them easier to answer and more useful for future readers by including a minimal, self-contained example. Here's an example:
#!/bin/bash
ftp -n mirrors.rit.edu
quote user anonymous
quote pass mypass
ls
When executed, you get a manual FTP session instead of a file listing:
$ ./myscript
Trying 2620:8d:8000:15:225:90ff:fefd:344c...
Connected to smoke.rc.rit.edu.
220 Welcome to mirrors.rit.edu.
ftp>
The problem is that you're assuming that a script is a series of strings that are automatically typed into a terminal. This is not true. It's a series of commands that are executed one after another.
Nothing happens with quote user anonymous until AFTER ftp has exited, and then it's run as a shell command instead of being written to the ftp command.
Instead, specify login credentials on the command line and then include commands in a here document:
ftp -n "ftp://anonymous:passwd#mirrors.rit.edu" << end
ls
end
This works as expected:
$ ./myscript
Trying 2620:8d:8000:15:225:90ff:fefd:344c...
Connected to smoke.rc.rit.edu.
220 Welcome to mirrors.rit.edu.
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
200 Switching to Binary mode.
229 Entering Extended Passive Mode (|||19986|).
150 Here comes the directory listing.
drwxrwxr-x 12 3002 1000 4096 Jul 11 20:00 CPAN
drwxrwsr-x 10 0 1001 4096 Jul 11 21:08 CRAN
drwxr-xr-x 18 1003 1000 4096 Jul 11 18:02 CTAN
drwxrwxr-x 5 89987 546 4096 Jul 10 10:00 FreeBSD

ftp -n "ftp://anonymous:passwd#mirrors.rit.edu" << end
Name or service not known

Speed up rsync with Simultaneous/Concurrent File Transfers?

We need to transfer 15TB of data from one server to another as fast as we can. We're currently using rsync but we're only getting speeds of around 150Mb/s, when our network is capable of 900+Mb/s (tested with iperf). I've done tests of the disks, network, etc and figured it's just that rsync is only transferring one file at a time which is causing the slowdown.
I found a script to run a different rsync for each folder in a directory tree (allowing you to limit to x number), but I can't get it working, it still just runs one rsync at a time.
I found the script here (copied below).
Our directory tree is like this:
/main
- /files
- /1
- 343
- 123.wav
- 76.wav
- 772
- 122.wav
- 55
- 555.wav
- 324.wav
- 1209.wav
- 43
- 999.wav
- 111.wav
- 222.wav
- /2
- 346
- 9993.wav
- 4242
- 827.wav
- /3
- 2545
- 76.wav
- 199.wav
- 183.wav
- 23
- 33.wav
- 876.wav
- 4256
- 998.wav
- 1665.wav
- 332.wav
- 112.wav
- 5584.wav
So what I'd like to happen is to create an rsync for each of the directories in /main/files, up to a maximum of, say, 5 at a time. So in this case, 3 rsyncs would run, for /main/files/1, /main/files/2 and /main/files/3.
I tried with it like this, but it just runs 1 rsync at a time for the /main/files/2 folder:
#!/bin/bash
# Define source, target, maxdepth and cd to source
source="/main/files"
target="/main/filesTest"
depth=1
cd "${source}"
# Set the maximum number of concurrent rsync threads
maxthreads=5
# How long to wait before checking the number of rsync threads again
sleeptime=5
# Find all folders in the source directory within the maxdepth level
find . -maxdepth ${depth} -type d | while read dir
do
# Make sure to ignore the parent folder
if [ `echo "${dir}" | awk -F'/' '{print NF}'` -gt ${depth} ]
then
# Strip leading dot slash
subfolder=$(echo "${dir}" | sed 's#^\./##g')
if [ ! -d "${target}/${subfolder}" ]
then
# Create destination folder and set ownership and permissions to match source
mkdir -p "${target}/${subfolder}"
chown --reference="${source}/${subfolder}" "${target}/${subfolder}"
chmod --reference="${source}/${subfolder}" "${target}/${subfolder}"
fi
# Make sure the number of rsync threads running is below the threshold
while [ `ps -ef | grep -c [r]sync` -gt ${maxthreads} ]
do
echo "Sleeping ${sleeptime} seconds"
sleep ${sleeptime}
done
# Run rsync in background for the current subfolder and move one to the next one
nohup rsync -a "${source}/${subfolder}/" "${target}/${subfolder}/" </dev/null >/dev/null 2>&1 &
fi
done
# Find all files above the maxdepth level and rsync them as well
find . -maxdepth ${depth} -type f -print0 | rsync -a --files-from=- --from0 ./ "${target}/"

Updated answer (Jan 2020)
xargs is now the recommended tool to achieve parallel execution. It's pre-installed almost everywhere. For running multiple rsync tasks the command would be:
ls /srv/mail | xargs -n1 -P4 -I% rsync -Pa % myserver.com:/srv/mail/
This will list all folders in /srv/mail, pipe them to xargs, which will read them one-by-one and and run 4 rsync processes at a time. The % char replaces the input argument for each command call.
Original answer using parallel:
ls /srv/mail | parallel -v -j8 rsync -raz --progress {} myserver.com:/srv/mail/{}

Have you tried using rclone.org?
With rclone you could do something like
rclone copy "${source}/${subfolder}/" "${target}/${subfolder}/" --progress --multi-thread-streams=N
where --multi-thread-streams=N represents the number of threads you wish to spawn.

rsync transfers files as fast as it can over the network. For example, try using it to copy one large file that doesn't exist at all on the destination. That speed is the maximum speed rsync can transfer data. Compare it with the speed of scp (for example). rsync is even slower at raw transfer when the destination file exists, because both sides have to have a two-way chat about what parts of the file are changed, but pays for itself by identifying data that doesn't need to be transferred.
A simpler way to run rsync in parallel would be to use parallel. The command below would run up to 5 rsyncs in parallel, each one copying one directory. Be aware that the bottleneck might not be your network, but the speed of your CPUs and disks, and running things in parallel just makes them all slower, not faster.
run_rsync() {
# e.g. copies /main/files/blah to /main/filesTest/blah
rsync -av "$1" "/main/filesTest/${1#/main/files/}"
}
export -f run_rsync
parallel -j5 run_rsync ::: /main/files/*

You can use xargs which supports running many processes at a time. For your case it will be:
ls -1 /main/files | xargs -I {} -P 5 -n 1 rsync -avh /main/files/{} /main/filesTest/

There are a number of alternative tools and approaches for doing this listed arround the web. For example:
The NCSA Blog has a description of using xargs and find to parallelize rsync without having to install any new software for most *nix systems.
And parsync provides a feature rich Perl wrapper for parallel rsync.

I've developed a python package called: parallel_sync
https://pythonhosted.org/parallel_sync/pages/examples.html
Here is a sample code how to use it:
from parallel_sync import rsync
creds = {'user': 'myusername', 'key':'~/.ssh/id_rsa', 'host':'192.168.16.31'}
rsync.upload('/tmp/local_dir', '/tmp/remote_dir', creds=creds)
parallelism by default is 10; you can increase it:
from parallel_sync import rsync
creds = {'user': 'myusername', 'key':'~/.ssh/id_rsa', 'host':'192.168.16.31'}
rsync.upload('/tmp/local_dir', '/tmp/remote_dir', creds=creds, parallelism=20)
however note that ssh typically has the MaxSessions by default set to 10 so to increase it beyond 10, you'll have to modify your ssh settings.

The simplest I've found is using background jobs in the shell:
for d in /main/files/*; do
rsync -a "$d" remote:/main/files/ &
done
Beware it doesn't limit the amount of jobs! If you're network-bound this is not really a problem but if you're waiting for spinning rust this will be thrashing the disk.
You could add
while [ $(jobs | wc -l | xargs) -gt 10 ]; do sleep 1; done
inside the loop for a primitive form of job control.

3 tricks for speeding up rsync on local net.
1. Copying from/to local network: don't use ssh!
If you're locally copying a server to another, there is no need to encrypt data during transfer!
By default, rsync use ssh to transer data through network. To avoid this, you have to create a rsync server on target host. You could punctually run daemon by something like:
rsync --daemon --no-detach --config filename.conf
where minimal configuration file could look like: (see man rsyncd.conf)
filename.conf
port = 12345
[data]
path = /some/path
use chroot = false
Then
rsync -ax rsync://remotehost:12345/data/. /path/to/target/.
rsync -ax /path/to/source/. rsync://remotehost:12345/data/.
2. Using zstandard zstd for high speed compression
Zstandard could be upto 8x faster than the common gzip. So using this newer compression algorithm will improve significantly your transfer!
rsync -axz --zc=zstd rsync://remotehost:12345/data/. /path/to/target/.
rsync -axz --zc=zstd /path/to/source/. rsync://remotehost:12345/data/.
3. Multiplexing rsync to reduce inactivity due to browse time
This kind of optimisation is about disk access and filesystem structure. There is nothing to see with number of CPU! So this could improve transfer even if your host use single core CPU.
As the goal is to ensure maximum data are using bandwidth while other task browse filesystem, the most suited number of simultaneous process depend on number of small files presents.
Here is a sample bash script using wait -n -p PID:
#!/bin/bash
maxProc=3
source=''
destination='rsync://remotehost:12345/data/'
declare -ai start elap results order
wait4oneTask() {
local _i
wait -np epid
results[epid]=$?
elap[epid]=" ${EPOCHREALTIME/.} - ${start[epid]} "
unset "running[$epid]"
while [ -v elap[${order[0]}] ];do
_i=${order[0]}
printf " - %(%a %d %T)T.%06.0f %-36s %4d %12d\n" "${start[_i]:0:-6}" \
"${start[_i]: -6}" "${paths[_i]}" "${results[_i]}" "${elap[_i]}"
order=(${order[#]:1})
done
}
printf " %-22s %-36s %4s %12s\n" Started Path Rslt 'microseconds'
for path; do
rsync -axz --zc zstd "$source$path/." "$destination$path/." &
lpid=$!
paths[lpid]="$path"
start[lpid]=${EPOCHREALTIME/.}
running[lpid]=''
order+=($lpid)
((${#running[#]}>=maxProc)) && wait4oneTask
done
while ((${#running[#]})); do
wait4oneTask
done
Output could look like:
myRsyncP.sh files/*/*
Started Path Rslt microseconds
- Fri 03 09:20:44.673637 files/1/343 0 1186903
- Fri 03 09:20:44.673914 files/1/43 0 2276767
- Fri 03 09:20:44.674147 files/1/55 0 2172830
- Fri 03 09:20:45.861041 files/1/772 0 1279463
- Fri 03 09:20:46.847241 files/2/346 0 2363101
- Fri 03 09:20:46.951192 files/2/4242 0 2180573
- Fri 03 09:20:47.140953 files/3/23 0 1789049
- Fri 03 09:20:48.930306 files/3/2545 0 3259273
- Fri 03 09:20:49.132076 files/3/4256 0 2263019
Quick check:
printf "%'d\n" $(( 49132076 + 2263019 - 44673637)) \
$((1186903+2276767+2172830+1279463+2363101+2180573+1789049+3259273+2263019))
6’721’458
18’770’978
There was 6,72seconds elapsed to process 18,77seconds under upto three subprocess.
Note: you could use musec2str to improve ouptut, by replacing 1st long printf line by:
musec2str -v elapsed "${elap[i]}"
printf " - %(%a %d %T)T.%06.0f %-36s %4d %12s\n" "${start[i]:0:-6}" \
"${start[i]: -6}" "${paths[i]}" "${results[i]}" "$elapsed"
myRsyncP.sh files/*/*
Started Path Rslt Elapsed
- Fri 03 09:27:33.463009 files/1/343 0 18.249400"
- Fri 03 09:27:33.463264 files/1/43 0 18.153972"
- Fri 03 09:27:33.463502 files/1/55 93 10.104106"
- Fri 03 09:27:43.567882 files/1/772 122 14.748798"
- Fri 03 09:27:51.617515 files/2/346 0 19.286811"
- Fri 03 09:27:51.715848 files/2/4242 0 3.292849"
- Fri 03 09:27:55.008983 files/3/23 0 5.325229"
- Fri 03 09:27:58.317356 files/3/2545 0 10.141078"
- Fri 03 09:28:00.334848 files/3/4256 0 15.306145"
The more: you could add overall stat line by some edits in this script:
#!/bin/bash
maxProc=3 source='' destination='rsync://remotehost:12345/data/'
. musec2str.bash # See https://stackoverflow.com/a/72316403/1765658
declare -ai start elap results order
declare -i sumElap totElap
wait4oneTask() {
wait -np epid
results[epid]=$?
local -i _i crtelap=" ${EPOCHREALTIME/.} - ${start[epid]} "
elap[epid]=crtelap sumElap+=crtelap
unset "running[$epid]"
while [ -v elap[${order[0]}] ];do # Print status lines in command order.
_i=${order[0]}
musec2str -v helap ${elap[_i]}
printf " - %(%a %d %T)T.%06.f %-36s %4d %12s\n" "${start[_i]:0:-6}" \
"${start[_i]: -6}" "${paths[_i]}" "${results[_i]}" "${helap}"
order=(${order[#]:1})
done
}
printf " %-22s %-36s %4s %12s\n" Started Path Rslt 'microseconds'
for path;do
rsync -axz --zc zstd "$source$path/." "$destination$path/." &
lpid=$! paths[lpid]="$path" start[lpid]=${EPOCHREALTIME/.}
running[lpid]='' order+=($lpid)
((${#running[#]}>=maxProc)) &&
wait4oneTask
done
while ((${#running[#]})) ;do
wait4oneTask
done
totElap=${EPOCHREALTIME/.}
for i in ${!start[#]};do sortstart[${start[i]}]=$i;done
sortstartstr=${!sortstart[*]}
fstarted=${sortstartstr%% *}
totElap+=-fstarted
musec2str -v hTotElap $totElap
musec2str -v hSumElap $sumElap
printf " = %(%a %d %T)T.%06.0f %-41s %12s\n" "${fstarted:0:-6}" \
"${fstarted: -6}" "Real: $hTotElap, Total:" "$hSumElap"
Could produce:
$ ./parallelRsync Data\ dirs-{1..4}/Sub\ dir{A..D}
Started Path Rslt microseconds
- Sat 10 16:57:46.188195 Data dirs-1/Sub dirA 0 1.69131"
- Sat 10 16:57:46.188337 Data dirs-1/Sub dirB 116 2.256086"
- Sat 10 16:57:46.188473 Data dirs-1/Sub dirC 0 1.1722"
- Sat 10 16:57:47.361047 Data dirs-1/Sub dirD 0 2.222638"
- Sat 10 16:57:47.880674 Data dirs-2/Sub dirA 0 2.193557"
- Sat 10 16:57:48.446484 Data dirs-2/Sub dirB 0 1.615003"
- Sat 10 16:57:49.584670 Data dirs-2/Sub dirC 0 2.201602"
- Sat 10 16:57:50.061832 Data dirs-2/Sub dirD 0 2.176913"
- Sat 10 16:57:50.075178 Data dirs-3/Sub dirA 0 1.952396"
- Sat 10 16:57:51.786967 Data dirs-3/Sub dirB 0 1.123764"
- Sat 10 16:57:52.028138 Data dirs-3/Sub dirC 0 2.531878"
- Sat 10 16:57:52.239866 Data dirs-3/Sub dirD 0 2.297417"
- Sat 10 16:57:52.911924 Data dirs-4/Sub dirA 14 1.290787"
- Sat 10 16:57:54.203172 Data dirs-4/Sub dirB 0 2.236149"
- Sat 10 16:57:54.537597 Data dirs-4/Sub dirC 14 2.125793"
- Sat 10 16:57:54.561454 Data dirs-4/Sub dirD 0 2.49632"
= Sat 10 16:57:46.188195 Real: 10.870221", Total: 31.583813"
Fake rsync for testing this script
Note: For testing this, I've used a fake rsync:
## Fake rsync wait 1.0 - 2.99 seconds and return 0-255 ~ 1x/10
rsync() { sleep $((RANDOM%2+1)).$RANDOM;exit $(( RANDOM%10==3?RANDOM%128:0));}
export -f rsync

The shortest version I found is to use the --cat option of parallel like below. This version avoids using xargs, only relying on features of parallel:
cat files.txt | \
parallel -n 500 --lb --pipe --cat rsync --files-from={} user#remote:/dir /dir -avPi
#### Arg explainer
# -n 500 :: split input into chunks of 500 entries
#
# --cat :: create a tmp file referenced by {} containing the 500
# entry content for each process
#
# user#remote:/dir :: the root relative to which entries in files.txt are considered
#
# /dir :: local root relative to which files are copied
Sample content from files.txt:
/dir/file-1
/dir/subdir/file-2
....
Note that this doesn't use -j 50 for job count, that didn't work on my end here. Instead I've used -n 500 for record count per job, calculated as a reasonable number given the total number of records.

I've found UDR/UDT to be an amazing tool. The TLDR; It's a UDT wrapper for rsync, utilizing multiple UPD connections rather than a single TCP connection.
References: https://udt.sourceforge.io/ & https://github.com/jaystevens/UDR#udr
If you use any RHEL distros, they've pre-compiled it for you... http://hgdownload.soe.ucsc.edu/admin/udr
The ONLY downside I've encountered is that you can't specify a different SSH port, so your remote server must use 22.
Anyway, after installing the rpm, it's literally as simple as:
udr rsync -aP user#IpOrFqdn:/source/files/* /dest/folder/
and your transfer speeds will increase drastically in most cases, depending on the server I've seen easily 10x increase in transfer speed.
Side note: if you choose to gzip everything first, then make sure to use --rsyncable arg so that it only updates what has changed.

using parallel rsync on a regular disk would only cause them to compete for the i/o, turning what should be a sequential read into an inefficient random read. You could try instead tar the directory into a stream through ssh pull from the destination server, then pipe the stream to tar extract.

Error building SCP command syntax in read loop

I'm trying to get a list of files copied by SCP from one server to another but the command seems not to be getting build correctly in the read loop.
I have a file called diff_tapes.txt which contains a list of files to be copied as follows:
/VAULT14/TEST_V14/634001
/VAULT14/TEST_V14/634002
/VAULT14/TEST_V14/634003
/VAULT14/TEST_V14/634004
etc etc...
The bash command line I'm using is as follows:
while read line; do scp -p bill#lgrdcpvtsa:$line $line;done < /home/bill/diff_tapes.txt
When I execute that from the command line (I'm running on CentOS so basically Red Hat) I get:
/VAULT14/TEST_V14/634001: No such file or directory
... for every single file.
If I run again adding the -v switch to get more info, I see the following:
debug1: Sending command: scp -v -p -f /VAULT14/TEST_V14/634001
The remote server (lgrdcpvtsa) definitely has the files in question:
[bill#LGRDCPVTSA TEST_V14]$ pwd
/VAULT14/TEST_V14
[bill#LGRDCPVTSA TEST_V14]$ ls -ll
total 207200
-rw------- 1 bill bill 27263700 Apr 26 11:16 634001
-rw------- 1 bill bill 27263700 Apr 26 11:16 634002
-rw------- 1 bill bill 27263700 Apr 26 11:16 634003
-rw------- 1 bill bill 27263700 Apr 26 11:16 634004
It's as though the second time I have $line in the scp command, it's ignored.
Any idea what's wrong with the syntax?
EDIT:
For clarity, the list of files is more likely to be like this:
/VAULT14/634100_V14/634001
/VAULT11/601100_V11/601011
/VAULT12/510200_V12/510192
And /VAULT10 through /VAULT14 exists on both servers, it's just the next folder node might not.
These files are files flagged as being different on local vs remote machine, hence copying from the remote machine which is the correct data source, so a recursive copy won't work here (I think the -r switch was a hangover from an earlier test so I've removed that from the code above).

The error is probably because the local directory /VAULT14/TEST_V14/ does not exist.
You can use the dirname command to get the directory name from the path, create the directory, and then executing the scp command. Example
while read line; do mkdir -p "$(dirname "$line")"; scp -rp bill#lgrdcpvtsa:"$line" "$line";done < /home/bill/diff_tapes.txt
The -p option tells mkdir to create the subdirectories even if the parent does not exist.
EDIT:
This was copying all the files to / so have changed to the following which is working perfectly:
while read line; do mkdir -p "$(dirname "$line")"; scp -p bill#lgrdcpvtsa:"$line" "$line";done < /home/bill/diff_tapes.txt

/VAULT14/TEST_V14/634001: No such file or directory
This is likely because the folder /VAULT14/TEST_V14/ does not exist on the local machine.
Result:
mkdir /VAULT14/TEST14
while read line; do
scp -p bill#lgrdcpvtsa:"$line" "$line"
done < /home/bill/diff_tapes.txt

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

rsync --exclude issue when using bash variable [duplicate] - bash

The quoting in your variable is off. Try this: $ line='/mnt/source --exclude=.qcow2' $ echo $line /mnt/source --exclude=.qcow2 $ rsync -av $line destination/

Related

tar & split remote files saving output locally remove "tar: Removing leading `/' from member names" message from output

Cannot get rsync to accept `mkdir -p` (--parents) in rsync-path argument?

FTP not working UNIX

Speed up rsync with Simultaneous/Concurrent File Transfers?

Error building SCP command syntax in read loop

Categories

Resources

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

rsync --exclude issue when using bash variable [duplicate] - bash

The quoting in your variable is off. Try this: $ line='/mnt/source --exclude=*.qcow2' $ echo $line /mnt/source --exclude=*.qcow2 $ rsync -av $line destination/

Related

tar & split remote files saving output locally remove "tar: Removing leading `/' from member names" message from output

Cannot get rsync to accept `mkdir -p` (--parents) in rsync-path argument?

FTP not working UNIX

Speed up rsync with Simultaneous/Concurrent File Transfers?

Error building SCP command syntax in read loop

Categories

Resources

The quoting in your variable is off. Try this: $ line='/mnt/source --exclude=.qcow2' $ echo $line /mnt/source --exclude=.qcow2 $ rsync -av $line destination/