Non sequential ftp script - ftp

Scenario:
I have to transfer approx 3000 files, 30 to 35 MB each from one server to another (Both servers are IBM-AIX servers).
These files are in .gz format. They are unzipped at the destination using gunzip command to b of use.
The way i am doing it now:
I have made .sh files containing ftp scripts of 500 files each. These .sh files when run, transfer the file to the destination. At the destination i keep on checking how many files have arrived, as soon as 100 files have arrived, i run gunzip for these 100 files, then again the same for the next 100 files and so on. I run gunzip for a batch of 100 just to save on time.
What is in my mind:
I am in search of a command or any other way which will ftp my files to the destination, and as soon as 100 files are transferred they are started for unzipping BUT this unzipping should not pause the transfer for the remaining files.
Script that i tried:
ftp -n 192.168.0.22 << EOF
quote user username
quote pass password
cd /gzip_files/files
lcd /unzip_files/files
prompt n
bin
mget file_00028910*gz
! gunzip file_00028910*gz
mget file_00028911*gz
! gunzip file_00028911*gz
mget file_00028912*gz
! gunzip file_00028912*gz
mget file_00028913*gz
! gunzip file_00028913*gz
mget file_00028914*gz
! gunzip file_00028914*gz
bye
The drawback in the above code is that when the
! gunzip file_00028910*gz
lines is executing, the ftp for the next batch i.e ftp for ( file_00028911*gz ) is paused, hence wasting lot of time and loss of bandwidth utilization.
The ! mark is used to run Operating system commands within ftp prompt.
Hope i have explained my scenario properly. Will update the post if i get a solution, if any one already has a solution do reply.
Regards
Yash.

Since you seem to do it on a UNIX system you probably have Perl installed. You might try the following Perl code:
use strict;
use warnings;
use Net::FTP;
my #files = #ARGV; # get files from command line
my $server = '192.168.0.22';
my $user = 'username';
my $pass = 'password';
my $gunzip_after = 100; # collect up to 100 files
my $ftp = Net::FTP->new($server) or die "failed connect to the server: $!";
$ftp->login($user,$pass) or die "login failed";
my $pid_gunzip;
while (1) {
my #collect4gunzip;
GET_FILES:
while (my $file = shift #files) {
my $local_file = $ftp->get($file);
if ( ! $local_file ) {
warn "failed to get $file: ".$ftp->message;
next;
}
push #collect4gunzip,$local_file;
last if #collect4gunzip == $gunzip_after;
}
#collect4gunzip or last; # no more files ?
while ( $pid_gunzip && kill(0,$pid_gunzip)) {
# gunzip is still running, wait because we don't want to run multiple
# gunzip instances at the same time
warn "wait for last gunzip to return...\n";
wait();
# instead of waiting for gunzip to return we could go back to retrieve
# more files and add them to #collect4gunzip
# goto GET_FILES;
}
# last gunzip is done, start to gunzip collected files
defined( $pid_gunzip = fork()) or die "fork failed: $!";
if ( ! $pid_gunzip ) {
# child process should run gunzip
# maybe one needs so split it into multipl gunzip calls to make
# sure, that the command line does not get too long!!
system( ['gunzip', #collect4gunzip ]);
# child will exit once done
exit(0);
}
# parent continues with getting more files
}
It's not tested, but at least it passes the syntax check.

One of two solutions. Don't call gunzip directly. Call "blah" and "blah" is a script:
#!/bin/sh
gunzip "$#" &
so the gunzip is put into the background, the script returns immediately, and you continue with the FTP. The other thought is to just add the & to the sh command -- I bet that would work just as well. i.e. within the ftp script, do:
! gunzip file_00028914*gz &
But... I believe you are somewhat leading yourself astray. rsync and other solutions are the way to go for many reasons.

Related

Bash FTP upload - events to log

I have bash script ftp upload file. In log file I have all events. How can I have in log file only several events, like: "Connected to $HOST" or "WARNING - failed ftp connection"; "Ok to send data"; "Transfer complete"; "10000 bytes sent in 0.00 secs (73.9282 MB/s)".
#!/bin/sh
echo "####################" >> $TESTLOG
echo "$(date +%Y%m%d_%H%M%S)" >> $TESTLOG
ftp -i -n -v <<SCRIPT >> ${TESTLOG} 2>&1
open $HOST
user $USER $PASSWD
bin
cd $DPATH
lcd $TFILE
mput *.txt
quit
SCRIPT
exit 0
####################
20210304_111125
Connected to $HOST.
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
200 Switching to Binary mode.
250 Directory successfully changed.
Local directory now /home/pi/Data/data_files/tmp
local: 20210304_111125_ftp_10k.txt remote: 20210304_111125_ftp_10k.txt
200 PORT command successful. Consider using PASV.
150 Ok to send data.
226 Transfer complete.
10000 bytes sent in 0.00 secs (73.9282 MB/s)
221 Goodbye.
First, I'd like to say while this is a very common idiom, it has no error checking and I despise it.
Now that I got that off my chest...
ftp 2>&1 -inv << SCRIPT | grep -f patfile >> ${TESTLOG}
patfile is a list of patterns to keep. c.f. the grep manual page.
...to continue my rant, though...
What if someone changes the permissions on your $DPATH? The cd fails; ftp reports the failure to the console (your grep ignores it so it doesn't even get logged...), the script continues and puts the files in the wrong place. Full disk prevents files from being placed? Same cycle. A thousand things could go wrong, but ftp just blithely goes on, and doesn't even return an error on exit for most of them. Don't do this.
Just use scp. Yes, you have to set up something like ssh keys, but then the command is simple and checkable.
scp $TFILE/*.txt $HOST:$DPATH/ || echo "copy failed"
For a more elaborate response -
if scp $TFILE/*.txt $HOST:$DPATH/ # test the scp
then ssh "$HOST" "ls -ltr $DPATH/*.txt" # show result if success
else echo "copy failed" # code here for if it fails...
exit 1 # as much code as you feel you need
fi
Otherwise, use a different language with an ftp module like Perl so you can check steps and handle failures.
(I wrote a ksh script that handled an ftp coprocess, fed it a command at a time, checked the results... it's doable, but I don't recommend it.)

Bash, shell - multiple LFTP commands in one script

I've been trying to solve relatively small problem with moving some files across FTP servers but no luck so far.
In a nutshell this is what I'm doing, I have three servers:
SourceSFTP
TargetSFTP
Target_2_SFTP
The script is supposed to do the following
Connect to SourceFTP
Grab all files
Loop through files
Call function that takes file as parameter and does stuff to it, let's call it postfunc()
Drop the files to TargetSFTP
The problem occurs when inside postfunc I put another call to lftp to transfer file to Target_2SFTP. The command is executed properly (I can see the file moved) but then the number 5 never happens.
This is the script I have:
function postfunc() {
the_file=$1
lftp<<END_SCRIPT2
open sftp://$Target2SFTP
user $USERNAME $PASSWORD
cd /root
put $the_file
bye
END_SCRIPT2
}
echo "Downloading files from $SOURCE_SFTP"
lftp -e "echo 'testing connection';exit" -u $SOURCE_USERNAME,$SOURCE_PASSWORD $SOURCE_SFTP
lftp -e "set xfer:clobber true;mget $SOURCE_DIR*.csv;exit" -u $SOURCE_USERNAME,$SOURCE_PASSWORD $SOURCE_SFTP || exit 0
files=(*.csv)
batch=10
for ((i=0; i < ${#files[#]}; i+=batch)); do
commands=""
# Do some stuff
for((j=0; j < batch; j+=1)); do
commands=$commands"mv source_dir/${files[i+j} archivedir/${files[i+j]};"
postfunc ${files[i]}
done
echo "Archiving batch..."
lftp -e "$commands;exit" -u $SOURCE_USERNAME,$SOURCE_PASSWORD $SOURCE_SFTP
lftp<<END_SCRIPT
open sftp://$TARGET_SFTP
user $TARGET_USERNAME $TARGET_PASSWORD
cd $TARGET_DIR
mput dirr/*
bye
END_SCRIPT
done
Hopefully I'm missing something obvious... At the moment even if I move one file "Archiving batch" never shows up, if I remove contents of postfunc() everything executes correctly

Copying SCP multiple files from remote server with wildcard argument in UNIX BASH

This Expect script is a part of my UNIX Bash script
expect -c "
spawn scp yoko#sd.lindeneau.com:\"encryptor *.enc\" .
expect password: { send \"$PASS\r\" }
expect 100%
sleep 1
exit
"
I am trying to copy both 'encryptor' and '*.enc' with this one SCP command. Console tells me it cannot find ' *.enc" '
The syntax for multiple files:
$ scp your_username#remotehost.edu:~/\{foo.txt,bar.txt\} .
I would guess in your case (untested)
scp yoko#sd.lindeneau.com:\\{encryptor, \*.enc\\} .
Not sure it helps but I was looking for a similar objective.
Means: Copying some selected files based on their filenames / extensions but located in different subfolders (same level of subfolders).
this works e.g. copying .csv files from Server:/CommonPath/SpecificPath/
scp -r Server:/CommonPath/\*/\*csv YourRecordingLocation
I even tested more complex "perl like" regular expressions.
Not sure the -r option is still useful.
#!/bin/bash
expect -c "
set timeout 60; # 1 min
spawn scp yoko#sd.lindeneau.com:{encryptor *.enc} .
expect \"password: $\"
send \"mypassword\r\"
expect eof
"
You can increase the timeout if your file copy takes more time to complete. I have used expect eof which will wait till the closure of the scp command. i.e. we are waiting for the End Of File (EOF) of the scp after sending the password.

how to get the return code of scp command failing with connection lost error in shell script

I have a shell script in which i am pulling a remote server's *.gz file using below command and after scp , i am executing gunzip command.
Issue is that while doing scp ,connection is getting lost, so incomplete *.gz file is being saved on my local server directory and when i am trying to gunzip in the next line below scp command ,its gunzipping the file succesfully and when i open the file ,it contains garbage values.
scp ${HostUser}#${HostServer}:$4/*$no*.gz
gunzip
When i debugged i found below reason:
1. Incomplete File transfer due to connection lost
2. While executing the gunzip command manually, it was giving end of the file not found error.So it was randomly creating garbage value in file and my script is going in success which is not correct.
So my queries are:
Can I store return code of scp command something through which i can come to know whther complete file transfer was done .If complete trasnfer is not done then the fail the script so that gunzip command dont try to open incomplete file and store garbage value after gunzip command
I am also facing strange issue that while running scp command for first time,connection is lost but when i fire scp command again from same session,its successful. But in production everytime it wil be new session , so we cant afford to have regular failures of job
Also one particular file ,connection is lost with 5 seconds. I checked with remote server timeout session is 5 minutes
please suggest
'$?' will hold the exit code of the last command so you can assign it to a variable after the scp command. Then do an if statement to check if the exit code = 0, for successful completion, then run the gunzip command.
scp ${HostUser}#${HostServer}:$4/*$no*.gz
EXIT_STATUS=$?
if [ $EXIT_STATUS -eq 0 ]; then
gunzip
else
...some error handling
fi

How to get a full copy of a dump file while it is still being created?

Every hour the main machine takes a minute to produce a dump file of
about 100MB.
A backup machine copies that, using scp, also hourly.
Both actions are triggered by cron to start at the same minutes past the hour.
The copy often contains only part of the dump file.
Although I could change the cron on the backup machine to happen 5 minutes later, that rather smells of cheating.
What is the correct way around this problem?
Leverage fuser to determine if the file is in-use; something like:
#!/bin/sh
MYFILE=myfile #...change as necessary
while true
do
if [ -z "$(/sbin/fuser ${MYFILE} 2> /dev/null)" ]; then
break
fi
echo "...waiting..."
sleep 10 #...adjust as necessary
done
echo "Process the file..."
Assuming you can modify the source code of the programs, you can have the "dumper" output a second file indicating that it is done dumping.
1. Dumper deletes signal file
2. Dumper dumps file
3. Dumper creates signal file
4. Backup program polls periodically until the signal file is created
5. Backup program deletes signal file
6. Backup program has complete file
Not the best form of IPC but as its on another machine... I suppose you could also open a network connection to do the polling.
Very quick and very dirty:
my_file=/my/filename
while [[ -n "$(find $my_file)" -a -n "$(find $my_file -mmin -1)" ]] ; do
echo "File '$my_file' exists, but has been recently modified. Waiting..."
sleep 10
done
scp ... # if the file does not exist at all, we don't wait for it, but let scp fail
Try to copy the file from the main node to the backup node in one script:
1. dumpfile
2. scp
Then it's guaranteed that the file is finished before the scp call.

Resources