Getting continue behavior (not redownloading files already present) using lftp. - bash

So, I have a script which downloads stuff from a seedbox. It works great for new files which are in the remote server and then mirrored on my local server. The problem is that when I want, for example, to remove unnecessary files, running the script again re-downloads the same file(s) again. I tried going into the man pages of mirror but it wasn't helpful. Here is the script which mirrors the files:
#!/bin/bash
login=XXXX
pass=XXXXXX
host=XXXXX
remote_dir=/files/
local_dir=/home/XXX/XXX
trap "rm -f /tmp/seedroots.lock" SIGINT SIGTERM
if [ -e /tmp/seedroots.lock ]; then
echo "Synctorrent is running already."
exit 1
else
touch /tmp/seedroots.lock
lftp -p 21 -u $login,$pass $host << EOF
set ftp:ssl-allow no
set mirror:use-pget-n 5
mirror -c -P5 --log=synctorrents.log $remote_dir $local_dir
EOF
rm -f /tmp/seedroots.lock
exit 0
fi
Is there an option for mirror which I am missing that doesn't re-download the locally deleted file(s) again?

The mirror command in lftp has a --continue flag which will result in the behavior you want.

You should give a try to my version of your script (not tested) :
#!/bin/bash
login=XXXX
pass=XXXXXX
host=XXXXX
remote_dir=/files/
local_dir=/home/XXX/XXX
files=$local_dir/*
trap "rmdir /tmp/seedroots.lock" 0 1 2 3 15
if [[ -d /tmp/seedroots.lock ]]; then
echo "Synctorrent is running already."
exit 1
else
mkdir /tmp/seedroots.lock
lftp -p 21 -u $login,$pass $host << EOF
set ftp:ssl-allow no
set mirror:use-pget-n 5
mget $files
EOF
fi
What it does :
I build a local list of files, and, subsequently, mget all these files on the ftp server with the variable $files.
I replaced the lock file with a dir : search web about atomicity.
Files are not atomic whereas directories are.
The trap runs on normal exit and other signals
If you are using bash, [[ ]] tests are more powerfull.
Indentation is not just an option ;)

If you are just leeching files (not seeding), you can use lftp mirror with --Remove-source-files option to remove files at source after transfer (so no duplicate, re-downloads).

Related

Bash execute a for loop while time is outside working hours

I am trying to make the below script to execute a Restore binary between hours 17:00 - 07:00 for each folders which name starts with EAR_* in /backup_local/ARCHIVES/ but for some reason it is not working as expected, meaning that the for loop is not breaking if the time condition gets invalid.
Should I add the while loop inside the for loop?
#! /usr/bin/bash
#set -x
while :; do
currenttime=$(date +%H:%M)
if [[ "$currenttime" > "17:00" ]] || [[ "$currenttime" < "07:00" ]]; then
for path in /backup_local/ARCHIVES/EAR_*; do
[ -d "${path}" ] || continue # if not a directory, skip
dirname="$(basename "${path}")"
nohup /Restore -a /backup_local/ARCHIVES -c -I 0 -force -v > /backup_local/$dirname.txt &
wait $!
if [ $? -eq 0 ]; then
rm -rf $path
rm /backup_local/$dirname.txt
echo $dirname >> /backup_local/completed.txt
fi
done &
else
echo "Restore can be ran only outside working hours!"
break
fi
done &
your script looks like this in pseudo-code:
START
IF outside workinghours
EXIT
ELSE
RUN /Restore FOR EACH backupdir
GOTO START
The script only checks the time once, before starting a restore run (which will call /Restore for each directory to restore in a for loop)
It will continue to start the for loop, until the working hours start. Then it will exit.
E.g. if you have restore 3 folders to restore, each taking 2 hours; and you start the script at midnight; then the script will check whether it's outside working hours (it is), and will start the restore for the first folder (at 0:00), after two hours of work it will start the restore the 2nd folder (at 2:00), after another two hours it will start the restore of the 3rd folder (at 4:00). Once the 3rd folder has been restored, it will check the working hours again. Since it's now only 6:00, that is: outside the working hours, it will start the restore for the first folder (at 6:00), after two hours of work it will start the restore the 2nd folder (at 8:00), after another two hours it will start the restore of the 3rd folder (at 10:00).
It's noon when it does the next check against the working hours; since 12:00 falls within 7:00..17:00, the script will now stop. With an error message.
You probably only want the restore to run once for each folder, and stop proceeding to the next folder if the working hours start.
#!/bin/bash
for path in /backup_local/ARCHIVES/EAR_*/; do
currenttime=$(date +%H:%M)
if [[ "$currenttime" > "7:00" ]] && [[ "$currenttime" < "17:00" ]]; then
echo "Not restoring inside working hours!" 1>&2
break
fi
dirname="$(basename "${path}")"
/Restore -a /backup_local/ARCHIVES -c -I 0 -force -v > /backup_local/$dirname.txt
# handle exit code
done
update
I've just noticed your liberal spread of & for backgrounding jobs.
This is presumably to allow running the script from a remote shell. don't
What this will really do is:
it will run all the iterations over the restore-directories in parallel. This might create a bottleneck on your storage (if the directories to restore to/from share the same hardware)
it will background the entire loop-to-restore and immediately return to the out-of-hours check. if the check succeeds, it will spawn another loop-to-restore (and background it). then it will return to the out-of-hours check and spawn another backgrounded loop-to-restore.
Before dawn you probably have a few thousands background threads to restore directories. More likely you've exceeded your ressources and the process get's killed.
My example script above has omitted all the backgrounding (and the nohup).
If you want to run the script from a remote shell (and exit the shell after launching it), just run it with
nohup restore-script.sh &
Alternatively you could use
echo "restore-script.sh" | at now
or use a cron-job (if applicable)
The shebang contains an unwanted space. On my ubuntu the bash is found at /bin/bash.
Yours, is located there :
type bash
The while loop breaks in my test, replace the #!/bin/bash path with the result of the previous command:
#!/bin/bash --
#set -x
while : ; do
currenttime=$(date +%H:%M)
if [[ "$currenttime" > "17:00" ]] || [[ "$currenttime" < "07:00" ]]; then
for path in /backup_local/ARCHIVES/EAR_*; do
[ -d "${path}" ] || continue # if not a directory, skip
dirname="$(basename "${path}")"
nohup /Restore -a /backup_local/ARCHIVES -c -I 0 -force -v > /backup_local/$dirname.txt &
wait $!
if [ $? -eq 0 ]; then
rm -rf $path
rm /backup_local/$dirname.txt
echo $dirname >> /backup_local/completed.txt
fi
done &
else
echo "Restore can be ran only outside working hours!"
break
fi
done &

Bash: Check if remote directory exists using FTP

I'm writing a bash script to send files from a linux server to a remote Windows FTP server.
I would like to check using FTP if the folder where the file will be stored exists before attempting to create it.
Please note that I cannot use SSH nor SCP and I cannot install new scripts on the linux server. Also, for performance issues, I would prefer if checking and creating the folders is done using only one FTP connection.
Here's the function to send the file:
sendFile() {
ftp -n $FTP_HOST <<! >> ${LOCAL_LOG}
quote USER ${FTP_USER}
quote PASS ${FTP_PASS}
binary
$(ftp_mkdir_loop "$FTP_PATH")
put ${FILE_PATH} ${FTP_PATH}/${FILENAME}
bye
!
}
And here's what ftp_mkdir_loop looks like:
ftp_mkdir_loop() {
local r
local a
r="$#"
while [[ "$r" != "$a" ]]; do
a=${r%%/*}
echo "mkdir $a"
echo "cd $a"
r=${r#*/}
done
}
The ftp_mkdir_loop function helps in creating all the folders in $FTP_PATH (Since I cannot do mkdir -p $FTP_PATH through FTP).
Overall my script works but is not "clean"; this is what I'm getting in my log file after the execution of the script (yes, $FTP_PATH is composed of 5 existing directories):
(directory-name) Cannot create a file when that file already exists.
Cannot create a file when that file already exists.
Cannot create a file when that file already exists.
Cannot create a file when that file already exists.
Cannot create a file when that file already exists.
To solve this, do as follows:
To ensure that you only use one FTP connection, you create the input (FTP commands) as an output of a shell script
E.g.
$ cat a.sh
cd /home/test1
mkdir /home/test1/test2
$ ./a.sh | ftp $Your_login_and_server > /your/log 2>&1
To allow the FTP to test if a directory exists, you use the fact that "DIR" command has an option to write to file
# ...continuing a.sh
# In a loop, $CURRENT_DIR is the next subdirectory to check-or-create
echo "DIR $CURRENT_DIR $local_output_file"
sleep 5 # to leave time for the file to be created
if (! -s $local_output_file)
then
echo "mkdir $CURRENT_DIR"
endif
Please note that "-s" test is not necessarily correct - I don't have acccess to ftp now and don't know what the exact output of running DIR on non-existing directory will be - cold be empty file, could be a specific error. If error, you can grep the error text in $local_output_file
Now, wrap the step #2 into a loop over your individual subdirectories in a.sh
#!/bin/bash
FTP_HOST=prep.ai.mit.edu
FTP_USER=anonymous
FTP_PASS=foobar#example.com
DIRECTORY=/foo # /foo does not exist, /pub exists
LOCAL_LOG=/tmp/foo.log
ERROR="Failed to change directory"
ftp -n $FTP_HOST << EOF | tee -a ${LOCAL_LOG} | grep -q "${ERROR}"
quote USER ${FTP_USER}
quote pass ${FTP_PASS}
cd ${DIRECTORY}
EOF
if [[ "${PIPESTATUS[2]}" -eq 1 ]]; then
echo ${DIRECTORY} exists
else
echo ${DIRECTORY} does not exist
fi
Output:
/foo does not exist
If you want to suppress only the messages in ${LOCAL_LOG}:
ftp -n $FTP_HOST <<! | grep -v "Cannot create a file" >> ${LOCAL_LOG}

Run wget and other commands in shell script

I'm trying to create a shell script that I will download the latest Atomic gotroot rules to my server, unpack them, copy them to the correct folder, etc.,
I've been reading shell tutorials and forum posts for most of the day and the syntax escapes me for some of these. I have run all these commands and I know they work if I manually run them.
I know I need to develop some error checking, but I'm just trying to get the commands to run correctly. The main problem at the moment is the syntax of the wget commands, i've got errors about missing semi-colons, divide by zero, unsupported schemes - I've tried various quoting (single and double) and escaping - / " characters in various combinations.
Thanks for any help.
The raw wget command is
wget --user="jim" --password="xxx-yyy-zzz" "http://updates.atomicorp.com/channels/rules/subscription/VERSION"
#!/bin/sh
update_modsec_rules(){
wget=/usr/bin/wget
tar=/bin/tar
apachectl=/usr/bin/apache2ctl
TXT="Script Run Finished"
WORKING_DIR="/var/asl/updates"
TARGET_DIR="/usr/local/apache/conf/modsec_rules/"
EXISTING_FILES="/var/asl/updates/modsec/*"
EXISTING_ARCH="/var/asl/updates/modsec-*"
WGET_OPTS='--user=jim --password=xxx-yyy-zzz'
URL_BASE="http://updates.atomicorp.com/channels/rules/subscription"
# change to working directory and cleanup any downloaded files and extracted rules in modsec/ directory
cd $WORKING_DIR
rm -f $EXISTING_ARCH
rm -f $EXISTING_FILES
rm -f VERSION*
# wget to download VERSION file
$wget ${WGET_OPTS} "${URL_BASE}/VERSION"
# get current MODSEC_VERSION from VERSION file and save as variable
source VERSION
TARGET_DATE=$MODSEC_VERSION
echo $TARGET_DATE
# wget to download current archive
$wget ${WGET_OPTS} "${URL_BASE}/modsec-${TARGET_DATE}.tar.gz"
# extract archive
echo "extracting files . . . "
tar zxvf $WORKING_DIR/modsec-${TARGET_DATE}.tar.gz
echo "copying files . . . "
cp -uv $EXISTING_FILES $TARGET_DIR
echo $TXT
}
update_modsec_rules $# 2>&1 | tee -a /var/asl/modsec_update.log
RESTART_APACHE="/usr/local/cpanel/scripts/restartsrv httpd"
$RESTART_APACHE
Here are some guidelines to use when writing shell scripts.
Always quote variables when you use them. This helps avoid the possibility of misinterpretation. (What if a filename contains a space?)
Don't trust fileglobbing on commands like rm. Use for loops instead. (What if a filename starts with a hyphen?)
Avoid subshells when possible. Your lines with backquotes make me itchy.
Don't exec if you can help it. And especially don't expect any parts of your script after your exec to actually get run.
I should point out that while your shell may be bash, you've specified /bin/sh for execution of this script, so it is NOT a bash script.
Here's a rewrite with some error checking. Add salt to taste.
#!/bin/sh
# Linux
wget=/usr/bin/wget
tar=/bin/tar
apachectl=/usr/sbin/apache2ctl
# FreeBSD
#wget=/usr/local/bin/wget
#tar=/usr/bin/tar
#apachectl=/usr/local/sbin/apachectl
TXT="GOT TO THE END, YEAH"
WORKING_DIR="/var/asl/updates"
TARGET_DIR="/usr/local/apache/conf/modsec_rules/"
EXISTING_FILES_DIR="/var/asl/updates/modsec/"
EXISTING_ARCH="/var/asl/updates/"
URL_BASE="http://updates.atomicorp.com/channels/rules/subscription"
WGET_OPTS='--user="jim" --password="xxx-yyy-zzz"'
if [ ! -x "$wget" ]; then
echo "ERROR: No wget." >&2
exit 1
elif [ ! -x "$apachectl" ]; then
echo "ERROR: No apachectl." >&2
exit 1
elif [ ! -x "$tar" ]; then
echo "ERROR: Not in Kansas anymore, Toto." >&2
exit 1
fi
# change to working directory and cleanup any downloaded files
# and extracted rules in modsec/ directory
if ! cd "$WORKING_DIR"; then
echo "ERROR: can't access working directory ($WORKING_DIR)" >&2
exit 1
fi
# Delete each file in a loop.
for file in "$EXISTING_FILES_DIR"/* "$EXISTING_ARCH_DIR"/modsec-*; do
rm -f "$file"
done
# Move old VERSION out of the way.
mv VERSION VERSION-$$
# wget1 to download VERSION file (replaces WGET1)
if ! $wget $WGET_OPTS $URL_BASE}/VERSION; then
echo "ERROR: can't get VERSION" >&2
mv VERSION-$$ VERSION
exit 1
fi
# get current MODSEC_VERSION from VERSION file and save as variable,
# but DON'T blindly trust and run scripts from an external source.
if grep -q '^MODSEC_VERSION=' VERSION; then
TARGET_DATE="`sed -ne '/^MODSEC_VERSION=/{s/^[^=]*=//p;q;}' VERSION`"
echo "Target date: $TARGET_DATE"
fi
# Download current archive (replaces WGET2)
if ! $wget ${WGET_OPTS} "${URL_BASE}/modsec-$TARGET_DATE.tar.gz"; then
echo "ERROR: can't get archive" >&2
mv VERSION-$$ VERSION # Do this, don't do this, I don't know your needs.
exit 1
fi
# extract archive
if [ ! -f "$WORKING_DIR/modsec-${TARGET_DATE}.tar.gz" ]; then
echo "ERROR: I'm confused, where's my archive?" >&2
mv VERSION-$$ VERSION # Do this, don't do this, I don't know your needs.
exit 1
fi
tar zxvf "$WORKING_DIR/modsec-${TARGET_DATE}.tar.gz"
for file in "$EXISTING_FILES_DIR"/*; do
cp "$file" "$TARGET_DIR/"
done
# So far so good, so let's restart apache.
if $apachectl configtest; then
if $apachectl restart; then
# Success!
rm -f VERSION-$$
echo "$TXT"
else
echo "ERROR: PANIC! Apache didn't restart. Notify the authorities!" >&2
exit 3
fi
else
echo "ERROR: Apache configs are broken. We're still running, but you'd better fix this ASAP." >&2
exit 2
fi
Note that while I've rewritten this to be more sensible, there is certainly still a lot of room for improvement.
You have two options:
1- changing this to
WGET1=' --user="jim" --password="xxx-yyy-zzz" "http://updates.atomicorp.com/channels/rules/subscription/VERSION"'
then run
wget $WGET1 same to WGET2
Or
2- encapsulating $WGET1 with backquotes ``.
e.g.:
`$WGET`
This applies to any command your executing out of a variable.
Suggested changes:
#!/bin/sh
TXT="GOT TO THE END, YEAH"
WORKING_DIR="/var/asl/updates"
TARGET_DIR="/usr/local/apache/conf/modsec_rules/"
EXISTING_FILES="/var/asl/updates/modsec/*"
EXISTING_ARCH="/var/asl/updates/modsec-*"
WGET1='wget --user="jim" --password="xxx-yyy-zzz" "http://updates.atomicorp.com/channels/rules/subscription/VERSION"'
WGET2='wget --user="jim" --password="xxx-yyy-zzz" "http://updates.atomicorp.com/channels/rules/subscription/modsec-$TARGET_DATE.tar.gz"'
## change to working directory and cleanup any downloaded files and extracted rules in modsec/ directory
cd $WORKING_DIR
rm -f $EXISTING_ARCH
rm -f $EXISTING_FILES
## wget1 to download VERSION file
`$WGET1`
## get current MODSEC_VERSION from VERSION file and save as variable
source VERSION
TARGET_DATE=`echo $MODSEC_VERSION`
## WGET2 command to download current archive
`$WGET2`
## extract archive
tar zxvf $WORKING_DIR/modsec-$TARGET_DATE.tar.gz
cp $EXISTING_FILES $TARGET_DIR
## restart server
exec '/usr/local/cpanel/scripts/restartsrv_httpd' $*;
Pro Tip: If you need string substitution, using ${VAR} is much better to eliminate ambiguity, e.g.:
tar zxvf $WORKING_DIR/modsec-${TARGET_DATE}.tar.gz

Bash curl returns 0 whenever the copy has finished or not

I'm calling curl on bash to copy a file from a mounted SD card with the option to resume the copy later if the device gets unmounted. I receive the same status exit code 0 when I interrupt the copy by unmounting the volume and when the file gets actually copied. Any suggestions how to catch the case where the file has not been copied?
I'm copying only one file at a time.
This is the command:
curl -C - -O file:///mnt/sdcard/DCIM/100/0044.MP4
I came to a solution which is not as clear as I want, but still working. I'm executing the command 2 times one after another, so when the first command returns 0 upon unmount, the second now tries to copy the file and return error code 37 because of the unreachable source. If the second command returns 0 I consider the file copied.
Following your concept you could have a script like this:
#!/bin/bash
# Copies files persistently.
#
# Usage: pc <filepath> [<filepath2>] ...
#
function pc {
local FILE
for FILE; do
echo "Copying $FILE."
until curl -C - -O "file://${FILE}" && curl -C - -O "file://${FILE}"; do
if [[ -e $FILE ]]; then
echo "File $FILE can't be copied."
break
else
echo "Waiting for $FILE."
until
sleep 5
[[ -e $FILE ]]
do
continue
done
fi
done
done
}
pc "$#"
You could also just embed the function to a bash startup script if you like.

how to find a file exists in particular dir through SSH

how to find a file exists in particular dir through SSH
for example :
host1 and dir /home/tree/TEST
Host2:- ssh host1 - find the TEST file exists or not using bash
ssh will return the exit code of the command you ask it to execute:
if ssh host1 stat /home/tree/TEST \> /dev/null 2\>\&1
then
echo File exists
else
echo Not found
fi
You'll need to have key authentication setup of course, so you avoid the password prompt.
This is what I ended up doing after reading and trying out the stuff here:
FileExists=`ssh host "test -e /home/tree/TEST && echo 1 || echo 0"`
if [ ${FileExists} = 0 ]
#do something because the file doesn't exist
fi
More info about test: http://linux.die.net/man/1/test
An extension to Erik's accepted answer.
Here is my bash script for waiting on an external process to upload a file. This will block current script execution indefinitely until the file exists.
Requires key-based SSH access although this could be easily modified to a curl version for checks over HTTP.
This is useful for uploads via external systems that use temporary file names:
rsync
transmission (torrent)
Script below:
#!/bin/bash
set -vx
#AUTH="user#server"
AUTH="${1}"
#FILE="/tmp/test.txt"
FILE="${2}"
while (sleep 60); do
if ssh ${AUTH} stat "${FILE}" > /dev/null 2>&1; then
echo "File found";
exit 0;
fi;
done;
No need for echo. Can't get much simpler than this :)
ssh host "test -e /path/to/file"
if [ $? -eq 0 ]; then
# your file exists
fi

Resources