I have a simple inotifywait script that watches for FTP file uploads to be closed and then moving them to a aws s3. It seems to be working except that in the inotify logs, it indicates that the file was not found ( although the file was indeed uploaded to s3 ).
The s3 move command moves the file to the cloud and deletes it locally. Could this be because inotifywait detects deleting the file as a close_write event ?
Why is inotify seems to be executing the commands twice ?
TARGET=/home/*/ftp/files
inotifywait -m -r -e close_write $TARGET |
while read directory action file
do
if [[ "$file" =~ .*mp4$ ]]
then
echo COPY PATH IS "$directory$file"
aws s3 mv "$directory$file" s3://bucket
fi
done
example logs:
Setting up watches. Beware: since -r was given, this may take a while!
Watches established.
COPY PATH IS /home/user/ftp/files/2022/05/16/user-cam-1_00_20220516114055.mp4
COPY PATH IS /home/user/ftp/files/2022/05/16/user-cam-1_00_20220516114055.mp4
COPY PATH IS /home/user/ftp/files/2022/05/16/user-cam-1_00_20220516114055.mp4
move: ../user/ftp/files/2022/05/16/user-cam-1_00_20220516114055.mp4 to s3://bucket/user-cam-1_00_20220516114055.mp4
upload: ../user/ftp/files/2022/05/16/user-cam-1_00_20220516114055.mp4 to s3://bucket/user-cam-1_00_20220516114055.mp4
move failed: ../user/ftp/files/2022/05/16/user-cam-1_00_20220516114055.mp4 to s3://bucket/user-cam-1_00_20220516114055.mp4 [Errno 2] No such file or directory: '/home/user/ftp/files/2022/05/16/user-cam-1_00_20220516114055.mp4'
rm: cannot remove '/home/user/ftp/files/2022/05/16/user-cam-1_00_20220516114055.mp4': No such file or directory
Cleaned-up your script and added some safety with quotes and check for already processed file in case the filesystem triggers duplicate events for same file.
#!/usr/bin/env bash
# Prevents expanding pattern without matches
shopt -s nullglob
# Expands pattern into an array
target=(/home/*/ftp/files/)
# Creates temporary directory and cleanup trap
declare -- tmpdir=
if tmpdir=$(mktemp -d); then
trap 'rm -fr -- "$tmpdir"' EXIT INT
else
# or exit error if it fails
exit 1
fi
# In case no target matches, exit error
[ "${#target[#]}" -gt 0 ] || exit 1
s3move() {
local -- p=$1
local -- tmp="$tmpdir/$p"
printf 'Copy path is: %s\n' "$p"
# Moves the file to temporary dir
# so it is away from inotify watch dir ASAP
mv -- "$p" "$tmp"
# Then perform the slow remote copy to s3 bucket
# Remove the echo onces it is ok
echo aws s3 mv "$p" s3://bucket
# File has been copied to s3, tmp file no longer needed
rm -f -- "$tmp"
}
while read -r -d '' p; do
# Skip if file does not exist, as it has already been moved away
# case of a duplicate event for already processed file
[ -e "$p" ] || continue
s3move "$p"
done < <(
# Good practice to spell long option names in a script
# --format will print null-delimited full file path
inotifywait \
--monitor \
--recursive \
--event close_write \
--includei '.*\.mp4$' \
--format '%w%f%0' \
"${target[#]}" 2>/dev/null
)
Related
I would like to break out of a loop when it gets to a blank line in a file. The issue is that my regexp's used to condition my data create a line with characters so I need something from the beginning to check if a line is empty or not so I can break out. What am I missing?
#!/bin/bash
#NOTES: chmod this script with chmod 755 to run as regular local user
#This line allows for passing in a source file as an argument to the script (i.e: ./script.sh source_file.txt)
input_file="$1"
#This creates the folder structure used to mount the SMB Share and copy the assets over to the local machines
SOURCE_FILES_ROOT_DIR="${HOME}/operations/source"
DESTINATION_FILES_ROOT_DIR="${HOME}/operations/copied_files"
#This creates the fileshare mount point and place to copy files over to on the local machine.
echo "Creating initial folders..."
mkdir -p "${SOURCE_FILES_ROOT_DIR}"
mkdir -p "${DESTINATION_FILES_ROOT_DIR}"
echo "Folders Created! Destination files will be copied to ${DESTINATION_FILES_ROOT_DIR}/SHARE_NAME"
while read -r line;
do
if [ -n "$line" ]; then
continue
fi
line=${line/\\\\///}
line=${line//\\//}
line=${line%%\"*\"}
SERVER_NAME=$(echo "$line" | cut -d / -f 4);
SHARE_NAME=$(echo "$line" | cut -d / -f 5);
ASSET_LOC=$(echo "$line" | cut -d / -f 6-);
SMB_MOUNT_PATH="//$(whoami)#${SERVER_NAME}/${SHARE_NAME}";
if df -h | grep -q "${SMB_MOUNT_PATH}"; then
echo "${SHARE_NAME} is already mounted. Copying files..."
else
echo "Mounting it"
mount_smbfs "${SMB_MOUNT_PATH}" "${SOURCE_FILES_ROOT_DIR}"
fi
cp -a ${SOURCE_FILES_ROOT_DIR}/${ASSET_LOC} ${DESTINATION_FILES_ROOT_DIR}
done < $input_file
# cleanup
hdiutil unmount ${SOURCE_FILES_ROOT_DIR}
exit 0
Expected result was for the script to realize when it gets to a blank line and then stops. The script works works when i remove the
if [ -n "$line" ]; then
continue
fi
The script runs and pulls assets but just keeps on going and never breaks out. When I do it as is now I get :
Creating initial folders...
Folders Created! Destination files will be copied to /Users/baguiar/operations/copied_files
Mounting it
mount_smbfs: server connection failed: No route to host
hdiutil: unmount: "/Users/baguiar/operations/source" failed to unmount due to error 16.
hdiutil: unmount failed - Resource busy
cat test.txt
This is some file
There are lines in it
And empty lines
etc
while read -r line; do
if [[ -n "$line" ]]; then
continue
fi
echo "$line"
done < "test.txt"
will print out
That's because -n matches strings that are not null, i.e., non-empty.
It sounds like you have a misunderstanding of what continue means. It does not mean "continue on in this step of the loop", it means "continue to the next step of the loop", i.e., go to the top of the while loop and run it starting with the next line in the file.
Right now, your script says "go line by line, and if the line is not empty, skip the rest of the processing". I think your goal is actually "go line by line, and if the line is empty, skip the rest of the processing". This would be achieved by if [[ -z "$line" ]]; then continue; fi
TL;DR You are skipping all the non-empty lines. Use -z to check if your variable is empty instead of -n.
The OS is centos 7, I have a small application to implement below functionality:
1.Read information from config.ini like this:
# Configuration file for ftpxml service
# Remote FTP server informations
ftpadress=1.2.3.4
username=test
password=test
# Local folders configuration
# folderA: folder for incomming files
folderA=/u02/dump
# folderB: Successfuly transfered files are copied here
folderB=/u02/dump_bak
# retrydir: when ftp upload fails, store failed files in this
# directory
retrydir=/u02/dump_retry
Monitor folder A. If there are any newly-added files in A, do step 3.
Ftp these new files to a remote ftp server in the order of their creation time, While upload finished, copy uploaded files to folder B and delete relevant files in folder A.
If ftp fails, store relevant files in retrydir and try to ftp them later.
Record every operation in a log file.
Detailed setting instruction for the application:
install ncftp package: yum install ncftp -y, it's not a service nor a daemon, just a client tool which is invoked in bash file for ftp purpose.
Customize these files to suit your setting using vi: config.ini
,ftpmon.path and ftpmon.service
copy ftpmon.path and ftpmon.service to /etc/systemd/system/, copy config.ini and ftpxml.sh to /u02/ftpxml/, run: chmod +x ftpxml.sh
Start the monitoring tool
sudo systemctl start ftpmon.path
If you want to enable it at boot time just enter: sudo systemctl enable ftpmon.path
Setup a cron task to purge queued files (add option -p)
*/5 * * * * /u02/ftpxml/ftpxml.sh -p
Now the application seems works well, except a special situation:
When we put several files in folder A continuously, for instance, put 1.txt, 2.txt and 3.txt...... one after another in a short time, we usually found 1.txt ftp well, but the upcoming files fails to ftp and still stay under folder A.
Now I am going to fix this problem. I suppose the error maybe due to: while doing ftp for the first file, maybe the second file is already created under folder A. so the code can't care about the second file.
Below is code of ftpxml.sh:
#!/bin/bash
# ! Please read the README.txt file !
# Copy files from upload dir to remote FTP then save them
# to folderB
# look our location
SCRIPT=$(readlink -f $0)
# Absolute path to this script
SCRIPTPATH=`dirname $SCRIPT`
PIDFILE=${SCRIPTPATH}/ftpmon_prog.lock
# load config.ini
if [ -f $SCRIPTPATH/config.ini ]; then
source $SCRIPTPATH/config.ini
else
echo "No config found. Exiting"
fi
# Lock to avoid multiple instances
if [ -f $PIDFILE ]; then
kill -0 $(cat $PIDFILE) 2> /dev/null
if [ $? == 0 ]; then
exit
fi
fi
# Store PID in lock file
echo $$ > $PIDFILE
# Parse cmdline arguments
while getopts ":ph" opt; do
case $opt in
p)
#we set the purge mode (cron mode)
purge_only=1
;;
\?|h)
echo "Help text"
exit 1
;;
esac
done
# declare usefull functions
# common logging function
function logmsg() {
LOGFILE=ftp_upload_`date +%Y%m%d`.log
echo $(date +%m-%d-%Y\ %H:%M:%S) $* >> $SCRIPTPATH/log/${LOGFILE}
}
# Upload to remote FTP
# we use ncftpput to batch silently
# $1 file to upload $2 return value placeholder
function upload() {
ncftpput -V -u $username -p $password $ftpadress /prog/ $1
return $?
}
function purge_retry() {
failed_files=$(ls -1 -rt ${retrydir}/*)
if [ -z $failed_files ]; then
return
fi
while read line
do
#upload ${retrydir}/$line
upload $line
if [ $? != 0 ]; then
# upload failed we exit
exit
fi
logmsg File $line Uploaded from ${retrydir}
mv $line $folderB
logmsg File $line Copyed from ${retrydir}
done <<< "$failed_files"
}
# first check out 'queue' directory
purge_retry
# if called from a cron task we are done
if [ $purge_only ]; then
rm -f $PIDFILE
exit 0
fi
# look in incoming dir
new_files=$(ls -1 -rt ${folderA}/*)
while read line
do
# launch upload
if [ Z$line == 'Z' ]; then
break
fi
#upload ${folderA}/$line
upload $line
if [ $? == 0 ]; then
logmsg File $line Uploaded from ${folderA}
else
# upload failed we cp to retry folder
echo upload failed
cp $line $retrydir
fi
# don't care upload successfull or failed, we ALWAYS move the file to folderB
mv $line $folderB
logmsg File $line Copyed from ${folderA}
done <<< "$new_files"
# clean exit
rm -f $PIDFILE
exit 0
below is content of ftpmon.path:
[Unit]
Description= Triggers the service that logs changes.
Documentation= man:systemd.path
[Path]
# Enter the path to monitor (/u02/dump)
PathModified=/u02/dump/
[Install]
WantedBy=multi-user.target
below is content of ftpmon.service:
[Unit]
Description= Starts File Upload monitoring
Documentation= man:systemd.service
[Service]
Type=oneshot
#Set here the user that ftpmxml.sh will run as
User=root
#Set the exact path to the script
ExecStart=/u02/ftpxml/ftpxml.sh
[Install]
WantedBy=default.target
Thanks in advance, hope any experts can give me some suggestion.
As you remove the successfull transfered files from A you can leave files with transfer errors in A. So I am dealing only with files in one folder.
List your files by creation time with
find -type f -maxdepth 1 -print0 | xargs -r0 stat -c %y\ %n | sort
if you want hidden files to be included or - if not -
find -type f -maxdepth 0 -print0 | xargs -r0 stat -c %y\ %n | sort
You'll get something like
2016-02-19 18:53:41.000000000 ./.dockerenv
2016-02-19 18:53:41.000000000 ./.dockerinit
2016-02-19 18:56:09.000000000 ./versions.txt
2016-02-19 19:01:44.000000000 ./test.sh
Now cut the filenames (or use xargs -r0 stat -c %n if it does not matter that the files are order by name instead the timestamp) and
do the transfer
check the success
move successfully transfered files to B
As you stated above, there are situations where newly stored files are not successfully transfered. This may be if the file is written further after you started the transfer. So filter the timestamp to be at least some time old. Add -mmin -1 to the find statement for "at least one minute old"
find -type f -maxdepth 0 -mmin -1 -print0 | xargs -r0 stat -c %n | sort
If you don't want to use a minute file age you'll have to check if the file is still open: lsof | grep ./testfile but this may have issues if you have tmpfs in your file system.
lsof | grep ./testfile
lsof: WARNING: can't stat() tmpfs file system /var/lib/docker/containers/8596cd310292a54652c7f50d7315c8390703b4816442146b340946779a72a40c/shm
Output information may be incomplete.
lsof: WARNING: can't stat() proc file system /run/docker/netns/fb9323486c44
Output information may be incomplete.
So add %s to the stats statement to check the file size twice within some seconds and if it's constant the file may be written complete. May, as the write process may be stalled.
I'm calling curl on bash to copy a file from a mounted SD card with the option to resume the copy later if the device gets unmounted. I receive the same status exit code 0 when I interrupt the copy by unmounting the volume and when the file gets actually copied. Any suggestions how to catch the case where the file has not been copied?
I'm copying only one file at a time.
This is the command:
curl -C - -O file:///mnt/sdcard/DCIM/100/0044.MP4
I came to a solution which is not as clear as I want, but still working. I'm executing the command 2 times one after another, so when the first command returns 0 upon unmount, the second now tries to copy the file and return error code 37 because of the unreachable source. If the second command returns 0 I consider the file copied.
Following your concept you could have a script like this:
#!/bin/bash
# Copies files persistently.
#
# Usage: pc <filepath> [<filepath2>] ...
#
function pc {
local FILE
for FILE; do
echo "Copying $FILE."
until curl -C - -O "file://${FILE}" && curl -C - -O "file://${FILE}"; do
if [[ -e $FILE ]]; then
echo "File $FILE can't be copied."
break
else
echo "Waiting for $FILE."
until
sleep 5
[[ -e $FILE ]]
do
continue
done
fi
done
done
}
pc "$#"
You could also just embed the function to a bash startup script if you like.
I have this bash script:
#!/bin/bash
inotifywait -m -e close_write --exclude '\*.sw??$' . |
#adding --format %f does not work for some reason
while read dir ev file; do
cp ./"$file" zinot/"$file"
done
~
Now, how would I have it do the same thing but also handle deletes by writing the filenames to a log file?
Something like?
#!/bin/bash
inotifywait -m -e close_write --exclude '\*.sw??$' . |
#adding --format %f does not work for some reason
while read dir ev file; do
# if DELETE, append $file to /inotify.log
# else
cp ./"$file" zinot/"$file"
done
~
EDIT:
By looking at the messages generated, I found that inotifywait generates CLOSE_WRITE,CLOSE whenever a file is closed. So that is what I'm now checking in my code.
I tried also checking for DELETE, but for some reason that section of the code is not working. Check it out:
#!/bin/bash
fromdir=/path/to/directory/
inotifywait -m -e close_write,delete --exclude '\*.sw??$' "$fromdir" |
while read dir ev file; do
if [ "$ev" == 'CLOSE_WRITE,CLOSE' ]
then
# copy entire file to /root/zinot/ - WORKS!
cp "$fromdir""$file" /root/zinot/"$file"
elif [ "$ev" == 'DELETE' ]
then
# trying this without echo does not work, but with echo it does!
echo "$file" >> /root/zinot.txt
else
# never saw this error message pop up, which makes sense.
echo Could not perform action on "$ev"
fi
done
In the dir, I do touch zzzhey.txt. File is copied. I do vim zzzhey.txt and file changes are copied. I do rm zzzhey.txt and the filename is added to my log file zinot.txt. Awesome!
You need to add -e delete to your monitor, otherwise DELETE events won't be passed to the loop. Then add a conditional to the loop that handles the events. Something like this should do:
#!/bin/bash
inotifywait -m -e delete -e close_write --exclude '\*.sw??$' . |
while read dir ev file; do
if [ "$ev" = "DELETE" ]; then
echo "$file" >> /inotify.log
else
cp ./"$file" zinot/"$file"
fi
done
I'm having some difficulties with this. Basically, for work I need a bash script that backs up a variable number of directories that are stored in a config file.
I'm sure I need to import the list from the config file and just use a loop to copy all the directories across. I have it working for a single directory. My code is below. I've cut it down to a minimum.
#!/bin/sh
if [ ! -f ./backup.conf ]
then
echo "Configuration file not found. Exiting!!"
exit
fi
. ./backup.conf
unset PATH
# make sure we're running as root
if (( `$ID -u` != 0 )) ; then { $ECHO "Sorry, must be root. Exiting..."; exit; } fi ;
# attempt to remount the RW mount point as RW; else abort
$MOUNT -o remount,rw $SOURCEFILE $DESTINATIONFOLDER ;
if (( $? )); then
{
$ECHO "snapshot: could not remount $DESTINATIONFOLDER readwrite";
exit;
}
fi ;
# step 2: create new backup folder:
$MKDIR $FULLPATH
**Loop should go here**
#copy source directories to backup folder
$RSYNC \
-va --delete --delete-excluded \
--exclude-from="$EXCLUDES" \
$SOURCEFILE $FULLPATH;
The config file is as follows
SOURCE=path
DESTINATION=path2
BACKUPFOLDERNAME=/laptopBackup
My question is what is the best approach to do this task. i.e how should I format the config file to import a variable amount of paths to an array? or is there a better way of doing this?
I'd personally do it slightly differently and have my configuration file more of a "control file". For example:
/path /path2 /laptopBackup
/tmp /test /bigmachine
etc.. 1 line per mount, 3 fields per line (source, destination, backupfoldername)
Then use something like :
while read SOURCE DESTINATION BACKUPFOLDERNAME
do
<stuff>
done < ${configfile}
(removed the cat so as not to shame myself further :( )