Bulk copy files over sftp with a delay between each file - bash

I want to move my files from one directory to SFTP and further to another directory in sequence not together.
Let's say if my directories are A and B.
Here is my code:
#!/bin/bash
cp -R /usr/sap/tmp/Dir A/. /usr/sap/tmp/Dir B/
lftp <<_EOF_
open sftp://User:Password#Host -p Port
lcd /usr/sap/tmp/Dir A
cd /
pwd
mput -E /usr/sap/tmp/Dir A/*.dat
exit
_EOF_
This works fine. But the only problem is it moves all files together at the same time from dir A to SFTP. How can I get it to move files one by one (in sequence, say the files moved to SFTP should have at least difference of one second between them)?

First create a file with commands for all files.
cat <<# > inputfile
open sftp://User:Password#Host -p Port
lcd /usr/sap/tmp/Dir A
cd /
pwd
#
find . -type f -name sa\*.txt -print0 |
xargs -n1 --null -I'{}' printf "%s\n" "mput '{}'" '!'"sleep 1" >> inputfile
echo "exit" >> inputfile
Next stream that file
lftp << inputfile

Related

Run a script on all recently modified files in bash

I would like to:
Find latest modified file in a folder
Change some files in the folder
Find all files modified after file of step 1
Run a script on these files from step 2
This this where I've end up:
#!/bin/bash
var=$(find /home -type f -exec stat \{} --printf="%y\n" \; |
sort -n -r |
head -n 1)
echo $var
sudo touch -d $var /home/foo
find /home/ -newer /home/foo
Can anybody help me in achieving these actions ?
Use inotifywait instead to monitor files and check for changes
inotifywait -m -q -e modify --format "%f" {Path_To__Monitored_Directory}
Also, you can make it output to file, loop over it's contents and run your script on every entry.
inotifywait -m -q -e modify --format "%f" -o {Output_File} {Path_To_Monitored_Directory}
sample output:
file1
file2
Example
We are monitoring directory named /tmp/dir which contains file1 and file2.
The following script which monitor the whole directory and echo the file name:
#!/bin/bash
while read ch
do
echo "File modified= $ch"
done < <(inotifywait -m -q -e modify --format "%f" /tmp/dir)
Run this script and modify file1 echo "123" > /tmp/dir/file1, the script will output the following:
File modified= file1
Also you can look at this stackoverflow answer

Shell Script: How to copy files with specific string from big corpus

I have a small bug and don't know how to solve it. I want to copy files from a big folder with many files, where the files contain a specific string. For this I use grep, ack or (in this example) ag. When I'm inside the folder it matches without problem, but when I want to do it with a loop over the files in the following script it doesn't loop over the matches. Here my script:
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" | while read -d $'\0' file; do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done
SEARCH_QUERY holds the String I want to find inside the files, INPUT_DIR is the folder where the files are located, OUTPUT_DIR is the folder where the found files should be copied to. Is there something wrong with the while do?
EDIT:
Thanks for the suggestions! I took this one now, because it also looks for files in subfolders and saves a list with all the files.
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" > "output_list.txt"
while read file
do
echo "${file##*/}"
cp "${file}" "${OUTPUT_DIR}/${file##*/}"
done < "output_list.txt"
Better implement it like below with a find command:
find "${INPUT_DIR}" -name "*.*" | xargs grep -l "${SEARCH_QUERY}" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
or another option:
grep -l "${SEARCH_QUERY}" "${INPUT_DIR}/*.*" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
if you do not mind doing it in just one line, then
grep -lr 'ONE\|TWO\|THREE' | xargs -I xxx -P 0 cp xxx dist/
guide:
-l just print file name and nothing else
-r search recursively the CWD and all sub-directories
match these works alternatively: 'ONE' or 'TWO' or 'THREE'
| pipe the output of grep to xargs
-I xxx name of the files is saved in xxx it is just an alias
-P 0 run all the command (= cp) in parallel (= as fast as possible)
cp each file xxx to the dist directory
If i understand the behavior of ag correctly, then you have to
adjust the read delimiter to '\n' or
use ag -0 -l to force delimiting by '\0'
to solve the problem in your loop.
Alternatively, you can use the following script, that is based on find instead of ag.
while read file; do
echo "$file"
cp "$file" "$OUTPUT_DIR/$file"
done < <(find "$INPUT_DIR" -name "*$SEARCH_QUERY*" -print)

Bash script delete file inside another folder if not present in both

The goal of the script is to check to see if a filename exists inside a folder. If the file name does NOT exist, then delete the file.
This is the script I got so far
#!/bin/bash
echo "What's the folder name?"
read folderName
$fileLocation="/home/daniel/Dropbox/Code/Python/FR/alignedImages/$folderName"
for files in "/home/daniel/Dropbox/Code/Python/FR/trainingImages/$folderName"/*
do
fileNameWithFormatFiles=${files##*$folderName/}
fileNameFiles=${fileNameWithFormat%%.png*}
for entry in "/home/daniel/Dropbox/Code/Python/FR/alignedImages/$folderName"/*
do
fileNameWithFormat=${entry##*$folderName/}
fileName=${fileNameWithFormat%%.png*}
if [ -f "/home/daniel/Dropbox/Code/Python/FR/alignedImages/$fileNameFiles.jpg" ]
then
echo "Found File"
else
echo $files
rm -f $files
fi
done
done
read
I have two folders, alignedImages and trainingImages.
All of the images in alignedImages will be inside trainingImages, but not the otherway around. So, I'm trying to make it so that if trainingImages does not contain a file with the same name as the file in alignedImages, then I want it to delete the file in trainingImages.
Also, the pictures are not the same, so I can't just compare md5's or hashes or whatever. Just the file names would be the same, except they are .jpg instead of .png
fileLocation="/home/daniel/Dropbox/Code/Python/FR/alignedImages/$folderName"
echo "What's the folder name?"
read folderName
rsync --delete --ignore-existing $fileLocation $folderName
rsync command is what you are looking for and when given the --delete option it will delete from destination dir any file that doesn't exist in source dir and --ignore-existing will cause rsync skip copying files from source if a file with same name already exist in destination dir.
The side effect of this is that it would copy any file in source dir but not in destination. You say all files in source are in destination so I guess that's ok
there is a better way! files, not for loops!
#!/bin/bash
echo "What's the folder name?"
read folderName
cd "/home/daniel/Dropbox/Code/Python/FR/alignedImages/$folderName"
find . -type f -name "*.png" | sed 's/\.png//' > /tmp/align.list
cd "/home/daniel/Dropbox/Code/Python/FR/trainingImages/$folderName"
find . -type f -name "*.jpg" | sed 's/\.jpg//' > /tmp/train.list
here's how to find files in both lists:
fgrep -f /tmp/align.list /tmp/train.list | sed 's/.*/&.jpg/' > /tmp/train_and_align.list
fgrep -v finds non-matches instead of matches: find files in train but not align:
fgrep -v -f /tmp/align.list /tmp/train.list | sed 's/.*/&.jpg/' > /tmp/train_not_align.list
test delete of all files in train_not_align.list:
cd "/home/daniel/Dropbox/Code/Python/FR/trainingImages/$folderName"
cat /tmp/train_not_align.list | tr '\n' '\0' | xargs -0 echo rm -f
(if this produces good output, remove the echo statement to actually delete those files.)

Please help. I need to add a log file to this multi-threaded rsync script

Define source, target, maxdepth and cd to source
source="/media"
target="/tmp"
depth=20
cd "${source}"
Set the maximum number of concurrent rsync threads
maxthreads=5
How long to wait before checking the number of rsync threads again
sleeptime=5
Find all folders in the source directory within the maxdepth level
find . -maxdepth ${depth} -type d | while read dir
do
Make sure to ignore the parent folder
if [ `echo "${dir}" | awk -F'/' '{print NF}'` -gt ${depth} ]
then
Strip leading dot slash
subfolder=$(echo "${dir}" | sed 's#^\./##g')
if [ ! -d "${target}/${subfolder}" ]
then
Create destination folder and set ownership and permissions to match source
mkdir -p "${target}/${subfolder}"
chown --reference="${source}/${subfolder}" "${target}/${subfolder}"
chmod --reference="${source}/${subfolder}" "${target}/${subfolder}"
fi
Make sure the number of rsync threads running is below the threshold
while [ `ps -ef | grep -c [r]sync` -gt ${maxthreads} ]
do
echo "Sleeping ${sleeptime} seconds"
sleep ${sleeptime}
done
Run rsync in background for the current subfolder and move one to the next one
nohup rsync -au "${source}/${subfolder}/" "${target}/${subfolder}/"
</dev/null >/dev/null 2>&1 &
fi
done
Find all files above the maxdepth level and rsync them as well
find . -maxdepth ${depth} -type f -print0 | rsync -au --files-from=- --from0 ./ "${target}/"
Thank you for all your help. By adding the -v switch to rsync, I solved the problem.
Not sure if this is what you are after (I don't know what rsync is), but can you not just run the script as,
./myscript > logfile.log
or
./myscript | tee logfile.log
(ie: pipe to tee if you want to see the output as it goes along)?
Alternatively... not sure this is what real coders do, but you could append the output of each command in the script to a logfile, eg:
#at the beginning define the logfile name:
logname="logfile"
#remove the file if it exists
if [ -a ${logfile}.log ]; then rm -i ${logfile}.log; fi
#for each command that you want to capture the output of, use >> $logfile
#eg:
mkdir -p "${target}/${subfolder}" >> ${logfile}.log
If rsync has several threads with names, I imagine you could store to separate logfiles as >> ${logfile}${thread}.log and concatenate the files at the end into 1 logfile.
Hope that is helpful? (am new to answering things - so I apologise if what I post is basic/bad, or if you already considered these ideas!)

shell script for 24 hrs data files upload to ftp ; unable to identify the issue

file_list=$( find . -type f -name * -mtime -1 )ftp -n << EOF
open ftpip
user uname pwd
cd directory
prompt
hash
bin
mput $file_list
bye
EOF
unable to upload with the above script... and through invalid command
Aside from the problem with quoting the asterisk, and the fact that your "ftp" statement needs to start on a new line, I suspect your $file_list variable could get far too long to be handled well. I have made you a little script that uses "tar" to collect up the files you want into a single archive named after today's date. Then you can FTP that instead of 8 million files ;-)
Here you go:
#!/bin/bash
#
# Make dated tar file of everything from last 24 hrs, filename like "Backup2013-12-14.tgz"
#
FILENAME=`date +"Backup%Y-%m-%d.tgz"`
find . -type f -mtime -1 | tar -cvz -T - -f "$FILENAME"
ftp -n << EOF
open somehost
user joe bloggs
prompt
hash
bin
mput "$FILENAME"
bye
EOF
You either need to put * in quotes so it doesn't immediately expand or remove
-name * altogeather since it's default option.

Resources