What's the best way to store files on a remote host as they are created and remove the originals? - bash

Basically I want to move files to another server on creation preserving the directory structure. I have a solution put it lacks elegance. Also I feel like I'm missing the obvious answer, so thanks in advance for your help and I totally understand if this bores you.
The situation
I have server with limited disk space (let's call it 'Tiny') and a storage server. Tiny creates files every once in a while. I want to store them automatically on the storage server and remove the originals when it's safe. I have to retain the directory structure of tiny. I don't know in advance how the dir structure looks like. That is, all files are created in the directory /some/dir/ but sudirectories of this are created on the fly. They should be sotred in /other/fold/ on the storege server preserving the substcrutre under /some/dir. E.g:
/some/dir/bla/foo/bar/snap_001a on tiny ---> becomes /other/fold/bla/foo/bar/snap_001a on the storage server. They are all called snap_xxxx wgere xxxx is a four letter alphanumeric string.
My old solution
Now I was thinking to loop over files and scp them. Once scp is finished and returns without error the files on tiny are removed with rm.
#!/bin/bash
# This is invoked by a cronjob ever once in a while.
files=$(find /some/dir/ -name snap_*)
IFS='
'
for current in $files; do
name=$(basename $current) # Get base name (i.e. strip directory)
dir=$(dirname $current) # Get the directory name of the curent file on tiny
dir=${dir/\/some\/dir/\/other\/fold} # Replace the directory root on tiny with the root on the storage server
ssh -i keyfile myuser#storage.server.net \
mkidir -p $dir # create the directory on the storage server and all parents if needed
scp -i keyfile $current myuser#storage.server.net:$dir$name \
&& rm $current # remove files on success
done
This however strikes me as unnecssarily complicated and maybe error prone. I thought of rsync but when coping single files, there is no option to create a directory and it's parents if they don't exist. Does anyone have an idea, better than mine?
What I ended up using after this thread
rsync -av --remove-sent-files --prune-empty-dirs \
-e 'ssh -i /full/path/to/keyfile' \
--include="*/" --include="snap_*" --exclude="*" \
/some/dir/ myuser#storage.server.com:/other/fold/
More recent versions then the one I was using take --remove-source-files instead of --remove-sent-files. The former being more of a telling name in that it's clearer what files are deleted. Also --dry-run is a good option to test your parameters BEFORE actually using rsync.
Thanks to Alex Howansky for the solution and to Douglas Leeder for caring!

How do I tell rsync just to copy the snap_xxxx files?
See the --include option.
How do I change the direcotry root?
Just specify it on the command line.
rsync [options] source_dir dest_host:dest_dir
How do I delete the originals on Tiny after transfer to the storage server?
See the --remove-source-files option.

Maybe something like:
touch /tmp/start
rsync -va /some/dir/ /other/fold
find /some/dir -type f -not -newer /tmp/start | xargs rm

Related

Operating on multiple specific folders at once with cp and rm commands

I'm new to linux (using bash) and I wanted to ask about something that I do often while I work, I'll give two examples.
Deleting multiple specific folders inside a certain directory.
Copying multiple specific folders into a ceratin directory.
I succesfully done this with files, using find with some regex and then using -exec and -delete. But for folders I found it more problematic, because I had problem pipelining the list of folders I got to the cp/rm command succescfully, each time getting the "No such file or directory error".
Looking online I found the following command (in my case for copying all folders starting with a Z):
cp -r $(ls -A | grep "Z*") destination
But when I execute it it says nothing and the prompt won't show up again until I hit Ctrl+C and nothing is copied.
How can I achieve what I'm looking for? For both cp and rm.
Thanks in advance!
First of all, you are trying to grep "Z*" but it means you are looking for Z, ZZ, ZZZZ, ZZZZZ ?
also try to execute ls -A - you will get multiple columns. I think need at least ls -1A to print result one per line.
So for your command try something like:
cp -r $(ls -1A|grep "^p") destination
or
cp -r $(ls -1A|grep "^p") -t destination
But all the above is just to correct syntax of your example.
It is much better to use find. Just in case try to put target directory in quotas like:
find <PATH_FROM> -type d -exec cp -r \"{}\" -t target \;

How to `scp` directory preserving structure but only pick certain files?

I need to secure copy (scp) to remotely copy a directory with its sub structure preserved from the UNIX command line. The sub directories have identically named files that I WANT and bunch of other stuff that I don't. Here is how the structure looks like.
directorytocopy
subdir1
1.wanted
2.wanted
...
1.unwanted
2.notwanted
subdir2
1.wanted
2.wanted
...
1.unwanted
2.notwanted
..
I just want the .wanted files preserving the directory structure. I realize that it is possible to write a shell (I am using bash) script to do this. Is it possible to do this in a less brute force way? I cannot copy the whole thing and delete the unwanted files because I do not have enough space.
Adrian has the best idea to use rsync. You can also use tar to bundle the wanted files:
cd directorytocopy
shopt -s nullglob globstar
tar -cf - **/*.wanted | ssh destination 'cd dirToPaste && tar -xvf -'
Here, using tar's -f option with the filename - to use stdin/stdout as the archive file.
This is untested, and may fail because the archive may not contain the actual subdirectories that hold the "wanted" files.
Assuming GNU tar on the source machine, and assuming that filenames of the wanted files won't contain newlines and they are short enough to fit the tar headers:
find /some/directory -type f -name '*.wanted' | \
tar cf - --files-from - | \
ssh user#host 'cd /some/other/dir && tar xvpf -'
rsync with and -exclude/include list follwing #Adrian Frühwirth's suggestion would be a to do this.

BASH: Copy all files and directories into another directory in the same parent directory

I'm trying to make a simple script that copies all of my $HOME into another folder in $HOME called Backup/. This includes all hidden files and folders, and excludes Backup/ itself. What I have right now for the copying part is the following:
shopt -s dotglob
for file in $HOME/*
do
cp -r $file $HOME/Backup/
done
Bash tells me that it cannot copy Backup/ into itself. However, when I check the contents of $HOME/Backup/ I see that $HOME/Backup/Backup/ exists.
The copy of Backup/ in itself is useless. How can I get bash to copy over all the folders except Backup/. I tried using extglob and using cp -r $HOME/!(Backup)/ but it didn't copy over the hidden files that I need.
try rsync. you can exclude file/directories .
this is a good reference
http://www.maclife.com/article/columns/terminal_101_using_rsync_locally
Hugo,
A script like this is good, but you could try this:
cp -r * Backup/;
cp -r .* Backup/;
Another tool used with backups is tar. This compresses your backup to save disk space.
Also note, the * does not cover . hidden files.
I agree that using rsync would be a better solution, but there is an easy way to skip a directory in bash:
for file in "$HOME/"*
do
[[ $file = $HOME/Backup ]] && continue
cp -r "$file" "$HOME/Backup/"
done
This doesn't answer your question directly (the other answers already did that), but try cp -ua when you want to use cp to make a backup. This recurses directories, copies rather than follows links, preserves permissions and only copies a file if it is newer than the copy at the destination.

rsync : Recursively sync all files while ignoring the directory structure

I am trying to create a bash script for syncing music from my desktop to a mobile device. The desktop is the source.
Is there a way to make rsync recursively sync files but ignore the directory structure? If a file was deleted from the desktop, I want it to be deleted on the device as well.
The directory structure on my desktop is something like this.
Artist1/
Artist1/art1_track1.mp3
Artist1/art1_track2.mp3
Artist1/art1_track3.mp3
Artist2/
Artist2/art2_track1.mp3
Artist2/art2_track2.mp3
Artist2/art2_track3.mp3
...
The directory structure that I want on the device is:
Music/
art1_track1.mp3
art1_track2.mp3
art1_track3.mp3
art2_track1.mp3
art2_track2.mp3
art2_track3.mp3
...
Simply:
rsync -a --delete --include=*.mp3 --exclude=* \
pathToSongs/Theme*/Artist*/. destuser#desthost:Music/.
would do the job if you're path hierarchy has a fixed number of level.
WARNING: if two song file do have exactly same name, while on same destination directory, your backup will miss one of them!
If else, and for answering strictly to your ask ignoring the directory structure you could use bash's shopt -s globstar feature:
shopt -s globstar
rsync -a --delete --include=*.mp3 --exclude=* \
pathToSongsRoot/**/. destuser#desthost:Music/.
At all, there is no need to fork to find command.
Recursively sync all files while ignoring the directory structure
For answering strictly to question, there must no be limited to an extension:
shopt -s globstar
rsync -d --delete sourceRoot/**/. destuser#desthost:destRoot/.
With this, directories will be copied too, but without content. All files and directories would be stored on same level at destRoot/.
WARNING: If some different files with same name exists in defferents directories, they would simply be overwrited on destination, durring rsync, for finaly storing randomly only one.
May be this is a recent option, but I see the option --no-relative mentioned in the documentation for --files-from and it worked great.
find SourceDir -name \*.mp3 | rsync -av --files-from - --no-relative . DestinationDir/
The answer to your question: No, rsync cannot do this alone. But with some help of other tools, we can get there... After a few tries I came up with this:
rsync -d --delete $(find . -type d|while read d ; do echo $d/ ; done) /targetDirectory && rmdir /targetDirectory/* 2>&-
The difficulty is this: To enable deletion of files at the target position, you need to:
specify directories as sources for rsync (it doesn't delete if the source is a list of files).
give it the complete list of sources at once (rsync within a loop will give you the contents of the last directory only at the target).
end the directory names with a slash (otherwise it creates the directories at the target directory)
So the command substitution (the stuff enclosed with the $( )) does this: It finds all directories and adds a slash (/) at the end of the directory names. Now rsync sees a list of source directories, all terminated with a slash and so copies their contents to the target directory. The option -d tells it, not to copy recursively.
The second trick is the rmdir /targetDirectory/* which removes the empty directories which rsync created (although we didn't ask it to do that).
I tested that here, and deletion of files removed in the source tree worked just fine.
If you can make a list of files, you've already solved the problem.
Try:
find /path/to/src/ -name \*.mp3 > list.txt
rsync -avi --no-relative --progress --files-from=list.txt / user#server:/path/to/dest
If you run the script again for new files, it will only copy the missing files.
If you don't like the list, then try a single sentence (but it's another logic)
find /path/to/src/ -name \*.mp3 -type f \
-exec rsync -avi --progress {} user#server:/path/to/dest/ \;
In this case, you will ask for each file, each time, since by the type of sentence, you cannot build the file list previously.

How to emulate cp and mv --parent on osx

Osx mv and cp does not have the --parents option, so how does one emulate it ?
I.e. mv x/y/a.txt s/x/y/a.txt when s is empty gives a no directory found error unless one does a mkdir first which is rather cumbersome when trying to do this did thousands of files.
The solution (which works on all platforms that has an rsync) is:
Use find or some other tool to create a file with the files you want moved/copied, i.e.
find *.mp3 > files.txt
Then use rsync files-from to specify it and by using --remove-source-files it behaves like a mv -p and without it works like cp -p:
rsync --files-from=files.txt --remove-source-files src dest
Slower than a native mv/cp but its resumable and rsync got alot more options that can otherwise help too for cleaning up your files.

Resources