Rsync only two dictionaries - bash

I want to work localy on some programm, which I want to test and run on a remote server. The only files I am editing are the *.hpp and *.cpp in the src and the include directory.
For that I tried this rsync command, to only upload the necessarry source files:
rsync --dry-run -av --exclude '*' --include 'src/*.cpp' --include 'include/*.hpp' Programm/ user#remote:/home/user/Programm
But for some reason no files are commited to the server after some local changes.
Any hints appreciated!
Thank you

Here's excerpt from the rsync man page, which tackles your exact problem.
Note that, when using the --recursive (-r) option (which is implied by
-a), every subdir component of every path is visited left to right, with each directory having a chance for exclusion before its content.
In this way include/exclude patterns are applied recursively to the
pathname of each node in the filesystem's tree (those inside the
transfer). The exclude patterns short-circuit the directory traversal
stage as rsync finds the files to send.
For instance, to include "/foo/bar/baz", the directories "/foo" and
"/foo/bar" must not be excluded. Excluding one of those parent
directories prevents the examination of its content, cutting off
rsync's recursion into those paths and rendering the include for
"/foo/bar/baz" ineffectual (since rsync can't match something it never
sees in the cut-off section of the directory hierarchy).
The concept path exclusion is particularly important when using a
trailing '*' rule. For instance, this won't work:
+ /some/path/this-file-will-not-be-found
+ /file-is-included
- *
This fails because the parent directory "some" is excluded by the '*' rule, so rsync never visits any of the files in the "some" or
"some/path" directories. One solution is to ask for all directories in
the hierarchy to be included by using a single rule: "+ */" (put it
somewhere before the "- *" rule), and perhaps use the
--prune-empty-dirs option. Another solution is to add specific include rules for all the parent dirs that need to be visited. For instance,
this set of rules works fine:
+ /some/
+ /some/path/
+ /some/path/this-file-is-found
+ /file-also-included
- *

Related

LFTP wildcard source folder

I'm trying to use LFTP in my GitLab continuous integration setup so I can mirror JSON files with my destination. However, I'd like to only mirror on a set of folders using a wildcard, but I cannot seem to get this working.
I tried this mirror command configuration in LFTP, but this results in a "No such file or directory" error. I assume I'm parsing the wildcard wrong somehow.
What I tried: lftp -c "set sftp:auto-confirm true; open sftp://$DEVELOPMENT_DEPLOY_USER:$DEVELOPMENT_DEPLOY_PASSWORD#$DEVELOPMENT_DEPLOY_HOST:$DEVELOPMENT_DEPLOY_PORT; mirror -Rev ./somefolder_* $DEVELOPMENT_DESTINATION_FOLDER --ignore-time --parallel=10 --exclude .* --exclude .*/ --include ./*.json"
Results in:
/home/gitlab-runner/builds/82ffc821/0/somegroup/someproject/somefolder_*: No such file or directory
I'm probably missing something obvious. Would appreciate any help.
Maybe old but find that in man :
Include and exclude options can be specified multiple times. It means
that a file or direc‐
tory would be mirrored if it matches an include and does not match to excludes after the
include, or does not match anything and the first check is exclude. Directories are matched
with a slash appended.
So first it will execute "include" argument, lastly "exclude" with is " --exclude .* --exclude .*/". After that glob, no file math to mirror. Use "--verbose" to check what files are touch by lftp

Update using rsync and remove from the source folder

I want to rsync contents from /local/path to server:/remote/path.
The files end with extensions composed by 4 digits
If a file does not exist in remote path, copy the file to remote and remove from local
If a file exists in remote path and the size is no less than the local one, do not copy the file to remote and remove it from local
I tried
rsync -avmhP --include='*.[0-9][0-9][0-9][0-9]' --include='*/' --exclude='*' --size-only --remove-source-files /local/path server:/remote/path
However, some files existing in the remote path remain in local path.
Another question is, why we need --include='*/' --exclude='*'? Why --include='*.[0-9][0-9][0-9][0-9]' alone doesn't work for the file filtering?
Do you mean --remove-sent-file instead of remove-source-file ?
According to the rsync man page :
--remove-sent-file
This tells rsync to remove from the sending side the files and/or symlinks that are newly created or whose content is updated on the receiving side. Directories and devices are not removed, nor are files/symlinks whose attributes are merely changed.
That's means that only transferred file (the ones whom size changed) are deleted from source. To active the include file, you first need to exclude all the other BUT my include pattern. The 3 arguments you used mean "I excluded all files (--include='*/' --exclude='*') but the ones matching my pattern (--include='*.[0-9]{4}')
From man page :
--include=PATTERN
don’t exclude files matching PATTERN
--exclude=PATTERN
exclude files matching PATTERN

wget - prevent creating empty directories

Is there a way to stop wget from creating empty directories? Most of the files I need are found at one level of depth, i.e. in folder 2 of /1/2/, but I need to use infinite recursion because sometimes the file I need is at 1/2/3/ or deeper. Or at least, I need infinite recursion for the time being, until I figure out the maximum depth of where the files of interest are located.
Right now I'm using
wget -nH --cut-dirs=3 -rl 0 -A "*assembly*.txt" ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria
Which gets all the files I need, but I am left with a bunch of empty directories. I would prefer the directory structure /bacteria/organism/*assembly*.txt, but if creating multiple subdirectories cannot be avoided, I want to at least stop wget from creating empty directories. I can, of course, remove the empty directories after running wget, but I want to stop wget from creating them in the first place if possible
Short answer: you can't prevent the directories from being created.
You can do post-processing on the directories though:
find bacteria/ -type d -empty -exec rmdir {} \;
Looking at a bunch of these directories (including the very busy one for e. coli) it appears, as you said, that the only files matching *assembly*.txt are stored in the first directory below bacteria. Unless there's some variation to this rule, you could just do this:
wget -nH --cut-dirs=2 -rl 2 -A "*assembly*.txt" ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria
BTW if you want your directory structure to start at bacteria/ you'll need to change --cut-dirs to 2 instead of 3.

rsync subset of directories

I am trying to use include and exclude options in rsync to copy a directory structure, excluding most but not all of the subdirectories, based on a pattern in the directory names. But, it isn't working. It is trying to copy everything over instead of just the subfolders I want. Is my syntax wrong?
I have tried:
rsync -am --include='*/*/*MPRAGE*/' --exclude='*' /parent_directory/ /destination
Also:
rsync -am --include='*/' --include='*/*/*MPRAGE*/' --exclude='*' /parent/ /dest
MPRAGE is the pattern that is in the name of each folder I want copied. But these folders are three levels deep in the structure, and I want to keep the well-organized directory structure intact for these folders I want copied.
Thanks in advance for any tips.

rsync with folder and file name pattern matching to copy files

Right now I'm successfully running:
rsync -uvma --include="*/" --include="*.css" --exclude="*" $spec_dir $css_spec_dir
In a shell script which copies all of the files in the source directory, that are .css files, into a target directory.
I want to do the same for HTML files, but only where they are in a subfolder with the name 'template'.
So I'm in directory ~/foo, and I want to rsync where the --include="*/" only matches on subfolders with the name 'template'. So ~/foo/bar/template/baz/somefile.html would match, and so would ~foo/bar/baz/qux/template/someotherfile.html, but NOT ~/foo/bar/thirdfile.html
Although it looks a little bit strange, this works for me:
rsync -uvma --include="*/" --include="*/template/*/*.html" --include="*/template/*.html" --include="template/*.html" --include="template/*/*.html" --exclude="*" $spec_dir $html_spec_dir
This one works for me:
rsync -umva --include="**/templates/**/*.html" --exclude="*.html" source/ target
Were you looking for **? Here you have to be careful about choosing your exclude pattern, * won't work as it matches directories on the way. If rsync finds foo/templates/some.html, it will first copy foo, then foo/templates and then foo/templates/some.html, but before it gets there * already matched foo and nothing gets copied.
Here's what worked:
rsync -uvma --include="*/" --include="templates/**.html" --exclude="*" $html_all_dir $html_dir
My guess is, your format and mine probably accomplish the same thing. I know I tried about 20 different patterns before this one, and this is the only one that worked properly. I don't think I tried your format though :)

Resources