how to copy broken symlinks in ruby script - ruby

I want to copy all the contents from one directory to another (including broken symlinks) in my Ruby script. I am using FileUtils.cp_r 'src/.', 'dest' but it is complaining about the broken symlinks. Can someone please help me with this? It is a show-stopper for me right now.

FileUtils.cp_r internally copies the src folder recursively to dest. When it finds a symlink, it will create a symlink using File#symlink method (Refer line 1369 of fileutils.rb).
The documentation of File#symlink states that:
Creates a symbolic link called new_name for the existing file
old_name. Raises a NotImplemented exception on platforms that do not
support symbolic links.
So, it seems that it may not be possible to use FileUtils.cp_r to copy directories if the one of the symlinks in it is broken and pointing to a non-existing file.
Workaround
You can execute shell command cp -r command from your ruby script, it may not be platform-independent code and may not be easy to debug, but it will help you to get around the given scenario which you consider to be a show-stopper.
src = "/path/to/src/dir"
dest = "/path/to/dest/dir"
`cp -r #{src} #{dest}`

Related

Why is Snakemake not seeing symbol link files?

I have a rule whose output files are symbolic link files. Even though the link files are being made, Snakemake exits with a MissingOutputException and lists the output files as being missing. If instead of making a symlink with "ln -s" I copy the files with "cp -p" it works. I tried increasing the --latency-wait but it made no difference.
Sounds like you are using relative path for source file when symlinking. Use absolute path.
Snakemake sees broken symlinks as missing output.

How to change SYMLINK to SYMLINKD in batch script

We're sharing SYMLINKD files on our git project. It almost works, except git modifies our SYMLINKD files to SYMLINK files when pulled on another machine.
To be clear, on the original machine, symlink is created using the command:
mklink /D Annotations ..\..\submodules\Annotations\Assets
On the original machine, the dir cmd displays:
25/04/2018 09:52 <SYMLINKD> Annotations [..\..\submodules\Annotations\Assets]
After cloning, on the receiving machine, we get
27/04/2018 10:52 <SYMLINK> Annotations [..\..\submodules\Annotations\Assets]
As you might guess, a file target type pointing at a a directory [....\submodules\Annotations\Assets] does not work correctly.
To fix this problem we either need to:
Prevent git from modifying our symlink types.
Fix our symlinks with batch script triggered on a githook
We're going we 2, since we do not want to require all users to use a modified version of git.
My limited knowledge of batch scripting is impeding me. So far, I have looked into simply modifying the attrib of the file, using the info here:
How to get attributes of a file using batch file and https://superuser.com/questions/653951/how-to-remove-read-only-attribute-recursively-on-windows-7.
Can anyone suggest what attrib commands I need to modify the symlink?
Alternatively, I realise I can delete and recreate the symlink, but how do I get the target directory for the existing symlink short of using the dir command and parsing the path from the output?
I think it's https://github.com/git-for-windows/git/issues/1646.
To be more clear: your question appears to be a manifestation of the XY problem: the Git instance used to clone/fetch the project appears to incorrectly treat symbolic links to directories—creating symbolic links pointing to files instead. So it appears to be a bug in GfW, so instead of digging it up you've invented a workaround and ask how to make it work.
So, I'd better try help GfW maintainer and whoever reported #1646 to fix the problem. If you need a stop-gap solution, I'd say a proper way would be to go another route and script several calls to git ls-tree to figure out what the directory symlinks are (they'd have a special set of permission bits;
you may start here).
So you would traverse all the tree objects of the HEAD commit, recursively,
figuring out what the symlinks pointing at directories are and then
fixup the matching entries in the work tree by deleting them
and recreating with mklink /D or whatever creates a correct sort of
symlink.
Unfortunately, I'm afraid trying to script this using lame possibilities
of cmd.exe-s scripting facilities would be an exercise in futility.
I'd take some more "real" programming language (PowerShell as an example,
and—since you're probably a Windows shop—even a .NET would be OK).

Can lftp follow symbolic directories?

lftp can get the files which symbolic links point to, but can it get these files if they are in directories represented by symbolic links? For example, I am looking to get files at
ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/Acinetobacter_nosocomialis/all_assembly_versions/GCA_000162375.2_Acin_sp_RUH2624_V1/
where /GCA_000162375.2_Acin_sp_RUH2624_V1/ is a symbolic link to a directory.
I tried adding set ftp:list-options "-La" to ~/.lftprc, ~.lftp/rc, and /etc/lftp.conf.
This is the command I am using:
lftp -c 'open -e "mirror -c -p --no-empty-dirs -I *.gz /genomes/genbank/bacteria/Acinetobacter_nosocomialis/ ~/ncbi_bacteria_mirror" ftp.ncbi.nlm.nih.gov'
This command DOES work on ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/Pseudomonas_sp._URMO17WK12_I11/all_assembly_versions/
where /all_assembly_versions/ is not a symbolic link. It does not however recursively follow the symbolically linked directories contained within and get the files from those directories, which I would like for it to do if possible.
Resolving the file name to the file on disk and thereby resolving symbolic links is done at the server. Thus there is no need for the ftp client to resolve these links, which means it will not even attempt it by trying to guess the format of the directory listing.
I am not sure I understand the question, and the answer is hidden in it: the -L flag is what you want.
Quoting lftp's man page:
-L, --dereference download symbolic links as files
That's really it!

Rsync create symbolic links only

I currently have rsync working well. It copies all my files from one directory to another directory. The only thing is it is physically copying the files.
I have a lot of large files that I don't want to have a duplicate of all the files. I just want to create a symbolic link in the new directory so that I can serve the data on a webpage. The source directory has some scripts and files I don't want the public to see. I'm moving the safe data to the web root (destination).
What I would like rsync to do is any new files in the source directory would create links into the destination. That way I am not using up my hard drive space like I currently am doing. What I have works perfect except for doing the symbolic link aspect to it. Is there a way to have rsync track and create symbolic links?
rsync -aP --exclude="file.sql" --exclude="*~" --exclude=".*" --exclude="*.sh" . ${destination}
It's not a symlink, but you might be able to work with --link-dest=DIR. It creates a hard link which will create a new name for the same file. This will behave similarly to a softlink as long as:
Both files are on the same filesystem
You don't plan to delete the original and not the copy (the symlink would break but a hard-link won't)
You don't have anything explicitly checking to see if it's a softlink
You could use cp -aR -s (Linux or FreeBSD) or cp --archive --recursive --symbolic-link (Linux) to create symbolic links to the source files in the destination directory instead of copies. Note that -s is non-standard.
Can lndir be useful to you. According to manual it creates a shadow directory of symbolic links to another directory tree.
I think master_delivery is probably the best tool for this. With the already introduced --link-dest option of rsync, files which are not the same will be copied. If you don't mind the situation where copies and hardlinks are mixed, you can use rsync, but if you want to eliminate duplicates completely, use master_delivery.
Usage is:
gem install master_delivery
master_delivery -m <path_to_master> -d <path_to_delivery_root>

Copying directories recursively using shell script

Should be an easy question for the gurus here, though it's hard to explain it in text so hopefully this is clear. I've got two directories on a box with some flavor of unix on it. I've got a script that I want to use to move all the files and directories from one location to another.
First, an example of how the directories look:
Directory A: final/results/2012/2012-02/2012-02-25/name/files
Directory B: test/results/2012/2012-02/2012-02-24/name/files
So you see they're very similar. What I want to do is move everything from the Directory B 2012 directory, recursively, to the same level of Directory A. So you'd end up with:
someproject/results/2012/2012-02/2012-02-25/name/files
someproject/results/2012/2012-02/2012-02-24/name/files
etc.
I want this script to be future proof though, meaning I don't want the 2012 hardcoded. Also, towards the end of a month you will potentially have data from two different months and both need to be copied into the 2012 directory. So here is the command I used in the shell script file:
CONS="/someproject";
ROOT="/test";
/bin/cp -r ${ROOT}/results/* ${CONS}/results/*
but this resulted in:
/final/results/2012/2012-02/2012-02-25/name/files
and
/final/results/2012/2012/2012-02/2012-02-24/name/files
So as I hope is clear, it started a level below where I wanted it too. Can anyone fill me in on what I'm doing wrong, if they can understand what I'm even trying to explain. My apologies if it's not clear. I'm sure this is a fairly simple fix but I'm not sure what to do. Shell scripting is not a strong point of mine.
One poster suggests rsync, which is overkill.
cp -rp will work fine. if you want to move the files, just mv the directory -- it and everything under it will move too.
The only real problem here is the use of terminating *'s in the command line in the original script. You don't need the *, you're just trying to pass directories to the cp command, you aren't trying to pass it the names of all the files already in the source (and more importantly, the destination).
You could also use a tool like rsync to make sure your source and target are synchronized.
rsync -av ${ROOT}/results/ ${CONS}/results/
You specified that you want to "move" the files, though. Which means deleting the originals after they're copied:
rsync -av --remove-source-files ${ROOT}/results/ ${CONS}/results/
If you start playing around with rsync, be sure to read the man page about how it treats trailing slashes.

Resources