ruby: copy directories recursively with link dereferencing - ruby

It's very strange, I cannot find any standard way with Ruby to copy a directory recursively while dereferencing symbolic links. The best I could find is FindUtils.cp_r but it only supports dereferencing the root src directory.
copy_entry is the same although documentation falsely shows that it has an option dereference. In source it is dereference_root and it does only that.
Also I can't find a standard way to recurse into directories. If nothing good exists, I can write something myself but wanted something simple and tested to be portable across Windows and Unix.

The standard way to recurse into directories is to use the Find class but I think you're going to have to write something. The built-in FileUtils methods are building blocks for normal operations but your need is not normal.
I'd recommend looking at the Pathname class which comes with Ruby. It makes it easy to walk directories using find, look at the type of the file and dereference it if necessary. In particular symlink? will tell you if a file is a soft-link and realpath will resolve the link and return the path to the real file.
For instance I have a soft-link in my home directory from .vim to vim:
vim = Pathname.new ENV['HOME'] + '/.vim'
=> #<Pathname:/Users/ttm/.vim>
vim.realpath
=> #<Pathname:/Users/ttm/vim>
Pathname is quite powerful, and I found it very nice when having to do some major directory traversals and working with soft-links. The docs say:
The goal of this class is to manipulate file path information in a neater way than standard Ruby provides. [...]
All functionality from File, FileTest, and some from Dir and FileUtils is included, in an unsurprising way. It is essentially a facade for all of these, and more.
If you use find, you'll probably want to implement the prune method which is used to skip entries you don't want to recurse into. I couldn't find it in Pathname when I was writing code so I added it using something like:
class Pathname
def prune
Find.prune
end
end

Here's my implementation of find -follow in ruby:
https://gist.github.com/akostadinov/05c2a976dc16ffee9cac
I could have isolated it into a class or monkey patch Find but I decided to do it as a self-contained method. There might be room for improvement because it doesn't work with jruby. If anybody has an idea, it will be welcome.
Update: found out why not working with jruby - https://github.com/jruby/jruby/issues/1895
I'll try to workaround. I implemented a workaround.
Update 2: now cp_r_dereference method ready - https://gist.github.com/akostadinov/fc688feba7669a4eb784

Related

Can gdb set break at every function inside a directory?

I have a large source tree with a directory that has several files in it. I'd like gdb to break every time any of those functions are called, but don't want to have to specify every file. I've tried setting break /path/to/dir/:*, break /path/to/dir/*:*, rbreak /path/to/dir/.*:* but none of them catch any of the functions in that directory. How can I get gdb to do what I want?
There seems to be no direct way to do it:
rbreak file:. does not seem to accept directories, only files. Also note that you would want a dot ., not asterisk *
there seems to be no way to loop over symbols in the Python API, see https://stackoverflow.com/a/30032690/895245
The best workaround I've found is to loop over the files with the Python API, and then call rbreak with those files:
import os
class RbreakDir(gdb.Command):
def __init__(self):
super().__init__(
'rbreak-dir',
gdb.COMMAND_BREAKPOINTS,
gdb.COMPLETE_NONE,
False
)
def invoke(self, arg, from_tty):
for root, dirs, files in os.walk(arg):
for basename in files:
path = os.path.abspath(os.path.join(root, basename))
gdb.execute('rbreak {}:.'.format(path), to_string=True)
RbreakDir()
Sample usage:
source a.py
rbreak-dir directory
This is ugly because of the gdb.execute call, but seems to work.
It is however too slow if you have a lot of files under the directory.
My test code is in my GitHub repo.
You could probably do this using the Python scripting that comes with modern gdb's. Two options: one is to list all the symbols and then if they contain the required directory create an instance of the Breakpoint class at the appropriate place to set the breakpoint. (Sorry, I can't recall off hand how to get a list of all the symbols, but I think you can do this.)
You haven't said why exactly you need to do this, but depending on your use-case an alternative may be to use reversible debugging - i.e. let it crash, and then step backwards. You can use gdb's inbuilt reversible debugging, or for radically improved performance, see UndoDB (http://undo-software.com/)

How to build a portable absolute path in Ruby?

Let's assume a script needs access a directory, say /some/where/abc on an "arbitrary" OS. There are a couple options to build the path in Ruby:
File.join('', 'some', 'where', 'abc')
File.absolute_path("some#{File::SEPARATOR}where#{File::SEPARATOR}abc", File::SEPARATOR)
Pathname in the standard API
I believe the first solution is clear enough, but idiomatic. In my experience, some code reviews ask for a comment to explain what it does...
The Question
Is there a better way to build an absolute path is Ruby, where better means "does the job and speaks for itself"?
What I would pick up if I was doing a code review is that on Windows /tmp is not necessarily the best place to create a temporary directory, and also the initial '', argument is perhaps not obvious to the casual reviewed that it creates <nothing>/tmp/abc. Therefore, I would recommend this code:
File.join(Dir.tmpdir(), 'abc')
See Ruby-doc for an explanation.
UPDATE
If we expand the problem to a more generic solution that does not involve using tmpdir(), I cannot see a way round using the initial '' idiom (hack?). On Linux this is not too much of a problem, perhaps, but on Windows with multiple drive letters it will be. Furthermore, there does not appear to be a Ruby API or gem for iterating the mount points.
Therefore, my recommendation would be to delegate the mount point definition to a configuration option that might be '/' for Linux, 'z:/' for Windows, and smb://domain;user#my.file.server.com/mountpoint for a Samba share, then use File.join(ProjectConfig::MOUNT_POINT, 'some', 'where', 'abc').
File#join is THE canonical way to build a portable path in Ruby. I'm wondering who is doing the review. Perhaps Ruby is new to your organization.
I agree with #ChrisHeald that referring to the documentation is the best way to explain the code to a reviewer.

Ruby - Naming Convention - letter case for acronyms in class/module names?

I need to create a class that represent "SVN" inside a module called "SCM". But I don't know what is the convention when dealing with acronyms in Ruby, and could not find anything relevant in Google, except "Camel case is preferred".
Should I call it SCM::SVN or Scm::Svn? Is there a convention for this?
Add the following to config/initializers/inflections.rb.
ActiveSupport::Inflector.inflections(:en) do |inflect|
inflect.acronym 'SVN'
end
Now running $ rails g model SVN… will create a class named SVN in a file named svn.rb and an associated table svns.
SCM::SVN looks best to me. Rails is full of classes like ERB, ORM and OMFGIMATEAPOT. And that's not to mention things like JSONSerializer. Ruby's source has a bunch of acronyms, too. The most obvious example to me is YAML. The standard as I've seen it is to upcase letters for CamelCase but generally not to downcase them (although Rails has opinions on model names).
If you have grep and the source code you can see plenty of examples with something like
grep -r 'class [A-Z]\{3,\}' <path/to/source>
# or, if you only want acronyms and nothing like YAMLColumn:
grep -rw 'class [A-Z]\{3,\}' <path/to/source>
I think that SCM::SVN looks better (aesthetically), and I've seen libraries that use the same convention. It's really just a matter of what you think reads better.
(However, note that if you are building a Rails project, and want this module to be autoloaded from the /lib directory, you may have to use Scm::Svn.)

Using "should" with class methods?

I'm used to making calls such as:
new_count.should eql(10)
on variables, but how can I do something similar with a class method such as File.directory?(my_path)?
Every combination of File.should be_directory(my_path) that I've tried leads to a method missing, as Ruby tries to find "be_directory" on my current object, rather than matching it against File.
I know I can turn it around and write
File.directory?(my_path).should == true
but that gives a really poor message when it fails.
Any ideas?
Hmm, maybe I have an idea.
File is part of Ruby proper, so it may have elements written in C. Some of Ruby's meta-programming tools break down when dealing with classes imported from C, that could explain Rspec's failure to make .should behave as expected.
If that's true, there is no real solution here. I'd suggest using the MockFS library:
http://mockfs.rubyforge.org/
This downside to MockFS is using it everywhere you'd normally use File, Dir and FileUtils:
require 'mockfs'
def move_log
MockFS.file_utils.mv( '/var/log/httpd/access_log', '/home/francis/logs/' )
end
The upside, especially if your code is file-intensive, is the ability to spec really complex scenarios out, and have them run without actually touching the slow filesystem. Everything happens in memory. Faster, more complete specs.
Hope this helps, Good luck!
I'm not sure why be_directory wouldn't work for you. What version of rspec are you using? You can also use rspec's predicate_matchers method, when a predicate exists, but it doesn't read nicely as be_predicate.
Here's what I tried:
describe File, "looking for a directory" do
it "should be directory" do
File.should be_directory("foo")
end
predicate_matchers[:find_the_directory_named] = :directory?
it "should find directory" do
File.should find_the_directory_named("foo")
end
end
And that gave me the following output (run with spec -fs spec.rb):
File looking for a directory
- should be directory
- should find directory
Finished in 0.004895 seconds
2 examples, 0 failures

Adding a directory to $LOAD_PATH (Ruby)

I have seen two commonly used techniques for adding the directory of the file currently being executed to the $LOAD_PATH (or $:). I see the advantages of doing this in case you're not working with a gem. One seems more verbose than the other, obviously, but is there a reason to go with one over the other?
The first, verbose method (could be overkill):
$LOAD_PATH.unshift(File.expand_path(File.dirname(__FILE__))) unless $LOAD_PATH.include?(File.expand_path(File.dirname(__FILE__)))
and the more straightforward, quick-and-dirty:
$:.unshift File.dirname(__FILE__)
Any reason to go with one over the other?
The Ruby load path is very commonly seen written as $: , but just because it is short, does not make it better. If you prefer clarity to cleverness, or if brevity for its own sake makes you itchy, you needn't do it just because everyone else is.
Say hello to ...
$LOAD_PATH
... and say goodbye to ...
# I don't quite understand what this is doing...
$:
I would say go with $:.unshift File.dirname(__FILE__) over the other one, simply because I've seen much more usage of it in code than the $LOAD_PATH one, and it's shorter too!
I'm not too fond on the 'quick-and-dirty' way.
Anyone new to Ruby will be pondering what $:. is.
I find this more obvious.
libdir = File.dirname(__FILE__)
$LOAD_PATH.unshift(libdir) unless $LOAD_PATH.include?(libdir)
Or if I care about having the full path...
libdir = File.expand_path(File.dirname(__FILE__))
$LOAD_PATH.unshift(libdir) unless $LOAD_PATH.include?(libdir)
UPDATE 2009/09/10
As of late I've been doing the following:
$:.unshift(File.expand_path(File.dirname(__FILE__))) unless
$:.include?(File.dirname(__FILE__)) || $:.include?(File.expand_path(File.dirname(__FILE__)))
I've seen it in a whole bunch of different ruby projects while browsing GitHub.
Seems to be the convention?
If you type script/console in your Rails project and enter $:, you'll get an array that includes all the directories needed to load Ruby. The take-away from this little exercise is that $: is an array. That being so, you can perform functions on it like prepending other directories with the unshift method or the << operator. As you implied in your statement $: and $LOAD_PATH are the same.
The disadvantage with doing it the quick and dirty way as you mentioned is this: if you already have the directory in your boot path, it will repeat itself.
Example:
I have a plugin I created called todo. My directory is structured like so:
/---vendor
|
|---/plugins
|
|---/todo
|
|---/lib
|
|---/app
|
|---/models
|---/controllers
|
|---/rails
|
|---init.rb
In the init.rb file I entered the following code:
## In vendor/plugins/todo/rails/init.rb
%w{ models controllers models }.each do |dir|
path = File.expand_path(File.join(File.dirname(__FILE__), '../lib', 'app', dir))
$LOAD_PATH << path
ActiveSupport::Dependencies.load_paths << path
ActiveSupport::Dependencies.load_once_paths.delete(path)
end
Note how I tell the code block to perform the actions inside the block to the strings 'models', 'controllers', and 'models', where I repeat 'models'. (FYI, %w{ ... } is just another way to tell Ruby to hold an array of strings). When I run script/console, I type the following:
>> puts $:
And I type this so that it is easier to read the contents in the string. The output I get is:
...
...
./Users/Me/mySites/myRailsApp/vendor/plugins/todo/lib/app/models
./Users/Me/mySites/myRailsApp/vendor/plugins/todo/lib/app/controllers
./Users/Me/mySites/myRailsApp/vendor/plugins/todo/lib/app/models
As you can see, though this is as simple an example I could create while using a project I'm currently working on, if you're not careful the quick and dirty way will lead to repeated paths. The longer way will check for repeated paths and make sure they don't occur.
If you're an experienced Rails programmer, you probably have a very good idea of what you're doing and likely not make the mistake of repeating paths. If you're a newbie, I would go with the longer way until you understand really what you're doing.
Best I have come across for adding a dir via relative path when using Rspec. I find it verbose enough but also still a nice one liner.
$LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
There is a gem which will let you setup your load path with nicer and cleaner code. Check this out: https://github.com/nayyara-samuel/load-path.
It also has good documentation
My 2¢: I like $LOAD_PATH rather than $:. I'm getting old... I've studied 92,000 languages. I find it hard to keep track of all the customs and idioms.
I've come to abhor namespace pollution.
Last, when I deal with paths, I always delete and then either append or prepend -- depending upon how I want the search to proceed. Thus, I do:
1.times do
models_dir = "#{File.expand_path(File.dirname(__FILE__))}/models"
$LOAD_PATH.delete(models_dir)
$LOAD_PATH.unshift(models_dir)
end
I know it's been a long time since this question was first asked, but I have an additional answer that I want to share.
I have several Ruby applications that were developed by another programmer over several years, and they re-use the same classes in the different applications although they might access the same database. Since this violates the DRY rule, I decided to create a class library to be shared by all of the Ruby applications. I could have put it in the main Ruby library, but that would hide custom code in the common codebase which I didn't want to do.
I had a problem where I had a name conflict between an already defined name "profile.rb", and a class I was using. This conflict wasn't a problem until I tried to create the common code library. Normally, Ruby searches application locations first, then goes to the $LOAD_PATH locations.
The application_controller.rb could not find the class I created, and threw an error on the original definition because it is not a class. Since I removed the class definition from the app/models section of the application, Ruby could not find it there and went looking for it in the Ruby paths.
So, I modified the $LOAD_PATH variable to include a path to the library directory I was using. This can be done in the environment.rb file at initialization time.
Even with the new directory added to the search path, Ruby was throwing an error because it was preferentially taking the system-defined file first. The search path in the $LOAD_PATH variable preferentially searches the Ruby paths first.
So, I needed to change the search order so that Ruby found the class in my common library before it searched the built-in libraries.
This code did it in the environment.rb file:
Rails::Initializer.run do |config|
* * * * *
path = []
path.concat($LOAD_PATH)
$LOAD_PATH.clear
$LOAD_PATH << 'C:\web\common\lib'
$LOAD_PATH << 'C:\web\common'
$LOAD_PATH.concat(path)
* * * * *
end
I don't think you can use any of the advanced coding constructs given before at this level, but it works just fine if you want to setup something at initialization time in your app. You must maintain the original order of the original $LOAD_PATH variable when it is added back to the new variable otherwise some of the main Ruby classes get lost.
In the application_controller.rb file, I simply use a
require 'profile'
require 'etc' #etc
and this loads the custom library files for the entire application, i.e., I don't have to use require commands in every controller.
For me, this was the solution I was looking for, and I thought I would add it to this answer to pass the information along.

Resources