Guard Ruby classes with class-level metaprogramming against multiple loads - ruby

A quirk of Ruby's require is that, while in general, it will only load a file once, if that file is accessible via multiple paths (e.g. symlinks), it can be required multiple times. This causes problems when there are things like class-level metaprogramming, or in general a code that should only be executed once on file loading, getting executed multiple times.
Is there any way, from inside a Ruby class definition, to tell whether the class has been defined before? I thought defined? or Object.const_get might tell me, but from those it looks like the class is defined as soon as it's opened.

This is not an answer to your question in the second paragraph, but a solution to the issue in your first paragraph. Actually, you cannot avoid multiple file loads by checking whether a class was defined already.
Instead of doing:
require some_file_name
do:
require File.realpath(some_file_name)
By doing so, different symbolic links pointing to the same real file would be normalized to the same real file name, and hence multiple loading of them would be correctly filtered by require.
Cf. this question and the answer given there.

The real solution is that requiring that a piece of code is only executed once is bad design and you should fix that.
However, what you could do is simply set some flag that the code has already been executed and check that flag. E.g.:
class Foo
unless #__executed__
def bla; end
puts 'Test'
end
#__executed__ = true
end

Related

Ruby require multiple files without knowing their class names

Let's say I have an Application that requires different files. Each file contains a single purpose "thing". I call it "thing" because I'm unsure if it will be a class or a module.
The First Thing
# things/one.rb
class One
def self.do_it
"I'm the one"
end
end
And the second
# things/two.rb
class Two
def self.do_it
"I'm not the only one!"
end
end
Now I wanna require all files in the thingsdirectory and execute the do_it method. But I wanna do this dynamically - for every thing in things folder without even knowing their Classname.
# app.rb
Dir["things/*.rb"].each {|file| require file }
Keep in Mind: I simply wanna execute the code in the things files - it's not necessary that they are Classes or Modules
You can't reliably do this for a number of reasons, but the most important is that you're under no obligations to declare a single class in any given file, or a class at all. You've got the right idea with loading in all files in a particular directory, but if you need to operate on those there's two ways to crack that nut.
The easiest way is to declare these classes as a subclass of some already known parent. ActiveSupport has the descendents method to reflect on a particular class, that's part of Rails, but you can also do it the hard way if you have to.
The second easiest way is to correlate the names with the classes in them, and then convert filenames to class-names algorithmically. This is how the Rails autoloader works. cat.rb contains Cat, bad_dog.rb contains BadDog and so on.
If you look more closely at how things like Test::Unit work they take the first approach, any declared test subclasses are executed before the Ruby process terminates. It's a fairly simple system that puts no constraints on what the files are called or where and how the classes are declared.
Rails leans towards the second approach where the path communicates intent. This makes finding files more predictable but requires more discipline on the part of the developer to conform to that standard.
They both have merit. Pick the one that works for your use case.

if __FILE__ == $0 break program when it's placed before any methods

I'm new to Ruby and am trying to understand a behavior that I've run into. I'm writing a class that needs some constants initialized before it can run, but when its run from another class, as it sometimes will be, I get warnings about constants already being defined. So I placed the following at the end of my file:
if __FILE__ == $0
constant_initialiation
ReviewScraper.new.getReviews($testing, $getWeekendReviews, $clearWorksheet, $getAll)
end
The constant_initialization is just a bunch of constants being set, nothing interesting. Anyway, this works great for me -- so long as its at the end of the file. If I move this up to the top and try running, I get an error: unitialized constant ReviewScraper (NameError). Almost as if it compiles sequentially for just this part of the file and isn't finding the ReviewScraper class definition when its run.
Can any Ruby geniuses explain this behavior to me? It's no big issue other than for styling purposes (I like having my list of constants up top), but it would be nice to understand what's going on here.
First off, I'll expand on my comments above as to what is going on. Secondly, I'll suggest perhaps a better way of setting your constants in code that avoids setting them multiple times.
As I mentioned, a Ruby script reads from top to bottom. So, if you try to instantiate a class before you define it, they the Ruby script won't know what it is.
cat = Cat.new # NameError
class Cat
# Code
end
cat = Cat.new # Works fine
The script first reads the line where you make a new Cat object. However, it doesn't know what a Cat is yet. Once it is done processing the code for what a Cat is, then it can create one. I used the example of talking to someone about Darth Vader but it is probably more akin to asking a construction company to build your building before you've ever handed them a blueprint. It is only after they have the blueprint that they can build a your building.
Now, in regards to initializing constants, there are a couple different things you could do. One is, you could put the initializations in a if block much like you did at the top but leave out the instantiating of the class until the end of the script. (Two if statements.) Another would be to put the constants in a module in its own file.
module Names
Dog = "Spot"
Cat = "Sparkles"
end
Now you just require that file wherever you need it. In file_one.rb you put
require_relative './modules/names_module.rb' # Or wherever it is
include Names
You put the same thing in your Review Scraper file. Here's the cool part: if you require the Names module once, it'll be brought into the code. However, if you require it a second time nothing will happen. You won't get warnings. It'll just quietly not require it a second time. One top of that, all your constants are in their own namespace.
Just a thought.

Ruby require - execute multiple times

A general question about require in Ruby.
My understanding of Ruby's require is that the file specified is only loaded once but can be executed multiple times. Is this correct?
I have a set of Rspec tests in different files which all require the same file logger.rb. It doesn't seem like the methods I call in the required file are executed in every spec.
Here is some of the code I wrote in logger.rb, it first cleans up the temp dir an then creates a logger.
tmpdir = Dir.tmpdir
diagnostics_directory = File.join(tmpdir, LibertyBuildpack::Diagnostics::DIAGNOSTICS_DIRECTORY)
FileUtils.rm_rf diagnostics_directory
raise 'Failed to create logger' if LibertyBuildpack::Diagnostics::LoggerFactory.create_logger(tmpdir).nil?
I'd like this to happen in every spec.
Is that because the tests are within the same module or did I misunderstand how require works.
There is still a great many ifs, as you do not show the code requiring your file, but I think I have understood some of the misunderstandings you had :-)
Your statement 'The file specified is only loaded once but can be executed multiple times.' is basically the opposite of what is true. If a file is to have any effect on a ruby program it will have to be 'executed', it may just happen sometimes that one of the executed methods defines other methods or classes. All statements in a file will be executed once when loaded, but you may load a file multiple times. If you require a file it will only be loaded if it has not been done so already, so as with method definitions your 'static' method calls will only be executed once.
If you want to execute things multiple times you should either load the file (which is usually inefficient since all the compiling will have to be done again) or you require a file with a method definition in it (def ... end) and you call the method multiple times (with possibly varying parameters). The latter is the usual way, as compiling will only have to take place once this way.

How can I determine the size of methods and classes in Ruby?

I'm working on a code visualization tool and I'd like to be able to display the size(in lines) of each Class, Method, and Module in a project. It seems like existing parsers(such as Ripper) could make this info easy to get. Is there a preferred way to do this? Is there a method of assessing size for classes that have been re-opened in separate locations? How about for dynamically (Class.new {}, Module.new {}) defined structures?
I think what you're asking for is not possible in general without actually running the whole Ruby program the classes are part of (and then you run into the halting problem). Ruby is extremely dynamic, so lines could be added to a class' definition anywhere, at any time, without necessarily referring to the particular class by name (e.g. using class_eval on a class passed into a method as an argument). Not that the source code of a class' definition is saved anyway... I think the closest you could get to that is the source_locations of the methods of the class.
You could take the difference of the maximum and minimum line numbers of those source_locations for each file. Then you'd have to assume that the class is opened only once per file, and that the size of the last method in a file is negligible (as well as any non-method parts of the class definition that happen before the first method definition or after the last one).
If you want something more accurate maybe you could run the program, get method source_locations, and try to correlate those with a separate parse of the source file(s), looking for enclosing class blocks etc.
But anything you do will most likely involve assumptions about how classes are generally defined, and thus not always be correct.
EDIT: Just saw that you were asking about methods and modules too, not just classes, but I think similar arguments apply for those.
I've created a gem that handles this problem in the fashion suggested by wdebaum. class_source. It certainly doesn't cover all cases but is a nice 80% solution for folks that need this type of thing. Patches welcome!

Adding a directory to $LOAD_PATH (Ruby)

I have seen two commonly used techniques for adding the directory of the file currently being executed to the $LOAD_PATH (or $:). I see the advantages of doing this in case you're not working with a gem. One seems more verbose than the other, obviously, but is there a reason to go with one over the other?
The first, verbose method (could be overkill):
$LOAD_PATH.unshift(File.expand_path(File.dirname(__FILE__))) unless $LOAD_PATH.include?(File.expand_path(File.dirname(__FILE__)))
and the more straightforward, quick-and-dirty:
$:.unshift File.dirname(__FILE__)
Any reason to go with one over the other?
The Ruby load path is very commonly seen written as $: , but just because it is short, does not make it better. If you prefer clarity to cleverness, or if brevity for its own sake makes you itchy, you needn't do it just because everyone else is.
Say hello to ...
$LOAD_PATH
... and say goodbye to ...
# I don't quite understand what this is doing...
$:
I would say go with $:.unshift File.dirname(__FILE__) over the other one, simply because I've seen much more usage of it in code than the $LOAD_PATH one, and it's shorter too!
I'm not too fond on the 'quick-and-dirty' way.
Anyone new to Ruby will be pondering what $:. is.
I find this more obvious.
libdir = File.dirname(__FILE__)
$LOAD_PATH.unshift(libdir) unless $LOAD_PATH.include?(libdir)
Or if I care about having the full path...
libdir = File.expand_path(File.dirname(__FILE__))
$LOAD_PATH.unshift(libdir) unless $LOAD_PATH.include?(libdir)
UPDATE 2009/09/10
As of late I've been doing the following:
$:.unshift(File.expand_path(File.dirname(__FILE__))) unless
$:.include?(File.dirname(__FILE__)) || $:.include?(File.expand_path(File.dirname(__FILE__)))
I've seen it in a whole bunch of different ruby projects while browsing GitHub.
Seems to be the convention?
If you type script/console in your Rails project and enter $:, you'll get an array that includes all the directories needed to load Ruby. The take-away from this little exercise is that $: is an array. That being so, you can perform functions on it like prepending other directories with the unshift method or the << operator. As you implied in your statement $: and $LOAD_PATH are the same.
The disadvantage with doing it the quick and dirty way as you mentioned is this: if you already have the directory in your boot path, it will repeat itself.
Example:
I have a plugin I created called todo. My directory is structured like so:
/---vendor
|
|---/plugins
|
|---/todo
|
|---/lib
|
|---/app
|
|---/models
|---/controllers
|
|---/rails
|
|---init.rb
In the init.rb file I entered the following code:
## In vendor/plugins/todo/rails/init.rb
%w{ models controllers models }.each do |dir|
path = File.expand_path(File.join(File.dirname(__FILE__), '../lib', 'app', dir))
$LOAD_PATH << path
ActiveSupport::Dependencies.load_paths << path
ActiveSupport::Dependencies.load_once_paths.delete(path)
end
Note how I tell the code block to perform the actions inside the block to the strings 'models', 'controllers', and 'models', where I repeat 'models'. (FYI, %w{ ... } is just another way to tell Ruby to hold an array of strings). When I run script/console, I type the following:
>> puts $:
And I type this so that it is easier to read the contents in the string. The output I get is:
...
...
./Users/Me/mySites/myRailsApp/vendor/plugins/todo/lib/app/models
./Users/Me/mySites/myRailsApp/vendor/plugins/todo/lib/app/controllers
./Users/Me/mySites/myRailsApp/vendor/plugins/todo/lib/app/models
As you can see, though this is as simple an example I could create while using a project I'm currently working on, if you're not careful the quick and dirty way will lead to repeated paths. The longer way will check for repeated paths and make sure they don't occur.
If you're an experienced Rails programmer, you probably have a very good idea of what you're doing and likely not make the mistake of repeating paths. If you're a newbie, I would go with the longer way until you understand really what you're doing.
Best I have come across for adding a dir via relative path when using Rspec. I find it verbose enough but also still a nice one liner.
$LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
There is a gem which will let you setup your load path with nicer and cleaner code. Check this out: https://github.com/nayyara-samuel/load-path.
It also has good documentation
My 2ยข: I like $LOAD_PATH rather than $:. I'm getting old... I've studied 92,000 languages. I find it hard to keep track of all the customs and idioms.
I've come to abhor namespace pollution.
Last, when I deal with paths, I always delete and then either append or prepend -- depending upon how I want the search to proceed. Thus, I do:
1.times do
models_dir = "#{File.expand_path(File.dirname(__FILE__))}/models"
$LOAD_PATH.delete(models_dir)
$LOAD_PATH.unshift(models_dir)
end
I know it's been a long time since this question was first asked, but I have an additional answer that I want to share.
I have several Ruby applications that were developed by another programmer over several years, and they re-use the same classes in the different applications although they might access the same database. Since this violates the DRY rule, I decided to create a class library to be shared by all of the Ruby applications. I could have put it in the main Ruby library, but that would hide custom code in the common codebase which I didn't want to do.
I had a problem where I had a name conflict between an already defined name "profile.rb", and a class I was using. This conflict wasn't a problem until I tried to create the common code library. Normally, Ruby searches application locations first, then goes to the $LOAD_PATH locations.
The application_controller.rb could not find the class I created, and threw an error on the original definition because it is not a class. Since I removed the class definition from the app/models section of the application, Ruby could not find it there and went looking for it in the Ruby paths.
So, I modified the $LOAD_PATH variable to include a path to the library directory I was using. This can be done in the environment.rb file at initialization time.
Even with the new directory added to the search path, Ruby was throwing an error because it was preferentially taking the system-defined file first. The search path in the $LOAD_PATH variable preferentially searches the Ruby paths first.
So, I needed to change the search order so that Ruby found the class in my common library before it searched the built-in libraries.
This code did it in the environment.rb file:
Rails::Initializer.run do |config|
* * * * *
path = []
path.concat($LOAD_PATH)
$LOAD_PATH.clear
$LOAD_PATH << 'C:\web\common\lib'
$LOAD_PATH << 'C:\web\common'
$LOAD_PATH.concat(path)
* * * * *
end
I don't think you can use any of the advanced coding constructs given before at this level, but it works just fine if you want to setup something at initialization time in your app. You must maintain the original order of the original $LOAD_PATH variable when it is added back to the new variable otherwise some of the main Ruby classes get lost.
In the application_controller.rb file, I simply use a
require 'profile'
require 'etc' #etc
and this loads the custom library files for the entire application, i.e., I don't have to use require commands in every controller.
For me, this was the solution I was looking for, and I thought I would add it to this answer to pass the information along.

Resources