Typically when I am testing small snippets of code for Ruby, I many times put code chunks in separate files in the same directory, run irb, and then run the following command:
Dir[Dir.pwd + "/*.rb"].each { |file| require file }
Which loads all the files into irb. Which brings me to my question: when I require a file, how does irb process that request? Does it take all requires and put them in one overall 'file' ? I am looking for the mechanics of how irb works.
If anyone has the answer or can point me in the right direction, I would appreciate it.
Cheers
The short answer is:
require loads a file into the Ruby interpreter. The source code is analyzed, its by-products incorporated into the Ruby runtime (classes loaded, etc.), and then the source code is not saved anywhere and is eventually garbage collected (the memory it occupied is freed).
Related
I'm writing a Ruby gem with lots of nested classes and such. I'd like to keep a huge list of require statements out of my main Ruby file in the /lib directory, so instead I used the following:
Dir[ File.join( File.dirname(__FILE__), "**", "*.rb" ) ].each {|f| require f}
which totally worked fine until this morning when I added a new file (helper module) and now the library is acting like that file isn't loaded, even though it is. I checked with
puts "loaded" if defined?(RealtimeArgHelpers)
I also duplicated the require statement to check to see if my new file is getting returned
Dir[ File.join( File.dirname(__FILE__), "**", "*.rb" ) ].each {|f| puts f}
and it is. I have to manually require this one file. Out of nowhere. I have 101 other files currently being gathered by this statement and everything works perfectly fine. But not this one file. I don't have any name conflicts besides
/path/arg_helpers.rb
/path/realtime/agents.rb
/path/realtime/queues.rb
/path/realtime.rb
/path/realtime_arg_helpers.rb
which still shouldn't be a 'conflict.' I'm completely baffled by the seemingly random behavior, unless I'm doing something illegal in the language. I tried renaming the module, renaming the file, no dice. Why is this one file not getting loaded?
The problem comes from the way require works in Ruby. Whenever a file is required (let's only consider the case of a .rb source file to keep it simple), Ruby parses the file and immediately executes the instructions that are not in methods. So if you use classes that are defined in files yet to be loaded, Ruby won't find them.
I don't really know if there is a recommended way used by the gem authors to deal with that, but coming from a C/C++ background I tend to favour explicitly requiring the dependencies in each source file, rather than trying to maintain a giant sorted list in the main.rb (or equivalent).
With this minimal ruby code:
require 'debug'
puts
in a file called, e.g. script.rb
if I launch it like so: ruby -rdebug script.rb
and then press l on the debug prompt, I get the listing, as expected
if I instead run it normally as ruby script.rb
when pressing l I get:
(rdb:1) l
[-3, 6] in script.rb
No sourcefile available for script.rb
The error message seems misleading at best: the working directory hasn't changed, and the file is definitely still there!
I'm unable to find documentation on this behavior (I tried it on both jruby and mri, and the behavior is the same)
I know about 'debugger' and 'pry', but they serve a different use case:
I'm used to other scripting languages with a builtin debug module, that can let me put a statement anywhere in the code to drop me in a debug shell, inspect code, variables and such... the advantage of having it builtin it's that it is available everywhere, without having to set up an environment for it, or even when I'm on a machine that's not my own
I could obviously workaround this by always calling the interpreter with -rdebug and manually setting the breakpoint, but I find this more work than the alternative
After looking into the debug source code, I found a workaround and a fix:
the workaround can be:
trace on
next
trace off
list
this will let you get the listing without restarting the interpreter with -rdebug, with the disadvantage that you'll get some otherwise unwanted output from the tracing, and you'll be able to see it only after moving by one statement
for the fix: the problem is that the SCRIPT_LINES__ Hash lacks a value for the current file... apparently it's only set inside tracer.rb
I've changed line 161, and changed the Hash with a subclass that tracks where []= has been called from, and I wasn't able to dig up the actual code that does the work when stepping into a function that comes from a different file
Also: I haven't found a single person yet who actively uses this module (I asked both on #ruby, #jruby and #pry on freenode), and together with the fact that it uses a function that is now obsolete it leads me to be a bit pessimistic about the maintenance state of this module
nonetheless, I submitted a pull request for the fix (it's actually quite dumb and simple, but to do otherwise I'd need a deeper understanding of this module, and possibly to refactor some parts of it... but if this module isn't actively maintaned, I'm not sure that's a good thing to put effort on)
I suspect the Ruby interpreter doesn't have the ability to load the sourcefile without the components in the debug module. So, no -rdebug, no access to the sourcefile. And I agree it is a misleading error. "Debugging features not loaded" might be better.
I am confused about the difference between load 'file.rb' and require 'Module'. At Learn Ruby the Hard Way, the example of how to use a module is set up with two files (mystuff.rb and apple.rb):
mystuff.rb
module MyStuff
def MyStuff.apple()
puts "I AM APPLES!"
end
end
apple.rb
require 'mystuff'
MyStuff.apple()
However, when I run apple.rb, either in the Sublime Text console, or with ruby apple.rb, I get a Load Error. I have also tried require 'MyStuff', and require 'mystuff.rb', but I still get the Load Error.
So, I switched the first line of apple.rb to load 'mystuff.rb', which allows it to run. However, if I edit 'mystuff.rb' to be a definition of class MyStuff as opposed to a module MyStuff, there is no difference.
For reference, the Load Error is:
/Users/David/.rvm/rubies/ruby-2.0.0-p353/lib/ruby/site_ruby/2.0.0/rubygems/core_ext/kernel_require.rb:55:in require': cannot load such file -- mystuff (LoadError)`
I've peeked into kernel_require.rb and looked at the require definition, but since I'm a Ruby Nuby (indeed, a programming newbie), it was a little overwhelming. Since Learn Ruby the Hard Way hasn't been updated since 2012-10-05, there've probably been some syntax changes for modules. Yes?
require searches a pre-defined list of directories, as discussed in What are the paths that "require" looks up by default?. It's failing because it can't find the mystuff.rb in any of those directories.
load, on the other hand, will look for files in the current directory.
As for:
However, if I edit 'mystuff.rb' to be a definition of class MyStuff as
opposed to a module MyStuff, there is no difference.
I'm not sure I understand what you mean by "no difference". If you mean that the require and load continue to fail and succeed, respectively, that makes sense, as the require failure is independent of the content of the file contents and the code you're testing behaves the same independent of whether Mystuff is a class or a vanilla module.
You can solve this easily by changing
require 'mystuff'
to
require_relative './mystuff'
I'm reading through "Programming Ruby 1.9". On page 208 (in a "Where to Put Tests" section), the book has the code organized as
roman
lib/
roman.rb
other files...
test/
test_roman.rb
other_tests...
other stuff
and asks how we get our test_roman.rb file to know about the roman.rb file.
It says that one option that doesn't work is to build the path into require statements in the test code:
# in test_roman.rb
require 'test/unit'
require '../lib/roman'
Instead, it says a better solution is for all other components of the application to assume that the top-level directory of the application is in Ruby's load path, so that the test code would have
# in test_roman.rb
require 'test/unit'
require '/lib/roman'
and we'd run the tests by calling ruby -I path/to/app path/to/app/test/test_roman.rb.
My question is: is this realy the best way? It seems like
If we simply replaced require '../lib/roman' in the first option with require_relative '../lib/roman', everything would work fine.
The assumption in the second option (that all components have the top-level directory in Ruby's load path) only works because we pass the -I path/to/app argument, which seems a little messy.
Am I correct that replacing require with require_relative fixes all the problems? Is there any reason to prefer the second option anyways?
Further on, that same book makes use of require_relative (Chapter 16, Organizing your source code) in the context of testing, so yes, I would say that using it is a Good Thing, since it "always loads files from a path relative to the directory of the file that invokes it".
Of course, like #christiangeek noticed, require_relative is new in the 1.9 series, but there's a gem that provides you with the same functionality.
It might be worth pointing out that the Pickaxe too provides a little method you can stick in your code in the same chapter I mentioned before.
require_relative does make the code cleaner but is only available natively on Ruby > 1.9.2. Which means if you want want your code to be portable to versions of Ruby < 1.9.2 you need to use a extension or the regular require AFAIK. The book is most likely a) written before 1.9.2 became widespread or b) providing a example for lowest common denominator.
I have seen two commonly used techniques for adding the directory of the file currently being executed to the $LOAD_PATH (or $:). I see the advantages of doing this in case you're not working with a gem. One seems more verbose than the other, obviously, but is there a reason to go with one over the other?
The first, verbose method (could be overkill):
$LOAD_PATH.unshift(File.expand_path(File.dirname(__FILE__))) unless $LOAD_PATH.include?(File.expand_path(File.dirname(__FILE__)))
and the more straightforward, quick-and-dirty:
$:.unshift File.dirname(__FILE__)
Any reason to go with one over the other?
The Ruby load path is very commonly seen written as $: , but just because it is short, does not make it better. If you prefer clarity to cleverness, or if brevity for its own sake makes you itchy, you needn't do it just because everyone else is.
Say hello to ...
$LOAD_PATH
... and say goodbye to ...
# I don't quite understand what this is doing...
$:
I would say go with $:.unshift File.dirname(__FILE__) over the other one, simply because I've seen much more usage of it in code than the $LOAD_PATH one, and it's shorter too!
I'm not too fond on the 'quick-and-dirty' way.
Anyone new to Ruby will be pondering what $:. is.
I find this more obvious.
libdir = File.dirname(__FILE__)
$LOAD_PATH.unshift(libdir) unless $LOAD_PATH.include?(libdir)
Or if I care about having the full path...
libdir = File.expand_path(File.dirname(__FILE__))
$LOAD_PATH.unshift(libdir) unless $LOAD_PATH.include?(libdir)
UPDATE 2009/09/10
As of late I've been doing the following:
$:.unshift(File.expand_path(File.dirname(__FILE__))) unless
$:.include?(File.dirname(__FILE__)) || $:.include?(File.expand_path(File.dirname(__FILE__)))
I've seen it in a whole bunch of different ruby projects while browsing GitHub.
Seems to be the convention?
If you type script/console in your Rails project and enter $:, you'll get an array that includes all the directories needed to load Ruby. The take-away from this little exercise is that $: is an array. That being so, you can perform functions on it like prepending other directories with the unshift method or the << operator. As you implied in your statement $: and $LOAD_PATH are the same.
The disadvantage with doing it the quick and dirty way as you mentioned is this: if you already have the directory in your boot path, it will repeat itself.
Example:
I have a plugin I created called todo. My directory is structured like so:
/---vendor
|
|---/plugins
|
|---/todo
|
|---/lib
|
|---/app
|
|---/models
|---/controllers
|
|---/rails
|
|---init.rb
In the init.rb file I entered the following code:
## In vendor/plugins/todo/rails/init.rb
%w{ models controllers models }.each do |dir|
path = File.expand_path(File.join(File.dirname(__FILE__), '../lib', 'app', dir))
$LOAD_PATH << path
ActiveSupport::Dependencies.load_paths << path
ActiveSupport::Dependencies.load_once_paths.delete(path)
end
Note how I tell the code block to perform the actions inside the block to the strings 'models', 'controllers', and 'models', where I repeat 'models'. (FYI, %w{ ... } is just another way to tell Ruby to hold an array of strings). When I run script/console, I type the following:
>> puts $:
And I type this so that it is easier to read the contents in the string. The output I get is:
...
...
./Users/Me/mySites/myRailsApp/vendor/plugins/todo/lib/app/models
./Users/Me/mySites/myRailsApp/vendor/plugins/todo/lib/app/controllers
./Users/Me/mySites/myRailsApp/vendor/plugins/todo/lib/app/models
As you can see, though this is as simple an example I could create while using a project I'm currently working on, if you're not careful the quick and dirty way will lead to repeated paths. The longer way will check for repeated paths and make sure they don't occur.
If you're an experienced Rails programmer, you probably have a very good idea of what you're doing and likely not make the mistake of repeating paths. If you're a newbie, I would go with the longer way until you understand really what you're doing.
Best I have come across for adding a dir via relative path when using Rspec. I find it verbose enough but also still a nice one liner.
$LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
There is a gem which will let you setup your load path with nicer and cleaner code. Check this out: https://github.com/nayyara-samuel/load-path.
It also has good documentation
My 2ยข: I like $LOAD_PATH rather than $:. I'm getting old... I've studied 92,000 languages. I find it hard to keep track of all the customs and idioms.
I've come to abhor namespace pollution.
Last, when I deal with paths, I always delete and then either append or prepend -- depending upon how I want the search to proceed. Thus, I do:
1.times do
models_dir = "#{File.expand_path(File.dirname(__FILE__))}/models"
$LOAD_PATH.delete(models_dir)
$LOAD_PATH.unshift(models_dir)
end
I know it's been a long time since this question was first asked, but I have an additional answer that I want to share.
I have several Ruby applications that were developed by another programmer over several years, and they re-use the same classes in the different applications although they might access the same database. Since this violates the DRY rule, I decided to create a class library to be shared by all of the Ruby applications. I could have put it in the main Ruby library, but that would hide custom code in the common codebase which I didn't want to do.
I had a problem where I had a name conflict between an already defined name "profile.rb", and a class I was using. This conflict wasn't a problem until I tried to create the common code library. Normally, Ruby searches application locations first, then goes to the $LOAD_PATH locations.
The application_controller.rb could not find the class I created, and threw an error on the original definition because it is not a class. Since I removed the class definition from the app/models section of the application, Ruby could not find it there and went looking for it in the Ruby paths.
So, I modified the $LOAD_PATH variable to include a path to the library directory I was using. This can be done in the environment.rb file at initialization time.
Even with the new directory added to the search path, Ruby was throwing an error because it was preferentially taking the system-defined file first. The search path in the $LOAD_PATH variable preferentially searches the Ruby paths first.
So, I needed to change the search order so that Ruby found the class in my common library before it searched the built-in libraries.
This code did it in the environment.rb file:
Rails::Initializer.run do |config|
* * * * *
path = []
path.concat($LOAD_PATH)
$LOAD_PATH.clear
$LOAD_PATH << 'C:\web\common\lib'
$LOAD_PATH << 'C:\web\common'
$LOAD_PATH.concat(path)
* * * * *
end
I don't think you can use any of the advanced coding constructs given before at this level, but it works just fine if you want to setup something at initialization time in your app. You must maintain the original order of the original $LOAD_PATH variable when it is added back to the new variable otherwise some of the main Ruby classes get lost.
In the application_controller.rb file, I simply use a
require 'profile'
require 'etc' #etc
and this loads the custom library files for the entire application, i.e., I don't have to use require commands in every controller.
For me, this was the solution I was looking for, and I thought I would add it to this answer to pass the information along.