Does the Ruby interpreter compile to byte-code in a lazy way? How? - ruby

For MRI 1.9+ and Rubinius implementation, Ruby source code is compiled into byte-code, and then that byte-code is interpreted by the VM. I want to know the details of this mechanism when running a Ruby script from command-line using the interpreter.
Does the interpreter compile all related source files required in the script first and then it run everything? Or does it execute some code and then compile other files while needed in a lazy way?
If it's the latter (which I suspect), is this process done by file or by a block of code?
At which point it stops the execution of byte-code and runs the compilation process again?
Does this process differ from MRI to Rubinius?
For example, if I run "ruby my_main_script.rb", which requires 3 other rb source files (and this files itself do not have any requirement), the possibility I imagine would be:
A: The interpreter parses my_main_script.rb and the 3 files. After parsing then, it compiles all AST trees to byte-code. It then proceeds to run the byte-code using the VM.
B: Ruby parses my_main_script.rb and compiles it into byte-code. It then runs the byte-code. When encountering a call to a method in another files, it first parse and compiles that files and the continues with the execution. If this is the case, I would like to understand this in detail.
C: Ruby parses and compiles some piece of code from my_main_script.rb according to some (unkwnon to me) criteria, it runs that byte-code and then parses-and-compiles another piece when needed. This process and that "when needed" condition detection method is what would be interesting for me to understand.
Update 30/03/16
I wrote this little experiment script to try to check if B is the right answer:
class RubyVM
class InstructionSequence
class << self
alias :old_compile_file :compile_file
def compile_file(code, opt)
puts "Injecting code..."
old_compile_file(code, opt)
end
alias :old_compile :compile
def compile(code)
puts "Injecting code..."
old_compile(code)
end
end
end
end
require_relative 'say_hi'
'say_hi.rb' only contains the line "puts 'hello'".
If B is the right answer, shouldn't the output be the following?
Injecting code...
hello
It just outputs "hello"...

For me B is the right answer.
Ruby allows us to load dynamically our code via autoload and to execute strings as code (eval) so it must be able to parse and execute code at any time.
Therefore first it will transform all the files required by your main program to YARV instructions, but if you use autoload or eval those files/code will be transformed later.
A very good book about that process is Ruby under a microscope

Related

How to write Rspec test for running file from command line?

I have a Ruby project with a UNIX executable file called parse located in a bin subfolder in my project root directory.
At the moment it's just this:
#!/usr/bin/env ruby
# frozen_string_literal: true
puts 'hello world'
The file can be executed on the command line when this command is run from the project root directory: bin/parse
It works fine, but I also want to write a passing Rspec test for it.
I have this spec file:
RSpec.describe "end-to-end application behaviour" do
subject { system('bin/parse') }
it 'prints the expected messsage to stdout' do
expect { subject }.to output(
'hello world'
).to_stdout
end
end
When I run it I get the test failure:
expected block to output "hello world" to stdout, but output nothing
This is the location of my spec file relative to my project root: spec/integration/parse_spec.rb
I tried placing require and require_relative statements in that spec file with the paths to the parse executable, in case that would help, but I just kept getting:
LoadError: cannot load such file
Does anyone know how I can write a test in that file that will pass and prove the parse executable behaviour works?
Don't Use the RSpec Output Matcher
RSpec has a built-in output matcher than can test both where output goes, as well as its contents. However, it's testing where your Ruby output goes, not whether some external application is using standard input or standard error. You're going to have to make some different assumptions about your code.
You can avoid driving yourself nuts by comparing strings rather than testing the underlying shell or your output streams. For example, consider:
RSpec.describe "parse utility output" do
it "prints the right string on standard output" do
expect(`echo hello world`).to start_with("hello world")
end
it "shows nothing on standard output when it prints to stderr" do
expect(`echo foo >&2 > /dev/null`).to be_empty
end
end
Just replace the echo statements with the correct invocation of parse for your system, perhaps by setting PATH directly in your shell, using a utility like direnv, or by modifying ENV["PATH"] in your spec or spec_helper.
As a rule of thumb, RSpec isn't really meant for testing command-line applications. If you want to do that, consider using the Aruba framework to exercise your command-line applications. It's best to use RSpec to test the results of methods or the output of commands, rather than trying to test basic functionality. Of course, your mileage may vary.
Use ‍to_stdout_from_any_process instead of to_stdout:
expect { subject }.to output('hello world').to_stdout_from_any_process

Stub require statement in rspec?

I have to maintain a Ruby script, which requires some libs I don't have locally and which won't work in my environment. Nevertheless I want to spec some methods in this script so that I can change them easily.
Is there an option to stub some of the require statements in the script I want to test so that it can be loaded by rspec and the spec can be executed within my environment?
Example (old_script.rb):
require "incompatible_lib"
class Script
def some_other_stuff
...
end
def add(a,b)
a+b
end
end
How can I write a test to check the add function without splitting the "old_Script.rb" file and without providing the incompatible_lib I don't have?
Instead of stubbing require which is "inherited" from Kernel, you could do this:
Create a dummy incompatible_lib.rb file somewhere that is not in your $LOAD_PATH. I.e., if this is a Ruby application (not Rails), don't put it in lib/ nor spec/.
You can do this a number of ways, but I'll tell you one method: in your spec file which tests Script, modify $LOAD_PATH to include the parent directory of your dummy incompatible_lib.rb.
Ordering is very important -- next you will include script.rb (the file which defines Script).
This will get you around the issue and allow you test test the add method.
Once you've successfully tested Script, I would highly recommend refactoring it so that you don't have to do this technique, which is a hack, IMHO.
Thanks, I also thought about the option of adding the files, but finally hacked the require itself within the test case:
module Kernel
alias :old_require :require
def require(path)
old_require(path) unless LIBS_TO_SKIP.include?(path)
end
end
I know that this is an ugly hack but as this is legacy code executed on a modified ruby compiler I can't easily get these libs running and it's sufficient to let me test my modifications...

Is there a "main" method in Ruby like in C?

I'm new to Ruby, so apologies if this sounds really silly.
I can't seem to figure out how to write a "main" code and have methods in the same file (similar to C). I end up with a "main" file which loads a seperate file that has all the methods. I appreciate any guidance on this.
I spotted the following SO post but I don't understand it:
Should I define a main method in my ruby scripts?
While it's not a big deal, it's just easier being able to see all the relevant code in the same file. Thank you.
[-EDIT-]
Thanks to everyone who responded - turns out you just need to define all the methods above the code. An example is below:
def callTest1
puts "in test 1"
end
def callTest2
puts "in test 2"
end
callTest1
callTest2
I think this makes sense as Ruby needs to know all methods beforehand. This is unlike C where there is a header file which clearly list the available functions and therefore, can define them beneath the main() function
Again, thanks to everyone who responded.
#Hauleth's answer is correct: there is no main method or structure in Ruby. I just want to provide a slightly different view here along with some explanation.
When you execute ruby somefile.rb, Ruby executes all of the code in somefile.rb. So if you have a very small project and want it to be self-contained in a single file, there's absolutely nothing wrong with doing something like this:
# somefile.rb
class MyClass
def say_hello
puts "Hello World"
end
end
def another_hello
puts "Hello World (from a method)"
end
c = MyClass.new
c.say_hello
another_hello
It's not that the first two blocks aren't executed, it's just that you don't see the effects until you actually use the corresponding class/method.
The if __FILE__ == $0 bit is just a way to block off code that you only want to run if this file is being run directly from the command line. __FILE__
is the name of the current file, $0 is the command that was executed by the shell (though it's smart enough to drop the ruby), so comparing the two tells you precisely that: is this the file that was executed from the command line? This is sometimes done by coders who want to define a class/module in a file and also provide a command-line utility that uses it. IMHO that's not very good project structure, but just like anything there are use cases where doing it makes perfect sense.
If you want to be able to execute your code directly, you can add a shebang line
#!/usr/bin/env ruby
# rest of somefile.rb
and make it executable with chmod +x somefile.rb (optionally rename it without the .rb extension). This doesn't really change your situation. The if __FILE__ == $0 still works and still probably isn't necessary.
Edit
As #steenslag correctly points out, the top-level scope in Ruby is an Object called main. It has slightly funky behavior, though:
irb
>> self
=> main
>> self.class
=> Object
>> main
NameError: undefined local variable or method `main' for main:Object
from (irb):8
Don't worry about this until you start to dig much deeper into the language. If you do want to learn lots more about this kind of stuff, Metaprogramming Ruby is a great read :)
No there isn't such structure. Of course you can define main function but it won't be called until you do so. Ruby execute line by line so if you want to print 'Hello World' you simply write:
puts 'Hello World'
The question that you mentioned is about using one file as module and executable, so if you write
if __FILE__ == $0
# your code
end
It will be called only if you run this file. If you only require it in other file then this code will never run. But IMHO it's bad idea, better option is using RubyGems and there add executables.
Actually there is a main, but it is not a method; it's the top-level object that is the initial execution context of a Ruby program.
class Foo
p self
end
#=> Foo
p self
#=> main
def foo
p self
end
foo
#=> main
There is no magic main function in Ruby. See http://en.wikipedia.org/wiki/Main_function#Ruby
If you wish to run Ruby scripts like C compiled files, do the following:
#!/usr/bin/env ruby
puts "Hello"
and then chmod a+x file_name.rb. Everything that is below the first line will be run, as if it was contents of main in C. Of course class and function definitions won't give you any results until they are instantiated/invoked (although the code inside class definitions is actually evaluated, so you could get some output but this is not expected in normal circumstances).
Another way to write main() method is:
class HelloWorld
def initialize(name)
#name = name
end
def sayHello()
print "Hello ##name!"
end
end
def main()
helloWorld = HelloWorld.new("Alice")
helloWorld.sayHello
end
main

How do I replace an executable with a mock executable in a test?

Can I replace an executable (accessed via a system call from ruby) with an executable that expects certain input and supplies the expected output in a consistent amount of time? I'm mainly operating on Mac OSX 10.6 (Snow Leopard), but I also have access to Linux and Windows. I'm using MRI ruby 1.8.7.
Background: I'm looking at doing several DNA sequence alignments, one in each thread. When I try using BioRuby for this, either BioRuby or ruby's standard library's tempfile sometimes raise exceptions (which is better than failing silently!).
I set up a test that reproduces the problem, but only some of the time. I assume the main sources of variability between tests are the threading, the tempfile system, and the executable used for alignment (ClustalW). Since ClustalW probably isn't malfunctioning, but can be a source of variability, I'm thinking that eliminating it may aid reproducibility.
For those thinking select isn't broken - that's what I'm wondering too. However, according to the changelog, there was concern about tempfile's thread safety in August 2009. Also, I've checked on the BioRuby mailing list whether I'm calling the BioRuby code correctly, and that seems to be the case.
I really don't understand what the problem is or what exactly are you after, can't you just write something like
#!/bin/sh
#Test for input (syntax might be wrong, but you get the idea)
if [ $* ne "expected input" ]; then
echo "expected output for failure"
exit -1
fi
#have it work in a consistent amount of time
$CONSISTENT_AMOUNT_OF_TIME = 20
sleep $CONSISTENT_AMOUNT_OF_TIME
echo "expected output"
You can. In cases where I'm writing a functional test for program A, I may need to "mock" a program, B, that A runs via system. What I do then is to make program B's pathname configurable, with a default:
class ProgramA
def initialize(argv)
#args = ParseArgs(argv)
#config = Config.new(#args.config_path || default_config_path)
end
def run
command = [
program_b_path,
'--verbose',
'--do_something_wonderful',
].join(' ')
system(command)
...
end
def program_a_path
#config.fetch('program_b_path', default_program_b_path)
end
end
Program A takes a switch, "--config PATH", which can override the default config file path. The test sets up a configuration file in /tmp:
program_b_path: /home/wayne/project/tests/mock_program_b.rb
And passes to program A that configuration file:
program_a.rb --config /tmp/config.yaml
Now program A will run not the real program B, but the mock one.
Have you tried the Mocha gem? It's used a lot for testing, and you describe it perfectly. It "fakes" the method call of an object (which includes just about anything in ruby), and returns the result you want without actually running the method. Take this example file:
# test.rb
require 'rubygems'
require 'mocha'
self.stubs(:system).with('ls').returns('monkey')
puts system('ls')
Running this script outputs "monkey" because I stubbed out the system call. You can use this to bypass parts of an application you don't want test, to factor out irrelevant parts.

Extraordinarily Simple Ruby Question: Where's My Class?

[I'm just starting with Ruby, but "no question is ever too newbie," so I trudge onwards...]
Every tutorial and book I see goes from Ruby with the interactive shell to Ruby on Rails. I'm not doing Rails (yet), but I don't want to use the interactive shell. I have a class file (first_class.rb) and a Main (main.rb). If I run the main.rb, I of course get the uninitialized constant FirstClass. How do I tell ruby about the first_class.rb?
The easiest way is to put them both in the same file.
However you can also use require, e.g.:
require 'first_class'
You can also use autoload as follows:
autoload :FirstClass, 'first_class'
This code will automatically load first_class.rb as soon as FirstClass is used. Note, however, that the current implementations of autoload are not thread safe (see http://www.ruby-forum.com/topic/174036).
There's another point worth noting: you wouldn't typically use a main file in ruby. If you're writing a command line tool, standard practice would be to place the tool in a bin subdirectory. For normal one-off scripts the main idiom is:
if __FILE__ == $0
# main does here
# `__FILE__` contains the name of the file the statement is contained in
# `$0` contains the name of the script called by the interpreter
#
# if the file was `required`, i.e. is being used as a library
# the code isn't executed.
# if the file is being passed as an argument to the interpreter, it is.
end

Resources