Capturing the spec filename in a ruby DSL implementation? - ruby

I am writing a ruby DSL that will be used to code-generate a number of Objective-C++ functions. I would like the name of each function to be derived from the name of its ruby DSL source file.
For example, given this source file clusterOptions.rb:
require './vMATCodeMonkey'
VMATCodeMonkey.new(:print).options_processor <<EOS
-cutoff: flag: set('useCutoff', true), arg: vector('double')
-depth: flag: set('useInconsistent', true), arg: scalar('double', default: 2.0)
-maxclust: flag: set('useCutoff', false), arg: vector('index')
EOS
When the VMATCodeMonkey.new(:print) expression is evaluated I would ideally somehow like the new object to capture the clusterOptions.rb source filename. Is that possible?
And if (as I suspect) it is not, is there a good idiom for accomplishing this functionality [e.g. making the source file name effectively part of the specification captured by a DSL] in ruby?
[While I suspect it's not possible to do exactly as I've described, I ask anyway, because I've been surprised by ruby's obscure capabilities more than once.]
EDIT: I'm aware of __FILE__; what I'm looking for is some DSL-centric way of capturing the name of a DSL source file without explicitly mentioning __FILE__ in the DSL source. Hmm, and now that I'm trying to explain it, maybe crawling up a stack trace from the class initialize method?
Solution
With thanks to tadman, here is my VMATCodeMonkey#initialize method:
def initialize(out_opt = :print)
#caller_file = caller(1)[0].split(':')[0]
case out_opt
when :pbcopy
#out = IO.popen('pbcopy', 'w')
when :print
#out = $stdout
else
raise ArgumentError, "#{out_opt} is not an option!"
end
#out.puts "// vMATCodeMonkey's work; do not edit by hand!\n\n"
initialize_options_processor
end
And here's what it captures:
#caller_file = "/Users/Shared/Source/vMAT/ruby/clusterOptions.rb"

The full path to the source file being evaluated is stored in __FILE__. If you want just the filename, you'd use:
File.basename(__FILE__)
The __FILE__ constant is common to C, C++, Perl and Python, among others.
If you need to know what file made the call to the currently running routine, this could work:
caller(1)[0].split(':')[0]
This presumes your filenames do not have : in them, but in most cases that should be a fairly safe assumption. You'll also need to call this at the entry point into your library. If it's a method deeper in the stack, test caller(2) and so on.

Related

Handle ARGV in Ruby without if...else block

In a blog post about unconditional programming Michael Feathers shows how limiting if statements can be used as a tool for reducing code complexity.
He uses a specific example to illustrate his point. Now, I've been thinking about other specific examples that could help me learn more about unconditional/ifless/forless programming.
For example in this cat clone there is an if..else block:
#!/usr/bin/env ruby
if ARGV.length > 0
ARGV.each do |f|
puts File.read(f)
end
else
puts STDIN.read
end
It turns out ruby has ARGF which makes this program much simpler:
#!/usr/bin/env ruby
puts ARGF.read
I'm wondering if ARGF didn't exist how could the above example be refactored so there is no if..else block?
Also interested in links to other illustrative specific examples.
Technically you can,
inputs = { ARGV => ARGV.map { |f| File.open(f) }, [] => [STDIN] }[ARGV]
inputs.map(&:read).map(&method(:puts))
Though that's code golf and too clever for its own good.
Still, how does it work?
It uses a hash to store two alternatives.
Map ARGV to an array of open files
Map [] to an array with STDIN, effectively overwriting the ARGV entry if it is empty
Access ARGV in the hash, which returns [STDIN] if it is empty
Read all open inputs and print them
Don't write that code though.
As mentioned in my answer to your other question, unconditional programming is not about avoiding if expressions at all costs but about striving for readable and intention revealing code. And sometimes that just means using an if expression.
You can't always get rid of a conditional (maybe with an insane number of classes) and Michael Feathers isn't advocating that. Instead it's sort of a backlash against overuse of conditionals. We've all seen nightmare code that's endless chains of nested if/elsif/else and so has he.
Moreover, people do routinely nest conditionals inside of conditionals. Some of the worst code I've ever seen is a cavernous nightmare of nested conditions with odd bits of work interspersed within them. I suppose that the real problem with control structures is that they are often mixed with the work. I'm sure there's some way that we can see this as a form of single responsibility violation.
Rather than slavishly try to eliminate the condition, you could simplify your code by first creating an array of IO objects from ARGV, and use STDIN if that list is empty.
io = ARGV.map { |f| File.new(f) };
io = [STDIN] if !io.length;
Then your code can do what it likes with io.
While this has strictly the same number of conditionals, it eliminates the if/else block and thus a branch: the code is linear. More importantly, since it separates gathering data from using it, you can put it in a function and reuse it further reducing complexity. Once it's in a function, we can take advantage of early return.
# I don't have a really good name for this, but it's a
# common enough idiom. Perl provides the same feature as <>
def arg_files
return ARGV.map { |f| File.new(f) } if ARGV.length;
return [STDIN];
end
Now that it's in a function, your code to cat all the files or stdin becomes very simple.
arg_files.each { |f| puts f.read }
First, although the principle is good, you have to consider other things that are more importants such as readability and perhaps speed of execution.
That said, you could monkeypatch the String class to add a read method and put STDIN and the arguments in an array and start reading from the beginning until the end of the array minus 1, so stopping before STDIN if there are arguments and go on until -1 (the end) if there are no arguments.
class String
def read
File.read self if File.exist? self
end
end
puts [*ARGV, STDIN][0..ARGV.length-1].map{|a| a.read}
Before someone notices that I still use an if to check if a File exists, you should have used two if's in your example to check this also and if you don't, use a rescue to properly inform the user.
EDIT: if you would use the patch, read about the possible problems at these links
http://blog.jayfields.com/2008/04/alternatives-for-redefining-methods.html
http://www.justinweiss.com/articles/3-ways-to-monkey-patch-without-making-a-mess/
Since the read method isn't part of String the solutions using alias and super are not necessary, if you plan to use a Module, here is how to do that
module ReadString
def read
File.read self if File.exist? self
end
end
class String
include ReadString
end
EDIT: just read about a safe way to monkey patch, for your documentation see https://solidfoundationwebdev.com/blog/posts/writing-clean-monkey-patches-fixing-kaminari-1-0-0-argumenterror-comparison-of-fixnum-with-string-failed?utm_source=rubyweekly&utm_medium=email

Ruby code blocks and Chef

I am an extremely new person to Ruby and Chef. I have been trying to wrap my head around the syntax and do some research, but I am sure as you all know unless one knows the terminology, it is hard to find what you are looking for.
I have read up on Ruby code blocks, but the Chef code blocks still confuse me. I see something like this for example:
log "a debug string" do
level :debug
end
Which adds "a debug string" to the log. From what I have seen though, it seems to me like it should be represented as:
log do |message|
#some logic
end
Chef refers to these as resources. Can someone please help explain the syntax difference and give me some terminology from which I can start to educate myself with?
If you come from another language (not Ruby), this syntax might seem very strange. Let's break down things.
When calling a method with parameters, in most cases the parentheses are optional:
foo(bar) is equivalent to foo bar
foo(bar, baz) is equivalent to foo bar, baz
A Ruby block of code can be wrapped in curly braces ({}) or inside a do..end block and can be passed to a method as its last parameters (but note that there's no comma and if you're using parentheses it goes after them. Some examples:
foo(bar) { # code here }
foo(bar) do
# code here
end
foo bar do
# code here
end
foo do
# code here
end
In some cases, code blocks can receive parameters, but in Chef the resources' blocks never do. Just for reference, the syntax for that is:
foo(bar) do |baz, qux|
baz + qux
end
Specifically about Chef resources, their syntax is usually:
resource_type(name) do
attribute1 value1
attribute2 value2
end
This means that, when you say:
log "a debug string" do
level :debug
end
you're actually creating a log resource whose name attribute is set to "a debug string". It can later be referred to (in other resources, for example) using log[a debug string].
AFAIK, the name attribute is mandatory for every Chef resource type as it's what makes it unique, and allows you to, among other things, call actions on it after it has been declared.
Side note: The ruby block is usually optional for a Chef resource. If you do something like:
directory "/some/path"
Chef will compile that resource using its default attributes (among which is action :create), and try to create the named directory using those.
The do ... end here is not a usual ruby block statement.
It's a implementation of DSL (Domain Specific Language).
Here's a nice explanation [1]:
there is the concept of an internal DSL, which uses the syntax of an
exŃ–sting language, a host language, such as Ruby. The means of the
language are used to build constructs resembling a distinct language.
The, already mentioned, Rake uses this to make code like this
possible:
task :codeGen do
# do the code generation
end
Hope that answer your question.
[1] : http://www.infoq.com/news/2007/06/dsl-or-not

From an included file, how can I get the filename of the file doing the including?

Apologies for the poorly worded question title - no idea how to put it better!
In the following code, when I execute ruby bar.rb, how can I make it output bar.rb, rather than foo.rb?
In foo.rb:
module Foo
def filename
__FILE__
end
end
In bar.rb:
require_relative 'foo'
include Foo
puts filename # outputs 'foo.rb'
This is for a library function that, each time some code is executed, records the location (and git ref) of that code.
Your question stimulated me to crack open the Ruby interpreter source and see how __FILE__ actually works. The answer is pretty interesting: it's implemented right inside the parser. The lexer has a special token type for __FILE__. When the parser sees that token, it converts it to a string constant, which contains the name of the file the parser is working on.
From line 14948 of ext/ripper/ripper.c:
case keyword__FILE__:
return NEW_STR(rb_external_str_new_with_enc(ruby_sourcefile, strlen(ruby_sourcefile),
rb_filesystem_encoding()));
I think this should make it clear that trying to make __FILE__ return the name of the including file is completely impossible, unless you hack the Ruby interpreter source, or write your own preprocessor which transforms __FILE__ to something else before passing the Ruby source to the interpreter!
There is a trick you might be a able to use. If you pass a block to the method you could use the blocks closure to determine it's source. Something like:
def filename(&blk)
blk.eval "__FILE__"
end
But again, that means you have to pass a block.
Honestly I wonder what you are trying to accomplish, b/c outside of make some common core extension method, this is probably something you really don't want to do.

Is there something like a null-stream in Ruby?

I could use:
File.open('/dev/null', 'w')
on Unix systems, but if there is a Ruby way to achieve this, I'd like to use it. I am just looking for an I/O stream, that immediately "trashes" all writes, kind of like a null-object.
If you want the full behavior of streams, the best is probably to use:
File.open(File::NULL, "w")
Note that File::NULL is new to Ruby 1.9.3; you can use my backports gem:
require 'backports/1.9.3/file/null' # => Won't do anything in 1.9.3+
File.open(File::NULL, "w") # => works even in Ruby 1.8.6
You could also copy the relevant code if you prefer.
There's stringIO, which I find useful when I want to introduce a dummy filestream:
require "stringio"
f = StringIO.new
f.gets # => nil
And here's some code from heckle that finds the bit bucket for both Unix and Windows, slightly modified:
# Is this platform MS Windows-like?
# Actually, I suspect the following line is not very reliable.
WINDOWS = RUBY_PLATFORM =~ /mswin/
# Path to the bit bucket.
NULL_PATH = WINDOWS ? 'NUL:' : '/dev/null'
No, I don't believe there is anything like a null stream in Ruby, at least in earlier versions. In that case, you must make one yourself. Depending on the methods that it will call, you will need to write
stub methods on the null stream class, like this:
class NullStream
def <<(o); self; end
end
The above example is by no means complete. For example, some streams may require calling the write, puts or other methods. Moreover, some methods should be implemented by returning self in their methods, like <<, others not.
Logger.new("/dev/null") does the trick
There's a gem called devnull
Ruby implementation of null file (like /dev/null on Un*x, NUL on
Windows)
It doesn't interact with the null file, but instead has dummy methods for all the methods that IO objects implement.

When is `eval` in Ruby justified?

"Is 'eval' supposed to be nasty?" inspired this one:
Mostly everybody agrees that eval is bad, and in most cases there is more elegant/safer replacement.
So I wanted to ask: if eval is misused that often, is it really needed as a language feature? Is it doing more evil than good?
Personally, the only place I find it useful is to interpolate strings provided in config file.
Edit: The intention of this question is to get as many real-life cases as possible when eval is the only or the best solution. So please, don't go into "should a language limit a programmer's creativity" direction.
Edit2: And when I say eval, of course I refer to evaling string, not passing ruby block to instance_eval or class_eval.
The only case I know of (other than "I have this string and I want to execute it") is dynamically dealing with local and global variables. Ruby has methods to get the names of local and global variables, but it lacks methods to get or set their values based on these names. The only way to do AFAIK is with eval.
Any other use is almost certainly wrong. I'm no guru and can't state categorically that there are no others, but every other use case I've ever seen where somebody said "You need eval for this," I've found a solution that didn't.
Note that I'm talking about string eval here, by the way. Ruby also has instance_eval, which can take either a string or a block to execute in the context of the receiver. The block form of this method is fast, safe and very useful.
When is it justified? I'd say when there's no reasonable alternative. I was able to think of one use where I can't think of an alternative: irb, which, if you dig deep enough (to workspace.rb, around line 80 in my copy if you're interested) uses eval to execute your input:
def evaluate(context, statements, file = __FILE__, line = __LINE__)
eval(statements, #binding, file, line)
end
That seems pretty reasonable to me - a situation where you specifically don't know what code you're going to have to execute until the very moment that you're asked to do so. Something dynamic and interactive seems to fit the bill.
The reason eval is there is because when you need it, when you really need it, there are no substitutes. There's only so much you can do with creative method dispatching, after all, and at some point you need to execute arbitrary code.
Just because a language has a feature that might be dangerous doesn't mean it's inherently a bad thing. When a language presumes to know more than its user, that's when there's trouble.
I'd argue that when you find a programming language devoid of danger, you've found one that's not very useful.
When is eval justified? In pragmatic terms, when you say it is. If it's your program and you're the programmer, you set the parameters.
There is one very important use-case for eval() which cannot (AFAIK) be achieved using anything else, and that is to find the corresponding object reference for a binding.
Say you have been passed a block but (for some reason) you need access to object context of the binding, you would do the following:
obj = eval('self', block.binding)
It is also useful to define the following:
class Proc
def __context__
eval('self', self.binding)
end
end
IMO mostly for Domain Specific Languages.
"Evaluation Options in Ruby" is an article by Jay Fields about it on InfoQ.
eval is a tool, it is neither inherently good nor evil. It is justified whenever you are certain it is the right tool for what you are trying to accomplish.
A tool like eval is about evaluating code at runtime vs. "compile" time. Do you know what the code is when you launch Ruby? Then you probably don't need eval. Is your code generating code during runtime? then you probably need to eval it.
For example, the methods/functions needed in a recursive decent parser depend on the language being parsed. If your application builds such a parser on-the-fly, then it might make sense to use eval. You could write a generalized parser, but it might not be as elegant a solution.
"Programatically filling in a letrec in Scheme. Macros or eval?" is a question I posted about eval in Scheme, where its use is mostly unavoidable.
In general eval is a useful language feature when you want to run arbitrary code. This should be a rare thing but maybe you are making your own REPL or you want to expose the ruby run-time to the end user for some reason. It could happen and that is why the feature exists. If you are using it to work around some part of the language (e.g. global variables) then either the language is flawed or your understanding of the language is flawed. The solution is typically not to use eval but to either better understand the language or pick a different language.
It's worth noting that in ruby particulary instance_eval and class_eval have other uses.
You very likely use eval on a regular basis without even realizing it; it's how rubygems loads the contents of a Gemspec. Via rubygems/lib/specification.rb:
# Note: I've removed some lines from that listing to illustrate the core concept
def self.load(file)
code = File.read(file)
begin
_spec = eval code, binding, file # <-------- EVAL HAPPENS HERE
if Gem::Specification === _spec
return _spec
end
warn "[#{file}] isn't a Gem::Specification (#{_spec.class} instead)."
rescue SignalException, SystemExit
raise
rescue SyntaxError, Exception => e
warn "Invalid gemspec in [#{file}]: #{e}"
end
nil
end
Typically, a gem specification would look like this:
Gem::Specification.new do |s|
s.name = 'example'
s.version = '0.1.0'
s.licenses = ['MIT']
s.summary = "This is an example!"
s.description = "Much longer explanation of the example!"
s.authors = ["Ruby Coder"]
s.email = 'rubycoder#example.com'
s.files = ["lib/example.rb"]
s.homepage = 'https://rubygems.org/gems/example'
s.metadata = { "source_code_uri" => "https://github.com/example/example" }
end
Note that the gemspec file simply creates a new object but does not assign it nor send it anywhere.
Trying to load or require this file (or even executing it with Ruby) will not return the Gem::Specification value. eval is the only way to extract the value defined by an external ruby file.
One use of eval is compiling another language to ruby:
ruby_code = "(def foo (f a b) (mapv f (cons a b)))".compile_to_ruby
# "foo_proc = ->(f a b) { mapv_proc.call(f, (cons_proc.call(a, b)) }"
eval ruby_code
I use a 3D modeling software that implemented Ruby for writing custom text macros. In that software we are given access to model data in the form of name:value pairs accessed using the following format:
owner.name
#=> value
So for a 36 inch tall cabinet, I could access the height and convert its value to feet like so:
owner.height.to_f / 12
The main problem is that objects in that software have no unique identifiers aside from something called their schedule_number. If I want to name a variable using the schedule_number in the variable name so that I can call and use that value elsewhere, the only possible way I know to do that is by using eval:
eval "#{owner.schedule_number} = owner.height"

Resources