Ruby: nested regular expressions and string replace - ruby

I'm using CodeRay for syntax highlighting, but I'm having trouble with this regular expression. The text will look like this:
<pre><code>:::ruby
def say_hello
puts 'hello!'
end
</code></pre>
This part: :::ruby will tell CodeRay which language the code block should be interpreted as (but it needs to be optional). So here's what I have so far:
def coderay(text)
text.gsub(/\<pre\>\<code\>(.+?)\<\/code\>\<\/pre\>/m) do
CodeRay.scan($2, $3).div()
end
end
$2 contains the code that I'm formatting (including the line that says which language to format it in), but I need to extract that first line so I can pass it as the second parameter to scan() or pass it a default parameter if that language line wasn't found. How can I do this?

In Ruby 1.9, using named groups:
default_lang=:ruby
def coderay(text)
text.gsub(%r!<pre><code>(?::{3}(?<lang>\w+)\s+)?(?<code>.+?)</code></pre>!m) do
if $~[:lang].nil?
lang=default_lang
else
lang = $~[:lang].intern
end
CodeRay.scan($~[:code], lang).div()
end
end
default_lang could also be a class or object variable rather than a local, depending on the context of coderay.
Same, but using an inline expression to handle the optional language:
default_lang=:ruby
def coderay(text)
text.gsub(%r!<pre><code>(?::{3}(?<lang>\w+)\s+)?(?<code>.+?)</code></pre>!m) do
CodeRay.scan($~[:code], $~[:lang].nil? ? default_lang : $~[:lang].intern).div()
end
end
The second option is a little messier, hence you might want to avoid it.
It turns out named groups in a non-matching optional group are still counted in Ruby, so handling unmatched numbered groups isn't any different from unmatched named groups, unlike what I first thought. You can thus replace the named group references with positional references in the above and it should work the same.
default_lang=:ruby
def coderay(text)
text.gsub(%r!<pre><code>(?::{3}(?<lang>\w+)\s+)?(?<code>.+?)</code></pre>!m) do
CodeRay.scan($2, $1.nil? ? default_lang : $1.intern).div()
end
end
def coderay(text)
text.gsub(%r!<pre><code>(?::{3}(?<lang>\w+)\s+)?(?<code>.+?)</code></pre>!m) do
if $1.nil?
lang=default_lang
else
lang = $1.intern
end
CodeRay.scan($2, lang).div()
end
end

Related

How to create and use variables dynamically named by string values in Ruby?

I'm using SitePrism to create some POM tests. One of my page classes looks like this:
class HomePage < SitePrism::Page
set_url '/index.html'
element :red_colour_cell, "div[id='colour-cell-red']"
element :green_colour_cell, "div[id='colour-cell-green']"
element :blue_colour_cell, "div[id='colour-cell-blue']"
def click_colour_cell(colour)
case colour
when 'red'
has_red_colour_cell?
red_colour_cell.click
when 'green'
has_green_colour_cell?
green_colour_cell.click
when 'blue'
has_blue_colour_cell?
blue_colour_cell.click
end
end
end
The method click_colour_cell() get its string value passed from a Capybara test step that calls this method.
If I need to create additional similar methods in the future, it can become rather tedious and unwieldy having so many case switches to determine the code flow.
Is there some way I can create a variable that is dynamically named by the string value of another variable? For example, I would like to do something for click_colour_cell() that resembles the following:
def click_colour_cell(colour)
has_#colour_colour_cell?
#colour_colour_cell.click
end
where #colour represents the value of the passed value, colour and would be interpreted by Ruby:
def click_colour_cell('blue')
has_blue_colour_cell?
blue_colour_cell.click
end
Isn't this what instance variables are used for? I've tried the above proposal as a solution, but I receive the ambiguous error:
syntax error, unexpected end, expecting ':'
end
^~~ (SyntaxError)
If it is an instance variable that I need to use, then I'm not sure I'm using it correctly. if it's something else I need to use, please advise.
Instance variables are used define properties of an object.
Instead you can achieve through the method send and string interpolation.
Try the below:
def click_colour_cell(colour)
send("has_#{colour}_colour_cell?")
send("#{colour}_colour_cell").click
end
About Send:
send is the method defined in the Object class (parent class for all the classes).
As the documentation says, it invokes the method identified by the given String or Symbol. You can also pass arguments to the methods you are trying to invoke.
On the below snippet, send will search for a method named testing and invokes it.
class SendTest
def testing
puts 'Hey there!'
end
end
obj = SendTest.new
obj.send("testing")
obj.send(:testing)
OUTPUT
Hey there!
Hey there!
In your case, Consider the argument passed for colour is blue,
"has_#{colour}_colour_cell?" will return the string"has_blue_colour_cell?" and send will dynamically invoke the method named has_blue_colour_cell?. Same is the case for method blue_colour_cell
Direct answer to your question
You can dynamically get/set instance vars with:
instance_variable_get("#build_string_as_you_see_fit")
instance_variable_set("#build_string_as_you_see_fit", value_for_ivar)
But...
A Warning!
I think dynamically creating variables here and/or using things like string-building method names to send are a bad idea that will greatly hinder future maintainability.
Think of it this way: any time you see method names like this:
click_blue_button
click_red_button
click_green_button
it's the same thing as doing:
add_one_to(1) // instead of 1 + 1, i.e. 1.+(1)
add_two_to(1) // instead of 1 + 2, i.e. 1.+(2)
add_three_to(1) // instead of 1 + 3, i.e. i.+(3)
Instead of passing a meaningful argument into a method, you've ended up hard-coding values into the method name! Continue this and eventually your whole codebase will have to deal with "values" that have been hard-coded into the names of methods.
A Better Way
Here's what you should do instead:
class HomePage < SitePrism::Page
set_url '/index.html'
elements :color_cells, "div[id^='colour-cell-']"
def click_cell(color)
cell = color_cells.find_by(id: "colour-cell-#{color}") # just an example, I don't know how to do element queries in site-prism
cell.click
end
end
Or if you must have them as individual elements:
class HomePage < SitePrism::Page
set_url '/index.html'
COLORS = %i[red green blue]
COLORS.each do |color|
element :"#{color}_colour_cell", "div[id='colour-cell-#{color}']"
end
def cell(color:) # every other usage should call this method instead
#cells ||= COLORS.index_with do |color|
send("#{color}_colour_cell") # do the dynamic `send` in only ONE place
end
#cells.fetch(color)
end
end
home_page.cell(color: :red).click

Custom Methods for Treetop Syntax Nodes

I have a Treetop PEG grammar that matches some keys. I want to look up the values associated with those keys in a hash I give the parser. How can I make it so that the syntax nodes have access to methods or variables from the parser?
For example, here's a simple grammar that finds a single word and tries to look up its value:
# var.treetop
grammar VarResolver
include VarLookup
rule variable
[a-zA-Z] [a-zA-Z0-9_]*
{
def value
p found:text_value
find_variable(text_value)
end
}
end
end
Here's a test file using it:
# test.rb
require 'treetop'
module VarLookup
def set_variables(variable_hash)
#vars = variable_hash
end
def find_variable(str)
#vars[str.to_sym]
end
end
Treetop.load('var.treetop')
#p = VarResolverParser.new
#p.set_variables name:'Phrogz'
p #p.parse('name').value
Running this test, I get the output:
{:found=>"name"}
(eval):16:in `value': undefined method `find_variable'
for #<Treetop::Runtime::SyntaxNode:0x00007f88e091b340> (NoMethodError)
How can I make find_variable accessible inside the value method? (In the real parser, these rules are deeply nested, and need to resolve the value without returning the actual name to the top of the parse tree. I cannot just return the text_value and look it up outside.)
This is a significant weakness in the design of Treetop.
I (as maintainer) didn't want to slow it down further by
passing yet another argument to every SyntaxNode,
and break any custom SyntaxNode classes folk have
written. These constructors get the "input" object, a Range
that selects part of that input, and optionally an array
of child SyntaxNodes. They should have received the
Parser itself instead of the input as a member.
So instead, for my own use (some years back), I made
a custom proxy for the "input" and attached my Context
to it. You might get away with doing something similar:
https://github.com/cjheath/activefacts-cql/blob/master/lib/activefacts/cql/parser.rb#L203-L249

How does [ ] work on a class in Ruby

I see that I can get a list of files in a directory using
Dir["*"]
How am I supposed to read that syntax exactly ? As I know that you can use [ ] to fetch a value from a array or a hash.
How does [ ] work on a call ?
[] is simply a method, like #to_s, #object_id. etc.
You can define it on any object:
class CoolClass
def [](v)
puts "hello #{v}"
end
end
CoolClass.new["John"] # => "hello John"
In your case it's defined as singleton method, in this way:
class Dir
def self.[](v)
...
end
end
From the Ruby Docs, Dir["*"] is equivalent to Dir.glob(["*"]). (As pointed out, it's syntactic sugar)
Dir isn't a call, it's a class, and objects of class Dir are directory streams, which you access like an array.
In your specific case, Dir["*"] will return an array of filenames that are found from the pattern passed as Dir[patternString]. "*" as a pattern will match zero or more characters, in other words, it will match everything, and thus will return an array of all of the filenames in that directory.
For your second question, you can just define it as any other method like so:
class YourClass
def self.[](v)
#your code here
end
end
The method Dir::glob takes an argument, and provides an array of all directories and files nested under the argument. (From there, you can grab the index of the array with [0].) The argument may include a pattern to match, along with flags. The argument (pattern, flags) may be options similar (but not exactly) regular expressions.
From the docs, including a couple of patterns/flags that may be of interest to you:
Note that this pattern is not a regexp, it's closer to a shell glob. See File.fnmatch for the meaning of the flags parameter. Note that case sensitivity depends on your system (so File::FNM_CASEFOLD is ignored), as does the order in which the results are returned.
* - Matches any file. Can be restricted by other values in the glob. Equivalent to / .* /x in regexp.
[set] - Matches any one character in set. Behaves exactly like character sets in Regexp, including set negation ([^a-z]).
The shorthand of Dir::glob() is Dir[], although I prefer the long form. As you saw above, using brackets denotes a special pattern/flag for the argument. Here are some examples (from the docs) that may better explain this:
Dir["config.?"] #=> ["config.h"]
Dir.glob("config.?") #=> ["config.h"]
Dir.glob("*.[a-z][a-z]") #=> ["main.rb"]
Dir.glob("*") #=> ["config.h", "main.rb"]
It is possible for you to redefine the [] method for Dir, but I will not show how -- many (and myself) do not recommend monkey-patching core Ruby classes and modules. However, you can create the method in a class of your own. See the following:
class User
# Class method => User.new[arg]
def self.[](arg)
end
# Instance method => #user[arg]
def [](arg)
end
end
Dir is an object just like any other object (it just happens to be an instance of class Class), and [] is a method just like any other method (it just happens to have a funny name, and special syntactic conveniences that allow it to called using a different syntax in addition to the normal one).
So, you define it just like any other method:
class MyClass
def self.[](*) end
end

How does Cucumber DSL work?

Let's take:
When /^(?:|I )fill in the following:$/ do |fields|
fields.rows_hash.each do |name, value|
When %{I fill in "#{name}" with "#{value}"}
end
end
With my rudimentary Ruby knowledge, I was thinking that When is a method call that takes a regular expression and a block.
But then, I am also thinking that this is a definition, and not a method call, but then how is it achieved? How can When define something?
The code is the following (code taken from here):
def register_rb_step_definition(regexp, symbol = nil, options = {}, &proc)
proc_or_sym = symbol || proc
RbDsl.register_rb_step_definition(regexp, proc_or_sym, options)
end
When, Given, Then are an alias to register_rb_step_definition.
You pass a regular expression as an argument, and a block.
Each step definition is registered with the regular expression and the block. When the test is executed, cucumber looks in the previously registered steps and if any regular expression matches it executes the block associated to that regular expression.

Whats the ruby syntax for calling a method with multiple parameters and a block?

Ruby doesn't like this:
item (:name, :text) {
label('Name')
}
And I don't know why. I'm attempting to create a DSL. The 'item' method looks like this:
def item(name, type, &block)
i = QbeItemBuilder.new(#ds, name, QbeType.gettype(type))
i.instance_exec &block
end
Take a name for the item, a type for the item, and a block. Construct an item builder, and execute the block in its context.
Regardless of whether or not I need to use instance_exec (I'm thinking that I don't - it can be stuffed in the initialiser), I get this:
SyntaxError (ds_name.ds:5: syntax error, unexpected ',', expecting ')'
item (:name, :text) {
^
How do I invoke method with multiple arguments and a block? What does ruby think I'm trying to do?
The space before parentheses is causing ruby to evaluate (:name, :text) as single argument before calling the method which results in a syntax error. Look at these examples for illustration:
puts 1 # equivalent to puts(1) - valid
puts (1) # equivalent to puts((1)) - valid
puts (1..2) # equivalent to puts((1..2)) - valid
puts (1, 2) # equivalent to puts((1, 2)) - syntax error
puts(1, 2) # valid
Your way of providing the block is syntactically valid, however when the block is not in the same line as the method call it is usually better to use do ... end syntax.
So to answer your question you can use:
item(:name, :text) { label('Name') }
or:
item(:name, :text) do
label('Name')
end
Remove the space before the ( in item (:name, :text) {

Resources