Ruby Regexp Interpolation / Character Class / Global Variable Syntax Clash? - ruby

Why does this error occur?
Regexp.new("[#$]")
# => SyntaxError: (irb):1: syntax error, unexpected $undefined
# => Regexp.new("[#$]")
# ^
# (irb):1: unterminated string meets end of file
# from ~/.rvm/rubies/ruby-1.9.3-p194/bin/irb:1:in `<main>'
This should describe the subset of strings consisting of either a single $ or #, literally. And, AFAIU Ruby's Regexp engine, # and $ don't need to be escaped inside a character class even though they're usually metacharacters.
I would guess from the error message that Ruby is trying to interpolate $ when it's hitting # within double-quotes, but...why? Ordering is important. The $ and # characters have multiple overloaded behaviors, so I'm at a loss about what's triggering this.
PS, FYI:
/[#$]/
# => SyntaxError: (irb):1: syntax error, unexpected $undefined
/[$#]/
# => /[$#]/
Regexp.new '[$#]'
# => /[$#]/
Regexp.new '[#$]'
# => /[#$]/
Regexp.new "[#$]"
# => SyntaxError: (irb):1: syntax error, unexpected $undefined

The problem is not $, but #, as #... is usually used for variable expansion in double quoted strings. Like "#{x}".
But the thing is you can also expand global variables directly using #$global, and that explains your problem:
$global = "hello"
"#$global"
=> "hello"
So the solution is to escape either # or $, as this will break the string interpolation state machine out of it's effort to interpret the construct as an interpolation:
puts "\#$global"
=> #$global
puts "#\$global"
=> #$global
EDIT
And just to make it really clear :) The problem is not the Regexp, but you are trying to expand a global variable named $] when you type "#$]":
puts "#$]"
SyntaxError: (irb):22: syntax error, unexpected $undefined
To fix it you need to escape something:
puts "\#$]"
=> #$]

Related

Ruby: How can I kill "warning: `*' interpreted as argument prefix"? [duplicate]

This question already has answers here:
Why does white-space affect ruby function calls?
(2 answers)
Closed 6 years ago.
How can I remove the "warning: `*' interpreted as argument prefix" from the following code?
hash = {"a" => 1,
"b" => 2,
"s" => 3,}
if "string".start_with? *hash.keys then
puts "ok"
else
puts "ng"
end
When I run the code above, I get:
$ ruby -w /tmp/a.rb
/tmp/a.rb:5: warning: `*' interpreted as argument prefix
ok
What is the best way to fix this warning?
I've tried to put parenthesis around hash like this:
hash = {"a" => 1,
"b" => 2,
"s" => 3,}
if "string".start_with? (*hash.keys) then
puts "ok"
else
puts "ng"
end
then you get:
$ ruby -w /tmp/a.rb
/tmp/a.rb:5: syntax error, unexpected *
if "string".start_with? (*hash.keys) then
^
/tmp/a.rb:5: syntax error, unexpected ')', expecting '='
if "string".start_with? (*hash.keys) then
^
/tmp/a.rb:7: syntax error, unexpected keyword_else, expecting end-of-input
And this is the problem described in Why does white-space affect ruby function calls?, and clearly not the way to fix the warning I'm trying to fix.
My ruby version is:
$ ruby --version
ruby 2.3.3p222 (2016-11-21) [x86_64-linux-gnu]
If you're going to use method-calling-parentheses then you must avoid putting a space between the method name and the opening parentheses:
if "string".start_with?(*hash.keys)
puts "ok"
else
puts "ng"
end
Also, then is rather archaic so we'll pretend that was never there. If there is a space between the method name and the opening parentheses then your parentheses are interpreted as expression-grouping-parentheses and that's where your syntax error comes from.
Once you add the method-calling-parentheses you remove any possible hint of ambiguity as to what your * is supposed to mean and the warning should go away.
BTW, the warning you're getting in this case is rather, um, silly. On second thought, the warning isn't so silly because Ruby can be whitespace sensitive in surprising ways. This:
o.m *x
can be interpreted as:
o.m(*x)
or as:
o.m() * x
but these:
o.m * x
o.m*x
o.m* x
can be interpreted in the same ways. Of course, all three of those are interpreted as o.m() * x and only o.m *x is seen as o.m(*x). Sane whitespace usage would suggest that o.m *x is obviously a splat whereas o.m * x is obviously a multiplication but a couple days on SO should convince you that whitespace usage is hardly sane or consistent.
That said, -w's output in the Real World tends to be so voluminous and noisy that -w is nearly useless.

Passing hash to p causes syntax error

Why do I get this?
p {a:3}
# => syntax error, unexpected tINTEGER, expecting tSTRING_CONTENT or tSTRING_DBEG or tSTRING_DVAR or tSTRING_END
# => p {a:3}
^
Ruby has a few oddities in its parsing engine. One is that certain things require parentheses around them.
For instance, this should work.
p({a:3})
Or this
hash = { a: 3 }
p hash
As the other answer pointed out. The reason for this is that the interpreter processes as below.
# Input
p { a: 3 }
# What the interpreter sees
p do
a: 3
end
The Kernel#p doesn't support blocks, so you must use the parentheses.

Symbol literal or a method

Are :"foo" and :'foo' notations with quotations a symbol literal, or is : a unary operator on a string?
: is really just part of the literal you enter yourself or create through a method. Although : can take a name or a "string" to create a literal, unlike an operator it does not provoke any action or modify a value.
In each case an instance of Symbol is returned. Writing : with string notation is sometimes important. If you want to represent, for instance, a string containg whitespace as a symbol you need to use the string notation.
> :foo
=> :foo
> :foo bar
SyntaxError: (irb):2: syntax error, unexpected tIDENTIFIER, expecting end-of-input
> :"foo bar"
=> :"foo bar"
Furthermore, it is interesting to explore this with the equality operator (==)
> :"foo" == :foo
=> true
> :"foo " == :foo
=> false
My advice, do not think of it as passing a string or name to create a symbol, but of different ways to express the same symbol. In the end what you enter is interpreted to an object. This can be achieved in different ways.
> :"foo"
=> :foo
After all, %w(foo bar) is also an alternative way of writing ['foo', 'bar'].
Ruby's documentation on Symbol literals says this:
You may reference a symbol using a colon: :my_symbol.
You may also create symbols by interpolation:
:"my_symbol1"
:"my_symbol#{1 + 1}"
Basically :"foo" and :'foo' are symbol literals, but they are useful when you want to create symbols using interpolation.
You also need quotes if your symbol has spaces:
hash = {
:"a b c" => 10,
:"x y z" => 20,
}
puts hash[:"a b c"]
--output:--
10
So the first one. From the docs:
[Symbols] are generated using the :name and :"name" literals syntax, and by
the various to_sym methods.

Why does capturing named groups in Ruby result in "undefined local variable or method" errors?

I am having trouble with named captures in regular expressions in Ruby 2.0. I have a string variable and an interpolated regular expression:
str = "hello world"
re = /\w+/
/(?<greeting>#{re})/ =~ str
greeting
It raises the following exception:
prova.rb:4:in <main>': undefined local variable or methodgreeting' for main:Object (NameError)
shell returned 1
However, the interpolated expression works without named captures. For example:
/(#{re})/ =~ str
$1
# => "hello"
Named Captures Must Use Literals
You are encountering some limitations of Ruby's regular expression library. The Regexp#=~ method limits named captures as follows:
The assignment does not occur if the regexp is not a literal.
A regexp interpolation, #{}, also disables the assignment.
The assignment does not occur if the regexp is placed on the right hand side.
You'll need to decide whether you want named captures or interpolation in your regular expressions. You currently cannot have both.
Assign the result of #match; this will be accessible as a hash that allows you to look up your named capture groups:
> matches = "hello world".match(/(?<greeting>\w+)/)
=> #<MatchData "hello" greeting:"hello">
> matches[:greeting]
=> "hello"
Alternately, give #match a block, which will receive the match results:
> "hello world".match(/(?<greeting>\w+)/) {|matches| matches[:greeting] }
=> "hello"
As an addendum to both answers in order to make it crystal clear:
str = "hello world"
# => "hello world"
re = /\w+/
# => /\w+/
re2 = /(?<greeting>#{re})/
# => /(?<greeting>(?-mix:\w+))/
md = re2.match str
# => #<MatchData "hello" greeting:"hello">
md[:greeting]
# => "hello"
Interpolation is fine with named captures, just use the MatchData object, most easily returned via match.

Stumped by a simple regex

I am trying to see if the string s contains any of the symbols in a regex. The regex below works fine on rubular.
s = "asd#d"
s =~ /[~!##$%^&*()]+/
But in Ruby 1.9.2, it gives this error message:
syntax error, unexpected ']', expecting tCOLON2 or '[' or '.'
s = "asd#d"; s =~ /[~!##$%^&*()]/
What is wrong?
This is actually a special case of string interpolation with global and instance variables that most seem not to know about. Since string interpolation also occurs within regex in Ruby, I'll illustrate below with strings (since they provide for an easier example):
#foo = "instancefoo"
$foo = "globalfoo"
"##foo" # => "instancefoo"
"#$foo" # => "globalfoo"
Thus you need to escape the # to prevent it from being interpolated:
/[~!#\#$%^&*()]+/
The only way that I know of to create a non-interpolated regex in Ruby is from a string (note single quotes):
Regexp.new('[~!##$%^&*()]+')
I was able to replicate this behavior in 1.9.3p0. Apparently there is a problem with the '#$' combination. If you escape either it works. If you reverse them it works:
s =~ /[~!#$#%^&*()]+/
Edit: in Ruby 1.9 #$ invokes variable interpolation, even when followed by a % which is not a valid variable name.
I disagree, you need to escape the $, its the end of string character.
s =~ /[~!##\$%^&*()]/ => 3
That is correct.

Resources