Ruby: Accessing characters in a string with brackets and string-key - ruby

Can someone explain me what happens here? Please see the commented line.
> test = "This is a test"
=> "This is a test"
> test["is"] # What happens here?
=> "is"
I know that one can access characters in a string via integer-index. But what kind of language-feature is used in the shown snippet?

You're invoking [] on a string passing an argument and it is returning the occurrence given that argument, which is as the docs say:
If a match_str is given, that string is returned if it occurs in the string.
It works something like this:
p "This is a test"["is"] # "is"
p "This is a test"["is not"] # nil
If the given argument is a string, it'll return the first occurrence of it in the receiver, otherwise, it returns just nil.
Whenever you encounter yourself with doubt like this, check first the elements in your problem; you have a receiver (a string), you're invoking a method on it ([]) and you're passing an argument (also a string):
"This is a test"["is"]
| | |
| | --- # argument
| ------ # method
-------------------- # receiver
so, the first thing to do is to check what kind of object you're dealing with; then you can check where the method is defined and then the argument you're using.

Check here
If a match_str is given, that string is returned if it occurs in the string.
Returns nil if the regular expression does not match or the match string cannot be found.
2.5.1 :004 > test = "This is a test"
=> "This is a test"
2.5.1 :005 > test['is'] # Check for the substring i.e. 'is' here
=> "is"
2.5.1 :006 > test['iss'] # Check for the substring i.e. 'is' here
=> nil
2.5.1 :008 > test[1] # Returns the char at given index i.e. 1
=> "h"
2.5.1 :009 > another_test = "This is a test 1"
=> "This is a test 1"
2.5.1 :010 > another_test[1] # Returns the char at given index i.e. 1
=> "h"
2.5.1 :011 > another_test['1'] # Check for the substring i.e. '1' here
=> "1"
2.5.1 :011 > another_test['2'] # Check for the substring i.e. '2' here
=> nil

But what kind of language-feature is used in the shown snippet?
The ISO/IEC 30170:2012 Ruby Language Specification calls it an indexing method invocation, see section 11.3 Method invocation expressions, subsection 11.3.1 General description, clause b) for details. The spec text is somewhat convoluted and opaque, as specs tend to be, but in the end, it is rather simple:
foo[bar, baz]
is just syntactic sugar for
foo.[](bar, baz)
So, the code in your question is equivalent to
test.[]("is")

Related

Get last character in string

I want to get the last character in a string MY WAY - 1) Get last index 2) Get character at last index, as a STRING. After that I will compare the string with another, but I won't include that part of code here. I tried the code below and I get a strange number instead. I am using ruby 1.8.7.
Why is this happening and how do I do it ?
line = "abc;"
last_index = line.length-1
puts "last index = #{last_index}"
last_char = line[last_index]
puts last_char
Output-
last index = 3
59
Ruby docs told me that array slicing works this way -
a = "hello there"
a[1] #=> "e"
But, in my code it does not.
UPDATE:
I keep getting constant up votes on this, hence the edit. Using [-1, 1] is correct, however a better looking solution would be using just [-1]. Check Oleg Pischicov's answer.
line[-1]
# => "c"
Original Answer
In ruby you can use [-1, 1] to get last char of a string. Here:
line = "abc;"
# => "abc;"
line[-1, 1]
# => ";"
teststr = "some text"
# => "some text"
teststr[-1, 1]
# => "t"
Explanation:
Strings can take a negative index, which count backwards from the end
of the String, and an length of how many characters you want (one in
this example).
Using String#slice as in OP's example: (will work only on ruby 1.9 onwards as explained in Yu Hau's answer)
line.slice(line.length - 1)
# => ";"
teststr.slice(teststr.length - 1)
# => "t"
Let's go nuts!!!
teststr.split('').last
# => "t"
teststr.split(//)[-1]
# => "t"
teststr.chars.last
# => "t"
teststr.scan(/.$/)[0]
# => "t"
teststr[/.$/]
# => "t"
teststr[teststr.length-1]
# => "t"
Just use "-1" index:
a = "hello there"
a[-1] #=> "e"
It's the simplest solution.
If you are using Rails, then apply the method #last to your string, like this:
"abc".last
# => c
You can use a[-1, 1] to get the last character.
You get unexpected result because the return value of String#[] changed. You are using Ruby 1.8.7 while referring the the document of Ruby 2.0
Prior to Ruby 1.9, it returns an integer character code. Since Ruby 1.9, it returns the character itself.
String#[] in Ruby 1.8.7:
str[fixnum] => fixnum or nil
String#[] in Ruby 2.0:
str[index] → new_str or nil
In ruby you can use something like this:
ending = str[-n..-1] || str
this return last n characters
Using Rails library, I would call the method #last as the string is an array. Mostly because it's more verbose..
To get the last character.
"hello there".last() #=> "e"
To get the last 3 characters you can pass a number to #last.
"hello there".last(3) #=> "ere"
Slice() method will do for you.
For Ex
"hello".slice(-1)
# => "o"
Thanks
Your code kinda works, the 'strange number' you are seeing is ; ASCII code. Every characters has a corresponding ascii code ( https://www.asciitable.com/). You can use for conversationputs last_char.chr, it should output ;.

Substring syntaxes in Ruby

Python has the following elegant syntax for checking whether one string is a substring of another one:
'ab' in 'abc' # True
Is there an equivalent elegant syntax in Ruby?
I'm aware to the "abc".includes? "ab" Ruby syntax, but I'm wondering whether the inverse syntax exists too (where the first parameter is the substring and the second is the string).
There isn't such method in Ruby standard library, but Rails ActiveSupport provides #.in? method:
1.9.3-p484 :004 > "ab".in? "abc"
=> true
Here is the source code: https://github.com/rails/rails/blob/e20dd73df42d63b206d221e2258cc6dc7b1e6068/activesupport/lib/active_support/core_ext/object/inclusion.rb
Define "elegant".
This does a sub-string search and returns the "hit" if found:
'abc'['ab'] # => "ab"
Using !! converts the value returned to a true/false, so "ab" becomes true:
!!'abc'['ab'] # => true
Knowing that, it's trivial to add it in if you want something closer:
class String
def in?(other)
!!other[self]
end
end
'ab'.in?('abc') # => true
'ab'.in? 'abc' # => true
Or, use require 'active_support/core_ext/object/inclusion' to cherry-pick the Active Suport definition that extends all objects to allow in?. See http://edgeguides.rubyonrails.org/active_support_core_extensions.html#in-questionmark. The upside/downside to that it's modifying all objects.

Why does capturing named groups in Ruby result in "undefined local variable or method" errors?

I am having trouble with named captures in regular expressions in Ruby 2.0. I have a string variable and an interpolated regular expression:
str = "hello world"
re = /\w+/
/(?<greeting>#{re})/ =~ str
greeting
It raises the following exception:
prova.rb:4:in <main>': undefined local variable or methodgreeting' for main:Object (NameError)
shell returned 1
However, the interpolated expression works without named captures. For example:
/(#{re})/ =~ str
$1
# => "hello"
Named Captures Must Use Literals
You are encountering some limitations of Ruby's regular expression library. The Regexp#=~ method limits named captures as follows:
The assignment does not occur if the regexp is not a literal.
A regexp interpolation, #{}, also disables the assignment.
The assignment does not occur if the regexp is placed on the right hand side.
You'll need to decide whether you want named captures or interpolation in your regular expressions. You currently cannot have both.
Assign the result of #match; this will be accessible as a hash that allows you to look up your named capture groups:
> matches = "hello world".match(/(?<greeting>\w+)/)
=> #<MatchData "hello" greeting:"hello">
> matches[:greeting]
=> "hello"
Alternately, give #match a block, which will receive the match results:
> "hello world".match(/(?<greeting>\w+)/) {|matches| matches[:greeting] }
=> "hello"
As an addendum to both answers in order to make it crystal clear:
str = "hello world"
# => "hello world"
re = /\w+/
# => /\w+/
re2 = /(?<greeting>#{re})/
# => /(?<greeting>(?-mix:\w+))/
md = re2.match str
# => #<MatchData "hello" greeting:"hello">
md[:greeting]
# => "hello"
Interpolation is fine with named captures, just use the MatchData object, most easily returned via match.

Couldn't understand why the Regexp option i got disabled in my code

I have just started playing with Ruby and I'm stuck on something. Is
there some trick to modify the casefold attribute of a Regexp object after
it's been instantiated?
The best idea what I tried is the following:
irb(main):001:0> a = Regexp.new('a')
=> /a/
irb(main):002:0> aA = Regexp.new(a.to_s, Regexp::IGNORECASE)
=> /(?-mix:a)/i
But none of the below seems to work:
irb(main):003:0> a =~ 'a'
=> 0
irb(main):004:0> a =~ 'A'
=> nil
irb(main):005:0> aA =~ 'a'
=> 0
irb(main):006:0> aA =~ 'A'
=> nil
Something I don't understand is happening here. Where did the 'i' go on line
8?
irb(main):07:0> aA = Regexp.new(a.to_s, Regexp::IGNORECASE)
=> /(?-mix:a)/i
irb(main):08:0> aA.to_s
=> "(?-mix:a)"
irb(main):09:0>
I am using Ruby 1.9.3.
I am also unable understand the below code: why returning false:
/(?i:a)/.casefold? #=> false
As your console output shows, a.to_s includes the case sensitiveness as an option for your subexpression, so aA is being defined as
/(?-mix:a)/i
so you're asking ruby for a regular expression that is case insensitive, but the only thing in that case insensitive regexp is a group for when case sensitivity has be turned on, so the net effect is that 'a' is matched case sensitively
Since the result of to_s is just the regular expression string itself - no delimiters or external flags - the flags are translated into the (?i:...) syntax that sets or clears them temporarily inside the expression itself. This lets you get a Regexp object back out via a simple Regexp.new(s) call that will match the same strings.
The wrapping, unfortunately, includes explicitly clearing the flags that are not set on the object. So your first regex gets stringified into something between (?:-i...) - that is, the casefold option is explicitly turned off between the parentheses. Turning it back on for the object doesn't have any effect.
You can use a.source instead of a.to_s to get just the original expression, without the flag settings:
irb(main):001:0> a=/a/
=> /a/
irb(main):002:0> aA = Regexp.new(a.source, Regexp::IGNORECASE)
=> /a/i
irb(main):003:0> a =~ 'a'
=> 0
irb(main):004:0> a =~ 'A'
=> nil
irb(main):005:0> aA =~ 'a'
=> 0
irb(main):006:0> aA =~ 'A'
=> 0
As Frederick already explains, calling to_s on a regex will add modifiers around it that ensure that its properties like case-sensitiveness are preserved. So if you insert a case-sensitive regex into a case-insensitive regex, the inserted part will still be case-sensitive. Likewise the modifiers given to Regexp.new will have no effect if the first argument is a regex or the result of calling to_s on one.
To solve this issue, call source on the regex instead of to_s. Unlike to_s, source simply returns the source of regex without adding anything:
aA = Regexp.new(a.source, Regexp::IGNORECASE)
I am also unable understand the below code: why returning false:
/(?i:a)/.casefold?
Because (?i:...) sets the i flag locally, not globally. It only applies to the part of the regex within the parentheses, not the whole regex. Of course in this case the whole regex is within the parentheses, but that doesn't matter as far as methods like casefold? are concerned.

What's the difference between scan and match on Ruby string

I am new to Ruby and has always used String.scan to search for the first occurrence of a number. It is kind of strange that the returned value is in nested array, but I just go [0][0] for the values I want. (I am sure it has its purpose, just that I haven't used it yet.)
I just found out that there is a String.match method. And it seems to be more convenient because the returned array is not nested.
Here is an example of the two, first is scan:
>> 'a 1-night stay'.scan(/(a )?(\d*)[- ]night/i).to_a
=> [["a ", "1"]]
then is match
>> 'a 1-night stay'.match(/(a )?(\d*)[- ]night/i).to_a
=> ["a 1-night", "a ", "1"]
I have check the API, but I can't really differentiate the difference, as both referred to 'match the pattern'.
This question is, for simply out curiousity, about what scan can do that match can't, and vise versa. Any specific scenario that only one can accomplish? Is match the inferior of scan?
Short answer: scan will return all matches. This doesn't make it superior, because if you only want the first match, str.match[2] reads much nicer than str.scan[0][1].
ruby-1.9.2-p290 :002 > 'a 1-night stay, a 2-night stay'.scan(/(a )?(\d*)[- ]night/i).to_a
=> [["a ", "1"], ["a ", "2"]]
ruby-1.9.2-p290 :004 > 'a 1-night stay, a 2-night stay'.match(/(a )?(\d*)[- ]night/i).to_a
=> ["a 1-night", "a ", "1"]
#scan returns everything that the Regex matches.
#match returns the first match as a MatchData object, which contains data held by special variables like $& (what was matched by the Regex; that's what's mapping to index 0), $1 (match 1), $2, et al.
Previous answers state that scan will return every match from the string the method is called on but this is incorrect.
Scan keeps track of an index and continues looking for subsequent matches after the last character of the previous match.
string = 'xoxoxo'
p string.scan('xo') # => ['xo' 'xo' 'xo' ]
# so far so good but...
p string.scan('xox') # => ['xox']
# if this retured EVERY instance of 'xox' it would include a substring
# starting at indices 0 and 2 but only one match is found

Resources