How to do with the string method `[]` in ruby - ruby

2.0.0p247 :069 > str[str.length].class
=> NilClass
2.0.0p247 :071 > str[str.length, 1].class
=> String
2.0.0p247 :072 > str[str.length, 2].class
=> String
2.0.0p247 :073 > str[str.length+ 1, 2].class
=> NilClass
The first line returns NilClass, while the second line returns String. Ruby method String#[n] return a single-character string, and String#[m, n] returns substrings from the string. Does that means the single-character string is different from the substrings?

Does that means the single-character string is different from the substrings?
No. It means that String#[] behaves differently depending on the arguments passed to it.
You are trying to access past the last character of the string.
str[str.length]
returns nil because there is no character there.
The documentation states:
Returns nil if the initial index falls outside the string or the length is negative.
str[-1]
returns the last character, and...
str[-1].class
returns String.
Similarly...
str[str.length, 1]
returns the empty string "".
Again, the documentation states (emphasis mine):
If passed a start index and a length, returns a substring containing length characters starting at the index.
Since there are no more characters past the end of str, this substring is empty.

Follow the code below :
s = "abc"
s[s.size] # => nil
s[s.size,1] # => ""
s.size # => 3
Documentation of String#[]:
Element Reference — (a) If passed a single index, returns a substring of one character at that index.(b) If passed a start index and a length, returns a substring containing length characters starting at the index.
apply for both the above - if an index is negative, it is counted from the end of the string. For the start and range cases the starting index is just before a character and an index matching the string’s size. (c)Additionally, an empty string is returned when the starting index for a character range is at the end of the string.
(d) Returns nil if the initial index falls outside the string or the length is negative
Why s[s.size] # => nil ?
Because at index 3 there is no character,so returns nil.(applying rule - a).Rule-a says that,return the character from the specified index if present or nil if not found.
Why s[s.size,1] # => "" ?
Because this goes to directly rule-c.
Why s[s.size+1,1] # => nil ?
Because rule-d says like that.
Said that nil is an instance of Nilclass and '' empty string is an instance of String class.Thus what you got,all are valid..
s = "abc"
s[s.size].class # => NilClass
s[s.size,1].class # => String

Related

is there a function to capitalize an obj accessed with str[i] in Ruby?

print str[i].upcase is not working and i have to capitalize specific letters determined using an index. Can someone help me with this?
def mumble_letters
str = nil
print "Please write a string : "
str = gets.to_str
# puts str.length
while str.length == 1
print "Please write a string : "
str = gets.to_str
end
for i in 0..str.length
print str[i].upcase!
i.times{ print str[i].capitalize}
if i != str.length - 1
print"-"
end
end
end
mumble_letters
the error I get is : undefined method `upcase' for nil:NilClass (NoMethodError)
Did you mean? case
Problem
str[i].upcase! mutates the single character in the Array value into an uppercase character. However, at least on Ruby 2.7.1, it won't actually change the contents of your original String object until you reassign the element back to the String index you want modified. For example:
str[i] = str[i].upcase
However, the approach above won't work with frozen strings, which are fairly common in certain core methods, libraries, and frameworks. As a result, you may encounter the FrozenError exception with the index-assignment approach.
Solution
There's more than one way to solve this, but one way is to:
split your String object into an Array of characters,
modify the letter at the desired indexes,
rejoin the characters into a single String, and then
re-assign the modified String to your original variable.
For example, showing some intermediate steps:
# convert String to Array of characters
str = "foobar"
chars = str.chars
# contents of your chars Array
chars
#=> ["f", "o", "o", "b", "a", "r"]
# - convert char in place at given index in Array
# - don't rely on the return value of the bang method
# to be a letter
# - safe navigation handles various nil-related errors
chars[3]&.upcase!
#=> "B"
# re-join Array of chars into String
chars.join
#=> "fooBar"
# re-assign to original variable
str = chars.join
str
#=> "fooBar"
If you want, you can perform the same operation on multiple indexes of your chars Array before re-joining the elements. That should yield the results you're looking for.
More concisely:
str = "foobar"
chars = str.chars
chars[3]&.upcase!
p str = chars.join
#=> "fooBar"
Personally, I find operating on an Array of characters more intuitive and easier to troubleshoot than making in-place changes through repeated assignments to indexes within the original String. Furthermore, it avoids exceptions raised when trying to modify indexes within a frozen String. However, your design choices may vary.
str[i].upcase returns the upcased letter, but does not modify it in place. Assign it back to the string for it to work.
str = 'abcd'
str[2] = str[2].upcase #=> "C"
str #=> "abCd"
I can see two problems with your code...
First, an empty string has a length of 0 so what you wanted to write is
while str.length == 0
Secondly, when you do...
for i in 0..str.length
You are iterating up to the string length INCLUDING the string length. If the string has five characters, it actually only has valid indexes 0 through 4 but you are iterating 0 through 5. And str[5] doesn't exist so returns nil and you cannot do upcase! on a nil.
To handle that common situation, Ruby has the tripe dot operator
for i in 0...str.length
...which will stop at the integer before the length, which is what you want.
It's also more ruby-eque to do
(0...str.length).each do |i|

Ruby : get value between parenthesis (without those parenthesis)

I have strings like "untitled", "untitled(1)" or "untitled(2)".
I want to get the last integer value between parenthesis when if there is. So far I tried a lot of regex, the ones making sense to me (I am new to regex) look like this:
number= string[/\([0-9]+\)/]
number= string[/\(([0-9]+)\)/]
but it still returns me the value with the parenthesis.
If there are no left-then-right parenthesis, getting an empty string (or nil) would be nice. Case such as "untitled($)", getting the '$' char or nil or an empty string would do the trick. For "untitled(3) animal(4)", I want to get 4.
I have been looking a lot of topics about how to do that and but it never seems to work ... what am I missing here ?
/(?<=\()\w+(?=\)$)/ matches one or more word characters (letter, number, underscore) within parenthesis, right before the end of line:
words = %w[
untitled
untitled(1)
untitled(2)
untitled(foo)
unti(tle)d
]
p words.map { |word| word[/(?<=\()\w+(?=\)$)/] }
# => [nil, "1", "2", "foo", nil]
When you use Regex as the parameter to String#[], you can optionally pass in the index of captured group to extract that group.
If a Regexp is supplied, the matching portion of the string is returned. If a capture follows the regular expression, which may be a capture group index or name, follows the regular expression that component of the MatchData is returned instead.
string = "untitled(1)"
number = string[/\(([0-9]+)\)/, 1]
puts number
#=> 1

How to split a string without getting an empty string inserted in the array

I'm having trouble splitting a character from a string using a regular expression, assuming there is a match.
I want to split off either an "m" or an "f" character from the first part of a string assuming the next character is one or more numbers followed by optional space characters, followed by a string from an array I have.
I tried:
2.4.0 :006 > MY_SEPARATOR_TOKENS = ["-", " to "]
=> ["-", " to "]
2.4.0 :008 > str = "M14-19"
=> "M14-19"
2.4.0 :011 > str.split(/^(m|f)\d+[[:space:]]*#{Regexp.union(MY_SEPARATOR_TOKENS)}/i)
=> ["", "M", "19"]
Notice the extraneous "" element at the beginning of my array and also notice that the last expression is just "19" whereas I would want everything else in the string ("14-19").
How do I adjust my regular expression so that only the parts of the expression that get split end up in the array?
I find match to be a bit more elegant when extracting characters from regular expressions in Ruby:
string = "M14-19"
string.match(/\A(?<m>[M|F])(?<digits>\d{2}(-| to )\d{2})/)[1, 2]
=> ["M", "14-19"]
# also can extract the symbols from match
extract_string = string.match(/\A(?<m>[M|F])(?<digits>\d{2}(-| to )\d{2})/)
[[extract_string[:m], extract_string[:digits]]
=> ["M", "14-19"]
string = 'M14 to 14'
extract_string = string.match(/\A(?<m>[M|F])(?<digits>\d{2}(-| to )\d{2})/)[1, 2]
=> ["M", "14 to 14"]
TOKENS = ["-", " to "]
r = /
(?<=\A[mMfF]) # match the beginning of the string and then one
# of the 4 characters in a positive lookbehind
(?= # begin positive lookahead
\d+ # match one or more digits
[[:space:]]* # match zero or more spaces
(?:#{TOKENS.join('|')}) # match one of the tokens
) # close the positive lookahead
/x # free-spacing regex definition mode
(?:#{TOKENS.join('|')}) is replaced by (?:-| to ).
This can of course be written in the usual way.
r = /(?<=\A[mMfF])(?=\d+[[:space:]]*(?:#{TOKENS.join('|')}))/
When splitting on r you are splitting between two characters (between a positive lookbehind and a positive lookahead) so no characters are consumed.
"M14-19".split r
#=> ["M", "14-19"]
"M14 to 19".split r
#=> ["M", "14 to 19"]
"M14 To 19".split r
#=> ["M14 To 19"]
If it is desired that ["M", "14 To 19"] be returned in the last example, change [mMfF] to [mf] and /x to /xi.
You have a bug brewing in your code. Don't get in the habit of doing this:
#{Regexp.union(MY_SEPARATOR_TOKENS)}
You're setting yourself up with a very hard to debug problem.
Here's what's happening:
regex = Regexp.union(%w(a b)) # => /a|b/
/#{regex}/ # => /(?-mix:a|b)/
/#{regex.source}/ # => /a|b/
/(?-mix:a|b)/ is an embedded sub-pattern with its set of the regex flags m, i and x which are independent of the surrounding pattern's settings.
Consider this situation:
'CAT'[/#{regex}/i] # => nil
We'd expect that the regular expression i flag would match because it's ignoring case, but the sub-expression still only allows only lowercase, causing the match to fail.
Using the bare (a|b) or adding source succeeds because the inner expression gets the main expression's i:
'CAT'[/(a|b)/i] # => "A"
'CAT'[/#{regex.source}/i] # => "A"
See "How to embed regular expressions in other regular expressions in Ruby" for additional discussion of this.
The empty element will always be there if you get a match, because the captured part appears at the beginning of the string and the string between the start of the string and the match is added to the resulting array, be it an empty or non-empty string. Either shift/drop it once you get a match, or just remove all empty array elements with .reject { |c| c.empty? } (see How do I remove blank elements from an array?).
Then, 14- is eaten up (consumed) by the \d+[[:space:]]... pattern part - put it into a (?=...) lookahead that will just check for the pattern match, but won't consume the characters.
Use something like
MY_SEPARATOR_TOKENS = ["-", " to "]
s = "M14-19"
puts s.split(/^(m|f)(?=\d+[[:space:]]*#{Regexp.union(MY_SEPARATOR_TOKENS)})/i).drop(1)
#=> ["M", "14-19"]
See Ruby demo

Ruby regexp returning "" vs nil

What is the reason behind different results between the following regexp statements:
"abbcccddddeeee"[/z*/] # => ""
And these that return nil:
"some matching content"[/missing/] # => nil
"start end"[/\Aend/] # => nil
What's happening is that /z*/ will return zero or more occurrences of z.
If you use /z+/, which returns one or more, you'll see it returns nil as expected.
The regular expression /z*/ matches 0 or more z characters, so it also matches an empty string at the beginning of your string. Consider this:
"abbcccddddeeee" =~ /z*/
# => 0
Thus String#[] returns the matched empty string.
In your second example the expressions /missing/ and /\Aend/ don't match anything so nil is returned.
* wild-card stands for 0 or more matches so even if your z is not present it will show a empty string match. on the other hand you can use + for 1 or more and ? for zero or more matches.

understanding regexp with array notation

I've met this code snippet:
erb = "#coding:UTF-8 _erbout = ''; _erbout.concat ..." # string is cut
erb[/\A(#coding[:=].*\r?\n)/, 1]
I know how regular expression works, but I am confused with the array notation. What does it mean to place a regexp in [], what does the second argument 1 mean?
str[regexp] is actually a method of class String, you can find it here http://www.ruby-doc.org/core/classes/String.html#M001128
The second argument 1 will return text matching the first subpattern #coding[:=].*\r?\n, another example for your better understanding:
"ab123baab"[/(\d+)(ba+).*/, 0] # returns "123baab", since it is the complete matched text, ,0 can be omitted also
"ab123baab"[/(\d+)(ba+).*/, 1] # returns "123", since the first subpattern is (\d+)
"ab123baab"[/(\d+)(ba+).*/, 2] # returns "baa", since the second subpattern is (ba+)
The brackets are a method of String. See http://www.ruby-doc.org/core/classes/String.html:
If a Regexp is supplied, the matching
portion of str is returned. If a
numeric or name parameter follows the
regular expression, that component of
the MatchData is returned instead. If
a String is given, that string is
returned if it occurs in str. In both
cases, nil is returned if there is no
match.
The 1 means to return what's matched by the pattern inside the parenthesis.

Resources