ruby regex scan vs .split method - ruby

I was trying to build a method that you take the first letter of every word and would capitalize it. I wrote it as
def titleize(name)
name.scan(/\w+/) { |x| x.capitalize! }
end
and it just wouldn't work properly. It wouldn't capitalize and letters. I did some searching and found the answer here Capitalizing titles eventually. It was written as
def titleize(name)
name.split(" ").each { |x| x.capitalize! }.join(" ")
end
How come my code didn't capitalize at all though? If I added a put statement and wrote it as
def titleize(name)
name.scan(/\w+/) { |x| puts x.capitalize! }
end
It would output "hi there" with capitals but the => would still be just "hi there" What did I miss?

Corrected code:
def titleize(name)
name.scan(/\w+/).each { |x| x.capitalize! }.join(' ')
end
p titleize("ayan roy") #=>"Ayan Roy"
Let's see why your one not worked:
def titleize(name)
name.scan(/\w+/)
end
p titleize("ayan roy") #=>["ayan", "roy"]
Now your line name.scan(/\w+/) { |x| x.capitalize! } , x is passed as "ayan", "roy". Now look at the below:
def titleize(name)
name.scan(/\w+/) { |x| p x.capitalize! }
end
p titleize("ayan roy")
Output:
"Ayan"
"Roy"
"ayan roy"
As String#scan says:
scan(pattern) {|match, ...| block } → str - if block is given,scan will return the receiver on which it is called. Both forms iterate through str, matching the pattern (which may be a Regexp or a String). For each match, a result is generated and either added to the result array or passed to the block.

scan returns/yields new strings and will never modify the source string. Perhaps you want gsub.
def titleize(name)
name.gsub(/\w+/) {|x| x.capitalize }
end
Or perhaps better to use a likely more correct implementation from the titleize gem.

Your code doesn't work because #scan returns new String objects which are the results of the Regexp and passes them to the block. So in your method you essentially took these new objects, mutated them by calling #capitalize! but never used them anywhere afterwards.
You should do instead:
def titleize(name)
name.scan(/\w+/).each { |x| x.capitalize! }.join(' ')
end
But this seems more readable to me:
def titleize2(name)
name.split(' ').each { |w| w.capitalize! }.join(' ')
end
Note however these methods do not mutate the original argument passed.

The block form of scan returns the original string, regardless of what you do in the block. (I think you may be able to alter the original string in the block by referring directly to it, but it's not recommended to alter the thing you're iterating over.) Instead, do your split variation, but instead of each, do collect followed by join:
name.split(" ").collect { |x| x.capitalize }.join(" ")
This works for titles containing numerals and punctuation, as well.

Related

Syntax error, unexpected tIDENTIFIER, expecting ')' Ruby

I get the following error when running a simple method that takes in a proper noun string and returns the string properly capitalized.
def format_name(str)
parts = str.split
arr = []
parts.map do |part|
if part[0].upcase
else part[1..-1].downcase
arr << part
end
end
return arr.join(" ")
end
Test cases:
puts format_name("chase WILSON") # => "Chase Wilson"
puts format_name("brian CrAwFoRd scoTT") # => "Brian Crawford Scott"
The only possibility that the above code returns a blank output is because your arr is nil or blank. And the reason your arr is blank(yes it is blank in your case) because of this line of code:
if part[0].upcase
in which the statement would always return true, because with every iteration it would check if the first element of the part string can be upcased or not, which is true.
Hence, your else block never gets executed, even if this got executed this would have returned the same string as the input because you are just putting the plain part into the array arr without any formatting done.
There are some ways you can get the above code working. I'll put two cases:
# one where your map method could work
def format_name(str)
parts = str.split
arr = []
arr = parts.map do |part|
part.capitalize
end
return arr.join(" ")
end
# one where your loop code logic works
def format_name(str)
parts = str.split
arr = []
parts.map do |part|
arr << "#{part[0].upcase}#{part[1..-1].downcase}"
end
return arr.join(" ")
end
There are numerous other ways this could work. I'll also put the one I prefer if I am using just plain ruby:
def format_name(str)
str.split(' ').map(&:capitalize)
end
You could also read more about the Open Classes concept to put this into the String class of ruby
Also, checkout camelize method if you're using rails.

Self enumerating function

I've got some code:
def my_each_with_index
return enum_for(:my_each_with_index) unless block_given?
i = 0
self.my_each do |x|
yield x, i
i += 1
end
self
end
It is my own code, but the line:
return enum_for(:my_each_with_index) unless block_given?
is found in solutions of other's. I can't get why they passed the function to enum_for as a parameter. When I invoke my function without a block, it won't return anything with or without enum_for. I could left sth like:
return unless block_given?
and it has the same result. Or am I wrong?
Being called without a block, it will return an enumerator:
▶ def my_each_with_index
▷ return enum_for(:my_each_with_index) unless block_given?
▷ end
#⇒ :my_each_with_index
▶ e = my_each_with_index
#⇒ #<Enumerator: main:my_each_with_index>
later on you might iterate on this enumerator:
▶ e.each { |elem| ... }
This behavior is specifically useful in some cases, like lazy iteration, passing block to this enumerator later etc.
Just returning nil cuts this ability off.
Think you for very precise answer. I recived also very good example to understand this issue for other new developers:
def iterator
yield 1
yield 2
yield 3
puts "koniec"
end
iterator { |v| puts v }
it = enum_for(:iterator)
puts it.next
puts it.next
puts it.next
puts it.next
Just run and analyze this code.
For any method that accepts a block, a good method implementation should have a well-defined behavior when the block is not given.
In the example shared by you, each_for_index is being re-implemented by author, may be to provide additional semantics or may be just for academic purpose given that its behavior is same as Ruby's Enumerable#each_with_index.
The documentation has following for Enumerable#each_with_index.
Calls block with two arguments, the item and its index, for each item
in enum. Given arguments are passed through to each().
If no block is given, an enumerator is returned instead.
In order to stay consistent with highlighted line indicating what should be the behavior if block is not given, one has to use something like
return enum_for(:my_each_with_index) unless block_given?
enum_for is interesting method
enum_for creates a new Enumerator which will enumerate by calling method on obj.
Below is an example reproduced from documentation:
str = "xyz"
enum = str.enum_for(:each_byte)
enum.each { |b| puts b }
# => 120
# => 121
# => 122
So, if one does not pass block to my_each_with_index, they have a chance to pass it later - just like one would have done with each_with_index.
e = obj.my_each_with_index
...
e.each { |x, i| # do something } # `my_each_with_index` executed later
In summary, my_each_with_index tries to be consistent with each_with_index and tries to be a well-behaved API.

Capitalizing Vowels in a String with Ruby

I have a method that takes in a string as an argument, replaces each letter with the next letter in the alphabet and then capitalizes every vowel. I have gotten both of those to work individually (the replacing and capitalization), but at this point, I just don't know how to make them work together.
def LetterChanges(str)
new_str = str.downcase.split("")
new_str.each do |x|
x.next!
end
new_str.to_s.tr!('aeiou','AEIOU')
return new_str.join("")
end
LetterChanges("abcdef")
new_str.to_s is not stored anywhere. It doesn't affect the original array.
return new_str.join("").tr('aeiou', 'AEIOU')
This will convert the array back to a string and you can operate on that and return it.
That could be resolved with gsub.
"abcdef".gsub(/./){|char| char.next}.gsub(/[aeiou]/){|vowel| vowel.upcase}
#=> "bcdEfg"
so that method could be
def letter_changes_gsub(str)
str.gsub(/./){|char| char.next}.gsub(/[aeiou]/){|vowel| vowel.upcase}
end
That is faster and more simple that work with arrays.
Other answers already showed you how to combine both parts of your code. But there's another issue: String#next is continuing witch "aa" after "z":
"z".next #=> "aa"
You could add an if statement to handle this case:
str.chars.map do |char|
if char == 'z'
'a'
else
char.next
end
end.join
or:
str.chars.map { |char| char == 'z' ? 'a' : char.next }.join
But there's a much simpler way: let String#tr perform the entire substitution:
str.downcase.tr('a-z', 'bcdEfghIjklmnOpqrstUvwxyzA')
Or slightly shorter:
str.downcase.tr('a-z', 'bcdEfghIjk-nOp-tUv-zA')
2.1.0 :012 > 'abcdef'.split('').map(&:next).join.tr('aeiou', 'AEIOU')
=> "bcdEfg"
I would not recommend doing this in one line, of course. But to get at your confusion of how these methods might string together, here is one solution that works. When in doubt, use IRB to call each method and watch how Ruby responds. That will help you figure out where your code is breaking down.
In practice, I would break this into multiple methods. It's too many things for one method to do. And also a lot harder to find bugs (and test), as you found out.
def rotate(string)
string.split('').map(&:next).join
end
def capitalize_vowels(string)
string.tr('aeiou', 'AEIOU')
end
How about:
def string_thing(string)
string.downcase.tr('abcdefghijklmnopqrstuvwxyz','bcdEfghIjklmnOpqrstUvwxyzA')
end
#tr just will replace each character in the first parameter with the corresponding one in the second parameter.
This can be achieved with the combination of gsub and tr:
"abcdef".gsub(/[A-z]/) { |char| char.next }.tr('aeiou', 'AEIOU')
#=> "bcdEfg"
"Fun times!".gsub(/[A-z]/) { |char| char.next }.tr('aeiou', 'AEIOU')
#=> "GvO Ujnft!"

Why does `join` un-capitalize strings in an array?

I'm writing a method that capitalizes each word in a string. Without using the join method, I can obtain a correct array (eg. david copperfield == ["David", "Copperfield"]):
def titleize(words)
single_words = words.split(/ /)
single_words.map {|i| i.capitalize}
single_words.join(" ")
end
When I join the elements, they revert back to lowercase. I have no idea why. Any help would be appreciated.
You must use map! instead of map, because map returns new array, not changes initial.
def titleize(words)
single_words = words.split(/ /)
single_words.map! {|i| i.capitalize}
single_words.join(" ")
end
Use the destructive version of map: map!:
single_words.map! { |i| i.capitalize}
join does not un-capitalize it. You threw out the result of capitalizing, and passed the original uncapitalized array to join.
If your intent is to capitalize each substring separated by / /, then the more normal and better way is:
def titleize(words)
words.gsub(/[^ ]+/, &:capitalize)
end
I am not quite sure why you are using an array.
If you go to the Array#map documentation page, you'll find out that it
Invokes the given block once for each element of self.
Creates a new array containing the values returned by the block.
So .map returns new array, but doesn't modify the one you provide. That's why your method has been working, when .map was the last instruction.
To fix your code, you could either replace .map by .map!, that modifies provided array(single_words):
def titleize(words)
single_words = words.split(/ /)
single_words.map! {|i| i.capitalize}
single_words.join(" ")
end
or replace .capitalize with .capitalize!, which modifies strings:
def titleize(words)
single_words = words.split(/ /)
single_words.map {|i| i.capitalize!}
single_words.join(" ")
end
or perform .join right after .map:
def titleize(words)
single_words = words.split(/ /)
single_words.map {|i| i.capitalize}.join(' ')
end
in fact, your method is simple enough to be one-liner:
def titleize(string)
string.split.map(&:capitalize).join(' ')
end

odd usage of "end" in Sample code

Looking through this I notice something I have never seen before on line 83.end.map(&:chomp) so end is an object? (I realize that might be 100% wrong.) Can someone explain what and how that works there? What exactly is advantage?
No, end is not an object, but object.some_method do ... end is an object (or rather it's evaluated to an object) - namely the object returned by the some_method method.
So if you do object.some_method do ... end.some_other_method, you're calling some_other_method on the object returned by some_method.
The full code snippet you're referring to is below:
def initialize(dict_file)
#dict_arr = File.readlines(dict_file).select do |word|
!word.include?("-") && !word.include?("'")
end.map(&:chomp)
end
notice that the end you're talking about is the end of the block that starts on the 2nd line (it matches the do on line 2).
Perhaps if you see it parenthesized, and rewritten with curly braces, it will make more sense:
def initialize(dict_file)
#dict_arr = (File.readlines(dict_file).select { |word|
!word.include?("-") && !word.include?("'")
}).map(&:chomp)
end
It's often helpful to examine what Ruby is doing, step-by-step. Let's see what's going with the method ComputerPlayer#initialize:
def initialize(dict_file)
#dict_arr = File.readlines(dict_file).select do |word|
!word.include?("-") && !word.include?("'")
end.map(&:chomp)
end
First, create a file:
File.write("my_file", "cat\ndog's\n")
When we execute:
ComputerPlayer.new("my_file")
the class method IO#readlines is sent to File, which returns an array a:
a = File.readlines("my_file")
#=> ["cat\n", "dog's\n"]
Enumerable#select is sent to the array a to create an enumerator:
b = a.select
#=> #<Enumerator: ["cat\n", "dog's\n"]:select>
We can convert this enumerator to an array to see what it will pass to it's block:
b.to_a
=> ["cat\n", "dog's\n"]
The enumerator is invoked by sending it the method each with a block, and it returns an array c:
c = b.each { |word| !word.include?("-") && !word.include?("'") }
#=> ["cat\n"]
Lastly, we send Enumerable#map with argument &:chomp (the method String#chomp converted to a proc) to the array c:
c.map(&:chomp)
#=> ["cat"]
A final point: you can improve clarity by minimizing the use of !. For example, instead of
...select do |word|
!word.include?("-") && !word.include?("'")
consider
...reject do |word|
word.include?("-") || word.include?("'")
You might also use a regex.

Resources