Multiple in-place substitutions in one line - ruby

What is the most idiomatic way to do multiple in-place substitutions on a string when none of the patterns being replaced are guaranteed to have a match?
For instance, say I have an array of strings and I want to replace "sad" with "happy" and "goodbye" with "hello" in each one:
a = ["I am sad", "goodbye for now"]
# This will work:
a.map! do |s|
s = s.gsub(/sad/,"happy").gsub(/goodbye/,"hello")
end
# So will this:
a.each do |s|
s.gsub!(/sad/,"happy")
s.gsub!(/goodbye/,"hello")
end
# This will fail when s does not match /sad/:
a.each do |s|
s.gsub!(/sad/,"happy").gsub!(/goodbye/,"hello")
end
The first option seems a little bit silly, since logically I'm trying to do an in-place substitution rather than a re-assignment. The second option is okay, but my aesthetic sense tells me that it seems wrong to be required to turn the substitution into two sequential statements, particularly in cases when only one or the other substitution is expected to be successful (which, ironically, is exactly the case that causes the third version, which looks "right" to me, to fail). Also, it's probably wrong to use each destructively as I'm doing here, but if I used map! instead, I'd need to add s as the final line in the block to ensure I don't accidentally make nil entries when substitution fails, which seems almost sillier than the first option.
I'm guessing the reason the (g)sub! methods return nil when no substitution is done is because that makes them convenient for use in logical constructs, which is admittedly a very good reason (especially since the non-destructive versions obviously must return "true" values no matter what).
So...I know this is really very little more than a minor aesthetic quibble, but is there a better way than the two (working) ways I've shown? If not, is there any reason to prefer one over the other (beyond my intuitive aesthetic preference for the second version)?

First you need to create a replacement_hash as below :
replacement_hash = { "sad" => "happy", "goodbye" => "hello"}
a = ["I am sad", "goodbye for now"]
Regexp.union(replacement_hash.keys) # => /sad|goodbye/
a.map { |s| s.gsub(Regexp.union(replacement_hash.keys), replacement_hash) }
# => ["I am happy", "hello for now"]
If in-place replacement needed, do as below :-
a.each { |s| s.gsub!(Regexp.union(replacement_hash.keys), replacement_hash) }

Related

Detecting a missing sub!, sort!, map!, etc

After returning to Ruby from a long stint coding in another language, I regularly assume that foo.sort, foo.map {...}, foo.sub /bar/, 'zip' will change foo. Of course I meant foo.sort!, etc. But that usually takes 3 or 4 debugging potshots before I notice. Meanwhile, the sort is calculated, but then isn't assigned to anything. Can I make ruby warn about that missing lvalue, like a C compiler warns of a function's ignored return value?
You mean like Perl's somewhat infamous "using map in void context"? I don't know of Ruby having such a thing. Sounds like you need more unit testing to catch mistakes like this before they can worm into your code deeply enough to be considered bugs.
Keep in mind Ruby's a lot more flexible than languages like Perl. For example, the following code might be useful:
def rewrite(list)
list.map do |row|
row += '!'
end
end
Now technically that's a map in a void context, but because it's used as a return value it might be captured elsewhere. It's the responsibility of the caller to make use of it. Flagging the method itself for some sort of warning is a level removed from what most linting type tools can do.
Here's a very basic parser :
#forgetful_methods = %w(sort map sub)
Dir['*.rb'].each do |script|
File.readlines(script).each.with_index(1) do |line, i|
#forgetful_methods.each do |method|
if line =~ /\.#{method}(?!!)/ && $` !~ /(=|\b(puts|print|return)\b|^#)/
puts format('%-25s (%3d) : %s', script, i, line.strip)
end
end
end
end
# =>
# brace_globbing.rb ( 13) : subpatterns.map{|subpattern| explode_extglob(match.pre_match+subpattern+match.post_match)}.flatten
# delegate.rb ( 11) : #targets.map { |t| t.send(m, *args) }
It checks every ruby script in the current directory for sort, map or sub without ! that aren't preceded by =, puts, print or return.
It's just a start, but maybe it could help you find some of the low hanging fruits.
There are many false positives, though.
A more complex version could use abstract syntax trees, for example with Ripper.

Is there a { |x| x } shorthand in ruby?

I often use .group_by{ |x| x } and .find{ |x| x }
The latter is to find the first item in an array which is true.
Currently I'm just using .compact.first but I feel like there must be an elegant way to use find here, like find(&:to_bool) or .find(true) that I'm missing.
Using .find(&:nil?) works but is the opposite of what I want, and I couldn't find a method that was the opposite of #find or #detect, or a method like #true?
So is there a more elegant way to write .find{ |x| x }? If not, I'll stick with .compact.first
(I know compact won't remove false but that's not a problem for me, also please avoid rails methods for this)
Edit: For my exact case it is used on arrays of only strings and nils e.g.
[nil, "x", nil, nil, nil, nil, "y", nil, nil, nil, nil] => "x"
If you do not care about what is returned you can sometimes use the hash method.
Thw feature you are asking for is not available in Ruby yet, however. it is present in the Ruby road-map:
https://bugs.ruby-lang.org/issues/6373
Expected to be implemented before 2035-12-25, can you wait?
That being said, how much typing is group_by{|x|x} ?
Edit:
As Stefan pointed out, my answer is now longer valid for Ruby 2.2 and above since the introduction of Object#itself.
There’s not.
If tap worked without a block you could do:
array.detect(&:tap)
But it doesn’t. Either way, I think what you have is extremely concise, idiomatic, and happens to be the same number of characters as the non-working above alternative, and thus you should stick with that:
array.compact.first
You could monkey-patch your way to getting a shorter version, but then it becomes unclear to anyone otherwise familiar with Ruby, which probably isn’t worth the minor “savings”.
As a curiosity, if you happened to want array.detect { |x| !x } (the opposite) you could do:
array.detect(&:!)
This works because !x is actually shorthand for x.!. Of course this would only ever give you nil or false, which is probably not very useful.
No, there is not. I personally have a utility library I include in all my projects which has something like
IDENTITIY = -> x { x }
Then you would have
.group_by(&IDENTITY)
There is also Object#itself that simply returns self:
.group_by(&:itself)
Although the tag is for ruby - with Rails (more specifically ActiveSupport) you are given a method presence which will work for anything that responds positively to present? (that would exclude blank strings, arrays, hashes, etc):
array.find(&:presence)
It's not quite equivalent to the preferred result, but it will work for most cases I've come across.
I frequently use group_by, map, select, sort_by, and other various hash methods. I discovered this useful little extension yesterday by fiddling around with another answer on a similar question:
class Hash
def method_missing(n)
if has_key? n
self[n]
else
raise NoMethodError
end
end
end
For any hash created by ruby, or any data that has been jsonified by as_json, this addition allows me to write code which is a little shorter. Example:
# make yellow cells
yellow = red = false
tube_steps_status.group_by(&:step_ordinal).each do |type|
group = type.last.select(&:completed).sort_by(&:completed)
red = true if group.last.step_status == 'red' if group.any?
yellow = true if group.map(&:step_status).include?('red')
end
tube_summary_status = 'yellow' if yellow unless red

Ruby while syntax

Does anybody why I can write this:
ruby-1.8.7-p302 > a = %w( a b c)
=> ["a", "b", "c"]
ruby-1.8.7-p302 > while (i = a.shift) do; puts i ; end
a
b
c
=> nil
Which looks like passing a block to while.
And not:
while(i = a.shift) { puts i; }
Is it because the "do" of the while syntax is just syntaxic sugar and as nothing to do with the "do" of a block?
Is it because the do of the while syntax is just syntaxic sugar and as nothing to do with the do of a block?
More or less, yes. It's not syntactic sugar, it's simply a built-in language construct, like def or class, as #meagar already wrote.
It has nothing to do with the do of a block, except that keywords are expensive and so reusing keywords makes sense. (By "expensive" I mean that they limit the programmer in his expressiveness.)
In a while loop, there are two ways to separate the block from the condition:
the do keyword and
an expression separator.
There are, in turn, two different expression separators in Ruby:
the semicolon ; and
a newline
So, all three of the following are valid:
while i = a.shift do puts i end # do
while i = a.shift; puts i end # semicolon
while i = a.shift
puts i end # newline
[Obviously, that last one wouldn't be written that way, you would put the end on a new line, dedented to match the while. I just wanted to demonstrate what is the minimum needed to separate the parts of the while loop.]
By the way: it is highly un-idiomatic to put the condition in parentheses. There's also a lot of superfluous semicolons in your code. And the variable name i is usually reserved for an index, not an element. (I normally use el for generic elements, but I much prefer more semantic names.)
It is also highly un-idiomatic to iterate a collection manually. Your code would be much better written as
a.each(&method(:puts)).clear
Not only is it much easier to understand what this does (print all elements of the array and delete all items from it), it is also much easier to write (there is no way to get the termination condition wrong, or screw up any assignments). It also happens to be more efficient: your version is Θ(n2), this one is Θ(n).
And actually, that's not really how you would write it, either, because Kernel#puts already implements that behavior, anyway. So, what you would really write is this
puts a
a.clear
or maybe this
a.tap(&method(:puts)).clear
[Note: this very last one is not 100% equivalent. It prints a newline for an empty array, all the other ones print nothing.]
Simple. Clear. Concise. Expressive. Fast.
Compare that to:
while (i = a.shift) do; puts i ; end
I actually had to run that multiple times to be 100% clear what it does.
while doesn't take a block, it's a language construct. The do is optional:
while (i = a.shift)
puts i
end

is this a valid ruby syntax?

if step.include? "apples" or "banana" or "cheese"
say "yay"
end
Several issues with your code.
step.include? "apples" or "banana" or "cheese"
This expression evaluates to:
step.include?("apples") or ("banana") or ("cheese")
Because Ruby treats all values other than false and nil as true, this expression will always be true. (In this case, the value "banana" will short-circuit the expression and cause it to evaluate as true, even if the value of step does not contain any of these three.)
Your intent was:
step.include? "apples" or step.include? "banana" or step.include? "cheese"
However, this is inefficient. Also it uses or instead of ||, which has a different operator precedence, and usually shouldn't be used in if conditionals.
Normal or usage:
do_something or raise "Something went wrong."
A better way of writing this would have been:
step =~ /apples|banana|cheese/
This uses a regular expression, which you're going to use a lot in Ruby.
And finally, there is no say method in Ruby unless you define one. Normally you would print something by calling puts.
So the final code looks like:
if step =~ /apples|banana|cheese/
puts "yay"
end
The last two terms appear to Ruby as true, rather than having anything to do with the include? phrase.
Assuming that step is a string...
step = "some long string with cheese in the middle"
you could write something like this.
puts "yay" if step.match(/apples|banana|cheese/)
Here's a way to call step.include? on each of the arguments until one of them returns true:
if ["apples", "banana", "cheese"].any? {|x| step.include? x}
It's definitely not what you appear to be wanting. The include? method takes in a String, which is not what "apples" or "banana" or "cheese" produces. Try this instead:
puts "yay" if ["apples", "banana", "cheese"].include?(step)
But it's unclear from the context what step is supposed to be. If it's just the single word, then this is fine. If it can be a whole sentence, try joel.neely's answer.
The closest thing to that syntax that would do what you appear to want would be something like:
if ["apples", "banana", "cheese"].include?(step)
puts "yay"
end
But one of the other suggestions using a regex would be more concise and readable.
Assuming step is an Array or a Set or something else that supports set intersection with the & operator, I think the following code is the most idiomatic:
unless (step & ["apples","banana","cheese"]).empty?
puts 'yay'
end
I'll add some parentheses for you:
if (step.include? "apples") or ("banana") or ("cheese")
say "yay"
end
(That would be why it's always saying "yay" -- because the expression will always be true.)
Just to add another side to this...
If step is an Array (as calling include? seems to suggest) then maybe the code should be:
if (step - %w{apples banana cheese}) != step
puts 'yay'
end

What is the "right" way to iterate through an array in Ruby?

PHP, for all its warts, is pretty good on this count. There's no difference between an array and a hash (maybe I'm naive, but this seems obviously right to me), and to iterate through either you just do
foreach (array/hash as $key => $value)
In Ruby there are a bunch of ways to do this sort of thing:
array.length.times do |i|
end
array.each
array.each_index
for i in array
Hashes make more sense, since I just always use
hash.each do |key, value|
Why can't I do this for arrays? If I want to remember just one method, I guess I can use each_index (since it makes both the index and value available), but it's annoying to have to do array[index] instead of just value.
Oh right, I forgot about array.each_with_index. However, this one sucks because it goes |value, key| and hash.each goes |key, value|! Is this not insane?
This will iterate through all the elements:
array = [1, 2, 3, 4, 5, 6]
array.each { |x| puts x }
# Output:
1
2
3
4
5
6
This will iterate through all the elements giving you the value and the index:
array = ["A", "B", "C"]
array.each_with_index {|val, index| puts "#{val} => #{index}" }
# Output:
A => 0
B => 1
C => 2
I'm not quite sure from your question which one you are looking for.
I think there is no one right way. There are a lot of different ways to iterate, and each has its own niche.
each is sufficient for many usages, since I don't often care about the indexes.
each_ with _index acts like Hash#each - you get the value and the index.
each_index - just the indexes. I don't use this one often. Equivalent to "length.times".
map is another way to iterate, useful when you want to transform one array into another.
select is the iterator to use when you want to choose a subset.
inject is useful for generating sums or products, or collecting a single result.
It may seem like a lot to remember, but don't worry, you can get by without knowing all of them. But as you start to learn and use the different methods, your code will become cleaner and clearer, and you'll be on your way to Ruby mastery.
I'm not saying that Array -> |value,index| and Hash -> |key,value| is not insane (see Horace Loeb's comment), but I am saying that there is a sane way to expect this arrangement.
When I am dealing with arrays, I am focused on the elements in the array (not the index because the index is transitory). The method is each with index, i.e. each+index, or |each,index|, or |value,index|. This is also consistent with the index being viewed as an optional argument, e.g. |value| is equivalent to |value,index=nil| which is consistent with |value,index|.
When I am dealing with hashes, I am often more focused on the keys than the values, and I am usually dealing with keys and values in that order, either key => value or hash[key] = value.
If you want duck-typing, then either explicitly use a defined method as Brent Longborough showed, or an implicit method as maxhawkins showed.
Ruby is all about accommodating the language to suit the programmer, not about the programmer accommodating to suit the language. This is why there are so many ways. There are so many ways to think about something. In Ruby, you choose the closest and the rest of the code usually falls out extremely neatly and concisely.
As for the original question, "What is the “right” way to iterate through an array in Ruby?", well, I think the core way (i.e. without powerful syntactic sugar or object oriented power) is to do:
for index in 0 ... array.size
puts "array[#{index}] = #{array[index].inspect}"
end
But Ruby is all about powerful syntactic sugar and object oriented power, but anyway here is the equivalent for hashes, and the keys can be ordered or not:
for key in hash.keys.sort
puts "hash[#{key.inspect}] = #{hash[key].inspect}"
end
So, my answer is, "The “right” way to iterate through an array in Ruby depends on you (i.e. the programmer or the programming team) and the project.". The better Ruby programmer makes the better choice (of which syntactic power and/or which object oriented approach). The better Ruby programmer continues to look for more ways.
Now, I want to ask another question, "What is the “right” way to iterate through a Range in Ruby backwards?"! (This question is how I came to this page.)
It is nice to do (for the forwards):
(1..10).each{|i| puts "i=#{i}" }
but I don't like to do (for the backwards):
(1..10).to_a.reverse.each{|i| puts "i=#{i}" }
Well, I don't actually mind doing that too much, but when I am teaching going backwards, I want to show my students a nice symmetry (i.e. with minimal difference, e.g. only adding a reverse, or a step -1, but without modifying anything else).
You can do (for symmetry):
(a=*1..10).each{|i| puts "i=#{i}" }
and
(a=*1..10).reverse.each{|i| puts "i=#{i}" }
which I don't like much, but you can't do
(*1..10).each{|i| puts "i=#{i}" }
(*1..10).reverse.each{|i| puts "i=#{i}" }
#
(1..10).step(1){|i| puts "i=#{i}" }
(1..10).step(-1){|i| puts "i=#{i}" }
#
(1..10).each{|i| puts "i=#{i}" }
(10..1).each{|i| puts "i=#{i}" } # I don't want this though. It's dangerous
You could ultimately do
class Range
def each_reverse(&block)
self.to_a.reverse.each(&block)
end
end
but I want to teach pure Ruby rather than object oriented approaches (just yet). I would like to iterate backwards:
without creating an array (consider 0..1000000000)
working for any Range (e.g. Strings, not just Integers)
without using any extra object oriented power (i.e. no class modification)
I believe this is impossible without defining a pred method, which means modifying the Range class to use it. If you can do this please let me know, otherwise confirmation of impossibility would be appreciated though it would be disappointing. Perhaps Ruby 1.9 addresses this.
(Thanks for your time in reading this.)
Use each_with_index when you need both.
ary.each_with_index { |val, idx| # ...
The other answers are just fine, but I wanted to point out one other peripheral thing: Arrays are ordered, whereas Hashes are not in 1.8. (In Ruby 1.9, Hashes are ordered by insertion order of keys.) So it wouldn't make sense prior to 1.9 to iterate over a Hash in the same way/sequence as Arrays, which have always had a definite ordering. I don't know what the default order is for PHP associative arrays (apparently my google fu isn't strong enough to figure that out, either), but I don't know how you can consider regular PHP arrays and PHP associative arrays to be "the same" in this context, since the order for associative arrays seems undefined.
As such, the Ruby way seems more clear and intuitive to me. :)
Here are the four options listed in your question, arranged by freedom of control. You might want to use a different one depending on what you need.
Simply go through values:
array.each
Simply go through indices:
array.each_index
Go through indices + index variable:
for i in array
Control loop count + index variable:
array.length.times do | i |
Trying to do the same thing consistently with arrays and hashes might just be a code smell, but, at the risk of my being branded as a codorous half-monkey-patcher, if you're looking for consistent behaviour, would this do the trick?:
class Hash
def each_pairwise
self.each { | x, y |
yield [x, y]
}
end
end
class Array
def each_pairwise
self.each_with_index { | x, y |
yield [y, x]
}
end
end
["a","b","c"].each_pairwise { |x,y|
puts "#{x} => #{y}"
}
{"a" => "Aardvark","b" => "Bogle","c" => "Catastrophe"}.each_pairwise { |x,y|
puts "#{x} => #{y}"
}
I'd been trying to build a menu (in Camping and Markaby) using a hash.
Each item has 2 elements: a menu label and a URL, so a hash seemed right, but the '/' URL for 'Home' always appeared last (as you'd expect for a hash), so menu items appeared in the wrong order.
Using an array with each_slice does the job:
['Home', '/', 'Page two', 'two', 'Test', 'test'].each_slice(2) do|label,link|
li {a label, :href => link}
end
Adding extra values for each menu item (e.g. like a CSS ID name) just means increasing the slice value. So, like a hash but with groups consisting of any number of items. Perfect.
So this is just to say thanks for inadvertently hinting at a solution!
Obvious, but worth stating: I suggest checking if the length of the array is divisible by the slice value.
If you use the enumerable mixin (as Rails does) you can do something similar to the php snippet listed. Just use the each_slice method and flatten the hash.
require 'enumerator'
['a',1,'b',2].to_a.flatten.each_slice(2) {|x,y| puts "#{x} => #{y}" }
# is equivalent to...
{'a'=>1,'b'=>2}.to_a.flatten.each_slice(2) {|x,y| puts "#{x} => #{y}" }
Less monkey-patching required.
However, this does cause problems when you have a recursive array or a hash with array values. In ruby 1.9 this problem is solved with a parameter to the flatten method that specifies how deep to recurse.
# Ruby 1.8
[1,2,[1,2,3]].flatten
=> [1,2,1,2,3]
# Ruby 1.9
[1,2,[1,2,3]].flatten(0)
=> [1,2,[1,2,3]]
As for the question of whether this is a code smell, I'm not sure. Usually when I have to bend over backwards to iterate over something I step back and realize I'm attacking the problem wrong.
In Ruby 2.1, each_with_index method is removed.
Instead you can use each_index
Example:
a = [ "a", "b", "c" ]
a.each_index {|x| print x, " -- " }
produces:
0 -- 1 -- 2 --
The right way is the one you feel most comfortable with and which does what you want it to do. In programming there is rarely one 'correct' way to do things, more often there are multiple ways to choose.
If you are comfortable with certain way of doings things, do just it, unless it doesn't work - then it is time to find better way.
Using the same method for iterating through both arrays and hashes makes sense, for example to process nested hash-and-array structures often resulting from parsers, from reading JSON files etc..
One clever way that has not yet been mentioned is how it's done in the Ruby Facets library of standard library extensions. From here:
class Array
# Iterate over index and value. The intention of this
# method is to provide polymorphism with Hash.
#
def each_pair #:yield:
each_with_index {|e, i| yield(i,e) }
end
end
There is already Hash#each_pair, an alias of Hash#each. So after this patch, we also have Array#each_pair and can use it interchangeably to iterate through both Hashes and Arrays. This fixes the OP's observed insanity that Array#each_with_index has the block arguments reversed compared to Hash#each. Example usage:
my_array = ['Hello', 'World', '!']
my_array.each_pair { |key, value| pp "#{key}, #{value}" }
# result:
"0, Hello"
"1, World"
"2, !"
my_hash = { '0' => 'Hello', '1' => 'World', '2' => '!' }
my_hash.each_pair { |key, value| pp "#{key}, #{value}" }
# result:
"0, Hello"
"1, World"
"2, !"

Resources