Ruby Regex: Get Index of Capture - ruby

I've seen this question asked and answered for javascript regex, and the answer was long and very ugly. Curious if anyone has a cleaner way to implement in ruby.
Here's what I'm trying to achieve:
Test String: "foo bar baz"
Regex: /.*(foo).*(bar).*/
Expected Return: [[0,2],[4,6]]
So my goal is to be able to run a method, passing in the test string and regex, that will return the indices where each capture group matched. I have included both the starting and ending indices of the capture groups in the expected return. I'll be working on this and adding my own potential solutions here along the way too. And of course, if there's a way other than regex that would be cleaner/easier to achieve this, that's a good answer too.

Something like this should work for a general amount of matches.
def match_indexes(string, regex)
matches = string.match(regex)
(1...matches.length).map do |index|
[matches.begin(index), matches.end(index) - 1]
end
end
string = "foo bar baz"
match_indexes(string, /.*(foo).*/)
match_indexes(string, /.*(foo).*(bar).*/)
match_indexes(string, /.*(foo).*(bar).*(baz).*/)
# => [[0, 2]]
# => [[0, 2], [4, 6]]
# => [[0, 2], [4, 6], [8, 10]]
You can have a look at the (kind of strange) MatchData class for how this works. http://www.ruby-doc.org/core-1.9.3/MatchData.html

m = "foo bar baz".match(/.*(foo).*(bar).*/)
[1, 2].map{|i| [m.begin(i), m.end(i) - 1]}
# => [[0, 2], [4, 6]]

Related

Why does an element of an array assigned to a variable changes when the variable is changed?

I have a two-dimensional array
test_array = [[4,3],[4,5],[6,7]]
I want to assign the last element of it to a variable
test = test_array.last #=> [6, 7]
And I want to change it
test[0] += 1
test #=> [7, 7]
Why has the last element of the array changed as well?
test_array #=> [[4, 3], [4, 5], [7, 7]]
How can I avoid this?
The variable test holds a reference to the array test_array.last. If you modified the value of test, the value of test_array.last is modified as well.
Your multi-dimensional array (test_array) has a reference to the last element, so any changes you make to test will be seen in the test_array. If you don't desire this behavior, duplicate the last element before modifying it:
test_array = [[4,3],[4,5],[6,7]]
test = test_array.last.dup
# => [6, 7]
test[0] += 1
test
# => [7, 7]
test_array
# => [[4, 3], [4, 5], [6, 7]]
Here's how you can check to see what's happening; It's a technique I use periodically when I'm helping debug someone else's code and need to 'splain what is happening:
foo = [[1,2]]
bar = foo.first
bar contains a pointer AKA reference to the sub-array in foo. That means that the array at foo.first and bar are pointing to the same variable space in memory:
foo.first.object_id # => 70357558266700
bar.object_id # => 70357558266700
Because they're the same variable space, changing the array that bar points to, or changing the one that foo.first points to, will change the other one.
That can be useful if you understand what's going on, because if you have a big array and want to temporarily point to a deeply-nested element in it, you can assign a variable to point to it rather than use a long array accessor.
Because that is the very array that you changed. There is no way to avoid that. You cannot change an array without changing it.

Differences between these 2 Ruby enumerators: [1,2,3].map vs. [1,2,3].group_by

In Ruby, is there a functional difference between these two Enumerators?
irb> enum_map = [1,2,3].map
=> #<Enumerator: [1, 2, 3]:map> # ends with "map>"
irb> enum_group_by = [1,2,3].group_by
=> #<Enumerator: [1, 2, 3]:group_by> # ends with "group_by>"
irb> enum_map.methods == enum_group_by.methods
=> true # they have the same methods
What can #<Enumerator: [1, 2, 3]:map> do that <Enumerator: [1, 2, 3]:group_by> can't do, and vice versa?
Thanks!
From the documentation of group_by:
Groups the collection by result of the block. Returns a hash where the
keys are the evaluated result from the block and the values are arrays
of elements in the collection that correspond to the key.
If no block is given an enumerator is returned.
(1..6).group_by { |i| i%3 } #=> {0=>[3, 6], 1=>[1, 4], 2=>[2, 5]}
From the documentation of map:
Returns a new array with the results of running block once for every
element in enum.
If no block is given, an enumerator is returned instead.
(1..4).map { |i| i*i } #=> [1, 4, 9, 16]
(1..4).collect { "cat" } #=> ["cat", "cat", "cat", "cat"]
As you can see, each does something different, which serves a different purpose. Concluding that two APIs are the same because they expose the same interface seems to miss the entire purpose of Object Oriented Programming - different services are supposed to expose the same interface to enable polymorphism.
There's a difference in what they do, but fundamentally they are both of the same class: Enumerator.
When they're used the values emitted by the enumerator will be different, yet the interface to them is identical.
Two objects of the same class generally have the same methods. It is possible to augment an instance with additional methods, but this is not normally done.

Is there a way to split an array of objects in Rails by two different delimiters?

I would like to do something like this:
#residenciais, #comerciais = TipoImovel.all.split { |t| t.residencial? }
The problem is that #comerciais is always empty because it never returns the object, since the condition is false.
Is there a better way of doing this?
You're looking for the standard method Enumerable#partition, rather than the Rails split add-on.
#residenciais, #comerciais = TipoImovel.all.partition { |t| t.residencial? }
Which can also be written like this, since the condition is a single method call:
#residenciais, #comerciais = TipoImovel.all.partition(&:residencial?)
Some more explanation:
The Rails Array#split method is used to separate an array into ordered groups delimited by elements which return true for a given block. It's a generalization of the standard String method. For example:
[1,2,3,4,5,6].split(&:odd?) #=> [[], [2], [4], [6]]
Any odd number is a delimiter, so it returns the portions of the array between the odd numbers, in order.
Whereas this is closer to what you're doing:
odds, evens = [1,2,3,4,5,6].partition(&:odd?) #=> [[1, 3, 5], [2, 4, 6]]
If the partition condition is not simply Boolean, or if you want to key off the values regardless, then you can use Enumerable#group_by, which returns a Hash of Arrays instead of a pair:
[1,2,3,4,5,6].group_by(&:odd?) #=> {true=>[1, 3, 5], false=>[2, 4, 6]}
You can use group_by:
#residenciais, #comerciais = TipoImovel.all.group_by { |t| t.residencial }.values

ruby: how to convert hash into array

I have a hash that contains numbers as such:
{0=>0.07394653730860076, 1=>0.0739598476853163, 2=>0.07398647083461522}
it needs to be converted into an array like:
[[0, 0.07394653730860076], [1, 0.0739598476853163], [2, 0.07398647083461522]]
i tried my hash.values which gets me:
[0.07398921877505593, 0.07400253683443543, 0.07402917535044515]
I have tried multiple ways but i just started learning ruby.
try this:
{0=>0.07394653730860076, 1=>0.0739598476853163, 2=>0.07398647083461522}.to_a
#=> [[0, 0.07394653730860076], [1, 0.0739598476853163], [2, 0.07398647083461522]]
Definitely use the Hash#to_a method, which will produce exactly what you are looking for.
{0=>0.07394653730860076, 1=>0.0739598476853163, 2=>0.07398647083461522}.to_a
=> [[0, 0.07394653730860076], [1, 0.0739598476853163], [2, 0.07398647083461522]]
Hash#values will give you only the values of each element in the hash, while Hash#keys will give you just the keys. Fortunately, the default behavior of to_a is what you are looking for.

Elegantly implementing 'map (+1) list' in ruby

The short code in title is in Haskell, it does things like
list.map {|x| x + 1}
in ruby.
While I know that manner, but what I want to know is, is there any more elegant manners to implement same thing in ruby like in Haskell.
I really love the to_proc shortcut in ruby, like this form:
[1,2,3,4].map(&:to_s)
[1,2,3,4].inject(&:+)
But this only accept exactly matching argument number between the Proc's and method.
I'm trying to seek a way that allow passing one or more arguments extra into the Proc, and without using an useless temporary block/variable like what the first demonstration does.
I want to do like this:
[1,2,3,4].map(&:+(1))
Does ruby have similar manners to do this?
If you just want to add one then you can use the succ method:
>> [1,2,3,4].map(&:succ)
=> [2, 3, 4, 5]
If you wanted to add two, you could use a lambda:
>> add_2 = ->(i) { i + 2 }
>> [1,2,3,4].map(&add_2)
=> [3, 4, 5, 6]
For arbitrary values, you could use a lambda that builds lambdas:
>> add_n = ->(n) { ->(i) { i + n } }
>> [1,2,3,4].map(&add_n[3])
=> [4, 5, 6, 7]
You could also use a lambda generating method:
>> def add_n(n) ->(i) { i + n } end
>> [1,2,3,4].map(&add_n(3))
=> [4, 5, 6, 7]
Use the ampex gem, which lets you use methods of X to build up any proc one one variable. Here’s an example from its spec:
["a", "b", "c"].map(&X * 2).should == ["aa", "bb", "cc"]
You can't do it directly with the default map. However it's quite easy to implement a version that supports this type of functionality. As an example Ruby Facets includes just such a method:
require 'facets/enumerable'
[1, 2, 3, 4].map_send(:+, 10)
=> [11, 12, 13, 14]
The implementation looks like this:
def map_send(meth, *args, &block)
map { |e| e.send(meth, *args, &block) }
end
In this particular case, you can use the following:
[1, 2, 3, 4].map(&1.method(:+))
However, this only works because + is not associative. It wouldn't work for -, for example.
Ruby hasn't built-in support for this feature, but you can create your own extension or use small gem 'ampex'. It defines global variable X with extended 'to_proc' functionality.
It gives you possibility to do that:
[1,2,3].map(&X.+(1))
Or even that:
"alpha\nbeta\ngamma\n".lines.map(&X.strip.upcase)
If you just want to add 1, you can use next or succ:
[1,2,3,4].map(&:next)
[1,2,3,4].map(&:succ)

Resources