Ruby block taking array or multiple parameters - ruby

Today I was surprised to find ruby automatically find the values of an array given as a block parameter.
For example:
foo = "foo"
bar = "bar"
p foo.chars.zip(bar.chars).map { |pair| pair }.first #=> ["f", "b"]
p foo.chars.zip(bar.chars).map { |a, b| "#{a},#{b}" }.first #=> "f,b"
p foo.chars.zip(bar.chars).map { |a, b,c| "#{a},#{b},#{c}" }.first #=> "f,b,"
I would have expected the last two examples to give some sort of error.
Is this an example of a more general concept in ruby?
I don't think my wording at the start of my question is correct, what do I call what is happening here?

Ruby block are quirky like that.
The rule is like this, if a block takes more than one argument and it is yielded a single object that responds to to_ary then that object is expanded. This makes yielding an array versus yielding a tuple seem to behave the same way for blocks that take two or more arguments.
yield [a,b] versus yield a,b do differ though when the block takes one argument only or when the block takes a variable number of arguments.
Let me demonstrate both of that
def yield_tuple
yield 1, 2, 3
end
yield_tuple { |*a| p a }
yield_tuple { |a| p [a] }
yield_tuple { |a, b| p [a, b] }
yield_tuple { |a, b, c| p [a, b, c] }
yield_tuple { |a, b, c, d| p [a, b, c, d] }
prints
[1, 2, 3]
[1]
[1, 2]
[1, 2, 3]
[1, 2, 3, nil]
Whereas
def yield_array
yield [1,2,3]
end
yield_array { |*a| p a }
yield_array { |a| p [a] }
yield_array { |a, b| p [a, b] }
yield_array { |a, b, c| p [a, b, c] }
yield_array { |a, b, c, d| p [a, b, c, d] }
prints
[[1, 2, 3]]
[[1, 2, 3]]
[1, 2] # array expansion makes it look like a tuple
[1, 2, 3] # array expansion makes it look like a tuple
[1, 2, 3, nil] # array expansion makes it look like a tuple
And finally to show that everything in Ruby uses duck-typing
class A
def to_ary
[1,2,3]
end
end
def yield_arrayish
yield A.new
end
yield_arrayish { |*a| p a }
yield_arrayish { |a| p [a] }
yield_arrayish { |a, b| p [a, b] }
yield_arrayish { |a, b, c| p [a, b, c] }
yield_arrayish { |a, b, c, d| p [a, b, c, d] }
prints
[#<A:0x007fc3c2969190>]
[#<A:0x007fc3c2969050>]
[1, 2] # array expansion makes it look like a tuple
[1, 2, 3] # array expansion makes it look like a tuple
[1, 2, 3, nil] # array expansion makes it look like a tuple
PS, the same array expansion behavior applies for proc closures which behave like blocks, whereas lambda closures behave like methods.

Ruby's block mechanics have a quirk to them, that is if you're iterating over something that contains arrays you can expand them out into different variables:
[ %w[ a b ], %w[ c d ] ].each do |a, b|
puts 'a=%s b=%s' % [ a, b ]
end
This pattern is very useful when using Hash#each and you want to break out the key and value parts of the pair: each { |k,v| ... } is very common in Ruby code.
If your block takes more than one argument and the element being iterated is an array then it switches how the arguments are interpreted. You can always force-expand:
[ %w[ a b ], %w[ c d ] ].each do |(a, b)|
puts 'a=%s b=%s' % [ a, b ]
end
That's useful for cases where things are more complex:
[ %w[ a b ], %w[ c d ] ].each_with_index do |(a, b), i|
puts 'a=%s b=%s # %d' % [ a, b, i ]
end
Since in this case it's iterating over an array and another element that's tacked on, so each item is actually a tuple of the form %w[ a b ], 0 internally, which will be converted to an array if your block only accepts one argument.
This is much the same principle you can use when defining variables:
a, b = %w[ a b ]
a
# => 'a'
b
# => 'b'
That actually assigns independent values to a and b. Contrast with:
a, b = [ %w[ a b ] ]
a
# => [ 'a', 'b' ]
b
# => nil

I would have expected the last two examples to give some sort of error.
It does in fact work that way if you pass a proc from a method. Yielding to such a proc is much stricter – it checks its arity and doesn't attempt to convert an array argument to an argument list:
def m(a, b)
"#{a}-#{b}"
end
['a', 'b', 'c'].zip([0, 1, 2]).map(&method(:m))
#=> wrong number of arguments (given 1, expected 2) (ArgumentError)
This is because zip creates an array (of arrays) and map just yields each element, i.e.
yield ['a', 0]
yield ['b', 1]
yield ['c', 2]
each_with_index on the other hand works:
['a', 'b', 'c'].each_with_index.map(&method(:m))
#=> ["a-0", "b-1", "c-2"]
because it yields two separate values, the element and its index, i.e.
yield 'a', 0
yield 'b', 1
yield 'c', 2

Related

Do block-local variables exist just to enhance readability?

Block-local variables are to prevent a block from tampering with variables outside of its scope.
Using a block-local variable
x = 10
3.times do |y; x|
x = y
end
x # => 10
But this is easily done by declaring a regular block parameter. A new local scope is created for that parameter, which takes precedence over previous variables/scopes.
Without using a block-local variable
x = 10
3.times do |y, x|
x = y
end
x # => 10
The variable x outside the block doesn't get changed in either case. Is there any need for block-local variables other than for enhancing readability?
The block parameter is a real parameter, while a block local variable is not.
If you give yield two parameters like this:
def foo
yield("hello", "world")
end
Calling
x = 10
foo do |y; x|
puts x
end
x is nil inside the function because only the first argument is assigned to y, the second argument is discarded.
Calling
x = 10
foo do |y, x|
puts x
end
#=>world
x gets the parameter correctly as "world".
To expand on Yu Hao's answer, the difference between block parameters and block-local not obvious when calling a method that only yields one value, but consider a method that yields multiple values:
def frob
yield 1, 2, 3
end
If you pass this a block with a single argument, you get the first value:
frob { |a| a.inspect }
# => "1"
But if you pass a block with multiple arguments, you get multiple values, even if you pass too few arguments, or too many:
frob { |a, b, c| [a, b, c].inspect }
# => "[1, 2, 3]"
frob { |a, b| [a, b].inspect }
# => "[1, 2]"
frob { |a, b, c, d| [a, b, c, d].inspect }
# => "[1, 2, 3, nil]"
If you pass block-scoped variables, however, those are independent of the yielded value(s):
frob { |a; b, c| [a, b, c].inspect }
# => "[1, nil, nil]"
Something similar happens with methods that yield an array, except that when you pass a block with a single argument, it gets the whole array:
def frobble
yield [1, 2, 3]
end
frobble {|a| a.inspect }
# => "[1, 2, 3]"
Multiple arguments, however, destructure the array --
frobble {|a, b| [a, b].inspect }
# => "[1, 2]"
-- while a block-scoped variable doesn't:
frobble {|a; b| [a, b].inspect }
# => "[[1, 2, 3], nil]"
(Even with a block-scoped variable present, though, multiple values will still destructure the array: frobble {|a, b; c| [a, b, c].inspect } will get you "[1, 2, nil]".)
For more discussion and examples, see also this answer.

Ruby: Collect index from Array/String Matchdata

I'm new to Ruby, here's my problem : I would like to iterate through either an Array or String to obtain the index of characters that match a Regex.
Sample Array/String
a = %q(A B A A C C B D A D)
b = %w(A B A A C C B D A D)
What I need is something for variable a or b like ;
#index of A returns;
[0, 2, 3,8]
#index of B returns
[1,6]
#index of C returns
[5,6]
#etc
I've tried to be a little sly with
z = %w()
a =~ /\w/.each_with_index do |x, y|
puts z < y
end
but that didn't workout so well.
Any solutions ?
For array, you could use
b.each_index.select { |i| b[i] == 'A' }
For string, you could split it to an array first (a.split(/\s/)).
If you want to get each character's index as a hash, this would work:
b = %w(A B A A C C B D A D)
h = {}
b.each_with_index { |e, i|
h[e] ||= []
h[e] << i
}
h
#=> {"A"=>[0, 2, 3, 8], "B"=>[1, 6], "C"=>[4, 5], "D"=>[7, 9]}
Or as a "one-liner":
b.each_with_object({}).with_index { |(e, h), i| (h[e] ||= []) << i }
#=> {"A"=>[0, 2, 3, 8], "B"=>[1, 6], "C"=>[4, 5], "D"=>[7, 9]}
If you want to count occurrences of each letter you can define helper method:
def occurrences(collection)
collection = collection.split(/\s/) if collection.is_a? String
collection.uniq.inject({}) do |result, letter|
result[letter] = collection.each_index.select { |index| collection[index] == letter }
result
end
end
# And use it like this. This will return you a hash something like this:
# {"A"=>[0, 2, 3, 8], "B"=>[1, 6], "C"=>[4, 5], "D"=>[7, 9]}
occurrences(a)
occurrences(b)
This should work either for String or Array.

Complex subsorts on an array of arrays

I wrote a quick method that confirms that data coming from a webpage is sorted correctly:
def subsort_columns(*columns)
columns.transpose.sort
end
Which worked for basic tests. Now, complex subsorts have been introduced, and I'm pretty certain I'll need to still use an array, since hashes can't be guaranteed to return in a specific order. The order of the input in this case represents subsort priority.
# `columns_sort_preferences` is an Array in the form of:
# [[sort_ascending_bool, column_data]]
# i.e.
# subsort_columns([true, column_name], [false, column_urgency], [true, column_date])
# Will sort on name ascending, then urgency descending, and finally date ascending.
def subsort_columns(*columns_sort_preferences)
end
This is where I'm stuck. I want to do this cleanly, but can't come up with anything but rolling out a loop for each subsort that occurs on any parent sort...but it sounds wrong.
Feel free to offer better suggestions, as I'm not tied to this implementation.
Here's some test data:
a = [1,1,1,2,2,3,3,3,3]
b = %w(a b c c b b a b c)
c = %w(x z z y x z z y z)
subsort_columns([true, a], [false, b], [false, c])
=> [[1, 'c', 'z'],
[1, 'b', 'z'],
[1, 'a', 'x'],
[2, 'c', 'y'],
[2, 'b', 'x'],
[3, 'c', 'z'],
[3, 'b', 'z'],
[3, 'b', 'y'],
[3, 'a', 'z']]
Update:
Marking for reopen because I've linked to this question in a comment above the function in our codebase that I provided as my own answer. Not to mention the help I got from an answer here that clearly displays the solution to my problem, whom I'd like to give a bounty to for giving me a tip in the right direction. Please don't delete this question, it is very helpful to me. If you disagree, at least leave a comment specifying what is unclear to you.
Use sort {|a, b| block} → new_ary:
a = [1,1,1,2,2,3,3,3,3]
b = %w(a b c c b b a b c)
c = %w(x z z y x z z y z)
sorted = [a, b, c].transpose.sort do |el1, el2|
[el1[0], el2[1], el2[2]] <=> [el2[0], el1[1], el1[2]]
end
Result:
[[1, "c", "z"],
[1, "b", "z"],
[1, "a", "x"]
[2, "c", "y"],
[2, "b", "x"],
[3, "c", "z"],
[3, "b", "z"],
[3, "b", "y"],
[3, "a", "z"]]
For a descending column reverse the left and right elements of the spaceship operator.
One way to do this is to do a series of 'stable sorts' in reverse order. Start with the inner sort and work out to the outer. The stability property means that the inner sort order remains intact.
Unfortunately, Ruby's sort is not stable. But see this question for a workaround.
# Sort on each entry in `ticket_columns`, starting with the first column, then second, etc.
# Complex sorts are supported. If the first element in each `ticket_columns` is a true/false
# boolean (specifying if an ascending sort should be used), then it is sorted that way.
# If omitted, it will sort all ascending.
def _subsort_columns(*ticket_columns)
# Is the first element of every `ticket_column` a boolean?
complex_sort = ticket_columns.all? { |e| [TrueClass, FalseClass].include? e[0].class }
if complex_sort
data = ticket_columns.transpose
sort_directions = data.first
column_data = data[1..-1].flatten 1
sorted = column_data.transpose.sort do |cmp_first, cmp_last|
cmp_which = sort_directions.map { |b| b ? cmp_first : cmp_last }
cmp_these = sort_directions.map { |b| b ? cmp_last : cmp_first }
cmp_left, cmp_right = [], []
cmp_which.each_with_index { |e, i| cmp_left << e[i] }
cmp_these.each_with_index { |e, i| cmp_right << e[i] }
cmp_left <=> cmp_right
end
sorted
else
ticket_columns.transpose.sort
end
end

Using uniq! on collection with two parameters

At this point I am using uniq! to get the unique elements in a collection. Is is possible to get the unique elements based on two parameters? In other words, I would like to use uniq! to get "unique" elements based on both t.info and t.name.
collection.uniq! {|t| t.info }
Compare an array of those parameters:
T = Struct.new :info, :name
collection = [
T.new('a', 'b'),
T.new('a', 'b'),
T.new('a', 'a'),
]
collection.uniq! { |t| [t.info, t.name] }
#=> [#<struct T info="a", name="b">, #<struct T info="a", name="a">]
require 'pp'
require 'ostruct'
a = OpenStruct.new(a: 1, b: 2, c: 3)
b = OpenStruct.new(a: 2, b: 2, c: 3)
c = OpenStruct.new(a: 1, b: 2, c: 4)
pp [a, b, c].uniq # all different
pp [a, b, c].uniq { |t| [t.a, t.b] } # a and c are same

Ruby Method similar to Haskells cycle

Is there a Ruby method similar to Haskell's cycle? Haskell's cycle takes a list and returns that list infinitely appended to itself. It's commonly used with take which grabs a certain number of elements off the top of an array. Is there a Ruby method that takes an array and returns the array appended to itself some n number of times?
Yes, it's called cycle. From the documentation:
Array.cycle
(from ruby core)
------------------------------------------------------------------------------
ary.cycle(n=nil) {|obj| block } -> nil
ary.cycle(n=nil) -> an_enumerator
------------------------------------------------------------------------------
Calls block for each element repeatedly n times or forever if none
or nil is given. If a non-positive number is given or the array is empty, does
nothing. Returns nil if the loop has finished without getting interrupted.
If no block is given, an enumerator is returned instead.
a = ["a", "b", "c"]
a.cycle {|x| puts x } # print, a, b, c, a, b, c,.. forever.
a.cycle(2) {|x| puts x } # print, a, b, c, a, b, c.
Edit:
It seems like whats inside the block is basically a "Lambda", and as far as I know, I can't make a lambda concat each element onto an existing array.
b = [1, 2, 3]
z = []
b.cycle(2) { |i| z << i }
z # => [1, 2, 3, 1, 2, 3]
You can multiply an array by an integer using Array#*:
ary * int → new_ary
[...] Otherwise, returns a new array built by concatenating the int copies of self.
So you can do things like this:
>> [1, 2] * 3
=> [1, 2, 1, 2, 1, 2]

Resources