Drop value in parallel assignment - ruby

Since parallel assignment like this can be done,
x, y, z = [1, 2, 3]
I wanted to know if one could drop for example the y value. Is there a way of doing parallel assignment without polluting the namespace of the current scope?
I tried assigning to nil:
x, nil, z = [1, 2, 3]
but that does not work.

The idiomatic way to do that is to assign it to variable named underscore:
x, _, z = [1, 2, 3]
If there are multiple values that you want to drop, you can use splat:
x, *_, z = [1, 2, 3, 4, 5]

As mentioned by #ndn, _ is often used.
It's okay to ignore it and to reassign over it, but _ is still a variable in the namespace of the current scope :
l = [1, 2, 3]
x, _, z = l
puts _
#=> 2
If it bothers you, you could use :
x, z = l.values_at(0,-1)
It might be less readable though. _ is also the proposed syntax if you launch rubocop on the script.
Talking about readability, if you want to explain what the variable is, but still want to show that it won't be used afterwards, you could use :
x, _y, z = [1, 2, 3]
# Do something with x and z. Ignore _y
It's also proposed by rubocop :
test.rb:2:4: W: Useless assignment to variable - y. Use _ or _y as a variable name to indicate that it won't be used.

meschi has asked whether it is possible, using parallel assignment, to assign some, but not all, elements of an array to variables.
Firstly, it doesn't help to write
a, _, c = [1, true, 2]
since _ is a variable:
_ #=> true
(You won't get this result using IRB as IRB uses _ for its own purposes.)
If, in the above array, you only want to assign 1 and true to variables you can write
a, b, * = [1, true, 2]
a #=> 1
b #=> true
If you only want to assign 1 and 2 to variables, you can write
a, *, b = [1, true, 2]
a #=> 1
b #=> 2
If, for the array [1, true, 2, 3], you wish to assign only 1 and 2 to variables, well, that's a problem. Evidently, as #Eric pointed out in a comment, you'd need to write
a, b, b, * = [1, true, 2, 3]
a #=> 1
b #=> 2
This idea can of course be generalized:
a, b, b, b, c, c = [1, true, 2, 3, 4, 5]
a #=> 1
b #=> 3
c #=> 5
Is this a good idea? That was not the question.
The main concern may be creating a not-to-be-used variable and then inadvertently using it, as here:
a, _, b = [1, true, 2]
...
launch_missiles_now?(_)
If so, there are a few options. One is to simply wrap the expression in an method, thereby keeping the superfluous variables tightly-confined:
def get_a_and_b(arr)
a, _, b, _ = arr
[a, b]
end
a, b = get_a_and_b [1, true, 2, 3]
#=> [1, 2]
Another is to name unwanted variables to make it unlikely that they will referenced accidentally, for example:
a, _reference_this_variable_and_you_will_die, b = [1, true, 2]

I would prefer to improve #ndn answer and use only * symbol to drop values.
_ usually used to return last value, so it can be confusing to use it here.
So you can use
x, *, z = [1, 2, 3]
as well as
x, *, y, z = [1, 2, 3, 4]
and it will still understand you and return the proper values for all the variables.

Related

How to return a splat from a method in Ruby

I wanted to create a method for array's to get a splat of the array in return. Is this possible to do in Ruby?
For example here's my current code:
Array.module_eval do
def to_args
return *self
end
end
I expect [1,2,3].to_args to return 1,2,3 but it ends up returning [1,2,3]
You cannot return a "splat" from Ruby. But you can return an array and then splat it yourself:
def args
[1, 2, 3]
end
x, y, z = args
# x == 1
# y == 2
# z == 3
x, *y = args
# x == 1
# y == [2, 3]
Of course, this works on any array, so really there is no need for monkey patching a to_args method into Array - it's all about how the calling concern is using the splat operator:
arr = [1, 2, 3]
x, y, z = arr
x, *y = arr
*x, y = arr
Same mechanism works with block arguments:
arr = [1, 2, 3]
arr.tap {|x, *y| y == [2, 3]}
Even more advanced usage:
arr = [1, [2, 3]]
x, (y, z) = arr
The concept that clarifies this for me is that although you can simulate the return of multiple values in Ruby, a method can really return only 1 object, so that simulation bundles up the multiple values in an Array.
The array is returned, and you can then deconstruct it, as you can any array.
def foo
[1, 2]
end
one, two = foo
Not exactly. What it looks like you're trying to do (the question doesn't give usage examples) is to force multiple return values. However, returning the splatted array self may do exactly what you need, as long as you're properly handling multiple return values on the calling side of the equation.
Consider these examples:
first, *rest = [1, 2, 3] # first = 1, rest = [2, 3]
*rest, last = [1, 2, 3] # rest = [1, 2], last = 3
first, *rest, last = [1, 2, 3] # first = 1, rest = [2], last = 3
Other than this, I can't actually see any way to capture or pass along multiple values like you're suggesting. I think the answer for your question, if I understand it correctly, is all in the caller's usage.

What does a comma followed by an equals sign mean in Ruby?

Just saw something like this in some Ruby code:
def getis;gets.split.map(&:to_i);end
k,=getis # What is this line doing?
di=Array::new(k){Array::new(k)}
It assigns the array's first element using Ruby's multiple assignment:
a, = [1, 2, 3]
a #=> 1
Or:
a, b = [1, 2, 3]
a #=> 1
b #=> 2
You can use * to fetch the remaining elements:
a, *b = [1, 2, 3]
a #=> 1
b #=> [2, 3]
Or:
*a, b = [1, 2, 3]
a #=> [1, 2]
b #=> 3
It works like this. If lhs has single element and rhs has multiple values then lhs gets assigned an array of values, like this.
a = 1,2,3 #=> a = [1,2,3]
Whereas if lhs has more elements than rhs, then excess elements in lhs are discarded
a,b,c = 1,2 #=> a = 1, b = 2, c = nil
Therefore
a, = 1,2,3 #=> a = 1. The rest i.e. [2,3] are discarded

Splat operator in Ruby (a quicksort example)

Hello I'm studying some Ruby code. Implement Quicksort in Ruby:
1 def qsort(lst)
2 return [] if lst.empty?
3 x, *xs = *lst
4 less, more = xs.partition{|y| y < x}
5 qsort(less) + [x] + qsort(more)
6 end
Given:
lst = [1, 2, 3, 4, 5]
x, *xs = *lst
I do not know if I understand what line 3 is doing correctly:
From my observation and experiment, this will assign 1 from lst to x, and the rest of lst to xs.
Also I found these two are doing the same thing:
x, *xs = *lst
is equivalent to
x, *xs = lst
My question is, what's the name of this nice feature (I will edit the title afterwards to adapt)? Then I could study more about this Ruby feature myself. Sorry if it's a duplicate problem, because I don't know the keyword to search on this problem.
The name of this feature is called splat operator in Ruby.
The splat operator in Ruby, Groovy and Perl allows you to switch between parameters and arrays:it splits a list in a series of parameters,or collects a series of parameters to fill an array.
From 4 Lines of Code.
This statement
x, *xs = *lst
doesn't make much sense to me, but these do:
x, *xs = [1, 2, 3] # x -> 1, xs -> [2, 3]
x = 1, *[2, 3, 4] # x -> [1, 2, 3, 4]
this usage IMO has nothing to do with parameters, but as others said splat can be (and usually is) used with parameters:
def foo(a, b, c)
end
foo(*[1,2,3]) # a -> 1, b -> 2, c -> 3

Confusion with splat operator and Range in Ruby

I was trying to see how splat operator worked with range in Ruby. To do so ran the below code in my IRB:
*a = (1..8)
#=> 1..8
When the above is fine, what happened with below? means why a gives []?
*a,b = (1..8)
#=> 1..8
b
#=> 1..8
a
#=> []
means why b gives []?
a,*b = (1..8)
#=> 1..8
a
#=> 1..8
b
#=> []
What precedence took place in the below Rvalues ?
a,*b = *(2..8),*3,*5
# => [2, 3, 4, 5, 6, 7, 8, 3, 5]
b
# => [3, 4, 5, 6, 7, 8, 3, 5]
a
# => 2
Here is another try to the splat operator(*) :-
While I know that in parallel assignment we couldn't use multiple splatted variable, but why not the same when splat is used with Rvalues?
*a,*b = [1,2,3,4,5]
SyntaxError: (irb):1: syntax error, unexpected tSTAR
*a,*b = [1,2,3,4,5]
^
from /usr/bin/irb:12:in `<main>'
The above is as expected.
a = *2,*3,*5
#=> [2, 3, 5]
But couldn't understand the above.
I think of parallel assignment as setting an array of variables equal to another array with pattern matching.
One point is that a range is a single value until you convert it to an array or splat it. For instance [1..5] which is a one element array of the range 1..5 and not [1,2,3,4,5]. To get the array of ints you need to do (1..5).to_a or [*(1..5)]
The first one i think is the trickiest. If the splatted var is assigned to one element, the var itself must be a one-element array:
*a = 5
a
# => [ 5 ]
For the next two, splat takes 0 or more not already assigned values into an array. So the following makes sense:
*a, b = (1..8)
is like
*a, b = "hey"
which is like
*a, b = [ "hey" ]
so *a is [] and b is "hey" and by the same logic that if *a is nothing, a must be an empty array. Same idea for
a, *b = (1..5)
For the next one, the range is splatted, so the assignment makes a lot of sense again:
[*(2..4), 9, 5]
# => [2, 3, 4, 9, 5]
And parallel assignment with a splat again. Next one is similar:
[*3, *4, *5]
# => [3, 4, 5]
So that's like
a = 3, 4, 5
which is like
a = [3, 4, 5]
splat has a very low precedence, almost anything will be executed earlier than the splat.
The code is splatting but the result is thrown away: b = *a = (1..8); p b #=> [1, 2, 3, 4, 5, 6, 7, 8]

Marking an unused block variable

When there is a block or local variable that is not to be used, sometimes people mark it with *, and sometimes with _.
{[1, 2] => 3, [4, 5] => 6}.each{|(x, *), *| p x}
{[1, 2] => 3, [4, 5] => 6}.each{|(x, _), _| p x}
{[1, 2, 3], [4, 5, 6]}.each{|*, x, *| p x}
{[1, 2, 3], [4, 5, 6]}.each{|_, x, _| p x}
def (x, *), *; p x; end
def (x, _), _; p x; end
def *, x, *; p x; end
def _, x, _; p x; end
What are the differences between them, and when should I use which? When there is need to mark multiple variables as unused as in the above examples, is either better?
A * means "all remaining parameters". An _ is just another variable name, although it is a bit special. So they are different, for example the following does not make sense:
[[1, 2, 3], [4, 5, 6]].each{|*, x, *| p x} # Syntax error
Indeed, how is Ruby supposed to know if the first star should get 0, 1 or 2 of the values (and the reverse)?
There are very few cases where you want to use a star to ignore parameters. An example would be if you only want to use the last of a variable number of parameters:
[[1], [2, 3], [4, 5, 6]].each{|*, last| p last} # => prints 1, 3 and 6
Ruby allows you to not give a name to the "rest" of the parameters, but you can use _:
[[1], [2, 3], [4, 5, 6]].each{|*_, last| p last} # => prints 1, 3 and 6
Typically, the number of parameters is known and your best choice is to use a _:
[[1, 2, 3], [4, 5, 6]].each{|_, mid, _| p mid} # prints 2 and 5
Note that you could leave the last paramater unnamed too (like you can when using a *), although it is less obvious:
[[1, 2, 3], [4, 5, 6]].each{|_, mid, | p mid} # prints 2 and 5
Now _ is the designated variable name to use when you don't want to use a value. It is a special variable name for two reasons:
Ruby won't complain if you don't use it (if warnings are on)
Ruby will allow you to repeat it in the argument list.
Example of point 1:
> ruby -w -e "def foo; x = 42; end; foo"
-e:1: warning: assigned but unused variable - x
> ruby -w -e "def foo; _ = 42; end; foo"
no warning
Example of point 2:
[[1, 2, 3], [4, 5, 6]].each{|unused, mid, unused| p mid}
# => SyntaxError: (irb):23: duplicated argument name
[[1, 2, 3], [4, 5, 6]].each{|_, mid, _| p mid}
# => prints 2 and 5
Finally, as #DigitalRoss notes, _ holds the last result in irb
Update: In Ruby 2.0, you can use any variable starting with _ to signify it is unused. This way the variable name can be more explicit about what is being ignored:
_scheme, _domain, port, _url = parse_some_url
# ... do something with port
I think it's mostly stylistic and programmer's choice. Using * makes more sense to me in Ruby because its purpose is to accumulate all parameters passed from that position onward. _ is a vestigial variable that rarely sees use in Ruby, and I've heard comments that it needs to go away. So, if I was to use either, I'd use *.
SOME companies might define it in their programming style document, if they have one, but I doubt it's worth most of their time because it is a throw-away variable. I've been developing professionally for over 20 years, and have never seen anything defining the naming of a throw-away.
Personally, I don't worry about this and I'd be more concerned with the use of single-letter variables. Instead of either, I would use unused or void or blackhole for this purpose.
IMO the practice makes code less readable, and less obvious.
Particularly in API methods taking blocks it may not be clear what the block actually expects. This deliberately removes information from the source, making maintenance and modification more difficult.
I'd rather the variables were named appropriately; in a short block it will be obvious it's not being used. In longer blocks, if the non-use is remarkable, a comment may elaborate on the reason.
What are the differences between them?
In the _ case a local variable _ is being created. It's just like using x but named differently.
In the * case the assignment of an expression to * creates [expression]. I'm not quite sure what it's useful for as it doesn't seem to do anything that just surrounding the expression with brackets does.
When should I use which?
In the second case you don't end up with an extra symbol being created but it looks like slightly more work for the interpreter. Also, it's obvious that you will never use that result, whereas with _ one would have to read the loop to know if it's used.
But I predict that the quality of your code will depend on other things than which trick you use to get rid of unused block parameters. The * does have a certain obscure cool-factor that I kind of like.
Note: when experimenting with this, be aware that in irb, _ holds the value of the last expression evaluated.

Resources