When there is a block or local variable that is not to be used, sometimes people mark it with *, and sometimes with _.
{[1, 2] => 3, [4, 5] => 6}.each{|(x, *), *| p x}
{[1, 2] => 3, [4, 5] => 6}.each{|(x, _), _| p x}
{[1, 2, 3], [4, 5, 6]}.each{|*, x, *| p x}
{[1, 2, 3], [4, 5, 6]}.each{|_, x, _| p x}
def (x, *), *; p x; end
def (x, _), _; p x; end
def *, x, *; p x; end
def _, x, _; p x; end
What are the differences between them, and when should I use which? When there is need to mark multiple variables as unused as in the above examples, is either better?
A * means "all remaining parameters". An _ is just another variable name, although it is a bit special. So they are different, for example the following does not make sense:
[[1, 2, 3], [4, 5, 6]].each{|*, x, *| p x} # Syntax error
Indeed, how is Ruby supposed to know if the first star should get 0, 1 or 2 of the values (and the reverse)?
There are very few cases where you want to use a star to ignore parameters. An example would be if you only want to use the last of a variable number of parameters:
[[1], [2, 3], [4, 5, 6]].each{|*, last| p last} # => prints 1, 3 and 6
Ruby allows you to not give a name to the "rest" of the parameters, but you can use _:
[[1], [2, 3], [4, 5, 6]].each{|*_, last| p last} # => prints 1, 3 and 6
Typically, the number of parameters is known and your best choice is to use a _:
[[1, 2, 3], [4, 5, 6]].each{|_, mid, _| p mid} # prints 2 and 5
Note that you could leave the last paramater unnamed too (like you can when using a *), although it is less obvious:
[[1, 2, 3], [4, 5, 6]].each{|_, mid, | p mid} # prints 2 and 5
Now _ is the designated variable name to use when you don't want to use a value. It is a special variable name for two reasons:
Ruby won't complain if you don't use it (if warnings are on)
Ruby will allow you to repeat it in the argument list.
Example of point 1:
> ruby -w -e "def foo; x = 42; end; foo"
-e:1: warning: assigned but unused variable - x
> ruby -w -e "def foo; _ = 42; end; foo"
no warning
Example of point 2:
[[1, 2, 3], [4, 5, 6]].each{|unused, mid, unused| p mid}
# => SyntaxError: (irb):23: duplicated argument name
[[1, 2, 3], [4, 5, 6]].each{|_, mid, _| p mid}
# => prints 2 and 5
Finally, as #DigitalRoss notes, _ holds the last result in irb
Update: In Ruby 2.0, you can use any variable starting with _ to signify it is unused. This way the variable name can be more explicit about what is being ignored:
_scheme, _domain, port, _url = parse_some_url
# ... do something with port
I think it's mostly stylistic and programmer's choice. Using * makes more sense to me in Ruby because its purpose is to accumulate all parameters passed from that position onward. _ is a vestigial variable that rarely sees use in Ruby, and I've heard comments that it needs to go away. So, if I was to use either, I'd use *.
SOME companies might define it in their programming style document, if they have one, but I doubt it's worth most of their time because it is a throw-away variable. I've been developing professionally for over 20 years, and have never seen anything defining the naming of a throw-away.
Personally, I don't worry about this and I'd be more concerned with the use of single-letter variables. Instead of either, I would use unused or void or blackhole for this purpose.
IMO the practice makes code less readable, and less obvious.
Particularly in API methods taking blocks it may not be clear what the block actually expects. This deliberately removes information from the source, making maintenance and modification more difficult.
I'd rather the variables were named appropriately; in a short block it will be obvious it's not being used. In longer blocks, if the non-use is remarkable, a comment may elaborate on the reason.
What are the differences between them?
In the _ case a local variable _ is being created. It's just like using x but named differently.
In the * case the assignment of an expression to * creates [expression]. I'm not quite sure what it's useful for as it doesn't seem to do anything that just surrounding the expression with brackets does.
When should I use which?
In the second case you don't end up with an extra symbol being created but it looks like slightly more work for the interpreter. Also, it's obvious that you will never use that result, whereas with _ one would have to read the loop to know if it's used.
But I predict that the quality of your code will depend on other things than which trick you use to get rid of unused block parameters. The * does have a certain obscure cool-factor that I kind of like.
Note: when experimenting with this, be aware that in irb, _ holds the value of the last expression evaluated.
Related
I'm pretty good at getting answers from google, but I just don't get this. In the following code, why does variable 'b' get changed after calling 'addup'? I think I understand why 'a' gets changed (although its a bit fuzzy), but I want to save the original array 'a' into 'b', run the method on 'a' so I have two arrays with different content. What am I doing wrong?
Thanks in advance
def addup(arr)
i=0
while i< arr.length
if arr[i]>3
arr.delete_at(i)
end
i += 1
end
return arr
end
a = [1,2,3,4]
b = a
puts "a=#{a}" # => [1,2,3,4]
puts "b=#{b}" # => [1,2,3,4]
puts "addup=#{addup(a)}" # => [1,2,3]
puts "a=#{a}" # => [1,2,3]
puts "b=#{b}" # => [1,2,3]
Both a and b hold a reference to the same array object in memory. In order to save the original array in b, you'd need to copy the array.
a = [1,2,3,4] # => [1, 2, 3, 4]
b = a # => [1, 2, 3, 4]
c = a.dup # => [1, 2, 3, 4]
a.push 5 # => [1, 2, 3, 4, 5]
a # => [1, 2, 3, 4, 5]
b # => [1, 2, 3, 4, 5]
c # => [1, 2, 3, 4]
For more information on why this is happening, read Is Ruby pass by reference or by value?
but I want to save the original array 'a' into 'b'
You are not saving the original array into b. Value of a is a reference to an array. You are copying a reference, which still points to the same array. No matter which reference you use to mutate the array, the changes will be visible through both references, because, again, they point to the same array.
To get a copy of the array, you have to explicitly do that. For shallow arrays with primitive values, simple a.dup will suffice. For structures which are nested or contain references to complex objects, you likely need a deep copy. Something like this:
b = Marhal.load(Marshal.dump(a))
In the following code, why does variable 'b' get changed after calling 'addup'?
The variable doesn't get changed. It still references the exact same array it did before.
There are only two ways to change a variable in Ruby:
Assignment (foo = :bar)
Reflection (Binding#local_variable_set, Object#instance_variable_set, Module#class_variable_set, Module#const_set)
Neither of those is used here.
I think I understand why 'a' gets changed (although its a bit fuzzy)
a doesn't get changed either. It also still references the exact same array it did before. (Which, incidentally, is the same array that b references.)
The only thing which does change is the internal state of the array that is referenced by both a and b. So, if you really understand why the array referenced by a changes, then you also understand why the array referenced by b changes, since it is the same array. There is only one array in your code.
The immediate problem with your code is that, if you want a copy of the array, then you need to actually make a copy of the array. That's what Object#dup and Object#clone are for:
b = a.clone
Will fix your code.
BUT!
There are some other problems in your code. The main problem is mutation. If at all possible, you should avoid mutation (and side-effects in general, of which mutation is only one example) as much as possible and only use it when you really, REALLY have to. In particular, you should never mutate objects you don't own, and this means you should never mutate objects that were passed to you as arguments.
However, in your addup method, you mutate the array that is passed to you as arr. Mutation is the source of your problem, if you didn't mutate arr but instead returned a new array with the modifications you want, then you wouldn't have the problem in the first place. One way of not mutating the argument would be to move the cloneing into the method, but there is an even better way.
Another problem with your code is that you are using a loop. In Ruby, there is almost never a situation where a loop is the best solution. In fact, I would go so far as to argue that if you are using a loop, you are doing it wrong.
Loops are error-prone, hard to understand, hard to get right, and they depend on side-effects. A loop cannot work without side-effects, yet, we just said we want to avoid side-effects!
Case in point: your loop contains a serious bug. If I pass [1, 2, 3, 4, 5], the result will be [1, 2, 3, 5]. Why? Because of mutation and manual looping:
In the fourth iteration of the loop, at the beginning, the array looks like this:
[1, 2, 3, 4, 5]
# ↑
# i
After the call to delete_at(i), the array looks like this:
[1, 2, 3, 5]
# ↑
# i
Now, you increment i, so the situation looks like this:
[1, 2, 3, 5]
# ↑
# i
i is now greater than the length of the array, ergo, the loop ends, and the 5 never gets removed.
What you really want, is this:
def addup(arr)
arr.reject {|el| el > 3 }
end
a = [1, 2, 3, 4, 5]
b = a
puts "a=#{a}" # => [1, 2, 3, 4, 5]
puts "b=#{b}" # => [1, 2, 3, 4, 5]
puts "addup=#{addup(a)}" # => [1, 2, 3]
puts "a=#{a}" # => [1, 2, 3, 4, 5]
puts "b=#{b}" # => [1, 2, 3, 4, 5]
As you can see, nothing was mutated. addup simply returns the new array with the modifications you want. If you want to refer to that array later, you can assign it to a variable:
c = addup(a)
There is no need to manually fiddle with loop indices. There is no need to copy or clone anything. There is no "spooky action at a distance", as Albert Einstein called it. We fixed two bugs and removed 7 lines of code, simply by
avoiding mutation
avoiding loops
Since parallel assignment like this can be done,
x, y, z = [1, 2, 3]
I wanted to know if one could drop for example the y value. Is there a way of doing parallel assignment without polluting the namespace of the current scope?
I tried assigning to nil:
x, nil, z = [1, 2, 3]
but that does not work.
The idiomatic way to do that is to assign it to variable named underscore:
x, _, z = [1, 2, 3]
If there are multiple values that you want to drop, you can use splat:
x, *_, z = [1, 2, 3, 4, 5]
As mentioned by #ndn, _ is often used.
It's okay to ignore it and to reassign over it, but _ is still a variable in the namespace of the current scope :
l = [1, 2, 3]
x, _, z = l
puts _
#=> 2
If it bothers you, you could use :
x, z = l.values_at(0,-1)
It might be less readable though. _ is also the proposed syntax if you launch rubocop on the script.
Talking about readability, if you want to explain what the variable is, but still want to show that it won't be used afterwards, you could use :
x, _y, z = [1, 2, 3]
# Do something with x and z. Ignore _y
It's also proposed by rubocop :
test.rb:2:4: W: Useless assignment to variable - y. Use _ or _y as a variable name to indicate that it won't be used.
meschi has asked whether it is possible, using parallel assignment, to assign some, but not all, elements of an array to variables.
Firstly, it doesn't help to write
a, _, c = [1, true, 2]
since _ is a variable:
_ #=> true
(You won't get this result using IRB as IRB uses _ for its own purposes.)
If, in the above array, you only want to assign 1 and true to variables you can write
a, b, * = [1, true, 2]
a #=> 1
b #=> true
If you only want to assign 1 and 2 to variables, you can write
a, *, b = [1, true, 2]
a #=> 1
b #=> 2
If, for the array [1, true, 2, 3], you wish to assign only 1 and 2 to variables, well, that's a problem. Evidently, as #Eric pointed out in a comment, you'd need to write
a, b, b, * = [1, true, 2, 3]
a #=> 1
b #=> 2
This idea can of course be generalized:
a, b, b, b, c, c = [1, true, 2, 3, 4, 5]
a #=> 1
b #=> 3
c #=> 5
Is this a good idea? That was not the question.
The main concern may be creating a not-to-be-used variable and then inadvertently using it, as here:
a, _, b = [1, true, 2]
...
launch_missiles_now?(_)
If so, there are a few options. One is to simply wrap the expression in an method, thereby keeping the superfluous variables tightly-confined:
def get_a_and_b(arr)
a, _, b, _ = arr
[a, b]
end
a, b = get_a_and_b [1, true, 2, 3]
#=> [1, 2]
Another is to name unwanted variables to make it unlikely that they will referenced accidentally, for example:
a, _reference_this_variable_and_you_will_die, b = [1, true, 2]
I would prefer to improve #ndn answer and use only * symbol to drop values.
_ usually used to return last value, so it can be confusing to use it here.
So you can use
x, *, z = [1, 2, 3]
as well as
x, *, y, z = [1, 2, 3, 4]
and it will still understand you and return the proper values for all the variables.
I found this code by user Hirolau:
def sum_to_n?(a, n)
a.combination(2).find{|x, y| x + y == n}
end
a = [1, 2, 3, 4, 5]
sum_to_n?(a, 9) # => [4, 5]
sum_to_n?(a, 11) # => nil
How can I know when I can send two parameters to a predefined method like find? It's not clear to me because sometimes it doesn't work. Is this something that has been redefined?
If you look at the documentation of Enumerable#find, you see that it accepts only one parameter to the block. The reason why you can send it two, is because Ruby conveniently lets you do this with blocks, based on it's "parallel assignment" structure:
[[1,2,3], [4,5,6]].each {|x,y,z| puts "#{x}#{y}#{z}"}
# 123
# 456
So basically, each yields an array element to the block, and because Ruby block syntax allows "expanding" array elements to their components by providing a list of arguments, it works.
You can find more tricks with block arguments here.
a.combination(2) results in an array of arrays, where each of the sub array consists of 2 elements. So:
a = [1,2,3,4]
a.combination(2)
# => [[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]
As a result, you are sending one array like [1,2] to find's block, and Ruby performs the parallel assignment to assign 1 to x and 2 to y.
Also see this SO question, which brings other powerful examples of parallel assignment, such as this statement:
a,(b,(c,d)) = [1,[2,[3,4]]]
find does not take two parameters, it takes one. The reason the block in your example takes two parameters is because it is using destruction. The preceding code a.combination(2) gives an array of arrays of two elements, and find iterates over it. Each element (an array of two elements) is passed at a time to the block as its single parameter. However, when you write more parameters than there is, Ruby tries to adjust the parameters by destructing the array. The part:
find{|x, y| x + y == n}
is a shorthand for writing:
find{|(x, y)| x + y == n}
The find function iterates over elements, it takes a single argument, in this case a block (which does take two arguments for a hash):
h = {foo: 5, bar: 6}
result = h.find {|k, v| k == :foo && v == 5}
puts result.inspect #=> [:foo, 5]
The block takes only one argument for arrays though unless you use destructuring.
Update: It seems that it is destructuring in this case.
I have a string which contains a 2-D array.
b= "[[1, 2, 3], [4, 5, 6]]"
c = b.gsub(/(\[\[)/,"[").gsub(/(\]\])/,"]")
The above is how I decide to flatten it to:
"[1, 2, 3], [4, 5, 6]"
Is there a way to replace the leftmost and rightmost brackets without doing a double gsub call? I'm doing a deeper dive into regular expressions and would like to see different alternatives.
Sometimes, the string may be in the correct format as comma delimited 1-D arrays.
The gsub method accepts a hash, and anything that matches your regular expression will be replaced using the keys/values in that hash, like so:
b = "[[1, 2, 3], [4, 5, 6]]"
c = b.gsub(/\[\[|\]\]/, '[[' => '[', ']]' => ']')
That may look a little jumbled, and in practice I'd probably define the list of swaps on a different line. But this does what you were looking for with one gsub, in a more intuitive way.
Another option is to take advantage of the fact that gsub also accepts a block:
c = b.gsub(/\[\[|\]\]/){|matched_value| matched_value.first}
Here we match any double opening/closing square brackets, and just take the first letter of any matches. We can clean up the regex:
c = b.gsub(/\[{2}|\]{2}/){|matched_value| matched_value.first}
This is a more succinct way to specify that we want to match exactly two opening brackets, or exactly two closing brackets. We can also refine the block:
c = b.gsub(/\[{2}|\]{2}/, &:first)
Here we're using some Ruby shorthand. If you only need to call a simple method on the object passed into a block, you can use the &: notation to do this. I think I've gotten it about as short and sweet as I can. Happy coding!
\[(?=\[)|(?<=\])\]
You can try this.Replace with ``.See demo.
http://regex101.com/r/hQ1rP0/25
Don't even bother with a regular expression, just do a simple string slice:
b= "[[1, 2, 3], [4, 5, 6]]"
b[1 .. -2] # => "[1, 2, 3], [4, 5, 6]"
the string may be in the correct format as comma delimited 1D arrays
Then sense whether it is and conditionally modify it:
b= "[[1, 2, 3], [4, 5, 6]]"
b = b[1 .. -2] if b[0, 2] == '[[' # => "[1, 2, 3], [4, 5, 6]"
Regular expressions aren't universal hammers, and not everything is a nail to be hit with one.
To "squeeze" consecutive occurrences of a specific character set, you can use tr_s:
"[[1,2],[3,4]]".tr_s('[]','[]')
=> "[1,2],[3,4]"
You're saying "translate all runs of square bracket characters to one of that character". To do the same thing with regular expressions and gsub, you can do:
"[[1,2],[3,4]]".gsub(/(\[|\])+/,'\1')
This is just a thought exercise and I'd be interested in any opinions. Although if it works I can think of a few ways I'd use it.
Traditionally, if you wanted to perform a function on the results of a nested loop formed from arrays or ranges etc, you would write something like this:
def foo(x, y)
# Processing with x, y
end
iterable_one.each do |x|
iterable_two.each do |y|
my_func(x, y)
end
end
However, what if I had to add another level of nesting. Yes, I could just add an additonal level of looping. At this point, let's make foo take a variable number of arguments.
def foo(*inputs)
# Processing with variable inputs
end
iterable_one.each do |x|
iterable_two.each do |y|
iterable_three.each do |z|
my_func(x, y, x)
end
end
end
Now, assume I need to add another level of nesting. At this point, it's getting pretty gnarly.
My question, therefore is this: Is it possible to write something like the below?
[iterable_one, iterable_two, iterable_three].nested_each(my_func)
or perhaps
[iterable_one, iterable_two, iterable_three].nested_each { |args| my_func(args) }
Perhaps passing the arguments as actual arguments isn't feasible, could you maybe pass an array to my_func, containing parameters from combinations of the enumerables?
I'd be curious to know if this is possible, it's probably not that likely a scenario but after it occurred to me I wanted to know.
Array.product yields combinations of enums as if they were in nested loops. It takes multiple arguments. Demo:
a = [1,2,3]
b = %w(a b c)
c = [true, false]
all_enums = [a,b,c]
all_enums.shift.product(*all_enums) do |combi|
p combi
end
#[1, "a", true]
#[1, "a", false]
#[1, "b", true]
#...
You can use product:
[1,4].product([5,6],[3,5]) #=> [[1, 5, 3], [1, 5, 5], [1, 6, 3], [1, 6, 5], [4, 5, 3], [4, 5, 5], [4, 6, 3], [4, 6, 5]]