Passing a hash to a function ( *args ) and its meaning - ruby

When using an idiom such as:
def func(*args)
# some code
end
What is the meaning of *args? Googling this specific question was pretty hard, and I couldn't find anything.
It seems all the arguments actually appear in args[0] so I find myself writing defensive code such as:
my_var = args[0].delete(:var_name) if args[0]
But I'm sure there's a better way I'm missing out on.

The * is the splat (or asterisk) operator. In the context of a method, it specifies a variable length argument list. In your case, all arguments passed to func will be putting into an array called args. You could also specify specific arguments before a variable-length argument like so:
def func2(arg1, arg2, *other_args)
# ...
end
Let's say we call this method:
func2(1, 2, 3, 4, 5)
If you inspect arg1, arg2 and other_args within func2 now, you will get the following results:
def func2(arg1, arg2, *other_args)
p arg1.inspect # => 1
p arg2.inspect # => 2
p other_args.inspect # => [3, 4, 5]
end
In your case, you seem to be passing a hash as an argument to your func, in which case, args[0] will contain the hash, as you are observing.
Resources:
Variable Length Argument List, Asterisk Operator
What is the * operator doing
Update based on OP's comments
If you want to pass a Hash as an argument, you should not use the splat operator. Ruby lets you omit brackets, including those that specify a Hash (with a caveat, keep reading), in your method calls. Therefore:
my_func arg1, arg2, :html_arg => value, :html_arg2 => value2
is equivalent to
my_func(arg1, arg2, {:html_arg => value, :html_arg2 => value2})
When Ruby sees the => operator in your argument list, it knows to take the argument as a Hash, even without the explicit {...} notation (note that this only applies if the hash argument is the last one!).
If you want to collect this hash, you don't have to do anything special (though you probably will want to specify an empty hash as the default value in your method definition):
def my_func(arg1, arg2, html_args = {})
# ...
end

Related

Why is `to_ary` called from a double-splatted parameter in a code block?

It seems that a double-splatted block parameter calls to_ary on an object that is passed, which does not happen with lambda parameters and method parameters. This was confirmed as follows.
First, I prepared an object obj on which a method to_ary is defined, which returns something other than an array (i.e., a string).
obj = Object.new
def obj.to_ary; "baz" end
Then, I passed this obj to various constructions that have a double splatted parameter:
instance_exec(obj){|**foo|}
# >> TypeError: can't convert Object to Array (Object#to_ary gives String)
->(**foo){}.call(obj)
# >> ArgumentError: wrong number of arguments (given 1, expected 0)
def bar(**foo); end; bar(obj)
# >> ArgumentError: wrong number of arguments (given 1, expected 0)
As can be observed above, only code block tries to convert obj to an array by calling a (potential) to_ary method.
Why does a double-splatted parameter for a code block behave differently from those for a lambda expression or a method definition?
I don't have full answers to your questions, but I'll share what I've found out.
Short version
Procs allow to be called with number of arguments different than defined in the signature. If the argument list doesn't match the definition, #to_ary is called to make implicit conversion. Lambdas and methods require number of args matching their signature. No conversions are performed and that's why #to_ary is not called.
Long version
What you describe is a difference between handling params by lambdas (and methods) and procs (and blocks). Take a look at this example:
obj = Object.new
def obj.to_ary; "baz" end
lambda{|**foo| print foo}.call(obj)
# >> ArgumentError: wrong number of arguments (given 1, expected 0)
proc{|**foo| print foo}.call(obj)
# >> TypeError: can't convert Object to Array (Object#to_ary gives String)
Proc doesn't require the same number of args as it defines, and #to_ary is called (as you probably know):
For procs created using lambda or ->(), an error is generated if wrong number of parameters are passed to the proc. For procs created using Proc.new or Kernel.proc, extra parameters are silently discarded and missing parameters are set to nil. (Docs)
What is more, Proc adjusts passed arguments to fit the signature:
proc{|head, *tail| print head; print tail}.call([1,2,3])
# >> 1[2, 3]=> nil
Sources: makandra, SO question.
#to_ary is used for this adjustment (and it's reasonable, as #to_ary is for implicit conversions):
obj2 = Class.new{def to_ary; [1,2,3]; end}.new
proc{|head, *tail| print head; print tail}.call(obj2)
# >> 1[2, 3]=> nil
It's described in detail in a ruby tracker.
You can see that [1,2,3] was split to head=1 and tail=[2,3]. It's the same behaviour as in multi assignment:
head, *tail = [1, 2, 3]
# => [1, 2, 3]
tail
# => [2, 3]
As you have noticed, #to_ary is also called when when a proc has double-splatted keyword args:
proc{|head, **tail| print head; print tail}.call(obj2)
# >> 1{}=> nil
proc{|**tail| print tail}.call(obj2)
# >> {}=> nil
In the first case, an array of [1, 2, 3] returned by obj2.to_ary was split to head=1 and empty tail, as **tail wasn't able to match an array of[2, 3].
Lambdas and methods don't have this behaviour. They require strict number of params. There is no implicit conversion, so #to_ary is not called.
I think that this difference is implemented in these two lines of the Ruby soruce:
opt_pc = vm_yield_setup_args(ec, iseq, argc, sp, passed_block_handler,
(is_lambda ? arg_setup_method : arg_setup_block));
and in this function. I guess #to_ary is called somewhere in vm_callee_setup_block_arg_arg0_splat, most probably in RARRAY_AREF. I would love to read a commentary of this code to understand what happens inside.

Mixing keyword argument and arguments with default values duplicates the hash?

So i discovered this ruby behaviour, which kept me going crazy for over an hour. When I pass a hash to a function which has a default value for hash AND a keyword argument, it seems like the reference doesn't get passed correctly. As soon as I take away the default value OR the keyword argument, the function behaves as expected. Am I missing some obvious ruby rule here?
def change_hash(h={}, rand: om)
h['hey'] = true
end
k = {}
change_hash(k)
k
#=> {}
It works fine as soon as I take out the default or the keyword arg.
def change_hash(h, rand: om)
h['hey'] = true
end
k = {}
change_hash(k)
k
#=> {'hey' => true}
def change_hash(h={})
h['hey'] = true
end
k = {}
change_hash(k)
k
#=> {'hey' => true}
EDIT
Thanks for your answers. Most of you pointed out that ruby parses the hash as a keyword argument in some cases. However, I am talking about the case when a hash has string keys. When I pass the hash, it seems like the value that gets passed is correct. But modifying the hash inside the function doesn't modify the original hash.
def change_hash(hash={}, another_arg: 300)
puts "another_arg: #{another_arg}"
puts "hash: #{hash}"
hash['hey'] = 3
end
my_hash = {"o" => 3}
change_hash(my_hash)
puts my_hash
Prints out
another_arg: 300
hash: {"o"=>3}
{"o"=>3}
TL;DR ruby allows passing hash as a keyword argument as well as “expanded inplace hash.” Since change_hash(rand: :om) must be routed to keyword argument, so should change_hash({rand: :om}) and, hence, change_hash({}).
Since ruby allows default arguments in any position, the parser takes care of default arguments in the first place. That means, that the default arguments are greedy and the most amount of defaults will take a place.
On the other hand, since ruby lacks pattern-matching feature for function clauses, parsing the given argument to decide whether it should be passed as double-splat or not would lead to huge performance penalties. Since the call with an explicit keyword argument (change_hash(rand: :om)) should definitely pass :om to keyword argument, and we are allowed to pass an explicit hash {rand: :om} as a keyword argument, Ruby has nothing to do but to accept any hash as a keyword argument.
Ruby will split the single hash argument between hash and rand:
k = {"a" => 42, rand: 42}
def change_hash(h={}, rand: :om)
h[:foo] = 42
puts h.inspect
end
change_hash(k);
puts k.inspect
#⇒ {"a"=>42, :foo=>42}
#⇒ {"a"=>42, :rand=>42}
That split feature requires the argument being cloned before passing. That is why the original hash is not being modified.
This is particularly tricky case in Ruby indeed.
In your example you have optional argument which is a hash and you have an optional keyword argument at the same time. In this situation if you pass only one hash, Ruby interprets it as a hash which contains keyword arguments. Here is the code to clarify:
change_hash({rand1: 'om'})
# ArgumentError: unknown keyword: rand1
To work around this you can pass two separate hashes into the method with second one (the one for keyword arguments) being empty:
def change_hash(h={}, rand: 'om')
h['hey'] = true
end
k = {}
change_hash(k, {})
k
#=> {'hey' => true}
From the practical point of view it is better to avoid metdhod signature like that in production code, because it is very easy to make an error while using the method.

Why do you have to specify 2 arguments explicitly to curry :>

Consider this, which works fine:
:>.to_proc.curry(2)[9][8] #=> true, because 9 > 8
However, even though > is a binary operator, the above won't work without the arity specified:
:>.to_proc.curry[9][8] #=> ArgumentError: wrong number of arguments (0 for 1)
Why aren't the two equivalent?
Note: I specifically want to create the intermediate curried function with one arg supplied, and then call then call that with the 2nd arg.
curry has to know the arity of the proc passed in, right?
:<.to_proc.arity # => -1
Negative values from arity are confusing, but basically mean 'variable number of arguments' one way or another.
Compare to:
less_than = lambda {|a, b| a < b}
less_than.arity # => 2
When you create a lambda saying it takes two arguments, it knows it takes two arguments, and will work fine with that style of calling #curry.
less_than.curry[9][8] # => false, no problem!
But when you use the symbol #to_proc trick, it's just got a symbol to go on, it has no idea how many arguments it takes. While I don't think < is actually an ordinary method in ruby, I think you're right it neccessarily takes two args, the Symbol#to_proc thing is a general purpose method that works on any method name, it has no idea how many args the method should take, so defines the proc with variable arguments.
I don't read C well enough to follow the MRI implementation, but I assume Symbol#to_proc defines a proc with variable arguments. The more typical use of Symbol#to_proc, of course, is for a no-argument methods. You can for instance do this with it if you want:
hello_proc = :hello.to_proc
class SomeClass
def hello(name = nil)
puts "Hello, #{name}!"
end
end
obj = SomeClass.new
obj.hello #=> "Hello, !"
obj.hello("jrochkind") #=> "Hello, jrochkind!"
obj.hello("jrochkind", "another")
# => ArgumentError: wrong number of arguments calling `hello` (2 for 1)
hello_proc.call(obj) # => "Hello, !"
hello_proc.call(obj, "jrochkind") # => "Hello, jrochkind!"
hello_proc.call(obj, "jrochkind", "another")
# => ArgumentError: wrong number of arguments calling `hello` (2 for 1)
hello_proc.call("Some string")
# => NoMethodError: undefined method `hello' for "Some string":String
Note I did hello_proc = :hello.to_proc before I even defined SomeClass. The Symbol#to_proc mechanism creates a variable arity proc, that knows nothing about how or where or on what class it will be called, it creates a proc that can be called on any class at all, and can be used with any number of arguments.
If it were defined in ruby instead of C, it would look something like this:
class Symbol
def to_proc
method_name = self
proc {|receiver, *other_args| receiver.send(method_name, *other_args) }
end
end
I think it is because Symbol#to_proc creates a proc with one argument. When turned into a proc, :> does not look like:
->x, y{...}
but it looks like:
->x{...}
with the requirement of the original single argument of > somehow tucked inside the proc body (notice that > is not a method that takes two arguments, it is a method called on one receiver with one argument). In fact,
:>.to_proc.arity # => -1
->x, y{}.arity # => 2
which means that applying curry to it without argument would only have a trivial effect; it takes a proc with one parameter, and returns itself. By explicitly specifying 2, it does something non-trivial. For comparison, consider join:
:join.to_proc.arity # => -1
:join.to_proc.call(["x", "y"]) # => "xy"
:join.to_proc.curry.call(["x", "y"]) # => "xy"
Notice that providing a single argument after Currying :join already evaluates the whole method.
#jrochkind's answer does a great job of explaining why :>.to_proc.curry doesn't have the behavior you want. I wanted to mention, though, that there's a solution to this part of your question:
I specifically want to create the intermediate curried function with one arg supplied, and then call then call that with the 2nd arg.
The solution is Object#method. Instead of this:
nine_is_greater_than = :>.to_proc.curry[9]
nine_is_greater_than[8]
#=> ArgumentError: wrong number of arguments (0 for 1)
...do this:
nine_is_greater_than = 9.method(:>)
nine_is_greater_than[8]
# => true
Object#method returns a Method object, which acts just like a Proc: it responds to call, [], and even (as of Ruby 2.2) curry. However, if you need a real proc (or want to use curry with Ruby < 2.2) you can also call to_proc on it (or use &, the to_proc operator):
[ 1, 4, 8, 10, 20, 30 ].map(&nine_is_greater_than)
# => [ true, true, true, false, false, false ]

ruby - omitting default parameter before splat

I have the following method signature
def invalidate_cache(suffix = '', *args)
# blah
end
I don't know if this is possible but I want to call invalidate_cache and omit the first argument sometimes, for example:
middleware.invalidate_cache("test:1", "test")
This will of course bind the first argument to suffix and the second argument to args.
I would like both arguments to be bound to args without calling like this:
middleware.invalidate_cache("", "test:1", "test")
Is there a way round this?
Use keyword arguments (this works in Ruby 2.0+):
def invalidate_cache(suffix: '', **args) # note the double asterisk
[suffix, args]
end
> invalidate_cache(foo: "any", bar: 4242)
=> ["", {:foo=>"any", :bar=>4242}]
> invalidate_cache(foo: "any", bar: 4242, suffix: "aaaaa")
=> ["aaaaa", {:foo=>"any", :bar=>4242}]
Note that you will have the varargs in a Hash instead of an Array and keys are limited to valid Symbols.
If you need to reference the arguments by position, create an Array from the Hash with Hash#values.
How about you create a wrapper method for invalidate_cache that just calls invalidate_cache with the standard argument for suffix.
In order to do this, your code has to have some way of telling the difference between a suffix, and just another occurrence of args. E.g. In your first example, how is your program supposed to know that you didn't mean for "test:1" to actually be the suffix?
If you can answer that question, you can write some code to make the method determine at run time whether or not you provided a suffix. For example, say you specify that all suffixes have to start with a period (and no other arguments will). Then you could do something like this:
def invalidate_cache(*args)
suffix = (args.first =~ /^\./) ? args.shift : ''
[suffix, args]
end
invalidate_cache("test:1", "test") #=> ["", ["test:1", "test"]]
invalidate_cache(".jpeg", "test:1", "test") #=> [".jpeg", ["test:1", "test"]]
If, however, there actually is no way of telling the difference between an argument meant as a suffix and one meant to be lumped in with args, then you're kind of stuck. You'll either have to keep passing suffix explicitly, change the method signature to use keyword arguments (as detailed in karatedog's answer), or take an options hash.

Join doesn't convert the splat argument into a string in Ruby

Let's say we have this code:
def something(*someargs)
return *someargs.join(",")
end
Now, I found you can reference *someargs just like any other variable anywhere in the method definition. But I tried this...returning *someargs as a string, separated with a comma. Yet, when I call this method:
a = something(4, 5)
p a.class # => Array
p a #> ["4,5"]
why does something(4,5) still returns an array? If I do something like this:
[4, 5].join(",")
the result will be a string not in an array. So my question would be, how do I make the "something" method return an actual string which contains all the arguments as a string. And it's weird because if I do *someargs.class, the result is "Array", yet it doesn't behave like a typical array...
Try below :
def something(*someargs)
return someargs.join(",")
end
a = something(4, 5)
p a.class # => String
p a # => "4,5"
One example to explain your case:
a = *"12,11"
p a # => ["12,11"]
So when you did return *someargs.join(","), someargs.join(",") created the string as "4,5".But now you are using splat operator(*) again on the evaluated string "4,5" with the assignment operation like a = *"4,5". So finally you are getting ["4,5"].
Read more scenarios with splat operators here - Splat Operator in Ruby
Hope that helps.
An object prepended with a splat *... is not an object. You cannot reference such thing, nor can you pass it as an argument to a method because there is no such thing. However, if you have a method that can take multiple arguments, for example puts, then you can do something like:
puts *["foo", "bar"]
In this case, there is no such thing as *["foo", "bar"]. The splat operator is expanding it into multiple arguments. It is equivalent to:
puts "foo", "bar"
Regarding why someargs remains to be an array after someargs.join(","). That is because join is not a destructive method. It does not do anything to the receiver. Furthermore, an object cannot change its class by a destructive method. The only way to change the reference of someargs from an array to a string is to reassign it.

Resources