What does Hash[x] << "string" do? - ruby

What does Hash[x] << "string" do?
What is the symbol << for and how does it work?

The real question is, what does Hash[x] evaluate to?
Because it's that object (an Array, perhaps?) on which the << operator (really a method) is being invoked. That is, Hash[x] << "string" is, excluding the temporary variable, equivalent to t = Hash[x]; t << "string".
Like all overridable Ruby operators1, << is really just a method call. It is commonly seen as Array#<<, but it may be different for the object in question (see above):
[On an Array object, the << operator] Append—Pushes the given object on to the end of this array. This expression returns the array itself, so several appends may be chained together.
Once the actual object is known, then the operator can be trivially looked up in the appropriate documentation.
1 See list of ruby operators that can be overridden/implemented for a list; "pure" operators like = (non-indexed assignment) and , cannot be overriden and do not work in the same way.

<< is a method that is also usually aliased as append. In Ruby, you can call operator methods in the same way as any other method: an_obj.<<(an_arg) is perfectly valid syntax.
In general, the append method adds the argument to the receiver. If the receiver is an array, it adds the argument to the end of the array; if it's a string, it adds the argument to the end of the string.
The side effects and return value of calling the << method simply depend on the method's implementation in the receiver object's class.

Related

How does a code block in Ruby know what variable belongs to an aspect of an object?

Consider the following:
(1..10).inject{|memo, n| memo + n}
Question:
How does n know that it is supposed to store all the values from 1..10? I'm confused how Ruby is able to understand that n can automatically be associated with (1..10) right away, and memo is just memo.
I know Ruby code blocks aren't the same as the C or Java code blocks--Ruby code blocks work a bit differently. I'm confused as to how variables that are in between the upright pipes '|' will automatically be assigned to parts of an object. For example:
hash1 = {"a" => 111, "b" => 222}
hash2 = {"b" => 333, "c" => 444}
hash1.merge(hash2) {|key, old, new| old}
How do '|key, old, new|' automatically assign themselves in such a way such that when I type 'old' in the code block, it is automatically aware that 'old' refers to the older hash value? I never assigned 'old' to anything, just declared it. Can someone explain how this works?
The parameters for the block are determined by the method definition. The definition for reduce/inject is overloaded (docs) and defined in C, but if you wanted to define it, you could do it like so (note, this doesn't cover all the overloaded cases for the actual reduce definition):
module Enumerable
def my_reduce(memo=nil, &blk)
# if a starting memo is not given, it defaults to the first element
# in the list and that element is skipped for iteration
elements = memo ? self : self[1..-1]
memo ||= self[0]
elements.each { |element| memo = blk.call(memo, element) }
memo
end
end
This method definition determines what values to use for memo and element and calls the blk variable (a block passed to the method) with them in a specific order.
Note, however, that blocks are not like regular methods, because they don't check the number of arguments. For example: (note, this example shows the usage of yield which is another way to pass a block parameter)
def foo
yield 1
end
# The b and c variables here will be nil
foo { |a, b, c| [a,b,c].compact.sum } # => 1
You can also use deconstruction to define variables at the time you run the block, for example if you wanted to reduce over a hash you could do something like this:
# this just copies the hash
{a: 1}.reduce({}) { |memo, (key, val)| memo[key] = val; memo }
How this works is, calling reduce on a hash implicitly calls to_a, which converts it to a list of tuples (e.g. {a: 1}.to_a = [[:a, 1]]). reduce passes each tuple as the second argument to the block. In the place where the block is called, the tuple is deconstructed into separate key and value variables.
A code block is just a function with no name. Like any other function, it can be called multiple times with different arguments. If you have a method
def add(a, b)
a + b
end
How does add know that sometimes a is 5 and sometimes a is 7?
Enumerable#inject simply calls the function once for each element, passing the element as an argument.
It looks a bit like this:
module Enumerable
def inject(memo)
each do |el|
memo = yield memo, el
end
memo
end
end
And memo is just memo
what do you mean, "just memo"? memo and n take whatever values inject passes. And it is implemented to pass accumulator/memo as first argument and current collection element as second argument.
How do '|key, old, new|' automatically assign themselves
They don't "assign themselves". merge assigns them. Or rather, passes those values (key, old value, new value) in that order as block parameters.
If you instead write
hash1.merge(hash2) {|foo, bar, baz| bar}
It'll still work exactly as before. Parameter names mean nothing [here]. It's actual values that matter.
Just to simplify some of the other good answers here:
If you are struggling understanding blocks, an easy way to think of them is as a primitive and temporary method that you are creating and executing in place, and the values between the pipe characters |memo| is simply the argument signature.
There is no special special concept behind the arguments, they are simply there for the method you are invoking to pass a variable to, like calling any other method with an argument. Similar to a method, the arguments are "local" variables within the scope of the block (there are some nuances to this depending on the syntax you use to call the block, but I digress, that is another matter).
The method you pass the block to simply invokes this "temporary method" and passes the arguments to it that it is designed to do. Just like calling a method normally, with some slight differences, such as there are no "required" arguments. If you do not define any arguments to receive, it will happily just not pass them instead of raising an ArgumentError. Likewise, if you define too many arguments for the block to receive, they will simply be nil within the block, no errors for not being defined.

Understanding `array.map(&:method)` [duplicate]

This question already has answers here:
What does map(&:name) mean in Ruby?
(17 answers)
Closed 6 years ago.
Why does:
[1,2,3,4,5].map(&:to_s) #=> ["1", "2", "3", "4", "5"]
work but:
[1,2,3,4,5].map(&:*(2))
throws an unexpected syntax error?
& is called the to_proc operator. It calls the to_proc method on the expression that follows it and then passes the resulting Proc to the method as a block.
In the case of &:to_s, :to_s is a Symbol, so it the operator calls Symbol#to_proc. The docs are a little garbled, but suffice it to say that these two expressions are more-or-less equivalent:
my_proc = :to_s.to_proc
my_proc = Proc.new {|obj| obj.to_s }
So the answer to the question "Why doesn't &:*(2) work?" is that the expression that follows the & operator, :*(2), isn't a valid Ruby expression. It makes about as much sense to the Ruby parser as "hello"(2).
There is, by the way, a way to do what you're trying to do:
[1,2,3,4,5].map(&2.method(:*))
# => [2, 4, 6, 8, 10]
In the above code, 2.method(:*) returns a reference to the * method of the object 2 as a Method object. Method objects behave a lot like Proc objects, and they respond to to_proc. However, the above isn't exactly equivalent—it does 2 * n rather than n * 2 (a distinction that doesn't matter if n is also a Numeric)—and it's not any more succinct or readable than {|n| n * 2 }, and so rarely worth the trouble.
Ampersand and object (&:method)
The & operator can also be used to pass an object as a block to a method, as in the following example:
arr = [ 1, 2, 3, 4, 5 ]
arr.map { |n| n.to_s }
arr.map &:to_s
Both the examples above have the same result. In both, the map method takes the arr array and a block, then it runs the block on each element of the array. The code inside the block runs to_s on each element, converting it from integers to strings. Then, the map method returns a new array containing the converted items.
The first example is common and widely used. The second example may look a bit cryptic at first glance. Let's see what's happening:
In Ruby, items prefixed with colon (:) are symbols. If you are not familiar with the Symbol class/data type, I suggest you Google it and read a couple of articles before continuing. All method names in Ruby are internally stored as symbols. By prefixing a method name with a colon, we are not converting the method into a symbol, neither are we calling the method, we are just passing the name of the method around (referencing the method). In the example above, we are passing :to_s, which is a reference to the to_s method, to the ampersand (&) operator, which will create a proc (by calling to_proc under the hood). The proc takes a value as an argument, calls to_s on it and returns the value converted into a string.
Although the :to_s symbol is always the same, when running the map loop, it will refer to the to_s method of the class corresponding to each array item. If we passed an array such as [ 21, 4.453, :foobar, ] to the map method, the to_s method of the Fixnum class would be applied (called) on the first item, the to_s method of the Float class would be applied to the second item and the to_s method of the Symbol class would be applied to the third item. This makes sense because we are not passing the actual to_s method to the ampersand operator, just its name.
Below is an example of creating a proc that takes an argument, calls a method on it and returns the result of the method.
p = :upcase.to_proc
p.call("foo bar")
Output:
=> "FOO BAR"
Let's review what is going on in arr.map &:to_s
At each iteration of map, one item of the array (an integer) is passed to &:to_s
The :to_s symbol (which is a reference to the to_s method) is passed to the & operator, which creates a proc that will take an argument (an array item), call to_s on the argument and return the value converted into string;
The map method returns a new array containing the strings "1", "2", "3", "4" and "5".

undefined method `assoc' for #<Hash:0x10f591518> (NoMethodError)

I'm trying to return a list of values based on user defined arguments, from hashes defined in the local environment.
def my_method *args
#initialize accumulator
accumulator = Hash.new(0)
#define hashes in local environment
foo=Hash["key1"=>["var1","var2"],"key2"=>["var3","var4","var5"]]
bar=Hash["key3"=>["var6"],"key4"=>["var7","var8","var9"],"key5"=>["var10","var11","var12"]]
baz=Hash["key6"=>["var13","var14","var15","var16"]]
#iterate over args and build accumulator
args.each do |x|
if foo.has_key?(x)
accumulator=foo.assoc(x)
elsif bar.has_key?(x)
accumulator=bar.assoc(x)
elsif baz.has_key?(x)
accumulator=baz.assoc(x)
else
puts "invalid input"
end
end
#convert accumulator to list, and return value
return accumulator = accumulator.to_a {|k,v| [k].product(v).flatten}
end
The user is to call the method with arguments that are keywords, and the function to return a list of values associated with each keyword received.
For instance
> my_method(key5,key6,key1)
=> ["var10","var11","var12","var13","var14","var15","var16","var1","var2"]
The output can be in any order. I received the following error when I tried to run the code:
undefined method `assoc' for #<Hash:0x10f591518> (NoMethodError)
Please would you point me how to troubleshoot this? In Terminal assoc performs exactly how I expect it to:
> foo.assoc("key1")
=> ["var1","var2"]
I'm guessing you're coming to Ruby from some other language, as there is a lot of unnecessary cruft in this method. Furthermore, it won't return what you expect for a variety of reasons.
`accumulator = Hash.new(0)`
This is unnecessary, as (1), you're expecting an array to be returned, and (2), you don't need to pre-initialize variables in ruby.
The Hash[...] syntax is unconventional in this context, and is typically used to convert some other enumerable (usually an array) into a hash, as in Hash[1,2,3,4] #=> { 1 => 2, 3 => 4}. When you're defining a hash, you can just use the curly brackets { ... }.
For every iteration of args, you're assigning accumulator to the result of the hash lookup instead of accumulating values (which, based on your example output, is what you need to do). Instead, you should be looking at various array concatenation methods like push, +=, <<, etc.
As it looks like you don't need the keys in the result, assoc is probably overkill. You would be better served with fetch or simple bracket lookup (hash[key]).
Finally, while you can call any method in Ruby with a block, as you've done with to_a, unless the method specifically yields a value to the block, Ruby will ignore it, so [k].product(v).flatten isn't actually doing anything.
I don't mean to be too critical - Ruby's syntax is extremely flexible but also relatively compact compared to other languages, which means it's easy to take it too far and end up with hard to understand and hard to maintain methods.
There is another side effect of how your method is constructed wherein the accumulator will only collect the values from the first hash that has a particular key, even if more than one hash has that key. Since I don't know if that's intentional or not, I'll preserve this functionality.
Here is a version of your method that returns what you expect:
def my_method(*args)
foo = { "key1"=>["var1","var2"],"key2"=>["var3","var4","var5"] }
bar = { "key3"=>["var6"],"key4"=>["var7","var8","var9"],"key5"=>["var10","var11","var12"] }
baz = { "key6"=>["var13","var14","var15","var16"] }
merged = [foo, bar, baz].reverse.inject({}, :merge)
args.inject([]) do |array, key|
array += Array(merged[key])
end
end
In general, I wouldn't define a method with built-in data, but I'm going to leave it in to be closer to your original method. Hash#merge combines two hashes and overwrites any duplicate keys in the original hash with those in the argument hash. The Array() call coerces an array even when the key is not present, so you don't need to explicitly handle that error.
I would encourage you to look up the inject method - it's quite versatile and is useful in many situations. inject uses its own accumulator variable (optionally defined as an argument) which is yielded to the block as the first block parameter.

What does << mean in Ruby?

I have code:
def make_all_thumbs(source)
sizes = ['1000','1100','1200','800','600']
threads = []
sizes.each do |s|
threads << Thread.new(s) {
create_thumbnail(source+'.png', source+'-'+s+'.png', s)
}
end
end
what does << mean?
It can have 3 distinct meanings:
'<<' as an ordinary method
In most cases '<<' is a method defined like the rest of them, in your case it means "add to the end of this array" (see also here).
That's in your particular case, but there are also a lot of other occasions where you'll encounter the "<<" method. I won't call it 'operator' since it's really a method that is defined on some object that can be overridden by you or implemented for your own objects. Other cases of '<<'
String concatenation: "a" << "b"
Writing output to an IO: io << "A line of text\n"
Writing data to a message digest, HMAC or cipher: sha << "Text to be hashed"
left-shifting of an OpenSSL::BN: bn << 2
...
Singleton class definition
Then there is the mysterious shift of the current scope (=change of self) within the program flow:
class A
class << self
puts self # self is the singleton class of A
end
end
a = A.new
class << a
puts self # now it's the singleton class of object a
end
The mystery class << self made me wonder and investigate about the internals there. Whereas in all the examples I mentioned << is really a method defined in a class, i.e.
obj << stuff
is equivalent to
obj.<<(stuff)
the class << self (or any object in place of self) construct is truly different. It is really a builtin feature of the language itself, in CRuby it's defined in parse.y as
k_class tLSHFT expr
k_class is the 'class' keyword, where tLSHFT is a '<<' token and expr is an arbitrary expression. That is, you can actually write
class << <any expression>
and will get shifted into the singleton class of the result of the expression. The tLSHFT sequence will be parsed as a 'NODE_SCLASS' expression, which is called a Singleton Class definition (cf. node.c)
case NODE_SCLASS:
ANN("singleton class definition");
ANN("format: class << [nd_recv]; [nd_body]; end");
ANN("example: class << obj; ..; end");
F_NODE(nd_recv, "receiver");
LAST_NODE;
F_NODE(nd_body, "singleton class definition");
break;
Here Documents
Here Documents use '<<' in a way that is again totally different. You can define a string that spans over multiple lines conveniently by declaring
here_doc = <<_EOS_
The quick brown fox jumps over the lazy dog.
...
_EOS_
To distinguish the 'here doc operator' an arbitrary String delimiter has to immediately follow the '<<'. Everything inbetween that initial delimiter and the second occurrence of that same delimiter will be part of the final string. It is also possible to use '<<-', the difference is that using the latter will ignore any leading or trailing whitespace.
Mostly used in arrays to append the value to the end of the array.
a = ["orange"]
a << "apple"
puts a
gives this ["orange", "apple"] result.
'a << b' means append b to the end of a
It's the operator which allows you to feed existing arrays, by appending new items.
In the example above you are just populating the empty array threads with 5 new threads.
In ruby you always have more the one way to do things. So, Ruby has some nice shortcuts for common method names. like this one is for .push instead of typing out the .push method name, you can simply use <<, the concatenation operator. in fact in some cases you can use any of these for the same operation .push and + with <<.
Like you can see in this example:
alphabet = ["a", "b", "c"]
alphabet << "d" # Update me!
alphabet.push("e") # Update me!
print alphabet
caption = "the boy is surrounded by "
caption << "weezards!" # Me, too!
caption += " and more. " # Me, too!
# .push can no be uses for concatenate
print caption
so you see the result is:
["a", "b", "c", "d", "e"]
the boy is surrounded by weezards! and more.
you can use the operator << to push a element into an array or to concatenate a string to another.
so, what this is this doing is creating a new element/object Thread type and pushing it into the array.
threads << Thread.new(s) {
create_thumbnail(source+'.png', source+'-'+s+'.png', s)
}
In ruby '<<' operator is basically used for:
Appending a value in the array (at last position)
[2, 4, 6] << 8
It will give [2, 4, 6, 8]
It also used for some active record operations in ruby. For example we have a Cart and LineItem model associated as cart has_many line_items. Cart.find(A).line_items will return ActiveRecord::Associations object with line items that belongs to cart 'A'.
Now, to add (or say to associate) another line_item (X) to cart (A),
Cart.find(A).line_items << LineItem.find(X)
Now to add another LineItem to the same cart 'A', but this time we will not going to create any line_item object (I mean will not create activerecord object manually)
Cart.find(A).line_items << LineItem.new
In above code << will save object and append it to left side active record association array.
And many others which are already covered in above answers.
Also, since Ruby 2.6, the << method is defined also on Proc.
Proc#<< allows to compose two or more procs.
It means add to the end (append).
a = [1,2,3]
a << 4
a = [1,2,3,4]

Elegant way of duck-typing strings, symbols and arrays?

This is for an already existing public API that I cannot break, but I do wish to extend.
Currently the method takes a string or a symbol or anything else that makes sense when passed as the first parameter to send
I'd like to add the ability to send a list of strings, symbols, et cetera. I could just use is_a? Array, but there are other ways of sending lists, and that's not very ruby-ish.
I'll be calling map on the list, so the first inclination is to use respond_to? :map. But a string also responds to :map, so that won't work.
How about treating them all as Arrays? The behavior you want for Strings is the same as for an Array containing only that String:
def foo(obj, arg)
[*arg].each { |method| obj.send(method) }
end
The [*arg] trick works because the splat operator (*) turns a single element into itself or an Array into an inline list of its elements.
Later
This is basically just a syntactically sweetened version or Arnaud's answer, though there are subtle differences if you pass an Array containing other Arrays.
Later still
There's an additional difference having to do with foo's return value. If you call foo(bar, :baz), you might be surprised to get [baz] back. To solve this, you can add a Kestrel:
def foo(obj, arg)
returning(arg) do |args|
[*args].each { |method| obj.send(method) }
end
end
which will always return arg as passed. Or you could do returning(obj) so you could chain calls to foo. It's up to you what sort of return-value behavior you want.
A critical detail that was overlooked in all of the answers: strings do not respond to :map, so the simplest answer is in the original question: just use respond_to? :map.
Since Array and String are both Enumerables, there's not an elegant way to say "a thing that's an Enumberable, but not a String," at least not in the way being discussed.
What I would do is duck-type for Enumerable (responds_to? :[]) and then use a case statement, like so:
def foo(obj, arg)
if arg.respond_to?(:[])
case arg
when String then obj.send(arg)
else arg.each { |method_name| obj.send(method_name) }
end
end
end
or even cleaner:
def foo(obj, arg)
case arg
when String then obj.send(arg)
when Enumerable then arg.each { |method| obj.send(method) }
else nil
end
end
Perhaps the question wasn't clear enough, but a night's sleep showed me two clean ways to answer this question.
1: to_sym is available on String and Symbol and should be available on anything that quacks like a string.
if arg.respond_to? :to_sym
obj.send(arg, ...)
else
# do array stuff
end
2: send throws TypeError when passed an array.
begin
obj.send(arg, ...)
rescue TypeError
# do array stuff
end
I particularly like #2. I severely doubt any of the users of the old API are expecting TypeError to be raised by this method...
Let's say your function is named func
I would make an array from the parameters with
def func(param)
a = Array.new
a << param
a.flatten!
func_array(a)
end
You end up with implementing your function func_array for arrays only
with func("hello world") you'll get a.flatten! => [ "hello world" ]
with func(["hello", "world"] ) you'll get a.flatten! => [ "hello", "world" ]
Can you just switch behavior based on parameter.class.name? It's ugly, but if I understand correctly, you have a single method that you'll be passing multiple types to - you'll have to differentiate somehow.
Alternatively, just add a method that handles an array type parameter. It's slightly different behavior so an extra method might make sense.
Use Marshal to serialize your objects before sending these.
If you don't want to monkeypatch, just massage the list to an appropriate string before the send. If you don't mind monkeypatching or inheriting, but want to keep the same method signature:
class ToBePatched
alias_method :__old_takes_a_string, :takes_a_string
#since the old method wanted only a string, check for a string and call the old method
# otherwise do your business with the map on things that respond to a map.
def takes_a_string( string_or_mappable )
return __old_takes_a_string( string_or_mappable ) if String === string_or_mappable
raise ArgumentError unless string_or_mappable.responds_to?( :map )
# do whatever you wish to do
end
end
Between those 3 types I'd do this
is_array = var.respond_to?(:to_h)
is_string = var.respond_to?(:each_char)
is_symbol = var.respond_to?(:to_proc)
Should give a unique answer for [], :sym, 'str'

Resources