Ruby: tap writes on a read? - ruby

So if I understand correctly Object#tap uses yield to produce a temporary object to work with during the execution of a process or method. From what I think I know about yield, it does something like, yield takes (thing) and gives (thing).dup to the block attached to the method it's being used in?
But when I do this:
class Klass
attr_accessor :hash
def initialize
#hash={'key' => 'value'}
end
end
instance=Klass.new
instance.instance_variable_get('#hash')[key] # => 'value', as it should
instance.instance_variable_get('#hash').tap {|pipe| pipe['key']=newvalue}
instance.instance_variable_get('#hash')[key] # => new value... wut?
I was under the impression that yield -> new_obj. I don't know how correct this is though, I tried to look it up on ruby-doc, but Enumerator::yielder is empty, yield(proc) isn't there, and the fiber version... I don't have any fibers, in fact, doesn't Ruby actually explicitly require include 'fiber' to use them?
So what ought have been a read method on the instance variable and a write on the temp is instead a read/write on the instance variable... which is cool, because that's what I was trying to do and accidentally found when I was looking up a way to deal with hashes as instance variables (for some larger-than-I'm-used-to tables for named arrays of variables), but now I'm slightly confused, and I can't find a description of the mechanism that's making this happen.

Object#tap couldn't be simpler:
VALUE
rb_obj_tap(VALUE obj)
{
rb_yield(obj);
return obj;
}
(from the documentation). It just yields and then returns the receiver. A quick check in IRB shows that yield yields the object itself rather than a new object.
def foo
x = {}
yield x
x
end
foo { |y| y['key'] = :new_value }
# => {"key" => :new_value }
So the behavior of tap is consistent with yield, as we would hope.

tap does not duplicate the receiver. The block variable is assigned the very receiver itself. Then, tap returns the receiver. So when you do tap{|pipe| pipe['key']=newvalue}, the receiver of tap is modified. To my understanding,
x.tap{|x| foo(x)}
is equivalent to:
foo(x); x
and
y.tap{|y| y.bar}
is equivalent to:
y.bar; y

Related

What is the purpose of the 'proc' parameter in Marshal::load?

I have looked for resources which explain its purpose. I couldn't find any real world implementations either.
Below is the extract from Ruby's documentation:
load( source [, proc] ) → obj Returns the result of converting the
serialized data in source into a Ruby object (possibly with associated
subordinate objects). source may be either an instance of IO or an
object that responds to to_str. If proc is specified, each object will
be passed to the proc, as the object is being deserialized.
I would appreciate an example of its usage, or at least direct me to some resources.
You can see how the proc is invoked by doing something like:
irb(main):030:0> Marshal.load(Marshal.dump(a:1), lambda { |x| p [self,x]; x })
[main, :a]
[main, 1]
[main, {:a=>1}]
=> {:a=>1}
When used with a marshalled string, the proc is for some reason invoked twice.
irb(main):031:0> Marshal.load(Marshal.dump('a'), lambda { |x| p [self,x]; x })
[main, "a"]
[main, true]
=> "a"
Transformation of Deserialized Objects
A general use case is that you want to perform some action or transformation on the object you're deserializing. For example, using some Ruby 2.7.1 shortcuts:
Marshal.load Marshal.dump("abc"), ->{ _1.to_s.upcase }
#=> "ABC"
This doesn't add much value when deserializing a single object, but could be very useful if you're handling dumped objects in bulk. I can't think of a pragmatic use case where you couldn't transform after deserializing, but Ruby is full of useful tools for handling things without intermediate steps. This seems to be one of them.
Possible Bug: Procs Appear to Run Twice but Return Once
In the example above, I coerce the first positional argument to a string because otherwise I get a NoMethodError on one of the two passes this is making through the lambda. You can sort of unpack what's going on (but perhaps not why) as follows:
prc = proc { |obj| pp obj }
Marshal.load Marshal.dump("abc"), prc
"abc"
true
#=> "abc"
For whatever reason, the body of a Proc or lambda is called twice, but only returns the first pass. The problem occurs when the second invocation raises a NoMethodError exception when called on TrueClass, so the call never returns a value.
Another way to handle this is to explicitly handle the exception, e.g.:
prc = proc { |obj| obj.upcase rescue NoMethodError }
or to avoid invoking methods on true:
prc = proc { |obj| obj.upcase unless obj == true }
While I can explain what is going on, and how to work around it, I can't tell you why invoking with a proc-like object behaves this way. That's a question for the Ruby Core Team, or fodder for the Ruby bug tracker.

Can `merge!` change a CONST in Ruby?

I have a situation where merge! seems to modify the value of a CONST. Can this occur? How?
I'm doing some API ingestion and mapping, like you do...
module Placement
FEATURE_DEFAULTS = {
"thingone" => "false",
"thingtwo" => "false"
}
def extract_features!(feat)
feat['norm_features'] ||= FEATURE_DEFAULTS
feat['norm_features'].merge!(
Array(feat['attributes']['feature']).reduce({}) do |h,f|
h[f] = "true"
h
end
)
end
def get_placement(_opts)
data_source["things"]["thing"].map do |thing|
product = {}
thing.each do |key, value|
new_key = RENAME_FIELDS[key] || key
new_value = REPLACE_FIELDS[key] || value
product[new_key] = new_value
end
binding.pry # 1
extract_features!(product)
binding.pry # 2
product
end
end
end
Later, I include this in a class for the API client, and then I call the get_placement method.
Dilemma
step 1
For the first run, in pry binding 1 & 2, the value of FEATURE_DEFAULTS is as seen above. For 2 the value of FEATURE_DEFAULTS is the same, and the value of product['norm_features'] is the same (plus the results of the meshing operation of extract_features!
step 2
The output (to the caller of get_placement), for every thing/product, is
FEATURE_DEFAULTS = {
"thingone" => true,
"thingtwo" => true
}
step 3
When I run this the second time (after starting up the service / app), the value of FEATURE_DEFAULTS, and pry binding 1 & 2, is
FEATURE_DEFAULTS = {
"thingone" => true,
"thingtwo" => true
}
What is happening here?
This seems to confirm that, after running the extract_features! method, the FEATURE_DEFAULTS CONST is changed. If I do not use merge! in extract_features!, and use merge instead, then the CONST value does not change.
I can post more code, if needed, or most to a Gist.
Ruby MRI 2.2.2
I am doing this inside a rails app, but I don't see that it matters.
I have a situation where merge! seems to modify the value of a CONST. Can this occur? How?
No, in general, methods cannot change variable bindings, regardless of whether those variables are local variables, instance variables, class variables, global variables, or constants. Variables aren't objects in Ruby, you can't call methods on them, you can't pass them as arguments to methods, ergo, you can't tell them to change themselves.
The exception to this are meta-programming methods like Binding#local_variable_set, Object#instance_variable_set, Module#class_variable_set, or Module#const_set.
What you can do, however, and what you are doing in this case, is tell the object the variable points to to change itself. The documentation for Hash#merge! is unfortunately a bit unclear in that it does not explicitly mention that Hash#merge! mutates its receiver.
This seems to confirm that, after running the extract_features! method, the FEATURE_DEFAULTS CONST is changed. If I do not use merge! in extract_features!, and use merge instead, then the CONST value does not change.
No, it only confirms that the state of the object FEATURE_DEFAULTS points to changed, it doesn't say anything about whether the binding of FEATURE_DEFAULTS, i.e. which object it points to, changed. You can confirm that the constant still points to the same object by looking at its object_id.
Of course, constants can be re-assigned anyway, albeit triggering a warning.

undefined method `assoc' for #<Hash:0x10f591518> (NoMethodError)

I'm trying to return a list of values based on user defined arguments, from hashes defined in the local environment.
def my_method *args
#initialize accumulator
accumulator = Hash.new(0)
#define hashes in local environment
foo=Hash["key1"=>["var1","var2"],"key2"=>["var3","var4","var5"]]
bar=Hash["key3"=>["var6"],"key4"=>["var7","var8","var9"],"key5"=>["var10","var11","var12"]]
baz=Hash["key6"=>["var13","var14","var15","var16"]]
#iterate over args and build accumulator
args.each do |x|
if foo.has_key?(x)
accumulator=foo.assoc(x)
elsif bar.has_key?(x)
accumulator=bar.assoc(x)
elsif baz.has_key?(x)
accumulator=baz.assoc(x)
else
puts "invalid input"
end
end
#convert accumulator to list, and return value
return accumulator = accumulator.to_a {|k,v| [k].product(v).flatten}
end
The user is to call the method with arguments that are keywords, and the function to return a list of values associated with each keyword received.
For instance
> my_method(key5,key6,key1)
=> ["var10","var11","var12","var13","var14","var15","var16","var1","var2"]
The output can be in any order. I received the following error when I tried to run the code:
undefined method `assoc' for #<Hash:0x10f591518> (NoMethodError)
Please would you point me how to troubleshoot this? In Terminal assoc performs exactly how I expect it to:
> foo.assoc("key1")
=> ["var1","var2"]
I'm guessing you're coming to Ruby from some other language, as there is a lot of unnecessary cruft in this method. Furthermore, it won't return what you expect for a variety of reasons.
`accumulator = Hash.new(0)`
This is unnecessary, as (1), you're expecting an array to be returned, and (2), you don't need to pre-initialize variables in ruby.
The Hash[...] syntax is unconventional in this context, and is typically used to convert some other enumerable (usually an array) into a hash, as in Hash[1,2,3,4] #=> { 1 => 2, 3 => 4}. When you're defining a hash, you can just use the curly brackets { ... }.
For every iteration of args, you're assigning accumulator to the result of the hash lookup instead of accumulating values (which, based on your example output, is what you need to do). Instead, you should be looking at various array concatenation methods like push, +=, <<, etc.
As it looks like you don't need the keys in the result, assoc is probably overkill. You would be better served with fetch or simple bracket lookup (hash[key]).
Finally, while you can call any method in Ruby with a block, as you've done with to_a, unless the method specifically yields a value to the block, Ruby will ignore it, so [k].product(v).flatten isn't actually doing anything.
I don't mean to be too critical - Ruby's syntax is extremely flexible but also relatively compact compared to other languages, which means it's easy to take it too far and end up with hard to understand and hard to maintain methods.
There is another side effect of how your method is constructed wherein the accumulator will only collect the values from the first hash that has a particular key, even if more than one hash has that key. Since I don't know if that's intentional or not, I'll preserve this functionality.
Here is a version of your method that returns what you expect:
def my_method(*args)
foo = { "key1"=>["var1","var2"],"key2"=>["var3","var4","var5"] }
bar = { "key3"=>["var6"],"key4"=>["var7","var8","var9"],"key5"=>["var10","var11","var12"] }
baz = { "key6"=>["var13","var14","var15","var16"] }
merged = [foo, bar, baz].reverse.inject({}, :merge)
args.inject([]) do |array, key|
array += Array(merged[key])
end
end
In general, I wouldn't define a method with built-in data, but I'm going to leave it in to be closer to your original method. Hash#merge combines two hashes and overwrites any duplicate keys in the original hash with those in the argument hash. The Array() call coerces an array even when the key is not present, so you don't need to explicitly handle that error.
I would encourage you to look up the inject method - it's quite versatile and is useful in many situations. inject uses its own accumulator variable (optionally defined as an argument) which is yielded to the block as the first block parameter.

Defining method in Ruby with equals

Being new to Ruby, I'm having trouble explaining to myself the behavior around method definitions within Ruby.
The example is noted below...
class Foo
def do_something(action)
action.inspect
end
def do_something_else=action
action.inspect
end
end
?> f.do_something("drive")
=> "\"drive\""
?> f.do_something_else=("drive")
=> "drive"
The first example is self explanatory. What Im trying to understand is the behavior of the second example. Other than what looks to be one producing a string literal and the other is not, what is actually happening? Why would I use one over the other?
Generally, do_something is a getter, and do_something= is a setter.
class Foo
attr_accessor :bar
end
is equivalent to
class Foo
def bar
#bar
end
def bar=(value)
#bar = value
end
end
To answer your question about the difference in behavior, methods that end in = always return the right hand side of the expression. In this case returning action, not action.inspect.
class Foo
def do_something=(action)
"stop"
end
end
?> f = Foo.new
?> f.do_something=("drive")
=> "drive"
Both of your methods are actually being defined and called as methods. Quite a lot of things in Ruby can be defined as methods, even the operators such as +, -, * and /. Ruby allows methods to have three special notational suffixes. I made that phrase up all by myself. What I mean by notational suffixes is that the thing on the end of the method will indicate how that method is supposed to work.
Bang!
The first notational suffix is !. This indicates that the method is supposed to be destructive, meaning that it modifies the object that it's called on. Compare the output of these two scripts:
a = [1, 2, 3]
a.map { |x| x * x }
a
And:
a = [1, 2, 3]
a.map! { |x| x * x }
a
There's a one character difference between the two scripts, but they operate differently! The first one will still go through each element in the array and perform the operation inside the block, but the object in a will still be the same [1,2,3] that you started with.
In the second example, however, the a at the end will instead be [1, 4, 9] because map! modified the object in place!
Query
The second notational suffix is ?, and that indicates that a method is used to query an object about something, and means that the method is supposed to return true, false or in some extreme circumstances, nil.
Now, note that the method doesn't have to return true or false... it's just that it'd be very nice if it did that!
Proof:
def a?
true
end
def b?
"moo"
end
Calling a? will return true, and calling b? will return "moo". So there, that's query methods. The methods that should return true or false but sometimes can return other things because some developers don't like other developers.
Setters!
NOW we get to the meat of your (paraphrased) question: what does = mean on the end of a method?
That usually indicates that a method is going to set a particular value, as Erik already outlined before I finished typing this essay of an answer.
However, it may not set one, just like the query methods may not return true or false. It's just convention.
You can call that setter method like this also:
foo.something_else="value"
Or (my favourite):
foo.something_else = "value"
In theory, you can actually ignore the passed in value, just like you can completely ignore any arguments passed into any method:
def foo?(*args)
"moo"
end
>> foo?(:please, :oh, :please, :why, :"won't", :you, :use, :these, :arguments, :i, :got, :just, :for, :you, :question_mark?)
=> "moo"
Ruby supports all three syntaxes for setter methods, although it's very rare to see the one you used!
Well, I hope this answer's been roughly educational and that you understand more things about Ruby now. Enjoy!
You cannot define a return value for assignment methods. The return value is always the same as the value passed in, so that assignment chains (x = y = z = 3) will always work.
Typically, you would omit the brackets when you invoke the method, so that it behaves like a property:
my_value = f.do_something= "drive"
def do_something_else=action
action.inspect
end
This defines a setter method, so do_something_else appears as though we are initializing a attribute. So the value initialized is directly passed,

Access variables programmatically by name in Ruby

I'm not entirely sure if this is possible in Ruby, but hopefully there's an easy way to do this. I want to declare a variable and later find out the name of the variable. That is, for this simple snippet:
foo = ["goo", "baz"]
How can I get the name of the array (here, "foo") back? If it is indeed possible, does this work on any variable (e.g., scalars, hashes, etc.)?
Edit: Here's what I'm basically trying to do. I'm writing a SOAP server that wraps around a class with three important variables, and the validation code is essentially this:
[foo, goo, bar].each { |param|
if param.class != Array
puts "param_name wasn't an Array. It was a/an #{param.class}"
return "Error: param_name wasn't an Array"
end
}
My question is then: Can I replace the instances of 'param_name' with foo, goo, or bar? These objects are all Arrays, so the answers I've received so far don't seem to work (with the exception of re-engineering the whole thing ala dbr's answer)
What if you turn your problem around? Instead of trying to get names from variables, get the variables from the names:
["foo", "goo", "bar"].each { |param_name|
param = eval(param_name)
if param.class != Array
puts "#{param_name} wasn't an Array. It was a/an #{param.class}"
return "Error: #{param_name} wasn't an Array"
end
}
If there were a chance of one the variables not being defined at all (as opposed to not being an array), you would want to add "rescue nil" to the end of the "param = ..." line to keep the eval from throwing an exception...
You need to re-architect your solution. Even if you could do it (you can't), the question simply doesn't have a reasonable answer.
Imagine a get_name method.
a = 1
get_name(a)
Everyone could probably agree this should return 'a'
b = a
get_name(b)
Should it return 'b', or 'a', or an array containing both?
[b,a].each do |arg|
get_name(arg)
end
Should it return 'arg', 'b', or 'a' ?
def do_stuff( arg )
get_name(arg)
do
do_stuff(b)
Should it return 'arg', 'b', or 'a', or maybe the array of all of them? Even if it did return an array, what would the order be and how would I know how to interpret the results?
The answer to all of the questions above is "It depends on the particular thing I want at the time." I'm not sure how you could solve that problem for Ruby.
It seems you are trying to solve a problem that has a far easier solution..
Why not just store the data in a hash? If you do..
data_container = {'foo' => ['goo', 'baz']}
..it is then utterly trivial to get the 'foo' name.
That said, you've not given any context to the problem, so there may be a reason you can't do this..
[edit] After clarification, I see the issue, but I don't think this is the problem.. With [foo, bar, bla], it's equivalent like saying ['content 1', 'content 2', 'etc']. The actual variables name is (or rather, should be) utterly irrelevant. If the name of the variable is important, that is exactly why hashes exist.
The problem isn't with iterating over [foo, bar] etc, it's the fundamental problem with how the SOAP server is returing the data, and/or how you're trying to use it.
The solution, I would say, is to either make the SOAP server return hashes, or, since you know there is always going to be three elements, can you not do something like..
{"foo" => foo, "goo" => goo, "bar"=>bar}.each do |param_name, param|
if param.class != Array
puts "#{param_name} wasn't an Array. It was a/an #{param.class}"
puts "Error: #{param_name} wasn't an Array"
end
end
OK, it DOES work in instance methods, too, and, based on your specific requirement (the one you put in the comment), you could do this:
local_variables.each do |var|
puts var if (eval(var).class != Fixnum)
end
Just replace Fixnum with your specific type checking.
I do not know of any way to get a local variable name. But, you can use the instance_variables method, this will return an array of all the instance variable names in the object.
Simple call:
object.instance_variables
or
self.instance_variables
to get an array of all instance variable names.
Building on joshmsmoore, something like this would probably do it:
# Returns the first instance variable whose value == x
# Returns nil if no name maps to the given value
def instance_variable_name_for(x)
self.instance_variables.find do |var|
x == self.instance_variable_get(var)
end
end
There's Kernel::local_variables, but I'm not sure that this will work for a method's local vars, and I don't know that you can manipulate it in such a way as to do what you wish to acheive.
Great question. I fully understand your motivation. Let me start by noting, that there are certain kinds of special objects, that, under certain circumstances, have knowledge of the variable, to which they have been assigned. These special objects are eg. Module instances, Class instances and Struct instances:
Dog = Class.new
Dog.name # Dog
The catch is, that this works only when the variable, to which the assignment is performed, is a constant. (We all know that Ruby constants are nothing more than emotionally sensitive variables.) Thus:
x = Module.new # creating an anonymous module
x.name #=> nil # the module does not know that it has been assigned to x
Animal = x # but will notice once we assign it to a constant
x.name #=> "Animal"
This behavior of objects being aware to which variables they have been assigned, is commonly called constant magic (because it is limited to constants). But this highly desirable constant magic only works for certain objects:
Rover = Dog.new
Rover.name #=> raises NoMethodError
Fortunately, I have written a gem y_support/name_magic, that takes care of this for you:
# first, gem install y_support
require 'y_support/name_magic'
class Cat
include NameMagic
end
The fact, that this only works with constants (ie. variables starting with a capital letter) is not such a big limitation. In fact, it gives you freedom to name or not to name your objects at will:
tmp = Cat.new # nameless kitty
tmp.name #=> nil
Josie = tmp # by assigning to a constant, we name the kitty Josie
tmp.name #=> :Josie
Unfortunately, this will not work with array literals, because they are internally constructed without using #new method, on which NameMagic relies. Therefore, to achieve what you want to, you will have to subclass Array:
require 'y_support/name_magic'
class MyArr < Array
include NameMagic
end
foo = MyArr.new ["goo", "baz"] # not named yet
foo.name #=> nil
Foo = foo # but assignment to a constant is noticed
foo.name #=> :Foo
# You can even list the instances
MyArr.instances #=> [["goo", "baz"]]
MyArr.instance_names #=> [:Foo]
# Get an instance by name:
MyArr.instance "Foo" #=> ["goo", "baz"]
MyArr.instance :Foo #=> ["goo", "baz"]
# Rename it:
Foo.name = "Quux"
Foo.name #=> :Quux
# Or forget the name again:
MyArr.forget :Quux
Foo.name #=> nil
# In addition, you can name the object upon creation even without assignment
u = MyArr.new [1, 2], name: :Pair
u.name #=> :Pair
v = MyArr.new [1, 2, 3], ɴ: :Trinity
v.name #=> :Trinity
I achieved the constant magic-imitating behavior by searching all the constants in all the namespaces of the current Ruby object space. This wastes a fraction of second, but since the search is performed only once, there is no performance penalty once the object figures out its name. In the future, Ruby core team has promised const_assigned hook.
You can't, you need to go back to the drawing board and re-engineer your solution.
Foo is only a location to hold a pointer to the data. The data has no knowledge of what points at it. In Smalltalk systems you could ask the VM for all pointers to an object, but that would only get you the object that contained the foo variable, not foo itself. There is no real way to reference a vaiable in Ruby. As mentioned by one answer you can stil place a tag in the data that references where it came from or such, but generally that is not a good apporach to most problems. You can use a hash to receive the values in the first place, or use a hash to pass to your loop so you know the argument name for validation purposes as in DBR's answer.
The closest thing to a real answer to you question is to use the Enumerable method each_with_index instead of each, thusly:
my_array = [foo, baz, bar]
my_array.each_with_index do |item, index|
if item.class != Array
puts "#{my_array[index]} wasn't an Array. It was a/an #{item.class}"
end
end
I removed the return statement from the block you were passing to each/each_with_index because it didn't do/mean anything. Each and each_with_index both return the array on which they were operating.
There's also something about scope in blocks worth noting here: if you've defined a variable outside of the block, it will be available within it. In other words, you could refer to foo, bar, and baz directly inside the block. The converse is not true: variables that you create for the first time inside the block will not be available outside of it.
Finally, the do/end syntax is preferred for multi-line blocks, but that's simply a matter of style, though it is universal in ruby code of any recent vintage.

Resources