Strange behavior: Hash's keys cancel dynamic method definition - ruby

Let's say I want some instance of String behave differently from other, "normal" instances - for example cancel the effect of the "upcase" method. I do the following:
class String
def foo
def self.upcase
self
end
self
end
end
It seems to work fine, and the way I need it:
puts "bar".upcase #=> "BAR"
puts "bar".foo.upcase #=> "bar"
However, as soon as I use the tricked instance of the String as a key for a Hash, the behavior starts looking weird to me:
puts ({"bar".foo => "code"}).keys.first.upcase #=> "BAR", not "bar"!
... which is as if the foo method is ignored, and the original instance of String is used as the key.
Anyone can see what's going on here? Thanks a bunch!

Ruby's Hash has a special case for using strings as a hash key -- it makes an internal copy of the string.
Basically it's to protect you from using a string (object) as a key and then altering that string object later in the code, which could lead to some confusing situations. Mutable keys get tricky.
Rather than hack method onto string that returns an altered string class, I would just create a new subclass of string that overrides upcase and then just set its value.

Just because in Ruby you can reopen core classes and virtually re-define everything, it doesn't mean you should.
With great powers come great responsibilities and your responsability is to not redefine a core library method just because a couple of objects might need this.
If your instance doesn't behave like a Sting, declare your own class and extend String.

The usual way to extend a single object in Ruby is this:
s = "bar"
class<<s
def self.upcase
self
end
end
...but that doesn't solve your problem. It seems that Ruby has special rules for hash keys that are strings, or subclasses of strings Maybe instead of a string, you can use an object with a meaningful definition of to_s?

Related

Ruby: understanding data structure

Most of the Factorybot factories are like:
FactoryBot.define do
factory :product do
association :shop
title { 'Green t-shirt' }
price { 10.10 }
end
end
It seems that inside the ":product" block we are building a data structure, but it's not the typical hashmap, the "keys" are not declared through symbols and commas aren't used.
So my question is: what kind of data structure is this? and how it works?
How declaring "association" inside the block doesn't trigger a:
NameError: undefined local variable or method `association'
when this would happen on many other situations. Is there a subject in compsci related to this?
The block is not a data structure, it's code. association and friends are all method calls, probably being intercepted by method_missing. Here's an example using that same technique to build a regular hash:
class BlockHash < Hash
def method_missing(key, value=nil)
if value.nil?
return self[key]
else
self[key] = value
end
end
def initialize(&block)
self.instance_eval(&block)
end
end
With which you can do this:
h = BlockHash.new do
foo 'bar'
baz :zoo
end
h
#=> {:foo=>"bar", :baz=>:zoo}
h.foo
#=> "bar"
h.baz
#=> :zoo
I have not worked with FactoryBot so I'm going to make some assumptions based on other libraries I've worked with. Milage may vary.
The basics:
FactoryBot is a class (Obviously)
define is a static method in FactoryBot (I'm going to assume I still haven't lost you ;) ).
Define takes a block which is pretty standard stuff in ruby.
But here's where things get interesting.
Typically when a block is executed it has a closure relative to where it was declared. This can be changed in most languages but ruby makes it super easy. instance_eval(block) will do the trick. That means you can have access to methods in the block that weren't available outside the block.
factory on line 2 is just such a method. You didn't declare it, but the block it's running in isn't being executed with a standard scope. Instead your block is being immediately passed to FactoryBot which passes it to a inner class named DSL which instance_evals the block so its own factory method will be run.
line 3-5 don't work that way since you can have an arbitrary name there.
ruby has several ways to handle missing methods but the most straightforward is method_missing. method_missing is an overridable hook that any class can define that tells ruby what to do when somebody calls a method that doesn't exist.
Here it's checking to see if it can parse the name as an attribute name and use the parameters or block to define an attribute or declare an association. It sounds more complicated than it is. Typically in this situation I would use define_method, define_singleton_method, instance_variable_set etc... to dynamically create and control the underlying classes.
I hope that helps. You don't need to know this to use the library the developers made a domain specific language so people wouldn't have to think about this stuff, but stay curious and keep growing.

Ruby nested send

Say I have an object with a method that accesses an object:
def foo
#foo
end
I know I can use send to access that method:
obj.send("foo") # Returns #foo
Is there a straightforward way to do a recursive send to get a parameter on the #foo object, like:
obj.send("foo.bar") # Returns #foo.bar
You can use instance_eval:
obj.instance_eval("foo.bar")
You can even access the instance variable directly:
obj.instance_eval("#foo.bar")
While OP has already accepted an answer using instance_eval(string), I would strongly urge OP to avoid string forms of eval unless absolutely necessary. Eval invokes the ruby compiler -- it's expensive to compute and dangerous to use as it opens a vector for code injection attacks.
As stated there's no need for send at all:
obj.foo.bar
If indeed the names of foo and bar are coming from some non-static calculation, then
obj.send(foo_method).send(bar_method)
is simple and all one needs for this.
If the methods are coming in the form of a dotted string, one can use split and inject to chain the methods:
'foo.bar'.split('.').inject(obj, :send)
Clarifying in response to comments: String eval is one of the riskiest things one can do from a security perspective. If there's any way the string is constructed from user supplied input without incredibly diligent inspection and validation of that input, you should just consider your system owned.
send(method) where method is obtained from user input has risks too, but there's a more limited attack vector. Your user input can cause you to execute any 0-arghument method dispatchable through the receiver. Good practise here would be to always whitelist the methods before dispatching:
VALID_USER_METHODS = %w{foo bar baz}
def safe_send(method)
raise ArgumentError, "#{method} not allowed" unless VALID_USER_METHODS.include?(method.to_s)
send(method)
end
A bit late to the party, but I had to do something similar that had to combine both 'sending' and accessing data from a hash/array in a single call. Basically this allows you to do something like the following
value = obj.send_nested("data.foo['bar'].id")
and under the hood this will do something akin to
obj.send(data).send(foo)['bar'].send(id)
This also works with symbols in the attribute string
value = obj.send_nested('data.foo[:bar][0].id')
which will do something akin to
obj.send(data).send(foo)[:bar][0].send(id)
In the event that you want to use indifferent access you can add that as a parameter as well. E.g.
value = obj.send_nested('data.foo[:bar][0].id', with_indifferent_access: true)
Since it's a bit more involved, here is the link to the gist that you can use to add that method to the base Ruby Object. (It also includes the tests so that you can see how it works)

Generic way to replace an object in it's own method

With strings one can do this:
a = "hello"
a.upcase!
p a #=> "HELLO"
But how would I write my own method like that?
Something like (although that doesn't work obviously):
class MyClass
def positify!
self = [0, self].max
end
end
I know there are some tricks one can use on String but what if I'm trying to do something like this for Object?
Many classes are immutable (e.g. Numeric, Symbol, ...), so have no method allowing you to change their value.
On the other hand, any Object can have instance variables and these can be modified.
There is an easy way to delegate the behavior to a known object (say 42) and be able to change, later on, to another object, using SimpleDelegator. In the example below, quacks_like_an_int behaves like an Integer:
require 'delegate'
quacks_like_an_int = SimpleDelegator.new(42)
quacks_like_an_int.round(-1) # => 40
quacks_like_an_int.__setobj__(666)
quacks_like_an_int.round(-1) # => 670
You can use it to design a class too, for example:
require 'delegate'
class MutableInteger < SimpleDelegator
def plus_plus!
__setobj__(self + 1)
self
end
def positify!
__setobj__(0) if self < 0
self
end
end
i = MutableInteger.new(-42)
i.plus_plus! # => -41
i.positify! # => 0
Well, the upcase! method doesn't change the object identity, it only changes its internal structure (s.object_id == s.upcase!.object_id).
On the other hand, numbers are immutable objects and therefore, you can't change their value without changing their identity. AFAIK, there's no way for an object to self-change its identity, but, of course, you may implement positify! method that changes properties of its object - and this would be an analogue of what upcase! does for strings.
Assignment, or binding of local variables (using the = operator) is built-in to the core language and there is no way to override or customize it. You could run a preprocessor over your Ruby code which would convert your own, custom syntax to valid Ruby, though. You could also pass a Binding in to a custom method, which could redefine variables dynamically. This wouldn't achieve the effect you are looking for, though.
Understand that self = could never work, because when you say a = "string"; a = "another string" you are not modifying any objects; you are rebinding a local variable to a different object. Inside your custom method, you are in a different scope, and any local variables which you bind will only exist in that scope; it won't have any effect on the scope which you called the method from.
You cannot change self to point to anything other than its current object. You can make changes to instance variables, such as in the case string which is changing the underlying characters to upper case.
As pointed out in this answer:
Ruby and modifying self for a Float instance
There is a trick mentioned here that is a work around, which is to write you class as a wrapper around the other object. Then your wrapper class can replace the wrapped object at will. I'm hesitant on saying this is a good idea though.

Dot syntax vs param passing syntax

Are only the core Ruby methods callable using object.functionName syntax? Is it possible to create methods on my own that are callable in the dot syntax fashion?
For this method:
def namechanger (name)
nametochange = name
puts "This is the name to change: #{nametochange}"
end
First one below works, the second does not.
namechanger("Steve")
"Steve".namechanger
I get an error on "Steve".namechanger
The error is:
rb:21:in `<main>': private method `namechanger' called for "Steve":String (NoMethodError)
Yes, you can add methods to the String class to achieve your desired effect; the variable "self" refers to the object which receives the method call.
class String
def namechanger
"This is the name to change: #{self}"
end
end
"Steve".namechanger # => This is the name to change: Steve
This practice is known as monkey patching and should be used carefully.
Instead of monkeypatching, you could alway subclass and thus:
Be precise about what you think the object really is.
Gain all the methods of String
For example:
class PersonName < String
def namechanger
puts "This is the name to change: #{self}"
end
end
s = PersonName.new( "Iain" )
s.namechanger
This is the name to change: Iain
=> nil
What you have here in the first form is a method which takes a parameter, which is very different than a method that doesn't take any parameters. Let me illustrate
ruby
namechanger("Steve")
Looks for a method named namechanger and passes a string argument to it. Straight forward. It looks up in some unknown context, probably the locals of another method which will look it up on the object that receives that method.
ruby
"Steve".namechanger
is a method that takes no arguments that exists on String. Typically these methods use the implicit self parameter to operate on some data.
If you want to be able to call "Steve".namechanger, you have to make namechanger a method of the String class like this:
class String
def namechanger
puts "This is the name to change: #{self}"
end
end
This is generally referred to as "monkey patching" and you might want to improve your general Ruby proficiency a bit before you get into the related considerations and discussions.
You could do
class String
def namechanger
puts "This is the name to change: #{self}"
end
end
The difference is that your first example is a method that's (basically) globally defined, which takes a string and operates on it. This code above however defines a method called "namechanger" which takes no parameters, and defines it directly on the String class. So any and all strings in your application will then have this method.
But as pst said, you should probably not dive into that style of programming until you get a little more familiar with Ruby, so that you can more easily see the upsides and downsides of doing monkeypatching like this. One of the considerations is that you probably have many strings that don't represent names, and it doesn't make a lot of sense for those strings to have a method called namechanger.
That said, if your goal is just to have a little fun with Ruby, to see what you can do, go for it, but remember to be more careful in projects that will have a longer lifespan.

Ruby - how to handle problem of subclass accidentally overriding superclass's private fields?

Suppose you write a class Sup and I decide to extend it to Sub < Sup. Not only do I need to understand your published interface, but I also need to understand your private fields. Witness this failure:
class Sup
def initialize
#privateField = "from sup"
end
def getX
return #privateField
end
end
class Sub < Sup
def initialize
super()
#privateField = "i really hope Sup does not use this field"
end
end
obj = Sub.new
print obj.getX # prints "i really hope Sup does not use this field"
The question is, what is the right way to tackle this problem? It seems a subclass should be able to use whatever fields it wants without messing up the superclass.
EDIT: The equivalent example in Java returns "from Sup", which is the answer this should produce as well.
Instance variables have nothing to do with inheritance, they are created on first usage, not by some defining mechanism, therefore there is no special access control for them in language and they can not be shadowed.
Not only do I need to understand your
published interface, but I also need
to understand your private fields.
Actually this is an "official" position. Excerpt from "The Ruby Programming Language" book (where Matz is one of the authors):
... this is another reason why it is only safe to extend Ruby
classes when you are familiar with
(and in control of) the implementation
of the superclass.
If you don't know it inside and out you're on your own. Sad but true.
Don't subclass it!
Use composition instead of inheritance.
Edit: Rather than MyObject subclassing ExistingObject, see if my_object having an instance variable referring to existing_object would be more appropriate.
Instance variables belong to instances (ie objects). They're not determined by the classes themselves.
unlike java/C#, in ruby private variables are always visible to the inheriting classes. There is no way to hide the private variables.
Ruby and Java don't treat 'private' property the same way. In Ruby if you mark something as private it only means that it can't be called with receiver, i.e.:
class Sub
private
def foo; end
end
sub.foo => error accessing private method with caller
but you can always access it if you change who is self like:
sub.instance_eval { foo } #instance_eval changes self to receiver, 'sub' in this example
Conclusion: Don't rely that you can hide or protect something from outer space! Or with great power comes great responsibility!
EDIT:
Yes, I know question was for fields but it's the same thing. You can always do:
sub.instance_eval { #my_private_field = 'something else' }
puts sub.instance_eval { #my_private_field }

Resources