How to Document Classmethod Properties in Python 3.9 / 3.10? - docstring

A small add-on to python 3.9 was that
Changed in version 3.9: Class methods can now wrap other descriptors such as property().
Which is useful in several contexts, for example as a solution for Using property() on classmethods or in order to create what is essentially a lazily evaluated class-attribute via the recipe
class A:
#classmethod
#property
#cache
def lazy_class_attribute(cls):
"""Method docstring."""
return expensive_computation(cls)
This works well and fine within python, importing the class, instantiating it or subclasses of it will not cause expensive_computation to occur, however it seems that both pydoc and sphinx will not only cause execution expensive_computation when they try to obtain the docstring, but not display any docstring for this #classmethod whatsoever.
Question: Is it possible - from within python (*) - to have lazily evaluated class-attributes/properties that do not get executed when building documentation?
(*) One workaround presented in Stop Sphinx from Executing a cached classmethod property, thanks to /u/jsbueno, consists of modifying the function body based on an environment variable:
def lazy_class_attribute(cls):
"""Method docstring."""
if os.environ.get("GENERATING_DOCS", False):
return
return expensive_computation(cls)
I like this workaround a lot, since in particular, it allows one to present a different output when documenting. (For example, my classes have attributes which are paths based on the class-name. If a different user executes the same script, the path will be different since their home folder is different.)
There are 2 problems, however:
This approach depends on doing things outside of python that, imo, should, in an ideal world, not be necessary to do outside of python itself
To get the docstring in documentation, it seems we end up having to do something unintuitive like
class MetaClass:
#property
#cache
def _expensive_function(cls):
"""some expensive function"""
class BaseClass(metaclass=MetaClass):
lazy_attribute: type = classmethod(MetaClass._expensive_funcion)
"""lazy_attribute docstring"""
PS: By the way, is there any functional difference between a #classmethod#property and an attribute? They seem very similar. The only difference I can make out at the moment is that if the attribute needs access to other #classmethods, we need to move everything in the metaclass as above.

Related

Ruby: instantiate objects from files

Overview:
main.rb
items/
one.rb
two.rb
three.rb
Every file in items/ should have a human readable description (serialization is out), like so (but maybe a DSL would be better?):
class One < BaseItem
name "Item one"
def meth
"something"
end
main.rb should be able to instantiate all objects from the items/ directory. How could this be accomplished? Not familiar with Ruby, I see the object model allows for some pretty cool things (those class hooks, etc), but I'm having trouble finding a way to solve this.
Any input way appreciated.
EDIT:
Shoot, I may have missed the gist of it - what I didn't mention was the stuff in the items/ dir would be dynamic — treat items as plugins, I'd want main.rb to autodetect everything in that dir at runtime (possibly force a reload during execution). main.rb has no prior knowledge of the objects in there, it just knows what methods to expect from them.
I've looked at building DSLs, considering defining (in main.rb) a spawn function that takes a block. A sample file in items/ would look something like:
spawn do
name "Item name"
def foo
"!"
end
end
And the innards of spawn would create a new object of the base type and pass the block to instance_eval. That meant I'd need to have a method name to set the value, but incidentally, I also wanted the value to be accessible under name, so I had to go around it renaming the attr.
I've also tried the inherit route: make every item file contain a class that inherits from a BaseItem of sorts, and hook into it via inherited ... but that didn't work (the hook never fired, I've lost the code now).
EDIT2:
You could look at what homebrew does with its formulas, that's very close to what I'd want - I just didn't have the ruby prowess to reverse engineer how it handles a formula.
It all boils down to requiring those files, and make sure that you implemented the functionality you want in them.
If you want a more specific response, you need to ask a more specific question.
I am no expert on object persistence, but answer to your specific question is, that you have 2 good choices: One is YAML, and the other is Ruby itself: a DSL written by you or someone else, and specific to your business logic.
But I think that more general answer would require reviewing object persistance in Ruby more systematically. For example, ActiveRecord::Base descendants persists as database tables. There are other ways, I found eg. this http://stone.rubyforge.org/ by googling. This is my problem as well, I'm facing the same question as you in my work.
What you are asking for looks and smells a lot like a normal Ruby script.
class One < BaseItem
name "Item one"
def meth
"something"
end
We'd close the class definition with another end statement. name "Item one" would probably be done inside the initialize method, by setting an instance variable:
attr_reader :name
def initialize(name)
#name = name
end
Typically we wouldn't call the folder "items", but instead it would be "lib", but otherwise what you are talking about is very normal and expected.
Instantiating all items in a folder is easily done by iterating over the folder's contents, requiring the files, and calling the new method for that item. You can figure out the name by mapping the filename to the class name, or by initializing an instance at the end of the file:
one = One.new("item one")
You could keep track of the items loaded in an array or hash, or just hardwire them in. It's up to you, since this is your code.
It sounds like you haven't tried writing any Ruby scripts, otherwise you would have found this out already. Normal Ruby programming books/documentation would have covered this. As is, the question is akin to premature optimization, and working with the language would have given you the answer.

ruby module_function vs including module

In ruby, I understand that module functions can be made available without mixing in the module by using module_function as shown here. I can see how this is useful so you can use the function without mixing in the module.
module MyModule
def do_something
puts "hello world"
end
module_function :do_something
end
My question is though why you might want to have the function defined both of these ways.
Why not just have
def MyModule.do_something
OR
def do_something
In what kind of cases would it be useful to have the function available to be mixed in, or to be used as a static method?
Think of Enumerable.
This is the perfect example of when you need to include it in a module. If your class defines #each, you get a lot of goodness just by including a module (#map, #select, etc.). This is the only case when I use modules as mixins - when the module provides functionality in terms of a few methods, defined in the class you include the module it. I can argue that this should be the only case in general.
As for defining "static" methods, a better approach would be:
module MyModule
def self.do_something
end
end
You don't really need to call #module_function. I think it is just weird legacy stuff.
You can even do this:
module MyModule
extend self
def do_something
end
end
...but it won't work well if you also want to include the module somewhere. I suggest avoiding it until you learn the subtleties of the Ruby metaprogramming.
Finally, if you just do:
def do_something
end
...it will not end up as a global function, but as a private method on Object (there are no functions in Ruby, just methods). There are two downsides. First, you don't have namespacing - if you define another function with the same name, it's the one that gets evaluated later that you get. Second, if you have functionality implemented in terms of #method_missing, having a private method in Object will shadow it. And finally, monkey patching Object is just evil business :)
EDIT:
module_function can be used in a way similar to private:
module Something
def foo
puts 'foo'
end
module_function
def bar
puts 'bar'
end
end
That way, you can call Something.bar, but not not Something.foo. If you define any other methods after this call to module_function, they would also be available without mixing in.
I don't like it for two reasons, though. First, modules that are both mixed in and have "static" methods sound a bit dodgy. There might be valid cases, but it won't be that often. As I said, I prefer either to use a module as a namespace or mix it in, but not both.
Second, in this example, bar would also be available to classes/modules that mix in Something. I'm not sure when this is desirable, since either the method uses self and it has to be mixed in, or doesn't and then it does not need to be mixed in.
I think using module_function without passing the name of the method is used quite more often than with. Same goes for private and protected.
It's a good way for a Ruby library to offer functionality that does not use (much) internal state. So if you (e.g.) want to offer a sin function and don't want to pollute the "global" (Object) namespace, you can define it as class method under a constant (Math).
However, an app developer, who wants to write a mathematical application, might need sin every two lines. If the method is also an instance method, she can just include the Math (or My::Awesome::Nested::Library) module and can now directly call sin (stdlib example).
It's really about making a library more comfortable for its users. They can choose themself, if they want the functionality of your library on the top level.
By the way, you can achieve a similar functionality like module_function by using: extend self (in the first line of the module). To my mind, it looks better and makes things a bit clearer to understand.
Update: More background info in this blog article.
If you want to look at a working example, check out the chronic gem:
https://github.com/mojombo/chronic/blob/master/lib/chronic/handlers.rb
and Handlers is being included in the Parser class here:
https://github.com/mojombo/chronic/blob/master/lib/chronic/parser.rb
He's using module_function to send the methods from Handlers to specific instances of Handler using that instance's invoke method.

How can I detect which modules depend on which modules in Ruby?

What tools can determine which modules have methods that are calling methods from other modules in Ruby?
Background: I'm partway through breaking a 808 line module into smaller modules, having created 12-submodules. However, some of the methods in one of the modules are calling methods in another sub-module. This may or may not be ok, depending on whether the module of the called method is meant to be common functionality.
module DisplayStatistics1
def display_statistics_1_foo
calculate_statistics_foo # call a method that's in CalculateStatistics - this is ok
display_statistics_2_bar # call a method that's in DisplayStatistics2 - this is bad
end
# other methods omitted
end
# modules DisplayStatistics2 and CalculateStatistics omitted
class ExampleClass
include DisplayStatistics1
include DisplayStatistics2
include CalculateStatistics
end
Ideally the analysis tool would show that DisplayStatistics1 has dependencies on DisplayStatistics2 as well as on CalculateStatistics.
Update: Maybe I shouldn't have done it this way - maybe I should have split them up into classes instead. That way, I'd have known for sure what depended on what!
While I'm not aware of a static analysis tool for Ruby, the one that seems closest to what you want is rubyprof. This can generate a callgraph in many formats, including an HTML tree and even a GraphViz box-and-line plot.

How do I deserialize classes in Psych?

How do I deserialize in Psych to return an existing object, such as a class object?
To do serialization of a class, I can do
require "psych"
class Class
yaml_tag 'class'
def encode_with coder
coder.represent_scalar 'class', name
end
end
yaml_string = Psych.dump(String) # => "--- !<class> String\n...\n"
but if I try doing Psych.load on that, I get an anonymous class, rather than the String class.
The normal deserialization method is Object#init_with(coder), but that only changes the state of the existing anonymous class, whereas I'm wanting the String class.
Psych::Visitors::ToRuby#visit_Psych_Nodes_Scalar(o) has cases where rather than modifying existing objects with init_with, they make sure the right object is created in the first place (for example, calling Complex(o.value) to deserialize a complex number), but I don't think I should be monkeypatching that method.
Am I doomed to working with low level or medium level emitting, or am I missing something?
Background
I'll describe the project, why it needs classes, and why it needs
(de)serialization.
Project
The Small Eigen Collider aims to create random tasks for Ruby to run.
The initial aim was to see if the different implementations of Ruby
(for example, Rubinius and JRuby) returned the same results when given
the same random tasks, but I've found that it's also good for
detecting ways to segfault Rubinius and YARV.
Each task is composed of the following:
receiver.send(method_name, *parameters, &block)
where receiver is a randomly chosen object, and method_name is the
name of a randomly chosen method, and *parameters is an array of
randomly chosen objects. &block is not very random - it's basically
equivalent to {|o| o.inspect}.
For example, if receiver were "a", method_name was :casecmp, and
parameters was ["b"], then you'd be calling
"a".send(:casecmp, "b") {|x| x.inspect}
which is equivalent to (since the block is irrelevant)
"a".casecmp("b")
the Small Eigen Collider runs this code, and logs these inputs and
also the return value. In this example, most implementations of Ruby
return -1, but at one stage, Rubinius returned +1. (I filed this as a
bug https://github.com/evanphx/rubinius/issues/518 and the Rubinius
maintainers fixed the bug)
Why it needs classes
I want to be able to use class objects in my Small Eigen Collider.
Typically, they would be the receiver, but they could also be one of
the parameters.
For example, I found that one way to segfault YARV is to do
Thread.kill(nil)
In this case, receiver is the class object Thread, and parameters is
[nil]. (Bug report: http://redmine.ruby-lang.org/issues/show/4367 )
Why it needs (de)serialization
The Small Eigen Collider needs serialization for a couple of reasons.
One is that using a random number generator to generate a series of
random tasks every time isn't practical. JRuby has a different builtin
random number generator, so even when given the same PRNG seed it'd
give different tasks to YARV. Instead, what I do is I create a list of
random tasks once (the first running of ruby
bin/small_eigen_collider), have the initial running serialize the list
of tasks to tasks.yml, and then have subsequent runnings of the
program (using different Ruby implementations) read in that tasks.yml
file to get the list of tasks.
Another reason I need serialization is that I want to be able to edit
the list of tasks. If I have a long list of tasks that leads to a
segmentation fault, I want to reduce the list to the minimum required
to cause a segmentation fault. For example, with the following bug
https://github.com/evanphx/rubinius/issues/643 ,
ObjectSpace.undefine_finalizer(:symbol)
by itself doesn't cause a segmentation fault, and nor does
Symbol.all_symbols.inspect
but if you put the two together, it did. But I started out with
thousands of tasks, and needed to pare it back to just those two
tasks.
Does deserialization returning existing class objects make sense in
this context, or do you think there's a better way?
Status quo of my current researches:
To get your desired behavior working you can use my workaround mentioned above.
Here the nicely formatted code example:
string_yaml = Psych.dump(Marshal.dump(String))
# => "--- ! \"\\x04\\bc\\vString\"\n"
string_class = Marshal.load(Psych.load(string_yaml))
# => String
Your hack with modifying Class maybe will never work, because real class handling isn't implemented in psych/yaml.
You can take this repo tenderlove/psych, which is the standalone lib.
(Gem: psych - to load it, use: gem 'psych'; require 'psych' and do a check with Psych::VERSION)
As you can see in line 249-251 handling of objects with the anonymous class Class isn't handled.
Instead of monkeypatching the class Class I recommend you to contribute to the Psych lib by extending this class handling.
So in my mind the final yaml result should be something like: "--- !ruby/class String"
After one night thinking about that I can say, this feature would be really nice!
Update
Found a tiny solution which seems to work in the intended way:
code gist: gist.github.com/1012130 (with descriptive comments)
The Psych maintainer has implemented the serialization and deserialization of classes and modules. It's now in Ruby!

What are some good examples of Mixins and or Traits?

I was reading up on Ruby, and learned about its mixins pattern, but couldn't think of many useful mixin functionality (because I'm not used to thinking that way most likely). So I was wondering what would be good examples of useful Mixin functionality?
Thanks
Edit: A bit of background. I'm Coming from C++, and other Object languages, but my doubt here is that Ruby says it's not inheriting mixins, but I keep seeing mixins as Multiple inheritance, so I fear I'm trying to categorize them too soon into my comfort zone, and not really grok what a mixin is.
They are usually used to add some form of standard functionality to a class, without having to redefine it all. You can probably think of them a bit like interfaces in Java, but instead of just defining a list of methods that need to be implemented, many of them will actually be implemented by including the module.
There are a few examples in the standard library:
Singleton - A module that can be mixed into any class to make it a singleton. The initialize method is made private, and an instance method added, which ensures that there is only ever one instance of that class in your application.
Comparable - If you include this module in a class, defining the <=> method, which compares the current instance with another object and says which is greater, is enough to provide <, <=, ==, >=, >, and between? methods.
Enumerable - By mixing in this module, and defining an each method, you get support for all the other related methods such as collect, inject, select, and reject. If it's also got the <=> method, then it will also support sort, min, and max.
DataMapper is also an interesting example of what can be done with a simple include statement, taking a standard class, and adding the ability to persist it to a data store.
Well the usual example I think is Persistence
module Persistence
def load sFileName
puts "load code to read #{sFileName} contents into my_data"
end
def save sFileName
puts "Uber code to persist #{#my_data} to #{sFileName}"
end
end
class BrandNewClass
include Persistence
attr :my_data
def data=(someData)
#my_data = someData
end
end
b = BrandNewClass.new
b.data = "My pwd"
b.save "MyFile.secret"
b.load "MyFile.secret"
Imagine the module is written by a Ruby ninja, which persists the state of your class to a file.
Now suppose I write a brand new class, I can reuse the functionality of persistence by mixing it in by saying include ModuleILike. You can even include modules at runtime. I get load and save methods for free by just mixing it in. These methods are just like the ones that you wrote yourself for your class. Code/Behavior/Functionality-reuse without inheritance!
So what you're doing is including methods to the method table for your class (not literally correct but close).
In ruby, the reason that Mixins aren't multiple-inheritance is that combining mixin methods is a one time thing. This wouldn't be such a big issue, except that Ruby's modules and classes are open to modification. This means that if you mixin a module to your class, then add a method to the module, the method will not be available to your class; where if you did it in the opposite order, it would.
It's like ordering an ice-cream cone. If you get chocolate sprinkles and toffee bits as your mixins, and walk away with your cone, what kind of ice cream cone you have won't change if someone adds multicolored sprinkles to the chocolate sprinkles bin back at the ice-cream shop. Your class, the ice cream cone, isn't modified when the mixin module, the bin of sprinkles is. The next person to use that mixin module will see the changes.
When you include a module in ruby, it calls Module#append_features on that module, which add a copy of that module's methods to the includer one time.
Multiple inheritance, as I understand it, is more like delegation. If your class doesn't know how to do something, it asks its parents. In an open-class environment, a class's parents may have been modified after the class was created.
It's like a RL parent-child relationship. Your mother might have learned how to juggle after you were born, but if someone asks you to juggle and you ask her to either: show you how (copy it when you need it) or do it for you (pure delegation), then she'll be able at that point, even though you were created before her ability to juggle was.
It's possible that you could modify a ruby module 'include' to act more like multiple inheritance by modifying Module#append_features to keep a list of includers, and then to update them using the method_added callback, but this would be a big shift from standard Ruby, and could cause major issues when working with others code. You might be better creating a Module#inherit method that called include and handled delegation as well.
As for a real world example, Enumerable is awesome. If you define #each and include Enumerable in your class, then that gives you access to a whole host of iterators, without you having to code each and every one.
It is largely used as one might use multiple inheritance in C++ or implementing interfaces in Java/C#. I'm not sure where your experience lies, but if you have done those things before, mixins are how you would do them in Ruby. It's a systemized way of injecting functionality into classes.

Resources