I am experimenting with adding sorbet type information to my gem, pdf-reader. I don't want sorbet to be a runtime dependency for the gem, so all type annotations are in an external file in the rbi/ directory. I also can't extend T::Sig in my classes.
I'd like to enable typed: strict in some files, but doing so flags that I'm using some instance variables without type annotations:
./lib/pdf/reader/rectangle.rb:94: Use of undeclared variable #bottom_left https://srb.help/6002
94 | #bottom_left = PDF::Reader::Point.new(
^^^^^^^^^^^^
./lib/pdf/reader/rectangle.rb:98: Use of undeclared variable #bottom_right https://srb.help/6002
98 | #bottom_right = PDF::Reader::Point.new(
^^^^^^^^^^^^^
./lib/pdf/reader/rectangle.rb:102: Use of undeclared variable #top_left https://srb.help/6002
102 | #top_left = PDF::Reader::Point.new(
^^^^^^^^^
./lib/pdf/reader/rectangle.rb:106: Use of undeclared variable #top_right https://srb.help/6002
106 | #top_right = PDF::Reader::Point.new(
The proposed fix is to use T.let():
#top_right = T.let(PDF::Reader::Point.new(0,0), PDF::Reader::Point)
However I can't do that because it requires a runtime dependency on sorbet.
Is it possible to record annotations for instance variables in an rbi file?
According to the documentation "The syntax of RBI files is the same as normal Ruby files, except that method definitions do not need implementations." So, the syntax for declaring the type of an instance variable in an RBI file is the same as in a Ruby file:
sig do
params(
x1: Numeric,
y1: Numeric,
x2: Numeric,
y2: Numeric
).void
end
def initialize(x1, y1, x2, y2)
#top_right = T.let(PDF::Reader::Point.new(0,0), PDF::Reader::Point)
# …
end
An alternative would be to use RBS syntax instead of RBI syntax, which supports type annotations for instance variables natively. However, I have found conflicting information on RBS support in Sorbet. There are claims online that Sorbet supports RBS. OTOH, the Sorbet FAQ talks about RBS support in the future tense. On the other other hand, the "future" the FAQ talks about is the release of Ruby 3, which is actually one year in the past.
In RBS, it would look something like this:
module PDF
class Reader
#top_left: Numeric
#top_right: Numeric
#bottom_left: Numeric
#bottom_right: Numeric
class Rectangle
def initialize: (
x1: Numeric,
y1: Numeric,
x2: Numeric,
y2: Numeric
) -> void
end
end
end
Or, since they are also attr_readers, it might be enough to just do
module PDF
class Reader
attr_reader top_left: Numeric
attr_reader top_right: Numeric
attr_reader bottom_left: Numeric
attr_reader bottom_right: Numeric
class Rectangle
def initialize: (
x1: Numeric,
y1: Numeric,
x2: Numeric,
y2: Numeric
) -> void
end
end
end
I believe, this will also implicitly type the corresponding instance variables, but I have not tested this.
Related
Motivation
Say we have a number of randomly sized numbers, generated with something like:
values = (0...10).map { |_| rand(1_000_000) }
We can determine the width required to display each of these numbers comfortably with the following:
width = values
.map { |x| Math.log10(x).to_i + 1 }
.max
and use this width in Kernel#printf like so:
values.each { |x| printf("%*d\n", width, x) }
taking advantage of the * flag to:
Use the next argument as the field width.
as per Kernel#format.
Issue
If we wanted to make use of both named references and the * flag inside the format string, we then start to run into issues. Attempting the following:
printf("%<value>*d\n", width, value: x)
raises the following error:
main.rb:7:in `printf': one hash required (ArgumentError)
and flipping the order of the arguments to:
printf("%<value>*d\n", value: x, width)
raises the following error:
main.rb:7: syntax error, unexpected ')', expecting =>
...%<value>*d\n", value: x, width) }
As a test, omitting width entirely (with no expectation of success):
printf("%<value>*d\n", value: x)
gives the following error:
main.rb:7:in `printf': unnumbered(1) mixed with named (ArgumentError)
Is it categorically currently not possible to use named references with the * flag in Ruby string formatting?
Is there a method to overwrite variable without copying its name? For example, when I want to change my_var = '3' to an integer, I must do something like this:
my_var = my_var.to_i
Is there way to do this without copying variable's name? I want to do something like this:
my_var = something_const.to_i
For numbers there exists +=, -= etc, but is there universal way to do this for all methods ?
There is no way to covert a string to an integer like that, without repeating the variable name. Methods such as String#upcase! and Array#flatten! work by mutating the object; however, it is not possible to define such a method like String#to_i! because we are converting the object to an instance of a different class.
For example, here is a (failed) attempt to define such a method:
# What I want to be able to do:
# my_var = "123"
# my_var.to_i! # => my_var == 123
class String
def to_i!
replace(Integer(self))
end
end
my_var = "123"
my_var.to_i! # TypeError: no implicit conversion of Fixnum into String
...And even if this code were valid, it would still offer no performance gain since a new object is still being created.
As for your examples of += and -=, these are in fact simply shorthand for:
x += 1
# Is equivalent to:
x = x + 1
So again, there is no performance gain here either; just slightly nicer syntax. A good question to ask is, why doesn't ruby support a ++ operator? If such an operator existed then it would offer performance gain... But I'll let you research for yourself why this is missing from the language.
So to summarise,
is there universal way to do this for all methods?
No. The special operators like +=, -=, |= and &= are all predefined; there is no "generalised" version such as method_name=.
You can also define methods that mutate the object, but only when appropriate. Such methods are usually named with a !, are called "bang-methods", and have a "non-bang" counterpart. On String objects, for example, there is String#capitalize! (and String#capitalize), String#delete! (and String#delete), String#encode! (and String#encode), .... but no String#to_i! for the reasons discussed above.
Assume I have implemented a Vector class. In C++ it is possible to do "scaling" in natural math expressions by overloading operator* at global scope:
template <typename T> // T can be int, double, complex<>, etc.
Vector operator*(const T& t, const Vector& v);
template <typename T> // T can be int, double, complex<>, etc.
Vector operator*(const Vector& v, const T& t);
However, when it goes to Ruby, since parameters are not typed, it would be possible to write
class Vector
def *(another)
case another
when Vector then ...
when Numeric then ...
end
end
end
This allows Vector * Numeric, but not Numeric * Vector. Is there a way of solve it?
[Using Numeric rather than Numerical in my reply.]
The most general way to do this is to add a coerce method to Vector. When Ruby encounters 5 * your_vector, the call to 5.*(your_vector) fails, it will then call your_vector.coerce(5). Your coerce method will pass back two items and the * method will be retried on those items.
Conceptually, something like this happens after the 5.*(your_vector) failure:
first, second = your_vector.coerce(5)
first.*(second)
The most simple approach is to pass back your_vector as the first item and 5 as the second.
def coerce(other)
case other
when Numeric
return self, other
else
raise TypeError, "#{self.class} can't be coerced into #{other.class}"
end
end
That works for commutative operations, but not so well for non-commutative operations. If you have a simple, self-contained program that only needs * to work, you could get away with it. If you're developing a library or need something more generic, and it makes sense to transform 5 into a Vector, you can do that in coerce:
def coerce(other)
case other
when Numeric
return Vector.new(other), self
else
raise TypeError, "#{self.class} can't be coerced into #{other.class}"
end
end
This is a much more robust solution, if it makes semantic sense. If it doesn't make semantic sense, you can create an intermediate type that you can transform Numeric into that does know how to multiply with Vector. This the approach that Matrix takes.
As a last resort, you can pull out the big guns and use alias_method to redefine * on Numeric to handle Vector. I'm not going to add the code for this approach, since doing it wrong will lead to disaster, and I haven't thought though any edge cases involved.
I am writing an application which needs to find out the schema of a database, across engines. To that end, I am writing a small database adapter using Python. I decided to first write a base class that outlines the functionality I need, and then implement it using classes that inherit from this base. Along the way, I need to implement some constants which need to be accessible across all these classes. Some of these constants need to be combined using C-style bitwise OR.
My question is,
what is the standard way of sharing such constants?
what is the right way to create constants that can be combined? I am referring to MAP_FIXED | MAP_FILE | MAP_SHARED style code that C allows.
For the former, I came across threads where all the constants were put into a module first. For the latter, I briefly thought of using a dict of booleans. Both of these seemed too unwieldly. I imagine that this is a fairly common requirement, and think some good way must indeed exist!
what is the standard way of sharing such constants?
Throughout the standard library, the most common way is to define constants as module-level variables using UPPER_CASE_WITH_UNDERSCORES names.
what is the right way to create constants that can be combined? I am referring to MAP_FIXED | MAP_FILE | MAP_SHARED style code that C allows.
The same rules as in C apply. You have to make sure that each constant value corresponds to a single, unique bit, i.e. powers of 2 (2, 4, 8, 16, ...).
Most of the time, people use hex numbers for this:
OPTION_A = 0x01
OPTION_B = 0x02
OPTION_C = 0x04
OPTION_D = 0x08
OPTION_E = 0x10
# ...
Some prefer a more human-readable style, computing the constant values dynamically using shift operators:
OPTION_A = 1 << 0
OPTION_B = 1 << 1
OPTION_C = 1 << 2
# ...
In Python, you could also use binary notation to make this even more obvious:
OPTION_A = 0b00000001
OPTION_B = 0b00000010
OPTION_C = 0b00000100
OPTION_D = 0b00001000
But since this notation is lengthy and hard to read, using hex or binary shift notation is probably preferable.
Constants generally go at the module level. From PEP 8:
Constants
Constants are usually defined on a module level and written in all capital letters with underscores separating words. Examples include MAX_OVERFLOW and TOTAL.
If you want constants at class level, define them as class properties.
Stdlib is a great source of knowledge example of what you want can be found in doctest code:
OPTIONS = {}
# A function to add (register) an option.
def register_option(name):
return OPTIONS.setdefault(name, 1 << len(OPTIONS))
# A function to test if an option exist.
def has_option(options, name):
return bool(options & name)
# All my option defined here.
FOO = register_option('FOO')
BAR = register_option('BAR')
FOOBAR = register_option('FOOBAR')
# Test if an option figure out in `ARG`.
ARG = FOO | BAR
print has_option(ARG, FOO)
# True
print has_option(ARG, BAR)
# True
print has_option(ARG, FOOBAR)
# False
N.B: The re module also use bit-wise argument style too, if you want another example.
You often find constants at global level, and they are one of the few variables that exist up there. There are also people who write Constant namespaces using dicts or objects like this
class Const:
x = 33
Const.x
There are some people who put them in modules and others that attach them as class variables that instances access. Most of the time its personal taste, but just a few global variables can't really hurt that much.
Naming is usually UPPERCASE_WITH_UNDERSCORE, and they are usually module level but occasionally they live in their own class. One good reason to be in a class is when the values are special -- such as needing to be powers of two:
class PowTwoConstants(object):
def __init__(self, items):
self.names = items
enum = 1
for name in items:
setattr(self, name, enum)
enum <<= 1
constants = PowTwoConstants('ignore_case multiline newline'.split())
print constants.newline # prints 4
If you want to be able to export those constants to module level (or any other namespace) you can add the following to the class:
def export(self, namespace):
for name in self.names:
setattr(namespace, name, getattr(self, name))
and then
import sys
constants.export(sys.modules[__name__])
In the case of e.g. ddddd, d is the native format for the system, so I can't know exactly how big it will be.
In python I can do:
import struct
print struct.calcsize('ddddd')
Which will return 40.
How do I get this in Ruby?
I haven't found a built-in way to do this, but I've had success with this small function when I know I'm dealing with only numeric formats:
def calculate_size(format)
# Only for numeric formats, String formats will raise a TypeError
elements = 0
format.each_char do |c|
if c =~ /\d/
elements += c.to_i - 1
else
elements += 1
end
end
([ 0 ] * elements).pack(format).length
end
This constructs an array of the proper number of zeros, calls pack() with your format, and returns the length (in bytes). Zeros work in this case because they're convertible to each of the numeric formats (integer, double, float, etc).
I don't know of a shortcut but you can just pack one and ask how long it is:
length_of_five_packed_doubles = 5 * [1.0].pack('d').length
By the way, a ruby array combined with the pack method appears to be functionally equivalent to python's struct module. Ruby pretty much copied perl's pack and put them as methods on the Array class.