I was reading an answer to stackoverflow question and tried to modify the function history to take IntoIter where item can be anything that can be transformed into reference and has some traits Debug in this case.
If I will remove V: ?Sized from the function definition rust compiler would complain that it doesn't know the size of str at compile time.
use std::fmt::Debug;
pub fn history<I: IntoIterator, V: ?Sized>(i: I) where I::Item: AsRef<V>, V: Debug {
for s in i {
println!("{:?}", s.as_ref());
}
}
fn main() {
history::<_, str>(&["st", "t", "u"]);
}
I don't understand why compiler shows error in the first place and not sure why the program is working properly if I kind of cheat with V: ?Sized.
I kind of cheat with V: ?Sized
It isn't cheating. All generic arguments are assumed to be Sized by default. This default is there because it's the most common case - without it, nearly every type parameter would have to be annotated with : Sized.
In your case, V is only ever accessed by reference, so it doesn't need to be Sized. Relaxing the Sized constraint makes your function as general as possible, allowing it to be used with the most possible types.
The type str is unsized, so this is not just about generalisation, you actually need to relax the default Sized constraint to be able to use your function with str.
Related
Code I'm exploring:
type Stack struct {
length int
values []int
}
func (s *Stack) Push(value int) {
// ...
}
func (s *Stack) Pop() int {
// ...
}
func (s *Stack) Length() int {
return s.length
}
Methods Push and Pop change the length field in Stack struct. And I wanted to hide this field from other files to prevent code like stack.length = ... (Manual length change). But I was need to have ability to read this field, so I added getter method - Length.
And my question is:
Shouldn't stack.Length() become slower than stack.length, because it is a function call? I have learnt assembler a bit and I know how many operations program should do to call a function. Have I understand right: By adding getter method stack.Length() I protected those who use my lib from bad usage but the cost of it - program's performance? This actually concerns not only Go.
Shouldn't stack.Length() become slower than stack.length, because it is a function call?
Objection! Assumes facts not in evidence.
Specifically:
Why do you think it is a function call? It looks like one, but actual Go compilers will often expand the code in line.
Why do you think a function call is slower than inline code? When measuring actual programs on actual computers, sometimes function calls are faster than inline code. It turns out the crucial part is usually whether the instructions being executed, and their operands, are already in the appropriate CPU caches. Sometimes, expanding functions inline makes the program run more slowly.
The compiler should do the inline expansion unless it makes the program run more slowly. How good the compiler is at pre- or post-detecting such slowdowns, if present, is a separate issue. In this particular case, given the function definition, the compiler is almost certain to just expand the function in line, as accessing stack.length will likely be one instruction, and calling a function will be one instruction, and deciding the tradeoff here will be easy.
This question already has an answer here:
"overflow while adding drop-check rules" while implementing a fingertree
(1 answer)
Closed 4 years ago.
Here is a data structure I can write down and which is accepted by the Rust compiler:
pub struct Pair<S, T>(S, T);
pub enum List<T> {
Nil,
Cons(T, Box<List<Pair<i32, T>>>),
}
However, I cannot write
let x: List<i32> = List::Nil;
playground
as Rust will complain about an "overflow while adding drop-check rules".
Why shouldn't it be possible to instantiate List::Nil?
It should be noted that the following works:
pub struct Pair<S, T>(S, T);
pub enum List<T> {
Nil,
Cons(T, Box<List<T>>),
}
fn main() {
let x: List<i32> = List::Nil;
}
playground
When the type hasn't been instantiated, the compiler is mostly worried about the size of the type being constant and known at compile-time so it can be placed on the stack. The Rust compiler will complain if the type is infinite, and quite often a Box will fix that by creating a level of indirection to a child node, which is also of known size because it boxes its own child too.
This won't work for your type though.
When you instantiate List<T>, the type of the second argument of the Cons variant is:
Box<List<Pair<i32, T>>>
Notice that the inner List has a type argument Pair<i32, T>, not T.
That inner list also has a Cons, whose second argument has type:
Box<List<Pair<i32, Pair<i32, T>>>>
Which has a Cons, whose second argument has type:
Box<List<Pair<i32, Pair<i32, Pair<i32, T>>>>>
And so on.
Now this doesn't exactly explain why you can't use this type. The size of the type will increase linearly with how deeply it is within the List structure. When the list is short (or empty) it doesn't reference any complex types.
Based on the error text, the reason for the overflow is related to drop-checking. The compiler is checking that the type is dropped correctly, and if it comes across another type in the process it will check if that type is dropped correctly too. The problem is that each successive Cons contains a completely new type, which gets bigger the deeper you go, and the compiler has to check if each of these will be dropped correctly. The process will never end.
I noticed that Rust's Vec::len method just accesses the vector's len property. Why isn't len just a public property, rather than wrapping a method around it?
I assume this is so that in case the implementation changes in the future, nothing will break because Vec::len can change the way it gets the length without any users of Vec knowing, but I don't know if there are any other reasons.
The second part of my question is about when I'm designing an API. If I am building my own API, and I have a struct with a len property, should I make len private and create a public len() method? Is it bad practice to make fields public in Rust? I wouldn't think so, but I don't notice this being done often in Rust. For example, I have the following struct:
pub struct Segment {
pub dol_offset: u64,
pub len: usize,
pub loading_address: u64,
pub seg_type: SegmentType,
pub seg_num: u64,
}
Should any of those fields be private and instead have a wrapper function like Vec does? If so, then why? Is there a good guideline to follow for this in Rust?
One reason is to provide the same interface for all containers that implement some idea of length. (Such as std::iter::ExactSizeIterator.)
In the case of Vec, len() is acting like a getter:
impl<T> Vec<T> {
pub fn len(&self) -> usize {
self.len
}
}
While this ensures consistency across the standard library, there is another reason underlying this design choice...
This getter protects from external modification of len. If the condition Vec::len <= Vec::buf::cap is not ever satisfied, Vec's methods may try to access memory illegally. For instance, the implementation of Vec::push:
pub fn push(&mut self, value: T) {
if self.len == self.buf.cap() {
self.buf.double();
}
unsafe {
let end = self.as_mut_ptr().offset(self.len as isize);
ptr::write(end, value);
self.len += 1;
}
}
will attempt to write to memory past the actual end of the memory owned by the container. Because of this critical requirement, modification to len is forbidden.
Philosophy
It's definitely good to use a getter like this in library code (crazy people out there might try to modify it!).
However, one should design their code in a manner that minimizes the requirement of getters/setters. A class should act on its own members as much as possible. These actions should be made available to the public through methods. And here I mean methods that do useful things -- not just a plain ol' getter/setter that returns/sets a variable. Setters in particular can be made redundant through the use of constructors or methods. Vec shows us some of these "setters":
push
insert
pop
reserve
...
Thus, Vec implements algorithms that provide access to the outside world. But it manages its innards by itself.
The Vec struct looks something like this[1]:
pub struct Vec<T> {
ptr: *mut T,
capacity: usize,
len: usize,
}
The idea is that ptr points at a block of allocated memory of size capacity. If the size of the Vec needs to be bigger than the capacity then new memory is allocated. The unused portion of the allocated memory is uninitialised and could contain arbitrary data.
When you call mutating methods on Vec like push or pop, they carefully manage the Vec's internal state, increase capacity when necessary, and ensure that items that are removed are properly dropped.
If len was a public field, any code with an owned Vec, or a mutable reference to one, could set len to any value. Set it higher than it should be and you'll be able to read from uninitialised memory, causing Undefined Behaviour. Set it lower and you'll be effectively removing elements without properly dropping them.
In some other programming languages (e.g. JavaScript) the API for arrays or vectors specifically lets you change the size by setting a length property. It's not unreasonable to think that a programmer who is used to that approach could do this accidentally in Rust.
Keeping all the fields private and using a getter method for len() allows Vec to protect the mutability of its internals, make strong memory guarantees and prevent users from accidentally doing bad things to themselves.
[1] In practice, there are abstraction layers built over this data structure, so it looks a little different.
The Rust documentation gives this example where we have an instance of Result<T, E> named some_value:
match some_value {
Ok(value) => println!("got a value: {}", value),
Err(_) => println!("an error occurred"),
}
Is there any way to read from some_value without pattern matching? What about without even checking the type of the contents at runtime? Perhaps we somehow know with absolute certainty what type is contained or perhaps we're just being a bad programmer. In either case, I'm just curious to know if it's at all possible, not if it's a good idea.
It strikes me as a really interesting language feature that this branch is so difficult (or impossible?) to avoid.
At the lowest level, no, you can't read enum fields without a match1.
Methods on an enum can provide more convenient access to data within the enum (e.g. Result::unwrap), but under the hood, they're always implemented with a match.
If you know that a particular case in a match is unreachable, a common practice is to write unreachable!() on that branch (unreachable!() simply expands to a panic!() with a specific message).
1 If you have an enum with only one variant, you could also write a simple let statement to deconstruct the enum. Patterns in let and match statements must be exhaustive, and pattern matching the single variant from an enum is exhaustive. But enums with only one variant are pretty much never used; a struct would do the job just fine. And if you intend to add variants later, you're better off writing a match right away.
enum Single {
S(i32),
}
fn main() {
let a = Single::S(1);
let Single::S(b) = a;
println!("{}", b);
}
On the other hand, if you have an enum with more than one variant, you can also use if let and while let if you're interested in the data from a single variant only. While let and match require exhaustive patterns, if let and while let accept non-exhaustive patterns. You'll often see them used with Option:
fn main() {
if let Some(x) = std::env::args().len().checked_add(1) {
println!("{}", x);
} else {
println!("too many args :(");
}
}
If you look at the image package here http://golang.org/src/pkg/image/image.go you can see that the implementation of Opaque() for every image does the same thing, differing only in the pixel-specific logic.
Is there a reason for this? Would any general solution be less efficient? Is it just an oversight? Is there some limitation (I cannot see one) to the type system that would make a polymorphic [was: generic] approach difficult?
[edit] The kind of solution I was thinking of (which does not need generics in the Java sense) would be like:
type ColorPredicate func(c image.Color) bool;
func AllPixels (p *image.Image, q ColorPredicate) bool {
var r = p.Bounds()
if r.Empty() {
return true
}
for y := r.Min.Y; y < r.Max.Y; y++ {
for x := r.Min.X; x < r.Max.X; x++ {
if ! q(p.At(x,y)) {
return false
}
}
}
return true
}
but I am having trouble getting that to compile (still very new to Go - it will compile with an image, but not with an image pointer!).
Is that too hard to optimise? (you would need to have function inlining, but then wouldn't any type checking be pulled out of the loop?). Also, I now realise I shouldn't have used the word "generic" earlier - I meant it only in a generic (ha) way.
There is a limitation to the type system which prevents a general solution (or at least makes it very inefficient).
For example, the bodies of RGBA.Opaque and NRGBA.Opaque are identical, so you'd think that they could be factored out into a third function with a signature something like this:
func opaque(pix []Color, stride int, rect Rectangle) bool
You'd like to call that function this way:
func (p *RGBA) Opaque() bool {
return opaque([]Color(p.Pix), p.Stride, p.Rect)
}
But you can't. p.Pix can't be converted to []Color because those types have different in-memory representations and the spec forbids it. We could allocate a new slice, convert each individual element of p.Pix, and pass that, but that would be very inefficient.
Observe that RGBAColor and NRGBAColor have the exact same structure. Maybe we could factor out the function for just those two types, since the in-memory representation of the pixel slices is exactly the same:
func opaque(pix []RGBAColor, stride int, rect Rectangle) bool
func (p *NRGBA) Opaque() bool {
return opaque([]RGBAColor(p.Pix), p.Stride, p.Rect)
}
Alas, again this isn't allowed. This seems to be more of a spec/language issue than a technical one. I'm sure this has come up on the mailing list before, but I can't find a good discussion of it.
This seems like an area where generics would come in handy, but there's no solution for generics in Go yet.
Why does Go not have generic
types?
Generics may well be added at some
point. We don't feel an urgency for
them, although we understand some
programmers do.
Generics are convenient but they come
at a cost in complexity in the type
system and run-time. We haven't yet
found a design that gives value
proportionate to the complexity,
although we continue to think about
it. Meanwhile, Go's built-in maps and
slices, plus the ability to use the
empty interface to construct
containers (with explicit unboxing)
mean in many cases it is possible to
write code that does what generics
would enable, if less smoothly.
This remains an open issue.