Consider this silly enum:
enum Number {
Rational {
numerator: i32,
denominator: std::num::NonZeroU32,
},
FixedPoint {
whole: i16,
fractional: u16,
},
}
The data in the Rational variant takes up 8 bytes, and the data in the FixedPoint variant takes up 4 bytes. The Rational variant has a field which must be nonzero, so i would hope that the enum layout rules would use that as a discriminator, with zero indicating the presence of the FixedPoint variant.
However, this:
fn main() {
println!("Number = {}", std::mem::size_of::<Number>(),);
}
Prints:
Number = 12
So, the enum gets space for an explicit discriminator, rather than exploiting the presence of the nonzero field.
Why isn't the compiler able to make this enum smaller?
Although simple cases like Option<&T> can be handled without reserving space for the tag, the layout calculator in rustc is still not clever enough to optimize the size of enums with multiple non-empty variants.
This is issue #46213 on GitHub.
The case you ask about is pretty clear-cut, but there are similar cases where an enum looks like it should be optimized, but in fact can't be because the optimization would preclude taking internal references; for example, see Why does Rust use two bytes to represent this enum when only one is necessary?
Related
I'm using bitvec_simd = "0.20" for bitvector operations in rust.
I have two instances of a struct, call them clique_into and clique_from. The relevant fields of the struct are two bitvectors members_bv and neighbors_bv, as well as a vector of integers which is called members. The members_bv and members vector represent the same data.
After profiling my code, I find that this is my bottleneck (in here 41% of the time): checking whether the members (typically 1) of clique_from are all neighbors of clique_into.
My current approach is to loop through the members of clique_from (typically 1) and check each one in turn to see if it's a neighbor of clique_into.
Here's my code:
use bitvec_simd::BitVec;
use smallvec::{smallvec, SmallVec};
struct Clique {
members_bv: BitVec,
members: SmallVec<[usize; 256]>,
neighbors_bv: BitVec,
}
fn are_cliques_mergable(clique_into: &Clique, clique_from: &Clique) -> bool {
for i in 0..clique_from.members.len() {
if !clique_into.neighbors_bv.get_unchecked(clique_from.members[i]) {
return false;
}
}
return true;
}
That code works fine and it's fast, but is there a way to make it faster? We can assume that clique_from almost always has a single member so the inner for loop is almost always executed once.
It likely comes down to this:
if !clique_into.neighbors_bv.get_unchecked(clique_from.members[i])
Is get_unchecked() the fastest way to do this? While I have written this so it will never panic, the compiler doesn't know that. Does this force Rust to waste time checking if it should panic?
I have an enum type in my Rust program of which some variants may contain inner data.
enum MyEnum {
A,
B(u64),
C(SmallStruct),
D(Box<LargeStruct>)
}
This enum is going to be stored tens of thousands of times and memory usage is an issue. I would like to avoid accidentally adding a very large variant for the enum. Is there a way that I can tell the compiler to limit the size of an enum instance in memory?
As of Rust 1.57 you can use asserts in a const context, so this kind of check will work:
// assert that MyEnum is no larger than 16 bytes
const _ASSERT_SMALL: () = const_assert(mem::size_of::<MyEnum>() <= 16);
Playground
Original answer follow for historical reference.
As noted in the other answer, you can use the const_assert! macro, but it will require an external crate, static_assertions. If you're looking for a std-only solution and can live with the uglier error message when the assertion fails, you can use this:
#[deny(const_err)]
const fn const_assert(ok: bool) {
0 - !ok as usize;
}
// assert that MyEnum is no larger than 16 bytes
const _ASSERT_SMALL: () = const_assert(mem::size_of::<MyEnum>() <= 16);
Playground
You can read about this technique, along with ways to improve it, in the article written by the author of the static_assertions crate.
EDIT: Link to original article is non-functional, web archive version
You could use const_assert! and mem::size_of to assert that your enum is less than or equal to a certain size.
If I define the following enums, Nil does not increase the size of the enum:
use std::mem::size_of;
enum Foo {
Cons(~char)
}
enum Bar {
Cons(~char),
Nil
}
println!("{}", size_of::<Foo>());
println!("{}", size_of::<Bar>());
// -> 4
// -> 4
On the other hand:
enum Foo {
Cons(char)
}
enum Foo {
Cons(char),
Nil
}
Yields:
// -> 4
// -> 8
What is happening when I define an enum? How is memory being allocated for these structures?
A naive approach to enums is to allocate enough space for the contents of its largest variant, plus a descriminant. This is a standard tagged union.
Rust is a little cleverer than this. (It could be a lot cleverer, but it is not at present.) It knows that given a ~T, there is at least one value that that memory location cannot be: zero. And so in a case like your enum { Cons(~T), Nil }, it is able to optimise it down to one word, with any non-zero value in memory meaning Cons(~T) and a zero value in memory meaning Nil.
When you deal with char, that optimisation cannot occur: zero is a valid codepoint. As it happens, char is defined as being a Unicode code-point, so it would actually be possible to optimise the variant into that space, there being plenty of spare bits at the end (Unicode character only needs 21 bits, so in a 32-bit space we have eleven spare bits). This is a demonstration of the fact that Rust's enum discriminant optimisation is not especially clever at present.
as i mentioned on subject of this post i found out OOP is slower than Structural Programming(spaghetti code) in the hard way.
i writed a simulated annealing program with OOP then remove one class and write it structural in main form. suddenly it got much faster . i was calling my removed class in every iteration in OOP program.
also checked it with Tabu Search. Same result .
can anyone tell me why this is happening and how can i fix it on other OOP programs?
are there any tricks ? for example cache my classes or something like that?
(Programs has been written in C#)
If you have a high-frequency loop, and inside that loop you create new objects and don't call other functions very much, then, yes, you will see that if you can avoid those news, say by re-using one copy of the object, you can save a large fraction of total time.
Between new, constructors, destructors, and garbage collection, a very little code can waste a whole lot of time.
Use them sparingly.
Memory access is often overlooked. The way o.o. tends to lay out data in memory is not conducive to efficient memory access in practice in loops. Consider the following pseudocode:
adult_clients = 0
for client in list_of_all_clients:
if client.age >= AGE_OF_MAJORITY:
adult_clients++
It so happens that the way this is accessed from memory is quite inefficient on modern architectures because they like accessing large contiguous rows of memory, but we only care for client.age, and of all clients we have; those will not be laid out in contiguous memory.
Focusing on objects that have fields results into data being laid out in memory in such a way that fields that hold the same type of information will not be laid out in consecutive memory. Performance-heavy code tends to involve loops that often look at data with the same conceptual meaning. It is conducive to performance that such data be laid out in contiguous memory.
Consider these two examples in Rust:
// struct that contains an id, and an optiona value of whether the id is divisible by three
struct Foo {
id : u32,
divbythree : Option<bool>,
}
fn main () {
// create a pretty big vector of these structs with increasing ids, and divbythree initialized as None
let mut vec_of_foos : Vec<Foo> = (0..100000000).map(|i| Foo{ id : i, divbythree : None }).collect();
// loop over all hese vectors, determine if the id is divisible by three
// and set divbythree accordingly
let mut divbythrees = 0;
for foo in vec_of_foos.iter_mut() {
if foo.id % 3 == 0 {
foo.divbythree = Some(true);
divbythrees += 1;
} else {
foo.divbythree = Some(false);
}
}
// print the number of times it was divisible by three
println!("{}", divbythrees);
}
On my system, the real time with rustc -O is 0m0.436s; now let us consider this example:
fn main () {
// this time we create two vectors rather than a vector of structs
let vec_of_ids : Vec<u32> = (0..100000000).collect();
let mut vec_of_divbythrees : Vec<Option<bool>> = vec![None; vec_of_ids.len()];
// but we basically do the same thing
let mut divbythrees = 0;
for i in 0..vec_of_ids.len(){
if vec_of_ids[i] % 3 == 0 {
vec_of_divbythrees[i] = Some(true);
divbythrees += 1;
} else {
vec_of_divbythrees[i] = Some(false);
}
}
println!("{}", divbythrees);
}
This runs in 0m0.254s on the same optimization level, — close to half the time needed.
Despite having to allocate two vectors instead of of one, storing similar values in contiguous memory has almost halved the execution time. Though obviously the o.o. approach provides for much nicer and more maintainable code.
P.s.: it occurs to me that I should probably explain why this matters so much given that the code itself in both cases still indexes memory one field at a time, rather than, say, putting a large swath on the stack. The reason is c.p.u. caches: when the program asks for the memory at a certain address, it actually obtains, and caches, a significant chunk of memory around that address, and if memory next to it be asked quickly again, then it can serve it from the cache, rather than from actual physical working memory. Of course, compilers will also vectorize the bottom code more efficiently as a consequence.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
There are a few points in the tutorial that sort of leave you on your own without a clue or link if you're not in the know I guess. So I'm sorry about the length of these:
http://tour.golang.org/#15
Try printing needInt(Big) too
I'm guessing ints are allowed less bits than constants?
http://tour.golang.org/#21
the { } are required.
(Sound familiar?)
Which language is alluded to?
http://tour.golang.org/#25
(And a type declaration does what you'd expect.)
Why do we need the word type and the word struct? What was I supposed to expect?
http://tour.golang.org/#28
Why implicit zeroes in the constructor? This sounds like a dangerous design choice by Go. Is there a PEP or anything beyond http://golang.org/doc/go_faq.html on this?
http://tour.golang.org/#30
Make? Are there constructors? What's the difference between new and make?
http://tour.golang.org/#33
Where did delete come from? I didn't import it.
http://tour.golang.org/#36
What's the %v formatter stand for? Value?
http://tour.golang.org/#47
panic: runtime error: index out of range
goroutine 1 [running]:
tour/pic.Show(0x400c00, 0x40ca61)
go/src/pkg/tour/pic/pic.go:24 +0xd4
main.main()
/tmpfs/gosandbox-15c0e483_5433f2dc_ff6f028f_248fd0a7_d7c2d35b/prog.go:14 +0x25
I guess I broke go somehow....
package main
import "tour/pic"
func Pic(dx, dy int) [][]uint8 {
image := make([][]uint8, 10)
for i := range image {
image[i] = make([]uint8, 10)
}
return image
}
func main() {
pic.Show(Pic)
}
http://tour.golang.org/#59
I return error values when a function fails? I have to qualify every single function call with an error check? The flow of the program is uninterrupted when I write crazy code? E.g. Copy(only_backup, elsewhere);Delete(only_backup) and Copy fails....
Why would they design it like that?
#15:
I'm guessing int's are allowed less bits than constants?
Yes, exactly. According to the spec, "numeric constants represent values of arbitrary precision and do not overflow", whereas type int has either 32 or 64 bits.
#21:
Which language is alluded to?
None; it's alluding to #16, which says the same thing, in the same words, about for-loops.
#25 :
a type declaration does what you'd expect is a little unfortunate, I agree (as it assumes too much on what a reader could expect...) but it means you're defining a struct (with the struct keyword) and binding the type name "Vertex" to it, with the type Vertex part (see http://golang.org/ref/spec#Type_declarations)
#28:
the fact that uninitialized structs are zeroed is really really useful in many cases (many standard structs like buffers use it also)
It's not implicit in the contructor only. Look at this
var i int; fmt.Println(i)
This prints out 0. This is similar to something like java where primitive types have an implicit default value. booleans are false, integers are zero, etc. The spec on zero values.
#30:
new allocates memory and returns a pointer to it, while make is a special function used only for Slices, maps and channels.
See http://golang.org/doc/effective_go.html#allocation_new for a more in-depth explanation of make vs new
#33:
delete, like append or copy is one of the basic operators of the language. See the full list of them at: http://golang.org/ref/spec#Predeclared_identifiers
#36:
Yes, %v stands for "value". See http://golang.org/pkg/fmt/
#47:
try with this:
func Pic(dx, dy int) [][]uint8 {
image := make([][]uint8, dy) // dy, not 10
for x := range image {
image[x] = make([]uint8, dx) // dx, not 10
for y := range image[x] {
image[x][y] = uint8(x*y) //let's try one of the mentioned
// "interesting functions"
}
}
return image
}
#59:
The language's design and conventions encourage you to explicitly
check for errors where they occur (as distinct from the convention in
other languages of throwing exceptions and sometimes catching them).
In some cases this makes Go code verbose, but fortunately there are
some techniques you can use to minimize repetitive error handling.
(quoted from Error handling and Go )
I'm guessing int's are allowed less bits than constants?
yes, Numeric constants are high-precision values. An int in any language doesn't have near the precision of other numeric types.
Which language is alluded to?
No clue but it is backwards from C and Java where ( ) is required and { } is optional.
Why do we need the word type and the word struct? What was I supposed to expect?
If you're familiar with C, then it does what you'd expect.
Why implicit zeroes in the constructor?
It's not implicit in the contructor only. Look at this
var i int
fmt.Println(i)
This prints out 0. This is similar to something like java where primitive types have an implicit default value. booleans are false, integers are zero, etc.
Make? Are there constructors? What's the difference between new and make?
make accepts additional parameters for initializing the size of an array, slice, or map. new on the other hand just returns a pointer to a type.
type Data struct {}
// both d1 and d2 are pointers
d1 := new(Data)
d2 := &Data{}
As for are there constructors?, only if you make and reference them. This how one normally implements a constructor in Go.
type Data struct {}
func NewData() *Data {
return new(Data)
}
What's the %v formatter stand for? Value?
Yep
I return error values when a function fails? ... Why would they design it like that?
I felt the same way at first. My opinion has changed though. You can ignore errors from the std library if you like and not bother with it yourself, but once I had a handle on it, I personally find I have better (and more readable) error checking.
What I can say is when I was doing it wrong, it felt like repetitive error handling that felt unnecessary. When I finally started doing it right... well, what I just said above.