Can I limit the size of a Rust enum? - enums

I have an enum type in my Rust program of which some variants may contain inner data.
enum MyEnum {
A,
B(u64),
C(SmallStruct),
D(Box<LargeStruct>)
}
This enum is going to be stored tens of thousands of times and memory usage is an issue. I would like to avoid accidentally adding a very large variant for the enum. Is there a way that I can tell the compiler to limit the size of an enum instance in memory?

As of Rust 1.57 you can use asserts in a const context, so this kind of check will work:
// assert that MyEnum is no larger than 16 bytes
const _ASSERT_SMALL: () = const_assert(mem::size_of::<MyEnum>() <= 16);
Playground
Original answer follow for historical reference.
As noted in the other answer, you can use the const_assert! macro, but it will require an external crate, static_assertions. If you're looking for a std-only solution and can live with the uglier error message when the assertion fails, you can use this:
#[deny(const_err)]
const fn const_assert(ok: bool) {
0 - !ok as usize;
}
// assert that MyEnum is no larger than 16 bytes
const _ASSERT_SMALL: () = const_assert(mem::size_of::<MyEnum>() <= 16);
Playground
You can read about this technique, along with ways to improve it, in the article written by the author of the static_assertions crate.
EDIT: Link to original article is non-functional, web archive version

You could use const_assert! and mem::size_of to assert that your enum is less than or equal to a certain size.

Related

What are the side effects of using structs to group constants

I know we can't use struct as constant in Go.
But I would like to group my constant for cleaner code, instead having many constant with some prefix, so I am doing this:
var SomeStatus = struct {
Active int
Inactive int
Others int
}{
Active: 1,
Inactive: 2,
Others: 3
}
// usage example
status = Status.Active
// example with some prefix
const StatusActive = 1
const StatusInactive = 2
const StatusOthers = 3
const OtherConstantVariable = 1
...
If it doesn't matter, the value of the variable can be rewritten.
What is the other side effect of this trick?
What is the other side effect of this trick?
The ones I can think of. There may be others:
It's less efficient, as variables allocate runtime memory.
Any values that could be pre-computed at compile time due to a constant, will now be calculated at runtime.
For exported symbols, it opens you up to modification at runtime by anyone who imports your package.
It's not idiomatic, so will potentially confuse anyone who sees your code.
The flexibility that comes from untyped constants is lost.

How to change a boost::multiprecision::cpp_int from big endian to little endian

I have a boost::multiprecision::cpp_int in big endian and have to change it to little endian. How can I do that? I tried with boost::endian::conversion but that did not work.
boost::multiprecision::cpp_int bigEndianInt("0xe35fa931a0000*);
boost::multiprecision::cpp_int littleEndianInt;
littleEndianIn = boost::endian::endian_reverse(m_cppInt);
The memory layout of boost multi-precision types is implementation detail. So you cannot assume much about it anyways (they're not supposed to be bitwise serializable).
Just read a random section of the docs:
MinBits
Determines the number of Bits to store directly within the object before resorting to dynamic memory allocation. When zero, this field is determined automatically based on how many bits can be stored in union with the dynamic storage header: setting a larger value may improve performance as larger integer values will be stored internally before memory allocation is required.
It's not immediately clear that you have any chance at some level of "normal int behaviour" in memory layout. The only exception would be when MinBits==MaxBits.
Indeed, we can static_assert that the size of cpp_int with such backend configs match the corresponding byte-sizes.
It turns out that there's even a promising tag in the backend base-class to indicate "triviality" (this is truly promising): trivial_tag, so let's use it:
Live On Coliru
#include <boost/multiprecision/cpp_int.hpp>
namespace mp = boost::multiprecision;
template <int bits> using simple_be =
mp::cpp_int_backend<bits, bits, mp::unsigned_magnitude>;
template <int bits> using my_int =
mp::number<simple_be<bits>, mp::et_off>;
using my_int8_t = my_int<8>;
using my_int16_t = my_int<16>;
using my_int32_t = my_int<32>;
using my_int64_t = my_int<64>;
using my_int128_t = my_int<128>;
using my_int192_t = my_int<192>;
using my_int256_t = my_int<256>;
template <typename Num>
constexpr bool is_trivial_v = Num::backend_type::trivial_tag::value;
int main() {
static_assert(sizeof(my_int8_t) == 1);
static_assert(sizeof(my_int16_t) == 2);
static_assert(sizeof(my_int32_t) == 4);
static_assert(sizeof(my_int64_t) == 8);
static_assert(sizeof(my_int128_t) == 16);
static_assert(is_trivial_v<my_int8_t>);
static_assert(is_trivial_v<my_int16_t>);
static_assert(is_trivial_v<my_int32_t>);
static_assert(is_trivial_v<my_int64_t>);
static_assert(is_trivial_v<my_int128_t>);
// however it doesn't scale
static_assert(sizeof(my_int192_t) != 24);
static_assert(sizeof(my_int256_t) != 32);
static_assert(not is_trivial_v<my_int192_t>);
static_assert(not is_trivial_v<my_int256_t>);
}
Conluding: you can have trivial int representation up to a certain point, after which you get the allocator-based dynamic-limb implementation no matter what.
Note that using unsigned_packed instead of unsigned_magnitude representation never leads to a trivial backend implementation.
Note that triviality might depend on compiler/platform choices (it's likely that cpp_128_t uses some builtin compiler/standard library support on GCC, e.g.)
Given this, you MIGHT be able to pull of what you wanted to do with hacks IF your backend configuration support triviality. Sadly I think it requires you to manually overload endian_reverse for 128 bits case, because the GCC builtins do not have __builtin_bswap128, nor does Boost Endian define things.
I'd suggest working off the information here How to make GCC generate bswap instruction for big endian store without builtins?
Final Demo (not complete)
#include <boost/multiprecision/cpp_int.hpp>
#include <boost/endian/buffers.hpp>
namespace mp = boost::multiprecision;
namespace be = boost::endian;
template <int bits> void check() {
using T = mp::number<mp::cpp_int_backend<bits, bits, mp::unsigned_magnitude>, mp::et_off>;
static_assert(sizeof(T) == bits/8);
static_assert(T::backend_type::trivial_tag::value);
be::endian_buffer<be::order::big, T, bits, be::align::no> buf;
buf = T("0x0102030405060708090a0b0c0d0e0f00");
std::cout << std::hex << buf.value() << "\n";
}
int main() {
check<128>();
}
(Changing be::order::big to be::order::native obviously makes it compile. The other way to complete it would be to have an ADL accessible overload for endian_reverse for your int type.)
This is both trivial and in the general case unanswerable, let me explain:
For a general N-bit integer, where N is a large number, there is unlikely to be any well defined byte order, indeed even for 64 and 128 bit integers there are more than 2 possible orders in use: https://en.wikipedia.org/wiki/Endianness#Middle-endian.
On any platform, with any native endianness you can always extract the bytes of a cpp_int, the first example here: https://www.boost.org/doc/libs/1_73_0/libs/multiprecision/doc/html/boost_multiprecision/tut/import_export.html#boost_multiprecision.tut.import_export.examples shows you how. When exporting bytes like this, they are always most significant byte first, so you can subsequently rearrange them how you wish. You should not however, rearrange them and load them back into a cpp_int as the class won't know what to do with the result!
If you know that the value is small enough to fit into a native integer type, then you can simply cast to the native integer and use a system API on the result. As in endian_reverse(static_cast<int64_t>(my_cpp_int)). Again, don't assign the result back into a cpp_int as it requires native byte order.
If you wish to check whether a value is small enough to fit in an N-bit integer for the approach above, you can use the msb function, which returns the index of the most significant bit in the cpp_int, add one to that to obtain the number of bits used, and filter out the zero case and the code looks like:
unsigned bits_used = my_cpp_int.is_zero() ? 0 : msb(my_cpp_int) + 1;
Note that all of the above use completely portable code - no hacking of the underlying implementation is required.

Why is this Rust enum not smaller?

Consider this silly enum:
enum Number {
Rational {
numerator: i32,
denominator: std::num::NonZeroU32,
},
FixedPoint {
whole: i16,
fractional: u16,
},
}
The data in the Rational variant takes up 8 bytes, and the data in the FixedPoint variant takes up 4 bytes. The Rational variant has a field which must be nonzero, so i would hope that the enum layout rules would use that as a discriminator, with zero indicating the presence of the FixedPoint variant.
However, this:
fn main() {
println!("Number = {}", std::mem::size_of::<Number>(),);
}
Prints:
Number = 12
So, the enum gets space for an explicit discriminator, rather than exploiting the presence of the nonzero field.
Why isn't the compiler able to make this enum smaller?
Although simple cases like Option<&T> can be handled without reserving space for the tag, the layout calculator in rustc is still not clever enough to optimize the size of enums with multiple non-empty variants.
This is issue #46213 on GitHub.
The case you ask about is pretty clear-cut, but there are similar cases where an enum looks like it should be optimized, but in fact can't be because the optimization would preclude taking internal references; for example, see Why does Rust use two bytes to represent this enum when only one is necessary?

Crash when casting the result of arc4random() to Int

I've written a simple Bag class. A Bag is filled with a fixed ratio of Temperature enums. It allows you to grab one at random and automatically refills itself when empty. It looks like this:
class Bag {
var items = Temperature[]()
init () {
refill()
}
func grab()-> Temperature {
if items.isEmpty {
refill()
}
var i = Int(arc4random()) % items.count
return items.removeAtIndex(i)
}
func refill() {
items.append(.Normal)
items.append(.Hot)
items.append(.Hot)
items.append(.Cold)
items.append(.Cold)
}
}
The Temperature enum looks like this:
enum Temperature: Int {
case Normal, Hot, Cold
}
My GameScene:SKScene has a constant instance property bag:Bag. (I've tried with a variable as well.) When I need a new temperature I call bag.grab(), once in didMoveToView and when appropriate in touchesEnded.
Randomly this call crashes on the if items.isEmpty line in Bag.grab(). The error is EXC_BAD_INSTRUCTION. Checking the debugger shows items is size=1 and [0] = (AppName.Temperature) <invalid> (0x10).
Edit Looks like I don't understand the debugger info. Even valid arrays show size=1 and unrelated values for [0] =. So no help there.
I can't get it to crash isolated in a Playground. It's probably something obvious but I'm stumped.
Function arc4random returns an UInt32. If you get a value higher than Int.max, the Int(...) cast will crash.
Using
Int(arc4random_uniform(UInt32(items.count)))
should be a better solution.
(Blame the strange crash messages in the Alpha version...)
I found that the best way to solve this is by using rand() instead of arc4random()
the code, in your case, could be:
var i = Int(rand()) % items.count
This method will generate a random Int value between the given minimum and maximum
func randomInt(min: Int, max:Int) -> Int {
return min + Int(arc4random_uniform(UInt32(max - min + 1)))
}
The crash that you were experiencing is due to the fact that Swift detected a type inconsistency at runtime.
Since Int != UInt32 you will have to first type cast the input argument of arc4random_uniform before you can compute the random number.
Swift doesn't allow to cast from one integer type to another if the result of the cast doesn't fit. E.g. the following code will work okay:
let x = 32
let y = UInt8(x)
Why? Because 32 is a possible value for an int of type UInt8. But the following code will fail:
let x = 332
let y = UInt8(x)
That's because you cannot assign 332 to an unsigned 8 bit int type, it can only take values 0 to 255 and nothing else.
When you do casts in C, the int is simply truncated, which may be unexpected or undesired, as the programmer may not be aware that truncation may take place. So Swift handles things a bit different here. It will allow such kind of casts as long as no truncation takes place but if there is truncation, you get a runtime exception. If you think truncation is okay, then you must do the truncation yourself to let Swift know that this is intended behavior, otherwise Swift must assume that is accidental behavior.
This is even documented (documentation of UnsignedInteger):
Convert from Swift's widest unsigned integer type,
trapping on overflow.
And what you see is the "overflow trapping", which is poorly done as, of course, one could have made that trap actually explain what's going on.
Assuming that items never has more than 2^32 elements (a bit more than 4 billion), the following code is safe:
var i = Int(arc4random() % UInt32(items.count))
If it can have more than 2^32 elements, you get another problem anyway as then you need a different random number function that produces random numbers beyond 2^32.
This crash is only possible on 32-bit systems. Int changes between 32-bits (Int32) and 64-bits (Int64) depending on the device architecture (see the docs).
UInt32's max is 2^32 − 1. Int64's max is 2^63 − 1, so Int64 can easily handle UInt32.max. However, Int32's max is 2^31 − 1, which means UInt32 can handle numbers greater than Int32 can, and trying to create an Int32 from a number greater than 2^31-1 will create an overflow.
I confirmed this by trying to compile the line Int(UInt32.max). On the simulators and newer devices, this compiles just fine. But I connected my old iPod Touch (32-bit device) and got this compiler error:
Integer overflows when converted from UInt32 to Int
Xcode won't even compile this line for 32-bit devices, which is likely the crash that is happening at runtime. Many of the other answers in this post are good solutions, so I won't add or copy those. I just felt that this question was missing a detailed explanation of what was going on.
This will automatically create a random Int for you:
var i = random() % items.count
i is of Int type, so no conversion necessary!
You can use
Int(rand())
To prevent same random numbers when the app starts, you can call srand()
srand(UInt32(NSDate().timeIntervalSinceReferenceDate))
let randomNumber: Int = Int(rand()) % items.count

What is "A Tour of Go" trying to say? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
There are a few points in the tutorial that sort of leave you on your own without a clue or link if you're not in the know I guess. So I'm sorry about the length of these:
http://tour.golang.org/#15
Try printing needInt(Big) too
I'm guessing ints are allowed less bits than constants?
http://tour.golang.org/#21
the { } are required.
(Sound familiar?)
Which language is alluded to?
http://tour.golang.org/#25
(And a type declaration does what you'd expect.)
Why do we need the word type and the word struct? What was I supposed to expect?
http://tour.golang.org/#28
Why implicit zeroes in the constructor? This sounds like a dangerous design choice by Go. Is there a PEP or anything beyond http://golang.org/doc/go_faq.html on this?
http://tour.golang.org/#30
Make? Are there constructors? What's the difference between new and make?
http://tour.golang.org/#33
Where did delete come from? I didn't import it.
http://tour.golang.org/#36
What's the %v formatter stand for? Value?
http://tour.golang.org/#47
panic: runtime error: index out of range
goroutine 1 [running]:
tour/pic.Show(0x400c00, 0x40ca61)
go/src/pkg/tour/pic/pic.go:24 +0xd4
main.main()
/tmpfs/gosandbox-15c0e483_5433f2dc_ff6f028f_248fd0a7_d7c2d35b/prog.go:14 +0x25
I guess I broke go somehow....
package main
import "tour/pic"
func Pic(dx, dy int) [][]uint8 {
image := make([][]uint8, 10)
for i := range image {
image[i] = make([]uint8, 10)
}
return image
}
func main() {
pic.Show(Pic)
}
http://tour.golang.org/#59
I return error values when a function fails? I have to qualify every single function call with an error check? The flow of the program is uninterrupted when I write crazy code? E.g. Copy(only_backup, elsewhere);Delete(only_backup) and Copy fails....
Why would they design it like that?
#15:
I'm guessing int's are allowed less bits than constants?
Yes, exactly. According to the spec, "numeric constants represent values of arbitrary precision and do not overflow", whereas type int has either 32 or 64 bits.
#21:
Which language is alluded to?
None; it's alluding to #16, which says the same thing, in the same words, about for-loops.
#25 :
a type declaration does what you'd expect is a little unfortunate, I agree (as it assumes too much on what a reader could expect...) but it means you're defining a struct (with the struct keyword) and binding the type name "Vertex" to it, with the type Vertex part (see http://golang.org/ref/spec#Type_declarations)
#28:
the fact that uninitialized structs are zeroed is really really useful in many cases (many standard structs like buffers use it also)
It's not implicit in the contructor only. Look at this
var i int; fmt.Println(i)
This prints out 0. This is similar to something like java where primitive types have an implicit default value. booleans are false, integers are zero, etc. The spec on zero values.
#30:
new allocates memory and returns a pointer to it, while make is a special function used only for Slices, maps and channels.
See http://golang.org/doc/effective_go.html#allocation_new for a more in-depth explanation of make vs new
#33:
delete, like append or copy is one of the basic operators of the language. See the full list of them at: http://golang.org/ref/spec#Predeclared_identifiers
#36:
Yes, %v stands for "value". See http://golang.org/pkg/fmt/
#47:
try with this:
func Pic(dx, dy int) [][]uint8 {
image := make([][]uint8, dy) // dy, not 10
for x := range image {
image[x] = make([]uint8, dx) // dx, not 10
for y := range image[x] {
image[x][y] = uint8(x*y) //let's try one of the mentioned
// "interesting functions"
}
}
return image
}
#59:
The language's design and conventions encourage you to explicitly
check for errors where they occur (as distinct from the convention in
other languages of throwing exceptions and sometimes catching them).
In some cases this makes Go code verbose, but fortunately there are
some techniques you can use to minimize repetitive error handling.
(quoted from Error handling and Go )
I'm guessing int's are allowed less bits than constants?
yes, Numeric constants are high-precision values. An int in any language doesn't have near the precision of other numeric types.
Which language is alluded to?
No clue but it is backwards from C and Java where ( ) is required and { } is optional.
Why do we need the word type and the word struct? What was I supposed to expect?
If you're familiar with C, then it does what you'd expect.
Why implicit zeroes in the constructor?
It's not implicit in the contructor only. Look at this
var i int
fmt.Println(i)
This prints out 0. This is similar to something like java where primitive types have an implicit default value. booleans are false, integers are zero, etc.
Make? Are there constructors? What's the difference between new and make?
make accepts additional parameters for initializing the size of an array, slice, or map. new on the other hand just returns a pointer to a type.
type Data struct {}
// both d1 and d2 are pointers
d1 := new(Data)
d2 := &Data{}
As for are there constructors?, only if you make and reference them. This how one normally implements a constructor in Go.
type Data struct {}
func NewData() *Data {
return new(Data)
}
What's the %v formatter stand for? Value?
Yep
I return error values when a function fails? ... Why would they design it like that?
I felt the same way at first. My opinion has changed though. You can ignore errors from the std library if you like and not bother with it yourself, but once I had a handle on it, I personally find I have better (and more readable) error checking.
What I can say is when I was doing it wrong, it felt like repetitive error handling that felt unnecessary. When I finally started doing it right... well, what I just said above.

Resources