How to write/read 8 true/false in a single byte on rust language - memory-management

Everyone knows that bool uses 1 byte and not 1 bite.
In case I want to store 8 (or more) true/false in a single byte,
how can I do it on Rust?
Something like this:
fn main()
{
let x:ByteOnBits;
x[3]=true;
if x[4]
{
println!("4th bit is true");
}
}
an array of 8 bools would be 8 bytes, not 1 byte as I am expecting.

You can use the bitflags crate to use bits as bools stored in different sizes. Although it is intended for other uses, you can still leverage its functionality for it.

The bitvec crate provides facilities of the kind you're asking for:
use bitvec::prelude::*;
fn main() {
let mut data = 0u8;
let bits = data.view_bits_mut::<Lsb0>();
bits.set(3, true);
if bits[4] {
println!("xxx");
}
assert_eq!(data, 8);
}

Related

Is reading register-sized data in `byteorder` efficient?

The crate in the title is byteorder.
Here is how we can read binary data from std::io::BufReader. BufReader implements the std::io::Read trait. There is an implementation of byteorder::ReadBytesExt for any type implementing Read. ReadBytesExt contains read_u16 and other methods that read binary data. This implementation:
fn read_u16<T: ByteOrder>(&mut self) -> Result<u16> {
let mut buf = [0; 2];
self.read_exact(&mut buf)?;
Ok(T::read_u16(&buf))
}
It passes a reference to buf to BufReader; I suppose it passes the address of buf in the stack. Hence the resulting u16 is transferred from the internal buffer of BufReader (memory) to buf above (memory), probably, using memcpy or something. Wouldn't it be more efficient if BufReader implemented ReadBytesExt by reading data from its internal buffer directly? Or the compiler optimizes buf away?
TL;DR: It's all up to the Optimization Gods, but it should be efficient.
The key optimization here is inlining, as usual, and the probabilities are on our side, but who knows...
As long as the call to read_exact is inlined, it should just work.
Firstly, it can be inlined. In Rust, "inner" calls are always statically dispatched -- there's no inheritance -- and therefore the type of the receiver (self) in self.read_exact is known at compile-time. As a result, the exact read_exact function being called is known at compile-time.
Of course, there's no telling whether it'll be inlined. The implementation is fairly short, so chances are good, but that's out of our hands.
Secondly, what happens if it's inlined? Magic!
You can see the implementation here:
fn read_exact(&mut self, buf: &mut [u8]) -> io::Result<()> {
if self.buffer().len() >= buf.len() {
buf.copy_from_slice(&self.buffer()[..buf.len()]);
self.consume(buf.len());
return Ok(());
}
crate::io::default_read_exact(self, buf)
}
Once inlined, we therefore have:
fn read_u16<T: ByteOrder>(&mut self) -> Result<u16> {
let mut buf = [0; 2];
// self.read_exact(&mut buf)?;
if self.buffer().len() >= buf.len() {
buf.copy_from_slice(&self.buffer()[..buf.len()]);
self.consume(buf.len());
Ok(())
} else {
crate::io::default_read_exact(self, buf)
}?;
Ok(T::read_u16(&buf))
}
Needless to say, all those buf.len() calls should be replaced by 2.
fn read_u16<T: ByteOrder>(&mut self) -> Result<u16> {
let mut buf = [0; 2];
// self.read_exact(&mut buf)?;
if self.buffer().len() >= 2 {
buf.copy_from_slice(&self.buffer()[..2]);
self.consume(2);
Ok(())
} else {
crate::io::default_read_exact(self, buf)
}?;
Ok(T::read_u16(&buf))
}
So we're left with copy_from_slice, a memcpy invoked with a constant size (2).
The trick is that memcpy is so special that it's a builtin in most compilers, and it certainly is in LLVM. And it's a builtin specifically so that in special cases -- such as a constant size being specified which happen to be a register size -- its codegen can be specialized to... a mov instruction in the case of x86/x64.
So, as long as read_exact is inlined, then buf should live in a register from beginning to end... in the happy case.
In the cold path, when default_read_exact is called, then the compiler will need to use the stack and pass a slice. That's fine. It should not happen often.
If you find yourself repeatedly doing sequences of u16 reads, however... you may find yourself better served by reading larger arrays, to avoid the repeated if self.buffer().len() >= 2 checks.

Serialization of large struct to disk with Serde and Bincode is slow [duplicate]

This question already has an answer here:
Rust file I/O is very slow compared with C. Is something wrong?
(1 answer)
Closed 4 years ago.
I have a struct that contains a vector of 2³¹ u32 values (total size about 8GB). I followed the bincode example to write it to disk:
#[macro_use]
extern crate serde_derive;
extern crate bincode;
use std::fs::File;
use bincode::serialize_into;
#[derive(Serialize, Deserialize, PartialEq, Debug)]
pub struct MyStruct {
counter: Vec<u32>,
offset: usize,
}
impl MyStruct {
// omitted for conciseness
}
fn main() {
let m = MyStruct::new();
// fill entries in the counter vector
let mut f = File::create("/tmp/foo.bar").unwrap();
serialize_into(&mut f, &m).unwrap();
}
To avoid allocating the memory twice, I used serialize_into to directly write into the file. However, the writing process is really slow (about half an hour). Is there a way to speed this up?
This is not an issue with serde and/ or bincode. Unlike some other languages, Rust does not use buffered I/O by default (See this question for details). Hence, the performance of this code can be significantly increased by using a buffered writer:
#[macro_use]
extern crate serde_derive;
extern crate bincode;
use std::fs::File;
use bincode::serialize_into;
use std::io::BufWriter;
#[derive(Serialize, Deserialize, PartialEq, Debug)]
pub struct MyStruct {
counter: Vec<u32>,
offset: usize,
}
impl MyStruct {
// omitted for conciseness
}
fn main() {
let m = MyStruct::new();
// fill entries in the counter vector
let mut f = BufWriter::new(File::create("/tmp/foo.bar").unwrap());
serialize_into(&mut f, &m).unwrap();
}
For me, this sped up the writing process from about half an hour to 40 seconds (50x speedup).

Is `String::with_capacity()` equal to `malloc`?

I read this article a few days ago and I thought what is the best way to implement such a thing in Rust. The article suggests to use a buffer instead of printing the string after each iteration.
Is this correct to say String::with_capacity() (or Vec) is equal to malloc in C?
Example from the codes:
String::with_capacity(size * 4096)
equal to:
char *buf = malloc(size * 4096);
It is not "equal", Rust's String is a composite object; String::with_capacity creates a String which is not only a buffer; it is a wrapper around a Vec<u8>:
pub struct String {
vec: Vec<u8>,
}
And a Vec is not just a section in memory - it also contains a RawVec and its length:
pub struct Vec<T> {
buf: RawVec<T>,
len: usize,
}
And a RawVec is not a primitive either:
pub struct RawVec<T> {
ptr: Unique<T>,
cap: usize,
}
So when you call String::with_capacity:
pub fn with_capacity(capacity: usize) -> String {
String { vec: Vec::with_capacity(capacity) }
}
You are doing much more than just reserving a section of memory.
That isn't quite accurate. It'd make more sense to say String::with_capacity is similar to std::string::reserve. From the documentation:
Creates a new empty String with a particular capacity.
Strings have an internal buffer to hold their data. The capacity is
the length of that buffer, and can be queried with the capacity
method. This method creates an empty String, but one with an initial
buffer that can hold capacity bytes. This is useful when you may be
appending a bunch of data to the String, reducing the number of
reallocations it needs to do.
If the given capacity is 0, no allocation will occur, and this method
is identical to the new method.
Whether or not it uses something similar to malloc for managing the internal buffer is an implementation detail.
In response to your edit:
You are explicitly allocating memory, whereas in C++ a memory allocation for std::string::reserve only occurs if the argument passed to reserve is greater than the existing capacity. Note that Rust's String does have a reserve method, but C++'s string does not have a with_capacity equivalent .
Two things:
If you link to an allocator, well, just call malloc.
The hook into the default global allocator is still unstable, but if you're on nightly, you can call it directly.
On stable Rust today, the closest thing you can get is Vec if you want to use the global allocator, but it's not equivalent for reasons spelled out in other answers.

Optimising datastructure/word alignment padding in golang

Similar to what I've learned in C++, I believe it's the padding that causes a difference in the size of instances of both structs.
type Foo struct {
w byte //1 byte
x byte //1 byte
y uint64 //8 bytes
}
type Bar struct {
x byte //1 byte
y uint64 //8 bytes
w byte// 1 byte
}
func main() {
fmt.Println(runtime.GOARCH)
newFoo := new(Foo)
fmt.Println(unsafe.Sizeof(*newFoo))
newBar := new(Bar)
fmt.Println(unsafe.Sizeof(*newBar))
}
Output:
amd64
16
24
Is there a rule of thumb to follow when defining struct members? (like ascending/descending order of size of types)
Is there a compile time optimisation which we can pass, that can automatically take care of this?
Or shouldn't I be worried about this at all?
Currently there's no compile-time optimisation; the values are padded to 8 bytes on x64.
You can manually arrange structs to optimally utilise space; typically by going from larger types to smaller; 8 consecutive byte fields for example, will only use 8 bytes, but a single byte would be padded to an 8 byte alignment, consider this: https://play.golang.org/p/0qsgpuAHHp
package main
import (
"fmt"
"unsafe"
)
type Compact struct {
a, b uint64
c, d, e, f, g, h, i, j byte
}
// Larger memory footprint than "Compact" - but less fields!
type Inefficient struct {
a uint64
b byte
c uint64
d byte
}
func main() {
newCompact := new(Compact)
fmt.Println(unsafe.Sizeof(*newCompact))
newInefficient := new(Inefficient)
fmt.Println(unsafe.Sizeof(*newInefficient))
}
If you take this into consideration; you can optimise the memory footprint of your structs.
Or shouldn't I be worried about this at all?
Yes you should.
This is also called mechanical sympathy (see this Go Time podcast episode), so it also depends on the hardware architecture you are compiling for.
See as illustration:
"The day byte alignment came back to bite me" (January 2014)
"On the memory alignment of Go slice values" (July 2016)
The values in Go slices are 16-byte aligned. They are not 32 byte aligned.
Go pointers are byte-aligned.
It depends on type of application that you are developing and on usage of those structures. If application needs to meet some memory/performance criteria you definitely should care about memory alignment and paddings, but not only - there is nice article https://www.usenix.org/legacy/publications/library/proceedings/als00/2000papers/papers/full_papers/sears/sears_html/index.html that highlights theme of optimal CPU caches usage and correlation between struct layouts and performance. It highlights cache line alignment, false sharing, etc.
Also there is a nice golang tool https://github.com/1pkg/gopium that helps to automate those optimizations, check it out!
Some guideline
To minimize the number of padding bytes, we must lay out the fields from
the highest allocation to lowest allocation.
One exception is an empty structure
As we know the size of empty is zero
type empty struct {
a struct{}
}
Following the common rule above, we may arrange the fields of structure as below
type E struct {
a int64
b int64
c struct{}
}
However, the size of E is 24,
When arrange the fields of structure as
type D struct {
b struct{}
a int64
c int64
}
The size of D is 16, refer to https://go.dev/play/p/ID_hN1zwIwJ
IMO, it is better to use tools that help us to automate structure alignment optimizations
aligncheck — https://gitlab.com/opennota/check
maligned — https://github.com/mdempsky/maligned, the original maligned is deprecated. Use https://pkg.go.dev/golang.org/x/tools/go/analysis/passes/fieldalignment
golang clint
you just need to enable ‘maligned’ in the ‘golangci-lint’ settings.
Example, from the configuration file.golangci.example.yml
linters-settings:
maligned:
# print struct with more effective memory layout or not, false by default
suggest-new: true

How to convert shell output to string in Rust [duplicate]

I am trying to write simple TCP/IP client in Rust and I need to print out the buffer I got from the server.
How do I convert a Vec<u8> (or a &[u8]) to a String?
To convert a slice of bytes to a string slice (assuming a UTF-8 encoding):
use std::str;
//
// pub fn from_utf8(v: &[u8]) -> Result<&str, Utf8Error>
//
// Assuming buf: &[u8]
//
fn main() {
let buf = &[0x41u8, 0x41u8, 0x42u8];
let s = match str::from_utf8(buf) {
Ok(v) => v,
Err(e) => panic!("Invalid UTF-8 sequence: {}", e),
};
println!("result: {}", s);
}
The conversion is in-place, and does not require an allocation. You can create a String from the string slice if necessary by calling .to_owned() on the string slice (other options are available).
If you are sure that the byte slice is valid UTF-8, and you don’t want to incur the overhead of the validity check, there is an unsafe version of this function, from_utf8_unchecked, which has the same behavior but skips the check.
If you need a String instead of a &str, you may also consider String::from_utf8 instead.
The library references for the conversion function:
std::str::from_utf8
std::str::from_utf8_unchecked
std::string::String::from_utf8
I prefer String::from_utf8_lossy:
fn main() {
let buf = &[0x41u8, 0x41u8, 0x42u8];
let s = String::from_utf8_lossy(buf);
println!("result: {}", s);
}
It turns invalid UTF-8 bytes into � and so no error handling is required. It's good for when you don't need that and I hardly need it. You actually get a String from this. It should make printing out what you're getting from the server a little easier.
Sometimes you may need to use the into_owned() method since it's clone on write.
If you actually have a vector of bytes (Vec<u8>) and want to convert to a String, the most efficient is to reuse the allocation with String::from_utf8:
fn main() {
let bytes = vec![0x41, 0x42, 0x43];
let s = String::from_utf8(bytes).expect("Found invalid UTF-8");
println!("{}", s);
}
In my case I just needed to turn the numbers into a string, not the numbers to letters according to some encoding, so I did
fn main() {
let bytes = vec![0x41, 0x42, 0x43];
let s = format!("{:?}", &bytes);
println!("{}", s);
}
To optimally convert a Vec<u8> possibly containing non-UTF-8 characters/byte sequences into a UTF-8 String without any unneeded allocations, you'll want to optimistically try calling String::from_utf8() then resort to String::from_utf8_lossy().
let buffer: Vec<u8> = ...;
let utf8_string = String::from_utf8(buffer)
.map_err(|non_utf8| String::from_utf8_lossy(non_utf8.as_bytes()).into_owned())
.unwrap();
The approach suggested in the other answers will result in two owned buffers in memory even in the happy case (with valid UTF-8 data in the vector): one with the original u8 bytes and the other in the form of a String owning its characters. This approach will instead attempt to consume the Vec<u8> and marshal it as a Unicode String directly and only failing that will it allocate room for a new string containing the lossily UTF-8 decoded output.

Resources