Transmuting u8 buffer to struct in Rust

Transmuting u8 buffer to struct in Rust - data-structures

I have a byte buffer of unknown size, and I want to create a local struct variable pointing to the memory of the beginning of the buffer. Following what I'd do in C, I tried a lot of different things in Rust and kept getting errors. This is my latest attempt:
use std::mem::{size_of, transmute};
#[repr(C, packed)]
struct MyStruct {
foo: u16,
bar: u8,
}
fn main() {
let v: Vec<u8> = vec![1, 2, 3];
let buffer = v.as_slice();
let s: MyStruct = unsafe { transmute(buffer[..size_of::<MyStruct>()]) };
}
I get an error:
error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
--> src/main.rs:12:42
|
12 | let s: MyStruct = unsafe { transmute(buffer[..size_of::<MyStruct>()]) };
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `[u8]`
= note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>

If you don't want to copy the data to the struct but instead leave it in place, you can use slice::align_to. This creates a &MyStruct instead:
#[repr(C, packed)]
#[derive(Debug, Copy, Clone)]
struct MyStruct {
foo: u16,
bar: u8,
}
fn main() {
let v = vec![1u8, 2, 3];
// I copied this code from Stack Overflow
// without understanding why this case is safe.
let (head, body, _tail) = unsafe { v.align_to::<MyStruct>() };
assert!(head.is_empty(), "Data was not aligned");
let my_struct = &body[0];
println!("{:?}", my_struct);
}
Here, it's safe to use align_to to transmute some bytes to MyStruct because we've used repr(C, packed) and all of the types in MyStruct can be any arbitrary bytes.
See also:
How to read a struct from a file in Rust?
Can I take a byte array and deserialize it into a struct?

You can use methods on raw pointers and functions in std::ptr to directly read/write objects in place.
std::ptr::read
std::ptr::read_unaligned
std::ptr::write
std::ptr::write_unaligned
In your case:
fn main() {
let v: Vec<u8> = vec![1, 2, 3];
let s: MyStruct = unsafe { std::ptr::read(v.as_ptr() as *const _) };
println!("here is the struct: {:?}", s);
}
I would encourage you to wrap this in a reusable method and perform a length check on the source buffer before attempting the read.

I gave up on the transmute stuff. *mut (raw pointers) in Rust are pretty similar to C pointers, so this was easy:
#[repr(C, packed)] // necessary
#[derive(Debug, Copy, Clone)] // not necessary
struct MyStruct {
foo: u16,
bar: u8,
}
fn main() {
let v: Vec<u8> = vec![1, 2, 3];
let buffer = v.as_slice();
let mut s_safe: Option<&MyStruct> = None;
let c_buf = buffer.as_ptr();
let s = c_buf as *mut MyStruct;
unsafe {
let ref s2 = *s;
s_safe = Some(s2);
}
println!("here is the struct: {:?}", s_safe.unwrap());
}
The unsafe tag there is no joke, but the way I'm using this, I know my buffer is filled and take the proper precautions involving endianness later on.

Related

Access the methods of primitive Rust types

How can I access the methods of primitive types in Rust?
Concretely, I want to pass either one of the two slice methods split_first_mut and split_last_mut to a function operating on slices. I know you can wrap them in closures as a workaround, but I’d like to know if direct access is possible.

You can access the methods on primitives just like regular types:
u8::to_le();
str::from_utf8();
<[_]>::split_first_mut();
You can create a function that accepts a slice ops function:
fn do_thing<T>(f: impl Fn(&mut [u8])) -> Option<(&mut T, &mut [T])>) {
// ...
}
And pass in both split_first_mut and split_last_mut:
fn main() {
do_thing(<[_]>::split_first_mut);
do_thing(<[_]>::split_last_mut);
}

You have to refer to the method using fully-qualified syntax. In a nutshell: <T>::{method_name} where T is the type and {method_name} is the name of the method. For example, if you're modifying a [i32] then you'd to prefix the method name with <[i32]>:: like this:
fn apply_fn<T, U>(t: T, t_fn: fn(T) -> U) -> U {
t_fn(t)
}
fn main() {
let mut items: Vec<i32> = vec![1, 2, 3];
let slice: &mut [i32] = items.as_mut_slice();
let first_split = apply_fn(slice, <[i32]>::split_first_mut);
let slice: &mut [i32] = items.as_mut_slice();
let last_split = apply_fn(slice, <[i32]>::split_last_mut);
}
playground

Wrapping RefCell and Rc in a struct type

I would like to have a struct which has a writable field, but explicitly borrowable:
struct App<W: Clone<BorrowMut<Write>>> {
stdout: W,
}
... so it can internally use it:
impl<W: Clone<BorrowMut<Write>>> App<W> {
fn hello(&mut self) -> Result<()> {
Rc::clone(&self.stdout).borrow_mut().write(b"world\n")?;
Ok(())
}
}
I tried to pass it a cursor and then use it:
let mut cursor = Rc::new(RefCell::new(Cursor::new(vec![0])));
let mut app = App { stdout: cursor };
app.hello().expect("failed to write");
let mut line = String::new();
Rc::clone(&cursor).borrow_mut().read_line(&mut line).unwrap();
Rust barks:
error[E0107]: wrong number of type arguments: expected 0, found 1
--> src/bin/play.rs:6:21
|
6 | struct App<W: Clone<BorrowMut<Write>>> {
| ^^^^^^^^^^^^^^^^ unexpected type argument
My end goal: pass stdin, stdout and stderr to an App struct. In fn main, these would be real stdin/stdout/stderr. In tests, these could be cursors. Since I need to access these outside of App (e.g. in tests), I need multiple owners (thus Rc) and runtime mutable borrow (thus RefCount).
How can I implement this?

This isn't how you apply multiple constraints to a type parameter. Instead you use the + operator, like this: <W: Clone + Write + BorrowMut>
But, if you want BorrowMut to be an abstraction for RefCell, it won't work. The borrow_mut method of RefCell is not part of any trait so you will need to depend on RefCell directly in your data structure:
struct App<W: Clone + Write> {
stdout: Rc<RefCell<W>>,
}
Having said that, it's considered best practice not to put unneeded constraints on a struct. You can actually leave them off here, and just mention them on the impl later.
struct App<W> {
stdout: Rc<RefCell<W>>,
}
In order to access the contents of a Rc, you need to dereference with *. This can be a bit tricky in your case because there is a blanket impl of BorrowMut, which means that Rc has a different borrow_mut, which you definitely don't want.
impl<W: Clone + Write> App<W> {
fn hello(&mut self) -> Result<()> {
(*self.stdout).borrow_mut().write(b"world\n")?;
Ok(())
}
}
Again, when you use this, you'll need to dereference the Rc:
let cursor = Rc::new(RefCell::new(Cursor::new(vec![0])));
let mut app = App { stdout: cursor.clone() };
app.hello().expect("failed to write");
let mut line = String::new();
let mut cursor = (&*cursor).borrow_mut();
// move to the beginning or else there's nothing to read
cursor.set_position(0);
cursor.read_line(&mut line).unwrap();
println!("result = {:?}", line);
Also, notice that the Rc was cloned into the cursor. Otherwise it would be moved and you couldn't use it again later.

Serialization of large struct to disk with Serde and Bincode is slow [duplicate]

This question already has an answer here:
Rust file I/O is very slow compared with C. Is something wrong?
(1 answer)
Closed 4 years ago.
I have a struct that contains a vector of 2³¹ u32 values (total size about 8GB). I followed the bincode example to write it to disk:
#[macro_use]
extern crate serde_derive;
extern crate bincode;
use std::fs::File;
use bincode::serialize_into;
#[derive(Serialize, Deserialize, PartialEq, Debug)]
pub struct MyStruct {
counter: Vec<u32>,
offset: usize,
}
impl MyStruct {
// omitted for conciseness
}
fn main() {
let m = MyStruct::new();
// fill entries in the counter vector
let mut f = File::create("/tmp/foo.bar").unwrap();
serialize_into(&mut f, &m).unwrap();
}
To avoid allocating the memory twice, I used serialize_into to directly write into the file. However, the writing process is really slow (about half an hour). Is there a way to speed this up?

This is not an issue with serde and/ or bincode. Unlike some other languages, Rust does not use buffered I/O by default (See this question for details). Hence, the performance of this code can be significantly increased by using a buffered writer:
#[macro_use]
extern crate serde_derive;
extern crate bincode;
use std::fs::File;
use bincode::serialize_into;
use std::io::BufWriter;
#[derive(Serialize, Deserialize, PartialEq, Debug)]
pub struct MyStruct {
counter: Vec<u32>,
offset: usize,
}
impl MyStruct {
// omitted for conciseness
}
fn main() {
let m = MyStruct::new();
// fill entries in the counter vector
let mut f = BufWriter::new(File::create("/tmp/foo.bar").unwrap());
serialize_into(&mut f, &m).unwrap();
}
For me, this sped up the writing process from about half an hour to 40 seconds (50x speedup).

Why can I use Ok and Err directly without the Result:: prefix?

For example:
enum Foobar {
Foo(i32),
Bar(i32),
}
fn main() {
let a: Result<i32, i32> = Result::Ok(1);
let b: Result<i32, i32> = Ok(1);
let c: Foobar = Foobar::Foo(1);
let d: Foobar = Foo(1); // Error!
}
I have to write Foobar::Foo() instead of just Foo(), but I can just write Ok() without Result::. Why is that? I have the same question for Some and None.

A use item can add enum variants to a namespace, so that you don't have to prefix them by the enum's name.
use Foobar::*;
enum Foobar {
Foo(i32),
Bar(i32)
}
fn main() {
let a: Result<i32, i32> = Result::Ok(1);
let b: Result<i32, i32> = Ok(1);
let c: Foobar = Foobar::Foo(1);
let d: Foobar = Foo(1); // Not an error anymore!
}
The reason why Ok, Err, Some and None are available without qualification is that the prelude has some use items that add these names to the prelude (in addition to the enums themselves):
pub use option::Option::{self, Some, None};
pub use result::Result::{self, Ok, Err};

ReadConsoleInputW wrapper and ownership

I want to wrap the ReadConsoleInputW Windows console method into the Read trait so that I can use the chars() method, but I need also to know which key modifiers are applied (control, alt/meta).
One solution (like the one used by the Unix console) is to encode key events into control characters or ANSI escape codes.
Another solution would be to keep the key modifiers around but I can't make it work because the chars() method consume/move the input:
struct InputBuffer {
handle: winapi::HANDLE,
ctrl: bool,
meta: bool,
}
impl Read for InputBuffer {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
let mut rec: winapi::INPUT_RECORD = unsafe { mem::zeroed() };
// kernel32::ReadConsoleInputW(self.0, &mut rec, 1 as winapi::DWORD, &mut count);
// ...
if rec.EventType != winapi::KEY_EVENT {
continue;
}
let key_event = unsafe { rec.KeyEvent() };
// ...
self.ctrl = key_event.dwControlKeyState &
(winapi::LEFT_CTRL_PRESSED | winapi::RIGHT_CTRL_PRESSED) ==
(winapi::LEFT_CTRL_PRESSED | winapi::RIGHT_CTRL_PRESSED);
self.meta = ...;
let utf16 = key_event.UnicodeChar;
// ...
let (bytes, len) = try!(InputBuffer::wide_char_to_multi_byte(utf16));
return (&bytes[..len]).read(buf);
}
}
fn main() {
let handle = try!(get_std_handle(STDIN_FILENO));
let mut stdin = InputBuffer(handle);
let mut chars = stdin.chars(); // stdin moved here
loop {
let c = chars.next().unwrap();
let mut ch = try!(c);
if stdin.ctrl { // use of moved value
//...
}
// ...
}
}
How to do this in Rust?

You could put these flags into Rc<RefCell<somestruct>> and clone it before consuming stdin.
This is a common pattern that allows you to have "access" to the same data, from two places. Rc takes care of shared ownership, and RefCell checks if you don't have overlapping accesses.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Transmuting u8 buffer to struct in Rust - data-structures

Related

Access the methods of primitive Rust types

Wrapping RefCell and Rc in a struct type

Serialization of large struct to disk with Serde and Bincode is slow [duplicate]

Why can I use Ok and Err directly without the Result:: prefix?

ReadConsoleInputW wrapper and ownership

Categories

Resources