How can I guarantee that a type that doesn't implement Sync can actually be safely shared between threads? - thread-safety

I have code that creates a RefCell and then wants to pass a reference to that RefCell to a single thread:
use crossbeam; // 0.7.3
use std::cell::RefCell;
fn main() {
let val = RefCell::new(1);
crossbeam::scope(|scope| {
scope.spawn(|_| *val.borrow());
})
.unwrap();
}
In the complete code, I'm using a type that has a RefCell embedded in it (a typed_arena::Arena). I'm using crossbeam to ensure that the thread does not outlive the reference it takes.
This produces the error:
error[E0277]: `std::cell::RefCell<i32>` cannot be shared between threads safely
--> src/main.rs:8:15
|
8 | scope.spawn(|_| *val.borrow());
| ^^^^^ `std::cell::RefCell<i32>` cannot be shared between threads safely
|
= help: the trait `std::marker::Sync` is not implemented for `std::cell::RefCell<i32>`
= note: required because of the requirements on the impl of `std::marker::Send` for `&std::cell::RefCell<i32>`
= note: required because it appears within the type `[closure#src/main.rs:8:21: 8:38 val:&std::cell::RefCell<i32>]`
I believe I understand why this error happens: RefCell is not designed to be called concurrently from multiple threads, and since it uses internal mutability, the normal mechanism of requiring a single mutable borrow won't prevent multiple concurrent actions. This is even documented on Sync:
Types that are not Sync are those that have "interior mutability" in a non-thread-safe form, such as cell::Cell and cell::RefCell.
This is all well and good, but in this case, I know that only one thread is able to access the RefCell. How can I affirm to the compiler that I understand what I am doing and I ensure this is the case? Of course, if my reasoning that this is actually safe is incorrect, I'd be more than happy to be told why.

Another solution for this case is to move a mutable reference to the item into the thread, even though mutability isn't required. Since there can be only one mutable reference, the compiler knows that it's safe to be used in another thread.
use crossbeam; // 0.7.3
use std::cell::RefCell;
fn main() {
let mut val = RefCell::new(1);
let val2 = &mut val;
crossbeam::scope(|scope| {
scope.spawn(move |_| *val2.borrow());
})
.unwrap();
}
As bluss points out:
This is allowed because RefCell<i32> implements Send.

One way would be to use a wrapper with an unsafe impl Sync:
use crossbeam; // 0.7.3
use std::cell::RefCell;
fn main() {
struct Wrap(RefCell<i32>);
unsafe impl Sync for Wrap {};
let val = Wrap(RefCell::new(1));
crossbeam::scope(|scope| {
scope.spawn(|_| *val.0.borrow());
})
.unwrap();
}
As usual with unsafe, it is now up to you to guarantee that the inner RefCell is indeed never accessed from multiple threads simultaneously. As far as I understand, this should be enough for it not to cause a data race.

Related

generic callback with data

There is already a very popular question about this topic but I don;t fully understand the answer.
The goal is:
I need a list (read a Vec) of "function pointers" that modify data stored elsewhere in a program. The simplest example I can come up with are callbacks to be called when a key is pressed. So when any key is pressed, all functions passed to the object will be called in some order.
Reading the answer, it is not clear to me how I would be able to make such a list. It sounds like I would need to restrict the type of the callback to something known, else I don't know how you would be able to make an array of it.
It's also not clear to me how to store the data pointers/references.
Say I have
struct Processor<CB>
where
CB: FnMut(),
{
callback: CB,
}
Like the answer suggests, I can't make an array of processors, can I? since each Processor is technically a different type depending on the generic isntantiation.
Indeed, you can't make a vector of processors. Usually, closures all have different, innominable types. What you want instead are trait objects, which allow you to have dynamic dispatch of callback calls. Since those are not Sized, you'd probably want to put them in a Box. The final type is Vec<Box<dyn FnMut()>>.
fn add_callback(list: &mut Vec<Box<dyn FnMut()>>, cb: impl FnMut() + 'static) {
list.push(Box::new(cb))
}
fn run_callback(list: &mut [Box<dyn FnMut()>]) {
for cb in list {
cb()
}
}
see the playground
If you do like that, however, you might have some issues with the lifetimes (because your either force to move-in everything, or only modify values that life for 'static, which isn't very convenient. Instead, the following might be better
#[derive(Default)]
struct Producer<'a> {
list: Vec<Box<dyn FnMut() + 'a>>,
}
impl<'a> Producer<'a> {
fn add_callback(&mut self, cb: impl FnMut() + 'a) {
self.list.push(Box::new(cb))
}
fn run_callbacks(&mut self) {
for cb in &mut self.list {
cb()
}
}
}
fn callback_1() {
println!("Hello!");
}
fn main() {
let mut modified = 0;
let mut prod = Producer::default();
prod.add_callback(callback_1);
prod.add_callback(
|| {
modified += 1;
println!("World!");
}
);
prod.run_callbacks();
drop(prod);
println!("{}", modified);
}
see the playground
Just a few things to note:
You manually have to drop the producer, otherwise Rust will complain that it will be dropped at the end of the scope, but it contains (through the closure) an exclusive reference to modified, which is not ok since I try to read it.
Current, run_callbacks take a &mut self, because we only require for a FnMut. If you wanted it to be only a &self, you'd need to replace FnMut with Fn, which means the callbacks can still modify things outside of them, but not inside.
Yes, all closures are differents type, so if you want to have a vec of different closure you will need to make them trait objects. This can be archieve with Box<dyn Trait> (or any smart pointer). Box<dyn FnMut()> implements FnMut(), so you can have Processor<Box<dyn FnMut()>> and can make a vec of them, and call the callbacks on them: playground

How can NonNull be made thread safe in Rust?

I cannot send non null type between threads in rust. I need to call a method on a NonNull pointer for the Windows rust API.
I have tried Arc<Mutex<NonNull>> and Arc<Mutex<RefCell<Box<NonNull>>> but cannot find a way to get send and sync for NonNull.
I would like the thread to halt and await the mutex, so calling a method or even mutating the NonNull type shouldnt be a threading problem but even with runtime borrow checking I get the error: 'NonNull<c_void> cannot be sent between threads safely'
and then a list of:
required because of the requirements on the impl of 'Send'
..etc.
I am about to try passing in the method as dyn but this should be possible right?
You need to provide an unsafe implementation of Send to inform the compiler that you've taken into account the thread-safety of the objects behind the pointer (which you did, since you want to use a mutex for synchronization). For example:
// Wrapper around `NonNull<RawType>` that just implements `Send`
struct WrappedPointer(NonNull<RawType>);
unsafe impl Send for WrappedPointer {}
// Safe wrapper around `WrappedPointer` that gives access to the pointer
// only with the mutex locked.
struct SafeType {
inner: Mutex<WrappedPointer>,
}
impl SafeType {
fn some_method(&self) {
let locked = self.inner.lock();
// use ptr in `locked` to implement functionality, but don't return it
}
}

Why can method taking immutable `&self` modify data in field with mutex?

Consider the following code (also on the playground):
use std::{collections::HashMap, sync::Mutex};
struct MyStruct {
dummy_map: Mutex<HashMap<i64, i64>>,
}
impl MyStruct {
pub fn new() -> Self {
MyStruct {
dummy_map: Mutex::new(Default::default()),
}
}
pub fn insert(&self, key: i64, val: i64) { // <-- immutable &self
self.dummy_map.lock().unwrap().insert(key, val); // <-- insert in dummy_map
}
}
fn main() {
let s = MyStruct::new();
let key = 1;
s.insert(key, 1);
assert!(s.dummy_map.lock().unwrap().get(&key).is_some());
}
The code runs wihout panic, meaning insert takes immutable &self, but still it can insert into the map (which is wrapped in Mutex).
How come this is possible?
Would it be better for insert to take &mut self? To indicate that a field is modified...
If dummy_map is not wrapped in a Mutex, the code does not compile (as I`d expect). See this playground.
mut is kind of a misnomer in Rust, it actually means "exclusive access" (which you need to be able to mutate a value but is slightly more general). In this case, you obviously can't get exclusive access to the Mutex itself (because the whole point is to share it between threads), and therefore you can't get exclusive access to self either. However, you can get temporary exclusive access to the data inside the Mutex, at which point it becomes possible to mutate it. Kinda like what happens with Cell and RefCell too.
A Mutex is one of the Rust types that implements "interior mutability" (see the docs for Cell for discussion on that).
In short, types that implement "interior mutability" circumvent the compile-time ownership checks in favor of runtime checks. In this case, Mutex enforces the mutability rules at runtime by ensuring only one thread can access the data using its lock() and try_lock() methods.
Both locking methods return an owned MutexGuard which can provide read-only and mutable access to the data through the Deref and DerefMut traits, respectively.
In the end, this means that the variable that needs to be mut is the returned MutexGuard, not the Mutex itself.

Is there an idiomatic way to drop early in Rust? [duplicate]

From what I understand, the compiler automatically generates code to call the destructor to delete an object when it's no longer needed, at the end of scope.
In some situations, it is beneficial to delete an object as soon as it's no longer needed, instead of waiting for it to go out of scope. Is it possible to call the destructor of an object explicitly in Rust?
Is it possible in Rust to delete an object before the end of scope?
Yes.
Is it possible to call the destructor of an object explicitly in Rust?
No.
To clarify, you can use std::mem::drop to transfer ownership of a variable, which causes it to go out of scope:
struct Noisy;
impl Drop for Noisy {
fn drop(&mut self) {
println!("Dropping Noisy!");
}
}
fn main() {
let a = Noisy;
let b = Noisy;
println!("1");
drop(b);
println!("2");
}
1
Dropping Noisy!
2
Dropping Noisy!
However, you are forbidden from calling the destructor (the implementation of the Drop trait) yourself. Doing so would lead to double free situations as the compiler will still insert the automatic call to the the Drop trait.
Amusing side note — the implementation of drop is quite elegant:
pub fn drop<T>(_x: T) { }
The official answer is to call mem::drop:
fn do_the_thing() {
let s = "Hello, World".to_string();
println!("{}", s);
drop(s);
println!("{}", 3);
}
However, note that mem::drop is nothing special. Here is the definition in full:
pub fn drop<T>(_x: T) { }
That's all.
Any function taking ownership of a parameter will cause this parameter to be dropped at the end of said function. From the point of view of the caller, it's an early drop :)

Caching externally-loaded data in a static variable

I'd like to load data from a file, then cache this data (including quite large arrays) in a static variable. This obviously is not the preferred way of doing this, but:
I'm writing a Rust library invoked by a C(++) program, and don't currently have any objects which out-live invocation of the Rust functions. Using a static avoids me having to hack up the C code.
The program doesn't do anything concurrently internally, so synchronisation is not an issue.
How can this be done in Rust?
I have found lazy-static which solves a similar problem, but only for code not requiring external resources (i.e. items which could in theory be evaluated at compile time).
You cannot do the initialization at program start, but you can do it at the first method call. All further calls will access the cached value instead of recomputing your value.
Since rust forbids things with destructors inside static variables, you need to do your own cleanup management. Logically this means you need unsafe code to break rust's safety system. The following example uses a static mut variable to cache a heap allocated object (an i32 in this case).
The cacheload function works like a Singleton.
Just remember to call cachefree() from c after you are done.
use std::{ptr, mem};
static mut cache: *const i32 = 0 as *const i32;
unsafe fn cacheload() -> i32 {
if cache == ptr::null() {
// do an expensive operation here
cache = mem::transmute(Box::new(42));
}
return *cache;
}
unsafe fn cachefree() {
if cache != ptr::null() {
let temp: Box<i32> = mem::transmute(cache);
cache = ptr::null();
drop(temp);
}
}
fn main() {
let x;
unsafe {
x = cacheload();
cachefree();
}
println!("{}" , x);
}

Resources