I am trying to create a function in Rust which will sample from M normal distributions N times. I have the sequential version below, which runs fine. I am trying to parallelize it using Rayon, but am encountering the error
Rc<UnsafeCell<ReseedingRng<rand_chacha::chacha::ChaCha12Core, OsRng>>> cannot be sent between threads safely
It seems my rand::thread_rng does not implement the traits Send and Sync. I tried using StdRng and OsRng which both do, to no avail because then I get errors that the variables pred and rng cannot be borrowed as mutable because they are captured in a Fn closure.
This is the working code below. It errors when I change the first into_iter() to into_par_iter().
use rand_distr::{Normal, Distribution};
use std::time::Instant;
use rayon::prelude::*;
fn rprednorm(n: i32, means: Vec<f64>, sds: Vec<f64>) -> Vec<Vec<f64>> {
let mut rng = rand::thread_rng();
let mut preds = vec![vec![0.0; n as usize]; means.len()];
(0..means.len()).into_iter().for_each(|i| {
(0..n).into_iter().for_each(|j| {
let normal = Normal::new(means[i], sds[i]).unwrap();
preds[i][j as usize] = normal.sample(&mut rng);
})
});
preds
}
fn main() {
let means = vec![0.0; 67000];
let sds = vec![1.0; 67000];
let start = Instant::now();
let preds = rprednorm(100, means, sds);
let duration = start.elapsed();
println!("{:?}", duration);
}
Any advice on how to make these two iterators parallel?
Thanks.
It seems my rand::thread_rng does not implement the traits Send and Sync.
Why are you trying to send a thread_rng? The entire point of thread_rng is that it's a per-thread RNG.
then I get errors that the variables pred and rng cannot be borrowed as mutable because they are captured in a Fn closure.
Well yes, you need to Clone the StdRng (or Copy the OsRng) into each closure. As for pred, that can't work for a similar reason: once you parallelise the loop the compiler does not know that every i is distinct, so as far as it's concerned the write access to i could overlap (you could have two iterations running in parallel which try to write to the same place at the same time) which is illegal.
The solution is to use rayon to iterate in parallel over the destination vector:
fn rprednorm(n: i32, means: Vec<f64>, sds: Vec<f64>) -> Vec<Vec<f64>> {
let mut preds = vec![vec![0.0; n as usize]; means.len()];
preds.par_iter_mut().enumerate().for_each(|(i, e)| {
let mut rng = rand::thread_rng();
(0..n).into_iter().for_each(|j| {
let normal = Normal::new(means[i], sds[i]).unwrap();
e[j as usize] = normal.sample(&mut rng);
})
});
preds
}
Alternatively with OsRng, it's just a marker ZST, so you can refer to it as a value:
fn rprednorm(n: i32, means: Vec<f64>, sds: Vec<f64>) -> Vec<Vec<f64>> {
let mut preds = vec![vec![0.0; n as usize]; means.len()];
preds.par_iter_mut().enumerate().for_each(|(i, e)| {
(0..n).into_iter().for_each(|j| {
let normal = Normal::new(means[i], sds[i]).unwrap();
e[j as usize] = normal.sample(&mut rand::rngs::OsRng);
})
});
preds
}
StdRng doesn't seem very suitable to this use-case, as you'll either have to create one per toplevel iteration to get different samplings, or you'll have to initialise a base rng then clone it once per spark, and they'll all have the same sequence (as they'll share a seed).
Related
this is a simplistic performance test based on https://www.youtube.com/watch?v=QlMLB2-G25c which compares the performance of rust vs wasm vs python vs go
the original rust program (from https://github.com/masmullin2000/random-sort-examples) is:
use rand::prelude::*;
fn main() {
let vec = make_random_vec(1_000_000, 100);
for _ in 0..250 {
let mut v = vec.clone();
// v.sort_unstable();
v.sort(); // using stable sort as f# sort is a stable sort
}
}
pub fn make_random_vec(sz: usize, modulus: i64) -> Vec<i64> {
let mut v: Vec<i64> = Vec::with_capacity(sz);
for _ in 0..sz {
let x: i64 = random();
v.push(x % modulus);
}
v
}
so i created the following f# program to compare against rust
open System
let rec cls (arr:int64 array) count =
if count > 0 then
let v1 = Array.copy arr
let v2 = Array.sort v1
cls arr (count-1)
else
()
let rnd = Random()
let rndArray = Array.init 1000000 (fun _ -> int64 (rnd.Next(100)))
cls rndArray 250 |> ignore
i was expecting f# to be slower (both running on WSL2) but got the following times on my Core i7 8th gen laptop
Rust - around 17 seconds
Rust (unstable sort) - around 2.7 seconds
F# - around 11 seconds
my questions:
is the dotnet compiler doing some sort of optimisation that throws away some of the processing because the return values are not being used resulting in the f# code running faster or am i doing something wrong?
does f# have an unstable sort function that i can use to compare against the Rust unstable sort?
I want to modify a collection in place before returning it:
fn main() {
println!("{:?}", compute()); // should print [[2, 1, 0], [5, 4, 3]]
}
// u8 is just a placeholder, so impl Copy is considered cheating :)
fn compute() -> Vec<Vec<u8>> {
let a = vec![0, 1, 2];
let b = vec![3, 4, 5];
let mut result = Vec::new();
result.push(a);
result.push(b);
// avoids allocations from:
//
// result.iter()
// .map(|r| {
// r.reverse()
// r
// })
// .collect::<Vec<_>>()
result.into_iter().for_each(|mut r| r.reverse());
// errors out: the collection was consumed the line above
result
}
A collection was already allocated with Vec::new(), so allocating a second collection here seems like a waste. I am assuming that's what .collect() does.
How do I avoid the allocation in excess?
Is there any easy way to know how many allocations are happening? In golang it was as easy as go test -bench=., but I can't find anything similar when it comes to Rust.
Link to playground
You need to use a &mut to each of the inside vectors, for that you can just use iter_mut which uses &mut Self instead of Self for the outer vector.
// u8 is just a placeholder, so impl Copy is considered cheating :)
fn compute() -> Vec<Vec<u8>> {
let a = vec![0, 1, 2];
let b = vec![3, 4, 5];
let mut result = Vec::new();
result.push(a);
result.push(b);
result.iter_mut().for_each(|r| r.reverse());
result
}
Playground
I have a vector of a vector and need to concatenate the second one to the first (it's ok if the second one is dropped), i.e.
f([[1,2,3], [4,5,6]]) => [[1,2,3,4,5,6], []]
or
f([[1,2,3], [4,5,6]]) => [[1,2,3,4,5,6], [4,5,6]]
Both are okay.
My initial solution is:
fn problem() {
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
items[0].append(&mut items[1]);
}
But it has a compile time error due to 2 mutable borrows:
| items[0].append(&mut items[1]);
| ----- ------ ^^^^^ second mutable borrow occurs here
| | |
| | first borrow later used by call
| first mutable borrow occurs here
I could solve it with Box / Option, but I wonder whether there are better ways to solve this?
My solution with Box:
fn solution_with_box() {
let mut items = Vec::new();
items.push(Box::new(vec![1,2,3]));
items.push(Box::new(vec![4,5,6]));
let mut second = items[1].clone();
items[0].as_mut().append(second.as_mut());
}
My solution with Option:
fn solution_with_option() {
let mut items = vec::new();
items.push(some(vec![1,2,3]));
items.push(some(vec![4,5,6]));
let mut second = items[1].take();
items[0].as_mut().unwrap().append(second.as_mut().unwrap());
}
You can clone the data of items[1] as follows:
fn main() {
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
let mut a: Vec<i32> = items[1].clone();
&items[0].append(&mut a);
}
If you don't want to clone the data, you can use mem::take as suggested by #trentcl
fn main() {
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
let second = std::mem::take(&mut items[1]);
items[0].extend(second);
println!("{:?}", items);
}
This is not fastest way of doing it but it solves your problem.
fn problem() {
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
//we can not have two mutable references in the same scope
// items[0].append(&mut items[1]);
// instead you can flatten vector
let first = items.into_iter().flatten().collect(); // we consume items so its no longer available
let items = vec![first, vec![]];
println!("{:?}", items); // [[1,2,3,4,5,6], []]
}
You can use split_at_mut on a slice or vector to get mutable references to non-overlapping parts that can't interfere with each other, so that you can mutate the first inner vector and the second inner vector at the same time.
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
let (contains_first, contains_second) = items.split_at_mut(1);
contains_first[0].append(&mut contains_second[0]);
dbg!(items);
Rust Playground link
No copying or cloning occurs. Note that contains_second[0] corresponds to items[1] because the second slice split_at_mut returns starts indexing at wherever the split point is (here, 1).
you can solve the problem in two steps:
append the empty vector at the end
remove items[1] and append its elements to items[0]
fn problem() {
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
items.push(vec![0;0]);
let v = items.remove(1);
items[0].extend(v);
}
According to The Rust Programming Language, ch15-03, std::mem::drop takes an object, receives its ownership, and calls its drop function.
That's what this code does:
fn my_drop<T>(x: T) {}
fn main() {
let x = 5;
let y = &x;
let mut z = 4;
let v = vec![3, 4, 2, 5, 3, 5];
my_drop(v);
}
Is this what std::mem::drop does? Does it perform any other cleanup tasks other than these?
Let's take a look at the source:
#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
pub fn drop<T>(_x: T) { }
#[inline] gives a hint to the compiler that the function should be inlined. #[stable] is used by the standard library to mark APIs that are available on the stable channel. Otherwise, it's really just an empty function! When _x goes out of scope as drop returns, its destructor is run; there is no other way to perform cleanup tasks implicitly in Rust.
I have been looking at examples of FFTs in Swift, and they all seem to have ConstUnsafePointer when using vDSP_ctozD as in the example below:
import Foundation
import Accelerate
internal func spectrumForValues(signal: [Double]) -> [Double] {
// Find the largest power of two in our samples
let log2N = vDSP_Length(log2(Double(signal.count)))
let n = 1 << log2N
let fftLength = n / 2
// This is expensive; factor it out if you need to call this function a lot
let fftsetup = vDSP_create_fftsetupD(log2N, FFTRadix(kFFTRadix2))
var fft = [Double](count:Int(n), repeatedValue:0.0)
// Generate a split complex vector from the real data
var realp = [Double](count:Int(fftLength), repeatedValue:0.0)
var imagp = realp
withExtendedLifetimes(realp, imagp) {
var splitComplex = DSPDoubleSplitComplex(realp:&realp, imagp:&imagp)
// Take the fft
vDSP_fft_zripD(fftsetup, &splitComplex, 1, log2N, FFTDirection(kFFTDirection_Forward))
// Normalize
var normFactor = 1.0 / Double(2 * n)
vDSP_vsmulD(splitComplex.realp, 1, &normFactor, splitComplex.realp, 1, fftLength)
vDSP_vsmulD(splitComplex.imagp, 1, &normFactor, splitComplex.imagp, 1, fftLength)
// Zero out Nyquist
splitComplex.imagp[0] = 0.0
// Convert complex FFT to magnitude
vDSP_zvmagsD(&splitComplex, 1, &fft, 1, fftLength)
}
// Cleanup
vDSP_destroy_fftsetupD(fftsetup)
return fft
}
// To get rid of the `() -> () in` casting
func withExtendedLifetime<T>(x: T, f: () -> ()) {
return Swift.withExtendedLifetime(x, f)
}
// In the spirit of withUnsafePointers
func withExtendedLifetimes<A0, A1>(arg0: A0, arg1: A1, f: () -> ()) {
return withExtendedLifetime(arg0) { withExtendedLifetime(arg1, f) }
}
However when I try to use it in my project, this ConstUnsafePointer is seen as an unresolved identifier. Any clue how to fix this? Thanks in advance.
The name ConstUnsafePointer was used in early Swift betas last summer (at that time, UnsafePointer meant mutable). Now, constant pointers are just UnsafePointer and mutable pointers are UnsafeMutablePointer.