This question already has answers here:
Is there a more idiomatic way to initialize an array with random numbers than a for loop?
(2 answers)
Closed 2 years ago.
I want to generate a UUID from a custom random number generator:
use uuid::{Builder, Uuid, Variant, Version};
use rand::{Rng, SeedableRng, rngs::StdRng, RngCore};
fn main() {
let seed = [5u8; 32];
let mut rng: StdRng = SeedableRng::from_seed(seed);
let bytes = ???
let uuid = Builder::from_bytes(bytes)
.set_variant(Variant::RFC4122)
.set_version(Version::Random)
.build();
println!("{:?}", uuid);
}
How do I get the bytes?
I think I have done it.
use rand::{rngs::StdRng, RngCore, SeedableRng};
use uuid::{Builder, Variant, Version};
fn main() {
let seed = [0u8; 32];
let mut rng: StdRng = SeedableRng::from_seed(seed);
let mut bytes = [0u8; 16];
rng.fill_bytes(&mut bytes);
let uuid = Builder::from_bytes(bytes)
.set_variant(Variant::RFC4122)
.set_version(Version::Random)
.build();
println!("{:?}", uuid);
}
I need to get the top N items from a Vec which is quite large in production. Currently I do it like this inefficient way:
let mut v = vec![6, 4, 3, 7, 2, 1, 5];
v.sort_unstable();
v = v[0..3].to_vec();
In C++, I'd use std::partial_sort, but I can't find an equivalent in the Rust docs.
Am I just overlooking it, or does it not exist (yet)?
The standard library doesn't contain this functionality, but it looks like the lazysort crate is exactly what you need:
So what's the point of lazy sorting? As per the linked blog post, they're useful when you do not need or intend to need every value; for example you may only need the first 1,000 ordered values from a larger set.
#![feature(test)]
extern crate lazysort;
extern crate rand;
extern crate test;
use std::cmp::Ordering;
trait SortLazy<T> {
fn sort_lazy<F>(&mut self, cmp: F, n: usize)
where
F: Fn(&T, &T) -> Ordering;
unsafe fn sort_lazy_fast<F>(&mut self, cmp: F, n: usize)
where
F: Fn(&T, &T) -> Ordering;
}
impl<T> SortLazy<T> for [T] {
fn sort_lazy<F>(&mut self, cmp: F, n: usize)
where
F: Fn(&T, &T) -> Ordering,
{
fn sort_lazy<F, T>(data: &mut [T], accu: &mut usize, cmp: &F, n: usize)
where
F: Fn(&T, &T) -> Ordering,
{
if !data.is_empty() && *accu < n {
let mut pivot = 1;
let mut lower = 0;
let mut upper = data.len();
while pivot < upper {
match cmp(&data[pivot], &data[lower]) {
Ordering::Less => {
data.swap(pivot, lower);
lower += 1;
pivot += 1;
}
Ordering::Greater => {
upper -= 1;
data.swap(pivot, upper);
}
Ordering::Equal => pivot += 1,
}
}
sort_lazy(&mut data[..lower], accu, cmp, n);
sort_lazy(&mut data[upper..], accu, cmp, n);
} else {
*accu += 1;
}
}
sort_lazy(self, &mut 0, &cmp, n);
}
unsafe fn sort_lazy_fast<F>(&mut self, cmp: F, n: usize)
where
F: Fn(&T, &T) -> Ordering,
{
fn sort_lazy<F, T>(data: &mut [T], accu: &mut usize, cmp: &F, n: usize)
where
F: Fn(&T, &T) -> Ordering,
{
if !data.is_empty() && *accu < n {
unsafe {
use std::mem::swap;
let mut pivot = 1;
let mut lower = 0;
let mut upper = data.len();
while pivot < upper {
match cmp(data.get_unchecked(pivot), data.get_unchecked(lower)) {
Ordering::Less => {
swap(
&mut *(data.get_unchecked_mut(pivot) as *mut T),
&mut *(data.get_unchecked_mut(lower) as *mut T),
);
lower += 1;
pivot += 1;
}
Ordering::Greater => {
upper -= 1;
swap(
&mut *(data.get_unchecked_mut(pivot) as *mut T),
&mut *(data.get_unchecked_mut(upper) as *mut T),
);
}
Ordering::Equal => pivot += 1,
}
}
sort_lazy(&mut data[..lower], accu, cmp, n);
sort_lazy(&mut data[upper..], accu, cmp, n);
}
} else {
*accu += 1;
}
}
sort_lazy(self, &mut 0, &cmp, n);
}
}
#[cfg(test)]
mod tests {
use test::Bencher;
use lazysort::Sorted;
use std::collections::BinaryHeap;
use SortLazy;
use rand::{thread_rng, Rng};
const SIZE_VEC: usize = 100_000;
const N: usize = 42;
#[bench]
fn sort(b: &mut Bencher) {
b.iter(|| {
let mut rng = thread_rng();
let mut v: Vec<i32> = std::iter::repeat_with(|| rng.gen())
.take(SIZE_VEC)
.collect();
v.sort_unstable();
})
}
#[bench]
fn lazysort(b: &mut Bencher) {
b.iter(|| {
let mut rng = thread_rng();
let v: Vec<i32> = std::iter::repeat_with(|| rng.gen())
.take(SIZE_VEC)
.collect();
let _: Vec<_> = v.iter().sorted().take(N).collect();
})
}
#[bench]
fn lazysort_in_place(b: &mut Bencher) {
b.iter(|| {
let mut rng = thread_rng();
let mut v: Vec<i32> = std::iter::repeat_with(|| rng.gen())
.take(SIZE_VEC)
.collect();
v.sort_lazy(i32::cmp, N);
})
}
#[bench]
fn lazysort_in_place_fast(b: &mut Bencher) {
b.iter(|| {
let mut rng = thread_rng();
let mut v: Vec<i32> = std::iter::repeat_with(|| rng.gen())
.take(SIZE_VEC)
.collect();
unsafe { v.sort_lazy_fast(i32::cmp, N) };
})
}
#[bench]
fn binaryheap(b: &mut Bencher) {
b.iter(|| {
let mut rng = thread_rng();
let v: Vec<i32> = std::iter::repeat_with(|| rng.gen())
.take(SIZE_VEC)
.collect();
let mut iter = v.iter();
let mut heap: BinaryHeap<_> = iter.by_ref().take(N).collect();
for i in iter {
heap.push(i);
heap.pop();
}
let _ = heap.into_sorted_vec();
})
}
}
running 5 tests
test tests::binaryheap ... bench: 3,283,938 ns/iter (+/- 413,805)
test tests::lazysort ... bench: 1,669,229 ns/iter (+/- 505,528)
test tests::lazysort_in_place ... bench: 1,781,007 ns/iter (+/- 443,472)
test tests::lazysort_in_place_fast ... bench: 1,652,103 ns/iter (+/- 691,847)
test tests::sort ... bench: 5,600,513 ns/iter (+/- 711,927)
test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 0 filtered out
This code allows us to see that lazysort is faster than the solution with BinaryHeap. We can also see that BinaryHeap solution gets worse when N increases.
The problem with lazysort is that it creates a second Vec<_>. A "better" solution would be to implement the partial sort in-place. I provided an example of such an implementation.
Keep in mind that all these solutions come with overhead. When N is about SIZE_VEC / 3, the classic sort wins.
You could submit an RFC/issue to ask about adding this feature to the standard library.
There is a select_nth_unstable, the equivalent of std::nth_element. The result of this can then be sorted to achieve what you want.
Example:
let mut v = vec![6, 4, 3, 7, 2, 1, 5];
let top_three = v.select_nth_unstable(3).0;
top_three.sort();
3 here is the index of the "nth" element, so we're actually picking the 4th element, that's because select_nth_unstable returns a tuple of
a slice to the left of the nth element
a reference to the nth element
a slice to the right of the nth element
I would like to be able to obtain references (both immutable and mutable) to the usize wrapped in Bar in the Foo enum:
use Foo::*;
#[derive(Debug, PartialEq, Clone)]
pub enum Foo {
Bar(usize)
}
impl Foo {
/* this works */
fn get_bar_ref(&self) -> &usize {
match *self {
Bar(ref n) => &n
}
}
/* this doesn't */
fn get_bar_ref_mut(&mut self) -> &mut usize {
match *self {
Bar(ref mut n) => &mut n
}
}
}
But I can't obtain the mutable reference because:
n does not live long enough
I was able to provide both variants of similar functions accessing other contents of Foo that are Boxed - why does the mutable borrow (and why only it) fail with an unboxed primitive?
You need to replace Bar(ref mut n) => &mut n with Bar(ref mut n) => n.
When you use ref mut n in Bar(ref mut n), it creates a mutable
reference to the data in Bar, so the type of n is &mut usize.
Then you try to return &mut n of &mut &mut u32 type.
This part is most likely incorrect.
Now deref coercion kicks in
and converts &mut n into &mut *n, creating a temporary value *n
of type usize, which doesn't live long enough.
These examples show the sample problem:
fn implicit_reborrow<T>(x: &mut T) -> &mut T {
x
}
fn explicit_reborrow<T>(x: &mut T) -> &mut T {
&mut *x
}
fn implicit_reborrow_bad<T>(x: &mut T) -> &mut T {
&mut x
}
fn explicit_reborrow_bad<T>(x: &mut T) -> &mut T {
&mut **&mut x
}
The explicit_ versions show what the compiler deduces through deref coercions.
The _bad versions both error in the exact same way, while the other two compile.
This is either a bug, or a limitation in how lifetimes are currently implemented in the compiler. The invariance of &mut T over T might have something to do with it, because it results in &mut &'a mut T being invariant over 'a and thus more restrictive during inference than the shared reference (&&'a T) case, even though in this situation the strictness is unnecessary.
I'm trying to solve a Rust algorithm question on hackerrank. My answer times out on some of the larger test cases. There are about 5 people who've completed it, so I believe it is possible and I assume they compile in release mode. Is there any speed-ups I'm missing?
The gist of the game is a counter (inp in main) is conditionally reduced and based on who can't reduce it any more, the winner is chosen.
use std::io;
fn main() {
let n: usize = read_one_line().
trim().parse().unwrap();
for _i in 0..n{
let inp: u64 = read_one_line().
trim().parse().unwrap();
println!("{:?}", find_winner(inp));
}
return;
}
fn find_winner(mut n: u64) -> String{
let mut win = 0;
while n>1{
if n.is_power_of_two(){
n /= 2;
}
else{
n -= n.next_power_of_two()/2;
}
win += 1;
}
let winner =
if win % 2 == 0{
String::from("Richard")
} else{
String::from("Louise")
};
winner
}
fn read_one_line() -> String{
let mut input = String::new();
io::stdin().read_line(&mut input).expect("Failed to read");
input
}
Your inner loop can be replaced by a combination of builtin functions:
let win = if n > 0 {
n.count_ones() + n.trailing_zeros() - 1
} else {
0
};
Also, instead of allocating a string every time find_winner is called,
a string slice may be returned:
fn find_winner(n: u64) -> &'static str {
let win = if n > 0 {
n.count_ones() + n.trailing_zeros() - 1
} else {
0
};
if win % 2 == 0 {
"Richard"
} else{
"Louise"
}
}
Avoiding memory allocation can help speeding up the application.
At the moment, the read_one_line function is doing one memory allocation per call, which can be avoided if you supply the String as a &mut parameter:
fn read_one_line(input: &mut String) -> &str {
io::stdin().read_line(input).expect("Failed to read");
input
}
Note how I also alter the return type to return a slice (which borrows input): further uses here do not need to modify the original string.
Another improvement is I/O. Rust is all about explicitness, and it means that io::stdin() is raw I/O: each call to read_line triggers interactions with the kernel.
You can (and should) instead used buffered I/O with std::io::BufReader. Build it once, then pass it as an argument:
fn read_one_line<'a, R>(reader: &mut R, input: &'a mut String) -> &'a str
where R: io::BufRead
{
reader.read_line(input).expect("Failed to read");
input
}
Note:
it's easier to make it generic (R) than to specify the exact type of BufReader :)
annotating the lifetime is mandatory because the return type could borrow either parameter
Putting it altogether:
fn read_one_line<'a, R>(reader: &mut R, input: &'a mut String) -> &'a str
where R: io::BufRead
{
reader.read_line(input).expect("Failed to read");
input
}
fn main() {
let mut reader = io::BufReader::new(io::stdin());
let mut input = String::new();
let n: usize = read_one_line(&mut reader, &mut input).
trim().parse().unwrap();
for _i in 0..n{
let inp: u64 = read_one_line(&mut reader, &mut input).
trim().parse().unwrap();
println!("{:?}", find_winner(inp));
}
return;
}
with the bigger win probably being I/O (might even be sufficient in itself).
Don't forget to also apply #John's advices, this way you'll be allocation-free in your main loop!
It is possible to make the following binding in Rust:
let &mut a = &mut 5;
But what does it mean exactly? For example, let a = &mut 5 creates an immutable binding of type &mut i32, let mut a = &mut 5 creates a mutable binding of type &mut i32. What about let &mut?
An easy way to test the type of something is to assign it to the wrong type:
let _: () = a;
In this case the value is an "integral variable", or a by-value integer. It is not mutable (as testing with a += 1 shows).
This is because you are using destructuring syntax. You are pattern matching your &mut 5 against an &mut _, much like if you wrote
match &mut 5 { &mut a => {
// rest of code
} };
Thus you are adding a mutable reference and immediately dereferencing it.
To bind a mutable reference to a value instead, you can do
let ref mut a = 5;
This is useful in destructuring to take references to multiple inner values.