Initialize a vector of struct with zero values in Rust - performance

I have a vector of struct and want to initialize it with all zeros.
struct MyStruct {
v1: u32,
v2: u64,
}
type MyVector = Vec<MyStruct>;
Cause the size of vector is already known, I can specify the capacity.
My first approach is as below,
impl Default for MyStruct {
fn default() -> Self {
Self {
v1: 0,
v2: 0,
}
}
}
fn init_my_vec() {
let size = 1000;
let mut my_vec: MyVector = Vec::with_capacity(size);
(0..size).for_each(|_| my_vec.push(MyStruct::default()))
}
As far as I know, the vector initialization with 0 is faster than using iterator. like this,
let usize_vec: Vec<usize> = vec![0; 1000];
// is faster than
let mut usize_vec: Vec<usize> = Vec::with_capacity(1000);
for i in 0..1000 {
usize_vec.push(0);
}
Question
Am I right about vector initialization speed? As fill with 0 is special instruction, using iterator is slower than using macro.
Is there any method that can initialize the vector of struct with 0 values safely and fast?
Or I should use unsafe code like making empty bytes and casting it to vector?
Speed measurement about Question 1
const VEC_SIZE: usize = 10_000;
fn init_with_iter() -> u128 {
let start = Instant::now();
let mut usize_vec: Vec<usize> = Vec::with_capacity(VEC_SIZE);
for i in 0..VEC_SIZE {
usize_vec.push(0);
}
start.elapsed().as_micros()
}
fn init_with_macro() -> u128 {
let start = Instant::now();
let _: Vec<usize> = vec![0; VEC_SIZE];
start.elapsed().as_micros()
}
Average time taken to generate vector 10,000 times is
using iter(init_with_iter): 514.6805 ms
using macro(init_with_macro): 2.0361 ms
on my machine
Speed measurement about Question 3
I think using unsafe function mem::zeroed is slightly faster than any others
const VEC_SIZE: usize = 10_000;
fn init_with_iter() -> u128 {
let start = Instant::now();
let mut my_vec: MyVector = Vec::with_capacity(VEC_SIZE);
for _ in 0..VEC_SIZE {
my_vec.push(MyStruct::default());
}
start.elapsed().as_micros()
}
fn init_with_macro() -> u128 {
let start = Instant::now();
let _: MyVector = vec![MyStruct::default(); VEC_SIZE];
start.elapsed().as_micros()
}
fn init_with_zeroed() -> u128 {
let start = Instant::now();
let _: MyVector = unsafe { vec![std::mem::zeroed(); VEC_SIZE] };
start.elapsed().as_micros()
}
Average time taken to generate vector 1,000 times is
using iter(init_with_iter): 575.572 ms.
using macro(init_with_macro): 486.958 ms
using unsafe function(init_with_zeroed): 468.885 ms
on my machine

Here is a criterion benchmark of your three approaches:
use criterion::{black_box, criterion_group, criterion_main, Criterion};
criterion_group!(
benches,
init_structs_with_iter,
init_structs_with_macro,
init_structs_with_unsafe
);
criterion_main!(benches);
const N_ITEMS: usize = 1000;
#[allow(unused)]
#[derive(Debug, Clone)]
struct MyStruct {
v1: u32,
v2: u64,
}
impl Default for MyStruct {
fn default() -> Self {
Self { v1: 0, v2: 0 }
}
}
fn init_structs_with_iter(c: &mut Criterion) {
c.bench_function("structs: with_iter", |b| {
b.iter(|| {
let mut my_vec = Vec::with_capacity(N_ITEMS);
(0..my_vec.capacity()).for_each(|_| my_vec.push(MyStruct::default()));
black_box(my_vec);
})
});
}
fn init_structs_with_macro(c: &mut Criterion) {
c.bench_function("structs: with_macro", |b| {
b.iter(|| {
let my_vec = vec![MyStruct::default(); N_ITEMS];
black_box(my_vec);
})
});
}
fn init_structs_with_unsafe(c: &mut Criterion) {
c.bench_function("structs: with_unsafe", |b| {
b.iter(|| {
let my_vec: Vec<MyStruct> = vec![unsafe { std::mem::zeroed() }; N_ITEMS];
black_box(my_vec);
})
});
}
And the results:
structs: with_iter time: [1.3857 us 1.3960 us 1.4073 us]
structs: with_macro time: [563.30 ns 565.30 ns 567.32 ns]
structs: with_unsafe time: [568.84 ns 570.09 ns 571.49 ns]
The vec![] macro seems to be the fastest (and also the cleanest and easiest to read).
As you can see, the time is measured in nanoseconds, so although the iterator version is 2-3x slower, it won't matter in practice. Optimizing the zero-initialization of a struct is the least important thing you can do - you can save at most 1 microsecond ;)
PS: those times include the memory allocation and deallocation times

Related

How can I speed up the process of parsing a 40Gb txt file and inserting a row, per line, into an sqlite db

Background
I'm currently trying to parse a .txt com zone file. It is structured like so
blahblah.com xx xx examplens.com
Currently my code is structured like so
extern crate regex;
use regex::Regex;
use rusqlite::{params, Connection, Result};
#[derive(Debug)]
struct Domain {
id: i32,
domain: String,
}
use std::io::stdin;
fn main() -> std::io::Result<()> {
let mut reader = my_reader::BufReader::open("data/com_practice.txt")?;
let mut buffer = String::new();
let conn: Connection = Connection::open("/Users/alex/Projects/domain_randomizer/data/domains.db").unwrap();
while let Some(line) = reader.read_line(&mut buffer) {
let regexed_i_think = rexeginton_the_domain_only(line?.trim());
println!("{:?}", regexed_i_think);
sqliting(regexed_i_think, &conn).unwrap();
}
let mut stmt = conn.prepare("SELECT id, domain FROM domains").unwrap();
let domain_iter = stmt.query_map(params![], |row| {
Ok(Domain {
id: row.get(0)?,
domain: row.get(1)?,
})
}).unwrap();
for domain in domain_iter {
println!("Found person {:?}", domain.unwrap());
}
Ok(())
}
pub fn sqliting(domain_name: &str, conn: &Connection) -> Result<()> {
let yeah = Domain {
id: 0,
domain: domain_name.to_string()
};
conn.execute(
"INSERT INTO domains (domain) VALUES (?1)",
params![yeah.domain],
)?;
Ok(())
}
mod my_reader {
use std::{
fs::File,
io::{self, prelude::*},
};
pub struct BufReader {
reader: io::BufReader<File>,
}
impl BufReader {
pub fn open(path: impl AsRef<std::path::Path>) -> io::Result<Self> {
let file = File::open(path)?;
let reader = io::BufReader::new(file);
Ok(Self { reader })
}
pub fn read_line<'buf>(
&mut self,
buffer: &'buf mut String,
) -> Option<io::Result<&'buf mut String>> {
buffer.clear();
self.reader
.read_line(buffer)
.map(|u| if u == 0 { None } else { Some(buffer) })
.transpose()
}
}
}
pub fn rexeginton_the_domain_only(full_line: &str) -> &str {
let regex_str = Regex::new(r"(?m).*?.com").unwrap();
let final_str = regex_str.captures(full_line).unwrap().get(0).unwrap().as_str();
return final_str;
}
Issue
So I am Parsing a single domain at a time, each time making an Insert. As I've gathered, Inserting would be far more efficient if I was making thousands of Inserts in a single transaction. However, I'm not quite sure what an efficient approach is to refactoring my parsing around this.
Question
How should I reorient my parsing process around my Insertion process? Also how can I actively gauge the speed and efficiency of the process in the first place so I can compare and contrast in an articulate manner?

Iterative filter function that modifies tree like structure

I have a structure like
struct Node {
pub id: String,
pub dis: String,
pub parent: Option<NodeRefNodeRefWeak>,
pub children: Vec<NodeRef>,
}
pub type NodeRef = Rc<RefCell<Node>>;
pub type NodeRefNodeRefWeak = Weak<RefCell<Node>>;
I also have a start of a function that can iterate this structure to
pull out a match on a node id but it has issues.
What I would like is for this function to return the parent node of the whole tree with ONLY the branches that have a match somewhere on the branch.
Children past the search node can be removed.
Ie a function that filters all other nodes out of the tree.
For example with my rust playground link I would like it to return
level0_node_#1 (level0_node_#1)
level1_node_4 (level1_node_4)
level2_node_4_3 (level2_node_4_3)
level3_node_4_3_2 (level3_node_4_3_2)
However, using the recursive approach as below causes real issue with already borrowed errors when trying to remove branches etc.
Is there a way to achieve this filter function?
I have a test in the Rust playground.
fn tree_filter_node_objects<F>(node: &NodeRef, f: F) -> Vec<NodeRef>
where F: Fn(&str) -> bool + Copy {
let mut filtered_nodes: Vec<NodeRef> = vec![];
let mut borrow = node.borrow_mut();
if f(&borrow.id) {
filtered_nodes.push(node.clone());
}
for n in borrow.children.iter() {
let children_filtered = tree_filter_node_objects(n, f);
for c in children_filtered.iter() {
filtered_nodes.push(c.clone());
}
}
filtered_nodes
}
In the end I used this iterative approach.
pub fn tree_filter_node_dfs<F>(root: &NodeRef, f: F) -> Vec<NodeRef>
where F: Fn(&BmosHaystackObject) -> bool + Copy {
let mut filtered_nodes: Vec<NodeRef> = vec![];
let mut cur_node: Option<NodeRef> = Some(root.clone());
let mut last_visited_child: Option<NodeRef> = None;
let mut next_child: Option<NodeRef>;
let mut run_visit: bool = true;
while cur_node.is_some() {
if run_visit {
let n = cur_node.as_ref().unwrap();
if f(&n.borrow().object) {
}
}
if last_visited_child.is_none() {
let children = cur_node.as_ref().unwrap().borrow().children.clone();
if children.len() > 0 {
next_child = Some(children[0].clone());
}
else {
next_child = None;
}
}
else {
next_child = tree_filter_node_get_next_sibling(last_visited_child.as_ref().unwrap());
}
if next_child.is_some() {
last_visited_child = None;
cur_node = next_child;
run_visit = true;
}
else {
last_visited_child = cur_node;
cur_node = tree_node_parent_node(&last_visited_child.clone().unwrap().clone());
run_visit = false;
}
}
filtered_nodes
}

How do I create a BinaryHeap that pops the smallest value, not the largest?

I can use the std::collections::BinaryHeap to iterate over a collection of a struct in the greatest to least order with pop, but my goal is to iterate over the collection from least to greatest.
I have succeeded by reversing the Ord implementation:
impl Ord for Item {
fn cmp(&self, other: &Self) -> Ordering {
match self.offset {
b if b > other.offset => Ordering::Less,
b if b < other.offset => Ordering::Greater,
b if b == other.offset => Ordering::Equal,
_ => Ordering::Equal, // ?not sure why compiler needs this
}
}
}
Now the BinaryHeap returns the Items in least to greatest. Seeing as how this is not the intended API, is this an incorrect or error prone pattern?
I realize that a LinkedList would give me the pop_front method, but I would need to sort the list on insert. Is that the better solution?
Reversing the order of a type inside the heap is fine. However, you don't need to implement your own order reversal. Instead, use std::cmp::Reverse or Ordering::reverse as appropriate.
If it makes sense for your type to actually be less than another value when some field is greater, implement your own Ord:
impl Ord for Item {
fn cmp(&self, other: &Self) -> Ordering {
self.offset.cmp(&other.offset).reverse()
}
}
If you do not wish to change the ordering of your type, flip the ordering when you put it in the BinaryHeap:
use std::{cmp::Reverse, collections::BinaryHeap};
fn main() {
let mut a: BinaryHeap<_> = vec![1, 2, 3].into_iter().collect();
if let Some(v) = a.pop() {
println!("Next is {}", v);
}
let mut b: BinaryHeap<_> = vec![1, 2, 3].into_iter().map(Reverse).collect();
if let Some(Reverse(v)) = b.pop() {
println!("Next is {}", v);
}
}
Next is 3
Next is 1
See also:
How can I implement a min-heap of f64 with Rust's BinaryHeap?
How do I select different std::cmp::Ord (or other trait) implementations for a given type?
Is [a LinkedList] the better solution?
99.9% of the time, a linked list is not a better solution.
For use std::cmp::Reverse
use std::cmp::Reverse;
use std::collections::BinaryHeap;
fn main() {
let mut heap = BinaryHeap::new();
(0..10)
.map(|i| {
if heap.len() >= 3 {
println!("Poped: {:?}.", heap.pop());
}
heap.push(Reverse(i));
})
.for_each(drop);
println!("{:?}", heap);
}
Poped: Some(Reverse(0)).
Poped: Some(Reverse(1)).
Poped: Some(Reverse(2)).
Poped: Some(Reverse(3)).
Poped: Some(Reverse(4)).
Poped: Some(Reverse(5)).
Poped: Some(Reverse(6)).
[Reverse(7), Reverse(8), Reverse(9)]
Rust Playground
For custom impl types:
use std::cmp::Ordering;
#[derive(Debug, PartialEq, Eq)]
struct MyU64Min(u64);
impl From<u64> for MyU64Min {
fn from(i: u64) -> Self {
Self(i)
}
}
impl PartialOrd for MyU64Min {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
other.0.partial_cmp(&self.0)
}
}
impl Ord for MyU64Min {
fn cmp(&self, other: &MyU64Min) -> Ordering {
self.partial_cmp(other).unwrap()
}
}
fn main() {
let mut heap = BinaryHeap::new();
(0..10)
.map(|i| {
if heap.len() >= 3 {
println!("Poped: {:?}.", heap.pop());
}
heap.push(MyU64Min::from(i));
})
.for_each(drop);
println!("{:?}", heap);
}
Poped: Some(MyU64Min(0)).
Poped: Some(MyU64Min(1)).
Poped: Some(MyU64Min(2)).
Poped: Some(MyU64Min(3)).
Poped: Some(MyU64Min(4)).
Poped: Some(MyU64Min(5)).
Poped: Some(MyU64Min(6)).
[MyU64Min(7), MyU64Min(8), MyU64Min(9)]
Rust Playground

Rust traits: The bounds might not be implemented, and the traits I've implemented does not exist

So I've been trying to implement a library for vector and matrix maths, and I created some functions that worked alright but wanted to generalize for all number primitives and add the functionality into the normal operators.
My thought was that I'd create a container for a Vec<T>, that can contain either number types (like i32) or another container for Vec, so that matrices where possible. Ergo:
#[derive(Clone, Debug)]
struct Mat<T>(Vec<T>);
Then, to add together two vecs of any number I implement Add as:
impl<'a, T> Add for &'a Mat<T>
where T: PartialEq + PartialOrd + Add<T> + Sub<T> + Mul<T> + Div<T> + Rem<T> + Clone {
type Output = Option<Mat<<T as std::ops::Add>::Output>>;
fn add(self, other: &Mat<T>) -> Self::Output {
let a: &Vec<T> = self.pop();
let b: &Vec<T> = other.pop();
match a.len() == b.len() {
true => {
let mut retvec: Vec<<T as std::ops::Add>::Output> = Vec::new();
for i in 0..a.len() {
retvec.push(a[i].clone() + b[i].clone());
}
Some(Mat(retvec))
},
false => None
}
}
}
Edit: To further clarify, Mat::pop() is just the unwrap function, though probably poorly named.
The basic scenario of adding together two vectors of any number seems to work.
#[test]
fn add_override_vectors() {
let vec: Mat<i32> = Mat(vec![2, 2, 2]);
let newvec = &vec + &vec;
assert_eq!(*newvec.unwrap().pop(), vec![4,4,4]);
}
But matrices are giving me a headache. For them, the add function looks very similar, except for the let Some(x) statement:
impl<'a, T> Add for &'a Mat<Mat<T>>
where T: Add<&'a Mat<T>>{
type Output = Option<Mat<T>>;
fn add(self, other: &Mat<Mat<T>>) -> Self::Output {
let a: &Vec<Mat<T>> = self.pop();
let b: &Vec<Mat<T>> = other.pop();
match a.len() == b.len() {
true => {
let mut retvec: Vec<T> = Vec::new();
for i in 0..a.len() {
if let Some(x) = &a[i] + &b[i] {
retvec.push(x);
}
}
Some(Mat(retvec))
},
false => None
}
}
}
The error message I get is:
error[E0369]: binary operation `+` cannot be applied to type `&Mat<T>`
--> src\main.rs:46:38
|
46 | if let Some(x) = &a[i] + &b[i] {
| ^^^^^^^^^^^^^
|
= note: an implementation of `std::ops::Add` might be missing for `&Mat<T>`
So the compiler says that Add might not be implemented for &Mat<T>, but I thought that I've specified the bound so that it has that requirement in where T: Add<&'a Mat<T>. To me it seems that whatever is in &a[i] should have the Add trait implemented. What am I doing wrong here?
Just as extra clarification, my idea is that Add for &'a Mat<Mat<T>> should be able to be called recursively until it boils down to the Vec with an actual number type in it. Then the Add for &'a Mat<T> should be called.
There are two problems: the wrong associated Output type and the type of retvec
Something like that should work:
impl<'a, T> Add for &'a Mat<Mat<T>>
where
T: PartialEq + PartialOrd + Add<T> + Clone,
{
type Output = Option<Mat<Mat<<T as std::ops::Add>::Output>>>;
fn add(self, other: &Mat<Mat<T>>) -> Self::Output {
let a: &Vec<Mat<T>> = self.pop();
let b: &Vec<Mat<T>> = other.pop();
match a.len() == b.len() {
true => {
let mut retvec: Vec<Mat<<T as std::ops::Add>::Output>> = Vec::new();
for i in 0..a.len() {
if let Some(x) = &a[i] + &b[i] {
retvec.push(x);
}
}
Some(Mat(retvec))
}
false => None,
}
}
}
A part the compilation issue I think it is not correct to implement a trait for a "recursive" struct
like Mat<Mat<T>>, if you think X as type X = Mat<T> then the impl for Mat<T> suffices:
impl<'a, T> Add for &'a Mat<T>
where
T: PartialEq + PartialOrd + Add<T> + Clone
with the additional impl for Mat<T> values:
impl<T> Add for Mat<T>
where
T: PartialEq + PartialOrd + Add<T> + Clone
Below I post a full working code, please note that the Output type is no more an Option<Mat<T>> but a plain Mat<T> object:
this avoids a lot of headaches and probably it is conceptually wrong if you want to impl some type of algebra.
use std::ops::*;
use std::vec::Vec;
#[derive(Clone, Debug, PartialEq, PartialOrd)]
struct Mat<T>(Vec<T>);
impl<T> Mat<T> {
fn pop(&self) -> &Vec<T> {
&self.0
}
}
impl<T> Add for Mat<T>
where
T: PartialEq + PartialOrd + Add<T> + Clone,
{
type Output = Mat<<T as std::ops::Add>::Output>;
fn add(self, other: Mat<T>) -> Self::Output {
let a: &Vec<T> = self.pop();
let b: &Vec<T> = other.pop();
match a.len() == b.len() {
true => {
let mut retvec: Vec<<T as std::ops::Add>::Output> = Vec::new();
for i in 0..a.len() {
retvec.push(a[i].clone() + b[i].clone());
}
Mat(retvec)
}
false => Mat(Vec::new()),
}
}
}
impl<'a, T> Add for &'a Mat<T>
where
T: PartialEq + PartialOrd + Add<T> + Clone,
{
type Output = Mat<<T as std::ops::Add>::Output>;
fn add(self, other: &Mat<T>) -> Self::Output {
let a: &Vec<T> = self.pop();
let b: &Vec<T> = other.pop();
match a.len() == b.len() {
true => {
let mut retvec: Vec<<T as std::ops::Add>::Output> = Vec::new();
for i in 0..a.len() {
retvec.push(a[i].clone() + b[i].clone());
}
Mat(retvec)
}
false => Mat(Vec::new()),
}
}
}
#[test]
fn add_override_vectors() {
let vec: Mat<Mat<i32>> = Mat(vec![Mat(vec![2, 2, 2]), Mat(vec![3, 3, 3])]);
let newvec = &vec + &vec;
assert_eq!(*newvec.pop(), vec![Mat(vec![4, 4, 4]), Mat(vec![6, 6, 6])]);
}
#[test]
fn add_wrong_vectors() {
let vec1: Mat<Mat<i32>> = Mat(vec![Mat(vec![2, 2, 2]), Mat(vec![4, 4, 4])]);
let vec2: Mat<Mat<i32>> = Mat(vec![Mat(vec![3, 3, 3]), Mat(vec![3, 3])]);
let newvec = &vec1 + &vec2;
assert_eq!(*newvec.pop(), vec![Mat(vec![5, 5, 5]), Mat(vec![])]);
}
fn main() {
let vec: Mat<Mat<i32>> = Mat(vec![Mat(vec![1, 2, 2]), Mat(vec![3, 3, 3])]);
let newvec = &vec + &vec;
println!("Hello, world!: {:?}", newvec);
}
PS: Your Mat<T> type is not a matrix in the classical sense, perhaps another name should be more appropriate to avoid confusion.

What is the idiomatic way to implement caching on a function that is not a struct method?

I have an expensive function like this:
pub fn get_expensive_value(n: u64): u64 {
let ret = 0;
for 0 .. n {
// expensive stuff
}
ret
}
And it gets called very frequently with the same argument. It's pure, so that means it will return the same result and can make use of a cache.
If this was a struct method, I would add a member to the struct that acts as a cache, but it isn't. So my option seems to be to use a static:
static mut LAST_VAL: Option<(u64, u64)> = None;
pub fn cached_expensive(n: u64) -> u64 {
unsafe {
LAST_VAL = LAST_VAL.and_then(|(k, v)| {
if k == n {
Some((n,v))
} else {
None
}
}).or_else(|| {
Some((n, get_expensive_value(n)))
});
let (_, v) = LAST_VAL.unwrap();
v
}
}
Now, I've had to use unsafe. Instead of the static mut, I could put a RefCell in a const. But I'm not convinced that is any safer - it just avoids having to use the unsafe block. I thought about a Mutex, but I don't think that will get me thread safety either.
Redesigning the code to use a struct for storage is not really an option.
I think the best alternative is to use a global variable with a mutex. Using lazy_static makes it easy and allows the "global" declaration inside the function
pub fn cached_expensive(n: u64) -> u64 {
use std::sync::Mutex;
lazy_static! {
static ref LAST_VAL: Mutex<Option<(u64, u64)>> = Mutex::new(None);
}
let mut last = LAST_VAL.lock().unwrap();
let r = last.and_then(|(k, v)| {
if k == n {
Some((n, v))
} else {
None
}
}).or_else(|| Some((n, get_expensive_value(n))));
let (_, v) = r.unwrap();
*last = r;
v
}
You can also check out the cached project / crate. It memoizes the function with a simple macro.

Resources