Related
I have this code that does a brute-force search to find a match for a string:
fn main() {
let strings: Vec<String> = ["a", "b", "c","d","e","f","g","h","i","j","K","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"].map(String::from).to_vec();
for i in strings.iter(){
for j in strings.iter(){
for k in strings.iter(){
for l in strings.iter(){
let mut result = format!("{i}{j}{k}{l}");
println!("{}",result);
if result == "Kaio"{
println!("Found it!!");
return ;
}
}
}
}
}
}
Is there a better way to be doing this? Can I do it dynamically? In this example I use four loops, assuming that the final string has a length of four. But what if there's a dynamically-sized string that I don't know the size of?
The itertools crate gives a good macro to generate this in iproduct.
use itertools::iproduct;
fn main() {
let pool: Vec<String> = "abcdefghijklmnopqrstuvwxyz".chars().map(String::from).collect();
let possibilities: Vec<String> = iproduct!(&pool, &pool, &pool, &pool) // for four-letters
.map(|(a, b, c, d)| format!("{}{}{}{}", a, b, c, d))
.collect();
for password in possibilities {
if password == "eggs" {
println!("We found it!");
}
}
}
I have a set of data that alternates between A and B. These are all valid choices:
A -> B -> A
A -> B -> A -> B
B -> A -> B
B -> A -> B -> A
I want to leverage the type system to make sure the alternating property is checked at compile time while maintaining good performance.
Solution 1: linked list
struct A {
// data
next: Option<B>,
}
struct B {
// data
next: Option<Box<A>>,
}
The problem is that the performance of this data structure will be poor at best. Linked lists have frequent cache misses, and for iterating the data structure this is quite bad.
Solution 2: Vec + enum
enum Types {
A(DataA),
B(DataB),
}
type Data = Vec<Types>;
With this solution, cache locality is much better, so yay for performance. However, this does not prevent putting 2 As side-by-side. There is also the fact that one needs to check the type at each iteration, while it is not needed because of the informal definition.
Solution 3: Combination
struct A {
// data, default in first link = empty
b: Option<B>,
}
struct B {
// data
}
type Data = Vec<A>;
This combines the cache locality of the Vec with the type verification of the linked list. It is quite ugly, and one needs to check the first value to verify if it really is an A, or an empty container for the next B.
The question
Is there a data structure that allows compile-time type verification, while maintaining cache locality and avoiding extra allocation?
To store alternating types in a way that the type system enforces and has reasonable efficiency, you can use a tuple: Vec<(X, Y)>.
Your situation also requires
Storing an extra leading value in an Option to handle starting with Y
Storing an extra trailing value in an Option to handle ending with X
use either::Either; // 1.5.2
use std::iter;
#[derive(Debug, Default)]
struct Data<X, Y> {
head: Option<Y>,
pairs: Vec<(X, Y)>,
tail: Option<X>,
}
impl<X, Y> Data<X, Y> {
fn iter(&self) -> impl Iterator<Item = Either<&X, &Y>> {
let head = self.head.iter().map(Either::Right);
let pairs = self.pairs.iter().flat_map(|(a, b)| {
let a = iter::once(Either::Left(a));
let b = iter::once(Either::Right(b));
a.chain(b)
});
let tail = self.tail.iter().map(Either::Left);
head.chain(pairs).chain(tail)
}
}
That being said, you are going to have ergonomic issues somewhere. For example, you can't just push an Either<X, Y> because the previously pushed value might be of the same type. Creating the entire structure at once might be the simplest direction:
#[derive(Debug)]
struct A;
#[derive(Debug)]
struct B;
fn main() {
let data = Data {
head: Some(B),
pairs: vec![(A, B)],
tail: None,
};
println!("{:?}", data.iter().collect::<Vec<_>>());
}
Is there a data structure that allows compile-time type verification, while maintaining cache locality and avoiding extra allocation?
You can use Rust's type system to enforce that items of each type are added in alternating order. The general strategy is to capture the type of the first item and also the previous item in the type of the whole structure and make different methods available according to the "current" type. When the previous item was an X, only the methods for adding a Y will be available, and vice versa.
I'm using two Vecs rather than a Vec of tuples. Depending on your data types, this could result in better memory adjacency, but that really depends on how you end up iterating.
use std::marker::PhantomData;
use std::fmt;
struct Left;
struct Right;
struct Empty;
struct AlternatingVec<L, R, P = Empty, S = Empty> {
lefts: Vec<L>,
rights: Vec<R>,
prev: PhantomData<P>,
start: PhantomData<S>,
}
impl<L, R> AlternatingVec<L, R, Empty, Empty> {
pub fn new() -> Self {
AlternatingVec {
lefts: Vec::new(),
rights: Vec::new(),
prev: PhantomData,
start: PhantomData,
}
}
}
The types Left, Right and Empty are for "tagging" if the previous and start values correspond to values in the left or right collection, or if that collection is empty. Initially both collections are empty, so both P (the previously added value) and S (the start value) are Empty.
Next, a utility method for changing the types. It doesn't look like it does much, but we'll use it in combination with type inference to produce copies of the data structure, but with the phantom types changed.
impl<L, R, P, S> AlternatingVec<L, R, P, S> {
fn change_type<P2, S2>(self) -> AlternatingVec<L, R, P2, S2> {
AlternatingVec {
lefts: self.lefts,
rights: self.rights,
prev: PhantomData,
start: PhantomData,
}
}
}
In practice, the compiler is smart enough that this method does nothing at runtime.
These two traits define operations on the left and right collections respectively:
trait LeftMethods<L, R, S> {
fn push_left(self, val: L) -> AlternatingVec<L, R, Left, S>;
}
trait RightMethods<L, R, S> {
fn push_right(self, val: R) -> AlternatingVec<L, R, Right, S>;
}
We will implement those for the times we want them to be callable: RightMethods should only be available if the previous item was a "left" or if there are no items added so far. LeftMethods should be implemented if the previous items was a "right" or if there are no items added so far.
impl<L, R> LeftMethods<L, R, Left> for AlternatingVec<L, R, Empty, Empty> {
fn push_left(mut self, val: L) -> AlternatingVec<L, R, Left, Left> {
self.lefts.push(val);
self.change_type()
}
}
impl<L, R, S> LeftMethods<L, R, S> for AlternatingVec<L, R, Right, S> {
fn push_left(mut self, val: L) -> AlternatingVec<L, R, Left, S> {
self.lefts.push(val);
self.change_type()
}
}
impl<L, R> RightMethods<L, R, Right> for AlternatingVec<L, R, Empty, Empty> {
fn push_right(mut self, val: R) -> AlternatingVec<L, R, Right, Right> {
self.rights.push(val);
self.change_type()
}
}
impl<L, R, S> RightMethods<L, R, S> for AlternatingVec<L, R, Left, S> {
fn push_right(mut self, val: R) -> AlternatingVec<L, R, Right, S> {
self.rights.push(val);
self.change_type()
}
}
These methods don't do much except call push on the correct inner Vec, and then use change_type to make the type reflect the signature.
The compiler forces you to call push_left and push_right alternately:
fn main() {
let v = AlternatingVec::new()
.push_left(true)
.push_right(7)
.push_left(false)
.push_right(0)
.push_left(false);
}
This complex structure leads to a lot more work in general. For example, Debug is fiddly to implement in a nice way. I made a version with a Debug impl, but it's getting a bit too long for Stack Overflow. You can see it here:
https://gist.github.com/peterjoel/2ffe8b7f5ad7c649f61c580ac7dabc67
It's quite common to compare data with precedence, for a struct which has multiple members which can be compared, or for a sort_by callback.
// Example of sorting a: Vec<[f64; 2]>, sort first by y, then x,
xy_coords.sort_by(
|co_a, co_b| {
let ord = co_a[1].cmp(&co_b[1]);
if ord != std::cmp::Ordering::Equal {
ord
} else {
co_a[0].cmp(&co_b[0])
}
}
);
Is there a more straightforward way to perform multiple cmp functions, where only the first non-equal result is returned?
perform multiple cmp functions, where only the first non-equal result is returned
That's basically how Ord is defined for tuples. Create a function that converts your type into a tuple and compare those:
fn main() {
let mut xy_coords = vec![[1, 0], [-1, -1], [0, 1]];
fn sort_key(coord: &[i32; 2]) -> (i32, i32) {
(coord[1], coord[0])
}
xy_coords.sort_by(|a, b| {
sort_key(a).cmp(&sort_key(b))
});
}
Since that's common, there's a method just for it:
xy_coords.sort_by_key(sort_key);
It won't help your case, because floating point doesn't implement Ord.
One of many possibilities is to kill the program on NaN:
xy_coords.sort_by(|a, b| {
sort_key(a).partial_cmp(&sort_key(b)).expect("Don't know how to handle NaN")
});
See also
Using max_by_key on a vector of floats
How to do a binary search on a Vec of floats?
There are times when you may not want to create a large tuple to compare values which will be ignored because higher priority values will early-exit the comparison.
Stealing a page from Guava's ComparisonChain, we can make a small builder that allows us to use closures to avoid extra work:
use std::cmp::Ordering;
struct OrdBuilder<T> {
a: T,
b: T,
ordering: Ordering,
}
impl<T> OrdBuilder<T> {
fn new(a: T, b: T) -> OrdBuilder<T> {
OrdBuilder {
a: a,
b: b,
ordering: Ordering::Equal,
}
}
fn compare_with<F, V>(mut self, mut f: F) -> OrdBuilder<T>
where F: for <'a> FnMut(&'a T) -> V,
V: Ord,
{
if self.ordering == Ordering::Equal {
self.ordering = f(&self.a).cmp(&f(&self.b));
}
self
}
fn finish(self) -> Ordering {
self.ordering
}
}
This can be used like
struct Thing {
a: u8,
}
impl Thing {
fn b(&self) -> u8 {
println!("I'm slow!");
42
}
}
fn main() {
let a = Thing { a: 0 };
let b = Thing { a: 1 };
let res = OrdBuilder::new(&a, &b)
.compare_with(|x| x.a)
.compare_with(|x| x.b())
.finish();
println!("{:?}", res);
}
I would like to walk through a Vec and combine some elements of it. How do I do that in idiomatic Rust?
Example:
#[derive(PartialEq, Debug)]
enum Thing { A, B, AandB }
fn combine(v: Vec<Thing>) -> Vec<Thing> {
// idiomatic code here
}
fn main() {
let v = vec![Thing::A, Thing::B];
assert_eq!(vec![Thing::AandB], combine(v));
}
How I would do it:
Traverse the Vec with Iterator::scan and replace all occurrences of Thing::B with Thing::AandB if Thing::A was the element before. Then I would traverse it again and remove all Thing::As before Thing::AandB.
This seems super complicated and inelegant.
I merged swizard's answer and Shepmaster's answer and ended up with an in-place solution that runs through the vector recursively, has only the vector as a mutable and never moves anything twice. No guarantees on runtime or idiomaticity ;)
use Thing::*;
use std::cmp::min;
#[derive(Copy,Clone,PartialEq,Debug)]
enum Thing { A, B, AandB}
fn combine(mut v: Vec<Thing>) -> Vec<Thing> {
fn inner(res: &mut Vec<Thing>, i: usize, backshift: usize) {
match &res[i..min(i+2, res.len())] {
[A, B] => {
res[i - backshift] = AandB;
inner(res, i + 2, backshift + 1);
},
[a, ..] => {
res[i - backshift] = a;
inner(res, i + 1, backshift);
},
[] => res.truncate(i - backshift),
}
};
inner(&mut v, 0, 0);
v
}
fn main() {
let v = vec![A, A, B, AandB, B, A, B, A, B];
assert_eq!(vec![A, AandB, AandB, B, AandB, AandB], combine(v));
let v = vec![A, A, B, AandB, B, A, B, A, A];
assert_eq!(vec![A, AandB, AandB, B, AandB, A, A], combine(v));
}
Not sure if this counts as idiomatic, but the itertools library has the batching() function for all iterators. Combined with peek() from the standard library, you get your result in one iteration instead of two.
extern crate itertools;
use itertools::Itertools;
use Thing::*;
#[derive(PartialEq, Debug)]
enum Thing { A, B, AandB }
fn combine(v: Vec<Thing>) -> Vec<Thing> {
v.into_iter().peekable().batching(|mut it| {
match it.next() {
Some(A) => {
if Some(&B) == it.peek() {
it.next();
Some(AandB)
} else {
Some(A)
}
}
x => x,
}
}).collect()
}
fn main() {
let v = vec![A, B, A, A, A, B, B, A];
assert_eq!(
vec![AandB, A, A, AandB, B, A],
combine(v)
);
}
obviously collect() will allocate a new buffer.
Here's a solution that uses recursion and pattern-matching. I'm pretty sure the recursion is tail-recursion, and so could be turned into iteration.
use Thing::*;
#[derive(Copy,Clone,PartialEq,Debug)]
enum Thing { A, B, AandB }
fn combine(v: Vec<Thing>) -> Vec<Thing> {
fn inner(mut res: Vec<Thing>, s: &[Thing]) -> Vec<Thing> {
match s {
[A, B, tail..] => {
res.push(AandB);
inner(res, tail)
},
[a, tail..] => {
res.push(a);
inner(res, tail)
},
[] => res,
}
};
inner(Vec::new(), &v)
}
fn main() {
let v = vec![A, A, B, AandB, B, A];
assert_eq!(vec![A, AandB, AandB, B, A], combine(v));
let v = vec![A, A, B, AandB, B, A, B, A, B];
assert_eq!(vec![A, AandB, AandB, B, AandB, AandB], combine(v));
let v = vec![A, A, B, AandB, B, A, B, A, A];
assert_eq!(vec![A, AandB, AandB, B, AandB, A, A], combine(v));
}
I suspect there is no easy way to do that with iterators, but nobody lays embargo on plain old c-style:
#[derive(PartialEq, Debug, Copy)]
enum Thing { A, B, AandB }
fn combine(mut v: Vec<Thing>) -> Vec<Thing> {
let mut prev: Option<Thing> = None;
let mut end = 0;
for i in 0 .. v.len() {
let el = v[i];
match (el, prev) {
(Thing::B, Some(Thing::A)) => {
end = end - 1;
v[end] = Thing::AandB
},
_ =>
v[end] = el
};
prev = Some(el);
end = end + 1;
}
v.truncate(end);
v
}
fn main() {
let v = vec![Thing::A, Thing::A, Thing::B, Thing::AandB, Thing::B, Thing::A];
assert_eq!(vec![Thing::A, Thing::AandB, Thing::AandB, Thing::B, Thing::A], combine(v));
}
Here is one pass with direct transformations.
Okay, here is an idiomatic version then without explicit for-loops and recursion :)
#[derive(PartialEq, Debug, Copy)]
enum Thing { A, B, AandB }
fn combine(mut v: Vec<Thing>) -> Vec<Thing> {
let (_, total) = (0 .. v.len()).fold((None, 0), |&mut: (prev, end), i| {
let el = v[i];
let (next, item) = match (el, prev) {
(Thing::B, Some(Thing::A)) => (end, Thing::AandB),
_ => (end + 1, el),
};
v[next - 1] = item;
(Some(el), next)
});
v.truncate(total);
v
}
fn main() {
let v = vec![Thing::A, Thing::A, Thing::B, Thing::AandB, Thing::B, Thing::A];
assert_eq!(vec![Thing::A, Thing::AandB, Thing::AandB, Thing::B, Thing::A], combine(v));
}
A singly linked list can simply create from tail. But can't from head, I tried many time, code here: https://gist.github.com/tioover/8d7585105c06e01678a8.
In fact, I want search then delete a node in linked list. but I can't traversal linked list with mutable borrow pointer: https://gist.github.com/tioover/526715ed05342ef5b4f1. I tried many time too.
Here's some code from an answer to a similar question. It shows a way of having a list that you can add to the beginning and the end, and the middle for good measure.
#[derive(Debug)]
struct Node<T> {
v: T,
next: Option<Box<Node<T>>>,
}
impl<T> Node<T> {
fn new(v: T) -> Node<T> { Node { v: v, next: None } }
fn push_front(self, head: T) -> Node<T> {
Node {
v: head,
next: Some(Box::new(self)),
}
}
fn push_back(&mut self, tail: T) {
match self.next {
Some(ref mut next) => next.push_back(tail),
None => self.next = Some(Box::new(Node::new(tail))),
}
}
fn push_after(&mut self, v: T) {
let old_next = self.next.take();
let new_next = Node {
v: v,
next: old_next,
};
self.next = Some(Box::new(new_next));
}
}
fn main() {
let mut n = Node::new(2u8);
n.push_back(3u8);
let mut n = n.push_front(0u8);
n.push_after(1u8);
println!("{:?}", n);
}
The important thing is that when we add to the head, we consume the old head by taking it as self. That allows us to move it into a Box which will be the follower of the new head. Removing an item is a straight-forward extension of this example, but you'll need to look forward a bit and handle more edge cases (like what to do if there isn't a second successor).