How to walk a mutually recursive graph in safe Rust? - data-structures

How do I express mutually recursive structures in Rust? explains how to represent graph-like structures but it does not explain how to walk the graph (only to add more children). The Rc<RefCell<T>> solution I tried to adapt did not compile.
I am looking for a way to design and walk a graph-like structure in safe Rust, utilizing Rc<T> and/or RefCell<T> for interior mutability. My current Node does not compile because of &mut T aliasing rules:
struct Node {
parent: Option<&mut Node>, // a mutable reference to the Node's owner
children: Vec<Node>, // each Node owns its children
}
impl Node {
fn add_child(&mut self, x: Node) {
self.children.push(x);
}
fn get_child(&mut self, i: usize) -> &mut Node {
&mut self.children[i]
}
fn get_parent(&mut self) -> &mut Node {
self.parent.expect("No parent!")
}
}
Example functionality:
let mut top_node = Node::new(None);
let mut ptr = &mut top_node;
ptr.add_child(Node::new(&mut ptr)); // add a child to top_node
ptr = ptr.get_child(0); // walk down to top_node's 0th child.
ptr = ptr.get_parent(); // walk back up to top_node
I repeatedly rewrote this implementation replacing &mut T with combinations of Rc, Weak, RefCell, and RefMut to no avail. I don't understand enough about the underlying memory management.
Could someone with more experience using interior mutability explain how to correctly design and walk this graph?

The key was to use a hybrid Arena/Cell solution. The parents and children are now just references to Nodes, but mutability is enabled using Cell and RefCell:
struct Node<'a> {
arena: &'a Arena<Node<'a>>,
parent: Cell<Option<&'a Node<'a>>>,
children: RefCell<Vec<&'a Node<'a>>>,
}
impl<'a> Node<'a> {
fn add_child(&'a self) {
let child = new_node(self.arena);
child.parent.set(Some(self));
self.children.borrow_mut().push(child);
}
fn get_child(&'a self, i: usize) -> &'a Node<'a> {
self.children.borrow()[i]
}
fn get_parent(&'a self) -> &'a Node<'a> {
self.parent.get().expect("No Parent!")
}
}
Now, the Arena owns every node instead of the parents owning their children (this is only an implementation detail and does not hinder any functionality). Resultantly, there is no need to pass the parent to add_child:
let arena: Arena<Node> = Arena::new();
let top_node = new_node(&arena);
top_node.add_child();
let mut ptr = top_node.get_child(0);
ptr.add_child();
ptr = ptr.get_child(0);
ptr = ptr.get_parent();
The solution uses the following helper function to start and keep ownership with the Arena:
fn new_node<'a>(arena: &'a Arena<Node<'a>>) -> &'a mut Node<'a> {
arena.alloc(Node {
arena: arena,
parent: Cell::new(None),
children: RefCell::new(vec![]),
})
}

Related

Is there any way to not link the lifetime of the iterator to the struct?

I am trying to implement a filter function which receives an iterator to a vector and returns an iterator with the filter. Is there any way by which I don't link the lifetime of the iterator to the struct? I am able to make it work by making lifetime of iterator depend on the struct but that is not what I intend to do.
Here is a simplified code:
struct Structure {
low: i32,
}
impl Structure {
pub fn find_low<'a>(
&mut self,
packets: impl Iterator<Item = &'a i32>,
) -> impl Iterator<Item = &'a i32> {
packets.filter(|packet| **packet < self.low)
}
pub fn new() -> Self {
Structure { low: 10 }
}
}
fn main() {
let strct = Structure::new();
let vec = [1, 2, 3, 11, 12, 13];
let mut it = strct.find_low(vec.iter());
assert_eq!(it.next().unwrap(), &vec[0]);
}
By doing so, I get an error
error[E0700]: hidden type for `impl Trait` captures lifetime that does not appear in bounds
--> src/main.rs:9:10
|
7 | &mut self,
| --------- hidden type `Filter<impl Iterator<Item = &'a i32>, [closure#src/main.rs:10:24: 10:52]>` captures the anonymous lifetime defined here
8 | packets: impl Iterator<Item = &'a i32>,
9 | ) -> impl Iterator<Item = &'a i32> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
help: to declare that the `impl Trait` captures `'_`, you can add an explicit `'_` lifetime bound
|
9 | ) -> impl Iterator<Item = &'a i32> + '_ {
| ++++
Playground
Yes, there is a way not to link the lifetime of the returned iterator to that of the reference to self: you need to move the data the filtering closure needs into the closure so that it doesn't reference it. To accomplish this, copy self.low into a local variable and move it into the closure.
Once you fix this, you will uncover a second problem: your main() as written can't invoke strct.find_low() as strct is not declared mut -- but it shouldn't need to be. find_low needlessly takes self as a mutable reference, when an immutable reference will do. Replace &mut self with &self.
pub fn find_low<'a>(
&self,
packets: impl Iterator<Item = &'a i32>,
) -> impl Iterator<Item = &'a i32> {
let low = self.low;
packets.filter(move |packet| **packet < low)
}
(Playground)
An alternative to moving a copy of self.low into the closure is to restrict the lifetime of &self to include 'a as well, though this does link the lifetime of the returned iterator to that of Structure.
pub fn find_low<'a>(
&'a self,
packets: impl Iterator<Item = &'a i32>,
) -> impl Iterator<Item = &'a i32> {
packets.filter(move |packet| **packet < self.low)
}
(Playground)

How can I ensure that a Rust vector only contains alternating types?

I have a set of data that alternates between A and B. These are all valid choices:
A -> B -> A
A -> B -> A -> B
B -> A -> B
B -> A -> B -> A
I want to leverage the type system to make sure the alternating property is checked at compile time while maintaining good performance.
Solution 1: linked list
struct A {
// data
next: Option<B>,
}
struct B {
// data
next: Option<Box<A>>,
}
The problem is that the performance of this data structure will be poor at best. Linked lists have frequent cache misses, and for iterating the data structure this is quite bad.
Solution 2: Vec + enum
enum Types {
A(DataA),
B(DataB),
}
type Data = Vec<Types>;
With this solution, cache locality is much better, so yay for performance. However, this does not prevent putting 2 As side-by-side. There is also the fact that one needs to check the type at each iteration, while it is not needed because of the informal definition.
Solution 3: Combination
struct A {
// data, default in first link = empty
b: Option<B>,
}
struct B {
// data
}
type Data = Vec<A>;
This combines the cache locality of the Vec with the type verification of the linked list. It is quite ugly, and one needs to check the first value to verify if it really is an A, or an empty container for the next B.
The question
Is there a data structure that allows compile-time type verification, while maintaining cache locality and avoiding extra allocation?
To store alternating types in a way that the type system enforces and has reasonable efficiency, you can use a tuple: Vec<(X, Y)>.
Your situation also requires
Storing an extra leading value in an Option to handle starting with Y
Storing an extra trailing value in an Option to handle ending with X
use either::Either; // 1.5.2
use std::iter;
#[derive(Debug, Default)]
struct Data<X, Y> {
head: Option<Y>,
pairs: Vec<(X, Y)>,
tail: Option<X>,
}
impl<X, Y> Data<X, Y> {
fn iter(&self) -> impl Iterator<Item = Either<&X, &Y>> {
let head = self.head.iter().map(Either::Right);
let pairs = self.pairs.iter().flat_map(|(a, b)| {
let a = iter::once(Either::Left(a));
let b = iter::once(Either::Right(b));
a.chain(b)
});
let tail = self.tail.iter().map(Either::Left);
head.chain(pairs).chain(tail)
}
}
That being said, you are going to have ergonomic issues somewhere. For example, you can't just push an Either<X, Y> because the previously pushed value might be of the same type. Creating the entire structure at once might be the simplest direction:
#[derive(Debug)]
struct A;
#[derive(Debug)]
struct B;
fn main() {
let data = Data {
head: Some(B),
pairs: vec![(A, B)],
tail: None,
};
println!("{:?}", data.iter().collect::<Vec<_>>());
}
Is there a data structure that allows compile-time type verification, while maintaining cache locality and avoiding extra allocation?
You can use Rust's type system to enforce that items of each type are added in alternating order. The general strategy is to capture the type of the first item and also the previous item in the type of the whole structure and make different methods available according to the "current" type. When the previous item was an X, only the methods for adding a Y will be available, and vice versa.
I'm using two Vecs rather than a Vec of tuples. Depending on your data types, this could result in better memory adjacency, but that really depends on how you end up iterating.
use std::marker::PhantomData;
use std::fmt;
struct Left;
struct Right;
struct Empty;
struct AlternatingVec<L, R, P = Empty, S = Empty> {
lefts: Vec<L>,
rights: Vec<R>,
prev: PhantomData<P>,
start: PhantomData<S>,
}
impl<L, R> AlternatingVec<L, R, Empty, Empty> {
pub fn new() -> Self {
AlternatingVec {
lefts: Vec::new(),
rights: Vec::new(),
prev: PhantomData,
start: PhantomData,
}
}
}
The types Left, Right and Empty are for "tagging" if the previous and start values correspond to values in the left or right collection, or if that collection is empty. Initially both collections are empty, so both P (the previously added value) and S (the start value) are Empty.
Next, a utility method for changing the types. It doesn't look like it does much, but we'll use it in combination with type inference to produce copies of the data structure, but with the phantom types changed.
impl<L, R, P, S> AlternatingVec<L, R, P, S> {
fn change_type<P2, S2>(self) -> AlternatingVec<L, R, P2, S2> {
AlternatingVec {
lefts: self.lefts,
rights: self.rights,
prev: PhantomData,
start: PhantomData,
}
}
}
In practice, the compiler is smart enough that this method does nothing at runtime.
These two traits define operations on the left and right collections respectively:
trait LeftMethods<L, R, S> {
fn push_left(self, val: L) -> AlternatingVec<L, R, Left, S>;
}
trait RightMethods<L, R, S> {
fn push_right(self, val: R) -> AlternatingVec<L, R, Right, S>;
}
We will implement those for the times we want them to be callable: RightMethods should only be available if the previous item was a "left" or if there are no items added so far. LeftMethods should be implemented if the previous items was a "right" or if there are no items added so far.
impl<L, R> LeftMethods<L, R, Left> for AlternatingVec<L, R, Empty, Empty> {
fn push_left(mut self, val: L) -> AlternatingVec<L, R, Left, Left> {
self.lefts.push(val);
self.change_type()
}
}
impl<L, R, S> LeftMethods<L, R, S> for AlternatingVec<L, R, Right, S> {
fn push_left(mut self, val: L) -> AlternatingVec<L, R, Left, S> {
self.lefts.push(val);
self.change_type()
}
}
impl<L, R> RightMethods<L, R, Right> for AlternatingVec<L, R, Empty, Empty> {
fn push_right(mut self, val: R) -> AlternatingVec<L, R, Right, Right> {
self.rights.push(val);
self.change_type()
}
}
impl<L, R, S> RightMethods<L, R, S> for AlternatingVec<L, R, Left, S> {
fn push_right(mut self, val: R) -> AlternatingVec<L, R, Right, S> {
self.rights.push(val);
self.change_type()
}
}
These methods don't do much except call push on the correct inner Vec, and then use change_type to make the type reflect the signature.
The compiler forces you to call push_left and push_right alternately:
fn main() {
let v = AlternatingVec::new()
.push_left(true)
.push_right(7)
.push_left(false)
.push_right(0)
.push_left(false);
}
This complex structure leads to a lot more work in general. For example, Debug is fiddly to implement in a nice way. I made a version with a Debug impl, but it's getting a bit too long for Stack Overflow. You can see it here:
https://gist.github.com/peterjoel/2ffe8b7f5ad7c649f61c580ac7dabc67

Why can't I mutably borrow a primitive from an enum?

I would like to be able to obtain references (both immutable and mutable) to the usize wrapped in Bar in the Foo enum:
use Foo::*;
#[derive(Debug, PartialEq, Clone)]
pub enum Foo {
Bar(usize)
}
impl Foo {
/* this works */
fn get_bar_ref(&self) -> &usize {
match *self {
Bar(ref n) => &n
}
}
/* this doesn't */
fn get_bar_ref_mut(&mut self) -> &mut usize {
match *self {
Bar(ref mut n) => &mut n
}
}
}
But I can't obtain the mutable reference because:
n does not live long enough
I was able to provide both variants of similar functions accessing other contents of Foo that are Boxed - why does the mutable borrow (and why only it) fail with an unboxed primitive?
You need to replace Bar(ref mut n) => &mut n with Bar(ref mut n) => n.
When you use ref mut n in Bar(ref mut n), it creates a mutable
reference to the data in Bar, so the type of n is &mut usize.
Then you try to return &mut n of &mut &mut u32 type.
This part is most likely incorrect.
Now deref coercion kicks in
and converts &mut n into &mut *n, creating a temporary value *n
of type usize, which doesn't live long enough.
These examples show the sample problem:
fn implicit_reborrow<T>(x: &mut T) -> &mut T {
x
}
fn explicit_reborrow<T>(x: &mut T) -> &mut T {
&mut *x
}
fn implicit_reborrow_bad<T>(x: &mut T) -> &mut T {
&mut x
}
fn explicit_reborrow_bad<T>(x: &mut T) -> &mut T {
&mut **&mut x
}
The explicit_ versions show what the compiler deduces through deref coercions.
The _bad versions both error in the exact same way, while the other two compile.
This is either a bug, or a limitation in how lifetimes are currently implemented in the compiler. The invariance of &mut T over T might have something to do with it, because it results in &mut &'a mut T being invariant over 'a and thus more restrictive during inference than the shared reference (&&'a T) case, even though in this situation the strictness is unnecessary.

Speedup counter game

I'm trying to solve a Rust algorithm question on hackerrank. My answer times out on some of the larger test cases. There are about 5 people who've completed it, so I believe it is possible and I assume they compile in release mode. Is there any speed-ups I'm missing?
The gist of the game is a counter (inp in main) is conditionally reduced and based on who can't reduce it any more, the winner is chosen.
use std::io;
fn main() {
let n: usize = read_one_line().
trim().parse().unwrap();
for _i in 0..n{
let inp: u64 = read_one_line().
trim().parse().unwrap();
println!("{:?}", find_winner(inp));
}
return;
}
fn find_winner(mut n: u64) -> String{
let mut win = 0;
while n>1{
if n.is_power_of_two(){
n /= 2;
}
else{
n -= n.next_power_of_two()/2;
}
win += 1;
}
let winner =
if win % 2 == 0{
String::from("Richard")
} else{
String::from("Louise")
};
winner
}
fn read_one_line() -> String{
let mut input = String::new();
io::stdin().read_line(&mut input).expect("Failed to read");
input
}
Your inner loop can be replaced by a combination of builtin functions:
let win = if n > 0 {
n.count_ones() + n.trailing_zeros() - 1
} else {
0
};
Also, instead of allocating a string every time find_winner is called,
a string slice may be returned:
fn find_winner(n: u64) -> &'static str {
let win = if n > 0 {
n.count_ones() + n.trailing_zeros() - 1
} else {
0
};
if win % 2 == 0 {
"Richard"
} else{
"Louise"
}
}
Avoiding memory allocation can help speeding up the application.
At the moment, the read_one_line function is doing one memory allocation per call, which can be avoided if you supply the String as a &mut parameter:
fn read_one_line(input: &mut String) -> &str {
io::stdin().read_line(input).expect("Failed to read");
input
}
Note how I also alter the return type to return a slice (which borrows input): further uses here do not need to modify the original string.
Another improvement is I/O. Rust is all about explicitness, and it means that io::stdin() is raw I/O: each call to read_line triggers interactions with the kernel.
You can (and should) instead used buffered I/O with std::io::BufReader. Build it once, then pass it as an argument:
fn read_one_line<'a, R>(reader: &mut R, input: &'a mut String) -> &'a str
where R: io::BufRead
{
reader.read_line(input).expect("Failed to read");
input
}
Note:
it's easier to make it generic (R) than to specify the exact type of BufReader :)
annotating the lifetime is mandatory because the return type could borrow either parameter
Putting it altogether:
fn read_one_line<'a, R>(reader: &mut R, input: &'a mut String) -> &'a str
where R: io::BufRead
{
reader.read_line(input).expect("Failed to read");
input
}
fn main() {
let mut reader = io::BufReader::new(io::stdin());
let mut input = String::new();
let n: usize = read_one_line(&mut reader, &mut input).
trim().parse().unwrap();
for _i in 0..n{
let inp: u64 = read_one_line(&mut reader, &mut input).
trim().parse().unwrap();
println!("{:?}", find_winner(inp));
}
return;
}
with the bigger win probably being I/O (might even be sufficient in itself).
Don't forget to also apply #John's advices, this way you'll be allocation-free in your main loop!

Can I traverse a singly linked list without owner move or unsafe?

A singly linked list can simply create from tail. But can't from head, I tried many time, code here: https://gist.github.com/tioover/8d7585105c06e01678a8.
In fact, I want search then delete a node in linked list. but I can't traversal linked list with mutable borrow pointer: https://gist.github.com/tioover/526715ed05342ef5b4f1. I tried many time too.
Here's some code from an answer to a similar question. It shows a way of having a list that you can add to the beginning and the end, and the middle for good measure.
#[derive(Debug)]
struct Node<T> {
v: T,
next: Option<Box<Node<T>>>,
}
impl<T> Node<T> {
fn new(v: T) -> Node<T> { Node { v: v, next: None } }
fn push_front(self, head: T) -> Node<T> {
Node {
v: head,
next: Some(Box::new(self)),
}
}
fn push_back(&mut self, tail: T) {
match self.next {
Some(ref mut next) => next.push_back(tail),
None => self.next = Some(Box::new(Node::new(tail))),
}
}
fn push_after(&mut self, v: T) {
let old_next = self.next.take();
let new_next = Node {
v: v,
next: old_next,
};
self.next = Some(Box::new(new_next));
}
}
fn main() {
let mut n = Node::new(2u8);
n.push_back(3u8);
let mut n = n.push_front(0u8);
n.push_after(1u8);
println!("{:?}", n);
}
The important thing is that when we add to the head, we consume the old head by taking it as self. That allows us to move it into a Box which will be the follower of the new head. Removing an item is a straight-forward extension of this example, but you'll need to look forward a bit and handle more edge cases (like what to do if there isn't a second successor).

Resources