Class set method in Haskell using State-Monad - data-structures

I've recently had a look at Haskell's Monad - State. I've been able to create functions that operate with this Monad, but I'm trying to encapsulate the behavior into a class, basically I'm trying to replicate in Haskell something like this:
class A {
public:
int x;
void setX(int newX) {
x = newX;
}
void getX() {
return x;
}
}
I would be very grateful if anyone can help with this. Thanks!

I would start off by noting that Haskell, to say the least, does not encourage traditional OO-style development; instead, it has features and characteristics that lend themselves well to the sort of 'pure functional' manipulation that you won't really find in many other languages; the short on this is that trying to 'bring over' concepts from other (traditional languages) can often be a very bad idea.
but I'm trying to encapsulate the behavior into a class
Hence, my first major question that comes to mind is why? Surely you must want to do something with this (traditional OO concept of a) class?
If an approximate answer to this question is: "I'd like to model some sort of data construct", then you'd be better off working with something like
data A = A { xval :: Int }
> let obj1 = A 5
> xval obj1
5
> let obj2 = obj1 { xval = 10 }
> xval obj2
10
Which demonstrates pure, immutable data structures, along with 'getter' functions and destructive updates (utilizing record syntax). This way, you'd do whatever work you need to do as some combination of functions mapping these 'data constructs' to new data constructs, as appropriate.
Now, if you absolutely needed some sort of model of State (and indeed, answering this question requires a bit of experience in knowing exactly what local versus global state is), only then would you delve into using the State Monad, with something like:
module StateGame where
import Control.Monad.State
-- Example use of State monad
-- Passes a string of dictionary {a,b,c}
-- Game is to produce a number from the string.
-- By default the game is off, a C toggles the
-- game on and off. A 'a' gives +1 and a b gives -1.
-- E.g
-- 'ab' = 0
-- 'ca' = 1
-- 'cabca' = 0
-- State = game is on or off & current score
-- = (Bool, Int)
type GameValue = Int
type GameState = (Bool, Int)
playGame :: String -> State GameState GameValue
playGame [] = do
(_, score) <- get
return score
playGame (x:xs) = do
(on, score) <- get
case x of
'a' | on -> put (on, score + 1)
'b' | on -> put (on, score - 1)
'c' -> put (not on, score)
_ -> put (on, score)
playGame xs
startState = (False, 0)
main = print $ evalState (playGame "abcaaacbbcabbab") startState
(shamelessly lifted from this tutorial). Note the use of the analogous 'pure immutable data structures' within the context of a state monad, in addition to 'put' and 'get' monadic functions, which facilitate access to the state contained within the State Monad.
Ultimately, I'd suggest you ask yourself: what is it that you really want to accomplish with this model of an (OO) class? Haskell is not your typical OO-language, and trying to map concepts over 1-to-1 will only frustrate you in the short (and possibly long) term. This should be a standard mantra, but I'd highly recommend learning from the book Real World Haskell, where the authors are able to delve into far more detailed 'motivation' for picking any one tool or abstraction over another. If you were adamant, you could model traditional OO constructs in Haskell, but I wouldn't suggest going about doing this - unless you have a really good reason for doing so.

It takes a bit of permuting to transform imperative code into a purely functional context.
A setter mutates an object. We're not allowed to do that directly in Haskell because of laziness and purity.
Perhaps, if we transcribe the State monad to another language it'll be more apparent. Your code is in C++, but because I at least want garbage collection I'll use java here.
Since Java never got around to defining anonymous functions, first, we'll define an interface for pure functions.
public interface Function<A,B> {
B apply(A a);
}
Then we can make up a pure immutable pair type.
public final class Pair<A,B> {
private final A a;
private final B b;
public Pair(A a, B b) {
this.a = a;
this.b = b;
}
public A getFst() { return a; }
public B getSnd() { return b; }
public static <A,B> Pair<A,B> make(A a, B b) {
return new Pair<A,B>(a, b);
}
public String toString() {
return "(" + a + ", " + b + ")";
}
}
With those in hand, we can actually define the State monad:
public abstract class State<S,A> implements Function<S, Pair<A, S> > {
// pure takes a value a and yields a state action, that takes a state s, leaves it untouched, and returns your a paired with it.
public static <S,A> State<S,A> pure(final A a) {
return new State<S,A>() {
public Pair<A,S> apply(S s) {
return new Pair<A,S>(a, s);
}
};
}
// we can also read the state as a state action.
public static <S> State<S,S> get() {
return new State<S,S>() {
public Pair<S,S> apply(S, s) {
return new Pair<S,S>(s, s);
}
}
}
// we can compose state actions
public <B> State<S,B> bind(final Function<A, State<S,B>> f) {
return new State<S,B>() {
public Pair<B,S> apply(S s0) {
Pair<A,S> p = State.this.apply(s0);
return f.apply(p.getFst()).apply(p.getSnd());
}
};
}
// we can also read the state as a state action.
public static <S> State<S,S> put(final S newS) {
return new State<S,S>() {
public Pair<S,S> apply(S, s) {
return new Pair<S,S>(s, newS);
}
}
}
}
Now, there does exist a notion of a getter and a setter inside of the state monad. These are called lenses. The basic presentation in Java would look like:
public abstract class Lens[A,B] {
public abstract B get(A a);
public abstract A set(B b, A a);
// .. followed by a whole bunch of concrete methods.
}
The idea is that a lens provides access to a getter that knows how to extract a B from an A, and a setter that knows how to take a B and some old A, and replace part of the A, yielding a new A. It can't mutate the old one, but it can construct a new one with one of the fields replaced.
I gave a talk on these at a recent Boston Area Scala Enthusiasts meeting. You can watch the presentation here.
To come back into Haskell, rather than talk about things in an imperative setting. We can import
import Data.Lens.Lazy
from comonad-transformers or one of the other lens libraries mentioned here. That link provides the laws that must be satisfied to be a valid lens.
And then what you are looking for is some data type like:
data A = A { x_ :: Int }
with a lens
x :: Lens A Int
x = lens x_ (\b a -> a { x_ = b })
Now you can write code that looks like
postIncrement :: State A Int
postIncrement = do
old_x <- access x
x ~= (old_x + 1)
return old_x
using the combinators from Data.Lens.Lazy.
The other lens libraries mentioned above provide similar combinators.

First of all, I agree with Raeez that this probably the wrong way to go, unless you really know why! If you want to increase some value by 42 (say), why not write a function that does that for you?
It's quite a change from the traditional OO mindset where you have boxes with values in them and you take them out, manipulate them and put them back in. I would say that until you start noticing the pattern "Hey! I always take some value as an argument, and at the end return it slightly modified, tupled with some other value and all the plumbing is getting messy!" you don't really need the State monad. Part of the fun of (learning) Haskell is finding new ways to get around the stateful OO thinking!
That said, if you absolutely want a box with an x of type Int in it, you could try making your own get and put variants, something like this:
import Control.Monad.State
data Point = Point { x :: Int, y :: Int } deriving Show
getX :: State Point Int
getX = get >>= return . x
putX :: Int -> State Point ()
putX newVal = do
pt <- get
put (pt { x = newVal })
increaseX :: State Point ()
increaseX = do
x <- getX
putX (x + 1)
test :: State Point Int
test = do
x1 <- getX
increaseX
x2 <- getX
putX 7
x3 <- getX
return $ x1 + x2 + x3
Now, if you evaluate runState test (Point 2 9) in ghci, you'll get back (12,Point {x = 7, y = 9}) (since 12 = 2 + (2+1) + 7 and the x in the state gets set to 7 at the end). If you don't care about the returned point, you can use evalState and you'll get just the 12.
There's also more advanced things to do here like abstracting out Point with a typeclass in case you have multiple datastructures which have something which behaves like x but that's better left for another question, in my opinion!

Related

F# record: ref vs mutable field

While refactoring my F# code, I found a record with a field of type bool ref:
type MyType =
{
Enabled : bool ref
// other, irrelevant fields here
}
I decided to try changing it to a mutable field instead
// Refactored version
type MyType =
{
mutable Enabled : bool
// other fields unchanged
}
Also, I applied all the changes required to make the code compile (i.e. changing := to <-, removing incr and decr functions, etc).
I noticed that after the changes some of the unit tests started to fail.
As the code is pretty large, I can't really see what exactly changed.
Is there a significant difference in implementation of the two that could change the behavior of my program?
Yes, there is a difference. Refs are first-class values, while mutable variables are a language construct.
Or, from a different perspective, you might say that ref cells are passed by reference, while mutable variables are passed by value.
Consider this:
type T = { mutable x : int }
type U = { y : int ref }
let t = { x = 5 }
let u = { y = ref 5 }
let mutable xx = t.x
xx <- 10
printfn "%d" t.x // Prints 5
let mutable yy = u.y
yy := 10
printfn "%d" !u.y // Prints 10
This happens because xx is a completely new mutable variable, unrelated to t.x, so that mutating xx has no effect on x.
But yy is a reference to the exact same ref cell as u.y, so that pushing a new value into that cell while referring to it via yy has the same effect as if referring to it via u.y.
If you "copy" a ref, the copy ends up pointing to the same ref, but if you copy a mutable variable, only its value gets copied.
You have the difference not because one is first-value, passed by reference/value or other things. It's because a ref is just a container (class) on its own.
The difference is more obvious when you implement a ref by yourself. You could do it like this:
type Reference<'a> = {
mutable Value: 'a
}
Now look at both definitions.
type MyTypeA = {
mutable Enabled: bool
}
type MyTypeB = {
Enabled: Reference<bool>
}
MyTypeA has a Enabled field that can be directly changed or with other word is mutable.
On the other-side you have MyTypeB that is theoretically immutable but has a Enabled that reference to a mutable class.
The Enabled from MyTypeB just reference to an object that is mutable like the millions of other classes in .NET. From the above type definitions, you can create objects like these.
let t = { MyTypeA.Enabled = true }
let u = { MyTypeB.Enabled = { Value = true }}
Creating the types makes it more obvious, that the first is a mutable field, and the second contains an object with a mutable field.
You find the implementation of ref in FSharp.Core/prim-types.fs it looks like this:
[<DebuggerDisplay("{contents}")>]
[<StructuralEquality; StructuralComparison>]
[<CompiledName("FSharpRef`1")>]
type Ref<'T> =
{
[<DebuggerBrowsable(DebuggerBrowsableState.Never)>]
mutable contents: 'T }
member x.Value
with get() = x.contents
and set v = x.contents <- v
and 'T ref = Ref<'T>
The ref keyword in F# is just the built-in way to create such a pre-defined mutable Reference object, instead that you create your own type for this. And it has some benefits that it works well whenever you need to pass byref, in or out values in .NET. So you should use ref. But you also can use a mutable for this. For example, both code examples do the same.
With a reference
let parsed =
let result = ref 0
match System.Int32.TryParse("1234", result) with
| true -> result.Value
| false -> result.Value
With a mutable
let parsed =
let mutable result = 0
match System.Int32.TryParse("1234", &result) with
| true -> result
| false -> result
In both examples you get a 1234 as an int parsed. But the first example will create a FSharpRef and pass it to Int32.TryParse while the second example creates a field or variable and passes it with out to Int32.TryParse

Can functions return user-defined types in Processing?

I wrote a class to represent complex numbers in Processing (which I call Complex). I want to implement basic arithmetic functions on complex numbers. However, if I declare the return type of a method to be Complex and try to return a new object, I get an error which says that says
"Void methods cannot return a value". I also get errors for the parentheses after the function name, as well as the comma separating the x and y parameters.
However, I have noticed that if I change the return type to something built-in (such as int or String) and return some arbitrary value of the correct type, all of these errors disappear. I have also not seen anay examples of functions returning non-built-in types either. These two facts lead me to believe that I may not be able to return object of a class I defined. So my question is whether it is possible to return an object from a class I have defined in Processing. If not, is there any way around this?
Here is my code:
class Complex {
int re, im; // real and imaginary components
Complex(int re, int im) {
this.re = re;
this.im = im;
}
}
Complex add(Complex x, Complex y) {
int re_new = x.re + y.re;
int im_new = x.im + y.im;
return new Complex(re_new, im_new);
}
With Processing just as with java you can return any object type with a function. Here's a short proof of concept for you to copy, paste and try:
void setup() {
Complex a = new Complex(1, 5);
Complex b = new Complex(2, 6);
Complex c = add(a, b);
println("[" + c.re + ", " + c.im + "]");
}
void draw() {
}
class Complex {
int re, im; // real and imaginary components
Complex(int re, int im) {
this.re = re;
this.im = im;
}
}
Complex add(Complex x, Complex y) {
int re_new = x.re + y.re;
int im_new = x.im + y.im;
return new Complex(re_new, im_new);
}
I noticed that the add method light up as if it was shadowing another method, but it didn't prevent it to run as intended.
If the situation persists, you may want to post more code, as the problem may be somewhere unexpected. Good luck!

Referencing / dereferencing a vector element in a for loop

In the code below, I want to retain number_list, after iterating over it, since the .into_iter() that for uses by default will consume. Thus, I am assuming that n: &i32 and I can get the value of n by dereferencing.
fn main() {
let number_list = vec![24, 34, 100, 65];
let mut largest = number_list[0];
for n in &number_list {
if *n > largest {
largest = *n;
}
}
println!("{}", largest);
}
It was revealed to me that instead of this, we can use &n as a 'pattern':
fn main() {
let number_list = vec![24, 34, 100, 65];
let mut largest = number_list[0];
for &n in &number_list {
if n > largest {
largest = n;
}
}
println!("{}", largest);
number_list;
}
My confusion (and bear in mind I haven't covered patterns) is that I would expect that since n: &i32, then &n: &&i32 rather than it resolving to the value (if a double ref is even possible). Why does this happen, and does the meaning of & differ depending on context?
It can help to think of a reference as a kind of container. For comparison, consider Option, where we can "unwrap" the value using pattern-matching, for example in an if let statement:
let n = 100;
let opt = Some(n);
if let Some(p) = opt {
// do something with p
}
We call Some and None constructors for Option, because they each produce a value of type Option. In the same way, you can think of & as a constructor for a reference. And the syntax is symmetric:
let n = 100;
let reference = &n;
if let &p = reference {
// do something with p
}
You can use this feature in any place where you are binding a value to a variable, which happens all over the place. For example:
if let, as above
match expressions:
match opt {
Some(1) => { ... },
Some(p) => { ... },
None => { ... },
}
match reference {
&1 => { ... },
&p => { ... },
}
In function arguments:
fn foo(&p: &i32) { ... }
Loops:
for &p in iter_of_i32_refs {
...
}
And probably more.
Note that the last two won't work for Option because they would panic if a None was found instead of a Some, but that can't happen with references because they only have one constructor, &.
does the meaning of & differ depending on context?
Hopefully, if you can interpret & as a constructor instead of an operator, then you'll see that its meaning doesn't change. It's a pretty cool feature of Rust that you can use constructors on the right hand side of an expression for creating values and on the left hand side for taking them apart (destructuring).
As apart from other languages (C++), &n in this case isn't a reference, but pattern matching, which means that this is expecting a reference.
The opposite of this would be ref n which would give you &&i32 as a type.
This is also the case for closures, e.g.
(0..).filter(|&idx| idx < 10)...
Please note, that this will move the variable, e.g. you cannot do this with types, that don't implement the Copy trait.
My confusion (and bear in mind I haven't covered patterns) is that I would expect that since n: &i32, then &n: &&i32 rather than it resolving to the value (if a double ref is even possible). Why does this happen, and does the meaning of & differ depending on context?
When you do pattern matching (for example when you write for &n in &number_list), you're not saying that n is an &i32, instead you are saying that &n (the pattern) is an &i32 (the expression) from which the compiler infers that n is an i32.
Similar things happen for all kinds of pattern, for example when pattern-matching in if let Some (x) = Some (42) { /* … */ } we are saying that Some (x) is Some (42), therefore x is 42.

Swift Dictionary slow even with optimizations: doing uncessary retain/release?

The following code, which maps simple value holders to booleans, runs over 20x faster in Java than Swift 2 - XCode 7 beta3, "Fastest, Aggressive Optimizations [-Ofast]", and "Fast, Whole Module Optimizations" turned on. I can get over 280M lookups/sec in Java but only about 10M in Swift.
When I look at it in Instruments I see that most of the time is going into a pair of retain/release calls associated with the map lookup. Any suggestions on why this is happening or a workaround would be appreciated.
The structure of the code is a simplified version of my real code, which has a more complex key class and also stores other types (though Boolean is an actual case for me). Also, note that I am using a single mutable key instance for the retrieval to avoid allocating objects inside the loop and according to my tests this is faster in Swift than an immutable key.
EDIT: I have also tried switching to NSMutableDictionary but when used with Swift objects as keys it seems to be terribly slow.
EDIT2: I have tried implementing the test in objc (which wouldn't have the Optional unwrapping overhead) and it is faster but still over an order of magnitude slower than Java... I'm going to pose that example as another question to see if anyone has ideas.
EDIT3 - Answer. I have posted my conclusions and my workaround in an answer below.
public final class MyKey : Hashable {
var xi : Int = 0
init( _ xi : Int ) { set( xi ) }
final func set( xi : Int) { self.xi = xi }
public final var hashValue: Int { return xi }
}
public func == (lhs: MyKey, rhs: MyKey) -> Bool {
if ( lhs === rhs ) { return true }
return lhs.xi==rhs.xi
}
...
var map = Dictionary<MyKey,Bool>()
let range = 2500
for x in 0...range { map[ MyKey(x) ] = true }
let runs = 10
for _ in 0...runs
{
let time = Time()
let reps = 10000
let key = MyKey(0)
for _ in 0...reps {
for x in 0...range {
key.set(x)
if ( map[ key ] == nil ) { XCTAssertTrue(false) }
}
}
print("rate=\(time.rate( reps*range )) lookups/s")
}
and here is the corresponding Java code:
public class MyKey {
public int xi;
public MyKey( int xi ) { set( xi ); }
public void set( int xi) { this.xi = xi; }
#Override public int hashCode() { return xi; }
#Override
public boolean equals( Object o ) {
if ( o == this ) { return true; }
MyKey mk = (MyKey)o;
return mk.xi == this.xi;
}
}
...
Map<MyKey,Boolean> map = new HashMap<>();
int range = 2500;
for(int x=0; x<range; x++) { map.put( new MyKey(x), true ); }
int runs = 10;
for(int run=0; run<runs; run++)
{
Time time = new Time();
int reps = 10000;
MyKey buffer = new MyKey( 0 );
for (int it = 0; it < reps; it++) {
for (int x = 0; x < range; x++) {
buffer.set( x );
if ( map.get( buffer ) == null ) { Assert.assertTrue( false ); }
}
}
float rate = reps*range/time.s();
System.out.println( "rate = " + rate );
}
After much experimentation I have come to some conclusions and found a workaround (albeit somewhat extreme).
First let me say that I recognize that this kind of very fine grained data structure access within a tight loop is not representative of general performance, but it does affect my application and I'm imagining others like games and heavily numeric applications. Also let me say that I know that Swift is a moving target and I'm sure it will improve - perhaps my workaround (hacks) below will not be necessary by the time you read this. But if you are trying to do something like this today and you are looking at Instruments and seeing the majority of your application time spent in retain/release and you don't want to rewrite your entire app in objc please read on.
What I have found is that almost anything that one does in Swift that touches an object reference incurs an ARC retain/release penalty. Additionally Optional values - even optional primitives - also incur this cost. This pretty much rules out using Dictionary or NSDictionary.
Here are some things that are fast that you can include in a workaround:
a) Arrays of primitive types.
b) Arrays of final objects as long as long as the array is on the stack and not on the heap. e.g. Declare an array within the method body (but outside of your loop of course) and iteratively copy the values to it. Do not Array(array) copy it.
Putting this together you can construct a data structure based on arrays that stores e.g. Ints and then store array indexes to your objects in that data structure. Within your loop you can look up the objects by their index in the fast local array. Before you ask "couldn't the data structure store the array for me" - no, because that would incur two of the penalties I mentioned above :(
All things considered this workaround is not too bad - If you can enumerate the entities that you want to store in the Dictionary / data structure you should be able to host them in an array as described. Using the technique above I was able to exceed the Java performance by a factor of 2x in Swift in my case.
If anyone is still reading and interested at this point I will consider updating my example code and posting.
EDIT: I'd add an option: c) It is also possible to use UnsafeMutablePointer<> or Unmanaged<> in Swift to create a reference that will not be retained when passed around. I was not aware of this when I started and I would hesitate to recommend it in general because it's a hack, but I've used it in a few cases to wrap a heavily used array that was incurring a retain/release every time it was referenced.

Hand-over-hand locking with Rust

I'm trying to write an implementation of union-find in Rust. This is famously very simple to implement in languages like C, while still having a complex run time analysis.
I'm having trouble getting Rust's mutex semantics to allow iterative hand-over-hand locking.
Here's how I got where I am now.
First, this is a very simple implementation of part of the structure I want in C:
#include <stdlib.h>
struct node {
struct node * parent;
};
struct node * create(struct node * parent) {
struct node * ans = malloc(sizeof(struct node));
ans->parent = parent;
return ans;
}
struct node * find_root(struct node * x) {
while (x->parent) {
x = x->parent;
}
return x;
}
int main() {
struct node * foo = create(NULL);
struct node * bar = create(foo);
struct node * baz = create(bar);
baz->parent = find_root(bar);
}
Note that the structure of the pointers is that of an inverted tree; multiple pointers may point at a single location, and there are no cycles.
At this point, there is no path compression.
Here is a Rust translation. I chose to use Rust's reference-counted pointer type to support the inverted tree type I referenced above.
Note that this implementation is much more verbose, possibly due to the increased safety that Rust offers, but possibly due to my inexperience with Rust.
use std::rc::Rc;
struct Node {
parent: Option<Rc<Node>>
}
fn create(parent: Option<Rc<Node>>) -> Node {
Node {parent: parent.clone()}
}
fn find_root(x: Rc<Node>) -> Rc<Node> {
let mut ans = x.clone();
while ans.parent.is_some() {
ans = ans.parent.clone().unwrap();
}
ans
}
fn main() {
let foo = Rc::new(create(None));
let bar = Rc::new(create(Some(foo.clone())));
let mut prebaz = create(Some(bar.clone()));
prebaz.parent = Some(find_root(bar.clone()));
}
Path compression re-parents each node along a path to the root every time find_root is called. To add this feature to the C code, only two new small functions are needed:
void change_root(struct node * x, struct node * root) {
while (x) {
struct node * tmp = x->parent;
x->parent = root;
x = tmp;
}
}
struct node * root(struct node * x) {
struct node * ans = find_root(x);
change_root(x, ans);
return ans;
}
The function change_root does all the re-parenting, while the function root is just a wrapper to use the results of find_root to re-parent the nodes on the path to the root.
In order to do this in Rust, I decided I would have to use a Mutex rather than just a reference counted pointer, since the Rc interface only allows mutable access by copy-on-write when more than one pointer to the item is live. As a result, all of the code would have to change. Before even getting to the path compression part, I got hung up on find_root:
use std::sync::{Mutex,Arc};
struct Node {
parent: Option<Arc<Mutex<Node>>>
}
fn create(parent: Option<Arc<Mutex<Node>>>) -> Node {
Node {parent: parent.clone()}
}
fn find_root(x: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
let mut ans = x.clone();
let mut inner = ans.lock();
while inner.parent.is_some() {
ans = inner.parent.clone().unwrap();
inner = ans.lock();
}
ans.clone()
}
This produces the error (with 0.12.0)
error: cannot assign to `ans` because it is borrowed
ans = inner.parent.clone().unwrap();
note: borrow of `ans` occurs here
let mut inner = ans.lock();
What I think I need here is hand-over-hand locking. For the path A -> B -> C -> ..., I need to lock A, lock B, unlock A, lock C, unlock B, ... Of course, I could keep all of the locks open: lock A, lock B, lock C, ... unlock C, unlock B, unlock A, but this seems inefficient.
However, Mutex does not offer unlock, and uses RAII instead. How can I achieve hand-over-hand locking in Rust without being able to directly call unlock?
EDIT: As the comments noted, I could use Rc<RefCell<Node>> rather than Arc<Mutex<Node>>. Doing so leads to the same compiler error.
For clarity about what I'm trying to avoid by using hand-over-hand locking, here is a RefCell version that compiles but used space linear in the length of the path.
fn find_root(x: Rc<RefCell<Node>>) -> Rc<RefCell<Node>> {
let mut inner : RefMut<Node> = x.borrow_mut();
if inner.parent.is_some() {
find_root(inner.parent.clone().unwrap())
} else {
x.clone()
}
}
We can pretty easily do full hand-over-hand locking as we traverse this list using just a bit of unsafe, which is necessary to tell the borrow checker a small bit of insight that we are aware of, but that it can't know.
But first, let's clearly formulate the problem:
We want to traverse a linked list whose nodes are stored as Arc<Mutex<Node>> to get the last node in the list
We need to lock each node in the list as we go along the way such that another concurrent traversal has to follow strictly behind us and cannot muck with our progress.
Before we get into the nitty-gritty details, let's try to write the signature for this function:
fn find_root(node: Arc<Mutex<Node>>) -> Arc<Mutex<Node>>;
Now that we know our goal, we can start to get into the implementation - here's a first attempt:
fn find_root(incoming: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
// We have to separate this from incoming since the lock must
// be borrowed from incoming, not this local node.
let mut node = incoming.clone();
let mut lock = incoming.lock();
// Could use while let but that leads to borrowing issues.
while lock.parent.is_some() {
node = lock.parent.as_ref().unwrap().clone(); // !! uh-oh !!
lock = node.lock();
}
node
}
If we try to compile this, rustc will error on the line marked !! uh-oh !!, telling us that we can't move out of node while lock still exists, since lock is borrowing node. This is not a spurious error! The data in lock might go away as soon as node does - it's only because we know that we can keep the data lock is pointing to valid and in the same memory location even if we move node that we can fix this.
The key insight here is that the lifetime of data contained within an Arc is dynamic, and it is hard for the borrow checker to make the inferences we can about exactly how long data inside an Arc is valid.
This happens every once in a while when writing rust; you have more knowledge about the lifetime and organization of your data than rustc, and you want to be able to express that knowledge to the compiler, effectively saying "trust me". Enter: unsafe - our way of telling the compiler that we know more than it, and it should allow us to inform it of the guarantees that we know but it doesn't.
In this case, the guarantee is pretty simple - we are going to replace node while lock still exists, but we are not going to ensure that the data inside lock continues to be valid even though node goes away. To express this guarantee we can use mem::transmute, a function which allows us to reinterpret the type of any variable, by just using it to change the lifetime of the lock returned by node to be slightly longer than it actually is.
To make sure we keep our promise, we are going to use another handoff variable to hold node while we reassign lock - even though this moves node (changing its address) and the borrow checker will be angry at us, we know it's ok since lock doesn't point at node, it points at data inside of node, whose address (in this case, since it's behind an Arc) will not change.
Before we get to the solution, it's important to note that the trick we are using here is only valid because we are using an Arc. The borrow checker is warning us of a possibly serious error - if the Mutex was held inline and not in an Arc, this error would be a correct prevention of a use-after-free, where the MutexGuard held in lock would attempt to unlock a Mutex which has already been dropped, or at least moved to another memory location.
use std::mem;
use std::sync::{Arc, Mutex};
fn find_root(incoming: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
let mut node = incoming.clone();
let mut handoff_node;
let mut lock = incoming.lock().unwrap();
// Could use while let but that leads to borrowing issues.
while lock.parent.is_some() {
// Keep the data in node around by holding on to this `Arc`.
handoff_node = node;
node = lock.parent.as_ref().unwrap().clone();
// We are going to move out of node while this lock is still around,
// but since we kept the data around it's ok.
lock = unsafe { mem::transmute(node.lock().unwrap()) };
}
node
}
And, just like that, rustc is happy, and we have hand-over-hand locking, since the last lock is released only after we have acquired the new lock!
There is one unanswered question in this implementation which I have not yet received an answer too, which is whether the drop of the old value and assignment of a new value to a variable is a guaranteed to be atomic - if not, there is a race condition where the old lock is released before the new lock is acquired in the assignment of lock. It's pretty trivial to work around this by just having another holdover_lock variable and moving the old lock into it before reassigning, then dropping it after reassigning lock.
Hopefully this fully addresses your question and shows how unsafe can be used to work around "deficiencies" in the borrow checker when you really do know more. I would still like to want that the cases where you know more than the borrow checker are rare, and transmuting lifetimes is not "usual" behavior.
Using Mutex in this way, as you can see, is pretty complex and you have to deal with many, many, possible sources of a race condition and I may not even have caught all of them! Unless you really need this structure to be accessible from many threads, it would probably be best to just use Rc and RefCell, if you need it, as this makes things much easier.
I believe this to fit the criteria of hand-over-hand locking.
use std::sync::Mutex;
fn main() {
// Create a set of mutexes to lock hand-over-hand
let mutexes = Vec::from_fn(4, |_| Mutex::new(false));
// Lock the first one
let val_0 = mutexes[0].lock();
if !*val_0 {
// Lock the second one
let mut val_1 = mutexes[1].lock();
// Unlock the first one
drop(val_0);
// Do logic
*val_1 = true;
}
for mutex in mutexes.iter() {
println!("{}" , *mutex.lock());
}
}
Edit #1
Does it work when access to lock n+1 is guarded by lock n?
If you mean something that could be shaped like the following, then I think the answer is no.
struct Level {
data: bool,
child: Option<Mutex<Box<Level>>>,
}
However, it is sensible that this should not work. When you wrap an object in a mutex, then you are saying "The entire object is safe". You can't say both "the entire pie is safe" and "I'm eating the stuff below the crust" at the same time. Perhaps you jettison the safety by creating a Mutex<()> and lock that?
This is still not the answer your literal question of to how to do hand-over-hand locking, which should only be important in a concurrent setting (or if someone else forced you to use Mutex references to nodes). It is instead how to do this with Rc and RefCell, which you seem to be interested in.
RefCell only allows mutable writes when one mutable reference is held. Importantly, the Rc<RefCell<Node>> objects are not mutable references. The mutable references it is talking about are the results from calling borrow_mut() on the Rc<RefCell<Node>>object, and as long as you do that in a limited scope (e.g. the body of the while loop), you'll be fine.
The important thing happening in path compression is that the next Rc object will keep the rest of the chain alive while you swing the parent pointer for node to point at root. However, it is not a reference in the Rust sense of the word.
struct Node
{
parent: Option<Rc<RefCell<Node>>>
}
fn find_root(mut node: Rc<RefCell<Node>>) -> Rc<RefCell<Node>>
{
while let Some(parent) = node.borrow().parent.clone()
{
node = parent;
}
return node;
}
fn path_compress(mut node: Rc<RefCell<Node>>, root: Rc<RefCell<Node>>)
{
while node.borrow().parent.is_some()
{
let next = node.borrow().parent.clone().unwrap();
node.borrow_mut().parent = Some(root.clone());
node = next;
}
}
This runs fine for me with the test harness I used, though there may still be bugs. It certainly compiles and runs without a panic! due to trying to borrow_mut() something that is already borrowed. It may actually produce the right answer, that's up to you.
On IRC, Jonathan Reem pointed out that inner is borrowing until the end of its lexical scope, which is too far for what I was asking. Inlining it produces the following, which compiles without error:
fn find_root(x: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
let mut ans = x.clone();
while ans.lock().parent.is_some() {
ans = ans.lock().parent.clone().unwrap();
}
ans
}
EDIT: As Francis Gagné points out, this has a race condition, since the lock doesn't extend long enough. Here's a modified version that only has one lock() call; perhaps it is not vulnerable to the same problem.
fn find_root(x: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
let mut ans = x.clone();
loop {
ans = {
let tmp = ans.lock();
match tmp.parent.clone() {
None => break,
Some(z) => z
}
}
}
ans
}
EDIT 2: This only holds one lock at a time, and so is racey. I still don't know how to do hand-over-hand locking.
As pointed out by Frank Sherry and others, you shouldn't use Arc/Mutex when single threaded. But his code was outdated, so here is the new one (for version 1.0.0alpha2).
This does not take linear space either (like the recursive code given in the question).
struct Node {
parent: Option<Rc<RefCell<Node>>>
}
fn find_root(node: Rc<RefCell<Node>>) -> Rc<RefCell<Node>> {
let mut ans = node.clone(); // Rc<RefCell<Node>>
loop {
ans = {
let ans_ref = ans.borrow(); // std::cell::Ref<Node>
match ans_ref.parent.clone() {
None => break,
Some(z) => z
}
} // ans_ref goes out of scope, and ans becomes mutable
}
ans
}
fn path_compress(mut node: Rc<RefCell<Node>>, root: Rc<RefCell<Node>>) {
while node.borrow().parent.is_some() {
let next = {
let node_ref = node.borrow();
node_ref.parent.clone().unwrap()
};
node.borrow_mut().parent = Some(root.clone());
// RefMut<Node> from borrow_mut() is out of scope here...
node = next; // therefore we can mutate node
}
}
Note for beginners: Pointers are automatically dereferenced by dot operator. ans.borrow() actually means (*ans).borrow(). I intentionally used different styles for the two functions.
Although not the answer to your literal question (hand-over locking), union-find with weighted-union and path-compression can be very simple in Rust:
fn unionfind<I: Iterator<(uint, uint)>>(mut iterator: I, nodes: uint) -> Vec<uint>
{
let mut root = Vec::from_fn(nodes, |x| x);
let mut rank = Vec::from_elem(nodes, 0u8);
for (mut x, mut y) in iterator
{
// find roots for x and y; do path compression on look-ups
while (x != root[x]) { root[x] = root[root[x]]; x = root[x]; }
while (y != root[y]) { root[y] = root[root[y]]; y = root[y]; }
if x != y
{
// weighted union swings roots
match rank[x].cmp(&rank[y])
{
Less => root[x] = y,
Greater => root[y] = x,
Equal =>
{
root[y] = x;
rank[x] += 1
},
}
}
}
}
Maybe the meta-point is that the union-find algorithm may not be the best place to handle node ownership, and by using references to existing memory (in this case, by just using uint identifiers for the nodes) without affecting the lifecycle of the nodes makes for a much simpler implementation, if you can get away with it of course.

Resources