I'm writing a parser generator as a project to learn rust, and I'm running into something I can't figure out with lifetimes and closures. Here's my simplified case (sorry it's as complex as it is, but I need to have the custom iterator in the real version and it seems to make a difference in the compiler's behavior):
Playpen link: http://is.gd/rRm2aa
struct MyIter<'stat, T:Iterator<&'stat str>>{
source: T
}
impl<'stat, T:Iterator<&'stat str>> Iterator<&'stat str> for MyIter<'stat, T>{
fn next(&mut self) -> Option<&'stat str>{
self.source.next()
}
}
struct Scanner<'stat,T:Iterator<&'stat str>>{
input: T
}
impl<'main> Scanner<'main, MyIter<'main,::std::str::Graphemes<'main>>>{
fn scan_literal(&'main mut self) -> Option<String>{
let mut token = String::from_str("");
fn get_chunk<'scan_literal,'main>(result:&'scan_literal mut String,
input: &'main mut MyIter<'main,::std::str::Graphemes<'main>>)
-> Option<&'scan_literal mut String>{
Some(input.take_while(|&chr| chr != "\"")
.fold(result, |&mut acc, chr|{
acc.push_str(chr);
&mut acc
}))
}
get_chunk(&mut token,&mut self.input);
println!("token is {}", token);
Some(token)
}
}
fn main(){
let mut scanner = Scanner{input:MyIter{source:"\"foo\"".graphemes(true)}};
scanner.scan_literal();
}
There are two problems I know of here. First, I have to shadow the 'main lifetime in the get_chunk function (I tried using the one in the impl, but the compiler complains that 'main is undefined inside get_chunk). I think it will still work out because the call to get_chunk later will match the 'main from the impl with the 'main from get_chunk, but I'm not sure that's right.
The second problem is that the &mut acc inside the closure needs to have a lifetime of 'scan_literal in order to work like I want it to (accumulating characters until the first " is encountered for this example). I can't add an explicit lifetime to &mut acc though, and the compiler says its lifetime is limited to the closure itself, and thus I can't return the reference to use in the next iteration of fold. I've gotten the function to compile and run in various other ways, but I don't understand what the problem is here.
My main question is: Is there any way to explicitly specify the lifetime of an argument to a closure? If not, is there a better way to accumulate the string using fold without doing multiple copies?
First, about lifetimes. Functions defined inside other functions are static, they are not connected with their outside code in any way. Consequently, their lifetime parameters are completely independent. You don't want to use 'main as a lifetime parameter for get_chunk() because it will shadow the outer 'main lifetime and give nothing but confusion.
Next, about closures. This expression:
|&mut acc, chr| ...
very likely does not what you really think it does. Closure/function arguments allow irrefutable patterns in them, and & have special meaning in patterns. Namely, it dereferences the value it is matched against, and assigns its identifier to this dereferenced value:
let x: int = 10i;
let p: &int = &x;
match p {
&y => println!("{}", y) // prints 10
}
You can think of & in a pattern as an opposite to & in an expression: in an expression it means "take a reference", in a pattern it means "remove the reference".
mut, however, does not belong to & in patterns; it belongs to the identifier and means that the variable with this identifier is mutable, i.e. you should write not
|&mut acc, chr| ...
but
|& mut acc, chr| ...
You may be interested in this RFC which is exactly about this quirk in the language syntax.
It looks like that you want to do a very strange thing, I'm not sure I understand where you're getting at. It is very likely that you are confusing different string kinds. First of all, you should read the official guide which explains ownership and borrowing and when to use them (you may also want to read the unfinished ownership guide; it will soon get into the main documentation tree), and then you should read strings guide.
Anyway, your problem can be solved in much simpler and generic way:
#[deriving(Clone)]
struct MyIter<'s, T: Iterator<&'s str>> {
source: T
}
impl<'s, T: Iterator<&'s str>> Iterator<&'s str> for MyIter<'s, T>{
fn next(&mut self) -> Option<&'s str>{ // '
self.source.next()
}
}
#[deriving(Clone)]
struct Scanner<'s, T: Iterator<&'s str>> {
input: T
}
impl<'m, T: Iterator<&'m str>> Scanner<'m, T> { // '
fn scan_literal(&mut self) -> Option<String>{
fn get_chunk<'a, T: Iterator<&'a str>>(input: T) -> Option<String> {
Some(
input.take_while(|&chr| chr != "\"")
.fold(String::new(), |mut acc, chr| {
acc.push_str(chr);
acc
})
)
}
let token = get_chunk(self.input.by_ref());
println!("token is {}", token);
token
}
}
fn main(){
let mut scanner = Scanner{
input: MyIter {
source: "\"foo\"".graphemes(true)
}
};
scanner.scan_literal();
}
You don't need to pass external references into the closure; you can generate a String directly in fold() operation. I also generified your code and made it more idiomatic.
Note that now impl for Scanner also works with arbitrary iterators returning &str. It is very likely that you want to write this instead of specializing Scanner to work only with MyIter with Graphemes inside it. by_ref() operation turns &mut I where I is an Iterator<T> into J, where J is an Iterator<T>. It allows further chaining of iterators even if you only have a mutable reference to the original iterator.
By the way, your code is also incomplete; it will only return Some("") because the take_while() will stop at the first quote and won't scan further. You should rewrite it to take initial quote into account.
Related
I have some Rust trait for some effects that manipulate the game state by applying them
pub struct GameState {some_number : usize}
impl GameState {
pub fn apply(self, e: &dyn Effect) -> GameState {
// <do some accounting>
return e.apply(self);
}
pub trait Effect {
fn apply(self, state: GameState) -> GameState;
}
in one of the Effects I want to apply another one:
pub struct MyFirstEffect;
impl Effect for MyFirstEffect {
fn apply(self, mut state: GameState) -> GameState{
for state.some_number > 0 {
state = state.apply(&MyOtherEffect::new());
}
return state;
}
}
pub struct MyOtherEffect;
impl Effect for MyFirstEffect {
fn apply(self, mut state: GameState) -> GameState{
return GameState{state.some_number +1 }
}
}
It is important for me, that Effect::apply(..) is only called in the GameState::apply(..) such that I can do some book keeping and maybe apply some other effects that have hooks on certain effects, but that is not the problem here.
My problem is that the borrow checker hates this. I first thought about this like passing a mutable reference around (instead of the "-> GameState") but it does not let me do that: In the loop it complains that the state was already borrowed.
So I came up with this variant where I hand out an immutable GameState every time, which is crazy inefficient, but I thought that's the dumbest and safest way, but the borrow checker doesn't let me do that either because the lifetime of the GameState is not known.
So I tried switching around the & and &mut but nothing is pleasing the borrow checker.
Now I think that I should come up with some error messages that I get. But instead I just want to ask, how do I do such thing in general? Someone who is good in working with Rust should imho know that ad-hoc
At a glance the main problem you are having is probably something like: "use of moved value: 'state' value moved here, in previous iteration of loop". Though I can't know for certain because you haven't posted the error the borrow checker is giving you. Regardless if that were the case, the cause of the problem would be that Copy isn't implemented for GameState and you are moving it into the context of GameState::apply.
Problem one can be solved three ways:
Implement copy for GameState. (This fixes the borrow checker issue but wouldn't achieve the behaviour you desire because state would remain unchanged by the end of MyFirstEffect::apply)
Follow #cdhowie's suggestion which takes back the mutated state so that it can be mutated again in the next loop iteration. (This is the answer to how your question, but it's not the solution I would go with.)
Change the apply functions to accept mutable references. (I would recommend this option because it avoids the issue of moving more and more data as the contents of GameState grows. All you need to move is a reference, which is of a fixed size regardless how big GameState is.)
Option three would look something like this:
pub struct GameState {some_condition: usize}
impl GameState {
pub fn apply(&mut self, e: &dyn Effect) {
// NOTE: for state.someCondition > 0 is incorrect for two reasons and
// confusing for one. Incorrect because:
// 1. Using `for` here is not consistent with rust's syntax. A while
// loop should be used instead.
// 2. Rust variable names and struct fields are generally written in
// snake_case rather than camelCase.
//
// And it's a little confusing because the `some_condition` variable is
// not a condition, it's some value.
while self.some_condition > 0 {
e.apply_to(self);
}
}
}
pub trait Effect {
// Consider changing `apply` -> `apply_to` because you are applying the
// effect onto the state rather than the state onto the effect, and it keeps
// the function distinct from `GameState::apply`.
fn apply_to(&self, state: &mut GameState);
}
struct ZeroCondition;
impl Effect for ZeroCondition {
fn apply_to(&self, state: &mut GameState) {
state.some_condition -= 1;
}
}
fn main() {
let mut state = GameState {some_condition: 3};
state.apply(&ZeroCondition);
}
There is already a very popular question about this topic but I don;t fully understand the answer.
The goal is:
I need a list (read a Vec) of "function pointers" that modify data stored elsewhere in a program. The simplest example I can come up with are callbacks to be called when a key is pressed. So when any key is pressed, all functions passed to the object will be called in some order.
Reading the answer, it is not clear to me how I would be able to make such a list. It sounds like I would need to restrict the type of the callback to something known, else I don't know how you would be able to make an array of it.
It's also not clear to me how to store the data pointers/references.
Say I have
struct Processor<CB>
where
CB: FnMut(),
{
callback: CB,
}
Like the answer suggests, I can't make an array of processors, can I? since each Processor is technically a different type depending on the generic isntantiation.
Indeed, you can't make a vector of processors. Usually, closures all have different, innominable types. What you want instead are trait objects, which allow you to have dynamic dispatch of callback calls. Since those are not Sized, you'd probably want to put them in a Box. The final type is Vec<Box<dyn FnMut()>>.
fn add_callback(list: &mut Vec<Box<dyn FnMut()>>, cb: impl FnMut() + 'static) {
list.push(Box::new(cb))
}
fn run_callback(list: &mut [Box<dyn FnMut()>]) {
for cb in list {
cb()
}
}
see the playground
If you do like that, however, you might have some issues with the lifetimes (because your either force to move-in everything, or only modify values that life for 'static, which isn't very convenient. Instead, the following might be better
#[derive(Default)]
struct Producer<'a> {
list: Vec<Box<dyn FnMut() + 'a>>,
}
impl<'a> Producer<'a> {
fn add_callback(&mut self, cb: impl FnMut() + 'a) {
self.list.push(Box::new(cb))
}
fn run_callbacks(&mut self) {
for cb in &mut self.list {
cb()
}
}
}
fn callback_1() {
println!("Hello!");
}
fn main() {
let mut modified = 0;
let mut prod = Producer::default();
prod.add_callback(callback_1);
prod.add_callback(
|| {
modified += 1;
println!("World!");
}
);
prod.run_callbacks();
drop(prod);
println!("{}", modified);
}
see the playground
Just a few things to note:
You manually have to drop the producer, otherwise Rust will complain that it will be dropped at the end of the scope, but it contains (through the closure) an exclusive reference to modified, which is not ok since I try to read it.
Current, run_callbacks take a &mut self, because we only require for a FnMut. If you wanted it to be only a &self, you'd need to replace FnMut with Fn, which means the callbacks can still modify things outside of them, but not inside.
Yes, all closures are differents type, so if you want to have a vec of different closure you will need to make them trait objects. This can be archieve with Box<dyn Trait> (or any smart pointer). Box<dyn FnMut()> implements FnMut(), so you can have Processor<Box<dyn FnMut()>> and can make a vec of them, and call the callbacks on them: playground
I am currently reading chapter 8 of The Rust Programming Language, which contains this code:
let v = vec![100, 32, 57];
for i in &v {
println!("{}", i);
}
I'm not sure why the & is necessary in for i in &v. I tried removing it and the code still works. Does for i in v do the same thing, or is it different somehow?
I'm not sure why the & is necessary in for i in &v. I tried removing it and the code still works. Does for i in v do the same thing, or is it different somehow?
It does something slightly different (or it does the same thing on a different structure, if you prefer). The problem is at chapter 8 you're a bit early for those concepts, especially without knowing your experience with other programming languages.
I would strongly recommend that you keep on reading, and either you'll eventually find out (or understand on your own, I don't remember if it's made explicit) or you can come back to it later.
If it really bothers you, here's my attempt at an explanation:
As in various other programming programming languages, Rust's for loop works on arbitrary things which can be iterated (sometimes called "iterables"). In Rust the concept is represented by IntoIterator, aka "things which can be converted into an iterator". Now the "Into" prefix is important because it generally means consuming conversion (after the Into trait). That is,
let a = thing();
let b = a.into_iter();
// println!("{:?}", a); // probably an error because the previous line "consumed" `a` unless it's trivial and copy
So far so good. In one case Rust calls Vec::into_iter, and in another case it calls <&Vec>::into_iter. With many methods this would make no difference, because only one such method exists. However, it does matter for IntoIterator, because there exist both
impl<T> IntoIterator for Vec<T>
and
impl<'a, T> IntoIterator for &'a Vec<T>
Why and what's the difference? Well, above I noted that IntoIterator consumes its subject, so the first consumes the Vec while the second just... consumes a reference to a Vec, which isn't even consuming because references can be copied.
As a result, in the original case you can keep using the Vec after having iterated it, but after your modification you can not, because the vector has been consumed and destroyed by the for loop. You can see that by adding
println!("{:?}", v);
after the loop, the second one will refuse to build.
There's another consequence to this, which isn't too relevant here but is in other situations: if we look at the declaration of the two implementations more broadly we get this:
impl<T> IntoIterator for Vec<T> {
type Item = T;
[...]
}
impl<'a, T> IntoIterator for &'a Vec<T> {
type Item = &'a T;
[...]
}
The Item associated type is different: in the first case, because the vector is consumed the iterator gets to move (and provide) the actual items contained in the collection. In the second case however, since the vector is not consumed, the loop can not own the items: that'd be stealing and in Rust you can only steal from corpses. So it can only hand out references to the item.
This is mostly relevant for non-Copy types (not the case here as integers are Copy) as it influences the specific operations you can perform on the items (that is, in your loop).
(Please refer to the other comprehensive answer for details.)
Here's a simpler explanation.
As you learned before in the section Ways Variables and Data Interact: Move, assigning non-Copy variables result in a move rather than a copy. For example:
let a = String::from("hello");
let b = a; // move
println!("{} {}", a, b); // error: borrow of moved value
Similarly, giving a Vec to a for loop transfers ownership to the loop. For Vec, each of the elements is moved into the loop variable, and the Vec is no longer accessible after the loop:
fn main() {
// creates a Vec<String>
let v: Vec<_> = ["a", "b", "c"].iter().copied().map(String::from).collect();
// elements are moved into s, no clone required
for s in v {
println!("{}", s);
}
// error[E0382]: borrow of moved value: `v`
// println!("{:?}", v);
}
(playground)
Passing a &Vec to a for loop, on the other hand, assigns the loop variable references to the elements, so the variable continues to own the Vec:
fn main() {
// creates a Vec<String>
let v: Vec<_> = ["a", "b", "c"].iter().copied().map(String::from).collect();
// s is a reference to individual elements
for s in &v {
println!("{}", s);
}
// v can still be accessed
println!("{:?}", v);
}
(playground)
Note that even though integers are Copy, a Vec containing them is not.
I want to solve a leetcode question in Rust (Remove Nth Node From End of List). My solution uses two pointers to find the Node to remove:
#[derive(PartialEq, Eq, Debug)]
pub struct ListNode {
pub val: i32,
pub next: Option<Box<ListNode>>,
}
impl ListNode {
#[inline]
fn new(val: i32) -> Self {
ListNode { next: None, val }
}
}
// two-pointer sliding window
impl Solution {
pub fn remove_nth_from_end(head: Option<Box<ListNode>>, n: i32) -> Option<Box<ListNode>> {
let mut dummy_head = Some(Box::new(ListNode { val: 0, next: head }));
let mut start = dummy_head.as_ref();
let mut end = dummy_head.as_ref();
for _ in 0..n {
end = end.unwrap().next.as_ref();
}
while end.as_ref().unwrap().next.is_some() {
end = end.unwrap().next.as_ref();
start = start.unwrap().next.as_ref();
}
// TODO: fix the borrow problem
// ERROR!
// start.unwrap().next = start.unwrap().next.unwrap().next.take();
dummy_head.unwrap().next
}
}
I borrow two immutable references of the linked-list. After I find the target node to remove, I want to drop one and make the other mutable. Each of the following code examples leads to a compiler error:
// ERROR
drop(end);
let next = start.as_mut().unwrap.next.take();
// ERROR
let mut node = *start.unwrap()
I don't know if this solution is possible to be written in Rust. If I can make an immutable reference mutable, how do I do it? If not, is there anyway to implement the same logic while making the borrow checker happy?
The correct answer is that you should not be doing this. This is undefined behavior, and breaks many assumptions made by the compiler when compiling your program.
However, it is possible to do this. Other people have also mentioned why this is not a good idea, but they haven't actually shown what the code to do something like this would look like. Even though you should not do this, this is what it would look like:
unsafe fn very_bad_function<T>(reference: &T) -> &mut T {
let const_ptr = reference as *const T;
let mut_ptr = const_ptr as *mut T;
&mut *mut_ptr
}
Essentially, you convert a constant pointer into a mutable one, and then make the mutable pointer into a reference.
Here's one example why this is very unsafe and unpredictable:
fn main() {
static THIS_IS_IMMUTABLE: i32 = 0;
unsafe {
let mut bad_reference = very_bad_function(&THIS_IS_IMMUTABLE);
*bad_reference = 5;
}
}
If you run this... you get a segfault. What happened? Essentially, you invalidated memory rules by trying to write to an area of memory that had been marked as immutable. Essentially, when you use a function like this, you break the trust the compiler has made with you to not mess with constant memory.
Which is why you should never use this, especially in a public API, because if someone passes an innocent immutable reference to your function, and your function mutates it, and the reference is to an area of memory not meant to be written to, you'll get a segfault.
In short: don't try to cheat the borrow checker. It's there for a reason.
EDIT: In addition to the reasons I just mentioned on why this is undefined behavior, another reason is breaking reference aliasing rules. That is, since you can have both a mutable and immutable reference to a variable at the same time with this, it causes loads of problems when you pass them in separately to the same function, which assumes the immutable and mutable references are unique. Read this page from the Rust docs for more information about this.
Is there a way to make an immutable reference mutable?
No.
You could write unsafe Rust code to force the types to line up, but the code would actually be unsafe and lead to undefined behavior. You do not want this.
For your specific problem, see:
How to remove the Nth node from the end of a linked list?
How to use two pointers to iterate a linked list in Rust?
I have the following enum:
pub enum Game {
Match(GameWorker),
#[cfg(feature = "cups")]
Cup(CupWorker),
}
So, this enum consists of one item if cups feature is disabled. The code below with match compiles okay but in place where I use if lets on matching this enum there is a error:
Working match:
fn clear(&mut self, silent: bool) {
match *self {
Game::Match(ref mut gm) => gm.clear(silent),
#[cfg(feature = "cups")]
Game::Cup(ref mut c) => c.clear(silent),
}
}
if let which leads to a compile error:
let m: &mut Game = Game::Match(...);
if let Game::Match(ref mut gamematch) = *m {
// ...
}
Error:
error[E0162]: irrefutable if-let pattern
--> src/game.rs:436:32
|
436 | if let Game::Match(ref mut gamematch) = *m {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ irrefutable pattern
Minimal example
Is there a way to allow such if lets ? I like this construction but somewhy it is not allowed to use it, I don't understand why. As shown above, match construction works okay in the same case. In my personal opinion here should be a silenceable warning instead of error.
if let expects a refutable pattern, similar to how if expects a bool. You can't write if () { something }, even though () is "valid" in some sense. If you had if () {} else { something_else } it would be statically known that the else cannot occur.
Arguably if true { something } is also statically known, but there's a difference: The condition is a bool, which has two values, so even if you statically know the value, the type still offers multiple variants.
With if let it's the same, but you can use user defined types instead of just bool. If your enum has multiple variants, you can't statically decide that the if let is always taken. If the enum has a single variant, you know for a fact that the if condition is always true, so even if you had an else branch, it would not make any sense at all to exist.