Does adding a semicolon at the end of `return` make a difference? - expression

The Rust Guide states that:
The semicolon turns any expression into a statement by throwing away its value and returning unit instead.
I thought I got this concept down until I ran an experiment:
fn print_number(x: i32, y: i32) -> i32 {
if x + y > 20 { return x }
x + y
}
Which compiles fine. Then, I added a semicolon at the end of the return line (return x;). From what I understand, this turns the line into a statement, returning the unit data type ().
Nonetheless, the end result is the same.

Normally, every branch in the if expression should have the same type. If the type for some branch is underspecified, the compiler tries to find the single common type:
fn print_number(x: int, y: int) {
let v = if x + y > 20 {
3 // this can be either 3u, 3i, 3u8 etc.
} else {
x + y // this is always int
};
println!("{}", v);
}
In this code, 3 is underspecified but the else branch forces it to have the type of int.
This sounds simple: There is a function that "unifies" two or more types into the common type, or it will give you an error when that's not possible. But what if there were a fail! in the branch?
fn print_number(x: int, y: int) {
let v = if x + y > 20 {
fail!("x + y too large") // ???
} else {
x + y // this is always int
};
println!("{}", v); // uh wait, what's the type of `v`?
}
I'd want that fail! does not affect other branches, it is an exceptional case after all. Since this pattern is quite common in Rust, the concept of diverging type has been introduced. There is no value which type is diverging. (It is also called an "uninhabited type" or "void type" depending on the context. Not to be confused with the "unit type" which has a single value of ().) Since the diverging type is naturally a subset of any other types, the compiler conclude that v's type is just that of the else branch, int.
Return expression is no different from fail! for the purpose of type checking. It abruptly escapes from the current flow of execution just like fail! (but does not terminate the task, thankfully). Still, the diverging type does not propagate to the next statement:
fn print_number(x: int, y: int) {
let v = if x + y > 20 {
return; // this is diverging
() // this is implied, even when you omit it
} else {
x + y // this is always int
};
println!("{}", v); // again, what's the type of `v`?
}
Note that the sole semicoloned statement x; is equivalent to the expression x; (). Normally a; b has the same type as b, so it would be quite strange that x; () has a type of () only when x is not diverging, and it diverges when x does diverge. That's why your original code didn't work.
It is tempting to add a special case like that:
Why don't you make x; () diverging when x diverges?
Why don't you assume uint for every underspecified integer literal when its type cannot be inferred? (Note: this was the case in the past.)
Why don't you automatically find the common supertrait when unifying multiple trait objects?
The truth is that, designing the type system is not very hard, but verifying it is much harder and we want to ensure that Rust's type system is future-proof and long standing. Some of them may happen if it really is useful and it is proved "correct" for our purpose, but not immediately.

I'm not 100% sure of what I'm saying but it kinda makes sense.
There's an other concept coming into play: reachability analysis. The compiler knows that what follows a return expression statement is unreachable. For example, if we compile this function:
fn test() -> i32 {
return 1;
2
}
We get the following warning:
warning: unreachable expression
--> src/main.rs:3:5
|
3 | 2
| ^
|
The compiler can ignore the "true" branch of the if expression if it ends with a return expression and only consider the "false" branch when determining the type of the if expression.
You can also see this behavior with diverging functions. Diverging functions are functions that don't return normally (e.g. they always fail). Try replacing the return expression with the fail! macro (which expands to a call to a diverging function). In fact, return expressions are also considered to be diverging; this is the basis of the aforementioned reachability analysis.
However, if there's an actual () expression after the return statement, you'll get an error. This function:
fn print_number(x: i32, y: i32) -> i32 {
if x + y > 20 {
return x;
()
} else {
x + y
}
}
gives the following error:
error[E0308]: mismatched types
--> src/main.rs:4:9
|
4 | ()
| ^^ expected i32, found ()
|
= note: expected type `i32`
found type `()`
In the end, it seems diverging expressions (which includes return expressions) are handled differently by the compiler when they are followed by a semicolon: the statement is still diverging.

Related

How is a reference counter implemented at compile time?

Here is a made up set of function calls (I tried to make it complicated but perhaps it is easy).
function main(arg1, arg2) {
do_foo(arg1, arg2)
}
function do_foo(a, b) {
let x = a + b
let y = x * a
let z = x * b
let p = y + z
let q = x + z
let r = do_bar(&p)
let s = do_bar(&q)
}
function do_bar(&p, &q) {
*p += 1
*q += 3
let r = &p * &q
let s = &p + &q
let v = do_baz(&r, &s)
return &v
}
function do_baz(&a, &b) {
return *a + *b
}
How do you generally go about figuring out the liveness of variables and where you can insert instructions for reference counting?
Here is my attempt...
Start at the top function main. It starts with 2 arguments. Assume there is no copying that occurs. It passes the actual mutable values to do_foo.
Then we have x. X owns a and b. Then we see y. y is set to x, so link the previous x to this x. By r, we don't see x anymore, so perhaps it can be freed.... Looking at do_bar by itself, we know basically that p and q can't be garbage collected within this scope.
Basically, I have no idea how to start implementing an algorithm to implement ARC (ideally compile time reference counting, but runtime would be okay for now too to get started).
function main(arg1, arg2) {
let x = do_foo(arg1, arg2)
free(arg1)
free(arg2)
free(x)
}
function do_foo(a, b) {
let x = a + b
let y = x * a
let z = x * b
let p = y + z
free(y)
let q = x + z
free(x)
free(z)
let r = do_bar(&p)
let s = do_bar(&q)
return r + s
}
function do_bar(&p, &q) {
*p += 1
*q += 3
let r = &p * &q
let s = &p + &q
let v = do_baz(&r, &s)
free(r)
free(s)
return &v
}
function do_baz(&a, &b) {
return *a + *b
}
How do I start with implementing such an algorithm. I have searched for every paper on the topic but found no algorithms.
The following rules should do the job for your language.
When a variable is declared, increment its refcount
When a variable goes out of scope, decrement its refcount
When a reference-to-variable is assigned to a variable, adjust the reference counts for the variable(s):
increment the refcount for the variable whose reference is being assigned
decrement the refcount for the variable whose references was previously in the variable being assigned to (if it was not null)
When a variable containing a non-null reference-to-variable goes out of scope, decrement the refcount for the variable it referred to.
Note:
If your language allows reference-to-variable types to be used in data structures, "static" variables, etcetera, the rules abouve need to be extended ... in the obvious fashion.
An optimizing compiler may be able to eliminate some refcount increments and decrements.
Compile time reference counting:
There isn't really any such thing. Reference counting is done at runtime. It doesn't make sense to do it at compile time.
You are probably talking about analyzing the code to determine if runtime reference counting can be optimized or entirely eliminated.
I alluded to the former above. It is really a kind of peephole optimization.
The latter entails checking whether a reference-to-variable can ever escape; i.e. whether it could be used after the variable goes out of scope. (Try Googling for "escape analysis". This is kind of analogous to the "escape analysis" that a compiler could do to decide if an object could be allocated on the stack rather than in the heap.)

Tuple assignment to mutable struct parameters [duplicate]

This question already has answers here:
Can I destructure a tuple without binding the result to a new variable in a let/match/for statement?
(3 answers)
How to swap two variables?
(2 answers)
Closed 3 years ago.
I’m getting an error I’m not sure how to handle, related to a tuple assignment I’m trying. The incr function below gets a left-hand of expression not valid error. What am I misunderstanding?
struct Fib {
i: u64,
fa: u64,
fb: u64,
}
impl Fib {
fn incr(&mut self) {
self.i += 1;
(self.fa, self.fb) = (self.fa + self.fb, self.fa);
}
}
As the helpful error explanation says†, you try to assign to a non-place expression. A place expression represents a memory location, and thus it can be a variable, a dereference, an indexing expression or a field reference, but a tuple is not one of these.
If you would use a binding, such as:
let (x, y) = (1, 2);
that would be a whole different story because let statements have different rules than assignment: the left hand side of a let statement is a pattern, not an expression, and (x, y) is a legal pattern.
To solve your problem, you may want to do the following and introduce a temporary variable, and then update the values of the members:
(The following is also fixing your fibonacci sequence, i.e. correcting the values of the members since they are naturally ordered as 'a' and 'b')
impl Fib {
fn incr(&mut self) {
self.i += 1;
let fa = self.fa;
self.fa = self.fb;
self.fb += fa;
}
}
Note: Albeit it was not your question, I would strongly advise to implement Iterator for your Fib type, in which case you wouldn't have to keep track of the index (i) because that would be available through the enumerate method.
E.g.
impl Iterator for Fib {
type Item = u64;
fn next(&mut self) -> Option<Self::Item> {
let fa = self.fa;
self.fa = self.fb;
self.fb += fa;
Some(fa)
}
}
And then you could use it as:
for (i, x) in my_fib.enumerate() { ... }
† rustc --explain E0070

How to add or subtract two enum values in swift

So I have this enum that defines different view positions on a View controller when a side bar menu is presented. I need to add, subtract, multiply, or divide the different values based on different situations. How exactly do I form a method to allow me to use -, +, *, or / operators on the values in the enum. I can find plenty examples that use the compare operator ==. Although I haven't been able to find any that use >=. Which I also need to be able to do.
Here is the enum
enum FrontViewPosition: Int {
case None
case LeftSideMostRemoved
case LeftSideMost
case LeftSide
case Left
case Right
case RightMost
case RightMostRemoved
}
Now I'm trying to use these operators in functions like so.
func getAdjustedFrontViewPosition(_ frontViewPosition: FrontViewPosition, forSymetry symetry: Int) {
var frontViewPosition = frontViewPosition
if symetry < 0 {
frontViewPosition = .Left + symetry * (frontViewPosition - .Left)
}
}
Also in another function like so.
func rightRevealToggle(animated: Bool) {
var toggledFrontViewPosition: FrontViewPosition = .Left
if self.frontViewPosition >= .Left {
toggledFrontViewPosition = .LeftSide
}
self.setFrontViewPosition(toggledFrontViewPosition, animated: animated)
}
I know that i need to directly create the functions to allow me to use these operators. I just don't understand how to go about doing it. A little help would be greatly appreciated.
The type you are trying to define has a similar algebra to pointers in that you can add an offset to a pointer to get a pointer and subtract two pointers to get a difference. Define these two operators on your enum and your other functions will work.
Any operators over your type should produce results in your type. There are different ways to achieve this, depending on your requirements. Here we shall treat your type as a wrap-around ("modulo") one - add 1 to the last literal and you get the first. To do this we use raw values from 0 to n for your types literals and use modulo arithmetic.
First we need a modulo operator which always returns a +ve result, the Swift % can return a -ve one which is not what is required for modulo arithmetic.
infix operator %% : MultiplicationPrecedence
func %%(_ a: Int, _ n: Int) -> Int
{
precondition(n > 0, "modulus must be positive")
let r = a % n
return r >= 0 ? r : r + n
}
Now your enum assigning suitable raw values:
enum FrontViewPosition: Int
{
case None = 0
case LeftSideMostRemoved = 1
case LeftSideMost = 2
case LeftSide = 3
case Left = 4
case Right = 5
case RightMost = 6
case RightMostRemoved = 7
Now we define the appropriate operators.
For addition we can add an integer to a FrontViewPosition and get a FrontViewPosition back. To do this we convert to raw values, add, and then reduce modulo 8 to wrap-around. Note the need for a ! to return a non-optional FrontViewPosition - this will always succeed due to the modulo math:
static func +(_ x : FrontViewPosition, _ y : Int) -> FrontViewPosition
{
return FrontViewPosition(rawValue: (x.rawValue + y) %% 8)!
}
For subtraction we return the integer difference between two FrontViewPosition values:
static func -(_ x : FrontViewPosition, _ y : FrontViewPosition) -> Int
{
return x.rawValue - y.rawValue
}
}
You can define further operators as needed, say a subtraction operator which takes a FrontViewPosition and an Int and returns a FrontViewPosition.
HTH
Enum could have function~
enum Tst:Int {
case A = 10
case B = 20
case C = 30
static func + (t1:Tst,t2:Tst) -> Tst {
return Tst.init(rawValue: t1.rawValue+t2.rawValue)! //here could be wrong!
}
}
var a = Tst.A
var b = Tst.B
var c = a+b

Why does a range that starts at a negative number not iterate?

I have just started to learn Rust. During my first steps with this language, I found a strange behaviour, when an iteration is performed inside main or in another function as in following example:
fn myfunc(x: &Vec<f64>) {
let n = x.len();
println!(" n: {:?}", n);
for i in -1 .. n {
println!(" i: {}", i);
}
}
fn main() {
for j in -1 .. 6 {
println!("j: {}", j);
}
let field = vec![1.; 6];
myfunc(&field);
}
While the loop in main is correctly displayed, nothing is printed for the loop inside myfunc and I get following output:
j: -1
j: 0
j: 1
j: 2
j: 3
j: 4
j: 5
n: 6
What is the cause of this behaviour?
Type inference is causing both of the numbers in your range to be usize, which cannot represent negative numbers. Thus, the range is from usize::MAX to n, which never has any members.
To find this out, I used a trick to print out the types of things:
let () = -1 .. x.len();
Which has this error:
error: mismatched types:
expected `core::ops::Range<usize>`,
found `()`
(expected struct `core::ops::Range`,
found ()) [E0308]
let () = -1 .. x.len();
^~
Diving into the details, slice::len returns a usize. Your -1 is an untyped integral value, which will conform to fit whatever context it needs (if there's nothing for it to conform to, it will fall back to an i32).
In this case, it's as if you actually typed (-1 as usize)..x.len().
The good news is that you probably don't want to start at -1 anyway. Slices are zero-indexed:
fn myfunc(x: &[f64]) {
let n = x.len();
println!(" n: {:?}", n);
for i in 0..n {
println!(" i: {}", i);
}
}
Extra good news is that this annoyance was fixed in the newest versions of Rust. It will cause a warning and then eventually an error:
warning: unary negation of unsigned integers will be feature gated in the future
for i in -1 .. n {
^~
Also note that you should never accept a &Vec<T> as a parameter. Always use a &[T] as it's more flexible and you lose nothing.

Why are these ASCII methods inconsistent?

When I look at the rust ASCII operations it feels like there is a consistency issue between
is_lowercase/is_uppercase:
pub fn is_uppercase(&self) -> bool {
(self.chr - b'A') < 26
}
is_alphabetic:
pub fn is_alphabetic(&self) -> bool {
(self.chr >= 0x41 && self.chr <= 0x5A) || (self.chr >= 0x61 && self.chr <= 0x7A)
}
Is there a good reason? Are the two methods totally equivalent or am I missing something?
All these functions are marked as stable so I'm confused.
EDIT:
To make it clearer, what I would expect is to decide on the best (in terms of performance/readability/common practice) implementation for lower/upper then have
pub fn is_alphabetic(&self) -> bool {
self.is_lowercase() || self.is_uppercase()
}
Since the question changed to be about performance, I'll add a second answer.
To start, I created a clone of the Ascii module (playpen):
pub struct Alpha(u8);
impl Alpha {
#[inline(never)]
pub fn is_uppercase_sub(&self) -> bool {
(self.0 - b'A') < 26
}
#[inline(never)]
pub fn is_uppercase_range(&self) -> bool {
self.0 >= 0x41 && self.0 <= 0x5A
}
}
fn main() {
let yes = Alpha(b'A');
let no = Alpha(b'a');
println!("{}, {}", yes.is_uppercase_sub(), yes.is_uppercase_range());
}
In the playpen, make sure that the optimization is set to -O2 and then click IR. This shows the LLVM Intermediate Representation. It's like a higher-level assembly, if you'd like.
There's lots of output, but look for the sections with fastcc. I've removed various bits to make this code clearer, but you can see that the exact same function is called, even though our code calls two different implementations, one with a subtraction and one with a range:
%3 = call fastcc zeroext i1 #_ZN5Alpha16is_uppercase_sub20h63aa0b11479803f4laaE
%5 = call fastcc zeroext i1 #_ZN5Alpha16is_uppercase_sub20h63aa0b11479803f4laaE
The LLVM optimizer can tell that these implementations are the same, so really it's up to the developers preference. You might be able to get a commit into Rust to make them consistent, if you'd like! ^_^
Asking about is_alphabetic is harder; inlining will come into play here. If LLVM inlines is_upper and is_lower into is_alphabetic, then your suggested change would be better. If it doesn't, then potentially what was 1 function call is now 3! That could be really bad.
These types of questions are a lot harder to answer at this level; one would have to do some looking (edit and profiling!) at real Rust code in the large to understand the optimizer with regards to inlining.
They're equivalent. is_alphabetic could be written with byte literals instead of hex codes, making it more readable and matching the other functions:
pub fn is_alphabetic(&self) -> bool {
(self.chr >= b'A' && self.chr <= b'Z') ||
(self.chr >= b'a' && self.chr <= b'z')
}
The values in is_alphabetic certainly correspond to the appropriate ASCII values for the letters. You can validate this with:
println!("0x{:x} 0x{:x}", b'A', b'A');
println!("0x{:x} 0x{:x}", b'a', b'z');
is_alphabetic relies on the fact that the ASCII lower and uppercase letters are sequential (not with each other, unfortunately). It could have been written:
pub fn is_alphabetic(&self) -> bool {
(self.chr >= b'A' && self.chr <= b'Z') || (self.chr >= b'a' && self.chr <= b'z')
}
// Or
pub fn is_alphabetic(&self) -> bool {
self.is_upper() || self.is_lower()
}
is_lower and is_upper both rely on unsigned math underflow to be correct. If a is 0x61 and z is 0x7A, and we subtract a from both, we get 0 and 25. However, if it's one less than a, we would get 0xFF. 0xFF is not < 26, so it will fail that check.

Resources