Is the big integer implementation in the num crate slow? - performance

I implemented the Miller-Rabin Strong Pseudoprime Test in Rust using BigUint to support arbitrary large primes. To run through the numbers between 5 and 10^6, it took about 40s with cargo run --release.
I implemented the same algorithm with Java's BigInteger and the same test took 10s to finish. Rust appears to be 4 times slower. I assume this is caused by the implementation of num::bigint.
Is this just the current state of num::bigint, or can anyone spot any obvious improvement in my code? (Mainly about how I used the language. Regardless whether my implementation of the algorithm is good or bad, it is almost implemented exactly the same in both languages - so does not cause the difference in performance.)
I did notice there are lots of clone() required, due to Rust's ownership model, that could well impact the speed to some level. But I guess there is no way around that, am I right?
Here is the code:
extern crate rand;
extern crate num;
extern crate core;
extern crate time;
use std::time::{Duration};
use time::{now, Tm};
use rand::Rng;
use num::{Zero, One};
use num::bigint::{RandBigInt, BigUint, ToBigUint};
use num::traits::{ToPrimitive};
use num::integer::Integer;
use core::ops::{Add, Sub, Mul, Div, Rem, Shr};
fn find_r_and_d(i: BigUint) -> (u64, BigUint) {
let mut d = i;
let mut r = 0;
loop {
if d.clone().rem(&2u64.to_biguint().unwrap()) == Zero::zero() {
d = d.shr(1usize);
r = r + 1;
} else {
break;
}
}
return (r, d);
}
fn might_be_prime(n: &BigUint) -> bool {
let nsub1 = n.sub(1u64.to_biguint().unwrap());
let two = 2u64.to_biguint().unwrap();
let (r, d) = find_r_and_d(nsub1.clone());
'WitnessLoop: for kk in 0..6u64 {
let a = rand::thread_rng().gen_biguint_range(&two, &nsub1);
let mut x = mod_exp(&a, &d, &n);
if x == 1u64.to_biguint().unwrap() || x == nsub1 {
continue;
}
for rr in 1..r {
x = x.clone().mul(x.clone()).rem(n);
if x == 1u64.to_biguint().unwrap() {
return false;
} else if x == nsub1 {
continue 'WitnessLoop;
}
}
return false;
}
return true;
}
fn mod_exp(base: &BigUint, exponent: &BigUint, modulus: &BigUint) -> BigUint {
let one = 1u64.to_biguint().unwrap();
let mut result = one.clone();
let mut base_clone = base.clone();
let mut exponent_clone = exponent.clone();
while exponent_clone > 0u64.to_biguint().unwrap() {
if exponent_clone.clone() & one.clone() == one {
result = result.mul(&base_clone).rem(modulus);
}
base_clone = base_clone.clone().mul(base_clone).rem(modulus);
exponent_clone = exponent_clone.shr(1usize);
}
return result;
}
fn main() {
let now1 = now();
for n in 5u64..1_000_000u64 {
let b = n.to_biguint().unwrap();
if might_be_prime(&b) {
println!("{}", n);
}
}
let now2 = now();
println!("{}", now2.to_timespec().sec - now1.to_timespec().sec);
}

You can remove most of the clones pretty easily. BigUint has all ops traits implemented also for operations with &BigUint, not just working with values. With that, it becomes faster but still about half as fast as Java...
Also (not related to performance, just readability) you don't need to use add, sub, mul and shr explicitly; they override the regular +, -, * and >> operators.
For instance you could rewrite might_be_prime and mod_exp like this, which already gives a good speedup on my machine (from 40 to 24sec on avg):
fn might_be_prime(n: &BigUint) -> bool {
let one = BigUint::one();
let nsub1 = n - &one;
let two = BigUint::new(vec![2]);
let mut rng = rand::thread_rng();
let (r, mut d) = find_r_and_d(nsub1.clone());
let mut x;
let mut a: BigUint;
'WitnessLoop: for kk in 0..6u64 {
a = rng.gen_biguint_range(&two, &nsub1);
x = mod_exp(&mut a, &mut d, &n);
if &x == &one || x == nsub1 {
continue;
}
for rr in 1..r {
x = (&x * &x) % n;
if &x == &one {
return false;
} else if x == nsub1 {
continue 'WitnessLoop;
}
}
return false;
}
true
}
fn mod_exp(base: &mut BigUint, exponent: &mut BigUint, modulus: &BigUint) -> BigUint {
let one = BigUint::one();
let zero = BigUint::zero();
let mut result = BigUint::one();
while &*exponent > &zero {
if &*exponent & &one == one {
result = (result * &*base) % modulus;
}
*base = (&*base * &*base) % modulus;
*exponent = &*exponent >> 1usize;
}
result
}
Note that I've moved the println! out of the timing, so that we're not benchmarking IO.
fn main() {
let now1 = now();
let v = (5u64..1_000_000u64)
.filter_map(|n| n.to_biguint())
.filter(|n| might_be_prime(&n))
.collect::<Vec<BigUint>>();
let now2 = now();
for n in v {
println!("{}", n);
}
println!("time spent seconds: {}", now2.to_timespec().sec - now1.to_timespec().sec);
}

Related

My prime number sieve is extremely slow even with --release

I have looked at multiple answers online to the same question but I cannot figure out why my program is so slow. I think it is the for loops but I am unsure.
P.S. I am quite new to Rust and am not very proficient in it yet. Any tips or tricks, or any good coding practices that I am not using are more than welcome :)
math.rs
pub fn number_to_vector(number: i32) -> Vec<i32> {
let mut numbers: Vec<i32> = Vec::new();
for i in 1..number + 1 {
numbers.push(i);
}
return numbers;
}
user_input.rs
use std::io;
pub fn get_user_input(prompt: &str) -> i32 {
println!("{}", prompt);
let mut user_input: String = String::new();
io::stdin().read_line(&mut user_input).expect("Failed to read line");
let number: i32 = user_input.trim().parse().expect("Please enter an integer!");
return number;
}
main.rs
mod math;
mod user_input;
fn main() {
let user_input: i32 = user_input::get_user_input("Enter a positive integer: ");
let mut numbers: Vec<i32> = math::number_to_vector(user_input);
numbers.remove(numbers.iter().position(|x| *x == 1).unwrap());
let mut numbers_to_remove: Vec<i32> = Vec::new();
let ceiling_root: i32 = (user_input as f64).sqrt().ceil() as i32;
for i in 2..ceiling_root + 1 {
for j in i..user_input + 1 {
numbers_to_remove.push(i * j);
}
}
numbers_to_remove.sort_unstable();
numbers_to_remove.dedup();
numbers_to_remove.retain(|x| *x <= user_input);
for number in numbers_to_remove {
if numbers.iter().any(|&i| i == number) {
numbers.remove(numbers.iter().position(|x| *x == number).unwrap());
}
}
println!("Prime numbers up to {}: {:?}", user_input, numbers);
}
There's two main problems in your code: the i * j loop has wrong upper limit for j, and the composites removal loop uses O(n) operations for each entry, making it quadratic overall.
The corrected code:
fn main() {
let user_input: i32 = get_user_input("Enter a positive integer: ");
let mut numbers: Vec<i32> = number_to_vector(user_input);
numbers.remove(numbers.iter().position(|x| *x == 1).unwrap());
let mut numbers_to_remove: Vec<i32> = Vec::new();
let mut primes: Vec<i32> = Vec::new(); // new code
let mut i = 0; // new code
let ceiling_root: i32 = (user_input as f64).sqrt().ceil() as i32;
for i in 2..ceiling_root + 1 {
for j in i..(user_input/i) + 1 { // FIX #1: user_input/i
numbers_to_remove.push(i * j);
}
}
numbers_to_remove.sort_unstable();
numbers_to_remove.dedup();
//numbers_to_remove.retain(|x| *x <= user_input); // not needed now
for n in numbers { // FIX #2:
if n < numbers_to_remove[i] { // two linear enumerations
primes.push(n);
}
else {
i += 1; // in unison
}
}
println!("Last prime number up to {}: {:?}", user_input, primes.last());
println!("Total prime numbers up to {}: {:?}", user_input,
primes.iter().count());
}
Your i * j loop was actually O( N1.5), whereas your numbers removal loop was actually quadratic -- remove is O(n) because it needs to move all the elements past the removed one back, so there is no gap.
The mended code now runs at ~ N1.05 empirically in the 106...2*106 range, and orders of magnitude faster in absolute terms as well.
Oh and that's a sieve, but not of Eratosthenes. To qualify as such, the is should range over primes, not just all numbers.
As commented AKX you function's big O is (m * n), that's why it's slow.
For this kind of "expensive" calculations to make it run faster you can use multithreading.
This part of answer is not about the right algorithm to choose, but code style. (tips/tricks)
I think the idiomatic way to do this is with iterators (which are lazy), it make code more readable/simple and runs like 2 times faster in this case.
fn primes_up_to() {
let num = get_user_input("Enter a positive integer greater than 2: ");
let primes = (2..=num).filter(is_prime).collect::<Vec<i32>>();
println!("{:?}", primes);
}
fn is_prime(num: &i32) -> bool {
let bound = (*num as f32).sqrt() as i32;
*num == 2 || !(2..=bound).any(|n| num % n == 0)
}
Edit: Also this style gives you ability easily to switch to parallel iterators for "expensive" calculations with rayon (Link)
Edit2: Algorithm fix. Before, this uses a quadratic algorithm. Thanks to #WillNess

VirtualFree results in ERROR_INVALID_PARAMETER in Rust

Here's a situation. I'm allocating memory using the following function
let addr = windows::Win32::System::Memory::VirtualAlloc(
ptr::null_mut(),
size,
windows::Win32::System::Memory::MEM_RESERVE | windows::Win32::System::Memory::MEM_COMMIT,
windows::Win32::System::Memory::PAGE_READWRITE,
);
Upon successful allocation, the resulting memory is cast to *mut u8 and everyone's happy until it's a time to deallocate this same space. Here's how I approach it
let result = System::Memory::VirtualFree(
ptr as *mut c_void,
size,
windows::Win32::System::Memory::MEM_DECOMMIT).0;
In Win32 API docs stated that upon successful reclamation of memory VirtualFree spits out a non-zero value, but in my case the return value turns out to be a zero. I was quite dismayed at first, so I decided to get right into the weeds to further investigate the problem. During my investigation I found out that calling GetLastError would give me a more detailed explanation of what I might have done wrong. The value this function ended up returning was 0x57, i.e ERROR_INVALID_PARAMETER. As that issue has been a primary source of majority of negative emotions for quite a while, I've had a lot of time to experiment with input values to these precious functions. And here's a thing. The setting I started describing the problem with functions perfectly when I'm running tests in release mode, but is completely off the table when it comes to debug mode. When I pass 0 as a second argument to VirtualFree, and MEM_RELEASE as a third one, it ends up crashing in both modes. So, how do I escape this nightmare and finally resolve the issue?
UPD
I apologize for the lack of context. So, the problem occurs when I'm running the following test
#[test]
fn stress() {
let mut rng = rand::thread_rng();
let seed: u64 = rng.gen();
let seed = seed % 10000;
run_stress(seed);
}
fn run_stress(seed: u64) {
let mut a = Dlmalloc::new();
println!("++++++++++++++++++++++ seed = {}\n", seed);
let mut rng = StdRng::seed_from_u64(seed);
let mut ptrs = Vec::new();
let max = if cfg!(test_lots) { 1_000_000 } else { 10_000 };
unsafe {
for _k in 0..max {
let free = !ptrs.is_empty()
&& ((ptrs.len() < 10_000 && rng.gen_bool(1f64 / 3f64)) || rng.gen());
if free {
let idx = rng.gen_range(0, ptrs.len());
let (ptr, size, align) = ptrs.swap_remove(idx);
println!("ptr: {:p}, size = {}", ptr, size);
a.free(ptr, size, align); // crashes right after the call to this function
continue;
}
if !ptrs.is_empty() && rng.gen_bool(1f64 / 100f64) {
let idx = rng.gen_range(0, ptrs.len());
let (ptr, size, align) = ptrs.swap_remove(idx);
let new_size = if rng.gen() {
rng.gen_range(size, size * 2)
} else if size > 10 {
rng.gen_range(size / 2, size)
} else {
continue;
};
let mut tmp = Vec::new();
for i in 0..cmp::min(size, new_size) {
tmp.push(*ptr.add(i));
}
let ptr = a.realloc(ptr, size, align, new_size);
assert!(!ptr.is_null());
for (i, byte) in tmp.iter().enumerate() {
assert_eq!(*byte, *ptr.add(i));
}
ptrs.push((ptr, new_size, align));
}
let size = if rng.gen() {
rng.gen_range(1, 128)
} else {
rng.gen_range(1, 128 * 1024)
};
let align = if rng.gen_bool(1f64 / 10f64) {
1 << rng.gen_range(3, 8)
} else {
8
};
let zero = rng.gen_bool(1f64 / 50f64);
let ptr = if zero {
a.calloc(size, align)
} else {
a.malloc(size, align)
};
for i in 0..size {
if zero {
assert_eq!(*ptr.add(i), 0);
}
*ptr.add(i) = 0xce;
}
ptrs.push((ptr, size, align));
}
}
}
I should point out that it doesn't crash on a particular iteration -- this number always changes.
This is the excerpt from the dlmalloc-rust crate.
The crate I'm using for interacting with winapi is windows-rs
Here's an implementation of free
pub unsafe fn free(ptr: *mut u8, size: usize) -> bool {
let result = System::Memory::VirtualFree(
ptr as *mut c_void,
0,
windows::Win32::System::Memory::MEM_RELEASE).0;
if result == 0 {
let cause = windows::Win32::Foundation::GetLastError().0;
dlverbose!("{}", cause);
}
result != 0
}

How can i make my rust code run faster in parallel?

#![feature(map_first_last)]
use num_cpus;
use std::collections::BTreeMap;
use ordered_float::OrderedFloat;
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Instant;
const MONO_FREQ: [f64; 26] = [
8.55, 1.60, 3.16, 3.87, 12.1, 2.18, 2.09, 4.96, 7.33, 0.22, 0.81, 4.21, 2.53, 7.17, 7.47, 2.07,
0.10, 6.33, 6.73, 8.94, 2.68, 1.06, 1.83, 0.19, 1.72, 0.11,
];
fn main() {
let ciphertext : String = "helloworldthisisatest".to_string();
concurrent( &ciphertext);
parallel( &ciphertext);
}
fn concurrent(ciphertext : &String) {
let start = Instant::now();
for _ in 0..50000 {
let mut best_fit : f64 = chi_squared(&ciphertext);
let mut best_key : u8 = 0;
for i in 1..26 {
let test_fit = chi_squared(&decrypt(&ciphertext, i));
if test_fit < best_fit {
best_key = i;
best_fit = test_fit;
}
}
}
let elapsed = start.elapsed();
println!("Concurrent : {} ms", elapsed.as_millis());
}
fn parallel(ciphertext : &String) {
let cpus = num_cpus::get() as u8;
let start = Instant::now();
for _ in 0..50000 {
let mut best_result : f64 = chi_squared(&ciphertext);
for i in (0..26).step_by(cpus.into()) {
let results = Arc::new(Mutex::new(BTreeMap::new()));
let mut threads = vec![];
for ii in i..i+cpus {
threads.push(thread::spawn({
let clone = Arc::clone(&results);
let test = OrderedFloat(chi_squared(&decrypt(&ciphertext, ii)));
move || {
let mut v = clone.lock().unwrap();
v.insert(test, ii);
}
}));
}
for t in threads {
t.join().unwrap();
}
let lock = Arc::try_unwrap(results).expect("Lock still has multiple owners");
let hold = lock.into_inner().expect("Mutex cannot be locked");
if hold.last_key_value().unwrap().0.into_inner() > best_result {
best_result = hold.last_key_value().unwrap().0.into_inner();
}
}
}
let elapsed = start.elapsed();
println!("Parallel : {} ms", elapsed.as_millis());
}
fn decrypt(ciphertext : &String, shift : u8) -> String {
ciphertext.chars().map(|x| ((x as u8 + shift - 97) % 26 + 97) as char).collect()
}
pub fn chi_squared(text: &str) -> f64 {
let mut result: f64 = 0.0;
for (pos, i) in get_letter_counts(text).iter().enumerate() {
let expected = MONO_FREQ[pos] * text.len() as f64 / 100.0;
result += (*i as f64 - expected).powf(2.0) / expected;
}
return result;
}
fn get_letter_counts(text: &str) -> [u64; 26] {
let mut results: [u64; 26] = [0; 26];
for i in text.chars() {
results[((i as u64) - 97) as usize] += 1;
}
return results;
}
Sorry to dump so much code, but i have no idea where the problem is, no matter what i try the parallel code seems to be around 100x slower.
I think that the problem may be in the chi_squared function as i don't know if this is running in parallel.
I have tried arc mutex, rayon and messaging and all slow it down when it should speed it up. What could I do to make this faster?
Your code calculates chi_squared function on main thread here is the correct version.
for ii in i..i + cpus {
let cp = ciphertext.clone();
let clone = Arc::clone(&results);
threads.push(thread::spawn(move || {
let test = OrderedFloat(chi_squared(&decrypt(&cp, ii)));
let mut v = clone.lock().unwrap();
v.insert(test, ii);
}));
}
Note that it does not matter if it is calculated parallel or not because spawning 50000*26 threads and synchronization overhead between threads are what makes up the 100x difference in the first place. Using a threadpool implementation would reduce the overhead but the result will still be much slower than single threaded version. The only thing you can do is assigning work in the outer loop (0..50000 ) however i am guessing you are trying to parallelize inside the main loop.

A vector is created inside an if block in Rust. How to use it after the scope of if ends?

I am learning Rust. I am trying to calculate a list of prime numbers up to some number. For that I need to create a vector (vec1) inside an if block and use it outside the scope of the if.
I tried a code with the same logic in MATLAB and it works.
A simplified version of the actual code looks like this:
fn main() {
let mut initiate = 1;
let mut whilechecker = 2;
while whilechecker > 0 {
whilechecker = whilechecker - 1;
if initiate == 1 {
let mut vec1 = vec![2];
}
for i in &vec1 {
if *i == 2 {
break;
}
} //for
initiate = 2;
vec1.push(5);
} //while
} //main
It is supposed to put a list of prime numbers in vec1. But since it is simplified code it should compile and giving a vector (vec1) will suffice.
But the compiler says:
cannot find value vec1 in this scope
at for i in &vec1{ and at vec1.push(5);.
Can you make it compile?
There's no reason to have the complicated if initialize==1 checking. Just move the initialization of the vector outside the while loop, so it gets done only once:
fn main() {
let mut whilechecker = 2;
let mut vec1 = vec![2];
while whilechecker > 0 {
whilechecker = whilechecker - 1;
for i in &vec1 {
if *i == 2 {
break;
}
} //for
vec1.push(5);
} //while
} //main
I don't get the thing which you actually want. But here is an example which may help you to define the global scope variable.
fn main() {
let mut initiate = 1;
let mut whilechecker = 2;
let mut vec1 = Vec::new();
while whilechecker > 0 {
if initiate == 1 {
let mut vec1 = vec![2];
}
for i in &vec1 {
if *i == 2 {
break;
}
}
initiate = 2;
vec1.push(5);
whilechecker = whilechecker - 1;
}
println!("{:?}", vec1);
}
The output of the given code is:
[5, 5]

How to do a 'for' loop with boundary values and step as floating point values?

I need to implement a for loop that goes from one floating point number to another with the step as another floating point number.
I know how to implement that in C-like languages:
for (float i = -1.0; i < 1.0; i += 0.01) { /* ... */ }
I also know that in Rust I can specify the loop step using step_by, and that gives me what I want if I have the boundary values and step as integers:
#![feature(iterator_step_by)]
fn main() {
for i in (0..30).step_by(3) {
println!("Index {}", i);
}
}
When I do that with floating point numbers, it results in a compilation error:
#![feature(iterator_step_by)]
fn main() {
for i in (-1.0..1.0).step_by(0.01) {
println!("Index {}", i);
}
}
And here is the compilation output:
error[E0599]: no method named `step_by` found for type `std::ops::Range<{float}>` in the current scope
--> src/main.rs:4:26
|
4 | for i in (-1.0..1.0).step_by(0.01) {
| ^^^^^^^
|
= note: the method `step_by` exists but the following trait bounds were not satisfied:
`std::ops::Range<{float}> : std::iter::Iterator`
`&mut std::ops::Range<{float}> : std::iter::Iterator`
How can I implement this loop in Rust?
If you haven't yet, I invite you to read Goldberg's What Every Computer Scientist Should Know About Floating-Point Arithmetic.
The problem with floating points is that your code may be doing 200 or 201 iterations, depending on whether the last step of the loop ends up being i = 0.99 or i = 0.999999 (which is still < 1 even if really close).
To avoid this footgun, Rust does not allow iterating over a range of f32 or f64. Instead, it forces you to use integral steps:
for i in -100i8..100 {
let i = f32::from(i) * 0.01;
// ...
}
See also:
How do I convert between numeric types safely and idiomatically?
As a real iterator:
Playground
/// produces: [ linear_interpol(start, end, i/steps) | i <- 0..steps ]
/// (does NOT include "end")
///
/// linear_interpol(a, b, p) = (1 - p) * a + p * b
pub struct FloatIterator {
current: u64,
current_back: u64,
steps: u64,
start: f64,
end: f64,
}
impl FloatIterator {
pub fn new(start: f64, end: f64, steps: u64) -> Self {
FloatIterator {
current: 0,
current_back: steps,
steps: steps,
start: start,
end: end,
}
}
/// calculates number of steps from (end - start) / step
pub fn new_with_step(start: f64, end: f64, step: f64) -> Self {
let steps = ((end - start) / step).abs().round() as u64;
Self::new(start, end, steps)
}
pub fn length(&self) -> u64 {
self.current_back - self.current
}
fn at(&self, pos: u64) -> f64 {
let f_pos = pos as f64 / self.steps as f64;
(1. - f_pos) * self.start + f_pos * self.end
}
/// panics (in debug) when len doesn't fit in usize
fn usize_len(&self) -> usize {
let l = self.length();
debug_assert!(l <= ::std::usize::MAX as u64);
l as usize
}
}
impl Iterator for FloatIterator {
type Item = f64;
fn next(&mut self) -> Option<Self::Item> {
if self.current >= self.current_back {
return None;
}
let result = self.at(self.current);
self.current += 1;
Some(result)
}
fn size_hint(&self) -> (usize, Option<usize>) {
let l = self.usize_len();
(l, Some(l))
}
fn count(self) -> usize {
self.usize_len()
}
}
impl DoubleEndedIterator for FloatIterator {
fn next_back(&mut self) -> Option<Self::Item> {
if self.current >= self.current_back {
return None;
}
self.current_back -= 1;
let result = self.at(self.current_back);
Some(result)
}
}
impl ExactSizeIterator for FloatIterator {
fn len(&self) -> usize {
self.usize_len()
}
//fn is_empty(&self) -> bool {
// self.length() == 0u64
//}
}
pub fn main() {
println!(
"count: {}",
FloatIterator::new_with_step(-1.0, 1.0, 0.01).count()
);
for f in FloatIterator::new_with_step(-1.0, 1.0, 0.01) {
println!("{}", f);
}
}
This is basically doing the same as in the accepted answer, but you might prefer to write something like:
for i in (-100..100).map(|x| x as f64 * 0.01) {
println!("Index {}", i);
}
Another answer using iterators but in a slightly different way playground
extern crate num;
use num::{Float, FromPrimitive};
fn linspace<T>(start: T, stop: T, nstep: u32) -> Vec<T>
where
T: Float + FromPrimitive,
{
let delta: T = (stop - start) / T::from_u32(nstep - 1).expect("out of range");
return (0..(nstep))
.map(|i| start + T::from_u32(i).expect("out of range") * delta)
.collect();
}
fn main() {
for f in linspace(-1f32, 1f32, 3) {
println!("{}", f);
}
}
Under nightly you can use the conservative impl trait feature to avoid the Vec allocation playground
#![feature(conservative_impl_trait)]
extern crate num;
use num::{Float, FromPrimitive};
fn linspace<T>(start: T, stop: T, nstep: u32) -> impl Iterator<Item = T>
where
T: Float + FromPrimitive,
{
let delta: T = (stop - start) / T::from_u32(nstep - 1).expect("out of range");
return (0..(nstep))
.map(move |i| start + T::from_u32(i).expect("out of range") * delta);
}
fn main() {
for f in linspace(-1f32, 1f32, 3) {
println!("{}", f);
}
}
For the reasons mentioned by others, one shouldn't be looping using floats under most circumstances.
For those cases where it is appropriate, it can be done (although not as ergonomically, which is probably good design--Rust should make it more difficult to juggle running chainsaws).
Since Rust 1.34, std::iter::successors() enables looping directly with a floating point index:
use std::iter;
const START: f64 = -1.0;
const END: f64 = 1.0;
// Increment by 0.1 (instead of 0.01 per the question) for output brevity
const INCREMENT: f64 = 0.1;
fn main() {
iter::successors(Some(START), |i| {
let next = i + INCREMENT;
(next < END).then_some(next)
})
.for_each(|i| println!("{i}"));
}
Note there are 21 lines of output, although only 20 were probably expected given the condition of i < 1.0 (as opposed to i <= 1.0) in the sample code of your question.
This is due to the precision and/or cumulative rounding errors present in the output, even though the source code specifies iterating from -1.0 to 1.0 in increments of exactly 0.1. (Feel free to switch the START value to 0.0 or 0.3 to see different series output, also with precision/cumulative rounding errors).
Playground example

Resources