How to shuffle an array deterministically with a seed? - random

I'm finding it difficult to shuffle an array deterministically, i.e. with a random seed in Rust. What I'm trying to achieve (in pseudo code):
let v = vec![0, 1, 2, 3];
pseudo_shuffle(v, randomSeed1) // always produces e.g. [3,1,2,0]
pseudo_shuffle(v, randomSeed2) // always produces e.g. [0,2,3,1]
In another Stack Overflow answer I learnt how to use rand::Rng::shuffle() to shuffle a vector non-deterministically, but it doesn't seem to provide an API for applying a random seed to the generation function, and I'm having a difficult time coming up with a solution myself that doesn't employ some ridiculous n! complexity algorithm.

Use a random number generator that implements the trait SeedableRng and call from_seed with the desired seed.
Example:
use rand::{seq::SliceRandom, SeedableRng}; // 0.6.5
use rand_chacha::ChaChaRng; // 0.1.1
fn main() {
let seed = [0; 32];
let mut rng = ChaChaRng::from_seed(seed);
let mut v1 = vec![1, 2, 3, 4, 5];
v1.shuffle(&mut rng);
assert_eq!(v1, [3, 5, 2, 4, 1]);
}
Clone the RNG before using it or create a new one from scratch with the same seed to reset back to the original state.
You may also be interested in ReseedingRng as well.

Related

Rust nalgebra inverse matrix

Does anyone no a simple way to get the inverse of a matrix using the Rust nalgebra::Matrix ?
I'm trying to do this the same way as with the C++ Eigen library but clearly not working.
#cargo.toml
[dependencies]
nalgebra = "0.30"
#main.rs
let mut m = Matrix3::new(11, 12, 13,
21, 22, 23,
31, 32, 33);
println!("{}", m);
println!("{}", m.transpose());
println!("{}", m.inverse()); // This blows up
Nalgebra's Matrix does not have a straight-up inverse method, mainly because matrix inversion is a fallible operation. In fact, the example matrix doesn't even have an inverse. However, you can use the try_inverse method:
let inverse = m.try_inverse().unwrap();
You can also use the pseudo_inverse method if that's better for your usecase:
let pseudo_inverse = m.pseudo_inverse().unwrap();
Note that the pseudoinverse is unlikely to fail, see nalgebra::linalg::SVD if you want more fine-grained control of the process.
Another note is that you have a matrix of integers, and you need a matrix of floats, although that's an easy fix- just add some decimal points:
let m = Matrix3::new(11.0, 12.0, 13.0, 21.0, 22.0, 23.0, 31.0, 32.0, 33.0);
Playground

Efficient way to map numbers 1 to 1 in a pseudorandom manner?

I don't want to return raw user IDs to the frontend. A lot of people solve this by generating random IDs and checking if they're already in the DB. I want to find a way to map numbers in a known range 1 to 1. This way, I can still use auto-incremented IDs internally, but return the pseudorandomly mapped IDs to the frontend.
I could just shuffle all numbers from 1 to N in a deterministic way, but I'm wondering if there's a more efficient way.
You can pick a random number and use bitwise XOR operations to switch between frontend and database IDs.
It would probably be easy to reverse engineer, but it's very easy to use.
Here's an example in JavaScript:
const seed = 123;
const internalIds = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
console.log(internalIds.join(','));
const frontIds = internalIds.map(id => id ^ seed);
console.log(frontIds.join(','));
const recoveredIds = frontIds.map(id => id ^ seed);
console.log(recoveredIds.join(','));

How to generate a random Rust integer in a range without introducing bias?

How you I generate a random dice roll in Rust?
I know I can use rand::random, but that requires I want to generate a value of an integer type. Using rand::random<u8>() % 6 introduces a bias.
Use Rng::gen_range for a one-off value:
use rand::{self, Rng}; // 0.8.0
fn main() {
let mut rng = rand::thread_rng();
let die = rng.gen_range(1..=6);
println!("The die was: {}", die);
}
Under the hood, this creates a Uniform struct. Create this struct yourself if you will be getting multiple random numbers:
use rand::{
self,
distributions::{Distribution, Uniform},
}; // 0.8.0
fn main() {
let mut rng = rand::thread_rng();
let die_range = Uniform::new_inclusive(1, 6);
let die = die_range.sample(&mut rng);
println!("{}", die);
}
Uniform does some precomputation to figure out how to map the complete range of random values to your desired range without introducing bias. It translates and resizes your original range to most closely match the range of the random number generator, discards any random numbers that fall outside this new range, then resizes and translates back to the original range.
See also:
Why do people say there is modulo bias when using a random number generator?
You're correct that a bias is introduced; whenever you want to map from set A to set B where the cardinality of set B is not a factor or multiple of set A, you will have bias.
In your case, 42*6=252. So you can just throw away any u8 values of 252 or greater (and call random again).
Your output can then be safely mapped with the modulus operator. Finally add 1 to achieve the standard [1,6] dice output.
It might seem unclean to call random again but there is no way of mapping a set of 256 values to a set of 6 without introducing bias.
Edit: looks like the rand crate has something which takes bias into account: https://docs.rs/rand/latest/rand/distributions/uniform/struct.Uniform.html

How can I modify my Akka streams Prime sieve to exclude modulo checks for known primes?

I wrote a sieve using akka streams to find prime members of an arbitrary source of Int:
object Sieve extends App {
implicit val system = ActorSystem()
implicit val mat = ActorMaterializer(ActorMaterializerSettings(system))
implicit val ctx = implicitly[ExecutionContext](system.dispatcher)
val NaturalNumbers = Source.fromIterator(() => Iterator.from(2))
val IsPrimeByEurithmethes: Flow[Int, Int, _] = Flow[Int].filter {
case n: Int =>
(2 to Math.floor(Math.sqrt(n)).toInt).par.forall(n % _ != 0)
}
NaturalNumbers.via(IsPrimeByEurithmethes).throttle(100000, 1 second, 100000, ThrottleMode.Shaping).to(Sink.foreach(println)).run()
}
Ok, so this appears to work decently well. However, there are at least a few potential areas of concern:
The modulo checks are run using par.forall, ie they are totally hidden within the Flow that filters, but I can see how it would be useful to have a Map from the candidate n to another Map of each n % _. Maybe.
I am checking way too many of the candidates needlessly - both in terms of checking n that I will already know are NOT prime based on previous results, and by checking n % _ that are redundant. In fact, even if I think the n is prime, it suffices to check only the known primes up until that point.
The second point is my more immediate concern.
I think I can prove rather easily that there is a more efficient way - by filtering out the source given each NEW prime.
So then....
2, 3, 4, 5, 6, 7, 8, 9, 10, 11... => (after finding p=2)
2, 3, 5, 7, 9, , 11... => (after finding p=3)
2, 3, 5, 7, , 11... => ...
Now after finding a p and filtering the source, we need to know whether the next candidate is a p. Well, we can say for sure it is prime if the largest known prime is greater than its root, which will Always happen I believe, so it suffices to just pick the next element...
2, 3, 4, 5, 6, 7, 8, 9, 10, 11... => (after finding p=2) PICK n(2) = 3
2, 3, 5, 7, 9, , 11... => (after finding p=3) PICK n(3) = 5
2, 3, 5, 7, , 11... => (after finding p=5) PICK n(5) = 7
This seems to me like a rewriting of the originally-provided sieve to do far fewer checks at the cost of introducing a strict sequential dependency.
Another idea - I could remove the constraint by working things out in terms of symbols, like the minimum set of modulo checks that necessitate primality, etc.
Am I barking up the wrong tree? IF not, how can I go about messing with my source in this manner?
I just started fiddling around with akka streams recently so there might be better solutions than this (especially since the code feels kind of clumsy to me) - but your second point seemed to be just the right challenge for me to try out building a feedback loop within akka streams.
Find my full solution here: https://gist.github.com/MartinHH/de62b3b081ccfee4ae7320298edd81ee
The main idea was to accumulate the primes that are already found and merge them with the stream of incoming natural numbers so the primes-check could be done based on the results up to N like this:
def isPrime(n: Int, primesSoFar: SortedSet[Int]): Boolean =
!primesSoFar.exists(n % _ == 0) &&
!(primesSoFar.lastOption.getOrElse(2) to Math.floor(Math.sqrt(n)).toInt).par.exists(n % _ == 0)

Scala - sort based on Future result predicate

I have an array of objects I want to sort, where the predicate for sorting is asynchronous. Does Scala have either a standard or 3rd party library function for sorting based on a predicate with type signature of (T, T) -> Future[Bool] rather than just (T, T) -> Bool?
Alternatively, is there some other way I could structure this code? I've considered finding all the 2-pair permutations of list elements, running the predicate over each pair and storing the result in a Map((T, T), Bool) or some structure to that effect, and then sorting on it - but I suspect that will have many more comparisons executed than even a naive sorting algorithm would.
If your predicate is async you may prefer to get an async result too and avoid blocking threads using Await
If you want to sort a List[(T,T)] according to a future boolean predicate, the easiest it to sort a List[(T,T,Boolean)]
So given a you have a List[(T,T)] and a predicate (T, T) -> Future[Bool], how can you get a List[(T,T,Boolean)]? Or rather a Future[List[(T,T,Boolean)]] as you want to keep the async behavior.
val list: List[(T,T)] = ...
val predicate = ...
val listOfFutures: List[Future[(T,T,Boolean]] = list.map { tuple2 =>
predicate(tuple2).map( bool => (tuple2._1, tuple2._2, bool)
}
val futureList: Future[List[(T,T,Boolean)]] = Future.sequence(listOfFutures)
val futureSortedResult: Future[List[(T,T)]] = futureList.map { list =>
list.sort(_._3).map(tuple3 => (tuple3._1,tuple3._2))
}
This is pseudo-code, I didn't compile it and it may not, but you get the idea.
The key is Future.sequence, very useful, which somehow permits to transform Monad1[Monad2[X]] to Monad2[Monad1[X]] but notice that if any of your predicate future fail, the global sort operation will also be a failure.
If you want better performance it may be a better solution to "batch" the call to the service returning the Future[Boolean].
For example instead of (T, T) -> Future[Bool] maybe you can design a service (if you own it obviously) like List[(T, T)] -> Future[List[(T,T,Bool)] so that you can get everything you need in a async single call.
A not so satisfactory alternative would be to block each comparison until the future is evaluated. If evaluating your sorting predicate is expensive, sorting will take a long time. In fact, this just translates a possibly concurrent program into a sequential one; all benefits of using futures will be lost.
import scala.concurrent.duration._
implicit val executionContext = ExecutionContext.Implicits.global
val sortingPredicate: (Int, Int) => Future[Boolean] = (a, b) => Future{
Thread.sleep(20) // Assume this is a costly comparison
a < b
}
val unsorted = List(4, 2, 1, 5, 7, 3, 6, 8, 3, 12, 1, 3, 2, 1)
val sorted = unsorted.sortWith((a, b) =>
Await.result(sortingPredicate(a, b), 5000.millis) // careful: May throw an exception
)
println(sorted) // List(1, 1, 1, 2, 2, 3, 3, 3, 4, 5, 6, 7, 8, 12)
I don't know if there is an out of the box solution that utilizes asynchronous comparison. However, you could try to implement your own sorting algorithm. If we consider Quicksort, which runs in O(n log(n)) on average, then we can actually utilize asynchronous comparison quite easy.
If you're not familiar with Quicksort, the algorithm basically does the following
Choose an element from the collection (called the Pivot)
Compare the pivot with all remaining elements. Create a collection with elements that are less than the pivot and one with elements that are greater than the pivot.
Sort the two new collections and concatenate them, putting the pivot in the middle.
Since step 2 performs a lot of independent comparisons we can evaluate the comparisons concurrently.
Here's an unoptimized implementation:
object ParallelSort {
val timeout = Duration.Inf
implicit class QuickSort[U](elements: Seq[U]) {
private def choosePivot: (U, Seq[U]) = elements.head -> elements.tail
def sortParallelWith(predicate: (U, U) => Future[Boolean]): Seq[U] =
if (elements.isEmpty || elements.size == 1) elements
else if (elements.size == 2) {
if (Await.result(predicate(elements.head, elements.tail.head), timeout)) elements else elements.reverse
}
else {
val (pivot, other) = choosePivot
val ordering: Seq[(Future[Boolean], U)] = other map { element => predicate(element, pivot) -> element }
// This is where we utilize asynchronous evaluation of the sorting predicate
val (left, right) = ordering.partition { case (lessThanPivot, _) => Await.result(lessThanPivot, timeout) }
val leftSorted = left.map(_._2).sortParallelWith(predicate)
val rightSorted = right.map(_._2).sortParallelWith(predicate)
leftSorted ++ (pivot +: rightSorted)
}
}
}
which can be used (same example as above) as follows:
import ParallelSort.QuickSort
val sorted2 = unsorted.sortParallelWith(sortingPredicate)
println(sorted2) // List(1, 1, 1, 2, 2, 3, 3, 3, 4, 5, 6, 7, 8, 12)
Note that whether or not this implementation of Quicksort is faster or slower than the completely sequential built-in sorting algorithm highly depends on the cost of a comparison: The longer a comparison has to block, the worse is the alternative solution mentioned above. On my machine, given a costly comparison (20 milliseconds) and the above list, the built-in sorting algorithm runs in ~1200 ms while this custom Quicksort runs in ~200 ms. If you're worried about performance, you'd probably want to come up with something smarter. Edit: I just checked how many comparisons both, the built-in sorting algorithm and the custom Quicksort algorithm perform: Apparently, for the given list (and some other lists I randomly typed in) the built-in algorithm uses more comparisons, so the performance improvements thanks to parallel execution might not be that great. I don't know about bigger lists, but you'd have to profile them on your specific data anyways.

Resources