Is there a way to prevent rust compiler from rearranging my code? - performance

After inspecting a disassembly of my binary, I realised that compiler turned
expensive_function();
let t1 = Instant::now();
let ptr = get_mut_ptr::<AtomicUsize>();
(ptr).store(some_value, Ordering::Relaxed);
let t2 = Instant::now();
into
let t1 = Instant::now();
expensive_function();
let ptr = get_mut_ptr::<AtomicUsize>();
(ptr).store(some_value, Ordering::Relaxed);
let t2 = Instant::now();
And completely screwed my performance measurements.
Is there a way to prevent this?

Related

Which Rust RNG should be used for multithreaded sampling?

I am trying to create a function in Rust which will sample from M normal distributions N times. I have the sequential version below, which runs fine. I am trying to parallelize it using Rayon, but am encountering the error
Rc<UnsafeCell<ReseedingRng<rand_chacha::chacha::ChaCha12Core, OsRng>>> cannot be sent between threads safely
It seems my rand::thread_rng does not implement the traits Send and Sync. I tried using StdRng and OsRng which both do, to no avail because then I get errors that the variables pred and rng cannot be borrowed as mutable because they are captured in a Fn closure.
This is the working code below. It errors when I change the first into_iter() to into_par_iter().
use rand_distr::{Normal, Distribution};
use std::time::Instant;
use rayon::prelude::*;
fn rprednorm(n: i32, means: Vec<f64>, sds: Vec<f64>) -> Vec<Vec<f64>> {
let mut rng = rand::thread_rng();
let mut preds = vec![vec![0.0; n as usize]; means.len()];
(0..means.len()).into_iter().for_each(|i| {
(0..n).into_iter().for_each(|j| {
let normal = Normal::new(means[i], sds[i]).unwrap();
preds[i][j as usize] = normal.sample(&mut rng);
})
});
preds
}
fn main() {
let means = vec![0.0; 67000];
let sds = vec![1.0; 67000];
let start = Instant::now();
let preds = rprednorm(100, means, sds);
let duration = start.elapsed();
println!("{:?}", duration);
}
Any advice on how to make these two iterators parallel?
Thanks.
It seems my rand::thread_rng does not implement the traits Send and Sync.
Why are you trying to send a thread_rng? The entire point of thread_rng is that it's a per-thread RNG.
then I get errors that the variables pred and rng cannot be borrowed as mutable because they are captured in a Fn closure.
Well yes, you need to Clone the StdRng (or Copy the OsRng) into each closure. As for pred, that can't work for a similar reason: once you parallelise the loop the compiler does not know that every i is distinct, so as far as it's concerned the write access to i could overlap (you could have two iterations running in parallel which try to write to the same place at the same time) which is illegal.
The solution is to use rayon to iterate in parallel over the destination vector:
fn rprednorm(n: i32, means: Vec<f64>, sds: Vec<f64>) -> Vec<Vec<f64>> {
let mut preds = vec![vec![0.0; n as usize]; means.len()];
preds.par_iter_mut().enumerate().for_each(|(i, e)| {
let mut rng = rand::thread_rng();
(0..n).into_iter().for_each(|j| {
let normal = Normal::new(means[i], sds[i]).unwrap();
e[j as usize] = normal.sample(&mut rng);
})
});
preds
}
Alternatively with OsRng, it's just a marker ZST, so you can refer to it as a value:
fn rprednorm(n: i32, means: Vec<f64>, sds: Vec<f64>) -> Vec<Vec<f64>> {
let mut preds = vec![vec![0.0; n as usize]; means.len()];
preds.par_iter_mut().enumerate().for_each(|(i, e)| {
(0..n).into_iter().for_each(|j| {
let normal = Normal::new(means[i], sds[i]).unwrap();
e[j as usize] = normal.sample(&mut rand::rngs::OsRng);
})
});
preds
}
StdRng doesn't seem very suitable to this use-case, as you'll either have to create one per toplevel iteration to get different samplings, or you'll have to initialise a base rng then clone it once per spark, and they'll all have the same sequence (as they'll share a seed).

How to concat vector elements and store it back to vector?

I have a vector of a vector and need to concatenate the second one to the first (it's ok if the second one is dropped), i.e.
f([[1,2,3], [4,5,6]]) => [[1,2,3,4,5,6], []]
or
f([[1,2,3], [4,5,6]]) => [[1,2,3,4,5,6], [4,5,6]]
Both are okay.
My initial solution is:
fn problem() {
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
items[0].append(&mut items[1]);
}
But it has a compile time error due to 2 mutable borrows:
| items[0].append(&mut items[1]);
| ----- ------ ^^^^^ second mutable borrow occurs here
| | |
| | first borrow later used by call
| first mutable borrow occurs here
I could solve it with Box / Option, but I wonder whether there are better ways to solve this?
My solution with Box:
fn solution_with_box() {
let mut items = Vec::new();
items.push(Box::new(vec![1,2,3]));
items.push(Box::new(vec![4,5,6]));
let mut second = items[1].clone();
items[0].as_mut().append(second.as_mut());
}
My solution with Option:
fn solution_with_option() {
let mut items = vec::new();
items.push(some(vec![1,2,3]));
items.push(some(vec![4,5,6]));
let mut second = items[1].take();
items[0].as_mut().unwrap().append(second.as_mut().unwrap());
}
You can clone the data of items[1] as follows:
fn main() {
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
let mut a: Vec<i32> = items[1].clone();
&items[0].append(&mut a);
}
If you don't want to clone the data, you can use mem::take as suggested by #trentcl
fn main() {
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
let second = std::mem::take(&mut items[1]);
items[0].extend(second);
println!("{:?}", items);
}
This is not fastest way of doing it but it solves your problem.
fn problem() {
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
//we can not have two mutable references in the same scope
// items[0].append(&mut items[1]);
// instead you can flatten vector
let first = items.into_iter().flatten().collect(); // we consume items so its no longer available
let items = vec![first, vec![]];
println!("{:?}", items); // [[1,2,3,4,5,6], []]
}
You can use split_at_mut on a slice or vector to get mutable references to non-overlapping parts that can't interfere with each other, so that you can mutate the first inner vector and the second inner vector at the same time.
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
let (contains_first, contains_second) = items.split_at_mut(1);
contains_first[0].append(&mut contains_second[0]);
dbg!(items);
Rust Playground link
No copying or cloning occurs. Note that contains_second[0] corresponds to items[1] because the second slice split_at_mut returns starts indexing at wherever the split point is (here, 1).
you can solve the problem in two steps:
append the empty vector at the end
remove items[1] and append its elements to items[0]
fn problem() {
let mut items = Vec::new();
items.push(vec![1,2,3]);
items.push(vec![4,5,6]);
items.push(vec![0;0]);
let v = items.remove(1);
items[0].extend(v);
}

Getting a random number in a function in OCAML OR Telling compiler to evaluate function each time

I'm new to OCAML and was playing around with putting a marker on a random 5X5 square. I've written the example program below. "silly_method1" works but notice that it takes an argument. I don't really have argument to pass in for what I want. I'm just asking for a random number to create my robot on a particular square:
let create = {x = ( Random.int 4); y=3; face = North}
However, I get the same location each time. This makes sense to me... sort of. I'm assuming that the way I've set it up, "create" is basically a constant. It's evaluated once and that's it! I've fixed it below in silly_method2 but look how ugly it is!
let silly_method2 _ = (Random.int 10)
Every time I have to call it, I have to pass in an argument even though I'm not really using it.
What is the correct way to do this? There must be some way to have a function that takes no arguments and passes back a random number (or random tuple, etc.)
And possibly related... Is there a way to tell OCaml not to evaluate the function once and save the result but rather recalculate the answer each time?
Thank you for your patience with me!
Dave
let _ = Random.self_init()
let silly_method1 x = x + (Random.int 10)
let silly_method2 _ = (Random.int 10)
let report1 x = (print_newline(); print_string("report1 begin: "); print_int (silly_method1 x); print_string("report1 end"); print_newline(); )
let report2 y = (print_newline(); print_string("report2 begin: "); print_int(silly_method2 y ); print_string("report2 end"); print_newline(); )
let _ = report1 3
let _ = report1 3
let _ = report1 3
let _ = report2 3
let _ = report2 3
let _ = report2 3
The idiomatic way to define a function in OCaml that doesn't take an argument is to have the argument be (), which is a value (the only value) of type unit:
# let f () = Random.int 10;;
val f : unit -> int = <fun>
# f ();;
- : int = 5
# f ();;
- : int = 2
OCaml doesn't save function results for later re-use. If you want this behavior you have to ask for it explicitly using lazy.

Is this a valid implementation of `std::mem::drop`?

According to The Rust Programming Language, ch15-03, std::mem::drop takes an object, receives its ownership, and calls its drop function.
That's what this code does:
fn my_drop<T>(x: T) {}
fn main() {
let x = 5;
let y = &x;
let mut z = 4;
let v = vec![3, 4, 2, 5, 3, 5];
my_drop(v);
}
Is this what std::mem::drop does? Does it perform any other cleanup tasks other than these?
Let's take a look at the source:
#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
pub fn drop<T>(_x: T) { }
#[inline] gives a hint to the compiler that the function should be inlined. #[stable] is used by the standard library to mark APIs that are available on the stable channel. Otherwise, it's really just an empty function! When _x goes out of scope as drop returns, its destructor is run; there is no other way to perform cleanup tasks implicitly in Rust.

Is there any syntax in F# which allows me to repeat the command lines I've already wrote? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
say I have a parameter x and have several lines using x to calculate y, now there are 10 values of x and I need to use each value to calculate a respective y, and I don't wanna change x each time and run my command lines 10 times, is there any syntax in F# which allows me to repeat those command lines I've already wrote so that I only need to execute one time to work out all 10 values of y?
Thanks in advance
EDITED:I pasted my code down below, basically, what I want is geting alphas for different parameter combinations, my parameters are "shreshold", "WeeksBfReport" and "DaysBfExecution". I have 30 sets of parameter combinations, so I don't wanna go change the parameters and run the command for 30 times. Is there any way for not doing this?
let shreshold= 2.0
let ReportDate = "2008/12/15"
let ExeDate = "2009/01/05"
let WeeksBfReport = 1
let DaysBfExecution = 3
let Rf=0.01
let DateIn=ReportDate.ToDateTimeExact("yyyy/MM/dd").AddWeeks(-WeeksBfReport)
let DateOut=ExeDate.ToDateTimeExact("yyyy/MM/dd").AddWorkDays(-DaysBfExecution)
let DateInString=DateIn.ToString("yyyy/MM/dd")
let DateOutString=DateOut.ToString("yyyy/MM/dd")
let mutable FundMV=0.
let FundTicker=csvTable.AsEnumerable().Select(fun (x:DataRow) -> x.Field<string>("Ticker")).ToArray()
for i in 0..csvTable.Rows.Count-1 do
let FundUnitPrice= float(csvTable.AsEnumerable().Where(fun (x:DataRow) -> x.Field<string>(0) = FundTicker.[i]).First().Field<string>(DateInString))
let FundShares= float(csvTable1.AsEnumerable().Where(fun (x:DataRow) -> x.Field<string>(0) = FundTicker.[i]).First().Field<string>(DateInString))
FundMV<-FundMV + FundUnitPrice*FundShares
printfn "%e" FundMV
//use TMV to calculate weights of CSI300 constitutes
let mutable csiTMV=0.
let CSITMV : float array = Array.zeroCreate 300
let DictionaryCSI = Dictionary<String,float>()
for i in 0..299 do
let TMV=float(csvTable3.Rows.[i].Field<string>(DateInString))
csiTMV<-csiTMV + TMV
CSITMV.[i] <- TMV
for i in 0..299 do
let Weight=CSITMV.[i]/csiTMV
DictionaryCSI.[csvTable3.Rows.[i].Field<string>("Stock")]<-Weight
let DictionaryOldOut = Dictionary<String,float>()
let array=csvTable2.AsEnumerable().Select(fun (x:DataRow) -> x.Field<string>("Stock")).ToArray()
let OldOutTMV=ResizeArray<float>()
let DictionaryOldOutWeight = Dictionary<string,float>()
let OldOutWeight : float array = Array.zeroCreate (csvTable2.Rows.Count/2)
for i in 0..(csvTable2.Rows.Count/2)-1 do
let Weight=DictionaryCSI.Item(array.[i+(csvTable2.Rows.Count/2)])
DictionaryOldOutWeight.[csvTable2.Rows.[i+csvTable2.Rows.Count/2].Field<string>("Stock")]<-Weight
OldOutWeight.[i]<- Weight
DictionaryOldOut.[csvTable2.Rows.[i+csvTable2.Rows.Count/2].Field<string>("Stock")]<- Weight*FundMV //OldOut Moving Value
OldOutTMV.Add(Weight)
let OldOutTMVarray=OldOutTMV.ToArray() //create an array of OldOut weights and then sum up
let SumOldOutTMV=Array.fold (+) 0. OldOutTMVarray
let mutable NewInTMV=0.
let NewInWeight : float array = Array.zeroCreate (csvTable2.Rows.Count/2)
let DictionaryNewIn = Dictionary<string,float>()
let DictionaryNewInWeight = Dictionary<string,float>()
for i in 0..csvTable3.Rows.Count-300-1 do
let TMV=float(csvTable3.Rows.[i+300].Field<string>(DateInString))
NewInTMV<-NewInTMV + TMV
let Weight=TMV/(csiTMV+NewInTMV-SumOldOutTMV)
NewInWeight.[i]<-Weight
DictionaryNewInWeight.[csvTable3.Rows.[i+300].Field<string>("Stock")]<-Weight
let MovingValue=Weight*FundMV
DictionaryNewIn.[csvTable3.Rows.[i+300].Field<string>("Stock")]<-MovingValue //NewIn Moving Value
let table2array=csvTable2.AsEnumerable().Select(fun (x:DataRow) -> x.Field<string>("Stock")).ToArray()
let NewInturnoverArray : float array = Array.zeroCreate (csvTable2.Rows.Count/2)
for i in 0..(csvTable2.Rows.Count/2)-1 do
let lastday= float(csvTable2.AsEnumerable().Where(fun (x:DataRow) -> x.Field<string>(0) = table2array.[i]).First().Field<string>(DateInString))
let turnover = csvTable2.Rows.[i].ItemArray.Skip(3)|>Seq.map(fun (x:obj)-> System.Double.Parse(x.ToString()))|>Seq.toArray
let lastdayindex : (int) =
if lastday= 0. then
let lastdayfake=float(csvTable2.AsEnumerable().Where(fun (x:DataRow) -> x.Field<string>(0) = table2array.[i+2]).First().Field<string>(DateInString))
let turnoverfake = csvTable2.Rows.[i+2].ItemArray.Skip(3)|>Seq.map(fun (x:obj)-> System.Double.Parse(x.ToString()))|>Seq.toArray
Array.findIndex(fun elem -> elem=lastdayfake) turnoverfake
else
let lastdayfake=lastday
let turnoverfake=turnover
Array.findIndex(fun elem -> elem=lastdayfake) turnoverfake
printfn "%A" lastdayindex
let TurnoverNeed : float array = Array.zeroCreate 21
for t in 0..20 do
TurnoverNeed.[t] <- turnover.[lastdayindex - 20 + t]
let zerotwo : float array = Array.zeroCreate TurnoverNeed.Length
if TurnoverNeed=zerotwo then
let ave_daily_turnover = 0.
NewInturnoverArray.[i] <- ave_daily_turnover
else
let ave_daily_turnover = Seq.average(TurnoverNeed|>Seq.filter(fun x-> x > 0.))
NewInturnoverArray.[i] <- ave_daily_turnover
type totalinfo = {Name:String;Shock:float}
let NewIn=ResizeArray<totalinfo>()
for i in 0..(csvTable2.Rows.Count/2)-1 do
let MovingValue=DictionaryNewIn.Item(array.[i])
let Shock=MovingValue/NewInturnoverArray.[i]
let V= {Name=string(array.[i]); Shock=Shock}
NewIn.Add(V)
let NewInShock=NewIn.ToArray()
let OldOutturnoverArray : float array = Array.zeroCreate (csvTable2.Rows.Count/2)
for i in 0..(csvTable2.Rows.Count/2)-1 do
let turnover = csvTable2.Rows.[i+csvTable2.Rows.Count/2].ItemArray.Skip(3)|>Seq.map(fun (x:obj)-> System.Double.Parse(x.ToString()))
let zero : float array = Array.zeroCreate (turnover|>Seq.toArray).Length
if turnover|>Seq.toArray=zero then
let ave_daily_turnover = 0.
OldOutturnoverArray.[i] <- ave_daily_turnover
else
let ave_daily_turnover = Seq.average(turnover|>Seq.filter(fun x-> x > 0.))
OldOutturnoverArray.[i] <- ave_daily_turnover
let OldOut=ResizeArray<totalinfo>()
for i in 0..(csvTable2.Rows.Count/2)-1 do
let MovingValue=DictionaryOldOut.Item(array.[i+csvTable2.Rows.Count/2])
let Shock=MovingValue/OldOutturnoverArray.[i]
let V= {Name=string(array.[i+csvTable2.Rows.Count/2]); Shock=Shock}
OldOut.Add(V)
let OldOutShock=OldOut.ToArray()
let DoIn=NewInShock |> Array.filter (fun t -> t.Shock >= shreshold)
let DoOut=OldOutShock |> Array.filter (fun t -> t.Shock >= shreshold)
let DoInTicker= Array.map (fun e -> e.Name) DoIn
let DoOutTicker= Array.map (fun e -> e.Name) DoOut
let DoInWeight : float array = Array.zeroCreate DoInTicker.Length
for i in 0..DoInTicker.Length-1 do
DoInWeight.[i] <- DictionaryNewInWeight.Item(DoInTicker.[i])
let TotalDoInWeight= Array.fold (+) 0. DoInWeight
let DoInRatioX : float array = Array.zeroCreate DoInTicker.Length
for i in 0..(DoInTicker.Length)-1 do
DoInRatioX.[i] <- DoInWeight.[i]/TotalDoInWeight
let Beta=csvTable2.AsEnumerable().Select(fun (x:DataRow) -> x.Field<string>("Beta")).ToArray()
//let NewInBeta : float array = Array.zeroCreate (csvTable2.Rows.Count/2)
let DictionaryNewInBeta = Dictionary<string,float>()
for i in 0..(csvTable2.Rows.Count/2)-1 do
// NewInBeta.[i] <- float(Beta.[i])
DictionaryNewInBeta.[csvTable3.Rows.[i+300].Field<string>("Stock")]<-float(Beta.[i])
let DoInBeta : float array = Array.zeroCreate DoInTicker.Length
for i in 0..DoInTicker.Length-1 do
DoInBeta.[i] <- DictionaryNewInBeta.Item(DoInTicker.[i])
let mutable PortfolioBeta=0.
for i in 0..(DoInTicker.Length)-1 do
PortfolioBeta <- PortfolioBeta + DoInRatioX.[i] * DoInBeta.[i]
let mutable PortfolioReturn= 0.
for i in 0..DoInTicker.Length-1 do
let PriceIn= float(csvTable4.AsEnumerable().Where(fun (x:DataRow) -> x.Field<string>(0) = DoInTicker.[i]).First().Field<string>(DateInString))
let PriceOut= float(csvTable4.AsEnumerable().Where(fun (x:DataRow) -> x.Field<string>(0) = DoInTicker.[i]).First().Field<string>(DateOutString))
PortfolioReturn <- PortfolioReturn + (1./float(DoInTicker.Length))*(PriceOut - PriceIn)/PriceIn
let IndexIn= float(csvTable4.AsEnumerable().Where(fun (x:DataRow) -> x.Field<string>(0) = "000300.SH").First().Field<string>(DateInString))
let IndexOut= float(csvTable4.AsEnumerable().Where(fun (x:DataRow) -> x.Field<string>(0) = "000300.SH").First().Field<string>(DateOutString))
let MarketReturn= (IndexOut-IndexIn)/IndexIn
let Alpha= PortfolioReturn-Rf-PortfolioBeta*(MarketReturn-Rf)
Like John said, put it all into a function accepting the changing values as parameters. To can use records to allow you to store the parameter combinations in a list, like so.
type ReportParameters = {
shreshold: float;
ReportDate: string;
ExeDate: string;
WeeksBfReport: int;
DaysBfExecution: int;
Rf: float;
}
type Report = {
NewInShock: totalinfo;
IndexIn: float;
// etc
}
let createReport (reportParams:ReportParameters) : Report =
let shreshold = reportParams.shreshold
let ReportDate = reportParams.ReportDate
let ExeDate = reportParams.ExeDate
let WeeksBfReport = reportParams.WeeksBfReport
let DaysBfExecution = reportParams.DaysBfExecution
let Rf = reportParams.Rf
// Your function code HERE
// Remember to move all type definitions out of this scope.
{ // Report data to return.
NewInShock = NewInShock;
IndexIn = IndexIn;
// etc
}
Using the code is as simple as this:
let reportsToBeGenerated = [
{ shreshold = 2.0; ReportDate = "2008/12/15"; ExeDate = "2009/01/05"; WeeksBfReport = 1; DaysBfExecution = 3; Rf = 0.01 };
{ shreshold = 1.5; ReportDate = "2009/12/15"; ExeDate = "2010/01/05"; WeeksBfReport = 2; DaysBfExecution = 2; Rf = 0.01 };
]
let reports = reportsToBeGenerated |> List.map createReport
I'm also not entirely sure what you need (what are other constraints and the motivation), but if you have some interactive code that makes a single calculation, say:
let x = 10
let y = x * x
You can turn it into code that does the same calculation on multiple inputs using e.g. lists:
let xs = [1; 10; 100]
let ys = [ for x in xs -> x * x ]
But as mentioned earlier, it depends on what you actually want to achieve - if you can add a realistic example of what you're trying to do, that would be useful.
Looking at your code you want to do something like this
let run shreshold ReportDate ExeDate WeeksBfReport DaysBfExecution Rf =
//The entire rest of the code indented - you may want to return alpha etc
Then you can just plug in the parameter values

Resources