I have a module with functions where I need to ensure that all functions use the same random values. I currently have two solutions, resetting the seed at each call:
using Distributions
function random_values(n)
Random.seed!(1)
rand(Normal(), n)
end
or similarly, instantiating it directly:
using Distributions
function random_values(n)
rand(MersenneTwister(1), Normal(), n)
end
This works but I have several functions and the code becomes a bit verbose. I would rather set a seed at the module level so that all functions use the same. How can I best achieve this?
I believe that what you want is to have two or more functions (f1, f2, ...) that each call rand(). You want them in "sync" so that with a series of function calls each function gets the same series of values for each sequential call of each function (so that the third call to f2 uses the same rand() value as the third call to f1).
For this, you are best off explicitly passing a duplicated RNG object to each of your functions after initializing each to the same seed:
init = 1234
rngvector(n) = [MersenneTwister(init) for _ in 1:n]
RNG = rngvector(3)
function f1(rng, ...)
x = rand(rng, ...)
...
end
function f2(rng, ...)
x = rand(rng, ...)
...
end
f1(RNG[1], ...)
will then be in sync with
f2(RNG[2], ...).
This does mean more coding since you add one more argument to each function, but the flexibility and reproducibility that initial extra coding gives you with your Monte Carlo methods when coding the rest of what you do may make it very worthwhile.
Related
Suppose if I have the following in Julia:
mutable struct emptys
begin_time::Dict{Float64,Float64}; finish_time::Dict{Float64,Float64}; Revenue::Float64
end
population = [emptys(Dict(),Dict(),-Inf) for i in 1:n_pop] #n_pop is a large positive integer value.
for ind in 1:n_pop
r = rand()
append!(population[ind].Revenue, r)
append!(population[ind].begin_time, Dict(r=>cld(r^2,rand())))
append!(population[ind].finish_time, Dict(r=>r^3/rand()))
end
Now I want to sort this population based on the Revenue value. Is there any way for Julia to achieve this? If I were to do it in Python it would be something like this:
sorted(population, key = lambda x: x.Revenue) # The population in Python can be prepared using https://pypi.org/project/ypstruct/ library.
Please help.
There is a whole range of sorting functions in Julia. The key functions are sort (corresponding to Python's sorted) and sort! (corresponding to Python's list.sort).
And as in Python, they have a couple of keyword arguments, one of which is by, corresponding to key.
Hence the translation of
sorted(population, key = lambda x: x.Revenue)
would be
getrevenue(e::emptys) = e.Revenue
sort(population, by=getrevenue)
Or e -> e.Revenue, but having a getter function is good style anyway.
I wrote a function that acts on each combination of columns in an input matrix. It uses multiple for loops and is very slow, so I am trying to parallelize it to use the maximum number of threads on my computer.
I am having difficulty finding the correct syntax to set this up. I'm using the Parallel package in octave, and have tried several ways to set up the calls. Here are two of them, in a simplified form, as well as a non-parallel version that I believe works:
function A = parallelExample(M)
pkg load parallel;
# Get total count of columns
ct = columns(M);
# Generate column pairs
I = nchoosek([1:ct],2);
ops = rows(I);
slice = ones(1, ops);
Ic = mat2cell(I, slice, 2);
## # Non-parallel
## A = zeros(1, ops);
## for i = 1:ops
## A(i) = cmbtest(Ic{i}, M);
## endfor
# Parallelized call v1
A = parcellfun(nproc, #cmbtest, Ic, {M});
## # Parallelized call v2
## afun = #(x) cmbtest(x, M);
## A = parcellfun(nproc, afun, Ic);
endfunction
# function to apply
function P = cmbtest(indices, matrix)
colset = matrix(:,indices);
product = colset(:,1) .* colset(:,2);
P = sum(product);
endfunction
For both of these examples I generate every combination of two columns and convert those pairs into a cell array that the parcellfun function should split up. In the first, I attempt to convert the input matrix M into a 1x1 cell array so it goes to each parallel instance in the same form. I get the error 'C must be a cell array' but this must be internal to the parcellfun function. In the second, I attempt to define an anonymous function that includes the matrix. The error I get here specifies that 'cmbtest' is undefined.
(Naturally, the actual function I'm trying to apply is far more complex than cmbtest here)
Other things I have tried:
Put M into a global variable so it doesn't need to be passed. Seemed to be impossible to put a global variable in a function file, though I may just be having syntax issues.
Make cmbtest a nested function so it can access M (parcellfun doesn't support that)
I'm out of ideas at this point and could use help figuring out how to get this to work.
Converting my comments above to an answer.
When performing parallel operations, it is useful to think of each parallel worker that will result as separate and independent octave instances, which need to have appropriate access to all functions and variables they will require in order to do their independent work.
Therefore, do not rely on subfunctions when calling parcellfun from a main function, since this might lead to errors if the worker is unable to access the subfunction directly under the hood.
In this case, separating the subfunction into its own file fixed the problem.
The matrix-qr function in Racket's math library outputs two values. I know about call-with-values to put both output values into the next function you want.
However, how can I take each individual output and put define some constant with that value? The QR function outputs a Q matrix and an R matrix. I need something like:
(define Q ...)
(define R ...)
Also, how could I just use one of the outputs from a function that outputs two values?
The usual way to create definitions for multiple values is to use define-values, which pretty much works like you’d expect.
(define-values (Q R) ; Q and R are defined
(matrix-qr (matrix [[12 -51 4]
[ 6 167 -68]
[-4 24 -41]])))
There is also a let equivalent for multiple values, called let-values (as well as let*-values and letrec-values).
Ignoring values is harder. There is no function like (first-value ...), for example, because ordinary function application does not produce a continuation that can accept multiple values. However, you can use something like match-define-values along with the _ “hole marker” to ignore values and simply not bind them.
(match-define-values (Q _) ; only Q is defined
(matrix-qr (matrix [[12 -51 4]
[ 6 167 -68]
[-4 24 -41]])))
It is theoretically possible to create a macro that could either convert multiple values to a list or simply only use a particular value, but in general this is avoided. Returning multiple values should not be done lightly, which is why, for almost all functions that return them, it wouldn’t usually make much sense to use one of the values but ignore the other.
I have following problem with my function which should return a random numer. When I want to generate a couple of numbers by calling that function, they are exactly the same. How can I fix the problem of returning the same number all the time when I call the function? I need that random to keep in function.
Here is the code:
with Ada.Numerics.discrete_Random
function generate_random_number ( n: in Positive) return Integer is
subtype Rand_Range is Integer range 0 .. n;
package Rand_Int is new Ada.Numerics.Discrete_Random(Rand_Range);
use Rand_Int;
gen : Rand_Int.Generator;
ret_val: Rand_Range;
begin
Rand_Int.Reset(gen);
ret_val := Random(gen);
return ret_val;
end;
The random generator gen should not be local to the function. Currently you are creating it anew on each generate_random_number call, and initialising it, so it's not surprising you always get the same result.
If you make gen - for instance - a global variable, initialised once, then each time you use it you'll get a new random number. (Yes, global variables are bad : but see below)
Unfortunately this does not play particularly well with the semantics of function generate_random_number ( n: in Positive) which can restrict the range of the random number differently each time. The simplest solution is to make gen return any valid integer and use modular arithmetic to return a number in the correct range for each call. This will work but you should be aware that it potentially introduces cryptographic weaknesses, beyond my skills to analyze.
If that is the case you will need a different approach, for example creating a different random generator for each range you need; taking care to seed (reset) them differently otherwise there may again be crypto weaknesses such as correlations between different generators.
Now global variables are poor structure for all the usual reasons in any language. So a much better way is to make it a resource by wrapping it in a package.
package RandGen is
function generate_random_number ( n: in Positive) return Positive;
end RandGen;
And that is all the client needs to see. The package is implemented as follows:
with Ada.Numerics.discrete_Random;
package body RandGen is
subtype Rand_Range is Positive;
package Rand_Int is new Ada.Numerics.Discrete_Random(Rand_Range);
gen : Rand_Int.Generator;
function generate_random_number ( n: in Positive) return Integer is
begin
return Rand_Int.Random(gen) mod n; -- or mod n+1 to include the end value
end generate_random_number;
-- package initialisation part
begin
Rand_Int.Reset(gen);
end RandGen;
EDIT : Jacob's suggestion of rejecting out-of-range values is better but inefficient if n is much smaller than the generator range. One solution might be to create several generators and let the generate_random_number function choose the one which covers 0 .. N with least waste.
The Random procedure with only one parameter is supposed to initiate the random number generator with a "time-dependent state". However, if it is called more than once rapidly in succession, it is very likely that the "time" that the program uses will be the same each time, since the program is likely to run faster than the clock resolution. (Try your test with something like delay 0.5 in between the two uses of generate_random_number; most likely you will get two different results.)
In any event, the intended use of a random number generator is to use Reset on it once to set the seed, and then generate a sequence of random numbers starting with the seed. By doing this, you can also use the same seed every time in order to get a duplicatable sequence of random numbers, for testing; or you can use a time-dependent seed. However, resetting the random number generator every time you want a random number is not the normal or recommended way of using RNG's. Therefore, you need to move the Reset call out of generate_random_number. Unfortunately, this also means that you have to use the same generator even if the n parameter will be different on every call, which means you may be forced to use Float_Random instead of Discrete_Random.
P.S. Looking into it further, I found this in G.2.5: "Two different calls to the time-dependent Reset procedure shall reset the generator to different states, provided that the calls are separated in time by at least one second and not more than fifty years." [Emphasis mine.] I'm sure that when you tried it, the calls were less than one second apart.
Assuming that the upper limit can be determined when you start the program:
Package Random (specification):
with Generic_Random;
with Upper_Limit_Function;
package Random is new Generic_Random (Upper_Limit => Upper_Limit_Function);
Generic package Generic_Random (specification):
generic
Upper_Limit : Positive;
package Generic_Random is
subtype Values is Natural range 0 .. Upper_Limit;
function Value return Values;
end Generic_Random;
Generic package Generic_Random (body):
with Ada.Numerics.Discrete_Random;
package body Generic_Random is
package Random is new Ada.Numerics.Discrete_Random (Values);
Source : Random.Generator;
function Value return Values is
begin
return Random.Random (Source);
end Value;
begin
Random.Reset (Source);
end Generic_Random;
I'm trying to make a function that defines a vector that varies based on the function's input, and set! works great for this in Scheme. Is there a functional equivalent for this in OCaml?
I agree with sepp2k that you should expand your question, and give more detailed examples.
Maybe what you need are references.
As a rough approximation, you can see them as variables to which you can assign:
let a = ref 5;;
!a;; (* This evaluates to 5 *)
a := 42;;
!a;; (* This evaluates to 42 *)
Here is a more detailed explanation from http://caml.inria.fr/pub/docs/u3-ocaml/ocaml-core.html:
The language we have described so far is purely functional. That is, several evaluations of the same expression will always produce the same answer. This prevents, for instance, the implementation of a counter whose interface is a single function next : unit -> int that increments the counter and returns its new value. Repeated invocation of this function should return a sequence of consecutive integers — a different answer each time.
Indeed, the counter needs to memorize its state in some particular location, with read/write accesses, but before all, some information must be shared between two calls to next. The solution is to use mutable storage and interact with the store by so-called side effects.
In OCaml, the counter could be defined as follows:
let new_count =
let r = ref 0 in
let next () = r := !r+1; !r in
next;;
Another, maybe more concrete, example of mutable storage is a bank account. In OCaml, record fields can be declared mutable, so that new values can be assigned to them later. Hence, a bank account could be a two-field record, its number, and its balance, where the balance is mutable.
type account = { number : int; mutable balance : float }
let retrieve account requested =
let s = min account.balance requested in
account.balance <- account.balance -. s; s;;
In fact, in OCaml, references are not primitive: they are special cases of mutable records. For instance, one could define:
type 'a ref = { mutable content : 'a }
let ref x = { content = x }
let deref r = r.content
let assign r x = r.content <- x; x
set! in Scheme assigns to a variable. You cannot assign to a variable in OCaml, at all. (So "variables" are not really "variable".) So there is no equivalent.
But OCaml is not a pure functional language. It has mutable data structures. The following things can be assigned to:
Array elements
String elements
Mutable fields of records
Mutable fields of objects
In these situations, the <- syntax is used for assignment.
The ref type mentioned by #jrouquie is a simple, built-in mutable record type that acts as a mutable container of one thing. OCaml also provides ! and := operators for working with refs.