What is a thread-safe random number generator for perl? - windows

The core perl function rand() is not thread-safe, and I need random numbers in a threaded monte carlo simulation.
I'm having trouble finding any notes in CPAN on the various random-number generators there as to which (if any) are thread-safe, and every google search I do keeps getting cluttered with C/C++/python/anything but perl. Any suggestions?

Do not use built-in rand for Monte Carlo on Windows. At least, try:
my %r = map { rand() => undef } 1 .. 1_000_000;
print scalar keys %r, "\n";
If nothing has changed, it should print 32768 which is utterly unsuitable for any kind of serious work. And, even if it does print a larger number, you're better off sticking with a PRNG with known good qualities for simulation.
You can use Math::Random::MT.
You can instantiate a new Math::Random::MT object in each thread with its own array of seeds. Mersenne Twister has good properties for simulation.

Do you have /dev/urandom on your system?
BEGIN {
open URANDOM, '<', '/dev/urandom';
}
sub urand { # drop in replacement for rand.
my $expr = shift || 1;
my $x;
read URANDOM, $x, 4;
return $expr * unpack("I", $x) / (2**32);
}

rand is thread safe, and I think you got the wrong definition of what "thread safe" means, If its not "thread safe" It means the program/function is modifying its "shared" data structure that makes its execution in thread mode unsafe.
Check Rand function documentation, Notice it take EXPR as argument, in every thread you can provide a different EXPR.
http://perldoc.perl.org/functions/rand.html

Related

How to make a simualtion in verilog have different results everytime if it has random values?

I want to generate a different output of the same code every time I run it as it has random values assigned to some variables. Is there a way to do that, for example seeding using time as in C?
Sample code that has the randomization in it:
class ABC;
rand bit [4 : 0] arr []; // dynamic array
constraint arr_size{
arr.size() >= 2;
arr.size() <= 6;
}
endclass
module constraint_array_randomization();
ABC test_class;
initial begin
test_class = new();
test_class.randomize();
$display("The array has the value = %p ", test_class.arr);
end
endmodule
I this is probably dependent on the tool that is being used. For example xcelium from cadence supports xrun -seed some_seed(Questa has -sv_seed some_seed I think). I am certain all tools support something similar. Look for simulation tool reference/manual/guide/help it may support random seed for every simulation run.
Not sure if this is possible from inside of simulation.
As mentioned in the comments for Questa, -sv_seed random should do the trick.
Usually, having an uncontrolled random seeding at simulation creates repeatability issues. In other words, it would be very difficult to debug a failing case if you do not know the seed. But if you insist, then read the following.
You can mimic the 'c' way of randomizing with time. However, there is no good way in verilog to access system time. Therfore, there is no good way to do time based seeding from within the program.
However as always, there is a work-around available. For example, one can use the $system call to get the system time (is system-dependent). Then the srandom function can be used to set the seed. The following (linux-based) example might work for you (or you can tune it up for your system).
Here the time is provided as unix-time by the date +'%s' command. It writes it into a file and then reads from it as 'int' using $fopen/$fscan.
module constraint_array_randomization();
ABC test_class;
int today ;
initial begin
// get system time
$system("date +'%s' > date_file"); // write date into a file
fh = $fopen("date_file", "r");
void'($fscanf(fh, "%d", today)); // cast to void to avoid warnings
$fclose(fh);
$system("rm -f date_file"); // remove the file
$display("time = %d", today);
test_class = new();
test_class.srandom(today); // seed it
test_class.randomize();
$display("The array has the value = %p ", test_class.arr);
end
endmodule

Performance of local variable vs. array access

I was doing some benchmarking of Perl performance, and ran into a case that I thought was somewhat odd. Suppose you have a function which uses a value from an array multiple times. In this case, you often see some code as:
sub foo {
my $value = $array[17];
do_something_with($value);
do_something_else_with($value);
}
The alternative is not to create a local variable at all:
sub foo {
do_something_with($array[17]);
do_something_else_with($array[17]);
}
For readability, the first is clearer. I assumed that performance would be at least equal (or better) for the first case too - array lookup requires a multiply-and-add, after all.
Imagine my surprise when this test program showed the opposite. On my machine, re-doing the array lookup is actually faster than storing the result, until I increase ITERATIONS to 7; in other words, for me, creating a local variable is only worthwhile if it's used at least 7 times!
use Benchmark qw(:all);
use constant { ITERATIONS => 4, TIME => -5 };
# sample array
my #array = (1 .. 100);
cmpthese(TIME, {
# local variable version
'local_variable' => sub {
my $index = int(rand(scalar #array));
my $val = $array[$index];
my $ret = '';
for (my $i = 0; $i < ITERATIONS; $i ++) {
$ret .= $val;
}
return $ret;
},
# multiple array access version
'multi_access' => sub {
my $index = int(rand(scalar #array));
my $ret = '';
for (my $i = 0; $i < ITERATIONS; $i ++) {
$ret .= $array[$index];
}
return $ret;
}
});
Result:
Rate local_variable multi_access
local_variable 245647/s -- -5%
multi_access 257907/s 5% --
It's not a HUGE difference, but it brings up my question: why is it slower to create a local variable and cache the array lookup, than to do the lookup again? Reading other S.O. posts, I've seen that other languages / compilers do have the expected outcome, and sometimes even transform these into the same code. What is Perl doing?
I've done more poking around at this today, and what I've determined is that scalar assignment of any sort is an expensive operation, relative to the overhead of one-deep array lookup.
This seems like it's just restating the initial question, but I feel I have found more clarity. If, for example, I modify my local_variable subroutine to do another assignment like so:
my $index = int(rand(scalar #array));
my $val = 0; # <- this is new
$val = $array[$index];
my $ret = '';
...the code suffers an additional 5% speed penalty beyond the single-assignment version - even though it does nothing but a dummy assignment to the variable.
I also tested to see if scope caused setup/teardown of $var to impede performance, by switching it to global instead of local scoped one. The difference is negligible (see comments to #zdim above), pointing away from construct/destruct as the performance bottleneck.
In the end, my confusion was based on faulty assumptions that scalar assignment should be fast. I am used to working in C, where copying a value to a local variable is an extremely quick operation (1-2 asm instructions).
As it turns out, this is not the case in Perl (though I don't know exactly why, it's ok). Scalar assignment is a relatively "slow" operation... Whatever Perl internals are doing to get at the nth element of an Array object is actually quite fast by comparison. The "multiply and add" I mentioned in the initial post is still far less work than the code for scalar assignment.
That is why it takes so many lookups to match the performance of caching the result: simply assigning to the "cache" variable is ~7 times slower (for my setup).
Let's first turn the statement: Caching the lookup is expected to be faster as it avoids the repeated lookups, even as it does cost some, and it starts being faster once more than 7 lookups are done. Now that's not so shocking, I think.
As to why it's slower for fewer than seven iterations ... I'll guess that the cost of the scalar creation is still greater than those few lookups. It is surely greater than one lookup, yes? How about two, then? I'd say that "a few" may well be a good measure.

MT19937 does NOT reproduce the same pseudo-random sequence by holding the seed value a constant

I'm writing a checkpoint function in my Monte Carlo simulation in Fortran 90/95, the compiler I'm using is ifort 18.0.2, before going through detail just to clarify the version of pseudo-random generator I'm using:
A C-program for MT19937, with initialization, improved 2002/1/26.
Coded by Takuji Nishimura and Makoto Matsumoto.
Code converted to Fortran 95 by Josi Rui Faustino de Sousa
Date: 2002-02-01
See mt19937 for the source code.
The general structure of my Monte Carlo simulation code is given below:
program montecarlo
call read_iseed(...)
call mc_subroutine(...)
end
Within the read_iseed
subroutine read_iseed(...)
use mt19937
if (Restart == 'n') then
call system('od -vAn -N4 -td4 < /dev/urandom > '//trim(IN_ISEED)
open(unit=7,file=trim(IN_ISEED),status='old')
read(7,*) i
close(7)
!This is only used to initialise the PRNG sequence
iseed = abs(i)
else if (Restart == 'y') then
!Taking seed value from the latest iteration of previous simulation
iseed = RestartSeed
endif
call init_genrand(iseed)
print *, 'first pseudo-random value ',genrand_real3(), 'iseed ',iseed
return
end subroutine
Based on my understanding, if the seed value holds a constant, the PRNG should be able to reproduce the pseudo-random sequence every time?
In order to prove this is the case, I ran two individual simulations by using the same seed value, they are able to reproduce the exact sequence. So far so good!
Based on the previous test, I'd further assume that regardless the number of times init_genrand() being called within one individual simulation, the PRNG should also be able to reproduce the pseudo-random value sequence? So I did a little modification to my read_iseed() subroutine
subroutine read_iseed(...)
use mt19937
if (Restart == 'n') then
call system('od -vAn -N4 -td4 < /dev/urandom > '//trim(IN_ISEED)
open(unit=7,file=trim(IN_ISEED),status='old')
read(7,*) i
close(7)
!This is only used to initialise the PRNG sequence
iseed = abs(i)
else if (Restart == 'y') then
!Taking seed value from the latest iteration of the previous simulation
iseed = RestartSeed
endif
call init_genrand(iseed)
print *, 'first time initialisation ',genrand_real3(), 'iseed ',iseed
call init_genrand(iseed)
print *, 'second time initialisation ',genrand_real3(), 'iseed ',iseed
return
end subroutine
The output is surprisingly not the case I thought would be, by all means iseed outputs are identical in between two initializations, however, genrand_real3() outputs are not identical.
Because of this unexpected result, I struggled with resuming the simulation at an arbitrary state of the system since the simulation is not reproducing the latest configuration state of the system I'm simulating.
I'm not sure if I've provided enough information, please let me know if any part of this question needs to be more specific?
From the source code you've provided (See [mt19937]{http://web.mst.edu/~vojtat/class_5403/mt19937/mt19937ar.f90} for the source code.), the init_genrand does not clear the whole state.
There are 3 critical state variables:
integer( kind = wi ) :: mt(n) ! the array for the state vector
logical( kind = wi ) :: mtinit = .false._wi ! means mt[N] is not initialized
integer( kind = wi ) :: mti = n + 1_wi ! mti==N+1 means mt[N] is not initialized
The first one is the "array for the state vector", second one is a flag that ensures we don't start with uninitialized array, and the third one is some position marker, as I guess from the condition stated in the comment.
Looking at subroutine init_genrand( s ), it sets mtinit flag, and fills the mt() array from 1 upto n. Alright.
Looking at genrand_real3 it's based on genrand_int32.
Looking at genrand_int32, it starts up with
if ( mti > n ) then ! generate N words at one time
! if init_genrand() has not been called, a default initial seed is used
if ( .not. mtinit ) call init_genrand( seed_d )
and does its arithmetic magic and then starts getting the result:
y = mt(mti)
mti = mti + 1_wi
so.. mti is a positional index in the 'state array', and it is incremented by 1 after each integer read from the generator.
Back to init_genrand - remember? it have been resetting the array mt() but it has not resetted the MTI back to its starting mti = n + 1_wi.
I bet this is the cause of the phenomenon you've observed, since after re-initializing with the same seed, the array would be filled with the same set of values, but later the int32 generator would read from a different starting point. I doubt it was intended, so it's probably a tiny bug easy to overlook.

Perl fast matrix multiply

I have implemented the following statistical computation in perl http://en.wikipedia.org/wiki/Fisher_information.
The results are correct. I know this because I have 100's of test cases that match input and output. The problem is that I need to compute this many times every single time I run the script. The average number of calls to this function is around 530. I used Devel::NYTProf to find out this out as well as where the slow parts are. I have optimized the algorithm to only traverse the top half of the matrix and reflect it onto the bottom as they are the same. I'm not a perl expert, but I need to know if there is anything I can try to speed up the perl. This script is distributed to clients so compiling a C file is not an option. Is there another perl library I can try? This needs to be sub second in speed if possible.
More information is $MatrixRef is a matrix of floating point numbers that is $rows by $variables. Here is the NYTProf dump for the function.
#-----------------------------------------------
#
#-----------------------------------------------
sub ComputeXpX
# spent 4.27s within ComputeXpX which was called 526 times, avg 8.13ms/call:
# 526 times (4.27s+0s) by ComputeEfficiency at line 7121, avg 8.13ms/call
{
526 0s my ($MatrixRef, $rows, $variables) = #_;
526 0s my $r = 0;
526 0s my $c = 0;
526 0s my $k = 0;
526 0s my $sum = 0;
526 0s my #xpx = ();
526 11.0ms for ($r = 0; $r < $variables; $r++)
{
14202 19.0ms my #temp = (0) x $variables;
14202 6.01ms push(#xpx, \#temp);
526 0s }
526 7.01ms for ($r = 0; $r < $variables; $r++)
{
14202 144ms for ($c = $r; $c < $variables; $c++)
{
198828 43.0ms $sum = 0;
#for ($k = 0; $k < $rows; $k++)
198828 101ms foreach my $RowRef (#{$MatrixRef})
{
#$sum += $MatrixRef->[$k]->[$r]*$MatrixRef->[$k]->[$c];
6362496 3.77s $sum += $RowRef->[$r]*$RowRef->[$c];
}
198828 80.1ms $xpx[$r]->[$c] = $sum;
#reflect on other side of matrix
198828 82.1ms $xpx[$c]->[$r] = $sum if ($r != $c);
14202 1.00ms }
526 2.00ms }
526 2.00ms return \#xpx;
}
Since each element of the result matrix can be calculated independently, it should be possible to calculate some/all of them in parallel. In other words, none of the instances of the innermost loop depend on the results of any other, so they could run simultaneously on their own threads.
There really isn't much you can do here, without rewriting parts in C, or moving to a better framework for mathematic operations than bare-bone Perl (→ PDL!).
Some minor optimization ideas:
You initialize #xpx with arrayrefs containing zeros. This is unneccessary, as you assign a value to every position either way. If you want to pre-allocate array space, assign to the $#array value:
my #array;
$#array = 100; # preallocate space for 101 scalars
This isn't generally useful, but you can benchmark with and without.
Iterate over ranges; don't use C-style for loops:
for my $c ($r .. $variables - 1) { ... }
Perl scalars aren't very fast for math operations, so offloading the range iteration to lower levels will gain a speedup.
Experiment with changing the order of the loops, and toy around with caching a level of array accesses. Keeping $my $xpx_r = $xpx[$r] around in a scalar will reduce the number of array accesses. If your input is large enough, this translates into a speed gain. Note that this only works when the cached value is a reference.
Remember that perl does very few “big” optimizations, and that the opcode tree produced by compilation closely resembles your source code.
Edit: On threading
Perl threads are heavyweight beasts that literally clone the current interpreter. It is very much like forking.
Sharing data structures across thread boundaries is possible (use threads::shared; my $variable :shared = "foo") but there are various pitfalls. It is cleaner to pass data around in a Thread::Queue.
Splitting the calculation of one product over multiple threads could end up with your threads doing more communication than calculation. You could benchmark a solution that divides responsibility for certain rows between the threads. But I think recombining the solutions efficiently would be difficult here.
More likely to be useful is to have a bunch of worker threads running from the beginning. All threads listen to a queue which contains a pair of a matrix and a return queue. The worker would then dequeue a problem, and send back the solution. Multiple calculations could be run in parallel, but a single matrix multiplication will be slower. Your other code would have to be refactored significantly to take advantage of the parallelism.
Untested code:
use strict; use warnings; use threads; use Thread::Queue;
# spawn worker threads:
my $problem_queue = Thread::Queue->new;
my #threads = map threads->new(\&worker, $problem_queue), 1..3; # make 3 workers
# automatically close threads when program exits
END {
$problem_queue->enqueue((undef) x #threads);
$_->join for #threads;
}
# This is the wrapper around the threading,
# and can be called exactly as ComputeXpX
sub async_XpX {
my $return_queue = Thread::Queue->new();
$problem_queue->enqueue([$return_queue, #_]);
return sub { $return_queue->dequeue };
}
# The main loop of worker threads
sub worker {
my ($queue) = #_;
while(defined(my $problem = $queue->dequeue)) {
my ($return, #args) = #$problem;
$return->enqueue(ComputeXpX(#args));
}
}
sub ComputeXpX { ... } # as before
The async_XpX returns a coderef that will eventually collect the result of the computation. This allows us to carry on with other stuff until we need the result.
# start two calculations
my $future1 = async_XpX(...);
my $future2 = async_XpX(...);
...; # do something else
# collect the results
my $result1 = $future1->();
my $result2 = $future2->();
I benchmarked the bare-bones threading code without doing actual calculations, and the communication is about as expensive as the calculations. I.e. with a bit of luck, you may start to get a benefit on a machine with at least four processors/kernel threads.
A note on profiling threaded code: I know of no way to do that elegantly. Benchmarking threaded code, but profiling with single-threaded test cases may be preferable.

How can I generate random integers in a range in Smalltalk?

A class I am taking currently requires us to do all of our coding in smalltalk (it's a Design class). On one of our projects, I am looking to do some things, and am having a tough time finding how to do them. It seems that what most people do is modify their own version of smalltalk to do what they need it to do. I am not at liberty to do this, as this would cause an error on my prof's computer when he doesn't have the same built-in methods I do.
Here's what I'm looking to do:
Random Numbers. I need to create a random number between 1 and 1000. Right now I'm faking it by doing
rand := Random new.
rand := (rand nextValue) * 1000.
rand := rand asInteger.
This gives me a number between 0 and 1000. Is there a way to do this in one command? similar to
Random between: 0 and: 1000
And/Or statements. This one bugs the living daylights out of me. I have tried several different configurations of
(statement) and: (statement) ifTrue...
(statement) and (statement) ifTrue...
So I'm faking it with nested ifTrue statements:
(statement) ifTrue:[
(statement) ifTrue:[...
What is the correct way to do and/or and Random in smalltalk?
The problem is that
(expr) and: (expr) ifTrue: aBlock
is parsed as the method and:ifTrue: If you look at the Boolean class (and either True or False in particular), you notice that ifTrue: is just a regular method, and that no method and:ifTrue: exists - however, plain and: does. So to make it clear that these are two messages, write
((expr) and: (expr)) ifTrue: aBlock
For longer boolean combinations, notice that there are also methods and:and: and and:and:and: implemented.
(1 to: 1000) atRandom
If you're using VisualWorks, and: takes a block as an argument, so you'd write:
(aBoolean and: [anotherBoolean]) ifTrue: [doSomething].
There's also &, which does not take a block as argument,
aBoolean & anotherBoolean ifTrue:[doSomething].
The difference is and: only evaluates what's in the block if the first bool is true (similar to java), while & always evaluates both.
Thus and: comes in handy if the second condition is computationally expensive, or if it includes state alterations which should only happen when the first condition is true. (that's usually a bad design though).
As for the Random, as long as you deliver your custom method, Random >> between: and: as well as the rest of your code, it runs fine on your professors computer. How to do that specifically, depends on the format in which you are supposed to deliver the assignment.
As for the Random issue: it depends on what ST version you use. In Squeak 3.9, there is Random>>#nextInt:, which is documented as "Answer a random integer in the interval [1, anInteger].". Its implementation reads
(self next * anInteger) truncated + 1
So I have two comments here:
You should really learn to use the class browser. This can answer the (frequent) questions "what messages can I send to objects of class X"
It is common, in ST, to add new methods to existing classes. So if you want Random to have between:and:, just add it, e.g. as
between: low and: high
^(self next * (high-low+1)) truncated + low
To put it simply, without knowing the Smalltalk dialect, I can only give a general answer. The way you stated the random question, yes that's the only way to do it if your professor needs a generic answer.
As for the and/or statements question,
And/Or statements. This one bugs the living daylights out of me. I have tried several different configurations of
(statement) and: (statement) ifTrue...
(statement) and (statement) ifTrue...
What you want to try is:
(statement) and: [statement] ifTrue: [ ... ]
note the brackets, the and: method takes a block as an argument.
To create several random integers between 1 and 1000
First create a random number series. Do this just once.
Then create a new random number by taking the next number from the series. Repeat as necessary.
aRandomSeries := Random new .
"Seed a new series of random numbers"
aRandomInt := aRandomSeries newInt: 1000 .
"generate a random integer between 0 and 1000"
anotherRandomInt := aRandomSeries newInt: 1000 .
"generate another random integer between 0 and 1000"
Logical operations
aBoolean will respond to and: and or:. They both take block arguments.
Here is how they work.
and: alternativeBlock
If the receiver is true, answer the value of alternativeBlock; otherwise answer false without evaluating alternativeBlock.
or: alternativeBlock
If the receiver is false, answer the value of alternativeBlock; otherwise answer true without evaluating alternativeBlock.
e.g.
( 3 > 2 ) or: [ 3 < 4 ] ifTrue: [ ]
aBoolean and: [ anotherBoolean ] ifFalse: [ ]
However, Squeak and Pharo Smalltalks will both accept an argument in parentheses ( )
Dolphin Smalltalk will not, and strictly requires the standard Smalltalk syntax of a block argument.
Other related methods:
& an AND that does not require a square bracketted (i.e. block) argument
| an OR that does not require a square bracketted (i.e. block) argument
& and | work in Amber, Cuis, Gnu, Pharo, Squeak, VisualAge and VisualWorks Smalltalks.
Squeak Smalltalk also provides:
and:and: }
and:and:and: } These take multiple block arguments
and:and:and:and }
or:or: }
or:or:or: } These take multiple block arguments
or:or:or:or: }

Resources