Efficiently process all possible 2D array combinations in Perl - performance

I have a 2D array containing numbers. I am attempting to work out the product of multiplying a single number from each sub-array with one from each of the other sub-arrays; I then need to do this for all possible combinations.
The aim is that I input a file of the frequency of individual events, and get an output of the probability of a particular series of these events happening, with one event from each set.
I have fudged together some code with the help of a previous question:
for my $aref ( getCartesian(#freq) ) {
my $p = 1;
foreach my $n (#$aref) {
$p = $p * $n;
}
print "$p\n";
}
sub getCartesian {
my #input = #_;
my #ret = map [$_], #{ shift #input };
for my $a2 (#input) {
#ret = map {
my $v = $_;
map [#$v, $_], #$a2;
}
#ret;
}
return #ret;
}
where #freq is an array of arrays, such as:
#freq = [0.1, 0.2, 0.3,]
[0.4, 0.5, 0.6,]
[0.7, 0.8, 0.9,]; `and ~ 20 more sub arrays`
This works fine for a small test file, but when I give it my required input of 24 sub-arrays with 3 items each, the generation of combinations is clearly far too intensive, with 3^24 possibilities.
I have run it on a machine with 22 GB RAM, and it maxed out after 4 minutes before any output.
My question is, how could I modify the code so that I can print out $p for each combination, without having to hold the whole set of combinations in memory, which kills it. I presume that time would be the only limiting factor for computation then, not resources.
Edit: a method in base Perl without packages would be great. I don't have admin on the HPC facility sadly,

Set::CrossProduct lets you iterate through the Cartesian product so you don't have to store everything in memory:
use List::Util qw(reduce);
use Set::CrossProduct;
my #array = (
[0.1, 0.2, 0.3],
[0.4, 0.5, 0.6],
[0.7, 0.8, 0.9]
);
my $iterator = Set::CrossProduct->new(\#array);
while (my $tuple = $iterator->get) {
say '(', join(', ', #$tuple), '): ', reduce { $a * $b } #$tuple;
}
Outputs:
(0.1, 0.4, 0.7): 0.028
(0.1, 0.4, 0.8): 0.032
(0.1, 0.4, 0.9): 0.036
(0.1, 0.5, 0.7): 0.035
(0.1, 0.5, 0.8): 0.04
(0.1, 0.5, 0.9): 0.045
(0.1, 0.6, 0.7): 0.042
(0.1, 0.6, 0.8): 0.048
(0.1, 0.6, 0.9): 0.054
(0.2, 0.4, 0.7): 0.056
(0.2, 0.4, 0.8): 0.064
(0.2, 0.4, 0.9): 0.072
(0.2, 0.5, 0.7): 0.07
(0.2, 0.5, 0.8): 0.08
(0.2, 0.5, 0.9): 0.09
(0.2, 0.6, 0.7): 0.084
(0.2, 0.6, 0.8): 0.096
(0.2, 0.6, 0.9): 0.108
(0.3, 0.4, 0.7): 0.084
(0.3, 0.4, 0.8): 0.096
(0.3, 0.4, 0.9): 0.108
(0.3, 0.5, 0.7): 0.105
(0.3, 0.5, 0.8): 0.12
(0.3, 0.5, 0.9): 0.135
(0.3, 0.6, 0.7): 0.126
(0.3, 0.6, 0.8): 0.144
(0.3, 0.6, 0.9): 0.162

Related

why heap space is smaller than sum of young and survivors after young gc?

My jvm options:
-verbose:gc -Xmx200M -Xmn40M -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseG1GC -XX:NewSize=40m -XX:MaxTenuringThreshold=1 -XX:-UseAdaptiveSizePolicy
The following is gc log:
Desired survivor size 2621440 bytes, new threshold 1 (max 1)
- age 1: 792 bytes, 792 total
, 0.0012861 secs]
[Parallel Time: 0.4 ms, GC Workers: 8]
[GC Worker Start (ms): Min: 120214.2, Avg: 120214.3, Max: 120214.3, Diff: 0.2]
[Ext Root Scanning (ms): Min: 0.1, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.3]
[Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3]
[Processed Buffers: Min: 0, Avg: 0.6, Max: 1, Diff: 1, Sum: 5]
[Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4]
[Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 8]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[GC Worker Total (ms): Min: 0.2, Avg: 0.3, Max: 0.4, Diff: 0.2, Sum: 2.2]
[GC Worker End (ms): Min: 120214.5, Avg: 120214.5, Max: 120214.6, Diff: 0.0]
[Code Root Fixup: 0.0 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.1 ms]
[Other: 0.7 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.4 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.2 ms]
[Humongous Register: 0.0 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.0 ms]
[Eden: 39.0M(39.0M)->0.0B(39.0M) Survivors: 1024.0K->1024.0K Heap: 39.9M(200.0M)->670.6K(200.0M)]
[Times: user=0.00 sys=0.01, real=0.00 secs]
My question is why heap space(670.6K) is smaller than sum of(0+1024=1024) eden and survivors from the gc log [Eden: 39.0M(39.0M)->0.0B(39.0M) Survivors: 1024.0K->1024.0K Heap: 39.9M(200.0M)->670.6K(200.0M)]

Image filter kernel to expand 16-235 limited color range

Is it possible to write a 5x5 kernel to process the limited color range into the full range?
This is my sample bitonal kernel, and I don't know what values to use and where to achieve this color expansion:
Grayscale
{ 0.3, 0.3, 0.3, 0.0, 0.0 }
{ 0.6, 0.6, 0.6, 0.0, 0.0 }
{ 0.1, 0.1, 0.1, 0.0, 0.0 }
{ 0.0, 0.0, 0.0, 1.0, 0.0 }
{ 0.0, 0.0, 0.0, 0.0, 1.0 }
I would like RGB color expansion RGB 16-235 => 0-255
However i need the kernel matrix because I am not processing the image but I'm passing the matrix to a windows API function (undocumented: SetMagnificationDesktopColorEffect).
I cannot do a simple subtract/divide/multiply on the pixels. I do not have them.
You can basically do it without kernel by substracting 16 from your image and then dividing it by 219. Then you will have normalized to 1 image which you have to multiply by 255 to get 255 intensity range representation.

std::partial_sum and Kahan algorithm

In C++11 I want to calculate the partial sum of a vector using std::partial_sum.
std::vector<double> vec = {-1.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1};
std::partial_sum(vec.begin(), vec.end(), vec.begin());
Unfortunatelly, the last entry of the resulting vector is 1.38778E-16 due to rounding errors of doubles and the fact that 0.1 has no exact presentaion as double.
vec = {-1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 1.38778E-16};
Is there any chance to use the Kahan algorithm in std::partial_sum to reduce rounding errors and get a smaller error - something like
std::partial_sum(vec.begin(), vec.end(), vec.begin(), KahanSum);
You can implement Kahan summation on top of std::partial_sum (based on Wikipedia pseudocode):
double c = 0.0;
std::partial_sum(vec.begin(), vec.end(), vec.begin(),
[c](double sum, double elem) mutable -> double {
double y = elem - c;
double t = sum + y;
c = (t - sum) - y;
return t;
});
This still won't get you zero though, since (double)0.1 is exactly equal to
0.1000000000000000055511151231257827021181583404541015625 and so the exact sum of your array is about 5.5511151231E-17 (assuming standard double).

Why does it matter how I execute this code?

I have built the following simple synth structure which creates a synth and routes its output through an effects unit:
b = Bus.audio(numChannels: 2);
SynthDef(
"mySynth",
{
|freq, amp, gate = 1|
var vol = 0.5;
var audio = Pulse.ar(freq, 0.5);
var env = EnvGen.kr(Env.perc, doneAction:2);
audio = Pan2.ar(audio, MouseX.kr(-1, 1));
Out.ar(b, audio * env);
}
).add;
SynthDef(
"effects",
{
var audio = In.ar(b, 2);
audio = LPF.ar(audio, MouseY.kr(200, 1000));
//TODO: Implement some crazy, revolutionary effects
Out.ar(0, audio);
}
).add;
// **** Dividing line for executing the code ****
e = Synth(\effects);
p = Pbind(*[
instrument: \mySynth,
scale: #[0, 2, 4, 5, 7, 9, 11],
degree: Pseq([3, 3, 9, 9, 2, 9, 9, 3, 5, 7], inf),
dur: Pseq([0.2, 0.2, 0.2, 0.1, 0.1, 0.2, 0.2, 0.2, 0.1, 0.1], inf),
amp: Pseq([1, 0.6, 0.9, 0.3, 0.4, 0.9, 0.6, 0.85, 0.3, 0.4], inf),
]);
p.play;
This only produces audible output when I execute the code in a particular way:
I can execute each block individually, in order, and I get audible output.
I can execute the first blocks up to the 'dividing line' comment, then the following blocks, and I get audible output.
If I execute all the code together, I don't get audible output.
I'm guessing there has to be some delay between declaring a SynthDef and then instantiate it using Synth(), while the server does set setup stuff. Can anyone shed any light?
I usually get around this with the Server.sync() method. It pauses execution of the enclosing thread (e.g. a Routine) until all asynchronous server commands have been completed. This includes sending SynthDefs and allocating Buffers. You can pass a Condition argument to Server.sync() for more explicit control.
so for example, you can execute this block in one go:
s = Server.local;
s.boot;
s.doWhenBooted({
Routine {
SynthDef.new(\sine, {
arg out=0, hz=220, dur=4.0;
var snd, amp;
snd = SinOsc.ar(hz);
amp = EnvGen.ar(Env.linen(0.1, dur, 0.1), doneAction:2);
Out.ar(out, (amp*snd).dup);
}).send(s);
s.sync; // waits here
x = Synth.new(\sine);
}.play;
});
It is because you can't just "add" SynthDefs to the server and create an instance of said synth in the same execution. If you "play" the synths as they are executed then an instance of them gets added to the server so that when you call the Synth up for execution it will already be loaded. Working code is included below.
(
b = Bus.audio(numChannels: 2);
SynthDef(
"mySynth",
{
|freq, amp, gate = 1|
var vol = 0.5;
var audio = Pulse.ar(freq, 0.5);
var env = EnvGen.kr(Env.perc, doneAction:2);
audio = Pan2.ar(audio, MouseX.kr(-1, 1));
Out.ar(b, audio * env);
}
).play;
SynthDef(
"effects",
{
var audio = In.ar(b, 2);
audio = LPF.ar(audio, MouseY.kr(200, 1000));
//TODO: Implement some crazy, revolutionary effects
Out.ar(0, audio);
}
).play;
// **** Dividing line for executing the code ****
e = Synth(\effects);
p = Pbind(*[
instrument: \mySynth,
scale: #[0, 2, 4, 5, 7, 9, 11],
degree: Pseq([3, 3, 9, 9, 2, 9, 9, 3, 5, 7], inf),
dur: Pseq([0.2, 0.2, 0.2, 0.1, 0.1, 0.2, 0.2, 0.2, 0.1, 0.1], inf),
amp: Pseq([1, 0.6, 0.9, 0.3, 0.4, 0.9, 0.6, 0.85, 0.3, 0.4], inf),
]);
p.play;
)
I'm sure you're right that it's to do with the delay between declaring the synthdef and it being ready.
I'm not really experienced enough with sclang to immediately tell you exactly how you should change your code (I generally use scsynth via OSC, only using sclang to write SynthDefs), but you should be able to do something with the optional completionMsg argument to SynthDef.add.

Ruby: reduce duplication while initialize hash

array = [0, 0.3, 0.4, 0.2, 0.6]
hash = {
"key1" => array[0..2],
"key2" => array[0..3],
"key3" => array,
"key4" => array,
"key5" => array,
"key6" => array,
"key7" => array
}
Is there a way I can remove the duplication by doing something like
hash = {
"key1" => array[0..2],
"key2" => array[0..3],
%(key3, key4, key5, key6, key7).each {|ele| ele => array}
}
Try
array = [0, 0.3, 0.4, 0.2, 0.6]
hash = {
"key1" => array[0..2],
"key2" => array[0..3]
}
%w(key3 key4 key5 key6 key7).each {|ele| hash[ele] = array}
array = [0, 0.3, 0.4, 0.2, 0.6]
h = Hash[*Array.new(7) {|x| ["key#{x+1}", array[0..(x<2?x+2:-1)]]}.flatten(1)]
h # => {"key1" => [0, 0.3, 0.4], "key2" => [0.3, 0.4, 0.2],...}
Here's a couple variations on a theme. They work with 1.8.7 or 1.9.2. The insertion order is maintained with 1.9.2 'cause that's what it does:
require 'pp'
array = [0, 0.3, 0.4, 0.2, 0.6]
hash = ('key3'..'key7').entries.inject({}) { |m, e| m[e] = array; m }
hash.merge!('key1' => array[0..2], 'key2' => array[0..3])
pp hash
puts '-' * 40
hash = {
'key1' => array[0..2],
'key2' => array[0..3]
}.merge(('key3'..'key7').entries.inject({}) { |m, e| m[e] = array; m })
pp hash
puts '-' * 40
# I think this is the most readable/maintainable
hash = {
'key1' => array[0..2],
'key2' => array[0..3]
}
('key3'..'key7').entries.inject(hash) { |m, e| m[e] = array; m }
pp hash
Which output:
# >> {"key3"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key4"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key5"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key6"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key7"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key1"=>[0, 0.3, 0.4],
# >> "key2"=>[0, 0.3, 0.4, 0.2]}
# >> ----------------------------------------
# >> {"key1"=>[0, 0.3, 0.4],
# >> "key2"=>[0, 0.3, 0.4, 0.2],
# >> "key3"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key4"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key5"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key6"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key7"=>[0, 0.3, 0.4, 0.2, 0.6]}
# >> ----------------------------------------
# >> {"key1"=>[0, 0.3, 0.4],
# >> "key2"=>[0, 0.3, 0.4, 0.2],
# >> "key3"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key4"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key5"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key6"=>[0, 0.3, 0.4, 0.2, 0.6],
# >> "key7"=>[0, 0.3, 0.4, 0.2, 0.6]}
Here is another version:
hash = {
"key1" => array[0..2],
"key2" => array[0..3]
}.tap { |h| ("key3".."key7").each{|k| h[k]=array}}

Resources