Related
In my shader I have variable b and need to determine within which range it lies and from that assign the right value to variable a. I ended up with a lot of if statements:
float a = const1;
if (b >= 2.0 && b < 4.0) {
a = const2;
} else if (b >= 4.0 && b < 6.0) {
a = const3;
} else if (b >= 6.0 && b < 8.0) {
a = const4;
} else if (b >= 8.0) {
a = const5;
}
My question is could this lead to performance issues (branching) and how can I optimize it? I've looked at the step and smoothstep functions but haven't figured out a good way to accomplish this.
To solve the problem depicted and avoid branching the usual techniques is to find a series of math functions, one for each condition, that evaluate to 0 for all the conditions except the one the variable satisfies. We can use these functions as gains to build a sum that evaluates to the right value each time.
In this case the conditions are simple intervals, so using the step functions we could write:
x in [a,b] as step(a,x)*step(x,b) (notice the inversion of x and b to get x<=b)
Or
x in [a,b[ as step(a,x)-step(x,b) as explained in this other post: GLSL point inside box test
Using this technique we obtain:
float a = (step(x,2.0)-((step(2.0,x)*step(x,2.0)))*const1 +
(step(2.0,x)-step(4.0,x))*const2 +
(step(4.0,x)-step(6.0,x))*const3 +
(step(6.0,x)-step(8.0,x))*const4 +
step(8.0,x)*const5
This works for general disjoint intervals, but in the case of a step or staircase function as in this question, we can simplify it as:
float a = const1 + step(2.0,x)*(const2-const1) +
step(4.0,x)*(const3-const2) +
step(6.0,x)*(const4-const3) +
step(8.0,x)*(const5-const4)
We could also use a 'bool conversion to float' as means to express our conditions, so as an example step(8.0,x)*(const5-const4) is equivalent to float(x>=8.0)*(const5-const4)
You can avoid branching by creating kind of a lookup table:
float table[5] = {const1, const2, const3, const4, const5};
float a = table[int(clamp(b, 0.0, 8.0) / 2)];
But the performance will depend on whether the lookup table will have to be created in every shader or if it's some kind of uniform... As always, measure first...
It turned out Jaa-cs answere wasn't viable for me as I'm targeting WebGL which doesn't allow variables as indexes (unless it's a loop index). His solution might work great for other OpenGL implementations though.
I came up with this solution using mix and step functions:
//Outside of main function:
uniform vec3 constArray[5]; // Values are sent in to shader
//Inside main function:
float a = constArray[0];
a = mix(a, constArray[1], step(2.0, b));
a = mix(a, constArray[2], step(4.0, b));
a = mix(a, constArray[3], step(6.0, b));
a = mix(a, constArray[4], step(8.0, b));
But after some testing it didn't give any visible performance boost. I finally ended up with this solution:
float a = constArray[0];
if (b >= 2.0)
a = constArray[1];
if (b >= 4.0)
a = constArray[2];
if (b >= 6.0)
a = constArray[3];
if (b >= 8.0)
a = constArray[4];
Which is both compact and easily readable. In my case both these alternatives and my original code performed equally, but at least here are some options to try out.
OSX v10.10.5 and Gnuplot v5.0
I have a data file with three columns of numbers and I read the values that are stored inside to do some calculations. But it is time consuming!
Here is what I have so far:
#user defined function to read data in a file
#see stackoverflow: "Reading dataset value into a gnuplot variable (start of X series)"
at(file, row, col) = system( sprintf("awk -v row=%d -v col=%d 'NR == row {print $col}' %s", row, col, file) )
file="myFile"
do for [k=1:10] { #we read line by line and we want the ratio between column 2/1 and 3/1
f(k) = at(file,k,2)/at(file,k,1)
g(k) = at(file,k,3)/at(file,k,1)
# example of calculation: least square to find the best "i"
do for [i=1:10] {
f1(i) = (a*i**2 + b*i + c) #function for the least square. a,b,c: floats
g1(i) = (d*i**2 + e*i + f) #d,e,f: floats
h(i) = sqrt( (f1(i)-f(k))**2 + (g1(i)-g(k))**2 )
if (h(i)<hMin) {
hMin=h(i)
}
else {}
} #end loop i
print i," ",hMin
} #end loop k
It works but as I said it takes time (around 2min for each k). When I do not make any calculation and only ask print f(k),g(k), it is << 1sec. I suspect then that the division could lead to too many digits and to unefficient calculation. I used round2 function to keep the n=4 first:
#see stackoverflow: How to use floor function in gnuplot
round(x) = x - floor(x) < 0.5 ? floor(x) : ceil(x)
round2(x, n) = round(x*10**n)*10.0**(-n)
f(k) = round2((at(file,k,2)/at(file,k,1)),4)
g(k) = round2((at(file,k,3)/at(file,k,1)),4)
but it did not change the required time.
Any idea about what's going on?
You did not post the full code (definitions for a, b, ..., f are missing). But within the part of the code you have posted I think you can avoid calling awk that often. You can replace the functions f(k) and g(k) by simple variables fk and gk, because in fact they are constant within each k-iteration. There seems to be no need to recalculate them within each i-iteration.
#user defined function to read data in a file
#see stackoverflow: "Reading dataset value into a gnuplot variable (start of X series)"
at(file, row, col) = system( sprintf("awk -v row=%d -v col=%d 'NR == row {print $col}' %s", row, col, file) )
file="myFile"
do for [k=1:10] { #we read line by line and we want the ratio between column 2/1 and 3/1
at1 = at(file,k,1)
fk = at(file,k,2)/at1
gk = at(file,k,3)/at1
# example of calculation: least square to find the best "i"
do for [i=1:10] {
f1i = (a*i**2 + b*i + c) #function for the least square. a,b,c: floats
g1i = (d*i**2 + e*i + f) #d,e,f: floats
hi = sqrt( (f1i-fk)**2 + (g1i-gk)**2 )
if (hi<hMin) {
hMin=hi
} else {
}
} #end loop i
print i," ",hMin
} #end loop k
But there might be more interesting details in the missing code which inhibits this solution.
The below script calculates the time difference in decimal number, but I get a rounding error, so I guess it is because of my very bad conversion at the end.
Question
Something tells me, that the expression can be simplified, but what are the math rules in this particular situation, when I have int, modulus and division?
#!/usr/bin/perl
print f("9:00", "16:45");
sub f {
my $t1 = shift #_;
my $t2 = shift #_;
my $m1 = $1*60+$2 if $t1 =~ m/(\d\d?):(\d\d?)/;
my $m2 = $1*60+$2 if $t2 =~ m/(\d\d?):(\d\d?)/;
my $h = int(($m2-$m1)/60);
return $h + ($h*60+$m2-$m1)%60/60;
}
You have already correctly calculated $m1 and $m2 as minutes elapsed since midnight. So why not return the time difference in fractional hours as simply:
return ($m2 - $m1) / 60.0;
As far as the "math rules," maybe it will help to look at your return versus mine and see why they are equivalent (ignoring rounding):
$h + ($h * 60 + $m2 - $m1) % 60 / 60
Notice that ($h * 60) % 60 is zero, so this term essentially drops out, leaving:
$h + ($m2 - $m1) % 60 / 60
Now think about how $h was calculated: it is the quotient of ($m2 - $m1) divided by 60, dropping the remainder (because of int()). On the other hand, ($m2 - $m1) % 60 is exactly the remainder of this division. So the expression above is essentially just adding the remainder fraction, which you dropped from $h, back in. Therefore it has the same result as
($m2 - $m1) / 60
You can use Time::Piece to do simple date/time calculations:
#!/usr/bin/perl
use strict;
use warnings;
use Time::Piece;
my $pattern = '%H:%M';
my $start = Time::Piece->strptime('09:00', $pattern);
my $end = Time::Piece->strptime('16:45', $pattern);
my $diff = $end - $start;
print $diff->hours;
Output:
7.75
Note that $diff is actually a Time::Seconds object.
I would say to use Time::Piece, but I assume this is some sort of exercise.
In Unix, all dates are converted to seconds since the epoch. It makes it easy to do arithmetic on stuff when everything is in the same units. Therefore, I would convert the times to minutes, do your operations, then convert the time back to hours and minutes. That would mean you need to subroutines. One to convert the time from hours and minutes to minutes. And one to convert from minutes to hours and minutes.
#! /usr/bin/env perl
#
use warnings;
use strict;
use feature qw(say);
my $time1 = "9:50";
my $time2 = "16:45";
my $time1_in_minutes = convert_to_minutes( $time1 );
my $time2_in_minutes = convert_to_minutes( $time2 );
my $diff_in_minutes = $time2_in_minutes - $time1_in_minutes;
say convert_to_hours( $diff_in_minutes );
sub convert_to_minutes {
my $time = shift;
my ($hours, $minutes) = split /:/, $time;
my $total_minutes = ( $hours * 60 ) + $minutes;
return $total_minutes;
}
sub convert_to_hours {
my $minutes = shift;
my $hours = int $minutes / 60;
$minutes = $minutes % 60;
return sprintf "%d:%02d", $hours, $minutes;
}
By breaking down the work into two subroutines, you can easily see what is going on. Plus, you have a lot more flexibility. What if I gave you times it took for me to run a marathon, and I want the average time? You could use the same two routines. What if I gave you the time I spent at work each day, and I wanted a total? Again, you could use the same subroutines.
And, because it's much easier to see what's going on, it's easier to correct errors that crop up when you program. Even better, because the operations are clean, I can start adding features. For example, it would be nice to check my input. In my convert_to_minutes subroutine, I might want to make sure that the hours and minutes are valid. That there's a colon in the middle. That the minutes aren't greater than 60.
Suppose we have N numbers(integers, floats, whatever you want) and want to find their arithmetic mean. Simplest method is to sum all values and divide by number of values:
def simple_mean(array[N]): # pseudocode
sum = 0
for i = 1 to N
sum += array[i]
return sum / N
It works fine, but requires big integers.
If we don't want big integers and we are fine with rounding errors, and N is the power of two, we can use 'divide-and-conquer' : ((a+b)/2 + (c+d)/2)/2 = (a+b+c+d)/4, ((a+b+c+d)/4 + (e+f+g+h)/4)/2 = (a+b+c+d+e+f+g+h)/8, so on.
def bisection_average(array[N]):
if N == 1: return array[1]
return (bisection_average(array[:N/2])+bisection_average(array[N/2:]))/2
Any other ways?
PS. playground for lazy
Knuth lists the following method for calculating mean and standard deviation given floating point (original on p. 232 of Vol 2 of The Art of Computer Programming, 1998 edition; my adaptation below avoids special-casing the first iteration):
double M=0, S=0;
for (int i = 0; i < N; ++i)
{
double Mprev = M;
M += (x[i] - M)/(i+1);
S += (x[i] - M)*(x[i] - Mprev);
}
// mean = M
// std dev = sqrt(S/N) or sqrt(S/N+1)
// depending on whether you want population or sample std dev
Here's a way to calculate the mean using just integers without rounding errors and avoiding big intermediate values:
sum = 0
rest = 0
for num in numbers:
sum += num / N
rest += num % N
sum += rest / N
rest = rest % N
return sum, rest
If the big integers are problem... is it ok
a/N + b/N+.... n/N
I mean you're looking just for other ways or the optimal way?
If the array is floating-point data, even the "simple" algorithm suffers from rounding error. Interestingly, in that case, blocking the computation into sqrt(N) sums of length sqrt(N) actually reduces the error in the average case (even though the same number of floating-point roundings are performed).
For integer data, note that you don't need general "big integers"; if you have fewer than 4 billion elements in your array (likely), you only need an integer type 32 bits larger than that the type of the array data. Performing addition on this slightly larger type will pretty much always be faster than doing division or modulus on the type itself. For example, on most 32 bit systems, 64-bit addition is faster than 32-bit division/modulus. This effect will only become more exaggerated as the types become larger.
If you use float you might avoid big integers:
def simple_mean(array[N]):
sum = 0.0 # <---
for i = 1 to N
sum += array[i]
return sum / N
The Kahan algorithm (according to the wikipedia) has better runtime performance (than the pairwise summation) -O(n)- and an O(1) error growth:
function KahanSum(input)
var sum = 0.0
var c = 0.0 // A running compensation for lost low-order bits.
for i = 1 to input.length do
var y = input[i] - c // So far, so good: c is zero.
var t = sum + y // Alas, sum is big, y small, so low-order digits of y are lost.
c = (t - sum) - y // (t - sum) recovers the high-order part of y; subtracting y recovers -(low part of y)
sum = t // Algebraically, c should always be zero. Beware overly-aggressive optimizing compilers!
// Next time around, the lost low part will be added to y in a fresh attempt.
return sum
Its idea is that the low bits of the floating point numbers are summed and corrected independently from the main summation.
Building on Jason S's solution to find a weighted-average and reign back in S's growth.
Using the M finding algorithm given before along with the aggregate formulas for weighted average and population standard deviation:
Avg = Avg(W*X) / Avg(W)
StDev = sqrt(Avg(W*X*X) / Avg(W) - Avg*Avg)
rewrite the code to find the three running averages, then do then the aggregate calculations at the end
function GetPopulationStats{
<#
.SYNOPSIS
Calculate the average, variance, and standard deviation of a weighted data set
.DESCRIPTION
Uses the Knuth method for finding means adapted by Jason S, to calculate the
three averages required by the agregate statistical formulas
.LINK
https://stackoverflow.com/a/1346890/4496560
#>
param(
[decimal[]]$x #Data Points
,[decimal[]]$w #Weights
)
$N = $x.Length
[decimal]$AvgW = 0
[decimal]$AvgWX = 0
[decimal]$AvgWXX = 0
for($i=0; $i -lt $N; $i++){
$AvgW += ($w[$i] - $AvgW) / ($i+1)
$AvgWX += ($w[$i]*$x[$i] - $AvgWX) / ($i+1)
$AvgWXX += ($w[$i]*$x[$i]*$x[$i] - $AvgWXX) / ($i+1)
}
[ordered]#{
N = $N
Avg = ($avg = $AvgWX / $AvgW)
Var = ($var = ($AvgWXX / $AvgW) - ($Avg * $Avg))
StDev = SquareRoot $var
}
}
and then if your language is like Windows PowerShell you'll likely need a higher precision [math]::sqrt() function
function SquareRoot([decimal]$a){
<#
.SYNOPSIS
Find the square-root of $a
.DESCRIPTION
Uses the McDougall-Wotherspoon variant of the Newton-Raphson method to
find the positive zero of:
f(x) = (x * x) - a
f'(x) = 2 * x
.NOTES
ToDo: using a fitted polynomial would likely run faster
#>
$BiCycleX = $PrevX = 0;
$x = $a/2 # guess
$y = ($x * $x) - $a
$xx = $x
$m = $x + $xx
$del = $x - $PrevX
if($del -lt 0){ $del = -$del }
$i = 0
while($del -gt 0 -and $x -ne $BiCycleX){
$BiCycleX = $PrevX;
$PrevX = $x;
$x = $x - ($y / $m)
$y = ($x * $x) - $a
$xx = $x - ($y / $m)
$m = $x + $xx
$del = $x - $PrevX
if($del -lt 0){ $del = -$del }
if(++$i -ge 50){
throw ("invariant sanity fail on step {0}:`r`n x_(n-1) = {1}`r`n x_n = {2}`r`n delta = {3:0.#e0}" -f $i,$PrevX,$x,$del)
}
}
($x + $PrevX) / 2
}
however, if you don't need a weighted solution it should be easy enough to just let w[i]=1 for all i
finally, it doesn't hurt to do a quick sanity check on the code
describe 'tool spot-check' {
context 'testing root calcs' {
$TestCases = #(
#{Value = 0; Expected = 0}
#{Value = 1; Expected = 1}
#{Value = 4; Expected = 2}
#{Value = 9; Expected = 3}
#{Value = 2; Expected = [decimal]'1.4142135623730950488016887242'}
#{Value = (1e14-1); Expected = [decimal]'9999999.99999995'}
)
It 'finds the square root of: <Value>' -TestCases $TestCases {
param($Value,$Expected)
SquareRoot $Value | should be $Expected
}
}
context 'testing stat calcs' {
It 'calculates the values for 1 to 1000' {
$x = 1..1000
$w = #(1) * 1000
$results = GetPopulationStats $x $w
$results.N | should be 1000
$results.Avg | should be 500.5
$results.Var | should be 83333.25
$results.StDev | should be ([decimal]'288.67499025720950043826670416')
}
It 'calculates the values for a test data set' {
$x = #(33,119,37,90,50,94,32,147,86,28,50,80,145,131,121,90,140,170,214,70,124)
$w = #(207,139,25,144,72,162,93,91,109,151,125,87,49,99,210,105,99,169,50,59,22)
$results = GetPopulationStats $x $w
$results.N | should be 21
$results.Avg | should be ([decimal]'94.54433171592412880458756066')
$results.Var | should be ([decimal]'2202.659150711314347179152603')
$results.StDev | should be ([decimal]'46.93249567955356821948311637')
}
}
}
First number needs to be rounded to nearest second number. There are many ways of doing this, but whats the best and shortest algorithm? Anyone up for a challenge :-)
1244->1200
1254->1300
123->100
178->200
1576->1600
1449->1400
123456->123500
654321->654300
23->00
83->100
Ruby's round method can consume negative precisions:
n.round(-2)
In this case -2 gets you rounding to the nearest hundred.
For input n:
(n + 50) / 100 * 100
using integer division.
Note that many languages/libraries already have functions to do this.
100 * round(n/100.0)
I know it's late in the game, but here's something I generally set up when I'm dealing with having to round things up to the nearest nTh:
Number.prototype.roundTo = function(nTo) {
nTo = nTo || 10;
return Math.round(this * (1 / nTo) ) * nTo;
}
console.log("roundto ", (925.50).roundTo(100));
Number.prototype.ceilTo = function(nTo) {
nTo = nTo || 10;
return Math.ceil(this * (1 / nTo) ) * nTo;
}
console.log("ceilTo ", (925.50).ceilTo(100));
Number.prototype.floorTo = function(nTo) {
nTo = nTo || 10;
return Math.floor(this * (1 / nTo) ) * nTo;
}
console.log("floorTo ", (925.50).floorTo(100));
I find myself using Number.ceilTo(..) because I'm working with Canvas and trying to get out to determine how far out to scale.
This will do it, given you're using integer math:
n = (n + 50) / 100 * 100
Of course, you didn't specify the behavior of e.g., 1350 and 1450, so I've elected to round up. If you need round-to-even, that'll not work.
As per Pawan Pillai's comment above, rounding to nearest 100th in Javascript:
100 * Math.floor((foo + 50) / 100);
Is this homework?
Generally, mod 100, then if >50 add else subtract.