UDF's in redshift : Possible to reference a udf within another - user-defined-functions

Is possible to nest UDF's within each other ?
Following is a code for computing confidence intervals in A/B tests - Ofcourse, I could write a huge function that does all-in-one, but wondering a better way to achieve this goal ?
set search_path to public;
create function cumnormdist(x float)
returns float
IMMUTABLE AS $$
import math
b1 = 0.319381530
b2 = -0.356563782
b3 = 1.781477937
b4 = -1.821255978
b5 = 1.330274429
p = 0.2316419
c = 0.39894228
h=math.exp(-x * x / 2.0)
if(x >= 0.0) :
t = 1.0 / ( 1.0 + p * x )
return (1.0 - c * h * t *( t *( t * ( t * ( t * b5 + b4 ) + b3 ) + b2 ) + b1 ))
else :
t = 1.0 / ( 1.0 - p * x );
return ( c * h * t *( t *( t * ( t * ( t * b5 + b4 ) + b3 ) + b2 ) + b1 ))
$$ language plpythonu;
set search_path to public;
create or replace function conversion(experience_total float,experience_conversions float)
returns float
IMMUTABLE AS $$
return experience_conversions*1.0/experience_total
$$ language plpythonu;
create or replace function zscore(total_c float,conversions_c float,total_t float,conversions_t float )
returns float
IMMUTABLE AS $$
import math
z = conversion(total_t,conversions_t )-conversion(total_c,conversions_c) # Difference in means
s =(conversion(total_t,conversions_t)*(1-conversion(total_t,conversions_t)))/total_t+(conversion(total_c,conversions_c)*(1-conversion(total_c,conversions_c)))/total_c
return float(z)/float(math.sqrt(s))
$$ language plpythonu;
create or replace function confidence(total_c float,conversions_c float,total_t float,conversions_t float )
returns float
IMMUTABLE AS $$
import math
return **(1-float(cumnormdist(zscore(total_c float,conversions_c float,total_t float,conversions_t float )),4))*100.00**
$$ language plpythonu;
The individual calls work fine, eg : select cumnormdist (-3.1641397476); If I insert them in the function definition, they don't, for example zscore that calls conversion function.
ERROR: NameError: global name 'zscore' is not defined. Please look at svl_udf_log for more information
DETAIL:
-----------------------------------------------
error: NameError: global name 'zscore' is not defined. Please look at svl_udf_log for more information
code: 10000
context: UDF
query: 0
location: udf_client.cpp:298
process: padbmaster [pid=3585]
-----------------------------------------------
If I could nest functions inside each other,(instead of having UDF's as above that are finally nested) that would be a reasonable status-quo.
End goal : Publish these computations in Tableau.

Here's how I solved it. UDF's cannot cross-reference the contents of another UDF, so you can create a custom library, upload it to AWS using CREATE library.
More here

Related

Finite difference method for solving the Klein-Gordon equation in Matlab

I am trying to numerically solve the Klein-Gordon equation that can be found here. To make sure I solved it correctly, I am comparing it with an analytical solution that can be found on the same link. I am using the finite difference method and Matlab. The initial spatial conditions are known, not the initial time conditions.
I start off by initializing the constants and the space-time coordinate system:
close all
clear
clc
%% Constant parameters
A = 2;
B = 3;
lambda = 2;
mu = 3;
a = 4;
b = - (lambda^2 / a^2) + mu^2;
%% Coordinate system
number_of_discrete_time_steps = 300;
t = linspace(0, 2, number_of_discrete_time_steps);
dt = t(2) - t(1);
number_of_discrete_space_steps = 100;
x = transpose( linspace(0, 1, number_of_discrete_space_steps) );
dx = x(2) - x(1);
Next, I define and plot the analitical solution:
%% Analitical solution
Wa = cos(lambda * x) * ( A * cos(mu * t) + B * sin(mu * t) );
figure('Name', 'Analitical solution');
surface(t, x, Wa, 'edgecolor', 'none');
colormap(jet(256));
colorbar;
xlabel('t');
ylabel('x');
title('Wa(x, t) - analitical solution');
The plot of the analytical solution is shown here.
In the end, I define the initial spatial conditions, execute the finite difference method algorithm and plot the solution:
%% Numerical solution
Wn = zeros(number_of_discrete_space_steps, number_of_discrete_time_steps);
Wn(1, :) = Wa(1, :);
Wn(2, :) = Wa(2, :);
for j = 2 : (number_of_discrete_time_steps - 1)
for i = 2 : (number_of_discrete_space_steps - 1)
Wn(i + 1, j) = dx^2 / a^2 ...
* ( ( Wn(i, j + 1) - 2 * Wn(i, j) + Wn(i, j - 1) ) / dt^2 + b * Wn(i - 1, j - 1) ) ...
+ 2 * Wn(i, j) - Wn(i - 1, j);
end
end
figure('Name', 'Numerical solution');
surface(t, x, Wn, 'edgecolor', 'none');
colormap(jet(256));
colorbar;
xlabel('t');
ylabel('x');
title('Wn(x, t) - numerical solution');
The plot of the numerical solution is shown here.
The two plotted graphs are not the same, which is proof that I did something wrong in the algorithm. The problem is, I can't find the errors. Please help me find them.
To summarize, please help me change the code so that the two plotted graphs become approximately the same. Thank you for your time.
The finite difference discretization of w_tt = a^2 * w_xx - b*w is
( w(i,j+1) - 2*w(i,j) + w(i,j-1) ) / dt^2
= a^2 * ( w(i+1,j) - 2*w(i,j) + w(i-1,j) ) / dx^2 - b*w(i,j)
In your order this gives the recursion equation
w(i,j+1) = dt^2 * ( (a/dx)^2 * ( w(i+1,j) - 2*w(i,j) + w(i-1,j) ) - b*w(i,j) )
+2*w(i,j) - w(i,j-1)
The stability condition is that at least a*dt/dx < 1. For the present parameters this is not satisfied, they give this ratio as 2.6. Increasing the time discretization to 1000 points is sufficient.
Next up is the boundary conditions. Besides the two leading columns for times 0 and dt one also needs to set the values at the boundaries for x=0 and x=1. Copy also them from the exact solution.
Wn(:,1:2) = Wa(:,1:2);
Wn(1,:)=Wa(1,:);
Wn(end,:)=Wa(end,:);
Then also correct the definition (and use) of b to that in the source
b = - (lambda^2 * a^2) + mu^2;
and the resulting numerical image looks identical to the analytical image in the color plot. The difference plot confirms the closeness

subscript indices must be either positiveintegers less than 2^31 or logicals

SOS i keep getting errors in the loop solving by finite difference method.
I either get the following error when i start with i = 2 : N :
diffusion: A(I,J): row index out of bounds; value 2 out of bound 1
error: called from
diffusion at line 37 column 10 % note line change due to edit!
or, I get the following error when i do i = 2 : N :
subscript indices must be either positive integers less than 2^31 or logicals
error: called from
diffusion at line 37 column 10 % note line change due to edit!
Please help
clear all; close all;
% mesh in space
dx = 0.1;
x = 0 : dx : 1;
% mesh in time
dt = 1 / 50;
t0 = 0;
tf = 10;
t = t0 : dt : tf;
% diffusivity
D = 0.5;
% number of nodes
N = 11;
% number of iterations
M = 10;
% initial conditions
if x <= .5 && x >= 0 % note, in octave, you don't need parentheses around the test expression
u0 = x;
elseif
u0 = 1-x;
endif
u = u0;
alpha = D * dt / (dx^2);
for j = 1 : M
for i = 1 : N
u(i, j+1) = u(i, j ) ...
+ alpha ...
* ( u(i-1, j) ...
+ u(i+1, j) ...
- 2 ...
* u(i, j) ...
) ;
end
u(N+1, j+1) = u(N+1, j) ...
+ alpha ...
* ( ...
u(N, j) ...
- 2 ...
* u(N+1, j) ...
+ u(N, j) ...
) ;
% boundary conditions
u(0, :) = u0;
u(1, :) = u1;
u1 = u0;
u0 = 0;
end
% exact solution with 14 terms
%k=14 % COMMENTED OUT
v = (4 / ((k * pi) .^ 2)) ...
* sin( (k * pi) / 2 ) ...
* sin( k * pi * x ) ...
* exp .^ (D * ((k * pi) ^ 2) * t) ;
exact = symsum( v, k, 1, 14 );
error = exact - u;
% plot stuff
plot( t, error );
xlabel( 'time' );
ylabel( 'error' );
legend( 't = 1 / 50' );
Have a look at the edited code I cleaned up for you above and study it.
Don't underestimate the importance of clean, readable code when hunting for bugs.
It will save you more time than it will cost. Especially a week from now when you will need to revisit this code and you will not remember at all what you were trying to do.
Now regarding your errors. (all line references are with respect to the cleaned up code above)
Scenario 1:
In line 29 you initialise u as a single value.
If you start your loop in line 35 starting with i = 2, then as soon as you try to do u(i, j+1), i.e. u(2,2) in the next line, octave will complain that you're trying to index the second row, in an array that so far only contains one row. (in fact, the same will apply for j at this point, since at this point you only have one column as well)
Scenario 2:
I assume the second scenario was a typo and you meant to say i = 1 : N.
If you start with i=1 in the loop, then have a look at line 38: you are trying to get element u(i-1, j), i.e. u(0,1). Therefore octave will complain that you're trying to get the zero element, but in octave arrays start from one and zero is not defined. Attempting to access any array with a zero will result in the error you see (try it in a terminal!).
UPDATE
Also, now that the code is clean, you can spot another bug, which octave helpfully warns you about if you try to run the code.
Look at line 26. There is NO condition in the elseif leg, so octave looks for the next statement as the test condition.
This means that the elseif condition will always succeed as long as the result of u0 = 1-x is non-zero.
This is clearly a bug. Either you forgot to put the condition for the elseif, or more likely, you probably just meant to say else, rather than elseif.

How to fit response plot using square wave

I'm looking for a way/method to fit my response data (Image is shown below). So using f(t) = (square(2*pi*f*t)+1) to filter my raw data. However, cftool don't recognize this kind of function. So please help me thanks!
The function below might allow to fit the data. It is continuous, but not differentiable everywhere. The steps tend to fall to the right, while OPs data does not. This might require some extra work. Moreover, steps have to be equidistant, which, however, seems to be the case.
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import numpy as np
def f( x, a, b ): # test function (that would be the one to fit, actually + a shift of edge position)
return a + b * x**3
def f_step( x, l, func, args=None ):
y = ( x - l / 2. ) % l - l / 2.
y = y / l * 2.
p = np.floor( ( x-l/2.) / (l) ) + 1
centre = p * l
left = centre - l / 2.
right = centre + l / 2.
fL = func( left, *args )
fR = func( right, *args )
fC = func( centre, *args )
out = fC + sharp( y , fL - fC, fR - fC , 5 )
return out
def sharp( x, a, b , p, epsilon=1e-1 ):
out = a * ( 1. / abs( x + 1 + epsilon )**p - ( 2 + epsilon)**( -p ) ) / ( epsilon**( -p ) - ( 2 + epsilon )**( -p ) )
out += b * ( 1. /abs( x - 1 - epsilon )**p - ( 2 + epsilon)**( -p ) ) / ( epsilon**( -p ) - ( 2 + epsilon )**( -p ) )
return out
l=0.57
xList = np.linspace( -1, 1.75, 500 )
yList = [ f_step( x, l, f, args=(2, -.3 ) ) for x in xList ]
fig1 = plt.figure( 1 )
ax = fig1.add_subplot( 1, 1, 1 )
ax.plot( xList, yList )
ax.plot( xList, f(xList, 2,-.3) )
plt.show()
Looks like:

Mutual recursion analysis

I'm trying to analyze these functions but i am getting a bit lost. So for function f when t(n) = c if n < 1^-5
so if n >= 1^5 i get t(n) = c2 + t( n / 2 ) + t2( n / 2) where t2 is the time analysis of function h, but i'm confused on expanding it should it be something like
t(n) = ( t(n / 2) + t2( n / 2) ) * c2 + c
or should i be expanding t2 in side of that?
here is the code i am trying to analyze.
float f( float x) {
if ( abs( x ) < 1e-5 ) {
return x + ( ( x * x * x ) / 2 );
}
float y = f( x / 2 );
float z = g( x / 2 );
return 2 * y * z;
}
float g( float x ) {
if ( abs( x ) < 1e-5 ) {
return 1 + ( ( x * x ) / 2 );
}
float y = f( x / 2 );
float z = g( x / 2 );
return ( z * z ) + ( y * y );
}
T1(n) = T1(n / 2) + T2(n / 2) + c1
T2(n) = T1(n / 2)+T2(n / 2) + c2
so we have
T1(n) = O(T2(n))
T1(n) = 2T1(n / 2) + c1
since c1 = O(nlog22) master theorem implies that
T(n) = O(n)
Even though we are calling two different functions in this code, there is a thing about them that makes finding the complexity of this recursion easy.
What's happening is that at the top level, if you are entering f(), you are evaluating x and then calling two different functions - itself and g(). Even if you enter the function g() first, same thing happens, i.e. g() calls itself and f().
Since, every level down the tree the value of x halves, the number of levels on this tree would be Log2(n). Also, every node has 2 children viz. f(x/2) and g(x/2).
This is a complete binary tree of length Log2(n).
Work done on each node is constant - If the node represents the call to f(), you do 2 * y * z, which is constant. If the node represents the call to g(), you do y*y + z*z, which is also constant.
Hence, all we need to do is, find the total number of nodes in a compete binary tree of length Log2(n) and we have our complexity.
A perfect binary tree of height h has total 2h + 1 - 1 nodes.
In this case it would be 2Log2(n) + 1 - 1 nodes.
Also, aLogab = b (By property of logarithms)1
Hence, the complexity is O(2Log2(n)) = O(n).
1 See first property in "Cancelling Exponentials" section.

Random number generator in PGI Fortran not so random

The following code just generates a simple triple of random numbers:
program testrand
integer, parameter :: nz = 160, nf = 160, nlt = 90
real :: tmpidx(3)
integer :: idxarr(3), idx1, idx2, idx3, seed_size, ticks
integer, allocatable :: seed(:)
call random_seed(size=seed_size)
allocate(seed(seed_size))
call system_clock(count=ticks)
seed = ticks+37*(/(i-1, i=1,seed_size)/)
call random_seed(put=seed)
deallocate(seed)
call random_number(tmpidx)
idxarr = tmpidx * (/nz, nf, nlt/)
idx1 = max(1,idxarr(1))
idx2 = max(1,idxarr(2))
idx3 = max(1,idxarr(3))
print *,idx1, idx2, idx3
end program
I compile this with gfortran and run a few times and I get:
> gfortran testrand.f90
> ./a.out
74 98 86
> ./a.out
113 3 10
> ./a.out
44 104 27
Looks pretty random. Now I compile with PGI Fortran and run a few times:
> pgf90 testrand.f90
> ./a.out
1 1 1
> ./a.out
1 1 1
> ./a.out
1 1 1
Of course, there's no way to be completely sure, but I suspect this is not random. :) Anyone know what is going on here? Anyone know the right way to get random numbers with PGI Fortran?
Somehow, PGI does not implement system_clock as in GNU compilers. I do not know why, I found it recently by doing similar stuff like you.
To see what I am talking about, just print ticks after calling system_clock. Chances are that you get 0 all the time with PGI and varying numbers with GNU compilers. To solve your problem, you can adapt the code bellow. It is a slightly modified version of a code that you can get at GNU fortran web site
program testrand
use iso_fortran_env, only: int64
integer, parameter :: nz = 160, nf = 160, nlt = 90
real :: tmpidx(3)
integer :: idxarr(3), idx1, idx2, idx3, seed_size, ticks
integer, allocatable :: seed(:)
call random_seed(size=seed_size)
allocate(seed(seed_size))
! call system_clock(count=ticks)
! seed = ticks+37*(/(i-1, i=1,seed_size)/)
! call random_seed(put=seed)
!
! deallocate(seed)
call init_random_seed()
call random_number(tmpidx)
idxarr = tmpidx * (/nz, nf, nlt/)
idx1 = max(1,idxarr(1))
idx2 = max(1,idxarr(2))
idx3 = max(1,idxarr(3))
print *,idx1, idx2, idx3
contains
!
subroutine init_random_seed()
implicit none
integer, allocatable :: seed(:)
integer :: i, n, istat, dt(8), pid
integer(int64) :: t
integer, parameter :: un=703
call random_seed(size = n)
allocate(seed(n))
! First try if the OS provides a random number generator
open(unit=un, file="/dev/urandom", access="stream", &
form="unformatted", action="read", status="old", iostat=istat)
if (istat == 0) then
read(un) seed
close(un)
else
! The PID is
! useful in case one launches multiple instances of the same
! program in parallel.
call system_clock(t)
if (t == 0) then
call date_and_time(values=dt)
t = (dt(1) - 1970) * 365_int64 * 24 * 60 * 60 * 1000 &
+ dt(2) * 31_int64 * 24 * 60 * 60 * 1000 &
+ dt(3) * 24_int64 * 60 * 60 * 1000 &
+ dt(5) * 60 * 60 * 1000 &
+ dt(6) * 60 * 1000 + dt(7) * 1000 &
+ dt(8)
end if
pid = getpid()
t = ieor( t, int(pid, kind(t)) )
do i = 1, n
seed(i) = lcg(t)
end do
end if
call random_seed(put=seed)
!print*, "optimal seed = ", seed
end subroutine init_random_seed
!
function lcg(s)
integer :: lcg
integer(int64), intent(in out) :: s
if (s == 0) then
s = 104729
else
s = mod(s, 4294967296_int64)
end if
s = mod(s * 279470273_int64, 4294967291_int64)
lcg = int(mod(s, int(huge(0), 8)), kind(0))
end function lcg
!
!this option is especially used for pgf90 to provide a getpid() function
!> #brief Returns the process ID of the current process
!! #todo write the actual code, for now returns a fixed value
!<
function getpid()result(pid)
integer pid
pid = 53 !just a prime number, no special meaning
end function getpid
end program

Resources