How to do an inverse orderNorm transformation (bestNormalize package) from a GAMLSS object? - transformation

My y variable (n=30,000) is distributed with very heavy tails (both positive and negative), for which the fitDist GAMLSS function selects the ST4 family.
I tried to assess a GAMLSS-based regression with an explanatory variable x (pb smoothing), but tails on y are so heavy that convergence does not reach after 50 cycles, even after refit (time consuming+++).
Therefore, I normalized y using the orderNorm transformation (bestNormalize package), which allowed to easily and quickly reach convergence, and then to predict the fitted value from the GAMLSS object.
However, these fitted "orderNormalized" values are a GAMLSS object, and thus cannot be inversed using the predict function from bestNormalize (since this latter seems to not recognize a GAMLSS object).
My question: is it possible, whatever the means, to apply an inverse orderNorm transformation to fitted values from a GAMLSS object?

It is easy to get confused about on what to use the predict function, so I list here the steps without code (as there is no example in the question):
1) transposeObj = orderNorm(data$outputvariable)
2) fitObj = gamlls(transposeObj$x.t ~., data)
3) pred = predict(fitObj, type = 'response')
4) inversedpredictions = predict(transposeObj, newdata = pred, inverse = TRUE)
In plain text, you normalize your data, fit a model, make predictions with the fit, and then predict on the predictions with the normalization object obtained from orderNorm.

The vignette for bestNormalize has a similar example, using lm instead of GAMLSS. See the application section of the vignette. Once you have run the normalization procedure, you should be able to repeat and invert the transformation with the predict function.
The key is storing a transformation object as an R object that can then be fed into the predict (or rather, the predict.bestNormalize) function.

Related

Why does Perlin noise use a hash function rather than computing random values?

I'm reading through this explanation of Perlin noise which describes a hash function that is calculates random points for all x, y coordinates.
If the x, y coordinate hashes are generated randomly which are eventually used for computing the gradient's and such, why couldn't I just generate random numbers on the fly?
Is it simply a question of optimization that we use a permutation on hash maps to find our random values? The only reason I could think of is that permutations through our hash map some how generates a smoothening effect but I fail to see how.
Just for clarification, I'm refering to this section in the code:
private static readonly int[] p = { 151,160,137,91,90,15, // Hash lookup table as defined by Ken Perlin. This is a randomly
131,13,201,95,96,53,194,233,7,225,140,36,103,30,69,142,8,99,37,240,21,10,23, // arranged array of all numbers from 0-255 inclusive.
190, 6,148,247,120,234,75,0,26,197,62,94,252,219,203,117,35,11,32,57,177,33,
88,237,149,56,87,174,20,125,136,171,168, 68,175,74,165,71,134,139,48,27,166,
77,146,158,231,83,111,229,122,60,211,133,230,220,105,92,41,55,46,245,40,244,
102,143,54, 65,25,63,161, 1,216,80,73,209,76,132,187,208, 89,18,169,200,196,
135,130,116,188,159,86,164,100,109,198,173,186, 3,64,52,217,226,250,124,123,
5,202,38,147,118,126,255,82,85,212,207,206,59,227,47,16,58,17,182,189,28,42,
223,183,170,213,119,248,152, 2,44,154,163, 70,221,153,101,155,167, 43,172,9,
129,22,39,253, 19,98,108,110,79,113,224,232,178,185, 112,104,218,246,97,228,
251,34,242,193,238,210,144,12,191,179,162,241, 81,51,145,235,249,14,239,107,
49,192,214, 31,181,199,106,157,184, 84,204,176,115,121,50,45,127, 4,150,254,
138,236,205,93,222,114,67,29,24,72,243,141,128,195,78,66,215,61,156,180
};
int aaa, aba, aab, abb, baa, bba, bab, bbb;
aaa = p[p[p[ xi ]+ yi ]+ zi ];
aba = p[p[p[ xi ]+inc(yi)]+ zi ];
aab = p[p[p[ xi ]+ yi ]+inc(zi)];
abb = p[p[p[ xi ]+inc(yi)]+inc(zi)];
baa = p[p[p[inc(xi)]+ yi ]+ zi ];
bba = p[p[p[inc(xi)]+inc(yi)]+ zi ];
bab = p[p[p[inc(xi)]+ yi ]+inc(zi)];
bbb = p[p[p[inc(xi)]+inc(yi)]+inc(zi)];
Why don't we just initialize the values as follows?
aaa = random(255)
aab = random(255)
// ...
The key idea behind Perlin noise generation is to create a grid of points, each of which is assigned some vector value, and then to interpolate between those points in a specific way.
I checked out Ken Perlin's original paper on Perlin noise and it seems like as far back as the original paper he recommends using a hash function to do this:
Associate with each point in the integer lattice a pseudorandom value and x, y, and z gradient values. More precisely, map each ordered sequence of three integers into an uncorrelated ordered sequence of four real numbers [a,b,c,d] = H([x,y,z]), where [a,b,e,d] define a linear equation with gradient [a,b,c] and value d at [x,y,z]. H is best implemented as a hash function.
(Emphasis mine).
I suspect that the reason for this has to do with memory concerns. Perlin noise generation requires that the gradient function at different points in space be reevaluated multiple times over the course of the run of the algorithm. Accordingly, you could either
have some formula that, given a point in space, evaluates to the gradient, or
explicitly create a table and store all of the random values that you need.
Option (1) is what Ken Perlin is proposing. The advantage of this approach is that the memory usage required to store the gradients is minimal; you just need to use a hash function.
Option (2) is what you're proposing. This works just fine, but it uses a ton of memory (you need multiple values stored for each point in the integer lattice you're working with). Remember that Perlin's paper was written back in 1985 (!) when memory was much, much scarcer than it is today.
My suspicion is that you can get away with either approach, but given that you don't need true randomness, the pseudorandomness afforded by a good hash function should be sufficient.
I can't explain why the author of that article you read chose to use the particular hash function that they did, though. My guess is that it's "random enough" and sufficiently fast that it doesn't end up being the bottleneck in the computation; remember that the hash function gets called a lot of times in the noise generation code. This seems to be the standard approach to implementing Perlin noise; even Ken Perlin mentions using this hash function on his site.
What you can't do is the approach you're proposing of just letting the variables aaa, aab, aba, etc. be random. The reason why is that the Perlin noise algorithm requires you to reevaluate the noise term at a given point multiple times and expects that it will give back the same values every time. If you wanted to compute truly random values, you could do so, but you'd need to cache your results so that you give back consistent answers of the noise terms at each point.

How to perform operation for all matrix elements in Scilab?

I'm trying to simulate the heat distribution on an infinite plate over time. For this purpose, I've wrote a Scilab script. Now, the crucial point of it, is calculation of temperature for all plate points, and it has to be done for every time instance I want to observe:
for j=2:S-1
for i=2:S-1
heat(i, j) = tcoeff*10000*(plate(i-1,j) + plate(i+1,j) - 4*plate(i,j) + plate(i, j-1) + plate(i, j+1)) + plate(i,j);
end;
end
The problem is, that, if I'd like to do it for a 100x100 points plate, it means, that here (it's only for inner part, without boundary conditions), I would have to loop 98x98 = 9604 times, at every turn calculating the heat at a given i,j point. If I'd like to observe that for, say 100 secons, with a 1 s step, I have to repeat it 100 times, giving 960,400 iterations in total. Which takes quite a long time, and I'd like to avoid it. Up to 50x50 plate, it all happens in a reasonable, 4-5 seconds time frame.
Now my question is - is it necessary to do all this using for loops? Is there any built-in aggregate function in Scilab, that will let me do this for all elements of a matrix? The reason I haven't found a way yet, is that the result for every point depends on the values of other matrix points, and that made me do it with nested loops. Any ideas on how to make it faster appreciated.
It seems to me that you want to compute a 2D intercorrelation of your heat field and a certain diffusion pattern. This pattern can be thought as a "filter" kernel, which is a common way to modify images with a linear filter matrix. Your "filter" is:
F=[0,1,0;1,-4,1;0,1,0];
If you install the Image Processing Toolbox (IPD) you will have a MaskFilter function to do this 2D intercorrelation.
S=500;
plate=rand(S,S);
tcoeff=1;
//your solution with nested for loops
t0=getdate();
for j=2:S-1
for i=2:S-1
heat(i, j) = tcoeff*10000*(plate(i-1,j)+plate(i+1,j)-..
4*plate(i,j)+plate(i,j-1)+plate(i, j+1))+plate(i,j);
end
end
t1=getdate();
T0=etime(t1,t0);
mprintf("\nNested for loops: %f s (100 %%)",T0);
//optimised nested for loop
F=[0,1,0;1,-4,1;0,1,0]; //"filter" matrix
F=tcoeff*10000*F;
heat2=zeros(plate);
t0=getdate();
for j=2:S-1
for i=2:S-1
heat2(i,j)=sum(F.*plate(i-1:i+1,j-1:j+1));
end
end
heat2=heat2+plate;
t1=getdate();
T2=etime(t1,t0);
mprintf("\nNested for loops optimised: %f s (%.2f %%)",T2,T2/T0*100);
//MaskFilter from IPD toolbox
t0=getdate();
heat3=MaskFilter(plate,F);
heat3=heat3+plate;
t1=getdate();
T3=etime(t1,t0);
mprintf("\nWith MaskFilter: %f s (%.2f %%)",T3,T3/T0*100);
disp(heat3(1:10,1:10)-heat(1:10,1:10),"Difference of the results (heat3-heat):");
Please note, that MaskFilter pads the image (the original matrix) before applying the filter, and as far as I know it uses a "mirror" array across the border. You should check whether this behaviour is appropriate for you or not.
The speed increase is about *320 (the execution time is 0.32% of your original code). Is that fast enough?
In theory it could be done with two 2D Fourier Transform (with Scilab builtin mfft maybe) but it might not be faster than this. See here: http://mailinglists.scilab.org/Image-processing-filter-td2618144.html#a2618168
Please consider that there is a big difference between vectorizing an operation and parallel computation, as I have explained here. Although vectorizing might improve performance a little bit, that's not comparable to what you can achive through GPU computing for example (e.g. OpenCL). I will try to explain a vectorized form of your code without going too much into the details. Consider these as given:
S = ...;
tcoeff = ...;
function Plate = plate(i, j)
...;
endfunction
function Heat = heat(i, j)
...;
endfunction
Now you could define a meshgrid:
x = 2 : S - 1;
y = 2 : S - 1;
[M, N] = meshgrid(x,y);
Result = feval(M, N, heat);
The feval is the key here which will broadcast the feval function over the M and N matrices.
Your scheme is a finite differences scheme of the Laplacian operator applied to a rectangular grid. If you choose a row-wise or column-wise numbering of your degrees of freedom (here the plate(i,j)) in order to treat them as vectors, then applying your "discrete" Laplacian can be done by multiplying a sparse matrix on the left (it is very fast) This is particularly well explained in the following document:
https://www.math.uci.edu/~chenlong/226/FDMcode.pdf.
The implementation is described in Matlab but is easily translated in Scilab.

Genetic/Evolutionary algorithm - Painter

My task:
Create a program to copy a picture (given as input) using primitives only (like triangle or something). The program should use evolutionary algorithm to create output picture.
My question:
I need to invent an algorithm to create populations and check them (how much - in % - they match the input picture).
I have an idea; you can find it below.
So what I want from you: advice (if you find my idea not so bad) or inspiration (maybe you have a better idea?)
My idea:
Let's say that I'll use only triangles to build the output picture.
My first population is P pictures (generated by using T randomly generated triangles - called Elements).
I check by my fitness function every pictures in population and choose E of them as elite and rest of population just remove:
To compare 2 pictures we check every pixel in picture A and compare his R,G,B with
the same pixel (the same coordinates) in picture B.
I use this:
SingleDif = sqrt[ (Ar - Br)^2 + (Ag - Bg)^2 + (Ab - Bb)^2]
then i sum all differences (from all pixels) - lets call it SumDif
and use:
PictureDif = (DifMax - SumDif)/DifMax
where
DifMax = pictureHeight * pictureWidth * 255*3
The best are used to create the next population in this way:
picture MakeChild(picture Mother, picture Father)
{
picture child;
for( int i = 0; i < T; ++i )
{
j //this is a random number from 0 to 1 - created now
if( j < 0.5 ) child.element(i) = Mother.element(i);
else child.element(i) = Father.element(i)
if( j < some small % ) mutate( child.element(i) );
}
return child;
}
So it's quite simple. Only the mutation needs a comment: So there is always some small probability that element X in child will be different than X in his parent. To do this we make random changes in element in child (change his colour by random number, or add random number to his (x,y) coordinate - or his node).
So this is my idea... I didn't test it, didn't code it.
Please check my idea - what do you think about it?
I would make the number of patches of each child dynamic and get the mutation operation to insert/delete patches with some (low) probability. Of course this could result in a lot of redundancy and bloat in the child's genome. In these situations, it is usually a good idea to use the length of an individual's genome as a parameter of the fitness function so that individuals get rewarded (with a higher fitness value) for using fewer patches. So for example if the PictureDif of individuals A and B are the same but the A has fewer patches than B, then A has a higher fitness.
Another issue is the reproductive operator that you proposed (namely, the crossover operation). In order for the evolutionary process to work efficiently, you need to achieve a reasonable exploration and exploitation balance. One way of doing this is by having a set of reproductive operators that exhibit a good fitness correlation [1] which means the fitness of a child must be close to the fitness of its parent(s).
In the case of single parent reproduction you only need to find the right mutation parameters. However, when it comes to multi-parent reproduction (crossover) one of the frequently used techniques is to produce 2 children (instead of 1) from the same 2 parents. For the first child, each gene comes from the mother with the probability of 0.2 and from the father with the probability of 0.8, and for the second child the other way around. Of course after the crossover, you can do the mutation.
Oh and one more thing, for the mutation operators, when you say
... make random changes in element in child (change his colour by random number, or add random number to his (x,y) coordinate - or his node)
it's a good idea to use a Gaussian distribution to change the colour, coordinate etc.
[1] Evolutionary Computation: A unified approach by Kenneth A. De Jong, page 69

matlab curve fitting: restrictions on parameters

I have 5 non-parametric models all with 5 to 8 parameters. This models are used to fit longitudinal data y(t) with t being time. Every datafile is fitted by all 5 models for comparison. The model itself cannot be altered.
For fitting starting values are used and these are fitted into a lsqcurvefit model using a levenberg-marquardt algortihm. So I've written a script for several models and one function for curvefitting
if i perform the curve fitting a lot of the starting values are wandering off to extreme values. This is the thing I want to avoid since these parameters should stay in the proximity off it's starting values and should only change between a well defined range or so that only curve fits within a standard deviation are included.Important to note here is that this restrictions should be imposed during the curve fitting (iterative numerization techique) and not afterwards.
The function I've written to fit models into height:
% Fit a specific model for all valid persons
try
opts = optimoptions(#lsqcurvefit, 'Algorithm', 'levenberg-marquardt');
[personalParams,personalRes,personalResidual] = lsqcurvefit(heightModel,initialValues,personalData(:,1),personalData(:,2),[],[],opts);
catch
x=1;
end
The function I've written for one of my models
elseif strcmpi(model,'jpss')
% y = h_1(1-(1/(1+((t+0.75)^c_1/d_1)+((t+0.75)^c_2/d_2)+((t+0.75)^c_3/d_3)))
% heightModel = #(params,ages) params(1).*(1-1./(1+((ages+0.75).^params(2))./params(3) + ((ages+0.75).^params(4))./params(5) + ((ages+0.75).^params(6))./params(7)));
heightModel = #(params,ages) params(1).*(1-1./(1+(((ages+0.75)./params(3)).^params(2)) + (((ages+0.75)./params(5)).^params(4)) + ((ages+0.75)./params(7)).^params(6))); % Adapted 25/07
modelStrings = {'h1','c1','d1','c2','d2','c3','d3'};
% Define initial values
if strcmpi('male',gender)
initialValues = [174.8 0.6109 2.9743 3.614 9.88 22.393 13.59];
else
initialValues = [162.7 0.6546 2.43 4.011 8.579 18.394 11.846];
end
What I would like to do:
Is it possible to place restrictions on every startingvalue #initial values? Putting restrictions on lsqcurvefit wouldn't be a good idea I think since there are different models with different starting values and different ranges that are allowed.
I had 2 things in my mind:
1. using range and place this between the initial values
initialValues = [162.7 0.6546 2.43 4.011 8.579 18.394 11.846]`
if range a1=[150,180]; range a2=[0.3,0.8] and so one
place lb and ub restrictions seperatly on all my initialvalues between lsqcurvefit
if Heightmodel='name model'
initial value* 1.2 and lb = initial value* 0.8
Can someone give me some hints or pointers because I can't make it work.
Thanks in advance
Lucy
Could somebody help me out
You state: there are different models with different starting values and different ranges that are allowed. This is where you can use ub and lb. How to do this is outlined in the lsqcurvefit documentation:
X=LSQCURVEFIT(FUN,X0,XDATA,YDATA,LB,UB) defines a set of lower and
upper bounds on the design variables, X, so that the solution is in the
range LB <= X <= UB. Use empty matrices for LB and UB if no bounds
exist. Set LB(i) = -Inf if X(i) is unbounded below; set UB(i) = Inf if
X(i) is unbounded above.
For instance in the following example the parameters are constrained within limits during the fit. The lower bound (lb) and upper bound (ub) are set to 20% below and above the starting values, respectively.
heightModel = #(params,ages) abs(params(1).*(1-1./(1+(params(2).* (ages+params(8) )).^params(5) +(params(3).* (ages+params(8) )).^params(6) +(params(4) .*(ages+params(8) )).^params(7) )));
initialValues = [161.92 0.4173 0.1354 0.090 0.540 2.87 14.281 0.3701];
lb = 0.8*initialValues; % <-- lower bound is 20% smaller than initial par values
ub = 1.2*initialValues;
[parsout,resnorm,residual] = lsqcurvefit(heightModel,initialValues,t,ht,lb,ub);

How to apply element-wise operations without using for loops and without influencing the speed

Suppose that I have an RGB image matrix and I want to apply some spatial filters on it.
In general I want to apply element-wise operations (note that it's a college assignment and I'm not permitted to use any built-in functions available in the Image Processing toolbox). I decided to write the filters as functions and then apply bsxfun to these functions on the image.
A simple example would be this:
I want to add 50 to all gray levels of an image and then replace all gray levels with above 200 with 200. Here's my code:
a='C:\Users\sepideh\Desktop\IP_abadpour\S45C-113050518040.jpg';
b=imread(a);
b(:,:,1)=b(:,:,1)+50;
b(:,:,2)=b(:,:,2)+50;
b(:,:,3)=b(:,:,3)+50;
c=reshape(b,[],1);
d=bsxfun(#test,c,200);
test is a function in this form:
function Out = test(in,a)
if in>a
in=200;
end
Out = in;
end
This code won't work because in the second line "in > a" is a matrix having 0's and 1's (I mean all of the elements are not 1 and should not be) so the debugger won't branch into the if statement.
Could you guide me how to write this function and how to apply spatial and fourier analyses on the image, without affecting the performance and run-time speed?
Here's a couple of suggestions:
First of all, you don't need to add 50 to each layer of the RGB matrix individually. You can just do:
b = b + 50;
Why do you reshape b before passing it to bsxfun? The size of the output of bsxfun is the same as your image's, there's really no need in reshaping here anything.
Regarding your test function, note what the official documentation of bsxfun states:
A binary element-wise function of the form C = fun(A,B) accepts arrays A and B of arbitrary but equal size and returns output of the same size. Each element in the output array C is the result of an operation on the corresponding elements of A and B only. fun must also support scalar expansion, such that if A or B is a scalar, C is the result of applying the scalar to every element in the other input array.
So bsxfun performs singleton expansion and "inflates" its two input arrays to the same size, and then applies the specified function to the inflated arrays. The element-wise function fun operates, in fact, on the arrays, not scalars. I don't see any actual gain in employing bsxfun here.
That said, you can simplify your code as shown in Dan's suggestion, or implement it as a function:
function out = test(in, a);
out = in;
out(in > a) = a;
I assume that if you were using the value 210 instead of 200, you'd like to cap all gray levels with 210 as well, so you should really be using a instead of a hard-coded value 200. You could also write your function like so:
function out = test(in, a)
out = min(in, a);
and then invoke it with:
d = test(b, 200);
instead of the more complicated d = bsxfun(#test, b, 200).
Another alternative is to use arrayfun:
d = arrayfun(#(x)test(x, 200), a);
or
d = arrayfun(#test, a, 200 * ones(size(a)));
in which arrayfun will apply test element-wise, and the test function would need to operate only on scalars. However, arrayfun usually runs slower than loops, let alone vectorized operations.
For spatial analysis, check out conv2 just like Dan suggested (or implement your own 2-D convolution, for the sake of practice). For Fourier analysis, consider using the fft2 and ifft2 functions in the frequency domain.
Hope this helps!
So for the example you posted you can just take advantage of the fact that most operators in matlab work on matrices natively:
b=imread(a);
c = a + 50;
c(c > 200) = 200;
It's as simple as that.
For the filtering, if you are allowed, I would have a look at the conv2 function. You can do spatial filtering this way without transforming to the frequency domain (remeber, multiplication of a filter in the frequency domain is the same as convolution in the spatial domain). So for example a basic low pass filter:
lpf = ones(5)./25;
c(:,:,1) = conv2(b(:,:,1), lpf);
c(:,:,2) = conv2(b(:,:,2), lpf);
c(:,:,3) = conv2(b(:,:,3), lpf);

Resources