Is it possible to do a poisson distribution with the probabilities based on integers? - probability

Working within Solidity and the Ethereum EVM and Decimals don't exist. Is there a way I could mathematically still create a Poisson distribution using integers ? it doesnt have to be perfect, i.e rounding or losing some digits may be acceptable.

Let me preface by stating that what follows is not going to be (directly) helpful to you with etherium/solidity. However, it produces probability tables that you might be able to use for your work.
I ended up intrigued by the question of how accurate you could be in expressing the Poisson probabilities as rationals, so I put together the following script in Ruby to try things out:
def rational_poisson(lmbda)
Hash.new.tap do |h| # create a hash and pass it to this block as 'h'.
# Make all components of the calculations rational to allow
# cancellations to occur wherever possible when dividing
e_to_minus_lambda = Math.exp(-lmbda).to_r
factorial = 1r
lmbda = lmbda.to_r
power = 1r
(0...).each do |x|
unless x == 0
power *= lmbda
factorial *= x
end
value = (e_to_minus_lambda / factorial) * power
# the following double inversion/conversion bounds the result
# by the significant bits in the mantissa of a float
approx = Rational(1, (1 / value).to_f)
h[x] = approx
break if x > lmbda && approx.numerator <= 1
end
end
end
if __FILE__ == $PROGRAM_NAME
lmbda = (ARGV.shift || 2.0).to_f # read in a lambda (defaults to 2.0)
pmf = rational_poisson(lmbda) # create the pmf for a Poisson with that lambda
pmf.each { |key, value| puts "p(#{key}) = #{value} = #{value.to_f}" }
puts "cumulative error = #{1.0 - pmf.values.inject(&:+)}" # does it sum to 1?
end
Things to know as you glance through the code. Appending .to_r to a value or expression converts it to a rational, i.e., a ratio of two integers; values with an r suffix are rational constants; and (0...).each is an open-ended iterator which will loop until the break condition is met.
That little script produces results such as:
localhost:pjs$ ruby poisson_rational.rb 1.0
p(0) = 2251799813685248/6121026514868073 = 0.36787944117144233
p(1) = 2251799813685248/6121026514868073 = 0.36787944117144233
p(2) = 1125899906842624/6121026514868073 = 0.18393972058572117
p(3) = 281474976710656/4590769886151055 = 0.061313240195240384
p(4) = 70368744177664/4590769886151055 = 0.015328310048810096
p(5) = 17592186044416/5738462357688819 = 0.003065662009762019
p(6) = 1099511627776/2151923384133307 = 0.0005109436682936699
p(7) = 274877906944/3765865922233287 = 7.299195261338141e-05
p(8) = 34359738368/3765865922233287 = 9.123994076672677e-06
p(9) = 67108864/66196861914257 = 1.0137771196302974e-06
p(10) = 33554432/330984309571285 = 1.0137771196302975e-07
p(11) = 33554432/3640827405284135 = 9.216155633002704e-09
p(12) = 4194304/5461241107926203 = 7.68012969416892e-10
p(13) = 524288/8874516800380079 = 5.907792072437631e-11
p(14) = 32768/7765202200332569 = 4.2198514803125934e-12
p(15) = 256/909984632851473 = 2.8132343202083955e-13
p(16) = 16/909984632851473 = 1.7582714501302472e-14
p(17) = 1/966858672404690 = 1.0342773236060278e-15
cumulative error = 0.0

Related

speed up loop in matlab

I'm very new in MATLAB (this is my first script).
I wonder how may I speed up this loop, I don't know any toolbox or 'tricks' as I'm a newbie on it. I tried to code it with instinct, it works, but it is really long.
All are variables get with fread or integer manually entered, so this is basically simple math, but I have no clue on why is it so long (maybe nested loops ?) and how to improve, as I am more familiar with Python and for example multiprocess.
Thanks a lot
X = 0;
Points = [0,0,0];
for i=1:nbLines
for j=1:nbPositions-1
if lDate(i)>posDate(j) && lDate(i)<=posDate(j+1)
weight = (lDate(i) - posDate(j)) / (posDate(j+1)- posDate(j));
X = posX(j)*(1-weight) + posX(j+1) * weight;
end
end
if X ~= 0
for j=1:nbScans
Y = - distance(i,j) / tan(angle(i,j));
Points = [Points;X, Y, distance(i,j)];
end
end
end
X = 0;
Points = cell([],1) ;
Points{1} = [0,0,0];
count = 1 ;
for i=1:nbLines
id = find(lDate(i)>posDate & lDate(i)<=posDate) ;
if length(id) > 1
weight = (lDate(i) - posDate(id(1))) / (posDate(id(end))- posDate(id(1)));
X = posX(id(1))*(1-weight) + posX(id(end)) * weight;
end
if X ~= 0
j=1:nbScans ;
count = count+1 ;
Y = - distance(i,j)./tan(angle(i,j));
Points{count} = [repelem(X,size(Y,2),size(Y),1), Y, distance(i,j)'];
end
end
You have one issue with the given code. The blow line:
Points = [Points; X, Y, distance(i,j)];
This will definitely slow up your code. You need to initialize this array to store the numbers. If you initialize it, you will find good difference in speed.
X = 0;
Points = zeros([],3) ;
Points(1,:) = [0,0,0];
count = 1 ;
for i=1:nbLines
for j=1:nbPositions-1
if lDate(i)>posDate(j) && lDate(i)<=posDate(j+1)
weight = (lDate(i) - posDate(j)) / (posDate(j+1)- posDate(j));
X = posX(j)*(1-weight) + posX(j+1) * weight;
end
end
if X ~= 0
for j=1:nbScans
count = count+1 ;
Y = - distance(i,j) / tan(angle(i,j));
Points(count,:) = [X, Y, distance(i,j)];
end
end
end
Note that, you code only saves the last value of X, is this what you want?
try using parallelization- "parfor" instead of "for" that uses all available processors.
parfor i=1:nbLines
rest of code here
end

Can anyone explain how different is this hybrid PSOGA from normal GA?

Does this code have mutation, selection, and crossover, just like the original genetic algorithm.
Since this, a hybrid algorithm (i.e PSO with GA) does it use all steps of original GA or skips some
of them.Please do tell me.
I am just new to this and still trying to understand. Thank you.
%%% Hybrid GA and PSO code
function [gbest, gBestScore, all_scores] = QAP_PSO_GA(CreatePopFcn, FitnessFcn, UpdatePosition, ...
nCity, nPlant, nPopSize, nIters)
% Set algorithm parameters
constant = 0.95;
c1 = 1.5; %1.4944; %2;
c2 = 1.5; %1.4944; %2;
w = 0.792 * constant;
% Allocate memory and initialize
gBestScore = inf;
all_scores = inf * ones(nPopSize, nIters);
x = CreatePopFcn(nPopSize, nCity);
v = zeros(nPopSize, nCity);
pbest = x;
% update lbest
cost_p = inf * ones(1, nPopSize); %feval(FUN, pbest');
for i=1:nPopSize
cost_p(i) = FitnessFcn(pbest(i, 1:nPlant));
end
lbest = update_lbest(cost_p, pbest, nPopSize);
for iter = 1 : nIters
if mod(iter,1000) == 0
parents = randperm(nPopSize);
for i = 1:nPopSize
x(i,:) = (pbest(i,:) + pbest(parents(i),:))/2;
% v(i,:) = pbest(parents(i),:) - x(i,:);
% v(i,:) = (v(i,:) + v(parents(i),:))/2;
end
else
% Update velocity
v = w*v + c1*rand(nPopSize,nCity).*(pbest-x) + c2*rand(nPopSize,nCity).*(lbest-x);
% Update position
x = x + v;
x = UpdatePosition(x);
end
% Update pbest
cost_x = inf * ones(1, nPopSize);
for i=1:nPopSize
cost_x(i) = FitnessFcn(x(i, 1:nPlant));
end
s = cost_x<cost_p;
cost_p = (1-s).*cost_p + s.*cost_x;
s = repmat(s',1,nCity);
pbest = (1-s).*pbest + s.*x;
% update lbest
lbest = update_lbest(cost_p, pbest, nPopSize);
% update global best
all_scores(:, iter) = cost_x;
[cost,index] = min(cost_p);
if (cost < gBestScore)
gbest = pbest(index, :);
gBestScore = cost;
end
% draw current fitness
figure(1);
plot(iter,min(cost_x),'cp','MarkerEdgeColor','k','MarkerFaceColor','g','MarkerSize',8)
hold on
str=strcat('Best fitness: ', num2str(min(cost_x)));
disp(str);
end
end
% Function to update lbest
function lbest = update_lbest(cost_p, x, nPopSize)
sm(1, 1)= cost_p(1, nPopSize);
sm(1, 2:3)= cost_p(1, 1:2);
[cost, index] = min(sm);
if index==1
lbest(1, :) = x(nPopSize, :);
else
lbest(1, :) = x(index-1, :);
end
for i = 2:nPopSize-1
sm(1, 1:3)= cost_p(1, i-1:i+1);
[cost, index] = min(sm);
lbest(i, :) = x(i+index-2, :);
end
sm(1, 1:2)= cost_p(1, nPopSize-1:nPopSize);
sm(1, 3)= cost_p(1, 1);
[cost, index] = min(sm);
if index==3
lbest(nPopSize, :) = x(1, :);
else
lbest(nPopSize, :) = x(nPopSize-2+index, :);
end
end
If you are new to Optimization, I recommend you first to study each algorithm separately, then you may study how GA and PSO maybe combined, Although you must have basic mathematical skills in order to understand the operators of the two algorithms and in order to test the efficiency of these algorithm (this is what really matter).
This code chunk is responsible for parent selection and crossover:
parents = randperm(nPopSize);
for i = 1:nPopSize
x(i,:) = (pbest(i,:) + pbest(parents(i),:))/2;
% v(i,:) = pbest(parents(i),:) - x(i,:);
% v(i,:) = (v(i,:) + v(parents(i),:))/2;
end
Is not really obvious how selection randperm is done (I have no experience about Matlab).
And this is the code that is responsible for updating the velocity and position of each particle:
% Update velocity
v = w*v + c1*rand(nPopSize,nCity).*(pbest-x) + c2*rand(nPopSize,nCity).*(lbest-x);
% Update position
x = x + v;
x = UpdatePosition(x);
This version of velocity updating strategy is utilizing what is called Interia-Weight W, which basically mean we are preserving the velocity history of each particle (not completely recomputing it).
It worth mentioning that velocity updating is done more often than crossover (each 1000 iteration).

The levenberg-marquardt method for solving non-linear equations

I tried implement the levenberg-marquardt method for solving non-linear equations on Julia based on Numerical Optimization using the
Levenberg-Marquardt Algorithm presentation. This my code:
function get_J(ArrOfFunc,X,delta)
N = length(ArrOfFunc)
J = zeros(Float64,N,N)
for i = 1:N
for j=1:N
Temp = copy(X);
Temp[j]=Temp[j]+delta;
J[i,j] = (ArrOfFunc[i](Temp)-ArrOfFunc[i](X))/delta;
end
end
return J
end
function get_resudial(ArrOfFunc,Arg)
return map((x)->x(Arg),ArrOfFunc)
end
function lm_solve(Funcs,Init)
X = copy(Init)
delta = 0.01;
Lambda = 0.01;
Factor = 2;
J = get_J(Funcs,X,delta)
R = get_resudial(Funcs,X)
N = 5
for t = 1:N
G = J'*J+Lambda.*eye(length(X))
dC = J'*R
C = sum(R.*R)/2;
Xnew = X-(inv(G)\dC);
Rnew = get_resudial(Funcs,Xnew)
Cnew = sum(Rnew.*Rnew)/2;
if ( Cnew < C)
X = Xnew;
R = Rnew;
Lambda = Lambda/Factor;
J = get_J(Funcs,X,delta)
else
Lambda = Lambda*Factor;
end
if(maximum(abs(Rnew)) < 0.001)
return X
end
end
return X
end
function test()
ArrOfFunc = [
(X)->X[1]+X[2]-2;
(X)->X[1]-X[2]
];
X = lm_solve(ArrOfFunc,Float64[3;3])
println(X)
return X
end
But from any starting point the step not accepted. What's I doing wrong?
Any help would be appreciated.
I have at the moment no way to test this, but one line does not make sense mathematically:
In the computation of Xnew it should be either inv(G)*dC or G\dC, but not a mix of both. Preferably the second, since the solution of a linear system does not require the computation of the inverse matrix.
With this one wrong calculation at the center of the iteration, the trajectory of the computation is almost surely going astray.

K-means for color quantization - Code not vectorized

I'm doing this exercise by Andrew NG about using k-means to reduce the number of colors in an image. It worked correctly but I'm afraid it's a little slow because of all the for loops in the code, so I'd like to vectorize them. But there are those loops that I just can't seem to vectorize effectively. Please help me, thank you very much!
Also if possible please give some feedback on my coding style :)
Here is the link of the exercise, and here is the dataset.
The correct result is given in the link of the exercise.
And here is my code:
function [] = KMeans()
Image = double(imread('bird_small.tiff'));
[rows,cols, RGB] = size(Image);
Points = reshape(Image,rows * cols, RGB);
K = 16;
Centroids = zeros(K,RGB);
s = RandStream('mt19937ar','Seed',0);
% Initialization :
% Pick out K random colours and make sure they are all different
% from each other! This prevents the situation where two of the means
% are assigned to the exact same colour, therefore we don't have to
% worry about division by zero in the E-step
% However, if K = 16 for example, and there are only 15 colours in the
% image, then this while loop will never exit!!! This needs to be
% addressed in the future :(
% TODO : Vectorize this part!
done = false;
while done == false
RowIndex = randperm(s,rows);
ColIndex = randperm(s,cols);
RowIndex = RowIndex(1:K);
ColIndex = ColIndex(1:K);
for i = 1 : K
for j = 1 : RGB
Centroids(i,j) = Image(RowIndex(i),ColIndex(i),j);
end
end
Centroids = sort(Centroids,2);
Centroids = unique(Centroids,'rows');
if size(Centroids,1) == K
done = true;
end
end;
% imshow(imread('bird_small.tiff'))
%
% for i = 1 : K
% hold on;
% plot(RowIndex(i),ColIndex(i),'r+','MarkerSize',50)
% end
eps = 0.01; % Epsilon
IterNum = 0;
while 1
% E-step: Estimate membership given parameters
% Membership: The centroid that each colour is assigned to
% Parameters: Location of centroids
Dist = pdist2(Points,Centroids,'euclidean');
[~, WhichCentroid] = min(Dist,[],2);
% M-step: Estimate parameters given membership
% Membership: The centroid that each colour is assigned to
% Parameters: Location of centroids
% TODO: Vectorize this part!
OldCentroids = Centroids;
for i = 1 : K
PointsInCentroid = Points((find(WhichCentroid == i))',:);
NumOfPoints = size(PointsInCentroid,1);
% Note that NumOfPoints is never equal to 0, as a result of
% the initialization. Or .... ???????
if NumOfPoints ~= 0
Centroids(i,:) = sum(PointsInCentroid , 1) / NumOfPoints ;
end
end
% Check for convergence: Here we use the L2 distance
IterNum = IterNum + 1;
Margins = sqrt(sum((Centroids - OldCentroids).^2, 2));
if sum(Margins > eps) == 0
break;
end
end
IterNum;
Centroids ;
% Load the larger image
[LargerImage,ColorMap] = imread('bird_large.tiff');
LargerImage = double(LargerImage);
[largeRows,largeCols,NewRGB] = size(LargerImage); % RGB is always 3
% TODO: Vectorize this part!
largeRows
largeCols
NewRGB
% Replace each of the pixel with the nearest centroid
NewPoints = reshape(LargerImage,largeRows * largeCols, NewRGB);
Dist = pdist2(NewPoints,Centroids,'euclidean');
[~,WhichCentroid] = min(Dist,[],2);
NewPoints = Centroids(WhichCentroid,:);
LargerImage = reshape(NewPoints,largeRows,largeCols,NewRGB);
% for i = 1 : largeRows
% for j = 1 : largeCols
% Dist = pdist2(Centroids,reshape(LargerImage(i,j,:),1,RGB),'euclidean');
% [~,WhichCentroid] = min(Dist);
% LargerImage(i,j,:) = Centroids(WhichCentroid,:);
% end
% end
% Display new image
imshow(uint8(round(LargerImage)),ColorMap)
UPDATE: Replaced
for i = 1 : K
for j = 1 : RGB
Centroids(i,j) = Image(RowIndex(i),ColIndex(i),j);
end
end
with
for i = 1 : K
Centroids(i,:) = Image(RowIndex(i),ColIndex(i),:);
end
I think this may be vectorized further by using linear indexing, but for now I should just focus on the while loop since it takes most of the time.
Also when I tried #Dev-iL's suggestion and replaced
for i = 1 : K
PointsInCentroid = Points((find(WhichCentroid == i))',:);
NumOfPoints = size(PointsInCentroid,1);
% Note that NumOfPoints is never equal to 0, as a result of
% the initialization. Or .... ???????
if NumOfPoints ~= 0
Centroids(i,:) = sum(PointsInCentroid , 1) / NumOfPoints ;
end
end
with
E = sparse(1:size(WhichCentroid), WhichCentroid' , 1, Num, K, Num);
Centroids = (E * spdiags(1./sum(E,1)',0,K,K))' * Points ;
the results were always worse: With K = 16, the first takes 2,414s , the second takes 2,455s ; K = 32, the first takes 4,529s , the second takes 5,022s. Seems like vectorization does not help, but maybe there's something wrong with my code :( .
Replaced
for i = 1 : K
for j = 1 : RGB
Centroids(i,j) = Image(RowIndex(i),ColIndex(i),j);
end
end
with
for i = 1 : K
Centroids(i,:) = Image(RowIndex(i),ColIndex(i),:);
end
I think this may be vectorized further by using linear indexing, but for now I should just focus on the while loop since it takes most of the time.
Also when I tried #Dev-iL's suggestion and replaced
for i = 1 : K
PointsInCentroid = Points((find(WhichCentroid == i))',:);
NumOfPoints = size(PointsInCentroid,1);
% Note that NumOfPoints is never equal to 0, as a result of
% the initialization. Or .... ???????
if NumOfPoints ~= 0
Centroids(i,:) = sum(PointsInCentroid , 1) / NumOfPoints ;
end
end
with
E = sparse(1:size(WhichCentroid), WhichCentroid' , 1, Num, K, Num);
Centroids = (E * spdiags(1./sum(E,1)',0,K,K))' * Points ;
the results were always worse: With K = 16, the first takes 2,414s , the second takes 2,455s ; K = 32, the first took 4,529s , the second took 5,022s. Seems like vectorization did not help in this case.
However, when I replaced
Dist = pdist2(Points,Centroids,'euclidean');
[~, WhichCentroid] = min(Dist,[],2);
(in the while loop) with
Dist = bsxfun(#minus,dot(Centroids',Centroids',1)' / 2 , Centroids * Points' );
[~, WhichCentroid] = min(Dist,[],1);
WhichCentroid = WhichCentroid';
the code ran much faster, especially when K is large (K=32)
Thank you everyone!

Solve wrong type argument 'cell'

I write in variable 'O' some values using
for i = 1:size(I,1)
for j = 1:size(1,I)
h = i * j;
O{h} = I(i, j) * theta(h);
end
end
I - double, theta - double.
I need to sum()all 'O' values, but when I do it its give me error: sum: wrong type argument 'cell'.
How can I sum() it?
P.s. when I want to see O(), its give me
O =
{
[1,1] = 0.0079764
[1,2] = 0.0035291
[1,3] = 0.0027539
[1,4] = 0.0034392
[1,5] = 0.017066
[1,6] = 0.0082958
[1,7] = 1.4764e-04
[1,8] = 0.0024597
[1,9] = 1.1155e-04
[1,10] = 0.0010342
[1,11] = 0.0039654
[1,12] = 0.0047713
[1,13] = 0.0054305
[1,14] = 3.3794e-04
[1,15] = 0.014323
[1,16] = 0.0026826
[1,17] = 0.013864
[1,18] = 0.0097778
[1,19] = 0.0058029
[1,20] = 0.0020726
[1,21] = 0.0016430
etc...
The exact answer to your question is to use cell2mat
sum (cell2mat (your_cell_o))
However, this is the very wrong way to solve your problem. The thing is that you should not have created a cell array in first place. You should have created a numeric array:
O = zeros (size (I), class (I));
for i = 1:rows (I)
for j = 1:columns (I)
h = i * j;
O(h) = I(i, j) * theta(h);
endfor
endfor
but even this is just really bad and slow. Octave is a language to vectorize operations. Instead, you should have:
h = (1:rows (I))' .* (1:columns (I)); # automatic broadcasting
O = I .* theta (h);
which assumes your function theta behaves properly and if givena matrix will compute the value for each of the element of h and return something of the same size.
If you get an error about wrong sizes, I will guess you have an old version of Octave that does not perform automatic broadcasting. If so, update Octave. If you really can't, then:
h = bsxfun (#times, (1:rows (I))', 1:columns (I));

Resources