Create a random distribution programmatically in AnyLogic with a double array of probabilites and plot it - probability

Is it possible to create a CustomDistribution in AnyLogic with an double array interval_start and a double array probability, which is one unit smaller? It shall look like this:
CustomDistribution c = new CustomDistribution(interval_start, probability);
I could not find any constructor for this case - more specifically for probabilities. The only similar constructor I could find was this:
CustomDistribution​(double[] intervalStarts, double[] numberOfObservations)
My second question is how can I plot this distribution in AnyLogic?

Custom distributions just use the relative numbers of observations as probabilities (i.e., the ratios between them determine the probabilities). You can actually use probabilities (0-1 values) as "number of observed values"; "observed values" is just the way AnyLogic chose to name it.

Related

Failed to convert structure to matrix with regionprops in MATLAB

I am working with particle tracking in images in MATLAB and using regionprops function. On the provided resource there is an example with circles:
stats = regionprops('table',bw,'Centroid',...
'MajorAxisLength','MinorAxisLength')
centers = stats.Centroid;
diameters = mean([stats.MajorAxisLength stats.MinorAxisLength],2);
radii = diameters/2;
In my Matlab R2014b, the line centers = stats.Centroid; produces undesired result: my stats.Centroid structure has 20 elements (each element is two numbers - the coordinates of the center of the region). However, after the following command, my variable center is only 1x2 matrix, instead of desired 20x2.
Screenshot attached.
I tried to go around this with different methods. The only solution I found is to do:
t=zeros(20,2);
for i=1:20
t(i,:)=stats(i).Centroid;
end
However, as we all know loops are slow in MATLAB. Is there another method that takes advantage of MATLAB matrix operations?
Doing stats.Centroid would in fact give you a comma-separated list of centroids, so MATLAB would only give you the first centre of that matrix if you did centers = stats.Centroid. What you must do is encapsulate the centres in an array (i.e. [stats.Centroid]), then reshape when you're done.
Something like this should work for you:
centers = reshape([stats.Centroid], 2, []).';
What this will do is read in the centroids as a 1 x 2*M array where M is the total number of blobs and because MATLAB does reshaping in column-major format, you should make sure that specify the total number of rows to be 2 and let MATLAB figure out how many columns there are after by itself. You would then transpose the result when you're done to complete what you want.
Minor Note
If you look at the regionprops documentation page in their Tips section - http://www.mathworks.com/help/images/ref/regionprops.html#buorh6l-1, you will see that they surround stats.Area, which is the area of each blob with [] brackets to ensure that the comma-separated list of values is encapsulated in an array. This is not an accident and there is a purpose of having those there and I've basically told you what that was.

Sort labels of segmented image in kmeans based on cluster mean

I have a simple question but is very interesting. As you know, Kmeans can be give different result after each running due to randomly initial cluster center. However, assume I know that cluster 1 has smaller mean value than cluster 2, cluster 2 has smaller mean value than cluster 3 and so on. I want to make a algorithm to implement that cluster has small mean value, then it will be assigned to small cluster index.
This is my Matlab code. If you are have more sort or more clear way. Please suggest to me
%% K-mean
num_cluster=2;
nrows = size(Img_original,1);
ncols = size(Img_original,2);
I_1D = reshape(Img_original,nrows*ncols,1);
[cluster_idx mu]=kmeans(double(I_1D),num_cluster,'distance','sqEuclidean','Replicates',3);
cluster_label = reshape(cluster_idx,nrows,ncols);
%% Sort based on mu
[mu_sort id_sort]=sort(mu);
idx=cell(1,num_cluster)
%% Save index of order if mu
for i=1:num_cluster
idx{i}=find(cluster_label==id_sort(i));
end
%% Sort cluster label based on mu
for i=1:num_cluster
cluster_label(idx{i})=i;
end
It's unclear to me as to why you'd want to relabel the clusters based on the ordering of each centroid. You can simply use the labelling vector that is output from k-means to reference which cluster / centroid each point belongs to.
Nevertheless, the initial idea that you had to sort the centroids is a good one. The last part of your code seems rather inefficient because you're looping over each label and doing the reassignment. One thing I could perhaps suggest is to have a lookup table where the input is the original label and the output is the reordered labels based on the sorted centroids.
If you want to pursue this route, you can use a containers.Map where the keys are the labels given from the sort order that is output from sort, and the values are the reordered labels... namely, a vector that goes from 1 up to as many classes you have. You need to do this because the second output of sort tells you where each value in the original array would appear in the sorted result, so you must use this ordering to properly perform the relabelling. In addition, I would use the sortrows function in MATLAB, not raw sort. With how you're doing it, you are sorting each column / variable independently and that will give the wrong centroids. This will work for grayscale images where you only have one feature to consider, namely the grayscale, but if you go beyond grayscale and perhaps go into RGB or whatever colour space you desire, using raw sort will give you incorrect results. You need to consider each row as a single point, then sort the rows jointly.
Given your code, you'd do something like this:
%% K-mean
num_cluster=2;
nrows = size(Img_original,1);
ncols = size(Img_original,2);
I_1D = reshape(Img_original,nrows*ncols,1);
[cluster_idx mu]=kmeans(double(I_1D),num_cluster,'distance','sqEuclidean','Replicates',3);
%% Sort based on mu
[mu_sort id_sort]=sortrows(mu);
%// New - Create lookup
lookup = containers.Map(id_sort, 1:size(mu_sort,1));
%// Relabel the vector
cluster_idx_sort = lookup.values(num2cell(cluster_idx));
cluster_idx_sort = [cluster_idx_sort{:}];
%// Reshape back to original image dimensions
cluster_label = reshape(cluster_idx_sort,nrows,ncols);
This should hopefully give you some speedup in your code.
To double check, I tried this on the cameraman.tif image, that's part of the image processing toolbox. Running the code gives me these cluster centres:
>> mu
mu =
153.3484
23.7291
Once I sort the clusters in ascending order, this is what I get for the ordering and for the centroids:
>> mu_sort
mu_sort =
23.7291
153.3484
>> id_sort
id_sort =
2
1
So that works as we expected... now if we display the original cluster label map before sorting on the centroids with:
cluster_label = reshape(cluster_idx, nrows, ncols);
imshow(cluster_label,[]);
... we get this image:
Now, if we run through the sorting logic and display the centroids:
imshow(cluster_label, []);
... we get this image:
This works as I expected. Because the centroids flipped, so should the colouring.

Get equation for 3d shape

I have 2 arrays say X and Y. Each have 5 elements. Now for each possible combination of (X,Y) I have a Z value, so Z is a 5x5 matrix.
I am looking to find a formula e.g. z=f(x,y). Any idea about how that can be done.
I tried MS Excel surface chart, but it doesn't give any equation or curve fitting on surface charts.
in general I would suggest to use some other software like SciLab or Matlab to work on this task. These products are more computatinal mathematics than Excel.
But Excel has some built-in features that maybe will help you.
First note:
You will need to use the Add-In called "Solver". This add-in comes along with Excel, but maybe is not installed as default on your installation.
One description (there are thousands available in www) how to install that add-in you will find here:
Solver Add-in
If you are done with this, the next step is to create a sheet with the data.
I tried to generate an example shown in the picture below.
The range C5:G9holds the Matrix you want to approximate by a function.
So it's the z=f(x,y) Matrix.
The Chart beside is just the 3D-Plot of your (in this case my) original data.
Now it will become a little bit mathematical....
You need a general type of function which will be used to do the approximation.
The quality of the result is depending on how good this function is able to come close to your data.
In the example I used an approach with a 2nd order approximation (maximum quadratic terms).
My example function is z=a*x^2 + b*y^2 + c *x*y + d*x + e*y +f.
If you need more, try it with a third order term (including also x^3, y^3 , ...).
I didn't want to do this in the example, because I'm hating to type long formulas in Excel.
Typing long formula is the next step:
Now we have to fill the range C15:G19 with the values of the calculated formula. But before this, we have to define the polynomial coefficiants in range J14:J19. As a starting value, you can use just 1 for all coefficients (the picture shows the solution after running the solver)
The formula in Cell C15 is =$J$14*C$14^2+$J$15*$B15^2+$J$16*C$14*$B15+$J$17*C$14+$J$18*$B15+$J$19
It should be easy to copy it to the other cells of the Matrix.
The plot beside this is showing the result of our approximation function.
Now we have to prepare the solver. The solver needs to optimize somehow.
Therefore we need to define a function which indicates the quality of our approximation.
I used the least square value... Have a look on the www for explanations.
In the range C24:G28 I calculated the squares of the differences from our approximation function to the original data. Cell C24 has the formula =(C15-C5)^2
Now we are close to be finished. Just copy this formula to the rest of the range and than add one very important cell:
Put the sum of the range C24:G28 in Cell H29
This value is the sum of the error or better to say the difference of our approximation function to the original data points.
Nowe the most important !!!
Select Cell H29 and start the solver add-in:
This window will pop-up (sorry I have a German Excel installation on my PC)
Just fill in the value fro target cell $H$29, target value =0 and the variable cells (important) $J$14;$J$19
Press "solve" and .... tada the polynomial coefficiants have changed to fit your data with the function.
Is this, what you have been searching for ???
Kindly Regards
Axel
You may google for and try ThreeDify Excel Grapher v4.5, an excel addin that includes a 3D equation fitter with an auto-equation finder.

How to generate Bad Random Numbers

I'm sure the opposite has been asked many times but I couldn't find any answers on how to generate bad random numbers.
I want to write a small program for cluster analysis and want to generate some random Points for testing. If I would just insert 1000 Points with random coordinates they would be scattered all over the field which would make a cluster analysis worthless.
Is there a simple way to generate Random Numbers which build clusters?
I already thought about either not using random() but random()*random() which generates normally distributed numbers (I think I read this somewhere here on Stack Overflow).
Second approach would be picking a few areas at random and run the point generation again in this area which would of course produce a cluster in this area.
Do you have a better idea?
If you are deliberately producing well formed clusters (rather than completely random clusters), you could combine the two to find a cluster center, and then put lots of points around it in a normal distribution.
As well working in cartesian coords (x,y); you could use a radial method to distribute points for a particular cluster. Choose a random angle (0-2PI radians), then choose a radius.
Note that as circumference is proportional radius, the area distribution will be denser close to the centre - but the distribution per specific radius will be the same. Modify the radial distribution to produce a more tightly packed cluster.
OR you could use real world derived data for semi-random point distributions with natural clustering. Recently I've been doing quite a bit of geospatial cluster analysis. For this I have used real world data - zipcode centroids (which form natural clusters around cities); and restaurant locations. Another suggestion: you could use a stellar catalogue or galactic catalogue.
Generate few anchors. True random numbers. Then generate noise around them:
anchor + dist * (random() - 0.5))
this will generate clustered numbers, that will be evenly distributed in distance dist.
Add an additional dimension to your model.
Draw an irregular (i.e. not flat) surface.
Generate numbers in the extended space.
Discard all numbers which are on one side of the surface.
From every number left, drop the additional dimension.
Maybe I have misunderstood, but the gnu scientific library (written in c) has many distributions written within it - could you not pick coordinates from the Gaussian/poisson etc from that library?
http://www.gnu.org/software/gsl/manual/html_node/Random-Number-Distributions.html
They provide a simple example with the Poisson distribution from the link, too.
If you need your distribution to be bounded (for example y-coordinate not less than -1) then you can achieve that by rejection sampling from the uniform distribution in the gsl.
Blessings, Tom
My first thought was that you could implement your own using a linear congruential generator and experiment with the coefficients until you get a low enough period to suit your needs. A really low m coefficient should do the trick.
I also like your second idea of running a good RNG around a few pre-selected points to create clusters. You could either target specific areas for the clusters with this method, or generate those randomly as well.

How can I choose an image with higher contrast in PHP?

For a thumbnail-engine I would like to develop an algorithm that takes x random thumbnails (crop, no resize) from an image, analyzes them for contrast and chooses the one with the highest contrast. I'm working with PHP and Imagick but I would be glad for some general tips about how to compute contrast of imagery.
It seems that many things are easier than computing contrast, for example counting colors, computing luminosity,etc.
What are your experiences with the analysis of picture material?
I'd do it that way (pseudocode):
L[256] = {0,0,0...}
loop over each pixel:
luminance = avg(R,G,B)
increment L[luminance] by 1
for i = 0 to 255:
if L[i] < C: L[i] = 0 // C = threshold of your chose
find index of first and last non-zero value of L[]
contrast = last - first
In looking for the image "with the highest contrast," you will need to be very careful in how you define contrast for the image. In the simplest way, contrast is the difference between the lowest intensity and the highest intensity in the image. That is not going to be very useful in your case.
I suggest you use a histogram approach to describe the contrast of a given image and then compare the properties of the histograms to determine the image with the highest contrast as you define it. You could use a variety of well known containers to represent the histogram in code, or construct a class to meet your specific needs. (I am not implying that you need to create a histogram in the form of a chart – just a statistical representation of the intensity values.) You could use the variance of each histogram directly as a measure of contrast, or use the standard deviation if that is easier to work with.
The key really lies in how you define the contrast of the image. In general, I would define a high contrast image as one with values present for all, or nearly all, the possible values. And I would further add that in this definition of a high contrast image, the intensity values of the image will tend to be distributed across the range of possible values in a uniform way.
Using this approach, a low contrast image would tend to have relatively few discrete intensity values and they would tend to be closely grouped together rather than uniformly distributed. (As a general rule, they will also tend to be grouped toward the center of the range.)

Resources