Correlation of an image - image

I tried to using this post How to find Correlation of an image
to find Correlation for my image but i have questions.when i use this: cov(x,y) / ( sqrt(D(x)* D(y) )) my result is [1.0025 -0.0358 ;-0.0358 0.9975](for 5000 pixel). -0.0358 is Correlation for my image?what is 0.9975?i run my code tow times. the second result is [0.9830 0.0243;0.0243 1.0173].-0.0358 or 0.0243 which one is Correlation ? I know because of using randperm In each run different number is create but Which one is a best?negative number or positive number?

Functions like corrcoef return values like [1.0025 -0.0358; -0.0358 0.9975] where the coefficient is -0.0358, and the other values relate to the certainty of the coefficient.
Here's what you should expect from the covariance matrix cov:
http://www.mathworks.com/help/matlab/ref/cov.html
Here's what you should expect from the correlation coefficients corrcoef:
http://www.mathworks.com/help/matlab/ref/corrcoef.html
What I suspect is going on is there's some variable whose value is affected by running. A useful debugging practice (when going through your code with a fine-toothed comb) is having your code output to the screen some of the values as they are being generated. In Matlab, this is as simple as removing a few ; at the end of a few lines.
Hope this helps!

Related

store if loop results in a matrix (Octave)

I have a matrix with many random values ranging from -600 to +600. These values are intermixed with each other inside of the matrix.
What I want to do is separate the negative values and positive values into their own matrices. Maybe even separating the values that are greater than 400 into its own matrix as well.
I'm fairly new with coding, so the first thing that popped in my head was an if statement. I am using Octave. I don't know if theres a better way to go about it, but I would appreciate all help I can get. Thanks
The best way to do this is to use logical indexing. You can create a logical matrix the size of your data based on the criteria you want.
So for example to get only the values that are negative:
negatives = your_data(your_data < 0);
And positive values
positives = your_data(your_data >= 0);
You can alter the expression used to generate the logical matrix to suit your needs.
Also if you're simply using Octave, you can stop tagging C++ since they are different languages.

Negative value of Information Gain

I'm implementing C4.5 and in my calculations im getting (for some examples) negative values for information gain. I read Why am I getting a negative information gain, but my issiue seeams to be diffrent. I putt my calculation to excel and i get the same results as below:
My calculations
What am i doing wrong?
I tried calculate it again, and also i get negative value as is on image below:
Newest calculations with data set
80 is split value, so i get 11 <=80 and 3objects > 80
Are you multiplying your result for entropy by -1?
$$ H(X) = -\sum_{i=1}^n {\mathrm{P}(x_i) \log_b \mathrm{P}(x_i)} $$
Ugh... having trouble with mathjax, go here for definition

Random number generation requires too many iterations

I am running simulations in Anylogic and I'm trying to calibrate the following distribution:
Jump = normal(coef1, coef2, -1, 1);
However, I keep getting the following message as soon as I start the calibration (experimentation):
Random number generation requires too many iterations (> 10000)
I tried to replace -1 and 1 by other values and keep getting the same thing.
I also tried to change the bounds of coef1 and coef2 and put things like [0,1], but I still get the same error.
I don't get it.
Any ideas?
The four parameter normal method is not deprecated and is not a "calibration where coef1 and coef2 are the coefficicents to be solved for". Where did you get that understanding from? Or are you saying that you're using your AnyLogic Experiment (possibly a multi-run or optimisation experiment) to 'calibrate' that distribution, in which case you need to explain what you mean by 'calibrate' here---what is your desired outcome?
If you look in the API reference (AnyLogic classes and functions --> API Reference --> com.xj.anylogic.engine --> Utilities), you'll see that it's a method to use a truncated normal distribution.
public double normal(double min,
double max,
double shift,
double stretch)
The first 2 parameters are the min and max (where it will sample repeatedly and ignore values outside the [min,max] range); the second two are effectively the mean and standard deviation. So you will get the error you mentioned if min or max means it will sample too many times to get a value in range.
API reference details below:
Generates a sample of truncated Normal distribution. Distribution
normal(1, 0) is stretched by stretch coefficient, then shifted to the
right by shift, after that it is truncated to fit in [min, max]
interval. Truncation is performed by discarding every sample outside
this interval and taking subsequent try. For more details see
normal(double, double)
Parameters:
min - the minimum value that this function will return. The distribution is truncated to return values above this. If the sample
(stretched and shifted) is below this value it will be discarded and
another sample will be drawn. Use -infinity for "No limit".
max - the maximum value that this function will return. The distribution is truncated to return values below this. If the sample
(stretched and shifted) is bigger than this value it will be discarded
and another sample will be drawn. Use +infinity for "No limit".
shift - the shift parameter that indicates how much the (stretched) distribution will shifted to the right = mean value
stretch - the stretch parameter that indicates how much the distribution will be stretched = standard deviation Returns:
the generated sample
According to AnyLogic's documentation, there is no version of normal which takes 4 arguments. Also note that if you specify a mean and standard deviation, the order is unusual (to probabilists/statisticians) by putting the standard deviation before the mean.

How to find the maximum value of a function in Sympy?

These days I am trying to redo shock spectrum of single degree of freedom system using Sympy. The problem can reduce to find maximum value of a function. Following are two cases I cannot figure out how to do.
The first one is
tau,t,t_r,omega,p0=symbols('tau,t,t_r,omega,p0',positive=True)
h=expand(sin(omega*(t-tau)))
f=simplify(integrate(p0*tau/t_r*h,(tau,0,t_r))+integrate(p0*h,(tau,t_r,t)))
The final goal is to obtain maximum absolute value of f (The variable is t). The direct way is
df=diff(f,t)
sln=solve(simplify(df),t)
simplify(f.subs(t,sln[1]))
Here is the result, I tried many ways, but I can not simplify any further.
Therefore, I tried another way. Because I need the maximum absolute value and the location where abs(f) is maximum happens at the same location of square of f, we can calculate square of f first.
df=expand_trig(diff(expand(f)**2,t))
sln=solve(df,t)
simplify(f.subs(t,sln[2]))
It seems the answer is almost the same, just in another form.
The expected answer is a sinc function plus a constant as following:
Therefore, the question is how to get the final presentation.
The second one may be a little harder. The question can be reduced to find the maximum value of f=sin(pi*t/t_r)-T/2/t_r*sin(2*pi/T*t), in which t_r and T are two parameters. The maximum located at different peak when the ratio of t_r and T changes. And I do not find a way to solve it in Sympy. Any suggestion? The answer can be represented in following figure.
The problem is the log(exp(I*omega*t_r/2)) term. SymPy is not reducing this to I*omega*t_r/2. SymPy doesn't simplify this because in general, log(exp(x)) != x, but rather log(exp(x)) = x + 2*pi*I*n for some integer n. But in this case, if you replace log(exp(I*omega*t_r/2)) with omega*t_r/2 or omega*t_r/2 + 2*pi*I*n, it will be the same, because it will just add a 2*pi*I*n inside the sin.
I couldn't figure out any functions that force this simplification, but the easiest way is to just do a substitution:
In [18]: print(simplify(f.subs(t,sln[1]).subs(log(exp(I*omega*t_r/2)), I*omega*t_r/2)))
p0*(omega*t_r - 2*sin(omega*t_r/2))/(omega**2*t_r)
That looks like the answer you are looking for, except for the absolute value (I'm not sure where they should come from).

How to implement a part of histogram equalization in matlab without using for loops and influencing speed and performance

Suppose that I have these Three variables in matlab Variables
I want to extract diverse values in NewGrayLevels and sum rows of OldHistogram that are in the same rows as one diverse value is.
For example you see in NewGrayLevels that the six first rows are equal to zero. It means that 0 in the NewGrayLevels has taken its value from (0 1 2 3 4 5) of OldGrayLevels. So the corresponding rows in OldHistogram should be summed.
So 0+2+12+38+113+163=328 would be the frequency of the gray level 0 in the equalized histogram and so on.
Those who are familiar with image processing know that it's part of the histogram equalization algorithm.
Note that I don't want to use built-in function "histeq" available in image processing toolbox and I want to implement it myself.
I know how to write the algorithm with for loops. I'm seeking if there is a faster way without using for loops.
The code using for loops:
for k=0:255
Condition = NewGrayLevels==k;
ConditionMultiplied = Condition.*OldHistogram;
NewHistogram(k+1,1) = sum(ConditionMultiplied);
end
I'm afraid if this code gets slow for high resolution big images.Because the variables that I have uploaded are for a small image downloaded from the internet but my code may be used for sattellite images.
I know you say you don't want to use histeq, but it might be worth your time to look at the MATLAB source file to see how the developers wrote it and copy the parts of their code that you would like to implement. Just do edit('histeq') or edit('histeq.m'), I forget which.
Usually the MATLAB code is vectorized where possible and runs pretty quick. This could save you from having to reinvent the entire wheel, just the parts you want to change.
I can't think a way to implement this without a for loop somewhere, but one optimisation you could make would be using indexing instead of multiplication:
for k=0:255
Condition = NewGrayLevels==k; % These act as logical indices to OldHistogram
NewHistogram(k+1,1) = sum(OldHistogram(Condition)); % Removes a vector multiplication, some additions, and an index-to-double conversion
end
Edit:
On rereading your initial post, I think that the way to do this without a for loop is to use accumarray (I find this a difficult function to understand, so read the documentation and search online and on here for examples to do so):
NewHistogram = accumarray(1+NewGrayLevels,OldHistogram);
This should work so long as your maximum value in NewGrayLevels (+1 because you are starting at zero) is equal to the length of OldHistogram.
Well I understood that there's no need to write the code that #Hugh Nolan suggested. See the explanation here:
%The green lines are because after writing the code, I understood that
%there's no need to calculate the equalized histogram in
%"HistogramEqualization" function and after gaining the equalized image
%matrix you can pass it to the "ExtractHistogram" function
% (which there's no loops in it) to acquire the
%equalized histogram.
%But I didn't delete those lines of code because I had tried a lot to
%understand the algorithm and write them.
For more information and studying the code, please see my next question.

Resources