The plots of co-variance functions should start from 0-shift - random

The following was my question given by my teacher,
Generate a sequence of N = 1000 independent observations of random variable with distribution: (c) Exponential with parameter λ = 1 , by
inversion method.
Present graphically obtained sequences(except for those generated in point e) i.e. e.g. (a) i. plot in the coordinates (no. obs.,
value of the obs) ii. plot in the coordinates (obs no n, obs. no n +
i) for i = 1, 2, 3. iii. plot so called covariance function for some
values. i.e. and averages:
I have written the following code,
(*****************************************************************)
(*Task 01(c) and 02(a)*)
(*****************************************************************)
n = 1000;
taskC = Table[-Log[RandomReal[]], {n}];
ListPlot[taskC, AxesLabel->{"No. obs", "value of the obs"}]
i = 1;
ListPlot[Table[
{taskC[[k]], taskC[[k+i]]},
{k, 1, n-i,1}],
AxesLabel->{"obs.no.n", "obs.no.n+1"}]
i++;
ListPlot[Table[
{taskC[[k]], taskC[[k+i]]},
{k, 1, n-i,1}],
AxesLabel-> {"obs.no.n", "obs.no.n+2"}]
i++;
ListPlot[Table[
{taskC[[k]], taskC[[k+i]]},
{k,1,n-i,1}],
AxesLabel->{"obs.no.n", "obs.no.n+3"}]
avg = (1/n)*Sum[taskC[[i]], {i,n}];
ListPlot[Table[1/(n-tau) * Sum[(taskC[[i]]-avg)*(taskC[[i+tau]] - avg), n], {tau, 1,100}],
Joined->True,
AxesLabel->"Covariance Function"]
He has commented,
The plots of co-variance functions should start from 0-shift. Note
that for larger than 0 shifts you are estimating co-variance between
independent observations which is zero, while for 0 shift you are
estimating variance of observation which is large. Thus the contrast
between these two cases is a clear indication that the observations
are uncorrelated.
What did I do wrong?
How can I correct my code?

Zero-shift means calculating the covariance for tau = 0, which is simply the variance.
Labeled[ListPlot[Table[{tau,
1/(n - tau)*Sum[(taskC[[i]] - avg)*(taskC[[i + tau]] - avg), {i, n - tau}]},
{tau, 0, 5}], Filling -> Axis, FillingStyle -> Thick, PlotRange -> All,
Frame -> True, PlotRangePadding -> 0.2, AspectRatio -> 1],
{"Covariance Function K(n)", "n"}, {{Top, Left}, Bottom}]
Variance[taskC]
0.93484
Covariance[taskC, taskC]
0.93484
(* n = 1 *)
Covariance[Most[taskC], Rest[taskC]]
0.00926913

Related

Trouble with sparse matrices in Mathematica

The following code gives me the first k eigenvalues of a certain big matrix. Because of the symmetries of the matrix, the eigenvalues are in pairs, one positive and the other negative, with the same absolute value. This is indeed the case if I run the code with the exact matrices, without using the sparse version. However when I make them sparse, the resulting eigenvalues appear to lose the sign information, as now the pairs can be both negative, or both positive, depending on the number I put on "nspins" (which controls the size of the matrix). The variable "sparse" controls whether I use sparse matrices or not.
This issue gives me considerable trouble. Can anybody tell me why the sparse version of the computation gives wrong signs, and how to fix it?
sparse = 1; (*Parameter that controls whether I will use sparse \
matrices, 0 means not sparse, 1 means sparse*)
(*Base matrices of my big matrix*)
ox = N[{{0, 1}, {1, 0}}];
oz = N[{{1, 0}, {0, -1}}];
id = N[{{1, 0}, {0, 1}}];
(*Transformation into sparse whether desired*)
If[sparse == 1,
ox = SparseArray[ox];
oz = SparseArray[oz];
id = SparseArray[id];
]
(*Dimension of the big matrix, must be even*)
nspins = 8;
(*Number of eigenvalues computed*)
neigenv = 4;
(*Algorithm to create big matrices*)
Do[
Do[
If[j == i, mata = ox; matc = oz;, mata = id; matc = id;];
If[j == 1,
o[1, i] = mata;
o[3, i] = matc;
,
o[1, i] = KroneckerProduct[o[1, i], mata];
o[3, i] = KroneckerProduct[o[3, i], matc];
];
, {j, 1, nspins}];
, {i, 1, nspins}];
(*Sum of big matrices*)
ham = Sum[o[1, i].o[1, i + 1], {i, 1, nspins - 1}] +
o[1, nspins].o[1, 1] + 0.5*Sum[o[3, i], {i, 1, nspins}];
(*Print the desired eigenvalues*)
Do[Print [Eigenvalues[ham, k][[k]]], {k, 1, neigenv}];

Solving systems of second order differential equations

I'm working on a script in mathematica that will take simulate a string held at either end and plucked, by solving the wave equation via numerical methods. (http://en.wikipedia.org/wiki/Wave_equation#Investigation_by_numerical_methods)
n = 5; (*The number of discreet elements to be used*)
L = 1.0; (*The length of the string that is vibrating*)
a = 1.0/3.0; (*The distance from the left side that the string is \
plucked at*)
T = 1; (*The tension in the string*)
[Rho] = 1; (*The length density of the string*)
y0 = 0.1; (*The vertical distance of the string pluck*)
[CapitalDelta]x = L/n; (*The length of each discreet element*)
m = ([Rho]*L)/n;(*The mass of each individual node*)
c = Sqrt[T/[Rho]];(*The speed at which waves in the string propogate*)
I set all my variables
Y[t] = Array[f[t], {n - 1, 1}];
MatrixForm(*Creates a vector size n-1 by 1 of functions \
representing each node*)
I define my Vector of nodal position functions
K = MatrixForm[
SparseArray[{Band[{1, 1}] -> -2, Band[{2, 1}] -> 1,
Band[{1, 2}] -> 1}, {n - 1,
n - 1}]](*Creates a matrix size n by n governing the coupling \
between each node*)
I create the stiffness matrix relating all the nodal functions to one another
Y0 = MatrixForm[
Table[Piecewise[{{(((i*L)/n)*y0)/a,
0 < ((i*L)/n) < a}, {(-((i*L)/n)*y0)/(L - a) + (y0*L)/(L - a),
a < ((i*L)/n) < L}}], {i, 1, n - 1}]]
I define the initial positions of each node using a piecewise function
NDSolve[{Y''[t] == (c/[CapitalDelta]x)^2 Y[t].K, Y[0] == Y0,
Y'[0] == 0},
Y, {t, 0, 10}];(*Numerically solves the system of second order DE's*)
Finally, This should solve for the values of the individual nodes, but it returns an error:
"NDSolve::ndinnt : Initial condition [Y0 table] is not a number or a rectangular array"
So , it would seem that I don't have a firm grasp on how matrices work in mathematica. I would greatly appreciate it if anyone could help me get this last line of code to run properly.
Thank you,
Brad
I don't think you should use MatrixForm when defining the matrices. MatrixForm is used to format a list of list as a matrix, usually when you display it. Try removing it and see if it works.

difference in mathematica between Inverse[_] and (_)^(-1) for WishartDistribution

Does anyone know why the following random distributions of matrices generate different plots? (This is code to generate a plot of the PDFs for first cells from a set of 10x10 matrices sampled using an inverse Wishart distribution; amazingly, the plots are different depending on the way one performs the matrix inverse - and it seems the right plots are obtained by Inverse[_], why?)
base code:
<< MultivariateStatistics`;
Module[{dist, p, k, data, samples, scale, graphics, distribution},
p = 10;
k = 13;
samples = 500;
dist = WishartDistribution[IdentityMatrix[p], k];
(* a samples x p x p array *)
data = Inverse[#] & /# RandomVariate[dist, samples];
(* distribution graphics *)
distribution[i_, j_] := Module[{fiber, f, mean, rangeAll, colorHue},
fiber = data[[All, i, j]];
dist = SmoothKernelDistribution[fiber];
f = PDF[dist];
Plot[f[z], {z, -2, 2},
PlotLabel -> ("Mean=" <> ToString[Mean[fiber]]),
PlotRange -> All]
];
Grid # Table[distribution[i, j], {i, 1, 3}, {j, 1, 5}]
]
code variant: above, change line
data = Inverse[#] & /# RandomVariate[dist, samples];
by this
data = #^(-1) & /# RandomVariate[dist, samples];
and you will see the plotted distributions are different.
Inverse computes a matrix inverse, i.e. if a is a square matrix, then Inverse[a].a will be the identity matrix.
a^(-1) is the same as 1/a, i.e. it gives you the reciprocal of each matrix element. The ^ operator gives powers element-wise. If you want a matrix power, use MatrixPower.

Mathematica fast 2D binning algorithm

I am having some trouble developing a suitably fast binning algorithm in Mathematica. I have a large (~100k elements) data set of the form
T={{x1,y1,z1},{x2,y2,z2},....}
and I want to bin it into a 2D array of around 100x100 bins, with the bin value being given by the sum of the Z values that fall into each bin.
Currently I am iterating through each element of the table, using Select to pick out which bin it is supposed to be in based on lists of bin boundaries, and adding the z value to a list of values occupying that bin. At the end I map Total onto the list of bins, summing their contents (I do this because I sometimes want to do other things, like maximize).
I have tried using Gather and other such functions to do this but the above method was ridiculously faster, though perhaps I am using Gather poorly. Anyway It still takes a few minutes to do the sorting by my method and I feel like Mathematica can do better. Does anyone have a nice efficient algorithm handy?
Here is a method based on Szabolcs's post that is about about an order of magnitude faster.
data = RandomReal[5, {500000, 3}];
(*500k values*)
zvalues = data[[All, 3]];
epsilon = 1*^-10;(*prevent 101 index*)
(*rescale and round (x,y) coordinates to index pairs in the 1..100 range*)
indexes = 1 + Floor[(1 - epsilon) 100 Rescale[data[[All, {1, 2}]]]];
res2 = Module[{gb = GatherBy[Transpose[{indexes, zvalues}], First]},
SparseArray[
gb[[All, 1, 1]] ->
Total[gb[[All, All, 2]], {2}]]]; // AbsoluteTiming
Gives about {2.012217, Null}
AbsoluteTiming[
System`SetSystemOptions[
"SparseArrayOptions" -> {"TreatRepeatedEntries" -> 1}];
res3 = SparseArray[indexes -> zvalues];
System`SetSystemOptions[
"SparseArrayOptions" -> {"TreatRepeatedEntries" -> 0}];
]
Gives about {0.195228, Null}
res3 == res2
True
"TreatRepeatedEntries" -> 1 adds duplicate positions up.
I intend to do a rewrite of the code below because of Szabolcs' readability concerns. Until then, know that if your bins are regular, and you can use Round, Floor, or Ceiling (with a second argument) in place of Nearest, the code below will be much faster. On my system, it tests faster than the GatherBy solution also posted.
Assuming I understand your requirements, I propose:
data = RandomReal[100, {75, 3}];
bins = {0, 20, 40, 60, 80, 100};
Reap[
Sow[{#3, #2}, bins ~Nearest~ #] & ### data,
bins,
Reap[Sow[#, bins ~Nearest~ #2] & ### #2, bins, Tr##2 &][[2]] &
][[2]] ~Flatten~ 1 ~Total~ {3} // MatrixForm
Refactored:
f[bins_] := Reap[Sow[{##2}, bins ~Nearest~ #]& ### #, bins, #2][[2]] &
bin2D[data_, X_, Y_] := f[X][data, f[Y][#2, #2~Total~2 &] &] ~Flatten~ 1 ~Total~ {3}
Use:
bin2D[data, xbins, ybins]
Here's my approach:
data = RandomReal[5, {500000, 3}]; (* 500k values *)
zvalues = data[[All, 3]];
epsilon = 1*^-10; (* prevent 101 index *)
(* rescale and round (x,y) coordinates to index pairs in the 1..100 range *)
indexes = 1 + Floor[(1 - epsilon) 100 Rescale[data[[All, {1, 2}]]]];
(* approach 1: create bin-matrix first, then fill up elements by adding zvalues *)
res1 = Module[
{result = ConstantArray[0, {100, 100}]},
Do[
AddTo[result[[##]], zvalues[[i]]] & ## indexes[[i]],
{i, Length[indexes]}
];
result
]; // Timing
(* approach 2: gather zvalues by indexes, add them up, convert them to a matrix *)
res2 = Module[{gb = GatherBy[Transpose[{indexes, zvalues}], First]},
SparseArray[gb[[All, 1, 1]] -> (Total /# gb[[All, All, 2]])]
]; // Timing
res1 == res2
These two approaches (res1 & res2) can handle 100k and 200k elements per second, respectively, on this machine. Is this sufficiently fast, or do you need to run this whole program in a loop?
Here's my approach using the function SelectEquivalents defined in What is in your Mathematica tool bag? which is perfect for a problem like this one.
data = RandomReal[100, {75, 3}];
bins = Range[0, 100, 20];
binMiddles = (Most#bins + Rest#bins)/2;
nearest = Nearest[binMiddles];
SelectEquivalents[
data
,
TagElement -> ({First#nearest[#[[1]]], First#nearest[#[[2]]]} &)
,
TransformElement -> (#[[3]] &)
,
TransformResults -> (Total[#2] &)
,
TagPattern -> Flatten[Outer[List, binMiddles, binMiddles], 1]
,
FinalFunction -> (Partition[Flatten[# /. {} -> 0], Length[binMiddles]] &)
]
If you would want to group according to more than two dimensions you could use in FinalFunction this function to give to the list result the desired dimension (I don't remember where I found it).
InverseFlatten[l_,dimensions_]:= Fold[Partition[#, #2] &, l, Most[Reverse[dimensions]]];

How to find optimal overlap of noisy bivalent matricies

I'm dealing with an image processing problem that I've simplified as follows. I have three 10x10 matrices, each with the values 1 or -1 in each cell. Each matrix has an irregular object located somewhere, and there is some noise in the matrix. I'd like to figure out how to find the optimal alignment of the matrices that would let me line up the objects so I can get their average.
With the 1/-1 coding, I know that the product of two matrices (using element-wise multiplication, not matrix multiplication) will yield 1 if there is a match between two multiplied cells and -1 if there is a mismatch, thus the sum of the products yields a measure of overlap. With this, I know I can try out all possible alignments of two matrices to find that which yields the optimal overlap, but I'm not sure how to do this with 3 matrices (or more - I really have 20+ in my actual data set).
To help clarify the problem, here is some code, written in R, that sets up the sort of matricies I'm dealing with:
#set up the 3 matricies
m1 = c(-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1)
m1 = matrix(m1,10)
m2 = c(-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1)
m2 = matrix(m2,10)
m3 = c(-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,-1,-1,-1,-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,-1,-1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1)
m3 = matrix(m3,10)
#show the matricies
image(m1)
image(m2)
image(m3)
#notice there's a "+" shaped object in each
#create noise
set.seed(1)
n1 = sample(c(1,-1),100,replace=T,prob=c(.95,.05))
n1 = matrix(n1,10)
n2 = sample(c(1,-1),100,replace=T,prob=c(.95,.05))
n2 = matrix(n2,10)
n3 = sample(c(1,-1),100,replace=T,prob=c(.95,.05))
n3 = matrix(n3,10)
#add noise to the matricies
mn1 = m1*n1
mn2 = m2*n2
mn3 = m3*n3
#show the noisy matricies
image(mn1)
image(mn2)
image(mn3)
Here is a program in Mathematica that does what you want (I think).
I may explain it in more detail, if you need.
(*define temp tables*)
r = m = Table[{}, {100}];
(*define noise function*)
noise := Partition[RandomVariate[BinomialDistribution[1, .05], 100],
10];
For[i = 1, i <= 100, i++,
(*generate 100 10x10 matrices with the random cross and noise added*)
w = RandomInteger[6]; h = w = RandomInteger[6];
m[[i]] = (ArrayPad[CrossMatrix[4, 4], {{w, 6 - w}, {h, 6 - h}}] +
noise) /. 2 -> 1;
(*Select connected components in each matrix and keep only the biggest*)
id = Last#
Commonest[
Flatten#(mf =
MorphologicalComponents[m[[i]], CornerNeighbors -> False]), 2];
d = mf /. {id -> x, x_Integer -> 0} /. {x -> 1};
{minX, maxX, minY, maxY} =
{Min#Thread[g[#]] /. g -> First,
Max#Thread[g[#]] /. g -> First,
Min#Thread[g[#]] /. g -> Last,
Max#Thread[g[#]] /. g -> Last} &#Position[d, 1];
(*Trim the image of the biggest component *)
r[[i]] = d[[minX ;; maxX, minY ;; maxY]];
]
(*As the noise is low, the more repeated component is the image*)
MatrixPlot ## Commonest#r
Result:

Resources