I would like to understand what is going on with my program in julia. The issue is the following: I have a symmetric non-negative matrix which I diagonalize using
egvals, egvecs = eig(H_mat)
By a theorem, my matrix should have a maximum eigenvalue which is associated a non-negative eigenvector. The H_mat has a further trick, its first column and row has an entry filled with zeros.
Diagonalization yields a maximum positive eigenvalue E_max, in fact it is the last eigenvalue because julia arranges the eigenvalues in order up to the biggest, but my eigenvector associated with E_max does not have all its entries zero or positive (i.e. they have negative entries)
egvecs[:,end] # Some or several components ii, egvecs[ii,end]<0
This is the matrix from I don't get proper results, for instance:
[0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 2.0 0.0 0.0 1.414213562373095 0.0 0.0 0.0 0.0 0.0 1.414213562373095 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 3.0 0.0 0.0 1.7320508075688774 0.0 0.0 0.0 0.0 0.0 1.7320508075688774 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 1.414213562373095 0.0 0.0 2.0 0.0 1.414213562373095 0.0 0.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 1.7320508075688774 0.0 0.0 3.0 0.0 2.0 0.0 0.0 0.0 1.0 0.0 1.414213562373095 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 1.414213562373095 0.0 2.0 0.0 0.0 0.0 0.0 0.0 1.414213562373095 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 2.0 0.0 3.0 1.7320508075688774 0.0 0.0 0.0 0.0 1.414213562373095 1.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.7320508075688774 3.0 0.0 0.0 0.0 0.0 0.0 1.7320508075688774 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 1.414213562373095 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 2.0 0.0 1.0 0.0 0.0 1.414213562373095 0.0 0.0 0.0
0.0 0.0 0.0 1.7320508075688774 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 3.0 0.0 1.414213562373095 0.0 0.0 2.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.414213562373095 0.0 0.0 0.0 1.0 0.0 2.0 0.0 0.0 1.414213562373095 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 1.414213562373095 0.0 1.414213562373095 0.0 0.0 0.0 1.414213562373095 0.0 3.0 1.414213562373095 0.0 1.414213562373095 1.414213562373095 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.7320508075688774 0.0 0.0 0.0 0.0 1.414213562373095 3.0 0.0 0.0 2.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.414213562373095 0.0 1.414213562373095 0.0 0.0 2.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.0 0.0 1.414213562373095 0.0 0.0 3.0 1.0 1.7320508075688774
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.414213562373095 2.0 0.0 1.0 3.0 1.7320508075688774
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.7320508075688774 1.7320508075688774 3.0]
(Should be a 20X20 matrix)
Eigenvectors are only determined up to proportionality because the requirement is that they solve (A - lambda*I)v = 0 and if v1 solves the equation, then also v2 = -v1 does. In the symmetric case, it is standard to normalize all the vectors to one, but that still leaves the sign undetermined. Therefore the theorem you are using must say something like that the vector corresponding to the largest values can be chosen such that it has non-negative elements. Indeed, I get that the eigenvector corresponding to the largest eigenvalue of your matrix has non-positive elements.
julia> eig(A) |> t -> all(t[2][:,indmax(t[1])] .<= 0)
true
Related
How do you convert a sparse matrix to a dense matrix in Julia? According to this I should be able to use full or Matrix, however full is evidently not standard in the SparseArrays module, and when I try to use Matrix:
I = []
J = []
A = []
for i in 1:3
push!(I, i)
push!(J, i^2)
push!(A, sqrt(i))
end
sarr = sparse(I, J, A, 10, 10)
arr = Matrix(sarr)
I get this error:
Exception has occurred: MethodError
MethodError: no method matching zero(::Type{Any})
It is enough to do collect(sarr) or Matrix(sarr).
Note, however that your code uses untyped containers which is not recommended. Indexes in arrays are Ints so it should be:
I = Int[]
J = Int[]
A = Float64[]
for i in 1:3
push!(I, i)
push!(J, i^2)
push!(A, sqrt(i))
end
sarr = sparse(I, J, A, 10, 10)
Now you can do:
julia> collect(sarr)
10×10 Matrix{Float64}:
1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 1.41421 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.73205 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
I'm trying to run this code, but why I'm I getting these 2 rows in the Middle with 00000, can someone help me, to get that fixed, please?
using Distributed #Bereitstellung der Bibliothekee zur Parallelen Programieru
addprocs(2)
#everywhere using LinearAlgebra #Bereitstellung der LinearAlgebra Bibliotheke
#everywhere using DistributedArrays #Bereitstellung der DistributedArrays
#everywhere T =(zeros(n,n))
T[:,1].=10 #Randbedingungen T_links =10
T[:,end].=10 #Randbedingungen T_rechts =10
T = distribute(T; dist=(2,1))
#everywhere maxit = 100 #maximale Iterrationsanzahl
#everywhere function Poissons_2D(T)
for w in 1:maxit
#sync #distributed for p in 1:nworkers()
for i in 2:length(localindices(T)[1])-1
for j in 2:length(localindices(T)[2])-1
localpart(T)[i,j] = (1/4 * (localpart(T)[i-1,j] + localpart(T)[i+1,j] + localpart(T)[i,j-1] + localpart(T)[i,j+1]))
end
end
end
end
return T
end
Poissons_2D(T)
10×10 DArray{Float64,2,Array{Float64,2}}:
10.0 0.0 0.0 0.0 … 0.0 0.0 0.0 10.0
10.0 4.33779 2.00971 1.01077 1.01077 2.00971 4.33779 10.0
10.0 5.34146 2.69026 1.40017 1.40017 2.69026 5.34146 10.0
10.0 4.33779 2.00971 1.01077 1.01077 2.00971 4.33779 10.0
10.0 0.0 0.0 0.0 0.0 0.0 0.0 10.0
10.0 0.0 0.0 0.0 … 0.0 0.0 0.0 10.0
10.0 4.33779 2.00971 1.01077 1.01077 2.00971 4.33779 10.0
10.0 5.34146 2.69026 1.40017 1.40017 2.69026 5.34146 10.0
10.0 4.33779 2.00971 1.01077 1.01077 2.00971 4.33779 10.0
10.0 0.0 0.0 0.0 0.0 0.0 0.0 10.0
The first cleanup could look like this:
a =(zeros(10,10))
a[:,[1,end]] .= 10
a = distribute(a; dist=(nworkers(),1))
function Poissons_2D(a::DArray, maxit::Int=100)
for w in 1:maxit
#sync #distributed for p in 1:nworkers()
local_a = localpart(a)
local_ind = localindices(a)
for iix in 1:length(local_ind[1])
i = local_ind[1][iix]
(i==1 || i==size(a,1)) && continue
for j in local_ind[2][2:end-1]
local_a[iix,j] = (1/4 * (a[i-1,j] + a[i+1,j] + a[i,j-1] + a[i,j+1]))
end
end
end
end
a
end
Some remarks:
Do not use #everywhere in front of T - you do not want to define it on all workers
in Julia you use by convention T to denote parametric types so use a, or some T-like LaTeX symbol
However, your function takes values from all adjacent cells to calculate new values.
I do not know how do you plan to handle situation when the value does not exist yet.
In particular if each row requires value from the previous row and previous column it is not possible to parallelize this computation at all (because you need to wait for the previous value to get the next one).
julia> Poissons_2D(a)
10×10 DArray{Float64,2,Array{Float64,2}}:
10.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 10.0
10.0 4.99998 3.05213 2.20861 1.87565 1.87565 2.20862 3.05214 4.99999 10.0
10.0 6.9478 4.99994 3.90669 3.41834 3.41834 3.9067 4.99995 6.94781 10.0
10.0 7.7913 6.09315 4.99989 4.47269 4.4727 4.99991 6.09317 7.79131 10.0
10.0 8.12425 6.58148 5.52707 4.99987 4.99988 5.52709 6.58151 8.12427 10.0
10.0 8.12425 6.58148 5.52707 4.99987 4.99988 5.52709 6.58151 8.12427 10.0
10.0 7.7913 6.09316 4.99991 4.47271 4.47271 4.99992 6.09317 7.79131 10.0
10.0 6.94781 4.99995 3.90671 3.41835 3.41836 3.90672 4.99996 6.94782 10.0
10.0 4.99999 3.05214 2.20862 1.87566 1.87566 2.20863 3.05215 4.99999 10.0
10.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 10.0
I think the problem are the ranges of for with I and j. You range is from 2 to N-1, avoiding the extremes. It is right because you are missing the information to calculate them, because it is stored in a different process. However you need to transfer the limits information. In MPI, for instance you could send redundant information to avoid that, but in Distributed I am not sure. I see the cause, but the solution is not easy. At least I hope to have helped a little.
I have a file containing the string "proto" which is repeated 384 times. Each "proto" should be edited according to 384 different labels existing in another textfile. For insance, if the content of the second text file is a, sp, .. (each label exists in a newline), so in my textfile the first "proto" should be changed to "a", the second to "sp" and so on. How should I do that?
The original file is a 384 repetitions of :
~o <VecSize> 39 <MFCC_0_D_A>
~h "proto"
<BeginHMM>
<NumStates> 5
<State> 2
<Mean> 39
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 39
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<State> 3
<Mean> 39
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 39
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<State> 4
<Mean> 39
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 39
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<TransP> 5
0.0 1.0 0.0 0.0 0.0
0.0 0.6 0.4 0.0 0.0
0.0 0.0 0.6 0.4 0.0
0.0 0.0 0.0 0.7 0.3
0.0 0.0 0.0 0.0 0.0
<EndHMM>
and the second file contains 384 labels, such as: a, ap, k12, sp, ... So as you see, I have just one form of "proto". I hope it is now clear.
Assuming textfile with replacement patterns has exact 384 lines you can use this awk command:
awk 'FNR==NR{a[++i]=$0; next} /proto/{sub(/proto/, a[++j])} 1' replacement.txt file.txt
This might work for you (GNU sed):
sed 's|.*|s/proto/&/|' replacementFile | sed -e ':a;$!{N;ba}' -f - textFile
This generates a substitution command for every replacement in the replacementFile and reads this command set into sed script which has slurped the textFile into memory.
N.B. This assumes that there is a replacement for every pattern in the textFile.
I have a text file with RGB data in the form of:
[Pixel 0,0] [Pixel 1,0] [Pixel 2,0]...
[Pixel 0,1] [Pixel 1,1] [Pixel 2,2]...
...
With an input of:
0.0 0.0 0.0 <-- this would be Pixel 0,0
1.0 0.0 0.0
1.0 0.9 0.0
I can create the flag of Germany in size 3x1 with:
%load the data to myData
Germany = reshape(myData,3,1,3);
image(Germany)
The 1px-wide pattern works good as show in picture, however, the goal is to be able to create multiple patterns, e.g. the Germany flag in 3x3 followed by Romania flag in 3x3 or any other pattern of any length and doing that! is where I can not find the proper way to reshape the matrix.
The input that should create the second example shown in picture is this:
|========= Germany Flag ==========| [ Blue ] [ Yellow ] [ Red ]
Black -> 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.9 0.0 1.0 0.0 0.0
Red -> 1.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 1.0 0.9 0.0 1.0 0.0 0.0
Yellow-> 1.0 0.9 0.0 1.0 0.9 0.0 1.0 0.9 0.0 0.0 0.0 1.0 1.0 0.9 0.0 1.0 0.0 0.0
Any help is appreciated
Update: Asked by Marcin, the input files are literal as I explained above.
This is the content of the GermanyRomania.txt file:
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.9 0.0 1.0 0.0 0.0
1.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 1.0 0.9 0.0 1.0 0.0 0.0
1.0 0.9 0.0 1.0 0.9 0.0 1.0 0.9 0.0 0.0 0.0 1.0 1.0 0.9 0.0 1.0 0.0 0.0
With that file I must create the 2nd pattern in picture (German+Romania Flag), there is ALL the RGB info required to do it.
I don't think you can achieve what you want by simply using the reshape function.
We must take into account that Matlab stores matrices in column-major order (you can read more about it here).
Therefore, before we can use the reshape function, we must have the data matrix in the following format:
[Pixel 0,0]
[Pixel 0,1]
...
[Pixel 1,0]
[Pixel 1,1]
...
[Pixel n,n]
Here's a possible solution:
# data stores the input
height = size(data, 1)
width = size(data, 2)
vertical_data_cell = mat2cell(data, height, 3 * ones(1, width / 3))'
vertical_data = cell2mat(vertical_data_cell)
flags = reshape(vertical_data, height, width / 3, 3)
image(flags)
Note that we make the matrix transformation on lines 4 and 5.
And here is the result for the input you provided:
It also works with different heights.
Here's the input for the flags of Germany, Argentina and Portugal.
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.46 0.66 0.85 0.46 0.66 0.85 0.46 0.66 0.85
1.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 1.0 1.0 0.98 0.75 0.29 1.0 1.0 1.0
1.0 0.9 0.0 1.0 0.9 0.0 1.0 0.9 0.0 0.46 0.66 0.85 0.46 0.66 0.85 0.46 0.66 0.85
0.0 1.0 0.0 0.0 1.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0
0.0 1.0 0.0 1.0 0.9 0.0 1.0 0.9 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0
0.0 1.0 0.0 0.0 1.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0
And this is the result:
using bash script.
example text file is:
<Mixture> 1 1.0000
<Mean> 39
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 39
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<TransP> 3
now i WANT TO replace the sequence of zeros with some other array which is matlab variable.i.e I want it to be as follows
<Mixture> 1 1.0000
<Mean> 39
-3.928404e+00 -2.833521e+00 1.418979e+00 -8.560805e+00 -4.987627e+00 -5.422057e+00 -2.148949e+00 -4.035418e+00 1.111476e+00 -4.360466e+00 -2.430894e+00 -3.562756e+00 2.149250e+01 4.148085e-03 3.280356e-01 1.993892e-01 4.226578e-01 3.070029e-01 3.061973e-01 2.200327e-01 2.279716e-01 1.851751e-01 2.630977e-01 2.530313e-01 1.584220e-01 -7.420680e-02 -6.129631e-02 6.845896e-03 1.414505e-02 5.905741e-02 7.828339e-02 6.020883e-02 7.195320e-02 4.838301e-02 4.507983e-02 6.380487e-02 7.884157e-02 6.597416e-02 -1.690722e-02
<Variance> 39
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<TransP> 3
Assuming $matlabvar contains the data you received from Matlab:
sed "s/ 0\.0\( 0\.0\)*/ $matlabvar/" input.txt > output.txt
This is regular expression match and substitution.
s/old/new/ - replace text that's matched by the old regular expression with the new text
0 - matches 0
\. - matches .
\(...\) - this groups the contained elements
* - this matches 0 or more of the preceding expression (the grouped construct).
So it matches one 0.0 followed by any number of additional 0.0s. This is then replaced with $matlabvar.
You can learn all about regular expressions at regular-expressions.info.
<Mean> 39
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 39
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<State> 3
0.000e+0 1.000e+0 0.000e+0 0.000e+0 0.000e+0
0.000e+0 5.000e-1 5.000e-1 0.000e+0 0.000e+0
0.000e+0 0.000e+0 5.000e-1 5.000e-1 0.000e+0
0.000e+0 0.000e+0 0.000e+0 5.000e-1 5.000e-1
0.000e+0 0.000e+0 0.000e+0 0.000e+0 0.000e+0
that is My total text file
Unfortunately
The suggested command will replace the zeros in the above matrix form also but i do not want this.Also another question is how to replace a general sequence (we do not know how pattern is) with another general sequence. I think it should be done only by the line numbers but here another problem is we do not know how many lines the sequence is. so is there any way?