Constructing a vector from a sequence in MacAulay2 - matrix

I am in the following situation:
S=QQ[x_0..x_n];
for i from 0 to n do for j from i to n do d_{i,j} = x_i*x_j;
Now I would like to construct a vector whose elements are
d_{0,0}=x_0^2,d_{0,1}=x_0*x_1,...,d_{0,n}=x_0*x_n,d_{1,1}=x_1^2,d_{1,2}=x_1*x_2,...,d_{n,n}=x_n^2
How can I do this in MacAulay2? Thank you very much.

This may be what you are looking for.
m=ideal(S_*)
m^2_*
The _* operator gets the generators of an ideal. So, m is the maximal ideal, and you are looking for the generators of m^2.
Alternatively
flatten entries basis(2,S)
which simply gives you the vector basis of the ring S in degree 2.

In Macaulay2, vector refers to a column vector, and if we have vector elements, we can construct the following vector:
SQ= for i from 0 to n list d_{i}
vector(SQ)
But since the vector you want is not a column vector, it's best to make a matrix:
d=mutableMatrix genericMatrix(S,n,n)
for i from 0 to n do for j from 0 to n do d_(i,j)=x_i*x_j

Related

Generating a perfect hash function given known list of strings?

Suppose I have a list of N strings, known at compile-time.
I want to generate (at compile-time) a function that will map each string to a distinct integer between 1 and N inclusive. The function should take very little time or space to execute.
For example, suppose my strings are:
{"apple", "orange", "banana"}
Such a function may return:
f("apple") -> 2
f("orange") -> 1
f("banana") -> 3
What's a strategy to generate this function?
I was thinking to analyze the strings at compile time and look for a couple of constants I could mod or add by or something?
The compile-time generation time/space can be quite expensive (but obviously not ridiculously so).
Say you have m distinct strings, and let ai, j be the jth character of the ith string. In the following, I'll assume that they all have the same length. This can be easily translated into any reasonable programming language by treating ai, j as the null character if j ≥ |ai|.
The idea I suggest is composed of two parts:
Find (at most) m - 1 positions differentiating the strings, and store these positions.
Create a perfect hash function by considering the strings as length-m vectors, and storing the parameters of the perfect hash function.
Obviously, in general, the hash function must check at least m - 1 positions. It's easy to see this by induction. For 2 strings, at least 1 character must be checked. Assume it's true for i strings: i - 1 positions must be checked. Create a new set of strings by appending 0 to the end of each of the i strings, and add a new string that is identical to one of the strings, except it has a 1 at the end.
Conversely, it's obvious that it's possible to find at most m - 1 positions sufficient for differentiating the strings (for some sets the number of course might be lower, as low as log to the base of the alphabet size of m). Again, it's easy to see so by induction. Two distinct strings must differ at some position. Placing the strings in a matrix with m rows, there must be some column where not all characters are the same. Partitioning the matrix into two or more parts, and applying the argument recursively to each part with more than 2 rows, shows this.
Say the m - 1 positions are p1, ..., pm - 1. In the following, recall the meaning above for ai, pj for pj ≥ |ai|: it is the null character.
let us define h(ai) = ∑j = 1m - 1[qj ai, pj % n], for random qj and some n. Then h is known to be a universal hash function: the probability of pair-collision P(x ≠ y ∧ h(x) = h(y)) ≤ 1/n.
Given a universal hash function, there are known constructions for creating a perfect hash function from it. Perhaps the simplest is creating a vector of size m2 and successively trying the above h with n = m2 with randomized coefficients, until there are no collisions. The number of attempts needed until this is achieved, is expected 2 and the probability that more attempts are needed, decreases exponentially.
It is simple. Make a dictionary and assign 1 to the first word, 2 to the second, ... No need to make things complicated, just number your words.
To make the lookup effective, use trie or binary search or whatever tool your language provides.

Pairwise matching of tiles

Recently in a coding competition I came across this question.
We have a 1000 tiles where each tile is a 3x3 matrix. Each cell in the
matrix has an integer value from 0 to 9 which signifies the elevation
of the cell. The problem was to find the maximum pairs of tiles such
that they fit in perfectly. The tiles may be rotated to fit in. By fit
in it means that for tile A and tile B
A[i]+B[i]=const for i=0 to 8
The approach I thought for this problem was that I could maintain a hash value corresponding to each tile. Then I would find the possible combinations of tiles that would be
a possible fit and look it up in the hashtable.
Ex. For the tile below
5 3 2 4 6 7 5 7 8
4 8 9 matches with 5 1 0 for const = 9 & with 6 2 1 for const=10
1 4 5 8 5 4 9 6 5
for this tile the 'const' would range from 9(adding 0 to the maximum element) to 10(adding 9 to the minimum element).
So I would get two possible combinations for tiles which i would look up in the table.
But this method is greedy and does not give the desired answer and also I was unable to think of a proper hash function which would consider of all possible rotations.
So what would be a good approach for solving this problem?
I am sure there is a brute force way to solve this problem but I was actually wondering whether a viable solution to the problem exists on the lines of "pairwise equal to k" problem.
For n=1000 I would stick with the O(n^2) brute force solution. However an O(n log n) algorithm is described below.
The lexicographicalish ordering is defined by the following less-than operator:
Given two matrices M1, M2, define M1' as M1 if M1[1] is positive and -M1 if M1[1] is negative, and likewise or M2'. We say that M1<M2 if M1'[1]<M2'[1], or if M1'[1] == M2'[1] and M1'[2] < M2'[2], or if M1'[1] == M2'[1] and M1'[2] == M2'[2] and M1'[3] < M2'[3] etc.
Subtract the middle element of each matrix from the rest of the elements of the matrix i.e. A'[5] = A[5] and A'[i] = A[i] - A[5]. Then A' fits with B' if A'[i] +B'[i] = 0 for i!=5, and the elevation is A'[5] + B'[5].
Create an array of matrices and a dictionary. Rotate each matrix so that the top left corner has minimal absolute value before adding it to the array. If there are multiple corners with the same absolute value then duplicate the matrix and store both rotations in the array.
If some rotation of a matrix fits with itself and i,j are indices of rotations of this matrix, add the key-value pairs (i,j) and (j, i) to the dictionary.
Create an array S of indices 1,2... and sort S using the lexicographicalish ordering.
Instead of needing O(n^2) operations to check all possible pairs of matrices, it is only necessary to check all pairs of matrices with indices are S_i and S_(i+1). If a pair of matrices fits, use the dictionary to check that the two matrices are not rotations of the same original matrix before calculating the elevation of the pair.
Not sure if this is the most efficient way for doing this, but it sure works.
What I would do is:
Go over all tiles and check the maximum and minimum value of each tile and save it in a different array.
Check all possible pairs.
If min(A) + max(B) == min(B) + max(A) then check if some rotation of B fits perfectly on A. If it does, add 1 to your count.
Else, it does not fit so you can skip the checking for this pair.
Note: The reason for saving both maximum and minimum for each tile is that it might save us unnecessary calculations and checking rotations as in O(1) we can check if it doesn't fit.

Generating a random matrix with non-static constraints

I would like to generate a random matrix with constraints on both rows and columns in MATLAB. But the problem is I have two parameters for this constraints which are not fix for each element. For explanation, consider the mxn matrix P = [P1 ; P2; ...; Pm], and 2 other vectors lambda and Mu with m and n elements, respectively.
Consider lambda as [lambda(1), lambda(2), ..., lambda(m)] and Mu as [Mu(1), Mu2, ..., Mu(n)]
lamda and Mu should have this constraints:
sum of lambda(s) < sum of Mu(s).
,Now for the random matrix P:
each element of the matrix(P[j,i]) should be equal or greater than zero.
sum of the elements of each row is equal to one (i.e. for the row of j: sigma_i(P[j,i] = 1)
for each column j, sum of the production of each element with the correspond lambda(j) is less than the correspond element in the Mu vector (i.e.Mu(i)). i.e. for the column of i: sigma_j(P[j,i]*lambda(j)) < Mu(i)
I have tried coding all these constraints but because the existence of lambda and Mu vectors, just one of the constraints of 3 or 4 can be feasible. May you please help me for coding this matrix.
Thanks in advance
There could be values of Mu and Lambda that does not allow any value of P[i,j].
For each row-vector v:
Constraint 3 means the values are constrained to the hyper-plane v.1 = 1 (A)
Constraint 4 means the values are constrained to the half-space v.Lambda < m (H), where m is the element of Mu corresponding to the current row.
Constraint 1 does not guarantee that these two constraint generates a non-empty solution space.
To verify that the solution-space is non-empty, the easiest method is by checking each corner of hyper-plane A (<1,0,0,...>, <0,1,0,...>, ...). If at least one of the corners qualify for constraint 4, the solution-space is non-empty.
Having said that; Assuming the solution-space is non-empty, you could generate values matching those constraints by:
Generate random vector with elements 0 ≤ vi ≤ 1.
Scale by dividing by the sum of the elements.
If this vector does not qualify for constraint 4, repeat from step 1.
Once you have n such vectors, combine them as rows into a matrix.
The speed of this algorithm depends on how large volume of hyper-plane A is contained inside the half-space H. If only 1% is contained, it would expected to require 100 iterations for that row.

Construct a full rank matrix by adding vectors from the standard basis

I have a nxn singular matrix. I want to add k rows (which must be from the standard basis e1, e2, ..., en) to this matrix such that the new (n+k)xn matrix is full column rank. The number of added rows k must be minimum and they can be added in any order (not just e1, e2 ,..., it can be e4, e10, e1, ...) as long as k is minimum.
Does anybody know a simple way to do this? Any help is appreciated.
You can achieve this by doing a QR decomposition with column pivoting, then taking the transpose of the last n-rank(A) columns of the permutation matrix.
In matlab, this is achieved by the qr function(See the matlab documentation here):
r=rank(A);
[Q,R,E]=qr(A);
newA=[A;transpose(E(:,end-r+1:end))];
Each row of transpose(E(:,end-r+1:end)) will be a member of standard basis, rank of newA will be n, and this is also the minimal number of standard basis you will need to do so.
Here is how this works:
QR decomposition with column pivoting is a standard procedure to decompose a matrix A into products:
A*E==Q*R
where Q is an orthogonal matrix if A is real, or an unitary matrix if A is complex; R is upper triangular matrix, and E is a permutation matrix.
In short, the permutations are chosen so that the diagonal elements are larger than the off-diagonals in the same row, and that size of the diagonal elements are non-increasing. More detailed description can be found on the netlib QR factorization page.
Since Q and E are both orthogonal (or unitary) matrices, the rank of R is the same as the rank of A. To bring up the rank of A, we just need to find ways to increase the rank of R; and this is much more straight forward thanks to the structure of R as the result of pivoting and the fact that it is upper-triangular.
Now, with the requirement placed on pivoting procedure, if any diagonal element of R is 0, the entire row has to be 0. The n-rank(A) rows of 0s in the bottom if R is responsible for the nullity. If we replace the lower right corner with an identity matrix, the that new matrix would be full rank. Well, we cannot really do the replacement, but we can append the rows matrix to the bottom of R and form a new matrix that has the same rank:
B==[ 0 I ] => newR=[ R ; B ]
Here the dimensionality of I is the nullity of A and that of R.
It is readily seen that rank(newR)=n. Then we can also define a new unitary Q matrix by expanding its dimensionality in a trivial manner:
newQ=[Q 0 ; 0 I]
With that, our new rank n matrix can be obtained as
newA=newQ*newR.transpose(E)=[Q*R ; B ]*transpose(E) =[A ; B*transpose(E)]
Note that B is [0 I] and E is a permutation matrix, so B*transpose(E) is simply the transpose
of the last n-rank(A) columns of E, and thus a set of rows made of standard basis, and that's just what you wanted!
Is n very large? The simplest solution without using any math would be to try adding e_i and seeing if the rank increases. If it does, keep e_i. proceed until finished.
I like #Xiaolei Zhu's solution because it's elegant, but another way to go (that's even more computationally efficient is):
Determine if any rows, indexed by i, of your matrix A are all zero. If so, then the corresponding e_i must be concatenated.
After that process, you can simply concatenate any subset of the n - rank(A) columns of the identity matrix that you didn't add in step 1.
rows/cols from Identity matrix can be added in any order. it does not need to be added in usual order as e1,e2,... in general situation for making matrix full rank.

Compare two arrays of points [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I'm trying to find a way to find similarities in two arrays of different points. I drew circles around points that have similar patterns and I would like to do some kind of auto comparison in intervals of let's say 100 points and tell what coefficient of similarity is for that interval. As you can see it might not be perfectly aligned also so point-to-point comparison would not be a good solution also (I suppose). Patterns that are slightly misaligned could also mean that they are matching the pattern (but obviously with a smaller coefficient)
What similarity could mean (1 coefficient is a perfect match, 0 or less - is not a match at all):
Points 640 to 660 - Very similar (coefficient is ~0.8)
Points 670 to 690 - Quite similar (coefficient is ~0.5-~0.6)
Points 720 to 780 - Let's say quite similar (coefficient is ~0.5-~0.6)
Points 790 to 810 - Perfectly similar (coefficient is 1)
Coefficient is just my thoughts of how a final calculated result of comparing function could look like with given data.
I read many posts on SO but it didn't seem to solve my problem. I would appreciate your help a lot. Thank you
P.S. Perfect answer would be the one that provides pseudo code for function which could accept two data arrays as arguments (intervals of data) and return coefficient of similarity.
Click here to see original size of image
I also think High Performance Mark has basically given you the answer (cross-correlation). In my opinion, most of the other answers are only giving you half of what you need (i.e., dot product plus compare against some threshold). However, this won't consider a signal to be similar to a shifted version of itself. You'll want to compute this dot product N + M - 1 times, where N, M are the sizes of the arrays. For each iteration, compute the dot product between array 1 and a shifted version of array 2. The amount you shift array 2 increases by one each iteration. You can think of array 2 as a window you are passing over array 1. You'll want to start the loop with the last element of array 2 only overlapping the first element in array 1.
This loop will generate numbers for different amounts of shift, and what you do with that number is up to you. Maybe you compare it (or the absolute value of it) against a threshold that you define to consider two signals "similar".
Lastly, in many contexts, a signal is considered similar to a scaled (in the amplitude sense, not time-scaling) version of itself, so there must be a normalization step prior to computing the cross-correlation. This is usually done by scaling the elements of the array so that the dot product with itself equals 1. Just be careful to ensure this makes sense for your application numerically, i.e., integers don't scale very well to values between 0 and 1 :-)
i think HighPerformanceMarks's suggestion is the standard way of doing the job.
a computationally lightweight alternative measure might be a dot product.
split both arrays into the same predefined index intervals.
consider the array elements in each intervals as vector coordinates in high-dimensional space.
compute the dot product of both vectors.
the dot product will not be negative. if the two vectors are perpendicular in their vector space, the dot product will be 0 (in fact that's how 'perpendicular' is usually defined in higher dimensions), and it will attain its maximum for identical vectors.
if you accept the geometric notion of perpendicularity as a (dis)similarity measure, here you go.
caveat:
this is an ad hoc heuristic chosen for computational efficiency. i cannot tell you about mathematical/statistical properties of the process and separation properties - if you need rigorous analysis, however, you'll probably fare better with correlation theory anyway and should perhaps forward your question to math.stackexchange.com.
My Attempt:
Total_sum=0
1. For each index i in the range (m,n)
2. sum=0
3. k=Array1[i]*Array2[i]; t1=magnitude(Array1[i]); t2=magnitude(Array2[i]);
4. k=k/(t1*t2)
5. sum=sum+k
6. Total_sum=Total_sum+sum
Coefficient=Total_sum/(m-n)
If all values are equal, then sum would return 1 in each case and total_sum would return (m-n)*(1). Hence, when the same is divided by (m-n) we get the value as 1. If the graphs are exact opposites, we get -1 and for other variations a value between -1 and 1 is returned.
This is not so efficient when the y range or the x range is huge. But, I just wanted to give you an idea.
Another option would be to perform an extensive xnor.
1. For each index i in the range (m,n)
2. sum=1
3. k=Array1[i] xnor Array2[i];
4. k=k/((pow(2,number_of_bits))-1) //This will scale k down to a value between 0 and 1
5. sum=(sum+k)/2
Coefficient=sum
Is this helpful ?
You can define a distance metric for two vectors A and B of length N containing numbers in the interval [-1, 1] e.g. as
sum = 0
for i in 0 to 99:
d = (A[i] - B[i])^2 // this is in range 0 .. 4
sum = (sum / 4) / N // now in range 0 .. 1
This now returns distance 1 for vectors that are completely opposite (one is all 1, another all -1), and 0 for identical vectors.
You can translate this into your coefficient by
coeff = 1 - sum
However, this is a crude approach because it does not take into account the fact that there could be horizontal distortion or shift between the signals you want to compare, so let's look at some approaches for coping with that.
You can sort both your arrays (e.g. in ascending order) and then calculate the distance / coefficient. This returns more similarity than the original metric, and is agnostic towards permutations / shifts of the signal.
You can also calculate the differentials and calculate distance / coefficient for those, and then you can do that sorted also. Using differentials has the benefit that it eliminates vertical shifts. Sorted differentials eliminate horizontal shift but still recognize different shapes better than sorted original data points.
You can then e.g. average the different coefficients. Here more complete code. The routine below calculates coefficient for arrays A and B of given size, and takes d many differentials (recursively) first. If sorted is true, the final (differentiated) array is sorted.
procedure calc(A, B, size, d, sorted):
if (d > 0):
A' = new array[size - 1]
B' = new array[size - 1]
for i in 0 to size - 2:
A'[i] = (A[i + 1] - A[i]) / 2 // keep in range -1..1 by dividing by 2
B'[i] = (B[i + 1] - B[i]) / 2
return calc(A', B', size - 1, d - 1, sorted)
else:
if (sorted):
A = sort(A)
B = sort(B)
sum = 0
for i in 0 to size - 1:
sum = sum + (A[i] - B[i]) * (A[i] - B[i])
sum = (sum / 4) / size
return 1 - sum // return the coefficient
procedure similarity(A, B, size):
sum a = 0
a = a + calc(A, B, size, 0, false)
a = a + calc(A, B, size, 0, true)
a = a + calc(A, B, size, 1, false)
a = a + calc(A, B, size, 1, true)
return a / 4 // take average
For something completely different, you could also run Fourier transform using FFT and then take a distance metric on the returning spectra.

Resources