Fast matrix Exponentiation computation in Galois field - algorithm

I am looking for a fast way to compute power of matrix A in galois field 2 (GF(2)). A is a double matrix and its exponentiation of x denoted by
A^x = A * A * A * ... * A (x times)
The simple way is that converts A to GF(2) (because given matrix A is double matrix) and then peform exponentiation operation.
Matlab code
A1 = gf(A, 2) % // convert to galois field
A_pow_x_first = A1^x; % // Perform A^x
However, this way takes long time to converts matrix A to GF(2). I am looking for a fast way without GF(2) converting. That is, I using mod operation
A_pow_x_second = mod(A^x, 2)
However, the problem is that the result of first way and second way are not similar. The problem is that overflow of number. Some member suggested to me convert matrix A to int64. However, I think it is not good way to handle with my problem. Could you suggest to me a fast way to do it in matlab? Thanks in advance
This is simple example
>> A = [1 0 1
0 1 1
1 1 1]
First way,
>> A_pow_x_first = gf(A, 2)^50
Result:
0 1 0
1 0 0
0 0 1
Second way
>> A_pow_x_second = mod(A^50, 2)
A_pow_x_second =
0 0 0
0 0 0
0 0 0
How to fast compute A^x without convert to GF(2) that has similar result in first way?

Related

Can I always assume that an mvp matrix with corner value !=1 is performing scaling?

Assume I have a modelview projection matrix, mvp and I know that mvp[3][3] !=1 and mvp[3][3] > 0
Can I assume that the model matrix performed the scaling or since the projection matrix itself performs scaling this number is not useful without the original matrices?
No, this value alone does not tell you much. Consider a diagonal matrix like the following:
d 0 0 0
0 d 0 0
0 0 d 0
0 0 0 d
d is an arbitrary number.
This matrix is essentially the homogeneous equivalent of the identity matrix and does not perform any transformation at all. The uniform scaling part in the upper left 3x3 block is cancelled out by the perspective divide. You can always multiply the matrix by the inverse of the m33 entry to somewhat normalize it (this will preserve the transformation). For the above matrix, you would then get:
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
And in this form, you can easily see that it is the identity. Moreover, you can examine the upper left 3x3 block to find out if there is a scaling (depending on your definition of scaling, calculating the determinant of the 3x3 block and checking for 1 is one option as Robert mentioned in the comments).

Decomposing uint32s into uint8s in a large matrix

For a project I am working on, I am loading in large image files, which Matlab inputs as LxWx3 arrays of uint8s. I have a function which concatenates these component-wise into a LxWx1 array of uint32s, but I can't find a way to do the reverse without using nested for loops, which is far too slow for the matrices I am working with.
Could anyone recommend a way to accomplish this effuciently? Basically, given a LxW Matrix of uint32s, I want to return a LxWx3 matrix of uint8s, where the (x, y,1:3) components are the three most significant bytes in the corresponding uint32.
You can do that with typecast:
A = uint32([2^16 2^30; 256 513]);
B = permute(reshape(typecast(A(:), 'uint8'), [], size(A,1), size(A,2)), [2 3 1]);
B = flipdim(B, 3); %// flip 3rd dim to bring MSB first, if needed (depends on computer)
B = B(:,:,2:4);
Example: for A = uint32([2^16 2^30; 256 513]);
A =
65536 1073741824
256 513
the result is
B(:,:,1) =
1 0
0 0
B(:,:,2) =
0 0
1 2
B(:,:,3) =
0 0
0 1

Optimal Square Covering in 2D Matrix (Minimize Coverage Cost)

I came across the following programming challenge recently:
Statement
Consider a 2D square matrix of size NxN containing 0s and 1s. You have to cover all the 1s in the matrix using squares of size 1, 2 or 3. The coverage cost using square of size 1 is 2, using square of size 2 is 4 and using square of size 3 is 7. The objective is to find the minimum coverage cost to cover all the 1s in matrix.
Constraints
1 <= N <= 100
General Comments
Overlapping covering squares are allowed.
It is not necessary that the covering square should cover only 1s -
they may cover cells containing 0s as well.
Example
Consider the following matrix as an example:
0 0 0 0 0 0 0 0
0 1 1 0 0 0 0 0
0 1 1 1 0 0 0 0
0 0 1 1 1 0 0 0
0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 1 1 0
0 0 0 0 0 0 1 0
In above example, minimum coverage cost is 7x1 + 4x2 + 2x1 = 17. Another covering is possible with minimum coverage cost of 7x1 + 4x1 + 2x3 = 17.
My Approach
I tried to approach the problem in the following manner:
Use square of size 3 to cover 1s where number of 1s in any 3x3 area is >= 5. Remove those 1s from the matrix.
Next, use square of size 2 to cover 1s where number of 1s in any 2x2
area is >= 2. Remove those 1s from the matrix.
Cover remaining 1s with sqaure of size 1.
This approach is greedy and is not optimal. For the example above, my approach gives answer 7x1 + 4x2 + 2x2 = 19 which is not optimal.
Any pointers about how to approach this problem or references to known problems which can be used to solve this one are appreciated. Thanks.
Update
Taking a cue from #bvdb answer, I updated the approach to select the coverage squares based on the number of 1s they are covering. However, the approach is still non-optimal. Consider a scenario where we have the following arrangement:
1 0 1
0 0 0
1 0 1
This arrangement will be covered using 4 coverage squares of size 1 whereas they must be covered using 1 square of size 3. In general, 5 1s in 3x3 area must be covered using different strategies based on how they are spread in the area. I can hardcode it for all types of cases, but I am looking for an elegant solution, if it exists.
Your problem is a typical Packing problem.
Your approach of fitting the biggest box first makes perfect sense.
A simple way to make your algorithm better, is to just give preference to 3x3 squares with maximum conent.
Example:
Use square of size 3 to cover 1s where number of 1s in any 3x3 area is = 9. Remove those 1s from the matrix.
Idem, but where area is = 8.
Idem, but where area is = 7.
Idem, but where area is = 6.
Next, use square of size 2 to cover 1s where number of 1s in any 2x2 area is = 4. Remove those 1s from the matrix.
etc ...
Monte carlo method
But if you want to add overlap, then it gets more tricky. I am sure you could work it out mathematically. However, when logic becomes tricky, then the Monte Carlo method always comes to mind:
Monte Carlo methods (or Monte Carlo experiments) are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. They are often used in physical and mathematical problems and are most useful when it is difficult or impossible to use other mathematical methods.
Monte Carlo trades coded logic for speed and randomness:
STEP 1: repeat 4 times:
cor = randomCoordinate()
if ( hasContent(cor, 3) ) then putSquare(cor, 3)
STEP 2: repeat 16 times:
cor = randomCoordinate()
if ( hasContent(cor, 2) ) then putSquare(cor, 2)
STEP 3: corList = getFreeSquaresWithContent()
putSquare(corlist, 1)
calculateScore()
store used squares and score.
This code should be simple but really fast.
Then run this 100.000 times, and keep the top 10 scores.
Which 3x3 squares did the winners use most often?
Use this information as a "starting position".
Now, run it again from STEP2 using this starting position.
This means that the 100.000 iterations don't have to focus on the 3x3 squares any more, they immediately start adding 2x2 squares.
PS: The number of iterations you do (e.g. 100.000) is really a matter of the required response time and the required accuracy. You should test this to find out what is acceptable.
If you are looking for a deterministic approach.
I think the best thing to do is to sort all possible patterns in an optimal order. There are only 394 relevant patterns. There is no need to hardcode them, you can generate them on-the-fly.
First our definitions (rules of the game). Each square has a size and a cost.
class Square
{
private int size;
private int cost;
Square(int pSize, int pCost)
{
size = pSize;
cost = pCost;
}
}
And there are only 3 types of squares. squareOne keeps the cost of a 1x1 matrix, squareTwofor a 2x2 and squareThree for a 3x3 matrix.
Square squareOne = new Square(1, 2);
Square squareTwo = new Square(2, 4);
Square squareThree = new Square(3, 7);
List<Square> definitions = Arrays.asList(squareOne, squareTwo, squareThree);
We are going to have to store each pattern with its cost, number of hits, and its cost per hit (efficiency). So here follows the class that I am using to store it. Note that this class contains methods that help to perform the sorting as well as conversions to a matrix of boolean's (1/0 values).
class ValuedPattern implements Comparable<ValuedPattern>
{
private long pattern;
private int size;
private int cost;
private double costPerHit;
private int hits;
ValuedPattern(long pPattern, int pSize, int pCost)
{
pattern = pPattern;
cost = pCost;
size = pSize;
// calculate the efficiency
int highCount = 0;
BitSet set = BitSet.valueOf(new long[]{pattern});
for (int i = 0; i < set.size(); i++)
{
if (set.get(i)) highCount++;
}
hits = highCount;
costPerHit = (double) cost / (double) hits;
}
public boolean[][] toArray()
{
boolean[][] patternMatrix = new boolean[size][size];
BitSet set = BitSet.valueOf(new long[]{pattern});
for (int i = 0; i < size; i++)
{
for (int j = 0; j < size; j++)
{
patternMatrix[i][j] = set.get(i * size + j);
}
}
return patternMatrix;
}
/**
* Sort by efficiency
* Next prefer big matrixes instead of small ones.
*/
#Override
public int compareTo(ValuedPattern p)
{
if (p == null) return 1;
if (costPerHit < p.costPerHit) return -1;
if (costPerHit > p.costPerHit) return 1;
if (hits > p.hits) return -1;
if (hits < p.hits) return 1;
if (size > p.size) return -1;
if (size < p.size) return 1;
return Long.compare(pattern, p.pattern);
}
#Override
public boolean equals(Object obj)
{
if (! (obj instanceof ValuedPattern)) return false;
return (((ValuedPattern) obj).pattern == pattern) &&
(((ValuedPattern) obj).size == size);
}
}
Next we are going to store all possible patterns in a sorted collection (i.e. a TreeSet sorts its content automatically using the compareTo method of the object).
Since your patterns are just 0 and 1 values, you can think of them as numeric values (long is a 64bit integer which is more than enough) which can be converted later to a boolean matrix. The size of the pattern is the same as the number of bits of that numeric value. Or in other words there are 2^x possible values, with x being the number of cells in your pattern.
// create a giant list of all possible patterns :)
Collection<ValuedPattern> valuedPatternSet = new TreeSet<ValuedPattern>();
for (Square square : definitions)
{
int size = square.size;
int bits = size * size;
long maxValue = (long) Math.pow(2, bits);
for (long i = 1; i < maxValue; i++)
{
ValuedPattern valuedPattern = new ValuedPattern(i, size, square.cost);
// filter patterns with a rediculous high cost per hit.
if (valuedPattern.costPerHit > squareOne.cost) continue;
// and store the result for later
valuedPatternSet.add(valuedPattern);
}
}
After composing the list, the patterns are already sorted according to efficiency. So now you can just apply the logic that you already have.
// use the list in that order
for (ValuedPattern valuedPattern : valuedPatternSet)
{
boolean[][] matrix = valuedPattern.toArray();
System.out.println("pattern" + Arrays.deepToString(matrix) + " has cost/hit: " + valuedPattern.costPerHit);
// todo : do your thing :)
}
The demo code above outputs all patterns with their efficiency. Note that smaller patterns sometimes have a better efficiency than the bigger ones.
Pattern [[true, true, true], [true, true, true], [true, true, true]] has cost/hit: 0.7777777777777778
Pattern [[true, true, true], [true, true, true], [true, true, false]] has cost/hit: 0.875
Pattern [[true, true, true], [true, true, true], [true, false, true]] has cost/hit: 0.875
Pattern [[true, true, true], [true, true, true], [false, true, true]] has cost/hit: 0.875
...
The entire thing runs in just a couple of ms.
EDIT:
I added some more code, which I am not going to drop here (but don't hesitate to ask, then I'll e-mail it to you). But I just wanted to show the result it came up with:
EDIT2:
I am sorry to tell you that you are correct to question my solution. It turns out there is a case where my solution fails:
0 0 0 0 0 0
0 1 1 1 1 0
0 1 1 1 1 0
0 1 1 1 1 0
0 1 1 1 1 0
0 0 0 0 0 0
My solution is greedy, in the sense that it immediatly tries to apply the most efficient pattern:
1 1 1
1 1 1
1 1 1
Next only the following remains:
0 0 0 0 0 0
0 _ _ _ 1 0
0 _ _ _ 1 0
0 _ _ _ 1 0
0 1 1 1 1 0
0 0 0 0 0 0
Next it will use three 2x2 squares to cover the remains.
So the total cost = 7 + 3*4 = 19
The best way of course would have been to use four 2x2 squares.
With a total cost of 4*4 = 16
Conclusion: So, even though the first 3x3 was very efficient, the next 2x2 patterns are less efficient. Now that you know this exception you could add it to the list of patterns. E.g. a square with size 4 has cost 16. However, that wouldn't solve it, a 3x3 would still have a lower cost/hit and would always be considered first. So, my solution is broken.

Error correcting codes and minimum distances

I was looking at a challenge online (at King's website) and although I understand the general idea behind it I'm slightly lost - maybe the wording is a little off? Here is the problem and I'll state what I don't understand below:
Error correcting codes are used in a wide variety of applications
ranging from satellite communication to music CDs. The idea is to
encode a binary string of length k as a binary string of length n>k,
called a codeword such that even if some bit(s) of the encoding are
corrupted (if you scratch on your CD for instance), the original k-bit
string can still be recovered. There are three important parameters
associated with an error correcting code: the length of codewords (n),
the dimension (k) which is the length of the unencoded strings, and
finally the minimum distance (d) of the code. Distance between two
codewords is measured as hamming distance, i.e., the number of
positions in which the codewords differ: 0010 and 0100 are at distance
2. The minimum distance of the code is the distance between the two different codewords that are closest to each other. Linear codes are a
simple type of error correcting codes with several nice properties.
One of them being that the minmum distance is the smallest distance
any non-zero codeword has to the zero codeword (the codeword
consisting of n zeros always belongs to a linear code of length n).
Another nice property of linear codes of length n and dimension k is
that they can be described by an n×k generator matrix of zeros and
ones. Encoding a k-bit string is done by viewing it as a column vector
and multiplying it by the generator matrix. The example below shows a
generator matrix and how the string 1001 is encoded. graph.png Matrix
multiplication is done as usual except that additon is done modulo 2
(i.e., 0+1=1+0=1 and 0+0=1+1=0). The set of codewords of this code is
then simply all vectors that can be obtained by encoding all k-bit
strings in this way. Write a program to calculate the minimum distance
for several linear error correcting codes of length at most 30 and
dimension at most 15. Each code will be given as a generator matrix.
Input You will be given several generator matrices as input. The first
line contains an integer T indicating the number of test cases. The
first line of each test case gives the parameters n and k where
1≤n≤30, 1≤k≤15 and n > k, as two integers separated by a single space.
The following n lines describe a generator matrix. Each line is a row
of the matrix and has k space separated entries that are 0 or 1.
Output For each generator matrix output a single line with the minimum
distance of the corresponding linear code.
Sample Input 1
2
7 4
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
0 1 1 1
1 0 1 1
1 1 0 1
3 2
1 1
0 0
1 1
Sample Output 1
3
0
Now my assumption is that the question is asking "Write a program that can take in the linear code in matrix form and say what the minimum distance is from an all zero codeword" I just don't understand why there is a 3 output for the first input and a 0 for the second input?
Very confused.
Any ideas?
For first example:
Input binary string: 1000
Resulting code: 1100001
Hamming distance to zero codeword 0000000: 3
For second example:
Input binary string: 11
Resulting code: 000
Hamming distance to zero codeword 000: 0
Your goal is to find valid non-zero codeword (which can be produced from some non-zero k-bit input string) with minimal Hamming distance to zero codeword (in different words - with minimal amount of ones in binary representation) and return that distance.
Hope that helps, the problem description is indeed a little bit hard to understand.
EDIT. I've made typo in first example. Actual input should be 1000 not 0001. Also it's may be not clear what exactly is input string and how the codeword is calculated. Let's look at first sample.
Input binary string: 1000
This binary string in general is not part of generator matrix. It is just one of all possible non-zero 4-bit strings. Let's multiply it by generator matrix:
(1 0 0 0) * (1 0 0 0) = 1
(0 1 0 0) * (1 0 0 0) = 0
(0 0 1 0) * (1 0 0 0) = 0
(0 0 0 1) * (1 0 0 0) = 0
(0 1 1 1) * (1 0 0 0) = 0
(1 0 1 1) * (1 0 0 0) = 1
(1 1 0 1) * (1 0 0 0) = 1
One way to find input that produces "minimal" codeword is to iterate all 2^k-1 non-zero k-bit strings and calculate codeword for each of them. This is feasible solution for k <= 15.
Another example for first test case 0011 (it's possible to have multiple inputs that produce "minimal" output):
(1 0 0 0) * (0 0 1 1) = 0
(0 1 0 0) * (0 0 1 1) = 0
(0 0 1 0) * (0 0 1 1) = 1
(0 0 0 1) * (0 0 1 1) = 1
(0 1 1 1) * (0 0 1 1) = 2 = 0 (mod 2)
(1 0 1 1) * (0 0 1 1) = 2 = 0 (mod 2)
(1 1 0 1) * (0 0 1 1) = 1
Resulting code 0011001 also has Hamming distance 3 to the zero codeword. There is no 4-bit string with code that has less that 3 ones in binary representation. That's why the answer for first test case is 3.

Form a Matrix From a Large Text File Quickly

Hi I am struggling with reading data from a file quickly enough. ( Currently left for 4hrs, then crashed) must be a simpler way.
The text file looks similar like this:
From To
1 5
3 2
2 1
4 3
From this I want to form a matrix so that there is a 1 in the according [m,n]
The current code is:
function [z] = reed (A)
[m,n]=size(A);
i=1;
while (i <= n)
z(A(1,i),A(2,i))=1;
i=i+1;
end
Which output the following matrix, z:
z =
0 0 0 0 1
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
My actual file has 280,000,000 links to and from, this code is too slow for this size file. Does anybody know a much faster was to do this in matlab?
thanks
You can do something along the lines of the following:
>> A = zeros(4,5);
>> B = importdata('testcase.txt');
>> A(sub2ind(size(A),B.data(:,1),B.data(:,2))) = 1;
My test case, 'testcase.txt' contains your sample data:
From To
1 5
3 2
2 1
4 3
The result would be:
>> A
A =
0 0 0 0 1
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
EDIT - 1
After taking a look at your data, it seems that even if you modify this code appropriately, you may not have enough memory to execute it as the matrix A would become too large.
As such, you can use sparse matrices to achieve the same as given below:
>> B = importdata('web-Stanford.txt');
>> A = sparse(B.data(:,1),B.data(:,2),1,max(max(B.data)),max(max(B.data)));
This would be the approach I'd recommend as your A matrix will have a size of [281903,281903] which would usually be too large to handle due to memory constraints. A sparse matrix on the other hand, maintains only those matrix entries which are non-zero, thus saving on a lot of space. In most cases, you can use sparse matrices more-or-less as you use normal matrices.
More information about the sparse command is given here.
EDIT - 2
I'm not sure why it isn't working for you. Here's a screenshot of how I did it in case that helps:
EDIT - 3
It seems that you're getting a double matrix in B while I'm getting a struct. I'm not sure why this is happening; I can only speculate that you deleted the header lines from the input file before you used importdata.
Basically it's just that my B.data is the same as your B. As such, you should be able to use the following instead:
>> A = sparse(B(:,1),B(:,2),1,max(max(B)),max(max(B)));

Resources