What should happen with the final exclusive scan value in a stream compaction algorithm?
This is an example to pick out all the 'A' characters.
Sequence A:
Input: A B B A A B B A
Selection: 1 0 0 1 1 0 0 1
Scan: 0 1 1 1 2 3 3 3
0 - A
1 - A
2 - A
3 - A
Sequence B (same except the last value):
Input: A B B A A B B B
Selection: 1 0 0 1 1 0 0 0
Scan: 0 1 1 1 2 3 3 3
0 - A
1 - A
2 - A
3 - B
Clearly the second example gives the wrong final result based on doing a naive loop through the scan values writing into these addresses.
What am I missing here?
Update:
As I understand the scan algorithm, I would do the equivalent of the following:
for (int i = 0; i < scan.length(); i++)
{
result[scan[i]] = input[i];
}
In parallel this would involve a scatter instruction.
After an A, you are asuming that there will be at least another A. Therefore, you asume that the sequence ends with an A. If it doesn't, you pick the wrong final letter.
You just need to count the As. Don't start with 1. Start with 0. Only increase this count when you find an A.
Or... Update:
Input: A B B A A B B A
Selection: 1 0 0 1 1 0 0 1
Scan: 0 1 1 1 2 3 3 3 4
^
0 - A |
1 - A Four elements
2 - A
3 - A
Input: A B B A A B B B
Selection: 1 0 0 1 1 0 0 0
Scan: 0 1 1 1 2 3 3 3 3
^
0 - A |
1 - A Three elements
2 - A
Related
I'm looking for a reordering technique to group connected components of an adjacency matrix together.
For example, I've made an illustration with two groups, blue and green. Initially the '1's entries are distributed across the rows and columns of the matrix. By reordering the rows and columns, all '1''s can be located in two contiguous sections of the matrix, revealing the blue and green components more clearly.
I can't remember what this reordering technique is called. I've searched for many combinations of adjacency matrix, clique, sorting, and reordering.
The closest hits I've found are
symrcm moves the elements closer to the diagonal, but does not make groups.
Is there a way to reorder the rows and columns of matrix to create a dense corner, in R? which focuses on removing completely empty rows and columns
Please either provide the common name for this technique so that I can google more effectively, or point me in the direction of a Matlab function.
I don't know whether there is a better alternative which should give you direct results, but here is one approach which may serve your purpose.
Your input:
>> A
A =
0 1 1 0 1
1 0 0 1 0
0 1 1 0 1
1 0 0 1 0
0 1 1 0 1
Method 1
Taking first row and first column as Column-Mask(maskCol) and
Row-Mask(maskRow) respectively.
Get the mask of which values contains ones in both first row, and first column
maskRow = A(:,1)==1;
maskCol = A(1,:)~=1;
Rearrange the Rows (according to the Row-mask)
out = [A(maskRow,:);A(~maskRow,:)];
Gives something like this:
out =
1 0 0 1 0
1 0 0 1 0
0 1 1 0 1
0 1 1 0 1
0 1 1 0 1
Rearrange columns (according to the column-mask)
out = [out(:,maskCol),out(:,~maskCol)]
Gives the desired results:
out =
1 1 0 0 0
1 1 0 0 0
0 0 1 1 1
0 0 1 1 1
0 0 1 1 1
Just a check whether the indices are where they are supposed to be or if you want the corresponding re-arranged indices ;)
Before Re-arranging:
idx = reshape(1:25,5,[])
idx =
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
After re-arranging (same process we did before)
outidx = [idx(maskRow,:);idx(~maskRow,:)];
outidx = [outidx(:,maskCol),outidx(:,~maskCol)]
Output:
outidx =
2 17 7 12 22
4 19 9 14 24
1 16 6 11 21
3 18 8 13 23
5 20 10 15 25
Method 2
For Generic case, if you don't know the matrix beforehand, here is the procedure to find the maskRow and maskCol
Logic used:
Take first row. Consider it as column mask (maskCol).
For 2nd row to last row, the following process are repeated.
Compare the current row with maskCol.
If any one value matches with the maskCol, then find the element
wise logical OR and update it as new maskCol
Repeat this process till the last row.
Same process for finding maskRow while the column are used for
iterations instead.
Code:
%// If you have a square matrix, you can combine both these loops into a single loop.
maskCol = A(1,:);
for ii = 2:size(A,1)
if sum(A(ii,:) & maskCol)>0
maskCol = maskCol | A(ii,:);
end
end
maskCol = ~maskCol;
maskRow = A(:,1);
for ii = 2:size(A,2)
if sum(A(:,ii) & maskRow)>0
maskRow = maskRow | A(:,ii);
end
end
Here is an example to try that:
%// Here I removed some 'ones' from first, last rows and columns.
%// Compare it with the original example.
A = [0 0 1 0 1
0 0 0 1 0
0 1 1 0 0
1 0 0 1 0
0 1 0 0 1];
Then, repeat the procedure you followed before:
out = [A(maskRow,:);A(~maskRow,:)]; %// same code used
out = [out(:,maskCol),out(:,~maskCol)]; %// same code used
Here is the result:
>> out
out =
0 1 0 0 0
1 1 0 0 0
0 0 0 1 1
0 0 1 1 0
0 0 1 0 1
Note: This approach may work for most of the cases but still may fail for some rare cases.
Here, is an example:
%// this works well.
A = [0 0 1 0 1 0
1 0 0 1 0 0
0 1 0 0 0 1
1 0 0 1 0 0
0 0 1 0 1 0
0 1 0 0 1 1];
%// This may not
%// Second col, last row changed to zero from one
A = [0 0 1 0 1 0
1 0 0 1 0 0
0 1 0 0 0 1
1 0 0 1 0 0
0 0 1 0 1 0
0 0 0 0 1 1];
Why does it fail?
As we loop through each row (to find the column mask), for eg, when we move to 3rd row, none of the cols match the first row (current maskCol). So the only information carried by 3rd row (2nd element) is lost.
This may be the rare case because some other row might still contain the same information. See the first example. There also none of the elements of third row matches with 1st row but since the last row has the same information (1 at the 2nd element), it gave correct results. Only in rare cases, similar to this might happen. Still it is good to know this disadvantage.
Method 3
This one is Brute-force Alternative. Could be applied if you think the previous case might fail. Here, we use while loop to run the previous code (finding row and col mask) number of times with updated maskCol, so that it finds the correct mask.
Procedure:
maskCol = A(1,:);
count = 1;
while(count<3)
for ii = 2:size(A,1)
if sum(A(ii,:) & maskCol)>0
maskCol = maskCol | A(ii,:);
end
end
count = count+1;
end
Previous example is taken (where the previous method fails) and is run with and without while-loop
Without Brute force:
>> out
out =
1 0 1 0 0 0
1 0 1 0 0 0
0 0 0 1 1 0
0 1 0 0 0 1
0 0 0 1 1 0
0 0 0 0 1 1
With Brute-Forcing while loop:
>> out
out =
1 1 0 0 0 0
1 1 0 0 0 0
0 0 0 1 1 0
0 0 1 0 0 1
0 0 0 1 1 0
0 0 0 0 1 1
The number of iterations required to get the correct results may vary. But it is safe to have a good number.
Good Luck!
I am trying to write code to detect if a matrix is permutation of a Hankel matrix but I can't think of an efficient solution other than very slow brute force. Here is the spec.
Input: An n by n matrix M whose entries are 1 or 0.
Input format: Space separated rows. One row per line. For example
0 1 1 1
0 1 0 1
0 1 0 0
1 0 1 1
Output: A permutation of the rows and columns of M so that M is a Hankel matrix if that is possible. A Hankel matrix has constant skew-diagonals (positive sloping diagonals).
When I say a permutation, I mean we can apply one permutation to the order of the rows and a possibly different one to the columns.
I would be very grateful for any ideas.
Without Loss of Generality, we will assume that there are fewer 0's than 1's. We can then find the possible diagonals in a Hankel Matrix that could be 0's to give us the appropriate number of 0's in the entire matrix. And, this will give us the possible Hankel matrices. From there, you can count the number of 0's in each column, and compare it to the number of 0's in the columns of the original matrix. Once you have done this, you have a much smaller space in which to perform a brute force search: permuting on columns and rows that have the right number of 0's.
Example: OP's suggested a 4x4 matrix with 7 0's. We need to partition this using the set {4,3,3,2,2,1,1}. So, or partitions would be:
{4,3}
{4,2,1} (2 of these matrices)
{3,3,1}
{3,2,2}
{3,2,1,1} (2 of these matrices)
And this gives us the Hankel Matrices (excluding symmetries)
1 1 0 0 1 1 1 0 0 1 1 0 1 1 0 1
1 0 0 1 1 1 0 1 1 1 0 1 1 0 1 0
0 0 1 1 1 0 1 0 1 0 1 0 0 1 0 1
0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0
1 0 0 1 0 1 1 1 0 1 0 1
0 0 1 1 1 1 1 0 1 0 1 1
0 1 1 0 1 1 0 0 0 1 1 0
1 1 0 1 1 0 0 0 1 1 0 0
The original matrix had columns with 3, 1, 2, and 1 0's in its four columns. Comparing this to the 7 possible Hankel matrices gives us 2 possibilities
1 1 1 0 0 1 1 1
1 1 0 1 1 1 1 0
1 0 1 0 1 1 0 0
0 1 0 0 1 0 0 0
Now, there are only 4 possible permutations that could map the original matrix to each of these: we have only 1 choice based on the columns with 2 and 3 0's, but 2 choices for the columns with 1 0's, and also 2 choices for the rows with 1 0's. Checking those permutations, we see that the following Hankel matrix is a permutation of the original
0 1 1 1
1 1 1 0
1 1 0 0
1 0 0 0
The one thing which the first answer to this question got right is that permuting the rows and columns doesn't change the row sums or column sums.
Another easy observation is that in a Hankel matrix, the difference in row sum between two consecutive rows is -1, 0, or 1, and each case gives us a constraint on the rows. If the difference is 0 then the entering variable is equal to the exiting variable; otherwise we know which is 0 and which is 1.
0 1 1 1
0 1 0 1
0 1 0 0
1 0 1 1
has row sums 3, 2, 1, 3. The orders which respect the difference requirement are 1 2 3 3 and 3 3 2 1, and wlog we can discard reversals because reversing the row and column permutations just rotates the matrix by 180 degrees. Therefore we reduce to considering four permuted matrices (two possible orderings of the 3s in the row sums, and two in the column sums):
0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1
0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
0 1 1 1 1 1 0 1 0 1 1 1 1 1 1 0
1 1 0 1 0 1 1 1 1 1 1 0 0 1 1 1
We could actually have taken the analysis further by observing that by forcing the initial rows to have sums 1 and 2 we constrain the order of the columns with sum 3, since
0 0 1 0
0 0 1 1
is not a valid initial two rows of a Hankel matrix. Whether or not this kind of reasoning is easy to implement depends on your programming paradigm.
Note that in the worst case this kind of reasoning still doesn't leave a polynomial number of cases to brute force through.
Here are some ideas.
1)
Row and Column permutations preserve the row and column sums:
1 0 1 0 - 2
0 0 0 1 - 1 row sums
1 0 0 0 - 1
1 1 1 0 - 3
| | | |
3 1 2 1
column sums
Whichever way you permute the rows, the row sums will still be {2, 1, 1, 3} in some permutation; the column sums will be unchanged. And vice versa. Hankel matrices and their permutations will always have the same set of row sums as column sums. This gives you a quick test to rule out a set of non-viable matrices.
2)
I posit that Hankel matrices can always be permuted in such a way that their row and column sums are in ascending order, and the result is still a Hankel matrix:
0 1 1 0 - 2 0 0 0 1 - 1
1 1 0 0 - 2 0 0 1 1 - 2
1 0 1 1 - 3 --> 0 1 1 0 - 2
0 0 1 0 - 1 1 1 0 1 - 3
| | | | | | | |
2 2 3 1 1 2 2 3
Therefore if a matrix can be permuted into a Hankel matrix, then it can also be permuted into a Hankel matrix of ascending row and column sum. That is, we can reduce the number of permutations needed to test by only testing permutations where the row and column sums are in ascending order.
3)
I posit further that for any Hankel matrix where two or more rows have the same sum, every permutation of columns has a matching permutation of rows that also produces a Hankel matrix. That is, if a Hankel matrix exists for one permutation of columns, then it exists for every permutation of columns - since we can simply apply that same permutation to the corresponding rows and achieve a symmetrical result.
The upshot is that we only need to test permutations of rows or columns, not rows and columns.
Applied to the original example:
1 0 1 0 - 2 0 0 0 1 0 1 0 0 - 1 0 0 0 1
0 0 0 1 - 1 1 0 0 0 0 0 0 1 - 1 0 1 0 0
1 0 0 0 - 1 --> 1 0 1 0 --> 0 0 1 1 - 2 --> 0 0 1 1 = Hankel!
1 1 1 0 - 3 1 1 1 0 1 0 1 1 - 3 1 0 1 1
| | | |
3 1 2 1 permute rows into| ditto | try swapping
ascending order | for columns | top 2 rows
4)
I posit, finally, that every Hankel matrix where there are multiple rows and columns with the same sum can be permuted into another Hankel matrix with the property that those rows and columns are in increasing order when read as binary numbers - reading left-to-right for rows and top-to-bottom for columns. That is:
0 1 1 0 0 1 0 1 0 0 1 1
1 0 0 1 0 1 1 0 0 1 0 1 New
1 0 1 0 --> 1 0 0 1 --> 1 0 1 0 Hankel
0 1 0 1 1 0 1 0 1 1 0 0
Original rows columns
Hankel ascending ascending
If this is true (and I'm still undecided), then we only ever need to create and test one permutation of any given input matrix. That permutation puts both the rows and columns in order of ascending sum, and in the case of equal sums, orders them by their binary number interpretations. If this resultant matrix is not Hankel, then there is no permutation that will make it Hankel.
Hope that gets you on the way to an algorithm!
Addendum: Counterexamples?
Trying #orlp's example:
0 0 1 0 0 0 1 0 0 0 0 1
0 1 0 1 0 1 0 1 0 1 1 0
1 0 1 1 --> 0 1 1 1 --> 0 1 1 1
0 1 1 1 1 0 1 1 1 0 1 1
(A) (B) (C)
A: Original Hankel. Row sums are 1, 2, 3, 3; Rows 3 and 4 are not in binary order.
B: Swap rows 3 and 4. Columns 3 and 4 are not in binary order.
C: Swap columns 3 and 4. Result is Hankel and satisfies all the properties.
Trying #Degustaf's example:
1 1 0 1 0 1 0 0 0 0 1 0
1 0 1 0 1 0 0 1 0 1 0 1
0 1 0 0 --> 1 0 1 0 --> 1 0 0 1
1 0 0 1 1 1 0 1 0 1 1 1
(A) (B) (C)
A: Original Hankel matrix. Row sums are 3, 2, 1, 2.
B: Rearrange so that the row sums are 1, 2, 2, 3, and the rows of sum 2 are in ascending binary order (i.e. 1001, 1010)
C: Rearrange column sums to 1, 2, 2, 3, with the two columns of sum 2 in order (0101, 1001). Result is Hankel and satisfies all the properties. Note also that the permutation on the columns matches the permutation on the rows: the new column order from the old one is {3, 4, 2, 1}, the same operation to get from A to B.
Note: I suggest the binary order (#4) only for tiebreak situations on the row or column sum, not as a replacement for the sort in (#2).
I have this piece of code
((⍳3)∘.+(⍳2))
which generates the following matrix
2 3
3 4
4 5
I want to find the occurrence of each unique element in the result i.e occurrence of 2,3,4,5 in the result.
I tried using "∘.=" with the matrix itself and then reshaping such that elements of each sub matrix is transformed into a row
using
6 6⍴ ((⍳3)∘.+(⍳2))∘.=((⍳3)∘.+(⍳2))
which gives the following result
1 0 0 0 0 0 for 2
0 1 1 0 0 0 for 3
0 1 1 0 0 0 for 3
0 0 0 1 1 0 for 4
0 0 0 1 1 0 for 4
0 0 0 0 0 1 for 5
as you can see it still contains the sum for duplicate items, and I'm lost as of now.
Any help will be appreciated.
You should do ∘.= between the unique elements in the matrix and a flat vector of all elements, like:
m ← ((⍳3)∘.+(⍳2))
(∪,m) ∘.= ,m
1 0 0 0 0 0
0 1 1 0 0 0
0 0 0 1 1 0
0 0 0 0 0 1
Then just do +/ on it to get the frequencies of ∪,m
+/ (∪,m) ∘.= ,m
1 2 2 1
∪,m
2 3 4 5
(Tested on GNU APL.)
Dyalog APL version 14.0 has the ⌸ Key operator exactly for this, you just need to ravel your data:
{≢⍵}⌸ ,((⍳3)∘.+(⍳2))
1 2 2 1
Try it online!
You can even use the left argument of ⌸'s operand function to create a table:
{⍺,≢⍵}⌸ ,((⍳3)∘.+(⍳2))
2 1
3 2
4 2
5 1
Try it online!
I need something like Position function for Mathematica (http://reference.wolfram.com/mathematica/ref/Position.html) but in Q. My solution for rectangular matrix is following:
q) colrow:{sz:count x; $[(count first x) = 1; enlist y; (floor y % sz; y mod sz)]}
q) position:{flip colrow[x;(where raze x = y)]}
It works straightforward for rectangular matrices and lists:
q) t:(1 -1 1; / matrix test
-1 3 4;
1 -1 1);
q) pos1:position[t;-1] / try to find all positions of -1
q) pos1
0 1
1 0
2 1
q) t ./: pos1 / here get items
-1 -1 -1
q) l:1 0 3 0 2 3 4 1 0 / list test
q) pos2:position[l;0] / try to find all positions of 0
q) pos2
1
3
8
q) l ./: pos2 / get items
0 0 0
This works but it'd be good to have more general solution for arbitrary lists and not only rectangular matrices. For instance code above won't work correctly for arguments like:
position[(1 2 3; 1 2; 1 2 1 4); 1]
May be someone has generic solution for that ?
How's this look? I think it should work for all two-dimensional lists, ragged or rectangular, and also for vectors. (I haven't worked out a version for arbitrary dimensions yet.)
q)position:{{$[type x;enlist each where x;raze flip each flip(til count x;raze each .z.s each x)]}x=y}
q)t
1 -1 1
-1 3 4
1 -1 1
q)l
1 0 3 0 2 3 4 1 0
q)r
1 2 3
1 2
1 2 1 4
q)pos1:position[t;-1]
q)pos2:position[l;0]
q)pos3:position[r;1]
q)pos1
0 1
1 0
2 1
q)pos2
1
3
8
q)pos3
0 0
1 0
2 0
2 2
q)t ./:pos1
-1 -1 -1
q)l ./:pos2
0 0 0
q)r ./:pos3
1 1 1 1
q)
EDIT:
Here's a version that works for all dimensions except 1 (and 0 of course):
q)position2:{{$[type x;where x;raze each raze flip each flip(til count x;.z.s each x)]}x=y}
q)r2:(r;r)
q)0N!r2;
((1 2 3;1 2;1 2 1 4);(1 2 3;1 2;1 2 1 4))
q)pos4:position2[r2;1]
q)0N!pos4;
(0 0 0;0 1 0;0 2 0;0 2 2;1 0 0;1 1 0;1 2 0;1 2 2)
q)r2 ./:pos4
1 1 1 1 1 1 1 1
q)r ./:position2[r;1]
1 1 1 1
q)t ./:position2[t;-1]
-1 -1 -1
q)
On vectors, though, it returns an address vector, not an address matrix, so it has to be used with #, not .:
q)0N!position2[l;0];
1 3 8
q)l ./:position2[l;0]
'type
q)l position2[l;0]
0 0 0
q)
If you really need it to work the same way on vectors as on higher-dimensional structures, the simplest solution is probably just to special-case them directly:
q)position3:{$[type x;enlist each where#;{$[type x;where x;raze each raze flip each flip(til count x;.z.s each x)]}]x=y}
q)position3[l;0]
1
3
8
q)l ./:position3[l;0]
0 0 0
q)r2 ./:position3[r2;1]
1 1 1 1 1 1 1 1
q)r ./:position3[r;1]
1 1 1 1
q)t ./:position3[t;-1]
-1 -1 -1
q)
Below should also work.
Not the exact solution but workable.
pos:{$[type x;where x=y;where each x=y]}
val:{raze ($[0h=type x;x#';x#])pos[x;y]}
q)t:(1 -1 1;-1 3 4;1 -1 1)
q)pos[t;-1]
1
0
1
q)val[t;-1]
-1
-1
-1
q)l:1 0 3 0 2 3 4 1 0
q)pos[l;0]
1 3 8
q)val[l;0]
0 0 0
q)r:(1 2 3; 1 2; 1 2 1 4)
q)pos[r;1]
,0
,0
0 2
q)val[r;1]
1 1 1 1
I'm creating a word search and am trying to calculate quality of the generated puzzles by verifying the word set is "distributed evenly" throughout the grid. For example placing each word consecutively, filling them up row-wise is not particularly interesting because there will be clusters and the user will quickly notice a pattern.
How can I measure how 'evenly distributed' the words are?
What I'd like to do is write a program that takes in a word search as input and output a score that evaluates the 'quality' of the puzzle. I'm wondering if anyone has seen a similar problem and could refer me to some resources. Perhaps there is some concept in statistics that might help? Thanks.
The basic problem is distribution of lines in a square or rectangle. You can eighter do this geometrically or using integer arrays. I will try the integer arrays here.
Let M be a matrix of your puzzle,
A B C D
E F G H
I J K L
M N O P
Let the word "EFGH" be an existent word, as well as "CGKO". Then, create a matrix which will contain the count of membership in eighter words in each cell:
0 0 1 0
1 1 2 1
0 0 1 0
0 0 1 0
Apply a rule: the current cell value is equal to the sum of all neighbours (4-way) and multiply with the cell's original value, if the original value is 2 or higher.
0 0 1 0 1 2 2 2
1 1 2 1 -\ 1 3 8 2
0 0 1 0 -/ 1 2 3 2
0 0 1 0 0 1 1 1
And sum up all values in rows and columns the matrix:
1 2 2 2 = 7
1 3 8 2 = 14
1 2 3 2 = 8
0 1 1 1 = 3
| | | |
3 7 | 6
14
Then calculate the avarage of both result sets:
(7 + 14 + 8 + 3) / 4 = 32 / 4 = 8
(3 + 7 + 14 + 6) / 4 = 30 / 4 = 7.5
And calculate the avarage difference to the avarage of each result set:
3 <-> 7.5 = 4.5 7 <-> 8 = 1
7 <-> 7.5 = 0.5 14 <-> 8 = 6
14 <-> 7.5 = 6.5 8 <-> 8 = 0
6 <-> 7.5 = 1.5 3 <-> 8 = 5
___avg ___avg
3.25 3
And multiply them together:
3 * 3.25 = 9.75
Which you treat as a distributionscore. You might need to tweak it a little bit to make it work better, but this should calculate distributionscores quite nicely.
Here is an example of a bad distribution:
1 0 0 0 1 1 0 0 2
1 0 0 0 -\ 2 1 0 0 -\ 3 -\ C avg 2.5 -\ C avg-2-avg 0.5
1 0 0 0 -/ 2 1 0 0 -/ 3 -/ R avg 2.5 -/ R avg-2-avg 2.5
1 0 0 0 1 1 0 0 2 _____*
6 4 0 0 1.25 < score
Edit: calc. errors fixed.