Why 3 and 1 evaluates to 1 and 4 and 1 to 0 in nim-lang - logic

Hi I'm new to nim and dont get how evaluations of arithmetics work?
3 and 1 # Outputs 1
and
4 and 1 # Outputs 0
Which logic is going on here?
Thanks

I think, in binary: 3 = 0b'11, 1 = 0b'01, so 3 and 1 = 0b'01, while 4 = 0b'100 and 1 = 0b'001, so 4 and 1 = 0b'00.

Related

Pattern finder algorithm

I would like to write an algorithm that is capable of detecting any pattern and determining the next element. Some examples would be 11 22 11 22 1 where the next element is 1.
But it could also be 3 1 1 1 1 3 3 3 1 1 1 3 3 3 1 1 1 1 (1x3 , 4x1, 3x3, 3x1, 3x3, 4x1) where the next element would be 3.
So it always goes like: 3 1 1 1 1 3 3 3 1 1 1 3 3 3 1 1 1 1 3 1 1 1 1 3 3 3 1 1 1 3 3 3 1 1 1 1
We do not know how long the pattern is but we know that it appears at least once in the array. We also know that there is only one pattern and 100% of it is inside the array.
It might be useful to imagine the numbers as colors and the array would be a wall with a repeating wallpaper on it. And so our task is to determine what color would the wall be if it had continued.
How should I get started?
You could search for the longest suffix that you can find anywhere else in the string. Than simply add the character that stands behind the suffix at this other location.
Example
3 1 1 1 1 3 3 3 1 1 1 3 3 3 1 1 1 1
We can fin the following suffixes in other places of the sequence (than at the end)
1 (found at a lot of positions)
1 1 (found at a lot of positions)
1 1 1 (found at a lot of positions)
1 1 1 1 (found at position 1)
3 1 1 1 1 (found at position 0)
3 3 1 1 1 1 (does not appear anywhere in the sequence except at the end)
So the Longest suffix is 3 1 1 1 1 and it is followed by 3.
3 1 1 1 1 3 3 3 1 1 1 3 3 3 1 1 1 1
xxxxxxxxx ^
So the estimation-rule for the next character is 3 .
As long as the pattern always repeats throughout the array, you can just start testing patterns of ascending length from the beginning of the array to the end, moving to the next higher length upon a failure until the whole array has been checked, returning the shortest successful pattern.
Something like the following (untested) JavaScript code:
var findShortestRepeatingPattern = function(array){
var pattern = [];
for(var patternLength = 0; patternLength < array.length; length++){
pattern = []
var arrayCursor = 0;
//Build the pattern from the start of the array
while(arrayCursor < patternLength){
pattern.push(array[arrayCursor++]);
}
//Test the pattern on the remainder of the array
var patternMatches = true;
while(arrayCursor < array.length){
if(pattern[arrayCursor % patternLength] != array[arrayCursor++]){
patternMatches = false;
break;
}
}
//Exit if the pattern matches
if(patternMatches){
break;
}
}
return pattern;
};
Then, the next expected value should be the patternValue at index insertIndex mod patternLength.

Understanding Spark MLlib LDA input format

I am trying to implement LDA using Spark MLlib.
But I am having difficulty understanding input format. I was able to run its sample implementation to take input from a file which contains only number's as shown :
1 2 6 0 2 3 1 1 0 0 3
1 3 0 1 3 0 0 2 0 0 1
1 4 1 0 0 4 9 0 1 2 0
2 1 0 3 0 0 5 0 2 3 9
3 1 1 9 3 0 2 0 0 1 3
4 2 0 3 4 5 1 1 1 4 0
2 1 0 3 0 0 5 0 2 2 9
1 1 1 9 2 1 2 0 0 1 3
4 4 0 3 4 2 1 3 0 0 0
2 8 2 0 3 0 2 0 2 7 2
1 1 1 9 0 2 2 0 0 3 3
4 1 0 0 4 5 1 3 0 1 0
I followed
http://spark.apache.org/docs/latest/mllib-clustering.html#latent-dirichlet-allocation-lda
I understand the output format of this as explained here.
My use case is very simple, I have one data file with some sentences.
I want to convert this file into corpus so that to pass it to org.apache.spark.mllib.clustering.LDA.run().
My doubt is about what those numbers in input represent which is then zipWithIndex and passed to LDA? Is it like number 1 appearing everywhere represent same word or it is some kind of count?
First you need to convert your sentences into vectors.
val documents: RDD[Seq[String]] = sc.textFile("yourfile").map(_.split(" ").toSeq)
val hashingTF = new HashingTF()
val tf: RDD[Vector] = hashingTF.transform(documents)
val idf = new IDF().fit(tf)
val tfidf: RDD[Vector] = idf.transform(tf)
val corpus = tfidf.zipWithIndex.map(_.swap).cache()
// Cluster the documents into three topics using LDA
val ldaModel = new LDA().setK(3).run(corpus)
Read more about TF_IDF vectorization here

Java algorithm to generate all the possible permutation of a matrix

I would need a java algorithm to generate all the possible permutation of a given matrix.
For example,
1 2
A = 3 4
The algorithm should return:
1 2 1 2 2 1 2 1 3 4 4 3 3 4 4 3
A = 3 4 B = 4 3 C = 3 4 D = 4 3 E = 1 2 E = 1 2 F = 2 1 G = 2 1
Any idea?
Thank you

Adding zeros between every 2 elements of a matrix in matlab/octave

I am interested in how can I add rows and columns of zeros in a matrix so that it looks like this:
1 0 2 0 3
1 2 3 0 0 0 0 0
2 3 4 => 2 0 3 0 4
5 4 3 0 0 0 0 0
5 0 4 0 3
Actually I am interested in how can I do this efficiently, because walking the matrix and adding zeros takes a lot of time if you work with a big matrix.
Update:
Thank you very much.
Now I'm trying to replace the zeroes with the sum of their neighbors:
1 0 2 0 3 1 3 2 5 3
1 2 3 0 0 0 0 0 3 8 5 12... and so on
2 3 4 => 2 0 3 0 4 =>
5 4 3 0 0 0 0 0
5 0 4 0 3
as you can see i'm considering all the 8 neighbors of an element, but again using for and walking the matrix slows me down quite a bit, is there a faster way ?
Let your little matrix be called m1. Then:
m2 = zeros(5)
m2(1:2:end,1:2:end) = m1(:,:)
Obviously this is hard-wired to your example, I'll leave it to you to generalise.
Here are two ways to do part 2 of the question. The first does the shifts explicitly, and the second uses conv2. The second way should be faster.
M=[1 2 3; 2 3 4 ; 5 4 3];
% this matrix (M expanded) has zeros inserted, but also an extra row and column of zeros
Mex = kron(M,[1 0 ; 0 0 ]);
% The sum matrix is built from shifts of the original matrix
Msum = Mex + circshift(Mex,[1 0]) + ...
circshift(Mex,[-1 0]) +...
circshift(Mex,[0 -1]) + ...
circshift(Mex,[0 1]) + ...
circshift(Mex,[1 1]) + ...
circshift(Mex,[-1 1]) + ...
circshift(Mex,[1 -1]) + ...
circshift(Mex,[-1 -1]);
% trim the extra line
Msum = Msum(1:end-1,1:end-1)
% another version, a bit more fancy:
MexTrimmed = Mex(1:end-1,1:end-1);
MsumV2 = conv2(MexTrimmed,ones(3),'same')
Output:
Msum =
1 3 2 5 3
3 8 5 12 7
2 5 3 7 4
7 14 7 14 7
5 9 4 7 3
MsumV2 =
1 3 2 5 3
3 8 5 12 7
2 5 3 7 4
7 14 7 14 7
5 9 4 7 3

Evaluating the distribution of words in a grid

I'm creating a word search and am trying to calculate quality of the generated puzzles by verifying the word set is "distributed evenly" throughout the grid. For example placing each word consecutively, filling them up row-wise is not particularly interesting because there will be clusters and the user will quickly notice a pattern.
How can I measure how 'evenly distributed' the words are?
What I'd like to do is write a program that takes in a word search as input and output a score that evaluates the 'quality' of the puzzle. I'm wondering if anyone has seen a similar problem and could refer me to some resources. Perhaps there is some concept in statistics that might help? Thanks.
The basic problem is distribution of lines in a square or rectangle. You can eighter do this geometrically or using integer arrays. I will try the integer arrays here.
Let M be a matrix of your puzzle,
A B C D
E F G H
I J K L
M N O P
Let the word "EFGH" be an existent word, as well as "CGKO". Then, create a matrix which will contain the count of membership in eighter words in each cell:
0 0 1 0
1 1 2 1
0 0 1 0
0 0 1 0
Apply a rule: the current cell value is equal to the sum of all neighbours (4-way) and multiply with the cell's original value, if the original value is 2 or higher.
0 0 1 0 1 2 2 2
1 1 2 1 -\ 1 3 8 2
0 0 1 0 -/ 1 2 3 2
0 0 1 0 0 1 1 1
And sum up all values in rows and columns the matrix:
1 2 2 2 = 7
1 3 8 2 = 14
1 2 3 2 = 8
0 1 1 1 = 3
| | | |
3 7 | 6
14
Then calculate the avarage of both result sets:
(7 + 14 + 8 + 3) / 4 = 32 / 4 = 8
(3 + 7 + 14 + 6) / 4 = 30 / 4 = 7.5
And calculate the avarage difference to the avarage of each result set:
3 <-> 7.5 = 4.5 7 <-> 8 = 1
7 <-> 7.5 = 0.5 14 <-> 8 = 6
14 <-> 7.5 = 6.5 8 <-> 8 = 0
6 <-> 7.5 = 1.5 3 <-> 8 = 5
___avg ___avg
3.25 3
And multiply them together:
3 * 3.25 = 9.75
Which you treat as a distributionscore. You might need to tweak it a little bit to make it work better, but this should calculate distributionscores quite nicely.
Here is an example of a bad distribution:
1 0 0 0 1 1 0 0 2
1 0 0 0 -\ 2 1 0 0 -\ 3 -\ C avg 2.5 -\ C avg-2-avg 0.5
1 0 0 0 -/ 2 1 0 0 -/ 3 -/ R avg 2.5 -/ R avg-2-avg 2.5
1 0 0 0 1 1 0 0 2 _____*
6 4 0 0 1.25 < score
Edit: calc. errors fixed.

Resources