Hashing binary numbers with don't care bits - data-structures

How can I find out if a binary number is contained in a set, where it is possible that an element of the set has don’t care bits?
I thought about using hash table, but there is a need to duplicate the numbers with don’t care bits in the hash table in order to cover all the possibilities.
For example:
The set of numbers is:
0 00x1
1 10xx
2 110x
3 1010
4 11x1
5 0010
and the number is 0011, the result should be 0.

If number of digits of binary number are limited then you can duplicate those don't care bits and convert the binary numbers to integers then use these integers as keys for map and other as values.
Example
0 00x1
1 10xx
can be converted to
0001 0
0011 0
1000 1
1001 1
1010 1
1011 1
and saved as
i j
1 0
3 0
8 1
9 1
10 1
11 1
where i is the key and j is the value

Let's say you have the binary number 1xxx, that would match 8 numbers. So, do not go with duplicating for each option.
You have to keep the "do not care" bits somewhere. Use another number for this, set the "do not care" bits to 1. If we go over your example:
i x y
0 00x1 0010
1 10xx 0011
2 110x 0001
3 1010 0000
4 11x1 0010
5 0010 0000
And you need to decide what to use for x, 0 or 1. You can use any of them, once you keep the information in the second number it does not matter.
Now use bitwise operations:
if ((n ^ x[i]) | y[i]) == y[i] then match
This solution is based on checking the existence of any non-matching bits except do-not-care bits. (n xor x[i]) gives the non-matching bits, then or'ing it with y[i] should not be different than y[i].
If we go over your example, and assuming you choose 0 for x, the check becomes
i:0 -->> ((0011 ^ 0001) | 0010) == 0010 -->> match!
i:1 -->> ((0011 ^ 1000) | 0011) != 0011 -->> no match!
i:2 -->> ((0011 ^ 1100) | 0001) != 0001 -->> no match!
i:3 -->> ((0011 ^ 1010) | 0000) != 0001 -->> no match!
i:4 -->> ((0011 ^ 1101) | 0010) != 0001 -->> no match!
i:5 -->> ((0011 ^ 0010) | 0000) != 0000 -->> no match!

Related

Why does ^1 equal -2?

fmt.Println(^1)
Why does this print -2?
The ^ operator is the bitwise complement operator. Spec: Arithmetic operators:
For integer operands, the unary operators +, -, and ^ are defined as follows:
+x is 0 + x
-x negation is 0 - x
^x bitwise complement is m ^ x with m = "all bits set to 1" for unsigned x
and m = -1 for signed x
So 1 in binary is a single 1 bit preceded with full of zeros:
0000000000000000000000000000000000000000000000000000000000000001
So the bitwise complement is a single 0 bit preceded by full of ones:
1111111111111111111111111111111111111111111111111111111111111110
The ^1 is an untyped constant expression. When it is passed to a function, it has to be converted to a type. Since 1 is an untyped integer constant, its default type int will be used. int in Go is represented using the 2's complement where negative numbers start with a 1. The number being full ones is -1, the number being smaller by one (in binary) is -2 etc.
The bit pattern above is the 2's complement representation of -2.
To print the bit patterns and type, use this code:
fmt.Println(^1)
fmt.Printf("%T\n", ^1)
fmt.Printf("%064b\n", 1)
i := ^1
fmt.Printf("%064b\n", uint(i))
It outputs (try it on the Go Playground):
-2
int
0000000000000000000000000000000000000000000000000000000000000001
1111111111111111111111111111111111111111111111111111111111111110
Okay, this has to do with the way that we use signed signs in computation.
For a 1 byte number, you can get
D
B
-8
1000
-7
1001
-6
1010
-5
1011
-4
1100
-3
1101
-2
1110
-1
1111
0
0000
1
0001
2
0010
3
0011
4
0100
5
0101
6
0110
7
0111
You can see here that 1 is equivalent to 0001 (Nothing changes) but -1 is equal to 1111. ^ operator does a bitwise xor operation. Therefore:
0001
1111 xor
-------
1110 -> That is actually -2.
All this is because of the convention of two complement that we use to do calculations with negative numbers. Of course, this can be extrapolated to longer binary numbers.
You can test this by using windows calculator to do a xor bitwise calculation.

How to bitmask a number (in hex) using the AND operator?

I know that you can bitmask by ANDing a value with 0. However, how can I both bitmask certain nibbles and maintain others. In other words if I have 0x000f0b7c and I wanted to mask the everything but b (in other words my result would be 0x00000b00) how would I use AND to do this? Would it require multiple steps?
You can better understand boolean operations if you represent values in binary form.
The AND operation between two binary digits returns 1 if both the binary digits have a value of 1, otherwise it returns 0.
Suppose you have two binary digits a and b, you can build the following "truth table":
a | b | a AND b
---+---+---------
0 | 0 | 0
1 | 0 | 0
0 | 1 | 0
1 | 1 | 1
The masking operation consists of ANDing a given value with a "mask" where every bit that needs to be preserved is set to 1, while every bit to discard is set to 0.
This is done by ANDing each bit of the given value with the corresponding bit of the mask.
The given value, 0xf0b7c, can be converted as follows:
f 0 b 7 c (hex)
1111 0000 1011 0111 1100 (bin)
If you want to preserve only the bits corresponding to the "b" value (bits 8..11) you can mask it this way:
f 0 b 7 c
1111 0000 1011 0111 1100
0000 0000 1111 0000 0000
The value 0000 0000 1111 0000 0000 can be converted to hex and has a value of 0xf00.
So if you calculate "0xf0b7c AND 0xf00" you obtain 0xb00.

Understanding XOR logical operator

I don't understand this
2.0.0p247 :616 > 5 ^ 2
=> 7
2.0.0p247 :617 > 5 ^ 1
=> 4
What 7 and 4 means in those scenarios?
I try reading here http://en.wikipedia.org/wiki/Exclusive_disjunction but cannot figure out by looking into the diagrams what is the subtract here. Sorry if this is simple math question.
It has to do with the binary representation of the values.
5 = 0101
2 = 0010
1 = 0001
Now the XOR works like this:
0 ^ 0 = 0
0 ^ 1 = 1
1 ^ 0 = 1
1 ^ 1 = 0
so to compute 5 ^ 2, let's apply the ^ operation to each column:
0101 (this is 5)
0010 (this is 2)
----
0111 ==> which is the binary representation of 7
How did this work? In the leftmost column, we computed 0^0=0. In the second column, 1^0=1. In the third column 0^1=1, and so on.
and 5 ^ 1
0101 (this is 5)
0001 (this is 1)
----
0100 ==> which is the binary represenation of 4

decoding HID data

I am using an rs232 HID reader.
Its manual says that its output is
CCDDDDDDDDDDXX
where CC is reserved for HID
DDDDDDDDDD is the transponder (the card) data
XX is a checksum
the checksum is well explained and irrelevant here. About DDDDDDDDDD only says valid values are 0000000000 to 1FFFFFFFFF but no indication of how it converts to what is printed on front face of the card.
I have 3 sample cards, sadly on a short range (edit plus an extra one). here I show them:
readed from rs232 shown on card
00000602031C27 00398
00000602031F2A 00399
0000060203202B 00400
00000601B535F1 55962 **new
Also I have a DB with 1000 cards loaded (what is printed on front) so I need the the decode path from what I read on rs232 to what is printed on front.
Some values from DB (I have seen the cards, but I have no phisical access to them now)
55503
60237
00833
Thanks a lot to every one.
Googling for the string "CCDDDDDDDDDDXX" returns http://www.rfideas.com/downloads/SerialAppNote8.pdf which seems to describe how to decode the numbers. I don't guarantee if that is accurate.
Decoding the Standard 26-bit Format
Message sent by the reader:
C C D D D D D D D D D D X X
---------------------------
0 0 0 0 0 6 0 2 0 3 1 C 2 7
0 0 0 0 0 6 0 2 0 3 1 F 2 A
0 0 0 0 0 6 0 2 0 3 2 0 2 B
0 0 0 0 0 6 0 1 B 5 3 5 F 1
Stripping off the checksum, X, and reducing the data to binary gives:
C C D D D D D D D D D D
cccc cccc zzzz zzzz zzzz zspf ffff fffn nnnn nnnn nnnn nnnp
-----------------------------------------------------------
0000 0000 0000 0000 0000 0110 0000 0010 0000 0011 0001 1100
0000 0000 0000 0000 0000 0110 0000 0010 0000 0011 0001 1111
0000 0000 0000 0000 0000 0110 0000 0010 0000 0011 0010 0000
0000 0000 0000 0000 0000 0110 0000 0001 1011 0101 0011 0101
All the Card Data Characters to the left of the 7th can be ignored.
c = HID Specific Code.
z = leading zeros
s = start sentinel (it is always a 1)
p = parity odd and even (12 bits each).
f = Facility Code 8 bits
n = Card Number 16 bits
From this we can see that
00000602031C27 → n = 0b0000000110001110 = 398
00000602031F2A → n = 0b0000000110001111 = 399
0000060203202B → n = 0b0000000110010000 = 400
00000601B535F1 → n = 0b1101101010011010 = 55962
So, for your example, we may probably get:
55503
(f, n) = 0b0000_0001__1101_1000_1100_1111
odd parity of first 12 bits = 0
even parity of last 12 bits = 0
result = 00000403b19e56

Length values from deflate algorithm

I compressed the text "TestingTesting" and the hex result was: 0B 49 2D 2E C9 CC 4B 0F 81 50 00. I can't figure out the length and distance codes. The binary below is reversed because the RFC says to read the bits from right to left (thanks Matthew Slattery for the help). Here is what was parsed so far:
1 BFINAL (last block)
01 BTYPE (static)
1000 0100 132-48= 84 T
1001 0101 149-48= 101 e
1010 0011 163-48= 115 s
1010 0100 164-48= 116 t
1001 1001 153-48= 105 i
1001 1110 158-48= 110 n
1001 0111 151-48= 103 g
These are the remaining bits that I don't know how to parse:
1000 0100 0000 1000 0101 0000 0000 0
The final 10 bits (end of block value is x100) is the only part I can parse. I think the length and distance values should be 7 (binary 0111) since the length of "Testing" is 7 letters, and it gets copied 7 characters after the current position, but I can't figure out how its representing this in the remaining bits. What am I doing wrong?
The distance code is 5, but a distance code of 5 is followed by one "extra bit" to indicate an actual distance of either 7 or 8. (See the second table in paragraph 3.2.5 of the RFC.)
The complete decoding of the data is:
1 BFINAL
01 BTYPE=static
10000100 'T'
10010101 'e'
10100011 's'
10100100 't'
10011001 'i'
10011110 'n'
10010111 'g'
10000100 another 'T'
0000100 literal/length code 260 = length 6
00101 distance code 5
0 extra bit => the distance is 7
0000000 literal/length code 256 = end of block

Resources