Why the 14 most significant bits of 17612864 is 67? - algorithm

At the bottom of Page 264 of CLRS, the authors say after obtaining r0 = 17612864, the 14 most significant bits of r0 yield the hash value h(k) = 67. I do not understand why it gives 67 since 67 in binary is 1000011 which is 7 bits.
EDIT
In the textbook:
As an example, suppose we have k = 123456, p = 14, m = 2^14 = 16384, and w = 32. Adapting Knuth's suggestion, we choose A to be the fraction of the form s/2^32 that is closest to (\sqrt(5) - 1) / 2, so that A = 2654435769/2^32. Then k*s = 327706022297664 = (76300 * 2^32) + 17612864, and so r1 = 76300 and r0 = 17612864. The 14 most significant bits of r0 yield the value h(k)=67.

17612864 = 0x010CC040 =
0000 0001 0000 1100 1100 0000 0100 0000
Most significant 14 bits of that is
0000 0001 0000 11
Which is 0x43, which is 67
Also:
int32 input = 17612864;
int32 output = input >> (32-14); //67

In a 32 bit world
17612864 = 00000001 00001100 11000000 01000000 (binary)
top fourteen bits = 00000001 000011 = 67

Related

How to explain embedded message binary wire format of protocol buffer?

I'm trying to understand protocol buffer encoding method, when translating message to binary(or hexadecimal) format, I can't understand how the embedded message is encoded.
I guess maybe it's related to memory address, but I can't find the accurate relationship.
Here is what i've done.
Step 1: I defined two messages in test.proto file,
syntax = "proto3";
package proto_test;
message Education {
string college = 1;
}
message Person {
int32 age = 1;
string name = 2;
Education edu = 3;
}
Step 2: And then I generated some go code,
protoc --go_out=. test.proto
Step 3: Then I check the encoded format of the message,
p := proto_test.Person{
Age: 666,
Name: "Tom",
Edu: &proto_test.Education{
College: "SOMEWHERE",
},
}
var b []byte
out, err := p.XXX_Marshal(b, true)
if err != nil {
log.Fatalln("fail to marshal with error: ", err)
}
fmt.Printf("hexadecimal format:% x \n", out)
fmt.Printf("binary format:% b \n", out)
which outputs,
hexadecimal format:08 9a 05 12 03 54 6f 6d 1a fd 96 d1 08 0a 09 53 4f 4d 45 57 48 45 52 45
binary format:[ 1000 10011010 101 10010 11 1010100 1101111 1101101 11010 11111101 10010110 11010001 1000 1010 1001 1010011 1001111 1001101 1000101 1010111 1001000 1000101 1010010 1000101]
what I understand is ,
08 - int32 wire type with tag number 1
9a 05 - Varints for 666
12 - string wire type with tag number 2
03 - length delimited which is 3 byte
54 6f 6d - ascii for "TOM"
1a - embedded message wire type with tag number 3
fd 96 d1 08 - ? (here is what I don't understand)
0a - string wire type with tag number 1
09 - length delimited which is 9 byte
53 4f 4d 45 57 48 45 52 45 - ascii for "SOMEWHERE"
What does fd 96 d1 08 stands for?
It seems like that d1 08 always be there, but fd 96 sometimes change, don't know why. Thanks for answering :)
Add
I debugged the marshal process and reported a bug here.
At that location I/you would expect the number of bytes in the embedded message.
I have repeated your experiment in Python.
msg = Person()
msg.age = 666
msg.name = "Tom"
msg.edu.college = "SOMEWHERE"
I got a different result, the one I would expect. A varint stating the size of the embedded message.
0x08
0x9A, 0x05
0x12
0x03
0x54 0x6F 0x6D
0x1A
0x0B <- Equals to 11 decimal.
0x0A
0x09
0x53 0x4F 0x4D 0x45 0x57 0x48 0x45 0x52 0x45
Next I deserialized your bytes:
msg2 = Person()
str = bytearray(b'\x08\x9a\x05\x12\x03\x54\x6f\x6d\x1a\xfd\x96\xd1\x08\x0a\x09\x53\x4f\x4d\x45\x57\x48\x45\x52\x45')
msg2.ParseFromString(str)
print(msg2)
The result of this is perfect:
age: 666
name: "Tom"
edu {
college: "SOMEWHERE"
}
The conclusion I come to is that there are some different ways of encoding in Protobuf. I do not know what is done in this case but I know the example of a negative 32 bit varint. A positive varint is encoded in five bytes, a negative value is cast as a 64 bit value and encoded to ten bytes.

What's wrong with my PNG IDAT chunk?

I'm trying to manually construct a simple, 4x1, uncompressed PNG.
So far, I have:
89504E47 // PNG Header
0D0A1A0A
0000000D // byte length of IHDR chunk contents, 4 bytes, value 13
49484452 // IHDR start - 4 bytes
00000004 // Width 4 bytes }
00000001 // Height 4 bytes }
08 // bit depth 8 = 24/32 bit 1 byte }
06 // color type, 6 - RGBa 1 byte }
00 // compression, 0 = Deflate 1 byte }
00 // filter, 0 = no filter 1 byte }
00 // interlace, 0 = no interlace 1 byte } Total, 13 Bytes
F93C0FCD // CRC of IHDR chunk, 4 bytes
00000013 // byte length of IDAT chunk contents, 4 bytes, value 19
49444154 // IDAT start - 4 bytes
0000 // ZLib 0 compression, 2 bytes }
00 // Filter = 0, 1 bytes }
CC0000FF // Pixel 1, Red-ish, 4 bytes }
00CC00FF // Pixel 2, Green-ish, 4 bytes }
0000CCFF // Pixel 3, Blue-ish, 4 bytes }
CCCCCCCC // Pixel 4, transclucent grey, 4 bytes } Total, 19 Bytes
6464C2B0 // CRC of IHDR chunk, 4 bytes
00000000 // byte length of IEND chunk, 4 bytes (value: 0)
49454E44 // IEND start - 4 bytes
AE426082 // CRC of IEND chunk, 4 bytes
Update
I think the issue I'm having is down to the ZLib/Deflate ordering.
I think that I have to include the "Non-compressed blocks format" details from RFC 1951, sec. 3.2.4, but I'm a little unsure as to the interactions. The only examples I can find are for Compressed blocks (understandably!)
So I've now tried:
49444154 // IDAT start - 4 bytes
01 // BFINAL = 1, BTYPE = 00 1 byte }
11EE // LEN & NLEN of data 2 bytes }
00 // Filter = 0, 1 byte }
CC0000FF // Pixel 1, Red-ish, 4 bytes }
00CC00FF // Pixel 2, Green-ish, 4 bytes }
0000CCFF // Pixel 3, Blue-ish, 4 bytes }
CCCCCCCC // Pixel 4, transclucent grey, 4 bytes } Total, 19 Bytes
6464C2B0 // CRC of IHDR chunk, 4 bytes
So the whole PNG file is:
89504E47 // PNG Block
0d0a1a0A
0000000D // IHDR Block
49484452
00000004
00000001
08060000
00
F93C0FCD
00000014 // IDAT Block
49444154
0111EE00
CC0000FF
00CC00FF
0000CCFF
CCCCCCCC
6464C2B0
00000000 // IEND Block
49454E44
AE426082
I'd be really grateful for some pointers as to where the issue lies... or even the PNG data for a working file so that I can reverse-engineer it?
Update 2
Thanks to Mark Adler, I've corrected my newbie errors, and now have functional code that can reproduce the result shown in his answer below, i.e. 4x1 pixel image. From this I can now happily produce a 100x1 image!
However, as a last step, I'd hoped, by tweaking the height field in the IHDR and adding additional non-terminal IDATs, to extend this to say a 4 x 2 image. Unfortunately this doesn't appear to work the way I'd expected.
I now have something like...
89504E47 // PNG Header
0D0A1A0A
0000000D // re calc'ed IHDR with 2 rows
49484452
00000004
00000002 // changed - now 2 rows
08
06
00
00
00
7FA87D63 // CRC of IHDR updated
0000001C // row 1 IDAT, non-terminal
49444154
7801
00 // BFINAL = 0, BTYPE = 00
1100EEFF
00
CC0000FF
00CC00FF
0000CCFF
CCCCCCCC
3D3A0892
5D19A623
0000001C // row 2, terminal IDAT, as Mark Adler's answer
49444154
7801
01 // BFINAL = 1, BTYPE = 00
1100EEFF
00
CC0000FF
00CC00FF
0000CCFF
CCCCCCCC
3D3A0892
BA0400B4
00000000
49454E44
AE426082
This:
11EE // LEN & NLEN of data 2 bytes }
is wrong. LEN and NLEN are both 16 bits, not 8 bits. So that needs to be:
1100EEFF // LEN & NLEN of data 4 bytes }
You also need a zlib wrapper around the deflate data. See RFC 1950.
Lastly you will need to update the CRC of the chunk. (Which has the wrong comment by the way -- it should say CRC of IDAT chunk.)
Thusly repaired:
89504E47 // PNG Header
0D0A1A0A
0000000D // byte length of IHDR chunk contents, 4 bytes, value 13
49484452 // IHDR start - 4 bytes
00000004 // Width 4 bytes }
00000001 // Height 4 bytes }
08 // bit depth 8 = 24/32 bit 1 byte }
06 // color type, 6 - RGBa 1 byte }
00 // compression, 0 = Deflate 1 byte }
00 // filter, 0 = no filter 1 byte }
00 // interlace, 0 = no interlace 1 byte } Total, 13 Bytes
F93C0FCD // CRC of IHDR chunk, 4 bytes
0000001C // byte length of IDAT chunk contents, 4 bytes, value 28
49444154 // IDAT start - 4 bytes
7801 // zlib Header 2 bytes }
01 // BFINAL = 1, BTYPE = 00 1 byte }
1100EEFF // LEN & NLEN of data 4 bytes }
00 // Filter = 0, 1 byte }
CC0000FF // Pixel 1, Red-ish, 4 bytes }
00CC00FF // Pixel 2, Green-ish, 4 bytes }
0000CCFF // Pixel 3, Blue-ish, 4 bytes }
CCCCCCCC // Pixel 4, transclucent grey, 4 bytes }
3d3a0892 // Adler-32 check 4 bytes }
ba0400b4 // CRC of IDAT chunk, 4 bytes
00000000 // byte length of IEND chunk, 4 bytes (value: 0)
49454E44 // IEND start - 4 bytes
AE426082 // CRC of IEND chunk, 4 bytes

LC3, Store a Value of a Register to a Memory Location

I'm attempting to write a short LC-3 program that initializes R1=5, R2=16 and computes the sum of R1 and R2 and put the result in memory x4000. The program is supposed to start at x3000. Unfortunately, I have to write it in binary form.
This is what I have so far...
.orig x3000__________; Program starts at x3000
0101 001 001 1 00000 ;R1 <- R1 AND x0000
0001 001 001 1 00101 ;R1 <- R1 + x0005
0101 010 010 1 00000 ;R2 <- R2 AND x0000
0001 010 010 1 01000 ;R2 <- R2 + x0008
0001 010 010 1 01000 ;R2 <- R2 + x0008
0001 011 010 0 00 001 ;R3 <- R2 + R1
//This last step is where I'm struggling...
I was thinking of using ST, and I figured PCOFFSET9 to be 994, but I can't represent that using 8 bits... so how else would I do this? Is my code inefficient?
0011 011
The ST command is limited to only 511 (I believe) from its current location in memory. For something like this you will need to use the STI command (Store Indirect) The sample code below will help explain how to use STI.
.orig x3000
AND R1, R1, #0 ; Clear R1
ADD R1, R1, #5 ; Store 5 into R1
AND R2, R2, #0 ; Clear R2
ADD R2, R2, #8 ; Store 8 into R2
ADD R3, R2, R1 ; R3 = R2 + R1
STI R3, STORE_x4000 ; Store the value of R3 into mem[x4000]
HALT ; TRAP x25 end the program
; Variables
STORE_x4000 .FILL x4000
.END
You will need to make the appropriate conversions to binary, but if you plug the code into the LC-3 simulator it will give you the binary representation.

Arithmetic shift using intel intrinsics

I have a set of bits, for example: 1000 0000 0000 0000 which is 16 bits and therefore a short. I would like to use an arithmetic shift so that I use the MSB to assign the rest of the bits:
1111 1111 1111 1111
and if I started off with 0000 0000 0000 0000:
after arithmetically shifting I would still have: 0000 0000 0000 0000
How can I do this without resorting to assembly? I looked at the intel intrinsics guide and it looks like I need to do this using AVX extensions but they look at data types larger than my short.
As mattnewport states in his answer, C code can do the job efficiently though with the possibility of 'implementation defined' behavior. This answer shows how to avoid the implementation defined behavior while preserving efficient code generation.
Because the question is about shifting a 16-bit operand, the concern over the implementation defined decision whether to sign extend or zero fill can be avoided by first sign extending the operand to 32-bits. Then the 32-bit value can be right shifted as unsigned, and finally truncated back to 16-bits.
Mattnewport's code actually sign extends the 16-bit operand to an int (32-bit or 64-bit depending on the compiler model) before shifting. This is because the language specification (C99 6.5.7 bitwise shift operators) requires a first step: integer promotions are performed on each of the operands. Likewise, the result of mattnewport's code is an int because The type of the result is that of the promoted left operand. Because of this, the code variation that avoids implementation defined behavior generates the same number of instructions as mattnewport's original code.
To avoid implementation defined behavior, the implicit promotion to signed int is replaced with an explicit promotion to unsigned int. This eliminates any possibility of implementation defined behavior while maintaining the same code efficiency.
The idea can be extended to cover 32-bit operands and works efficiently when 64-bit native integer support is present. Here is an example:
// use standard 'll' for long long print format
#define __USE_MINGW_ANSI_STDIO 1
#include <stdio.h>
#include <stdint.h>
// Code provided by mattnewport
int16_t aShiftRight16x (int16_t val, int count)
{
return val >> count;
}
// This variation avoids implementation defined behavior
int16_t aShiftRight16y (int16_t val, int count)
{
uint32_t uintVal = val;
uint32_t uintResult = uintVal >> count;
return (int16_t) uintResult;
}
// A 32-bit arithmetic right shift without implementation defined behavior
int32_t aShiftRight32 (int32_t val, int count)
{
uint64_t uint64Val = val;
uint64_t uint64Result = uint64Val >> count;
return (int32_t) uint64Result;
}
int main (void)
{
int16_t val16 = 0x8000;
int32_t val32 = 0x80000000;
int count;
for (count = 0; count <= 15; count++)
printf ("%04hX %04hX %08X\n", aShiftRight16x (val16, count),
aShiftRight16y (val16, count),
aShiftRight32 (val32, count));
return 0;
}
Here is gcc 4.8.1 x64 code generation:
0000000000000030 <aShiftRight16x>:
30: 0f bf c1 movsx eax,cx
33: 89 d1 mov ecx,edx
35: d3 f8 sar eax,cl
37: c3 ret
0000000000000040 <aShiftRight16y>:
40: 0f bf c1 movsx eax,cx
43: 89 d1 mov ecx,edx
45: d3 e8 shr eax,cl
47: c3 ret
0000000000000050 <aShiftRight32>:
50: 48 63 c1 movsxd rax,ecx
53: 89 d1 mov ecx,edx
55: 48 d3 e8 shr rax,cl
58: c3 ret
Here is MS visual studio x64 code generation:
aShiftRight16x:
00: 0F BF C1 movsx eax,cx
03: 8B CA mov ecx,edx
05: D3 F8 sar eax,cl
07: C3 ret
aShiftRight16y:
10: 0F BF C1 movsx eax,cx
13: 8B CA mov ecx,edx
15: D3 E8 shr eax,cl
17: C3 ret
aShiftRight32:
20: 48 63 C1 movsxd rax,ecx
23: 8B CA mov ecx,edx
25: 48 D3 E8 shr rax,cl
28: C3 ret
program output:
8000 8000 80000000
C000 C000 C0000000
E000 E000 E0000000
F000 F000 F0000000
F800 F800 F8000000
FC00 FC00 FC000000
FE00 FE00 FE000000
FF00 FF00 FF000000
FF80 FF80 FF800000
FFC0 FFC0 FFC00000
FFE0 FFE0 FFE00000
FFF0 FFF0 FFF00000
FFF8 FFF8 FFF80000
FFFC FFFC FFFC0000
FFFE FFFE FFFE0000
FFFF FFFF FFFF0000
I'm not sure why you are looking for intrinsics for this. Why not just use ordinary C++ right shift? This behavior is implementation defined but AFAIK on Intel platforms it will always sign extend.
int16_t val = 1 << 15; // 1000 0000 0000 0000
int16_t shiftVal = val >> 15; // 1111 1111 1111 1111

XMM register storing

I am trying to store an XMM register into a certain location such as address line 4534342.
Example:
What I have done?
I know that the xmm registers contain 128 bit values. So, my program has already produced allocated 16 bytes of memory. In addition, since the registers are aligned data, I have allocated 31 bytes and then found an aligned address within it. That should prevent any exceptions from being thrown.
What I am trying to do visually?
Mem Adr | Contents (binary)
4534342 | 0000 0000 0000 0000 ; I want to pass in address 4534342 and the
4534346 | 0000 0000 0000 0000 ; the program inline-assembly will store it
4534348 | 0000 0000 0000 0000 ; its contents straight down to address 45-
4534350 | 0000 0000 0000 0000 ; 34350
4534352 | 0000 0000 0000 0000
4534354 | 0000 0000 0000 0000
4534356 | 0000 0000 0000 0000
4534358 | 0000 0000 0000 0000
Setup
cyg_uint8 *start_value; //pointer to the first value of the allocated block
cyg_uint32 end_address; //aligned address location value
cyg_uint32 *aligned_value; //pointer to the value at the end_address
start_value = xmm_container; //get the pointer to the allocated block
end_address = (((unsigned int) start_value) + 0x0000001f) & 0xFFFFFFF8; //find aligned memory
aligned_value = ((cyg_uint32*)end_address); //create a pointer to get the first value of the block
Debug statements BEFORE assembly call to ensure function
printf("aligned_value: %d\n", (cyg_uint32) aligned_value);
printf("*aligned_value: %d\n", *aligned_value);
Assembly Call
__asm__("movdqa %%xmm0, %0\n" : "=m"(*aligned_value)); //assembly call
Debug statements AFTER assembly call to ensure function
printf("aligned_value: %d\n", (cyg_uint32) aligned_value);
printf("*aligned_value: %d\n", *aligned_value);
The output from printf [FAILURE]
aligned_value: 1661836 //Looks good!
*aligned_value: 0 //Looks good!
aligned_value: -1 //Looks wrong :(
//then program gets stuck
Basically, am I doing this process correctly? Why do you think it is getting stuck?
Thank you for your time and effort.
I don't think your alignment logic is correct if you want a 16-byte aligned address.
Just do the math, it's easy!:
(0 + 0x1f) & 0xFFFFFFF8 = 0x18 ; 0x18-0=0x18 unused bytes, 0x1F-0x18=7 bytes left
(1 + 0x1f) & 0xFFFFFFF8 = 0x20 ; 0x20-1=0x1F unused bytes, 0x1F-0x1F=0 bytes left
...
(8 + 0x1f) & 0xFFFFFFF8 = 0x20 ; 0x20-8=0x18 unused bytes, 0x1F-0x18=7 bytes left
(9 + 0x1f) & 0xFFFFFFF8 = 0x28 ; 0x28-9=0x1F unused bytes, 0x1F-0x1F=0 bytes left
...
(0xF + 0x1f) & 0xFFFFFFF8 = 0x28 ; 0x28-0xF=0x19 unused bytes, 0x1F-0x19=6 bytes left
(0x10 + 0x1f) & 0xFFFFFFF8 = 0x28 ; 0x28-0x10=0x18 unused bytes, 0x1F-0x18=7 bytes left
(0x11 + 0x1f) & 0xFFFFFFF8 = 0x30 ; 0x30-0x11=0x1F unused bytes, 0x1F-0x1F=0 bytes left
...
(0x18 + 0x1f) & 0xFFFFFFF8 = 0x30 ; 0x30-0x18=0x18 unused bytes, 0x1F-0x18=7 bytes left
(0x19 + 0x1f) & 0xFFFFFFF8 = 0x38 ; 0x38-0x19=0x1F unused bytes, 0x1F-0x1F=0 bytes left
...
(0x1F + 0x1f) & 0xFFFFFFF8 = 0x38 ; 0x38-0x1F=0x19 unused bytes, 0x1F-0x19=6 bytes left
First, to get all zeroes in the 4 least significant bits the mask should be 0xFFFFFFF0.
Next, your overflowing the 31-byte buffer if you calculate the aligned address in this way. Your math leaves you with 0 to 7 bytes of space, which isn't sufficient to store 16 bytes.
For correct 16-byte alignment you should write this:
end_address = (((unsigned int)start_value) + 0xF) & 0xFFFFFFF0;

Resources