How to encode struct into binary with bit packing in Golang - go

I am trying to encode large data structs into binary. I have specified number of bits for each struct element. So I need to encode struct into binary according to bit length. Standard Golang library Encoding/binary packs each item minimum as one byte. Therefore I need another solution. How can I encode struct elements as specified bit number in Go?
For example; Item1 = 00001101 Item2 = 00000110 Result will be as 01101110
type Elements struct{
Item1 uint8 // number of bits = 5
Item2 uint8 // number of bits = 3
Item3 uint8 // number of bits = 2
Item4 uint64 // number of bits = 60
Item5 uint16 // number of bits = 11
Item6 []byte // bit length = 8
Item7 Others
}
type Others struct{
Other1 uint8 // number of bits = 4
Other2 uint32 // number of bits = 21
Other3 uint16 // number of bits = 9
}

Related

Does go use something like space padding for structs? [duplicate]

This question already has answers here:
Sizeof struct in Go
(6 answers)
Closed 4 months ago.
I was playing around in go, and was trying to calculate and get the size of struct objects. And found something interesting, if you take a look at the following structs:
type Something struct {
anInteger int16 // 2 bytes
anotherInt int16 // 2 bytes
yetAnother int16 // 2 bytes
someBool bool // 1 byte
} // I expected 7 bytes total
type SomethingBetter struct {
anInteger int16 // 2 bytes
anotherInt int16 // 2 bytes
yetAnother int16 // 2 bytes
someBool bool // 1 byte
anotherBool bool // 1 byte
} // I expected 8 bytes total
type Nested struct {
Something // 7 bytes expected at first
completingByte bool // 1 byte
} // 8 bytes expected at first sight
But the result I got using unsafe.Sizeof(...) was as following:
Something -> 8 bytes
SomethingBetter -> 8 bytes
Nested -> 12 bytes, still, after finding out that "Something" used 8 bytes, though this might use 9 bytes
I suspect that go does something kind of like padding, but I don't know how and why it does that, is there some formula? Or logics? If it uses space padding, is it done randomly? Or based on some rules?
Yes, we have padding! if your system architecture is 32-bit the word size is 4 bytes and if it is 64-bit, the word size is 8 bytes. Now, what is the word size? "Word size" refers to the number of bits processed by a computer's CPU in one go (these days, typically 32 bits or 64 bits). Data bus size, instruction size, address size are usually multiples of the word size.
For example, suppose this struct:
type data struct {
a bool // 1 byte
b int64 // 8 byte
}
This struct it's not 9 bytes because, when our word size is 8, for first cycle, cpu reads 1 byte of bool and padding 7 bytes for others.
Imagine:
p: padding
+-----------------------------------------+----------------+
| 1-byte bool | p | p | p | p | p | p | p | int-64 |
+-----------------------------------------+----------------+
first 8 bytes second 8 bytes
For better performance, sort your struct items from bigger to small.
This is not good performance:
type data struct {
a string // 16 bytes size 16
b int32 // 4 bytes size 20
// 4 bytes padding size 24
c string // 16 bytes size 40
d int32 // 4 bytes size 44
// 4 bytes padding size 48 - Aligned on 8 bytes
}
Now It's better:
type data struct {
a string // 16 bytes size 16
c string // 16 bytes size 32
d int32 // 4 bytes size 36
b int32 // 4 bytes size 40
// no padding size 40 - Aligned on 5 bytes
}
See here for more examples.

int64(math.Pow(2, 63) - 1) results in -9223372036854775808 rather than 9223372036854775807

I am trying to store max and min signed ints of different bits. The code works just fine for ints other than int64
package main
import (
"fmt"
"math"
)
func main() {
var minInt8 int8 = -128
var maxInt8 int8 = 127
fmt.Println("int8\t->", minInt8, "to", maxInt8)
fmt.Println("int8\t->", math.MinInt8, "to", math.MaxInt8)
var minInt16 int16 = int16(math.Pow(-2, 15))
var maxInt16 int16 = int16(math.Pow(2, 15) - 1)
fmt.Println("int16\t->", minInt16, "to", maxInt16)
fmt.Println("int16\t->", math.MinInt16, "to", math.MaxInt16)
var minInt32 int32 = int32(math.Pow(-2, 31))
var maxInt32 int32 = int32(math.Pow(2, 31) - 1)
fmt.Println("int32\t->", minInt32, "to", maxInt32)
fmt.Println("int32\t->", math.MinInt32, "to", math.MaxInt32)
var minInt64 int64 = int64(math.Pow(-2, 63))
var maxInt64 int64 = int64(math.Pow(2, 63) - 1) // gives me the wrong output
fmt.Println("int64\t->", minInt64, "to", maxInt64)
fmt.Println("int64\t->", math.MinInt64, "to", math.MaxInt64)
}
Output:
int8 -> -128 to 127
int8 -> -128 to 127
int16 -> -32768 to 32767
int16 -> -32768 to 32767
int32 -> -2147483648 to 2147483647
int32 -> -2147483648 to 2147483647
int64 -> -9223372036854775808 to -9223372036854775808
int64 -> -9223372036854775808 to 9223372036854775807
I have no idea about the cause of this behavior, any help would be appreciated.
There are multiple problems here:
math.Pow returns a float64. This type cannot be used to represent a 64 bit signed integer with full precision as required for the attempted computation here. To cite from Double-precision floating-point format
Precision limitations on integer values
Integers from −2^53 to 2^53 (−9,007,199,254,740,992 to 9,007,199,254,740,992) can be exactly
represented
Integers between 2^53 and 2^54 = 18,014,398,509,481,984
round to a multiple of 2 (even number)
Integers between 2^54 and 2^55 =
36,028,797,018,963,968 round to a multiple of 4
Even if the precision would be sufficient (which is true in the special case of 2^63) then the precision of float64 is not sufficient to substract 1 from 2^63. Just try the following (uint64 is used here since signed int64 is not sufficient):
uint64(math.Pow(2, 63)) // -> 9223372036854775808
uint64(math.Pow(2, 63)-1) // -> 9223372036854775808
Converting the value first to uint64 and then subtracting works instead, but only because 2^63 can be represented with full prevision in float64 even though other values with this size can not:
uint64(math.Pow(2, 63))-1 // -> 9223372036854775807

Split uint32 into two uint16

How does one go about splitting a single uint32 var in Go into two uint16 vars, representing the 16 MSB and 16 LSB respectively?
Here is a representation of what I am trying to do:
var number uint32
var a uint16
var b uint16
number = 4206942069
Now how would one go about assigning the 16 MSB in number into a and the 16 LSB into b ?
Use the following code to assign the 16 most significant bits in number to a and the 16 least significant bits to b:
a, b := uint16(number>>16), uint16(number)
Run it on the playground.

Convert uint64 to int64 without loss of information

The problem with the following code:
var x uint64 = 18446744073709551615
var y int64 = int64(x)
is that y is -1. Without loss of information, is the only way to convert between these two number types to use an encoder and decoder?
buff bytes.Buffer
Encoder(buff).encode(x)
Decoder(buff).decode(y)
Note, I am not attempting a straight numeric conversion in your typical case. I am more concerned with maintaining the statistical properties of a random number generator.
Your conversion does not lose any information in the conversion. All the bits will be untouched. It is just that:
uint64(18446744073709551615) = 0xFFFFFFFFFFFFFFFF
int64(-1) = 0xFFFFFFFFFFFFFFFF
Try:
var x uint64 = 18446744073709551615 - 3
and you will have y = -4.
For instance: playground
var x uint64 = 18446744073709551615 - 3
var y int64 = int64(x)
fmt.Printf("%b\n", x)
fmt.Printf("%b or %d\n", y, y)
Output:
1111111111111111111111111111111111111111111111111111111111111100
-100 or -4
Seeing -1 would be consistent with a process running as 32bits.
See for instance the Go1.1 release notes (which introduced uint64)
x := ^uint32(0) // x is 0xffffffff
i := int(x) // i is -1 on 32-bit systems, 0xffffffff on 64-bit
fmt.Println(i)
Using fmt.Printf("%b\n", y) can help to see what is going on (see ANisus' answer)
As it turned out, the OP wheaties confirms (in the comments) it was run initially in 32 bits (hence this answer), but then realize 18446744073709551615 is 0xffffffffffffffff (-1) anyway: see ANisusanswer;
The types uint64 and int64 can both represent 2^64 discrete integer values.
The difference between the two is that uint64 holds only positive integers (0 thru 2^64-1), where as int64 holds both negative and positive integers using 1 bit to hold the sign (-2^63 thru 2^63-1).
As others have said, if your generator is producing 0xffffffffffffffff, uint64 will represent this as the raw integer (18,446,744,073,709,551,615) whereas int64 will interpret the two's complement value and return -1.

How to calculate g values from LIS3DH sensor?

I am using LIS3DH sensor with ATmega128 to get the acceleration values to get motion. I went through the datasheet but it seemed inadequate so I decided to post it here. From other posts I am convinced that the sensor resolution is 12 bit instead of 16 bit. I need to know that when finding g value from the x-axis output register, do we calculate the two'2 complement of the register values only when the sign bit MSB of OUT_X_H (High bit register) is 1 or every time even when this bit is 0.
From my calculations I think that we calculate two's complement only when MSB of OUT_X_H register is 1.
But the datasheet says that we need to calculate two's complement of both OUT_X_L and OUT_X_H every time.
Could anyone enlighten me on this ?
Sample code
int main(void)
{
stdout = &uart_str;
UCSRB=0x18; // RXEN=1, TXEN=1
UCSRC=0x06; // no parit, 1-bit stop, 8-bit data
UBRRH=0;
UBRRL=71; // baud 9600
timer_init();
TWBR=216; // 400HZ
TWSR=0x03;
TWCR |= (1<<TWINT)|(1<<TWSTA)|(0<<TWSTO)|(1<<TWEN);//TWCR=0x04;
printf("\r\nLIS3D address: %x\r\n",twi_master_getchar(0x0F));
twi_master_putchar(0x23, 0b000100000);
printf("\r\nControl 4 register 0x23: %x", twi_master_getchar(0x23));
printf("\r\nStatus register %x", twi_master_getchar(0x27));
twi_master_putchar(0x20, 0x77);
DDRB=0xFF;
PORTB=0xFD;
SREG=0x80; //sei();
while(1)
{
process();
}
}
void process(void){
x_l = twi_master_getchar(0x28);
x_h = twi_master_getchar(0x29);
y_l = twi_master_getchar(0x2a);
y_h = twi_master_getchar(0x2b);
z_l = twi_master_getchar(0x2c);
z_h = twi_master_getchar(0x2d);
xvalue = (short int)(x_l+(x_h<<8));
yvalue = (short int)(y_l+(y_h<<8));
zvalue = (short int)(z_l+(z_h<<8));
printf("\r\nx_val: %ldg", x_val);
printf("\r\ny_val: %ldg", y_val);
printf("\r\nz_val: %ldg", z_val);
}
I wrote the CTRL_REG4 as 0x10(4g) but when I read them I got 0x20(8g). This seems bit bizarre.
Do not compute the 2s complement. That has the effect of making the result the negative of what it was.
Instead, the datasheet tells us the result is already a signed value. That is, 0 is not the lowest value; it is in the middle of the scale. (0xffff is just a little less than zero, not the highest value.)
Also, the result is always 16-bit, but the result is not meant to be taken to be that accurate. You can set a control register value to to generate more accurate values at the expense of current consumption, but it is still not guaranteed to be accurate to the last bit.
the datasheet does not say (at least the register description in chapter 8.2) you have to calculate the 2' complement but stated that the contents of the 2 registers is in 2's complement.
so all you have to do is receive the two bytes and cast it to an int16_t to get the signed raw value.
uint8_t xl = 0x00;
uint8_t xh = 0xFC;
int16_t x = (int16_t)((((uint16)xh) << 8) | xl);
or
uint8_t xa[2] {0x00, 0xFC}; // little endian: lower byte to lower address
int16_t x = *((int16*)xa);
(hope i did not mixed something up with this)
I have another approach, which may be easier to implement as the compiler will do all of the work for you. The compiler will probably do it most efficiently and with no bugs too.
Read the raw data into the raw field in:
typedef union
{
struct
{
// in low power - 8 significant bits, left justified
int16 reserved : 8;
int16 value : 8;
} lowPower;
struct
{
// in normal power - 10 significant bits, left justified
int16 reserved : 6;
int16 value : 10;
} normalPower;
struct
{
// in high resolution - 12 significant bits, left justified
int16 reserved : 4;
int16 value : 12;
} highPower;
// the raw data as read from registers H and L
uint16 raw;
} LIS3DH_RAW_CONVERTER_T;
than use the value needed according to the power mode you are using.
Note: In this example, bit fields structs are BIG ENDIANS.
Check if you need to reverse the order of 'value' and 'reserved'.
The LISxDH sensors are 2's complement, left-justified. They can be set to 12-bit, 10-bit, or 8-bit resolution. This is read from the sensor as two 8-bit values (LSB, MSB) that need to be assembled together.
If you set the resolution to 8-bit, just can just cast LSB to int8, which is the likely your processor's representation of 2's complement (8bit). Likewise, if it were possible to set the sensor to 16-bit resolution, you could just cast that to an int16.
However, if the value is 10-bit left justified, the sign bit is in the wrong place for an int16. Here is how you convert it to int16 (16-bit 2's complement).
1.Read LSB, MSB from the sensor:
[MMMM MMMM] [LL00 0000]
[1001 0101] [1100 0000] //example = [0x95] [0xC0] (note that the LSB comes before MSB on the sensor)
2.Assemble the bytes, keeping in mind the LSB is left-justified.
//---As an example....
uint8_t byteMSB = 0x95; //[1001 0101]
uint8_t byteLSB = 0xC0; //[1100 0000]
//---Cast to U16 to make room, then combine the bytes---
assembledValue = ( (uint16_t)(byteMSB) << UINT8_LEN ) | (uint16_t)byteLSB;
/*[MMMM MMMM LL00 0000]
[1001 0101 1100 0000] = 0x95C0 */
//---Shift to right justify---
assembledValue >>= (INT16_LEN-numBits);
/*[0000 00MM MMMM MMLL]
[0000 0010 0101 0111] = 0x0257 */
3.Convert from 10-bit 2's complement (now right-justified) to an int16 (which is just 16-bit 2's complement on most platforms).
Approach #1: If the sign bit (in our example, the tenth bit) = 0, then just cast it to int16 (since positive numbers are represented the same in 10-bit 2's complement and 16-bit 2's complement).
If the sign bit = 1, then invert the bits (keeping just the 10bits), add 1 to the result, then multiply by -1 (as per the definition of 2's complement).
convertedValueI16 = ~assembledValue; //invert bits
convertedValueI16 &= ( 0xFFFF>>(16-numBits) ); //but keep just the 10-bits
convertedValueI16 += 1; //add 1
convertedValueI16 *=-1; //multiply by -1
/*Note that the last two lines could be replaced by convertedValueI16 = ~convertedValueI16;*/
//result = -425 = 0xFE57 = [1111 1110 0101 0111]
Approach#2: Zero the sign bit (10th bit) and subtract out half the range 1<<9
//----Zero the sign bit (tenth bit)----
convertedValueI16 = (int16_t)( assembledValue^( 0x0001<<(numBits-1) ) );
/*Result = 87 = 0x57 [0000 0000 0101 0111]*/
//----Subtract out half the range----
convertedValueI16 -= ( (int16_t)(1)<<(numBits-1) );
[0000 0000 0101 0111]
-[0000 0010 0000 0000]
= [1111 1110 0101 0111];
/*Result = 87 - 512 = -425 = 0xFE57
Link to script to try out (not optimized): http://tpcg.io/NHmBRR

Resources