Is the difference between the data structures used for custom message headers in mqttv5 paho and amqp due to the protocol, or simply an implementation decision?
Background: I noticed that the most recent paho library for mqttv5 replaced the type for Publish.PublishProperties.User from map[string]string with []UserProperty. I'm assuming the reason for the change is the prevalence of use cases that require multiple values sharing a single key, but wouldn't it be more efficient to utilize the amqp style customer message header? amqp uses amqp.Table where Table is map[string]interface{}, which seems to serve the same purpose (just use []string as your interface implementation).
Retrieving a slice of values that share a custom message header in amqp requires one line of code:
slice := message.Headers["key"]
Retrieving a slice of values that share a custom message header in paho requires a loop of string comparisons:
// GetAll returns a slice of all entries in the UserProperties
// that match key, or a nil slice if none were found.
func (u UserProperties) GetAll(key string) []string {
var ret []string
for _, v := range u {
if v.Key == key {
ret = append(ret, v.Value)
}
}
return ret
}
Is there a reason for this choice of implementation?
As per the contributors, the performance of returning UserProperty in that format is much lower.
See the github discussion here: https://github.com/eclipse/paho.golang/issues/47
From that discussion, with the current implementation being the second performance run:
With 10 k/v in the user properties and 5 being the same key
pkg: github.com/eclipse/paho.golang/paho
BenchmarkUserProperties-8 703861 1476 ns/op 784 B/op 15 allocs/op
BenchmarkUserProperty-8 6933013 158 ns/op 320 B/op 1 allocs/op
Related
I have to handle multiple versions of communication protocol. I have packages with all packets and their IDs which look like
package v1_8
//Clientbound packet IDs
const (
KeepAliveClientbound byte = iota
JoinGame
ChatMessageClientbound
TimeUpdate
EntityEquipment
SpawnPosition
UpdateHealth
Respawn
PlayerPositionAndLookClientbound
HeldItemChangeClientbound
UseBed
AnimationClientbound
SpawnPlayer
CollectItem
...
and
package v1_14_3
//Clientbound packet IDs
const (
SpawnObject byte = iota //0x00
SpawnExperienceOrb
SpawnGlobalEntity
SpawnMob
SpawnPainting
SpawnPlayer
AnimationClientbound
Statistics
BlockBreakAnimation
UpdateBlockEntity
BlockAction
BlockChange
BossBar
ServerDifficulty
ChatMessageClientbound
MultiBlockChange
...
There are slight differences between versions (different IDs for the same packets or changed type of data) how do I go about handling them? For now, I am just using switch statement to check what packet is being processed but that works only for one version. I have information about client protocol version so I just need to change package that is being used in switch statement.
for {
pkt, err := src.ReadPacket()
if err != nil {
...handle error
}
switch pkt.ID {
case v1_14_3.<Packet Const>:
...handle packet - there might be different data type assigned for different versions of protocol
}
err = dst.WritePacket(pkt)
if err != nil {
...handle error
}
}
#Edit: providing more information
Package struct returned by ReadPacket()
type Packet struct {
ID byte
Data []byte
}
And some real-life data
{0 [234 3 9 108 111 99 97 108 104 111 115 116 99 156 1]}
This is one of the packets defined here as Handshake
I then use Scan method of Packet struct like this
switch pkt.ID {
case v1_14_3.Handshake:
var (
protocolVersion packet.VarInt
serverAddress packet.String
serverPort packet.UnsignedShort
nextState packet.VarInt
)
err = pkt.Scan(&protocolVersion, &serverAddress, &serverPort, &nextState)
if err != nil {
...handle error
}
...process packet
}
As I said before types passed to Scan method can be slightly different for different versions as well as packets IDs. So Handshake packet between versions may have different pkt.ID or different payload (therefore properties passed to scan would be different). Scan is just variadic function that propagates passed variables with pkt.Data
func Encode(i interface{}) ([]byte, error) {
buffer := bytes.NewBuffer(make([]byte, 0, 1024))
// size := unsafe.Sizeof(i)
size := reflect.TypeOf(i).Size()
fmt.Println(size)
ptr := unsafe.Pointer(&i)
startAddr := uintptr(ptr)
endAddr := startAddr + size
for i := startAddr; i < endAddr; i++ {
bytePtr := unsafe.Pointer(i)
b := *(*byte)(bytePtr)
buffer.WriteByte(b)
}
return buffer.Bytes(), nil
}
func TestEncode(t *testing.T) {
test := Test{10, "hello world"}
b, _ := Encode(test)
ptr := unsafe.Pointer(&b)
newTest := *(*Test)(ptr)
fmt.Println(newTest.X)
}
I am learning how to use golang unsafe and wrote this function for encoding any object. I meet with two problems, first, dose unsafe.Sizeof(obj) always return obj's pointer size? Why it different from reflect.TypeOf(obj).Size()? Second, I want to iterate the underlying bytes of obj and convert it back to obj in TestEncode function by unsafe.Pointer(), but the object's values all corrupt, why?
First, unsafe.Sizeof returns the bytes that needs to store the type. It is a little bit tricky, but it does not mean bytes that needs to store the data.
For example, a slice, as it is well known, stores 3 4-byte ints on a 32bit machine. One uintptr for memory address of the underlying array, and two int32 for len and cap. So no matter how long a slice is or what type it is of, a slice takes always 12 bytes on a 32 bit machine. Likely, a string uses 8 bytes: 1 uintptr for address and 1 int32 for len.
As for difference between reflect.TypeOf().Size, it is about interface. reflect.TypeOf looks into the interface and gets an concrete type, and reports bytes needed about the concrete type, while unsafe.Sizeof just returns 8 for an interface type: 2 uintptr for a pointer to the data and a pointer to the method lists.
Second part is quite clear now. For one, unsafe.Pointer is taking the address of the interface, instead of the concrete type. Two, in TestEncode, unsafe.Pointer is taking address to the 12-byte slice "header". There might be other errors, but with the two mentioned, they are meaningless to spot.
Note: I avoid talking about orders of the uintptr and int32 not only because I don't know, but also becuase they are not documented, unsafe, and implentation depended.
Note 2: Conclusion: Don't try to dump memory of a Go data.
Note 3: I change everything to 32 bit becuase playground is using it, so it is easier to check.
Assuming you had 80 bytes of data and only the last 4 bytes was constantly changing, how would you efficiently hash the total 80 bytes using Go. In essence, the first 76 bytes are the same, while the last 4 bytes keeps changing. Ideally, you want to keep a copy of the hash digest for the first 76 bytes and just keep changing the last 4.
You can try the following examples on the Go Playground. Benchmark results is at the end.
Note: the implementations below are not safe for concurrent use; I intentionally made them like this to be simpler and faster.
Fastest when using only public API (always hashes all input)
The general concept and interface of Go's hash algorithms is the hash.Hash interface. This does not allow you to save the state of the hasher and to return or rewind to the saved state. So using the public hash APIs of the Go standard lib, you always have to calculate the hash from start.
What the public API offers is to reuse an already constructed hasher to calculate the hash of a new input, using the Hash.Reset() method. This is nice so that no (memory) allocations will be needed to calculate multiple hash values. Also you may take advantage of the optional slice that may be passed to Hash.Sum() which is used to append the current hash to. This is nice so that no allocations will be needed to receive the hash results either.
Here's an example that takes advantage of these:
type Cached1 struct {
hasher hash.Hash
result [sha256.Size]byte
}
func NewCached1() *Cached1 {
return &Cached1{hasher: sha256.New()}
}
func (c *Cached1) Sum(data []byte) []byte {
c.hasher.Reset()
c.hasher.Write(data)
return c.hasher.Sum(c.result[:0])
}
Test data
We'll use the following test data:
var fixed = bytes.Repeat([]byte{1}, 76)
var variantA = []byte{1, 1, 1, 1}
var variantB = []byte{2, 2, 2, 2}
var data = append(append([]byte{}, fixed...), variantA...)
var data2 = append(append([]byte{}, fixed...), variantB...)
var c1 = NewCached1()
First let's get authentic results (to verify if our hasher works correctly):
fmt.Printf("%x\n", sha256.Sum256(data))
fmt.Printf("%x\n", sha256.Sum256(data2))
Output:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
Now let's check our Cached1 hasher:
fmt.Printf("%x\n", c1.Sum(data))
fmt.Printf("%x\n", c1.Sum(data2))
Output is the same:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
Even faster but may break (in future Go releases): hashes only the last 4 bytes
Now let's see a less flexible solution which truly calculates the hash of the first 76 fixed part only once.
The hasher of the crypto/sha256 package is the unexported sha256.digest type (more precisely a pointer to this type):
// digest represents the partial evaluation of a checksum.
type digest struct {
h [8]uint32
x [chunk]byte
nx int
len uint64
is224 bool // mark if this digest is SHA-224
}
A value of the digest struct type basically holds the current state of the hasher.
What we may do is feed the hasher the fixed, first 76 bytes, and then save this struct value. When we need to caclulate the hash of some 80 bytes data where the first 76 is the same, we use this saved value as a starting point, and then feed the varying last 4 bytes.
Note that it's enough to simply save this struct value as it contains no pointers and no descriptor types like slices and maps. Else we would also have to make a copy of those, but we're "lucky". So this solution would need adjustment if a future implementation of crypto/sha256 would add a pointer or slice field for example.
Since sha256.digest is unexported, we can only use reflection (reflect package) to achieve our goals, which inherently will add some delays to computation.
Example implementation that does this:
type Cached2 struct {
origv reflect.Value
hasherv reflect.Value
hasher hash.Hash
result [sha256.Size]byte
}
func NewCached2(fixed []byte) *Cached2 {
h := sha256.New()
h.Write(fixed)
c := &Cached2{origv: reflect.ValueOf(h).Elem()}
hasherv := reflect.New(c.origv.Type())
c.hasher = hasherv.Interface().(hash.Hash)
c.hasherv = hasherv.Elem()
return c
}
func (c *Cached2) Sum(data []byte) []byte {
// Set state of the fixed hash:
c.hasherv.Set(c.origv)
c.hasher.Write(data)
return c.hasher.Sum(c.result[:0])
}
Testing it:
var c2 = NewCached2(fixed)
fmt.Printf("%x\n", c2.Sum(variantA))
fmt.Printf("%x\n", c2.Sum(variantB))
Output is again the same:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
So it works.
The "ultimate", fastest solution
Cached2 could be faster if reflection would not be involved. If we want an even faster solution, simply we can make a copy of the sha256.digest type and its methods into our package, so we can directly use it without having to resort to reflection.
If we do this, we will have access to the digest struct value, and we can simply make a copy of it like:
var d digest
// init d
saved := d
And restoring it is like:
d = saved
I simply "cloned" the crypto/sha256 package to my workspace, and changed / exported the digest type as Digest just for demonstration purposes. Then using this mysha256.Digest type I implemented Cached3 like this:
type Cached3 struct {
orig mysha256.Digest
result [sha256.Size]byte
}
func NewCached3(fixed []byte) *Cached3 {
var d mysha256.Digest
d.Reset()
d.Write(fixed)
return &Cached3{orig: d}
}
func (c *Cached3) Sum(data []byte) []byte {
// Make a copy of the fixed hash:
d := c.orig
d.Write(data)
return d.Sum(c.result[:0])
}
Testing it:
var c3 = NewCached3(fixed)
fmt.Printf("%x\n", c3.Sum(variantA))
fmt.Printf("%x\n", c3.Sum(variantB))
Output again is the same. So this works too.
Benchmarks
We can benchmark performance with this code:
func BenchmarkCached1(b *testing.B) {
for i := 0; i < b.N; i++ {
c1.Sum(data)
c1.Sum(data2)
}
}
func BenchmarkCached2(b *testing.B) {
for i := 0; i < b.N; i++ {
c2.Sum(variantA)
c2.Sum(variantB)
}
}
func BenchmarkCached3(b *testing.B) {
for i := 0; i < b.N; i++ {
c3.Sum(variantA)
c3.Sum(variantB)
}
}
Benchmark results (go test -bench . -benchmem):
BenchmarkCached1-4 1000000 1569 ns/op 0 B/op 0 allocs/op
BenchmarkCached2-4 2000000 926 ns/op 0 B/op 0 allocs/op
BenchmarkCached3-4 2000000 872 ns/op 0 B/op 0 allocs/op
Cached2 is approximately 41% faster than Cached1 which is quite noticable and nice. Cached3 only gives a "little" performance boost compared to Cached2, another 6%. Cached3 is 44% faster than Cached1.
Also note that none of the solutions use any allocations which is also nice.
Conclusion
For that extra 40% or 44%, I would probably not go for the Cached2 or Cached3 solutions. Of course it really depends on how important the performance is to you. If it is important, I think the Cached2 solution presents a fine compromise between minimum added complexity and the noticeable performance gain. It does pose a threat as future Go implementations may break it; if it is a problem, Cached3 solves this by copying the current implementation (and also improves its performance a little).
I am trying to create a method that will append data already cached with non-cached data pulled from the db. Ideally a single method performing something similar as below would be ideal. So is there a way to append two interface{} refs that are both slices?
package main
import "fmt"
type foo struct {
Name string
}
func main() {
a := []*foo{
&foo{"bar"},
&foo{"boom"},
}
b := []*foo{
&foo{"blam"},
&foo{"pow"},
}
fmt.Println(add(a, b))
}
func add(a, b interface{}) interface{} {
return append([]interface{}{a}, ([]interface{}{b})...)
}
Results https://play.golang.org/p/9cWxPg6daq
[[0x1040a128 0x1040a130] [0x1040a140 0x1040a148]]
Desired Results
[0x1040a128 0x1040a130 0x1040a140 0x1040a148]
Update: Benchmarks
https://play.golang.org/p/9a8aZckQAF
C:\Users\chris\Go\src\bitbucket.org\chrisolsen\test>go test -bench=.
PASS
BenchmarkWithReflection-4 300000 4580 ns/op
BenchmarkNoReflection-4 300000 4194 ns/op
ok bitbucket.org/chrisolsen/test 2.911s
Only with reflect:
func add(a, b interface{}) interface{} {
return reflect.AppendSlice(reflect.ValueOf(a), reflect.ValueOf(b)).Interface()
}
Playground: https://play.golang.org/p/FjS73G2_G5.
Notice though that this will panic if a and b are not compatible slices.
In general in Go there is no way to convert an array of one type to an array of another type. You have to do it the type specific way or run a for loop to convert each element. This is a result of how slices are implemented. See this answer:
Type converting slices of interfaces in go
The way to get the result you want with a type specific method looks like this:
https://play.golang.org/p/RqKWvQqE_g
Go does not yet have a "generics" mechanism which would make this possible with a generic annotation although some people have written code generators that function like generics in Java.
I've been trying to benchmark a Radix Tree implementation I wrote for sake of practice with Golang.
But I encountered a problem on "How should I benchmark it?". In the code below shows two cases or lets say different ways I would like to benchmark the LookUp func.
Case 1: Use one single slice of bytes which exist on the tree meaning it will be successful LookUp through all children nodes etc...
Case 2: Use a func to generate that random slice from the existing data in the tree meaning it will be successful LookUp as well...
I know the time expend will depend on the tree depth... I think Case 2 is close to a real world implementation or not?
QUESTION: Which case is more efficient or useful to benchmark?
Benchmark:
func BenchmarkLookUp(b *testing.B) {
radix := New()
insertData(radix, sampleData2)
textToLookUp := randomBytes()
for i := 0; i < b.N; i++ {
radix.LookUp(textToLookUp) // Case 1
//radix.LookUp(randomBytes()) // Case 2
}
}
func randomBytes() []byte {
strings := sampleData2()
return []byte(strings[random(0, len(strings))])
}
func sampleData2() []string {
return []string{
"romane",
"romanus",
"romulus",
...
}
}
Result Case 1:
PASS
BenchmarkLookUp-4 10000000 146 ns/op
ok github.com/falmar/goradix 2.068s
PASS
BenchmarkLookUp-4 10000000 149 ns/op
ok github.com/falmar/goradix 2.244s
Result Case 2:
PASS
BenchmarkLookUp-4 3000000 546 ns/op
ok github.com/falmar/goradix 3.094s
PASS
BenchmarkLookUp-4 3000000 538 ns/op
ok github.com/falmar/goradix 4.481s
Results when there is no match:
PASS
BenchmarkLookUp-4 10000000 194 ns/op
ok github.com/falmar/goradix 3.189s
PASS
BenchmarkLookUp-4 10000000 191 ns/op
ok github.com/falmar/goradix 3.243s
If your benchmark is random, that would make it very difficult to compare the performance between different implementations from one run to the next.
Instead, statically implement a few different benchmark cases that stress different areas of your algorithm. The cases should represent different scenarios, such as the case when there are no matches (as you already have), the case where there are many items in the source data that will be returned in a lookup, the case where there are many items and only 1 item will be returned, etc etc.