Flatbuffer serialization performance is slow compared to protobuf - go

With following IDL files my intention is to measure the serialization speed of Flatbuffer . I am using golang for my analysis
namespace MyFlat;
struct Vertices {
x : double;
y :double;
}
table Polygon {
polygons : [Vertices];
}
table Layer {
polygons : [Polygon];
}
root_type Layer;
Here is the code I have written for calculation
package main
import (
"MyFlat"
"fmt"
"io/ioutil"
"log"
"strconv"
"time"
flatbuffers "github.com/google/flatbuffers/go"
)
func calculation(size int, vertices int) {
b := flatbuffers.NewBuilder(0)
var polyoffset []flatbuffers.UOffsetT
rawSize := ((16 * vertices) * size) / 1024
var vec1 flatbuffers.UOffsetT
var StartedAtMarshal time.Time
var EndedAtMarshal time.Time
StartedAtMarshal = time.Now()
for k := 0; k < size; k++ {
MyFlat.PolygonStartPolygonsVector(b, vertices)
for i := 0; i < vertices; i++ {
MyFlat.CreateVertices(b, 2.0, 2.4)
}
vec1 = b.EndVector(vertices)
MyFlat.PolygonStart(b)
MyFlat.PolygonAddPolygons(b, vec1)
polyoffset = append(polyoffset, MyFlat.PolygonEnd(b))
}
MyFlat.LayerStartPolygonsVector(b, size)
for _, offset := range polyoffset {
b.PrependUOffsetT(offset)
}
vec := b.EndVector(size)
MyFlat.LayerStart(b)
MyFlat.LayerAddPolygons(b, vec)
finalOffset := MyFlat.LayerEnd(b)
b.Finish(finalOffset)
EndedAtMarshal = time.Now()
SeElaprseTime := EndedAtMarshal.Sub(StartedAtMarshal).String()
mybyte := b.FinishedBytes()
file := "/tmp/myflat_" + strconv.Itoa(size) + ".txt"
if err := ioutil.WriteFile(file, mybyte, 0644); err != nil {
log.Fatalln("Failed to write address book:", err)
}
StartedAt := time.Now()
layer := MyFlat.GetRootAsLayer(mybyte, 0)
size = layer.PolygonsLength()
obj := &MyFlat.Polygon{}
layer.Polygons(obj, 1)
for i := 0; i < obj.PolygonsLength(); i++ {
objVertices := &MyFlat.Vertices{}
obj.Polygons(objVertices, i)
fmt.Println(objVertices.X(), objVertices.Y())
}
EndedAt := time.Now()
DeElapseTime := EndedAt.Sub(StartedAt).String()
fmt.Println(size, ",", vertices, ", ", SeElaprseTime, ",", DeElapseTime, ",", (len(mybyte) / 1024), ",", rawSize)
}
func main() {
data := []int{500000, 1000000, 1500000, 3000000, 8000000}
for _, size := range data {
//calculation(size, 5)
//calculation(size, 10)
calculation(size, 20)
}
}
Problem is I find it serialization is quite slow compared to protobuff with similar idl.
For 3M polygons serialization its taking almost 4.1167037s. Where in protobuf its taking half. Deserilization time for flatbuf is very less (in micro sec). In protobuf its quite high. But still if I add both flatbuf performance is lower.
Do you see any optimized way to serialize it. Flatbuffer is having a method createBinaryVector for byte vector but there is no direct way to serialize vector of polygon from a existing a user defined type vector.
I am adding protobuf code also
syntax = 'proto3';
package myproto;
message Polygon {
repeated double v_x = 1 ;
repeated double v_y = 2 ;
}
message CADData {
repeated Polygon polygon = 1;
string layer_name = 2;
}
Go Code with protobuf
package main
import (
"fmt"
"io/ioutil"
"log"
"math/rand"
"myproto"
"strconv"
"time"
"github.com/golang/protobuf/proto"
)
func calculation(size int, vertices int) {
var comp []*myproto.Polygon
var vx []float64
var vy []float64
for i := 0; i < vertices; i++ {
r := 0 + rand.Float64()*(10-0)
vx = append(vx, r)
vy = append(vy, r/2)
}
rawSize := ((16 * vertices) * size) / 1024
StartedAtMarshal := time.Now()
for i := 0; i < size; i++ {
comp = append(comp, &myproto.Polygon{
VX: vx,
VY: vy,
})
}
pfs := &myproto.CADData{
LayerName: "Layer",
Polygon: comp,
}
data, err := proto.Marshal(pfs)
if err != nil {
log.Fatal("marshaling error: ", err)
}
EndedAtMarshal := time.Now()
SeElaprseTime := EndedAtMarshal.Sub(StartedAtMarshal).String()
file := "/tmp/myproto_" + strconv.Itoa(size) + ".txt"
if err := ioutil.WriteFile(file, data, 0644); err != nil {
log.Fatalln("Failed to write address book:", err)
}
StartedAt := time.Now()
serialized := &myproto.CADData{}
proto.Unmarshal(data, serialized)
EndedAt := time.Now()
DeElapseTime := EndedAt.Sub(StartedAt).String()
fmt.Println(size, ",", vertices, ", ", SeElaprseTime, ",", DeElapseTime, ",", (len(data) / 1024), ",", rawSize)
}
func main() {
data := []int{500000, 1000000, 1500000, 3000000, 8000000}
for _, size := range data {
// calculation(size, 5)
//calculation(size, 10)
calculation(size, 20)
}
}

The time you give, is that for serialization, de-serialization, or both?
Your de-serialization code is likely entirely dominated by fmt.Println. Why don't you instead do sum += objVertices.X() + objVertices.Y() and print sum after timing is done? Can you pull objVertices := &MyFlat.Vertices{} outside of the loop?
You didn't post your protobuf code. Are you including in the timing the time to create the tree of objects which is being serialized (which is required for use in Protobuf but not in FlatBuffers)? Similarly, are you doing the timed (de-)serialization at least a 1000x or so, so you can include the cost of GC (Protobuf allocates a LOT of objects, FlatBuffers allocates few/none) in your comparison?
If after you do the above, it is still slower, post on the FlatBuffers github issues, the authors of the Go port may be able to help further. Make sure you post full code for both systems, and full timings.
Note generally: the design of FlatBuffers is such that it will create the biggest performance gap with Protobuf in C/C++. That said, it should still be a lot faster in Go also. There are unfortunate things about Go however that prevent it from maximizing the performance potential.

b := flatbuffers.NewBuilder(0)
I'm not sure what the "grows automatically" behavior is in Go for flatbuffers, but I'm pretty sure requiring the buffer to grow automatically is not the preferred pattern. Could you try doing your same timing comparison after initializing the buffer with flatbuffers.NewBuilder(moreBytesThanTheMessageNeeds)?

Related

What Did I Miss in Input Process?

I am solving a problem in Hackerearth. Passed all the test cases except 1, showing "Time limit exceeded". What did I really miss in my code?
package main
import(
"fmt"
"strings"
)
func rotateRight(numbers []int, size int, k int) []int {
new_numbers := make([]int, size)
for index, value := range numbers {
new_numbers[(index + k) % size] = value
}
return new_numbers
}
func main() {
var test_case, size, k int
fmt.Scanf("%v", &test_case)
fmt.Scanln()
for i := 0; i < test_case; i++ {
fmt.Scanf("%v %v", &size, &k)
fmt.Scanln()
numbers := make([]int, size)
for i := 0; i<size; i++ {
fmt.Scanf("%v", &numbers[i])
}
result := rotateRight(numbers, size, k)
fmt.Println(strings.Trim(fmt.Sprint(result), "[]"))
}
}
maybe the reason is the way that you read the data, fmt is really slow, try change it with
package main
import (
"bufio"
"os"
)
func main() {
sc := bufio.NewScanner(os.Stdin)
sc.Scan()
sc.Text()//here you have your data
}
this change will improve the time wasted

golang calculate full precision float number

Used a decimal point of 200 as the precision, I need to calculate a number from atto to decimal number similar screenshot.
To get the values at precision of nano and atto you can use %.9f and %.18f in fmt.Printf() respectively,I created a small program to get your value of 0.000000000000099707 as follows:
package main
import (
"fmt"
"math"
)
func main() {
powr := math.Pow(10, -18)
numb := 99707 * powr
fmt.Println("number", numb)
fmt.Printf("\nthe value in atto %.18f\n", numb)
}
Output:
number 9.970700000000001e-14
the value in atto 0.000000000000099707
You can use the github.com/shopspring/decimal package for this as well. This library can represents numbers up to 2^31 (2147483648) digits. Here is a simple code to do the calculation:
d := decimal.NewFromInt(99707)
d10 := decimal.NewFromInt(10)
dpow := decimal.NewFromInt(-18)
d10pow := d10.Pow(dpow)
dmul := d.Mul(d10pow)
fmt.Println(dmul)
This can simplified to:
d := decimal.NewFromInt(99707).Mul(decimal.NewFromInt(10).Pow(decimal.NewFromInt(-18)))
fmt.Println(d)
Output: 0.000000000000099707
See playground
I was interested in how to do this so I found the apd package from cockroach that handles arbitrary precision calculations. You can use it like this:
import (
"fmt"
"github.com/cockroachdb/apd"
)
func main() {
// 99707 * 10^(-18)
n1 := apd.New(99707, 0)
n2 := apd.New(10, 0)
n3 := apd.New(-18, 0)
c := apd.BaseContext.WithPrecision(200)
res := apd.New(0,0)
ctx, err := c.Pow(res, n2, n3)
if err != nil {
panic(err)
}
ctx, err = c.Mul(res, res, n1)
if err != nil {
panic(err)
}
fmt.Println(ctx.Inexact(), res.Text('f'))
}
And it will output:
false 0.000000000000099707
You will have to be careful with the loss of precision that may happen and look at the inexact field.

Threading Decasteljau algorithm in golang

I'm trying to write a threaded Decasteljau algorithm for control polygons with any set of points in golang but can't get the goroutines to work right because their work randomly and i can't manage to get all the goroutines to work .
here's my code for the decasteljau.go file :
package main
import (
"fmt"
)
type ControlPolygon struct {
Vertices []Vertex
}
type Spline struct {
Vertices map[int]Vertex
}
type splinePoint struct {
index int
vertex Vertex
}
func (controlPolygon ControlPolygon) Decasteljau(levelOfDetail int) {
//
// LevelOfDetail is the number of points in the spline
//
spline := Spline{make(map[int]Vertex)}
splinePointsChannel := make(chan splinePoint)
for index := 1; index < levelOfDetail; index++ {
splinePoint := splinePoint{}
splinePoint.index = index
pointPosition := float64(index) / float64(levelOfDetail)
go func() {
fmt.Println("goroutine number:", index)
splinePoint.findSplinePoint(controlPolygon.Vertices, pointPosition)
splinePointsChannel <- splinePoint
}()
}
point := <-splinePointsChannel
spline.Vertices[point.index] = point.vertex
fmt.Println(spline)
}
func (point *splinePoint) findSplinePoint(vertices []Vertex, pointPosition float64) {
var interpolationPoints []Vertex
if len(vertices) == 1 {
fmt.Println("vertices : ", vertices)
point.vertex = vertices[0]
}
if len(vertices) > 1 {
for i := 0; i < len(vertices)-1; i++ {
interpolationPoint := vertices[i].GetInterpolationPoint(vertices[i+1], pointPosition)
interpolationPoints = append(interpolationPoints, interpolationPoint)
}
point.findSplinePoint(interpolationPoints, pointPosition)
fmt.Println()
} else {
fmt.Println("Done Detailing ..")
return
}
}
func main() {
v1 := Vertex{0, 0, 0}
v2 := Vertex{0, 0, 1}
v3 := Vertex{0, 1, 1}
v4 := Vertex{0, 1, 0}
vectices := []Vertex{v1, v2, v3, v4}
controlPolygon := ControlPolygon{vectices}
controlPolygon.Decasteljau(10)
}
I'm also new to go concurrency and after a lot of research i'm still wondering if i need to use buffered or unbuffered channels for my case .
I also found that goroutines are mostly used for managing networks rather than optimizing 3D so i would love to know if i'm using a good stack for writing concurrent 3D algorithms

Go algorithm for looping through servers in predefined ratio

I am tying to make a algorithm that can loop true things, backend server in my case by a pre defined ratio.
for example I got 2 backend servers
type server struct {
addr string
ratio float64
counter int64
}
// s2 is a beast and may handle 3 times the requests then s1 *edit
s1 := &server{":3000", 0.25}
s2 := &server{":3001", 0.75}
func nextServer() {
server := next() // simple goroutine that provides the next server between s1 and s2
N := server.counter / i
if float64(N) > server.ratio {
//repeat this function
return nextServer()
}
server.counter += 1
}
for i := 0; i < 1000; i++ {
nextServer()
}
s1 has 250 as counter (requests handled)
s2 is huge so he has 750 as counter (requests handled)
this is a very simple implementation of what I got but when i is like 10000, it keeps looping in
nextServer() cause N is always > server.ratio.
as long as i is around 5000 it works perfect. but I think there are better algorithms for looping in ratios.
How to make this simple and solid?
Something like this?
package main
import (
"fmt"
"math/rand"
)
type server struct {
addr string
ratio float64
}
var servers []server
func nextServer() *server {
rndFloat := rand.Float64() //pick a random number between 0.0-1.0
ratioSum := 0.0
for _, srv := range servers {
ratioSum += srv.ratio //sum ratios of all previous servers in list
if ratioSum >= rndFloat { //if ratiosum rises above the random number
return &srv //return that server
}
}
return nil //should not come here
}
func main() {
servers = []server{server{"0.25", 0.25}, server{"0.50", 0.50},
server{"0.10", 0.10}, server{"0.15", 0.15}}
counts := make(map[string]int, len(servers))
for i := 0; i < 100; i++ {
srv := nextServer()
counts[srv.addr] += 1
}
fmt.Println(counts)
}
Yields for example:
map[0.50:56 0.15:15 0.25:24 0.10:5]

Looking for Go equivalent of scanf

I'm looking for the Go equivalent of scanf().
I tried with following code:
1 package main
2
3 import (
4 "scanner"
5 "os"
6 "fmt"
7 )
8
9 func main() {
10 var s scanner.Scanner
11 s.Init(os.Stdin)
12 s.Mode = scanner.ScanInts
13 tok := s.Scan()
14 for tok != scanner.EOF {
15 fmt.Printf("%d ", tok)
16 tok = s.Scan()
17 }
18 fmt.Println()
19 }
I run it with input from a text with a line of integers.
But it always output -3 -3 ...
And how to scan a line composed of a string and some integers?
Changing the mode whenever encounter a new data type?
The Package documentation:
Package scanner
A general-purpose scanner for UTF-8
encoded text.
But it seems that the scanner is not for general use.
Updated code:
func main() {
n := scanf()
fmt.Println(n)
fmt.Println(len(n))
}
func scanf() []int {
nums := new(vector.IntVector)
reader := bufio.NewReader(os.Stdin)
str, err := reader.ReadString('\n')
for err != os.EOF {
fields := strings.Fields(str)
for _, f := range fields {
i, _ := strconv.Atoi(f)
nums.Push(i)
}
str, err = reader.ReadString('\n')
}
r := make([]int, nums.Len())
for i := 0; i < nums.Len(); i++ {
r[i] = nums.At(i)
}
return r
}
Improved version:
package main
import (
"bufio"
"os"
"io"
"fmt"
"strings"
"strconv"
"container/vector"
)
func main() {
n := fscanf(os.Stdin)
fmt.Println(len(n), n)
}
func fscanf(in io.Reader) []int {
var nums vector.IntVector
reader := bufio.NewReader(in)
str, err := reader.ReadString('\n')
for err != os.EOF {
fields := strings.Fields(str)
for _, f := range fields {
if i, err := strconv.Atoi(f); err == nil {
nums.Push(i)
}
}
str, err = reader.ReadString('\n')
}
return nums
}
Your updated code was much easier to compile without the line numbers, but it was missing the package and import statements.
Looking at your code, I noticed a few things. Here's my revised version of your code.
package main
import (
"bufio"
"fmt"
"io"
"os"
"strconv"
"strings"
"container/vector"
)
func main() {
n := scanf(os.Stdin)
fmt.Println()
fmt.Println(len(n), n)
}
func scanf(in io.Reader) []int {
var nums vector.IntVector
rd := bufio.NewReader(os.Stdin)
str, err := rd.ReadString('\n')
for err != os.EOF {
fields := strings.Fields(str)
for _, f := range fields {
if i, err := strconv.Atoi(f); err == nil {
nums.Push(i)
}
}
str, err = rd.ReadString('\n')
}
return nums
}
I might want to use any input file for scanf(), not just Stdin; scanf() takes an io.Reader as a parameter.
You wrote: nums := new(vector.IntVector), where type IntVector []int. This allocates an integer slice reference named nums and initializes it to zero, then the new() function allocates an integer slice reference and initializes it to zero, and then assigns it to nums. I wrote: var nums vector.IntVector, which avoids the redundancy by simply allocating an integer slice reference named nums and initializing it to zero.
You didn't check the err value for strconv.Atoi(), which meant invalid input was converted to a zero value; I skip it.
To copy from the vector to a new slice and return the slice, you wrote:
r := make([]int, nums.Len())
for i := 0; i < nums.Len(); i++ {
r[i] = nums.At(i)
}
return r
First, I simply replaced that with an equivalent, the IntVector.Data() method: return nums.Data(). Then, I took advantage of the fact that type IntVector []int and avoided the allocation and copy by replacing that by: return nums.
Although it can be used for other things, the scanner package is designed to scan Go program text. Ints (-123), Chars('c'), Strings("str"), etc. are Go language token types.
package main
import (
"fmt"
"os"
"scanner"
"strconv"
)
func main() {
var s scanner.Scanner
s.Init(os.Stdin)
s.Error = func(s *scanner.Scanner, msg string) { fmt.Println("scan error", msg) }
s.Mode = scanner.ScanInts | scanner.ScanStrings | scanner.ScanRawStrings
for tok := s.Scan(); tok != scanner.EOF; tok = s.Scan() {
txt := s.TokenText()
fmt.Print("token:", tok, "text:", txt)
switch tok {
case scanner.Int:
si, err := strconv.Atoi64(txt)
if err == nil {
fmt.Print(" integer: ", si)
}
case scanner.String, scanner.RawString:
fmt.Print(" string: ", txt)
default:
if tok >= 0 {
fmt.Print(" unicode: ", "rune = ", tok)
} else {
fmt.Print(" ERROR")
}
}
fmt.Println()
}
}
This example always reads in a line at a time and returns the entire line as a string. If you want to parse out specific values from it you could.
package main
import (
"fmt"
"bufio"
"os"
"strings"
)
func main() {
value := Input("Please enter a value: ")
trimmed := strings.TrimSpace(value)
fmt.Printf("Hello %s!\n", trimmed)
}
func Input(str string) string {
print(str)
reader := bufio.NewReader(os.Stdin)
input, _ := reader.ReadString('\n')
return input
}
In a comment to one of my answers, you said:
From the Language Specification: "When
memory is allocated to store a value,
either through a declaration or make()
or new() call, and no explicit
initialization is provided, the memory
is given a default initialization".
Then what's the point of new()?
If we run:
package main
import ("fmt")
func main() {
var i int
var j *int
fmt.Println("i (a value) = ", i, "; j (a pointer) = ", j)
j = new(int)
fmt.Println("i (a value) = ", i, "; j (a pointer) = ", j, "; *j (a value) = ", *j)
}
The declaration var i int allocates memory to store an integer value and initializes the value to zero. The declaration var j *int allocates memory to store a pointer to an integer value and initializes the pointer to zero (a nil pointer); no memory is allocated to store an integer value. We see program output similar to:
i (a value) = 0 ; j (a pointer) = <nil>
The built-in function new takes a type T and returns a value of type *T. The memory is initialized to zero values. The statement j = new(int) allocates memory to store an integer value and initializes the value to zero, then it stores a pointer to this integer value in j. We see program output similar to:
i (a value) = 0 ; j (a pointer) = 0x7fcf913a90f0 ; *j (a value) = 0
The latest release of Go (2010-05-27) has added two functions to the fmt package: Scan() and Scanln(). They don't take any pattern string. like in C, but checks the type of the arguments instead.
package main
import (
"fmt"
"os"
"container/vector"
)
func main() {
numbers := new(vector.IntVector)
var number int
n, err := fmt.Scan(os.Stdin, &number)
for n == 1 && err == nil {
numbers.Push(number)
n, err = fmt.Scan(os.Stdin, &number)
}
fmt.Printf("%v\n", numbers.Data())
}

Resources