Pooling Maps in Golang - go

I was curious if anyone has tried to pool maps in Go before? I've read about pooling buffers previously, and I was wondering if by similar reasoning it could make sense to pool maps if one has to create and destroy them frequently or if there was any reason why, a priori, it might not be efficient. When a map is returned to the pool, one would have to iterate through it and delete all elements, but it seems a popular recommendation is to create a new map instead of deleting the entries in a map which has already been allocated and reusing it which makes me think that pooling maps may not be as beneficial.

If your maps change (a lot) in size by deleting or adding entries this will cause new allocations and there will be no benefit of pooling them.
If your maps will not change in size but only the values of the keys will change then pooling will be a successful optimization.
This will work well when you read table-like structures, for instance CSV files or database tables. Each row will contain exactly the same columns, so you don't need to clear any entry.
The benchmark below shows no allocation when run with go test -benchmem -bench . to
package mappool
import "testing"
const SIZE = 1000000
func BenchmarkMap(b *testing.B) {
m := make(map[int]int)
for i := 0; i < SIZE; i++ {
m[i] = i
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
for i := 0; i < SIZE; i++ {
m[i] = m[i] + 1
}
}
}

Like #Grzegorz Żur says, if your maps don't change in size very much, then pooling is helpful. To test this, I made a benchmark where pooling wins out. The output on my machine is:
Pool time: 115.977µs
No-pool time: 160.828µs
Benchmark code:
package main
import (
"fmt"
"math/rand"
"time"
)
const BenchIters = 1000
func main() {
pool := map[int]int{}
poolTime := benchmark(func() {
useMapForSomething(pool)
// Return to pool by clearing the map.
for key := range pool {
delete(pool, key)
}
})
nopoolTime := benchmark(func() {
useMapForSomething(map[int]int{})
})
fmt.Println("Pool time:", poolTime)
fmt.Println("No-pool time:", nopoolTime)
}
func useMapForSomething(m map[int]int) {
for i := 0; i < 1000; i++ {
m[rand.Intn(300)] += 5
}
}
// benchmark measures how long f takes, on average.
func benchmark(f func()) time.Duration {
start := time.Now().UnixNano()
for i := 0; i < BenchIters; i++ {
f()
}
return time.Nanosecond * time.Duration((time.Now().UnixNano()-start)/BenchIters)
}

Related

Prometheus counters: How to get current value with golang client?

I am using counters to count the number of requests. Is there any way to get current value of a prometheus counter?
My aim is to reuse existing counter without allocating another variable.
Golang prometheus client version is 1.1.0.
It's easy, have a function to fetch Prometheus counter value
import (
"github.com/prometheus/client_golang/prometheus"
dto "github.com/prometheus/client_model/go"
"github.com/prometheus/common/log"
)
func GetCounterValue(metric *prometheus.CounterVec) float64 {
var m = &dto.Metric{}
if err := metric.WithLabelValues("label1", "label2").Write(m); err != nil {
log.Error(err)
return 0
}
return m.Counter.GetValue()
}
Currently there is no way to get the value of a counter in the official Golang implementation.
You can also avoid double counting by incrementing your own counter and use an CounterFunc to collect it.
Note: use integral type and atomic to avoid concurrent access issues
// declare the counter as unsigned int
var requestsCounter uint64 = 0
// register counter in Prometheus collector
prometheus.MustRegister(prometheus.NewCounterFunc(
prometheus.CounterOpts{
Name: "requests_total",
Help: "Counts number of requests",
},
func() float64 {
return float64(atomic.LoadUint64(&requestsCounter))
}))
// somewhere in your code
atomic.AddUint64(&requestsCounter, 1)
It is possible to read the value of a counter (or any metric) in the official Golang implementation. I'm not sure when it was added.
This works for me for a simple metric with no vector:
func getMetricValue(col prometheus.Collector) float64 {
c := make(chan prometheus.Metric, 1) // 1 for metric with no vector
col.Collect(c) // collect current metric value into the channel
m := dto.Metric{}
_ = (<-c).Write(&m) // read metric value from the channel
return *m.Counter.Value
}
Update: here's a more general version that works with vectors and on histograms...
// GetMetricValue returns the sum of the Counter metrics associated with the Collector
// e.g. the metric for a non-vector, or the sum of the metrics for vector labels.
// If the metric is a Histogram then number of samples is used.
func GetMetricValue(col prometheus.Collector) float64 {
var total float64
collect(col, func(m dto.Metric) {
if h := m.GetHistogram(); h != nil {
total += float64(h.GetSampleCount())
} else {
total += m.GetCounter().GetValue()
}
})
return total
}
// collect calls the function for each metric associated with the Collector
func collect(col prometheus.Collector, do func(dto.Metric)) {
c := make(chan prometheus.Metric)
go func(c chan prometheus.Metric) {
col.Collect(c)
close(c)
}(c)
for x := range c { // eg range across distinct label vector values
m := dto.Metric{}
_ = x.Write(&m)
do(m)
}
}
While it is possible to obtain counter values in github.com/prometheus/client_golang as pointed at this answer, this looks too complicated. This can be greatly simplified by using an alternative library for exporing Prometheus metrics - github.com/VictoriaMetrics/metrics:
import (
"github.com/VictoriaMetrics/metrics"
)
var requestsTotal = metrics.NewCounter(`http_requests_total`)
//...
func getRequestsTotal() uint64 {
return requestsTotal.Get()
}
E.g. just call Get() function on the needed counter.

golang client side load balancer when not serving http

As a golang n00b, I have a go program that reads messages into kafka, modifies them then post them to one of the http endpoints in a list.
As of now we do some really basic round robin with random
cur := rand.Int() % len(httpEndpointList)
I'd like to improve that and add weight to the endpoints based on their response time or something similar.
I've looked into libraries but all I seem to find are written to be used as middleware using http.Handle. For example see the oxy lib roundrobin
I my case I do not serve http requests per say.
Any Ideas how could I accomplish that sort of more advanced client side load balancing in my golang program ?
I'd like to avoid to use yet another haproxy or similar in my environment.
There is a very simple algorithm for weighted random selection:
package main
import (
"fmt"
"math/rand"
)
type Endpoint struct {
URL string
Weight int
}
func RandomWeightedSelector(endpoints []Endpoint) Endpoint {
// this first loop should be optimised so it only gets computed once
max := 0
for _, endpoint := range endpoints {
max = max + endpoint.Weight
}
r := rand.Intn(max)
for _, endpoint := range endpoints {
if r < endpoint.Weight {
return endpoint
} else {
r = r - endpoint.Weight
}
}
// should never get to this point because r is smaller than max
return Endpoint{}
}
func main() {
endpoints := []Endpoint{
{Weight: 1, URL: "https://web1.example.com"},
{Weight: 2, URL: "https://web2.example.com"},
}
count1 := 0
count2 := 0
for i := 0; i < 100; i++ {
switch RandomWeightedSelector(endpoints).URL {
case "https://web1.example.com":
count1++
case "https://web2.example.com":
count2++
}
}
fmt.Println("Times web1: ", count1)
fmt.Println("Times web2: ", count2)
}
In can be optimized, this is the most naive. Definitely for production you should not calculate max every time, but apart from that, this basically is the solution.
Here a more profesional and OO version, that does not recompute max everytime:
package main
import (
"fmt"
"math/rand"
)
type Endpoint struct {
URL string
Weight int
}
type RandomWeightedSelector struct {
max int
endpoints []Endpoint
}
func (rws *RandomWeightedSelector) AddEndpoint(endpoint Endpoint) {
rws.endpoints = append(rws.endpoints, endpoint)
rws.max += endpoint.Weight
}
func (rws *RandomWeightedSelector) Select() Endpoint {
r := rand.Intn(rws.max)
for _, endpoint := range rws.endpoints {
if r < endpoint.Weight {
return endpoint
} else {
r = r - endpoint.Weight
}
}
// should never get to this point because r is smaller than max
return Endpoint{}
}
func main() {
var rws RandomWeightedSelector
rws.AddEndpoint(Endpoint{Weight: 1, URL: "https://web1.example.com"})
rws.AddEndpoint(Endpoint{Weight: 2, URL: "https://web2.example.com"})
count1 := 0
count2 := 0
for i := 0; i < 100; i++ {
switch rws.Select().URL {
case "https://web1.example.com":
count1++
case "https://web2.example.com":
count2++
}
}
fmt.Println("Times web1: ", count1)
fmt.Println("Times web2: ", count2)
}
For the part of updating weights based on a metric like endpoint latency, I would create a different object that uses this metrics to update the weights in the RandomWeightedSelector object. I think to implement it all together would be against single responsibility.

Go Profiling - Wrong file

I'm doing profiling in Go using github.com/pkg/profile and it's creating the file when I run my code, but the return comes from the example page code, how would it be to run through my code?
thanks in advance
Code:
package main
import (
"fmt"
"github.com/pkg/profile"
"time"
)
func main() {
defer profile.Start(profile.MemProfile).Stop()
var inicio = time.Now().UnixNano()
var text = "Olá Mundo!"
fmt.Println(text)
var fim = time.Now().UnixNano()
fmt.Println(fim - inicio)
}
Return:
You can change your profile output path to to your current working directory,
profile.ProfilePath(path)
If you are unable to make retrieve any samples, it either means your MemProfileRate is not small enough to actually capture small changes.
If you are allocation less amount of memory, then set the MemProfileRate to lesser value, If you are allocating large amount of memory, just keep to default. If you think you capturing minor memory changes, then increase the MemProfileRate.
profile.MemProfileRate(100)
and one thing you shouldn't forget when you are using profile package is your call should be deferred.
defer profile.Start(xxx).Stop()
Here is the complete program.
package main
import (
"os"
"github.com/pkg/profile"
)
func main() {
dir, _ := os.Getwd()
defer profile.Start(profile.MemProfile, profile.MemProfileRate(100), profile.ProfilePath(dir)).Stop()
//decrease mem profile rate for capturing more samples
for i := 0; i < 10000; i++ {
tmp := make([]byte, 100000)
tmp[0] = tmp[1] << 0 //fake workload
}
}
you can also set profile path for having the profile output in your current workign directory.

initializing a struct containing a slice of structs in golang

I have a struct that I want to initialize with a slice of structs in golang, but I'm trying to figure out if there is a more efficient version of appending every newly generated struct to the slice:
package main
import (
"fmt"
"math/rand"
)
type LuckyNumber struct {
number int
}
type Person struct {
lucky_numbers []LuckyNumber
}
func main() {
count_of_lucky_nums := 10
// START OF SECTION I WANT TO OPTIMIZE
var tmp []LuckyNumber
for i := 0; i < count_of_lucky_nums; i++ {
tmp = append(tmp, LuckyNumber{rand.Intn(100)})
}
a := Person{tmp}
// END OF SECTION I WANT TO OPTIMIZE
fmt.Println(a)
}
You can use make() to allocate the slice in "full-size", and then use a for range to iterate over it and fill the numbers:
tmp := make([]LuckyNumber, 10)
for i := range tmp {
tmp[i].number = rand.Intn(100)
}
a := Person{tmp}
fmt.Println(a)
Try it on the Go Playground.
Note that inside the for I did not create new "instances" of the LuckyNumber struct, because the slice already contains them; because the slice is not a slice of pointers. So inside the for loop all we need to do is just use the struct value designated by the index expression tmp[i].
You can use make() the way icza proposes, you can also use it this way:
tmp := make([]LuckyNumber, 0, countOfLuckyNums)
for i := 0; i < countOfLuckyNums; i++ {
tmp = append(tmp, LuckyNumber{rand.Intn(100)})
}
a := Person{tmp}
fmt.Println(a)
This way, you don't have to allocate memory for tmp several times: you just do it once, when calling make. But, contrary to the version where you would call make([]LuckyNumber, countOfLuckyNums), here, tmp only contains initialized values, not uninitialized, zeroed values. Depending on your code, it might make a difference or not.

Go how to properly use the for ... range loop

At the moment I have a go program that contains the following code.
package main
import "time"
import "minions/minion"
func main() {
// creating the slice
ms := make([]*minion.Minion, 2)
//populating the slice and make the elements start doing something
for i := range ms {
m := &ms[i]
*m = minion.NewMinion()
(*m).Start()
}
// wait while the minions do all the work
time.Sleep(time.Millisecond * 500)
// make the elements of the slice stop with what they were doing
for i := range ms {
m := &ms[i]
(*m).Stop()
}
}
Here NewMinion() is a constructor that returns a *minion.Minion
The code works perfectly, but having to write m := &ms[i] every time I use a for ... range loop seems to me like there should be a code writer friendlier way to tackle this problem.
Ideally I'd like something like the following to be possible (using the made up &range tag):
package main
import "time"
import "minions/minion"
func main() {
// creating the slice
ms := make([]*minion.Minion, 2)
//populating the slice and make the elements start doing something
for _, m := &range ms {
*m = minion.NewMinion()
(*m).Start()
}
// wait while the minions do all the work
time.Sleep(time.Millisecond * 500)
// make the elements of the slice stop with what they were doing
for _, m := &range ms {
(*m).Stop()
}
}
Unfortunately, this is not a language feature as of yet. Any considerations on what would be the nicest way remove the m := &ms[i] from the code? Or is there no way yet that takes less effort to write than this?
Your first example is a slice of pointers, you don't need to take the address of the pointers in the slice and then dereference the pointers each time. More idiomatic Go would look like (edited slightly to run in the playground without the "minion" package):
http://play.golang.org/p/88WsCVonaL
// creating the slice
ms := make([]*Minion, 2)
//populating the slice and make the elements start doing something
for i := range ms {
ms[i] = NewMinion(i)
ms[i].Start()
// (or equivalently)
// m := MewMinion(i)
// m.Start()
// ms[i] = m
}
// wait while the minions do all the work
time.Sleep(time.Millisecond * 500)
// make the elements of the slice stop with what they were doing
for _, m := range ms {
m.Stop()
}
This is all wrong.
There is absolutely no need to take the address of a pointer in your code. ms is a slice of pointers and you constructor returns a pointer so just assign i directly:
for i := range ms {
ms[i] = minion.NewMinion()
ms[i].Start()
}
Dead simple.

Resources