As a golang n00b, I have a go program that reads messages into kafka, modifies them then post them to one of the http endpoints in a list.
As of now we do some really basic round robin with random
cur := rand.Int() % len(httpEndpointList)
I'd like to improve that and add weight to the endpoints based on their response time or something similar.
I've looked into libraries but all I seem to find are written to be used as middleware using http.Handle. For example see the oxy lib roundrobin
I my case I do not serve http requests per say.
Any Ideas how could I accomplish that sort of more advanced client side load balancing in my golang program ?
I'd like to avoid to use yet another haproxy or similar in my environment.
There is a very simple algorithm for weighted random selection:
package main
import (
"fmt"
"math/rand"
)
type Endpoint struct {
URL string
Weight int
}
func RandomWeightedSelector(endpoints []Endpoint) Endpoint {
// this first loop should be optimised so it only gets computed once
max := 0
for _, endpoint := range endpoints {
max = max + endpoint.Weight
}
r := rand.Intn(max)
for _, endpoint := range endpoints {
if r < endpoint.Weight {
return endpoint
} else {
r = r - endpoint.Weight
}
}
// should never get to this point because r is smaller than max
return Endpoint{}
}
func main() {
endpoints := []Endpoint{
{Weight: 1, URL: "https://web1.example.com"},
{Weight: 2, URL: "https://web2.example.com"},
}
count1 := 0
count2 := 0
for i := 0; i < 100; i++ {
switch RandomWeightedSelector(endpoints).URL {
case "https://web1.example.com":
count1++
case "https://web2.example.com":
count2++
}
}
fmt.Println("Times web1: ", count1)
fmt.Println("Times web2: ", count2)
}
In can be optimized, this is the most naive. Definitely for production you should not calculate max every time, but apart from that, this basically is the solution.
Here a more profesional and OO version, that does not recompute max everytime:
package main
import (
"fmt"
"math/rand"
)
type Endpoint struct {
URL string
Weight int
}
type RandomWeightedSelector struct {
max int
endpoints []Endpoint
}
func (rws *RandomWeightedSelector) AddEndpoint(endpoint Endpoint) {
rws.endpoints = append(rws.endpoints, endpoint)
rws.max += endpoint.Weight
}
func (rws *RandomWeightedSelector) Select() Endpoint {
r := rand.Intn(rws.max)
for _, endpoint := range rws.endpoints {
if r < endpoint.Weight {
return endpoint
} else {
r = r - endpoint.Weight
}
}
// should never get to this point because r is smaller than max
return Endpoint{}
}
func main() {
var rws RandomWeightedSelector
rws.AddEndpoint(Endpoint{Weight: 1, URL: "https://web1.example.com"})
rws.AddEndpoint(Endpoint{Weight: 2, URL: "https://web2.example.com"})
count1 := 0
count2 := 0
for i := 0; i < 100; i++ {
switch rws.Select().URL {
case "https://web1.example.com":
count1++
case "https://web2.example.com":
count2++
}
}
fmt.Println("Times web1: ", count1)
fmt.Println("Times web2: ", count2)
}
For the part of updating weights based on a metric like endpoint latency, I would create a different object that uses this metrics to update the weights in the RandomWeightedSelector object. I think to implement it all together would be against single responsibility.
Related
I would like to loop through a slice of structs, and populate a struct field (which is a map) by passing in each struct to a function.
I have the below struct
type thing struct {
topicThing map[string]int
}
and I have the below functions
func main() {
ths := make([]thing, 0)
for i := 0; i < 10; i++ {
var th thing
ths = append(ths, th)
}
for _, th := range ths {
dothing(&th)
}
for _, th := range ths {
fmt.Println(th.topicThing)
}
}
func dothing(th *thing) {
tc := make(map[string]int)
tc["Hello"] = 1
tc["Bye"] = 2
th.topicThing = tc
}
The main function creates a slice of things (refered as ths), and passes each thing to the dothing() function by iterating over them.
Within dothing(), I create a new map, populate it with data, and assigns it to the passed in thing's attribute. However, by the time we iterate over ths in the main function to print topicThing of each thing, the map is empty.
Since make() creates objects within the heap, I was hoping it would be accessible even outside of the function scope. Can anyone tell me why this is happening?
P.S.
if I change the dothing() function like below:
func dothing(th *thing) {
th.topicThing["Hello"] = 1
th.topicThing["Bye"] = 2
}
The code works as expected, meaning the map is populated with data when accessed in the main function.
The range copies your object.
So when you do this,
for _, th := range ths {
dothing(&th)
}
you are actually dothing on a copy.
For example, with this main:
func main() {
ths := make([]thing, 0)
for i := 0; i < 10; i++ {
var th thing
ths = append(ths, th)
}
for _, th := range ths {
dothing(&th)
fmt.Println(th.topicThing)
}
it will print the right thing, since we are still working on the copy.
In order to not copy, use the array index:
for idx, _ := range ths {
dothing(&ths[idx])
}
I'm using compute.NewRegionInstanceGroupManagersService's ListManagedInstances call which returns ManagedInstance's.
ManagedInstance has a field Instance which is an instance url, like https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-b/instances/instance-group-z0hf
Now I would like to get more details about this particular instance. So using InstanceService's Get call, the function signature looks like this:
func (r *InstancesService) Get(project string, zone string, instance string) *InstancesGetCall
What's the best way to parse the instance URL (see above) into its project, zone and instance parts? Or is there a way of using another method to pass the instance URL directly?
you could do something like this,
parse the URL to get its path
split the path by slash component
iterate the parts,
locate static strings
take the next value and assign it appropriately.
package main
import (
"fmt"
"net/url"
"strings"
)
func main() {
s := "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-b/instances/instance-group-z0hf"
u, err := url.Parse(s)
if err != nil {
panic(err)
}
parts := strings.Split(u.Path, "/")
var project string
var zone string
var inst string
for i := 0; i < len(parts); i++ {
if parts[i] == "projects" && i+1 < len(parts) {
project = parts[i+1]
i++
} else if parts[i] == "zones" && i+1 < len(parts) {
zone = parts[i+1]
i++
} else if parts[i] == "instances" && i+1 < len(parts) {
inst = parts[i+1]
i++
}
}
fmt.Println(project, zone, inst)
}
//Ouptput:
//my-project us-central1-b instance-group-z0hf
Alternatively, use the route engine from gorilla to create a new pattern, apply the route to the url path and collect output results. But it is more complex and probably not justified.
URLs are complex animals, the best way is using the library url.Parse. Then you can use a regex or split to extract the data you need from the path part.
I am using counters to count the number of requests. Is there any way to get current value of a prometheus counter?
My aim is to reuse existing counter without allocating another variable.
Golang prometheus client version is 1.1.0.
It's easy, have a function to fetch Prometheus counter value
import (
"github.com/prometheus/client_golang/prometheus"
dto "github.com/prometheus/client_model/go"
"github.com/prometheus/common/log"
)
func GetCounterValue(metric *prometheus.CounterVec) float64 {
var m = &dto.Metric{}
if err := metric.WithLabelValues("label1", "label2").Write(m); err != nil {
log.Error(err)
return 0
}
return m.Counter.GetValue()
}
Currently there is no way to get the value of a counter in the official Golang implementation.
You can also avoid double counting by incrementing your own counter and use an CounterFunc to collect it.
Note: use integral type and atomic to avoid concurrent access issues
// declare the counter as unsigned int
var requestsCounter uint64 = 0
// register counter in Prometheus collector
prometheus.MustRegister(prometheus.NewCounterFunc(
prometheus.CounterOpts{
Name: "requests_total",
Help: "Counts number of requests",
},
func() float64 {
return float64(atomic.LoadUint64(&requestsCounter))
}))
// somewhere in your code
atomic.AddUint64(&requestsCounter, 1)
It is possible to read the value of a counter (or any metric) in the official Golang implementation. I'm not sure when it was added.
This works for me for a simple metric with no vector:
func getMetricValue(col prometheus.Collector) float64 {
c := make(chan prometheus.Metric, 1) // 1 for metric with no vector
col.Collect(c) // collect current metric value into the channel
m := dto.Metric{}
_ = (<-c).Write(&m) // read metric value from the channel
return *m.Counter.Value
}
Update: here's a more general version that works with vectors and on histograms...
// GetMetricValue returns the sum of the Counter metrics associated with the Collector
// e.g. the metric for a non-vector, or the sum of the metrics for vector labels.
// If the metric is a Histogram then number of samples is used.
func GetMetricValue(col prometheus.Collector) float64 {
var total float64
collect(col, func(m dto.Metric) {
if h := m.GetHistogram(); h != nil {
total += float64(h.GetSampleCount())
} else {
total += m.GetCounter().GetValue()
}
})
return total
}
// collect calls the function for each metric associated with the Collector
func collect(col prometheus.Collector, do func(dto.Metric)) {
c := make(chan prometheus.Metric)
go func(c chan prometheus.Metric) {
col.Collect(c)
close(c)
}(c)
for x := range c { // eg range across distinct label vector values
m := dto.Metric{}
_ = x.Write(&m)
do(m)
}
}
While it is possible to obtain counter values in github.com/prometheus/client_golang as pointed at this answer, this looks too complicated. This can be greatly simplified by using an alternative library for exporing Prometheus metrics - github.com/VictoriaMetrics/metrics:
import (
"github.com/VictoriaMetrics/metrics"
)
var requestsTotal = metrics.NewCounter(`http_requests_total`)
//...
func getRequestsTotal() uint64 {
return requestsTotal.Get()
}
E.g. just call Get() function on the needed counter.
I'm trying to figure out if there's a way to get the current character position of a tag using the golang.org/x/net/html tokenizer library?
Simplified code looks like:
func LookForForm(body string) {
reader := strings.NewReader(body)
tokenizer := html.NewTokenizer(reader)
idx := 0
lastIdx := 0
for {
token := tokenizer.Next()
lastIdx = idx
idx = int(reader.Size()) - int(reader.Len())
switch token {
case html.ErrorToken:
return
case html.StartTagToken:
t := tokenizer.Token()
tagName := strings.ToLower(t.Data)
if tagName == "form" {
fmt.Printf("found at form at %d\n", lastIdx)
return
}
}
}
}
This doesn't work (I think) because reader is not reading character-by-character but by chunks so my calculation of Size - Len is invalid. tokenizer maintains two private span structs ( https://github.com/golang/net/blob/master/html/token.go line 147) but I am unaware of how to access them.
One possible solution that just occurred to me is to make a "reader" that only reads a single character at a time so my Size and Len calculations are always correct. But, that seems like a hack and any suggestions would be appreciated.
You might be able to accomplish what you are trying to do (not what you want) with careful arithmetic using Tokenizer's Buffered method which returns the slice of bytes currently in buffer that have yet been tokenized. But I don't think you will get what you wanted, as <div><form></form></div> would probably buffer the whole string before give you the first div token. In that case the size of the buffered content is not helpful in calculating the solution.
Tokenizing mark up lang with nested structure will almost always need to buffer the input to work. the private span attribute should be quite useless as it is only a reference in it's buffer, not absolute position from the reader.
Since the html Tokenizer is not providing an API to access the raw position of a tag in the original data, to get want you wanted I probably would just do a strings.Index or bytes.Index on the raw buffer of the token to get the position:
strings.Index(body, string(tokenizer.Raw()))
A non-buffering reader ended up working ok for me. The implementation of the reader looks something like:
package rule
import (
"errors"
"io"
"unicode/utf8"
)
type Reader struct {
s string
i int64
z int64
prevRune int64 // index of the previously read rune or -1
}
func (r *Reader) String() string {
return r.s
}
func (r *Reader) Len() int {
if r.i >= r.z {
return 0
}
return int(r.z - r.i)
}
func (r *Reader) Size() int64 {
return r.z
}
func (r *Reader) Pos() int64 {
return r.i
}
func (r *Reader) Read(b []byte) (int, error) {
if r.i >= r.z {
return 0, io.EOF
}
r.prevRune = -1
b[0] = r.s[r.i]
r.i += 1
return 1, nil
}
Then the loop for the tokenizer is fairly easy to calculate:
reader := NewReader(body)
tokenizer := html.NewTokenizer(reader)
idx := 0
lastIdx := 0
tokenLoop:
for {
token := tokenizer.Next()
switch token {
case html.ErrorToken:
break tokenLoop
case html.EndTagToken, html.TextToken, html.CommentToken, html.SelfClosingTagToken:
lastIdx = int(reader.Pos())
case html.StartTagToken:
t := tokenizer.Token()
tagName := strings.ToLower(t.Data)
idx = int(reader.Pos())
if tagName == "form" {
fmt.Printf("found at form at %d\n", lastIdx)
return
}
}
}
I was curious if anyone has tried to pool maps in Go before? I've read about pooling buffers previously, and I was wondering if by similar reasoning it could make sense to pool maps if one has to create and destroy them frequently or if there was any reason why, a priori, it might not be efficient. When a map is returned to the pool, one would have to iterate through it and delete all elements, but it seems a popular recommendation is to create a new map instead of deleting the entries in a map which has already been allocated and reusing it which makes me think that pooling maps may not be as beneficial.
If your maps change (a lot) in size by deleting or adding entries this will cause new allocations and there will be no benefit of pooling them.
If your maps will not change in size but only the values of the keys will change then pooling will be a successful optimization.
This will work well when you read table-like structures, for instance CSV files or database tables. Each row will contain exactly the same columns, so you don't need to clear any entry.
The benchmark below shows no allocation when run with go test -benchmem -bench . to
package mappool
import "testing"
const SIZE = 1000000
func BenchmarkMap(b *testing.B) {
m := make(map[int]int)
for i := 0; i < SIZE; i++ {
m[i] = i
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
for i := 0; i < SIZE; i++ {
m[i] = m[i] + 1
}
}
}
Like #Grzegorz Żur says, if your maps don't change in size very much, then pooling is helpful. To test this, I made a benchmark where pooling wins out. The output on my machine is:
Pool time: 115.977µs
No-pool time: 160.828µs
Benchmark code:
package main
import (
"fmt"
"math/rand"
"time"
)
const BenchIters = 1000
func main() {
pool := map[int]int{}
poolTime := benchmark(func() {
useMapForSomething(pool)
// Return to pool by clearing the map.
for key := range pool {
delete(pool, key)
}
})
nopoolTime := benchmark(func() {
useMapForSomething(map[int]int{})
})
fmt.Println("Pool time:", poolTime)
fmt.Println("No-pool time:", nopoolTime)
}
func useMapForSomething(m map[int]int) {
for i := 0; i < 1000; i++ {
m[rand.Intn(300)] += 5
}
}
// benchmark measures how long f takes, on average.
func benchmark(f func()) time.Duration {
start := time.Now().UnixNano()
for i := 0; i < BenchIters; i++ {
f()
}
return time.Nanosecond * time.Duration((time.Now().UnixNano()-start)/BenchIters)
}