How can I scale sarama consumer group in kubernetes deployment? - go

I am trying to have some consumers to process messages from kafka, and I would like to implement kubernetes deployment scalability for elastic message processing capability.
I found this code from sarama official guide https://pkg.go.dev/github.com/Shopify/sarama#NewConsumerGroup:
package main
import (
"context"
"fmt"
)
type exampleConsumerGroupHandler struct{}
func (exampleConsumerGroupHandler) Setup(_ ConsumerGroupSession) error { return nil }
func (exampleConsumerGroupHandler) Cleanup(_ ConsumerGroupSession) error { return nil }
func (h exampleConsumerGroupHandler) ConsumeClaim(sess ConsumerGroupSession, claim ConsumerGroupClaim) error {
for msg := range claim.Messages() {
fmt.Printf("Message topic:%q partition:%d offset:%d\n", msg.Topic, msg.Partition, msg.Offset)
sess.MarkMessage(msg, "")
}
return nil
}
func main() {
config := NewTestConfig()
config.Version = V2_0_0_0 // specify appropriate version
config.Consumer.Return.Errors = true
group, err := NewConsumerGroup([]string{"localhost:9092"}, "my-group", config)
if err != nil {
panic(err)
}
defer func() { _ = group.Close() }()
// Track errors
go func() {
for err := range group.Errors() {
fmt.Println("ERROR", err)
}
}()
// Iterate over consumer sessions.
ctx := context.Background()
for {
topics := []string{"my-topic"}
handler := exampleConsumerGroupHandler{}
// `Consume` should be called inside an infinite loop, when a
// server-side rebalance happens, the consumer session will need to be
// recreated to get the new claims
err := group.Consume(ctx, topics, handler)
if err != nil {
panic(err)
}
}
}
I have some questions:
how to set numbers of consumers in a consumer group?
If I deploy this program in a Pod, can I scale it safely? I mean, assume one program is running, and I scale the replicas from 1 to 2, will another NewConsumerGroup call with the same group id works perfectly without conflict?
Thank you in advance.
NOTE: I am using Kafka 2.8 and I heard that sarama_cluster package is DEPRECATED.

Reminder that groups cannot scale beyond the topic partition count
Scaling the pods is the correct way to use consumer groups, and using the same group name is correct, however I'd recommend extracting that and the broker address to environment variables so they can easily be changed at deploy time
As-is the containerized code would be unable to use localhost as a Kafka connection string as that would be the pod itself

Related

Manually extracting OpenTelemetry context from golang into a string?

I'm building a simple client server app which I want to trace across the client execution to a server microservice that calls a second server microservice.
Simply speaking, it's not more complicated than CLI -> ServiceA -> ServiceB.
The challenge I'm having is how to serialize the context - most of the docs I've looked at appear to do some form of automated HTTP header injection (e.g. https://opentelemetry.lightstep.com/core-concepts/context-propagation/) , but I do not have access to that. I need to serialize (I think) the context of the trace/span in the client and push it to the server, where I'll rehydrate it. (Mind you, I'd love this to be simpler, but I cannot figure it out).
So the object looks like this (called "job"):
args := &types.SubmitArgs{
SerializedOtelContext: serializedOtelContext,
}
job := &types.Job{}
tracer := otel.GetTracerProvider().Tracer("myservice.org")
_, span := tracer.Start(ctx, "Submitting Job to RPC")
err := system.JsonRpcMethod(rpcHost, rpcPort, "Submit", args, job)
The function to submit to JsonRpcMethod is here:
func JsonRpcMethod(
host string,
port int,
method string,
req, res interface{},
) error {
client, err := rpc.DialHTTP("tcp", fmt.Sprintf("%s:%d", host, port))
if err != nil {
return fmt.Errorf("Error in dialing. %s", err)
}
return client.Call(fmt.Sprintf("JobServer.%s", method), req, res)
}
And the function that receives it is here:
func (server *JobServer) Submit(args *types.SubmitArgs, reply *types.Job) error {
//nolint
job, err := server.RequesterNode.Scheduler.SubmitJob(args.Spec, args.Deal)
if err != nil {
return err
}
*reply = *job
return nil
}
My question is how do I, in the receiving function ("Submit" above) extract the trace/span from the sender?
Here is a small program to illustrate the usage. Hope this makes it clear.
package main
import (
"context"
"fmt"
"go.opentelemetry.io/otel/exporters/stdout/stdouttrace"
"go.opentelemetry.io/otel/propagation"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
)
func main() {
// common init
// You may also want to set them as globals
exp, _ := stdouttrace.New(stdouttrace.WithPrettyPrint())
bsp := sdktrace.NewSimpleSpanProcessor(exp) // You should use batch span processor in prod
tp := sdktrace.NewTracerProvider(
sdktrace.WithSampler(sdktrace.AlwaysSample()),
sdktrace.WithSpanProcessor(bsp),
)
propgator := propagation.NewCompositeTextMapPropagator(propagation.TraceContext{}, propagation.Baggage{})
ctx, span := tp.Tracer("foo").Start(context.Background(), "parent-span-name")
defer span.End()
// Serialize the context into carrier
carrier := propagation.MapCarrier{}
propgator.Inject(ctx, carrier)
// This carrier is sent accros the process
fmt.Println(carrier)
// Extract the context and start new span as child
// In your receiving function
parentCtx := propgator.Extract(context.Background(), carrier)
_, childSpan := tp.Tracer("foo").Start(parentCtx, "child-span-name")
childSpan.AddEvent("some-dummy-event")
childSpan.End()
}

Prometheus Exporter - Direct Instrumentation vs Custom Collector

I'm currently writing a Prometheus exporter for a telemetry network application.
I've read the doc here Writing Exporters and while I understand the use case for implementing a custom collector to avoid race condition, I'm not sure whether my use case could fit with direct instrumentation.
Basically, the network metrics are streamed via gRPC by the network devices so my exporter just receives them and doesn't have to effectively scrape them.
I've used direct instrumentation with below code:
I declare my metrics using promauto package to keep code compact:
package metrics
import (
"github.com/lucabrasi83/prom-high-obs/proto/telemetry"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
)
var (
cpu5Sec = promauto.NewGaugeVec(
prometheus.GaugeOpts{
Name: "cisco_iosxe_iosd_cpu_busy_5_sec_percentage",
Help: "The IOSd daemon CPU busy percentage over the last 5 seconds",
},
[]string{"node"},
)
Below is how I simply set the metric value from the gRPC protocol buffer decoded message:
cpu5Sec.WithLabelValues(msg.GetNodeIdStr()).Set(float64(val))
Finally, here is my main loop which basically handles the telemetry gRPC streams for metrics I'm interested in:
for {
req, err := stream.Recv()
if err == io.EOF {
return nil
}
if err != nil {
logging.PeppaMonLog(
"error",
fmt.Sprintf("Error while reading client %v stream: %v", clientIPSocket, err))
return err
}
data := req.GetData()
msg := &telemetry.Telemetry{}
err = proto.Unmarshal(data, msg)
if err != nil {
log.Fatalln(err)
}
if !logFlag {
logging.PeppaMonLog(
"info",
fmt.Sprintf(
"Telemetry Subscription Request Received - Client %v - Node %v - YANG Model Path %v",
clientIPSocket, msg.GetNodeIdStr(), msg.GetEncodingPath(),
),
)
}
logFlag = true
// Flag to determine whether the Telemetry device streams accepted YANG Node path
yangPathSupported := false
for _, m := range metrics.CiscoMetricRegistrar {
if msg.EncodingPath == m.EncodingPath {
yangPathSupported = true
go m.RecordMetricFunc(msg)
}
}
}
For each metric I'm interested in, I register it with a record metric function (m.RecordMetricFunc ) that takes the protocol buffer message as argument as per below.
package metrics
import "github.com/lucabrasi83/prom-high-obs/proto/telemetry"
var CiscoMetricRegistrar []CiscoTelemetryMetric
type CiscoTelemetryMetric struct {
EncodingPath string
RecordMetricFunc func(msg *telemetry.Telemetry)
}
I then use an init function for the actual registration:
func init() {
CiscoMetricRegistrar = append(CiscoMetricRegistrar, CiscoTelemetryMetric{
EncodingPath: CpuYANGEncodingPath,
RecordMetricFunc: ParsePBMsgCpuBusyPercent,
})
}
I'm using Grafana as the frontend and so far haven't seen any particular discrepancy while correlating the Prometheus exposed metrics VS Checking metrics directly on the device.
So I would like to understand whether this is following Prometheus best practices or I should still go through the custom collector route.
Thanks in advance.
You are not following best practices because you are using the global metrics that the article you linked to cautions against. With your current implementation your dashboard will forever show some arbitrary and constant value for the CPU metric after a device disconnects (or, more precisely, until your exporter is restarted).
Instead, the RPC method should maintain a set of local metrics and remove them once the method returns. That way the device's metrics vanish from the scrape output when it disconnects.
Here is one approach to do this. It uses a map that contains currently active metrics. Each map element is the set of metrics for one particular stream (which I understand corresponds to one device). Once the stream ends, that entry is removed.
package main
import (
"sync"
"github.com/prometheus/client_golang/prometheus"
)
// Exporter is a prometheus.Collector implementation.
type Exporter struct {
// We need some way to map gRPC streams to their metrics. Using the stream
// itself as a map key is simple enough, but anything works as long as we
// can remove metrics once the stream ends.
sync.Mutex
Metrics map[StreamServer]*DeviceMetrics
}
type DeviceMetrics struct {
sync.Mutex
CPU prometheus.Metric
}
// Globally defined descriptions are fine.
var cpu5SecDesc = prometheus.NewDesc(
"cisco_iosxe_iosd_cpu_busy_5_sec_percentage",
"The IOSd daemon CPU busy percentage over the last 5 seconds",
[]string{"node"},
nil, // constant labels
)
// Collect implements prometheus.Collector.
func (e *Exporter) Collect(ch chan<- prometheus.Metric) {
// Copy current metrics so we don't lock for very long if ch's consumer is
// slow.
var metrics []prometheus.Metric
e.Lock()
for _, deviceMetrics := range e.Metrics {
deviceMetrics.Lock()
metrics = append(metrics,
deviceMetrics.CPU,
)
deviceMetrics.Unlock()
}
e.Unlock()
for _, m := range metrics {
if m != nil {
ch <- m
}
}
}
// Describe implements prometheus.Collector.
func (e *Exporter) Describe(ch chan<- *prometheus.Desc) {
ch <- cpu5SecDesc
}
// Service is the gRPC service implementation.
type Service struct {
exp *Exporter
}
func (s *Service) RPCMethod(stream StreamServer) (*Response, error) {
deviceMetrics := new(DeviceMetrics)
s.exp.Lock()
s.exp.Metrics[stream] = deviceMetrics
s.exp.Unlock()
defer func() {
// Stop emitting metrics for this stream.
s.exp.Lock()
delete(s.exp.Metrics, stream)
s.exp.Unlock()
}()
for {
req, err := stream.Recv()
// TODO: handle error
var msg *Telemetry = parseRequest(req) // Your existing code that unmarshals the nested message.
var (
metricField *prometheus.Metric
metric prometheus.Metric
)
switch msg.GetEncodingPath() {
case CpuYANGEncodingPath:
metricField = &deviceMetrics.CPU
metric = prometheus.MustNewConstMetric(
cpu5SecDesc,
prometheus.GaugeValue,
ParsePBMsgCpuBusyPercent(msg), // func(*Telemetry) float64
"node", msg.GetNodeIdStr(),
)
default:
continue
}
deviceMetrics.Lock()
*metricField = metric
deviceMetrics.Unlock()
}
return nil, &Response{}
}

Get current resource usage of a pod in Kubernetes with Go client

The kubernetes go client has tons of methods and I can't find how I can get the current CPU & RAM usage of a specific (or all pods).
Can someone tell me what methods I need to call to get the current usage for pods & nodes?
My NodeList:
nodes, err := clientset.CoreV1().Nodes().List(metav1.ListOptions{})
Kubernetes Go Client: https://github.com/kubernetes/client-go
Metrics package: https://github.com/kubernetes/kubernetes/tree/master/staging/src/k8s.io/metrics
As far as I got the metrics server implements the Kubernetes metrics package in order to fetch the resource usage from pods and nodes, but I couldn't figure out where & how they do it: https://github.com/kubernetes-incubator/metrics-server
It is correct that go-client does not have support for metrics type, but in the metrics package there is a pregenerated client that can be used for fetching metrics objects and assign them right away to the appropriate structure. The only thing you need to do first is to generate a config and pass it to metrics client. So a simple client for metrics would look like this:
package main
import (
"k8s.io/client-go/tools/clientcmd"
metrics "k8s.io/metrics/pkg/client/clientset/versioned"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
func main() {
var kubeconfig, master string //empty, assuming inClusterConfig
config, err := clientcmd.BuildConfigFromFlags(master, kubeconfig)
if err != nil{
panic(err)
}
mc, err := metrics.NewForConfig(config)
if err != nil {
panic(err)
}
mc.MetricsV1beta1().NodeMetricses().Get("your node name", metav1.GetOptions{})
mc.MetricsV1beta1().NodeMetricses().List(metav1.ListOptions{})
mc.MetricsV1beta1().PodMetricses(metav1.NamespaceAll).List(metav1.ListOptions{})
mc.MetricsV1beta1().PodMetricses(metav1.NamespaceAll).Get("your pod name", metav1.GetOptions{})
}
Each of the above methods from metric client returns an appropriate structure (you can check those here) and an error (if any) which you should process according to your requirements.
here is an example.
package main
import (
"fmt"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/tools/clientcmd"
metrics "k8s.io/metrics/pkg/client/clientset/versioned"
)
func main() {
var kubeconfig, master string //empty, assuming inClusterConfig
config, err := clientcmd.BuildConfigFromFlags(master, kubeconfig)
if err != nil {
panic(err)
}
mc, err := metrics.NewForConfig(config)
if err != nil {
panic(err)
}
podMetrics, err := mc.MetricsV1beta1().PodMetricses(metav1.NamespaceAll).List(metav1.ListOptions{})
if err != nil {
fmt.Println("Error:", err)
return
}
for _, podMetric := range podMetrics.Items {
podContainers := podMetric.Containers
for _, container := range podContainers {
cpuQuantity, ok := container.Usage.Cpu().AsInt64()
memQuantity, ok := container.Usage.Memory().AsInt64()
if !ok {
return
}
msg := fmt.Sprintf("Container Name: %s \n CPU usage: %d \n Memory usage: %d", container.Name, cpuQuantity, memQuantity)
fmt.Println(msg)
}
}
}
The API you're looking for in new versions of Kubernetes (tested on mine as of 1.10.7) is the metrics.k8s.io/v1beta1 API route.
You can see it locally if you run a kubectl proxy and check http://localhost:8001/apis/metrics.k8s.io/v1beta1/pods and /nodes on your localhost.
I see where your confusion is though. At the time of writing, it does not look like the metrics/v1beta1 has a generated typed package (https://godoc.org/k8s.io/client-go/kubernetes/typed), and doesn't appear in the kubernetes.ClientSet object.
You can hit all available endpoints directly though the rest.RestClient object, and just specify metrics/v1beta1 as the versionedAPIPath, which will be more work and less convenient than the nicely wrapped ClientSet, but I'm not sure how long it'll take before that API shows up in that interface.

Golang listenUDP multiple ports blocking with BigTable connection

I'm creating a simple udp client that listens on multiple ports and saves the request to bigtable.
It's essential to listen on different ports before you ask.
Everything was working nicely until I included bigtable. After doing so, the listeners block completely.
My stripped down code, without bigtable, looks like this:
func flow(port string) {
protocol := "udp"
udpAddr, err := net.ResolveUDPAddr(protocol, "0.0.0.0:"+port)
if err != nil {
fmt.Println("Wrong Address")
return
}
udpConn, err := net.ListenUDP(protocol, udpAddr)
if err != nil {
fmt.Println(err)
}
defer udpConn.Close()
for {
Publish(udpConn, port)
}
}
func main() {
fmt.Print("Starting server.........")
for i := *Start; i <= *End; i++ {
x := strconv.Itoa(i)
go flow(x)
}
}
This works fine however, as soon as I add the following for bigtable, the whole thing blocks. If I remove the go routine that creates the listener (which means I can't listen on multiple ports) it works.
func createBigTable() {
ctx := context.Background()
client, err := bigtable.NewClient(ctx, *ProjectID, *Instance)
if err != nil {
log.Fatal("Bigtable NewClient:", err)
}
Table = client.Open("x")
}
I managed to get it working by adding a query in the createBigTable func but the program still blocks later on.
I have no idea if this is an issue with bigtable, grpc or just the way I'm doing it.
Would really appreciate some advise about how to fix.
--- UPDATE ---
I've discovered the issue isn't just with BigTable - I also have the same issue when I call gcloud pubsub.
--- UPDATE 2 ---
createBigtable is called in the init function (BEFORE THE MAIN FUNCTION):
func init() {
createBigTable
}
--- Update 3 ---
Output from sigquit can be found here:
https://pastebin.com/fzixqmiA
In your playground example, you're using for {} to keep the server running for forever.
This seems to deprive the goroutines from ever getting to run.
Try using e.g. a WaitGroup to yield control from the main() routine and let the flow() routines handle the incoming UDP packets.
import (
...
"sync"
...
)
...
func main() {
fmt.Print("Starting server.")
for i := *Start; i <= *End; i++ {
x := strconv.Itoa(i)
go flow(x)
}
var wg sync.WaitGroup
wg.Add(1)
wg.Wait()
}

How to get consumer group offsets for partition in Golang Kafka 10

Now that Golang Kafka library (sarama) is providing consumer group capability without any external library help with kafka 10. How can I get the current message offset being processed by a consumer group at any given time ?
Previously I used kazoo-go (https://github.com/wvanbergen/kazoo-go) to get my consumer group message offset as it is stored in Zookeeper. Now I use sarama-cluster (https://github.com/bsm/sarama-cluster), I am not sure which API to use to get my consumer group message offset.
Under the hood the consumerGroupSession struct is using PartitionOffsetManager to get next offset:
if pom := s.offsets.findPOM(topic, partition); pom != nil {
offset, _ = pom.NextOffset()
}
Here is the documentation of pom.NextOffset().
When a consumerGroupSession constructs a consumerGroupClaim struct via newConsumerGroupClaim() method, it passes offset, returned by pom.NextOffset(), as offset argument. You can access it later via claim.InitialOffset(). After you started consuming messages, you can use message.Offset of the currently processed message.
Unfortunately, consumerGroupSession.offsets.findPOM() can't be accessed from ConsumerGroupHandler.ConsumeClaim(session sarama.ConsumerGroupSession, claim sarama.ConsumerGroupClaim) method, because it receives session as a ConsumerGroupSession interface, not as consumerGroupSession struct. So the offsets variable is private and not accessible.
Thus we can't really access NextOffset() method, which does precisely what the OP wants.
I am also working with Sarama and Kafka to get offset of a topic.
You can get offset with following code.
package main
import (
"gopkg.in/Shopify/sarama"
"fmt"
)
func main(){
client , err := sarama.Client([]string{"localhost:9092"},nil) // I am not giving any configuration
if err != nil {
panic(err)
}
lastoffset, err := client.GetOffset("topic-test",0,sarama.OffsetNewest)
if err != nil {
panic(err)
}
fmt.Println("Last Commited Offset ",lastoffset)
}
Let me know if this is the answer you are looking for and if it is helpful.
Here's a sample code to get the consumer group offset (i.e. the offset where the consumer group will start):
package main
import (
"context"
"log"
"strings"
"github.com/Shopify/sarama"
)
func main() {
groupName := "testgrp"
topic := "topic_name"
offset, e := GetCGOffset(context.Background(), "localhost:9092", groupName, topic)
if e != nil {
log.Fatal(e)
}
log.Printf("Consumer group %s offset for topic %s is: %d", groupName, topic, offset)
}
type gcInfo struct {
offset int64
}
func (g *gcInfo) Setup(sarama.ConsumerGroupSession) error {
return nil
}
func (g *gcInfo) Cleanup(sarama.ConsumerGroupSession) error {
return nil
}
func (g *gcInfo) ConsumeClaim(_ sarama.ConsumerGroupSession, claim sarama.ConsumerGroupClaim) error {
g.offset = claim.InitialOffset()
return nil
}
func GetCGOffset(ctx context.Context, brokers, groupName, topic string) (int64, error) {
config := sarama.NewConfig()
config.Consumer.Offsets.AutoCommit.Enable = false // we're not going to update the consumer group offsets
client, err := sarama.NewConsumerGroup(strings.Split(brokers, ","), groupName, config)
if err != nil {
return 0, err
}
info := gcInfo{}
if err := client.Consume(ctx, []string{topic}, &info); err != nil {
return 0, err
}
return info.offset, nil
}
I've just been doing work on this myself. As #boris-burkov mentioned you don't have access to the getPOM method, however, you can create a POM yourself and called NextOffset() to get the current consumer's actual offset:
offsetManager, _ := sarama.NewOffsetManagerFromClient(clientName, cl.Client)
offsetPartitionManager, _ := offsetManager.ManagePartition("test-topic", 0)
offsetPartitionManager.NextOffset()

Resources