Upgrading consumers from zk based offset storage to kafka based storage - go

I am using Golang and Sarama client. Kafka version is 0.9 which I plan to upgrade.
I am planning to upgrade sarama clients to latest version and use sarama-cluster instead of wvanbergen/kafka. I see that offset will be committed to kafka now.
On Apache Kafka page it says that for doing migration from zk based storage to kafka you need to do following:
Set offsets.storage=kafka and dual.commit.enabled=true in your consumer config.
There is no such property in wvanbergen/kafka library and they don't have plans to add it too.
Has anyone performed a similar upgrade from wvanbergen/kafka to sarama-cluster without the dual.commit.enabled settings on a production system? How did you migrate offsets from zk to kafka?

Related

Spring Batch and Kafka

I am a junior programmer in banking. I want to make a microservice system that get data from kafka and processes it. after that, save to database and send final data to client app. What technology can i use? I plan to use spring bacth and kafka. Can the technology be implemented in my project or is there a better alternative?
To process data from a Kafka topic I recommend you to use Kafka Streams API, especially Spring Kafka Streams.
Kafka Streams and Spring
And to store the data in a database, you should use a Kafka Sink Connector.
Kafka Connect
This approach is very common and easy if your company has a Kafka ecosystem.
In terms of alternatives, here you will find an interesting comparison:
https://scramjet.org/blog/welcome-to-the-family
3 in 1 serverless
Scramjet takes a slightly different approach - 3 platforms in one.
Both the free product https://hub.scramjet.org/ for installation on your server and the cloud platform are available - currently also free in the beta version https://scramjet.org/#join-beta

Apache ActiveMQ - retrieving JMS metrics

In my corporate project I am using Spring Boot and Apache ActiveMQ 5.x Spring Boot starter. I am a totally beginner in this.
My goal is to expose Prometheus endpoint with some JMS queue metrics:
number of messages in queue
number of messages in error queue
What are dedicated tools for retrieving such metrics? Up to now I have found two possible ways. Can anyone confirm which of these two tools can solve my problem?
https://docs.spring.io/spring-integration/docs/5.1.7.RELEASE/reference/html/#system-management-chapter
https://activemq.apache.org/components/artemis/documentation/latest/metrics.html (here the example is not very helpful)
I don't think the Spring stuff will work because that will provide Spring-related metrics from the application itself, not the ActiveMQ broker.
Also, the documentation for ActiveMQ you cited is for ActiveMQ Artemis. However, the dependency you're using is for ActiveMQ 5.x. Therefore, the documentation is not applicable. However, if you choose to use ActiveMQ Artemis it is very simple to expose a Prometheus endpoint using this Prometheus metrics plugin implementation. It's worth noting that Artemis is ActiveMQ's next generation message broker. If you're starting a new project I would recommend you use it rather than 5.x. Artemis is planned to replace 5.x and become ActiveMQ 6.0 in the future.
I think your best bet would be to configure the Prometheus JMX exporter. It even has a sample configuration for ActiveMQ 5.x.
ActiveMQ comes with Jolokia bundled by default for extracting JMX Beans for the JVM, queues and a bunch of other metrics using HTTP. That way we can easily export using a software like Telegraf, which comes with a simple input plugin for ActiveMQ and a simple output plugin for Prometheus.

Export data from Kafka to Oracle

I am trying to export data from Kafka to Oracle db. I've searched related questions and web but could not understand that we need a platform (confluent etc.. ) or not. I'd been read the link below but it's not clear enough.
https://docs.confluent.io/3.2.2/connect/connect-jdbc/docs/sink_connector.html
So, what we actually need to export data without 3rd party platform? Thanks in advance.
It's not clear what you mean by "third-party" here
What you linked to is Kafka Connect, which is Apache 2.0 Licensed and open source.
Kafka Connect is a plugin ecosystem, you install connectors individually, written by anyone, or write your own, just like any other Java dependency (i.e. a third-party)
The JDBC connector just happens to be maintained by Confluent. and you can configure the Confluent Hub CLI
to install within any Kafka Connect distribution (or use Kafka Connect Docker images from Confluent)
Alternatively, you use Apache Spark, Flink, Nifi, and many other Kafka Consumer libraries to read data and then start an Oracle transaction per record batch
Or you can explore non-JVM kafka libraries as well and use a language you're more familiar with doing Oracle operations with

How to implement kafka-connect using apache-kaka instead of confluent

I would like to use an open source version of kafka-connect instead of the confluent one as it appears that confluent cli is not for production and only for dev. I would like to be able to listen to changes on mysql database on aws ec2. Can someone point me in the right direction.
Kafka Connect is part of Apache Kafka. Period. If you want to use Kafka Connect you can do so with any modern distribution of Apache Kafka.
You then need a connector plugin to use with Kafka Connect, specific to your source technology. For integrating with a database there are various considerations, and available for MySQL you specifically have:
Kafka Connect JDBC - see it in action here
Debezium - see it in action here
The Confluent CLI is just a tool for helping manage and deploy Confluent Platform on developer machines. Confluent Platform itself is widely used in production.

For a spring enterprise web application with multiple instances, What is the way to retrieve the offset value from Kafka and store it?

I'm working on an enterprise web application that has a requirement to read from a Kafka system and then trigger events. Can anyone suggest a way to get the offset and also an ideal way to store the offset (Ideal way should be able to handle accessing by multiple instances of the application)?
Note:-
I'm using spring-kafka and open for any further suggestions.
Thanks in advance.
With recent versions of Kafka, the offset is stored in a kafka topic. Kafka keeps track of the consumer offset for each partition in a topic __consumer_offsets which is a compacted topic; in other words; kafka itself keeps track of the offset for each consumer group.
With Spring for Apache Kafka; several options are provided for when the offset is committed.
In earlier versions of kafka offsets were often stored externally; it's now a lot simpler.
There may still be use cases for that but such scenarios are all supported by Spring Kafka; especially with the upcoming 2.0 release.

Resources