Is Kafka JDBC connect compatible with Spring-Kafka library?
I did follow https://www.confluent.io/blog/kafka-connect-deep-dive-jdbc-source-connector/ and still have some confusions.
Let's say you want to consume from a Kafka topic and write to a JDBC database. Some of your options are
Use plain Kafka consumer to consume from the topic and use Jdbc api to write the consumed record to database.
Use spring Kafka to consume from the Kafka Topic and spring jdbc template or spring data to write it to the database
Use Kafka connect with Jdbc connector as sink to read from topic and write to a table.
So as you can see
Kafka Jdbc connector is a specialised component that can only do one job.
Kafka Consumer is very genric component which can do lot of job and you will be writing lot of code. In facr, it will be the foundational API from which other frameworks build on and specialise.
Spring Kafka simplfies it and let you deal with kafka records as java objects but doesnt tell you how to write that object to your db.
So they are alternative solutions to fulfil the task. Having said that you may have a flow where different segments are controlled by different teams and for each segment, any of them can be used and Kafka topic will act as joining channel
I'm trying to use the Confluent InfluxDB Sink Connector to get data from a kafka topic into my InfluxDB.
Firstly, I transmit data to kafka topic from a log file by using nifi, and it works well. The kafka topic get the data, like below:
{
"topic": "testDB5",
"key": null,
"value": {
"timestamp": "2019-03-20 01:24:29,461",
"measurement": "INFO",
"thread": "NiFi Web Server-795",
"class": "org.apache.nifi.web.filter.RequestLogger",
"message": "Attempting request for (anonymous)
},
"partition": 0,
"offset": 0
}
Then, I create InfluxDB sink connector through the Kafka Connect UI , and I get the following exception:
org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:587)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:323)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:226)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:194)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at io.confluent.influxdb.InfluxDBSinkTask.put(InfluxDBSinkTask.java:140)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:565)
... 10 more
But if I manually input data to another topic testDB1 by using
./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic testDB1 --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"measurement","type":"string"},{"name":"timestamp","type":"string"}]}'
It works, my influxDB can get the data.
Here is the connect configuration:
connector.class=io.confluent.influxdb.InfluxDBSinkConnector
influxdb.url=http://myurl
tasks.max=1
topics=testDB5
the configuration of connecting topic testDB1 is the same except topics name.
Is there any problems in nifi ? But it can transmit data to topic well.
When you use Avro with Kafka Connect, the Avro deserialiser expects the data to have been serialised using the Avro serialiser. This is what the kafak-avro-console-producer uses, which is why your pipeline works when you use that.
This article gives a good background to Avro and the Schema Registry. See also Kafka Connect Deep Dive – Converters and Serialization Explained.
I'm not familiar with Nifi, but looking at the documentation it seems that AvroRecordSetWriter has the option to use Confluent Schema Registry. At a guess you'll also want to set Schema Write Strategy to Confluent Schema Registry Reference.
Once you can consume data from your topic with kafka-avro-console-consumer then you know that it is correctly serialised and will work with your Kafka Connect sink.
I found the reason. It's because in Nifi, I used PublishKafka_0_10 to publish the data to Kafka topic, but its version is to low!
When I make a query in ksql, it says that
Input record ConsumerRecord(..data..) has invalid (negative) timestamp.
Possibly because a pre-0.10 producer client was used to write this record to Kafka without embedding a timestamp,
or because the input topic was created before upgrading the Kafka cluster to 0.10+. Use a different TimestampExtractor to process this data.
So, I change it to PublishKafka_1_0 , and start again, and it works! My influxDB can get the data. I'm speechless.
And thanks Robin Moffatt for the reply, its very helpful to me.
is there an bridge which can send data from a kafka topic to amqp(rabbitmq) or mqtt ?
I just find only the from mqtt/amqp to kafka but not backwards...
BR,
I believe the Stimzi supplies a Kafka AMQP 1.0 bridge that could be used for such a task. The project source is located on Github here so I'd suggest starting there and reading the documentation.
I want to bring the data from solace to hadoop using flume, can some one let me know how to write interceptor to convert protobuf to avro ?
There's a very detailed integration guide available describing how to use the JMS Flume Source to receive messages from a Solace message bus.
Is this the interface you are using?
If so the blog post by Ken Barr (https://solace.com/blog/devops/solace-as-flume-channel-technical-look) gives an implementation of both Flume Source and Sink. The complete source code is at http://dev.solace.com/wp-content/uploads/solace-flume-channel.tgz
The FlumeEventToSolaceMessageConverter.solaceToFlume() method is the one you'd need to modify to support your protobuf to avro use case. OOTB it just assumes that the body of the JMS message is an avro message.
On GitHub we found a protobuf to avro converter (vpon/protobuf-to-avro) which generates a POJO converter using a .proto schema file.
Does Confluent by default provides this JMSSourceConnector for Kafka topic.
Or we need to write custom connector for this?
I dont see any documentation on Confluent page on this.
Currently Confluent doesn't provide source connector for JMS. Please find below link for number of connectors available in Kafka Connect.
http://www.confluent.io/product/connectors/
But developers can develop custom connectors for Kafka Connect. Please see the below URL for more information on how to write custom connectors for Kafka Connect.
http://docs.confluent.io/3.0.1/connect/devguide.html