How to catch duplicate entry error using a JDBC sink connector with kafka? - jdbc

I set up a JDBC sink to use "insert" as the "insert.mode". Naturally this throws up an exception and kills the JDBC sink connector when a duplicate entry is added. Is there a way to have it just log this error and keep on running?

Related

Flink JDBC Sink and connection pool

I cannot find any info in docs about connection reusing in JDBCAppendTableSink in Flink. Should I use my own connection pool or Flink reuses connection for me?
Is this really a gap in the documentation or I'm missing something?
Each instance of the sink creates a connection when the sink is created, and that connection (and a prepared statement) are then automatically reused for you.

How to handle SQL exceptions in Spring batch JDBC batch write

I have spring batch which reads from ActiveMQ Artemis and write to Oracle database. Now I want to implement a mechanism to cater if any SQLException occured in JDBC batch write like following,
org.springframework.jdbc.UncategorizedSQLException: PreparedStatementCallback; uncategorized SQLException for SQL
I tried with SkipPolicy and CompositeStepExecutionListener but it seems methods not get called on error. My target is to write the error prone message to a error queue or logs. Is there a way I can do it without extending JdbcBatchItemWriter. Also I want to know your opinion on whether my error handing is efficient?
My target is to write the error prone message to a error queue or logs
ItemWriteListener#onWriteError should be called on write failures. It gives you access to the items of the failed chunk as well as the exception that caused the failure. You can implement this listener and send failed items to your error queue or log them as needed.

Kafka-Connect : Can source and sink use same topic?

I am using Kafka-MongoDB-Connect where
Source is Debezium Connector
Sink is Kafka-MongoDB-Connector by MongoDB
Can I use same topic and MongoDB collection for both Source and Sink Connector? Will it cause messages to go in infinite loop?

Storm-kafka Hortonworks Tutorials for real time data streaming

Welcome any idea after read the problem statement.
Background:
Publish message using Apache Kafka:
Kafka broker is running. Kafka producers are the application that create the messages and publish them to the Kafka broker for further consumption. Therefore, in order for the Kafka consumer to consume data, Kafka topic need to create before Kafka producer and consumer starting publish message and consume message
Kafka tested successful as Kafka consumer able to consume data from Kafka topic and display result.
Before startup Storm topology, stop the Kafka consumer so that Storm Spout able to working on source of data streams from kafka topics.
Real time processing of the data using Apache Storm:
With Storm topology created, Storm Spout working on the source of data streams, which mean Spout will read data from kafka topics. At another end, Spout passes streams of data to Storm Bolt, which processes and create the data into HDFS (file format) and HBase (db format) for storage purpose.
Zookeeper znode missing the last child znode.
From log file,
2015-05-20 04:22:43 b.s.util [ERROR] Async loop died!
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /brokers/topics/truckevent/partitions
Zookeeper is the coordination service for distribution application. From the zookeeper client, we always can see the /brokers/topics/truckevent, but the last znode always missing when running storm. I managed to solve this issue once if we create the znode manually. However, same method is no longer work for subsequent testing.
Storm (TrucjHBaseBolt is the java class) failed to access connection to HBase tables.
From log file,
2015-05-20 04:22:51 c.h.t.t.TruckHBaseBolt [ERROR] Error retrievinging connection and access to HBase Tables
I had manually create the Hbase table as for data format at HBase. However, retrieving the connection to HBase still failed.
Storm (HdfsBolt java class) reported the permission denied when storm user write the data into hdfs.
From log file,
2015-05-20 04:22:43 b.s.util [ERROR] Async loop died!
java.lang.RuntimeException: Error preparing HdfsBolt: Permission denied: user=storm, access=WRITE, inode="/":hdfs:hdfs:drwxr-xr-x
Any one can help on this?
Suggestion for problem 1:
Stop storm topology. Delete the znodes related to topics manually in the zookeeper where storm is running and restart storm topology.
This will create new znodes.
Suggestion for problem 2:
First check using java code if you are able to connect to Hbase. Then test that same logic in Storm topology.
Answer for problem 3:
As per your logs user=storm but the directory in which you are writing is owned by hdfs. So change the user permission of that directory and make storm as the user using chown command.

Quarkus native with Kafka Streams and Schema Registry

Quarkus (1.5.0.Final) as native executable works fine with Kafka Streams and Avro Schema Registry.
But in case of a Kafka streams consuming a topic with Avro Serdes, if a new event is added, there is an exception:
The kafka-streams-avro-serde library try to reach (via REST API) the Schema Registry with the schema added.
This exception below occured: (This works fine in Qaurkus + JVM)
Caused by: org.apache.kafka.common.errors.SerializationException: Error registering Avro schema: {"type":"record","name":"siteconf","namespace":"test","fields":[{"name":"id","type":["null","string"],"default":null},{"name":"site","type":["null","string"],"default":null},{"name":"configuration","type":["null","string"],"default":null}],"connect.name":"siteconf"}
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Internal Server Error; error code: 500
I don't know how to workaround this problem.
It's very annoying because I think it's the only one problem I've detected in Kafka Streams with Schema Registry.
And I was interrested to adaopt Quarkus instead of Spring Boot/Cloud

Resources