I want to use Apache NiFi for writing some data encoded in Avro to Apache Kafka.
Therefore i use the ConvertRecord processor for converting from JSON to Avro. For Avro the AvroRecordSetWriter with ConfluentSchemaRegistry is used. The schema url is set to http:<hostname>:<port>/apis/ccompat/v6 (hostname/port are not important for this question). For having a free alternative to Confluent Schema Registry i deployed a Apicurio Schema Registry. The ccompat API should be compatible to Confluent.
But when i run the NiFi pipeline i get the following error, that the schema with the given name is not found:
Could not retrieve schema with the given name [...] from the configured Schema Registry
But i definitely created the Avro schema with this name in the Web-UI of Apicurio Registry.
Can someone please help me? Is there anybody who is using NiFi for Avro encoding in Kafka by using Apicurio Schema Registry?
Update:
Here are some screenshots of my pipeline and its configuration.
Set schema name via UpdateAttribute
Use ConvertRecord with JsonTreeReader
and ConfluentSchemaRegistry
and AvroSetWriter
Update 2:
This artifact id has to be set:
Related
I am trying to produce message using kafka-console-producer from apache-kafka binary and consume from consumer setup in spring boot. Consumer uses avro schema.
When message is produced in json format, my consumer is throwing exception - “not able to serialize”.
I found a solution for this to use “Confluent Platform 7.1”, which has kafka-avro-console-producer. It supports avro but it is an enterprise edition.
Is there a way to produce/consume messages with avro schema using apache-kafka itself with kafka-console-producer?
kafka-console-producer only accepts UTF8 strings, by default. Internally, it defaults to use StringSerializer
The kafka-avro-console-consumer wraps the other, and it's source-available, not Enterprise. You'd need to download at least the Confluent kafka-avro-serializer JAR file(s) and dependencies to even produce Avro data with it, but this also will require you to use the Schema Registry.
If you simply want to produce Avro binary data with no Registry, you can use Avro BinaryEncoder class, however, this will require you to also need your own deserializer in any consumer, rather than using ones provided by Confluent (again, free, not Enterprise)
I am using Kafka connect to sink data into elastic search. Usually, we ignore the empty fields when persisting to the elastic search. Can we do the same using Kafka connect?
Sample input
{"field1":"1","field2":""}
In the elastic index
{"field1":"1"}
In kafka connect there is a term called SMT, single message transformation, it has several types of supported functions, but none of them doing what you wish for, you can however write your on SMT doing that action ,
● Create your JAR file.
● Install the JAR file. Copy your custom SMT JAR file (and any non-Kafka JAR files required by the transformation) into a directory that is under one of the directories listed in the plugin.path property in the Connect worker configuration
refer to further instructions
https://docs.confluent.io/platform/current/connect/transforms/custom.html
Having an issue integrating the AWS Glue schema registry with quarkus reactive messaging. I have a property defined as:
mp.messaging.outgoing.eligibility.schemaName=<some schema name>
Notice the camelcase in schemaName. The Glue schema registry is looking for a value for schemaName but from the log output quarkus seems to be putting that property out in all lower case as schemaname so the default approach for adding additional kafka properties doesn't work.
Is there a way to maintain the camel casing in the properties file or another approach to adding kafka properties to an application.
Thanks
You need to put the property segment into quotes to preserve the original case :
mp.messaging.outgoing.eligibility."schemaName"=<some schema name>
In Apache NiFi, unable to configure databse schema. I'm trying to find a record in PostgesSQL using LookupRecord. But unable to specify the schema name as there is no such option. It's looking by default in public schema. Please help me with this.
You can specify the particular schema as part of the JDBC URL in newer versions of the Postgres driver. See example here.
I have a scenario where I have some set of avro files in HDFS.And I need generate Avro Schema files for those AVRO data files in HDFS.I tried researching using Spark (https://github.com/databricks/spark-avro/blob/master/src/main/scala/com/databricks/spark/avro/SchemaConverters.scala).
Is there any other than bringing the AVRO data file to local and doing HDFS PUT .
Any Suggestions are welcomed.Thanks !
Every avro file incorporates in it avro schema that it was written with. You can extract this schema using avro-tools.jar(download from maven). You can download only one part(assuming all other files were written with same schema) and use avro tools(java -jar ~/workspace/avro-tools-1.7.7.jar getschema xxx.avro) to extract it