Local verification of avro schema validity and compatibility - gradle

We're using avro for (de)serialization of messages that flow through a message broker. For the purpose of storing the avro files a schema registry (apicurio) is used. This provides two benefits - schema validation and compatibility validation. However, I'm wondering if there is a way to go around the schema registry and achieve the same locally, using a script/plugin. Validating if an avro file is syntactically/semantically valid should be possible. The same applies for compatibility validation, as checking if a new schema version is backward/forward compatible against a list of other schemas (the previous versions) also sounds doable locally.
Is there a library that does that? Ideally a gradle plugin, but a java/python library would do as well, as it can easily be called from a gradle task.

I believe this is confluent's Java class for checking schema compatibility within its schema registry:
https://github.com/confluentinc/schema-registry/blob/master/core/src/test/java/io/confluent/kafka/schemaregistry/avro/AvroCompatibilityTest.java
You can use it to validate schemas locally.
Expedia has used it as a basis to create their own compatibility tool:
https://github.com/ExpediaGroup/avro-compatibility

I could not find a plugin doing just that what you ask for. Plugins seem to aim for generating classes from the schema files (i.e. https://github.com/davidmc24/gradle-avro-plugin). Without getting into why you want to do this, I think you could use this simple approach (How do I call a static Java method from Gradle) to hook you custom code into Gradle and checking for schema validity and compatibility.
Refer to following Avro Java API:
https://avro.apache.org/docs/current/api/java/org/apache/avro/SchemaCompatibility.html
https://avro.apache.org/docs/current/api/java/org/apache/avro/SchemaValidatorBuilder.html
Also checking this particular class can be helpful for executing validation against the schema:
https://github.com/apache/avro/blob/master/lang/java/tools/src/main/java/org/apache/avro/tool/DataFileReadTool.java

Related

What is the best way to maintain queries in Spring boot application?

In My Application, Using the below technologies
Spring boot 2.7.x
Cassandra
spring batch 5. x
java 11
As part of this, I need to extract data from the Cassandra database and need to write out the file
so here I need to use queries to fetch data so
just want to know what is the best way to maintain all queries at one place so any query changes come in the future, I shouldn't build the app rather just need to modify the query.
Using a repository class is necessary. If you are using JPA i recommend using a repository for each Entity class. With JDBC it is possible to create a single repository which contains all the queries. To access the query methodes i would use a service class. In this way your code is structured well and maintainable for future changes.

Spring boot jpa entity table name from property file

We are working on a spring boot library to generate and validate OTP. It uses database to store the OTP.
We are using Spring Data JPA for Database operations, as it will be easy to handle multiple database systems according to the project.
Now we have ran in to a problem, most of our projects uses Oracle with a single database.
When using the the same lib in multiple projects there is a name conflict.
So we want the name of the OTP table to be configurable using a property file.
We tried #Table(name = "${otp-table-name}") But its not working.
We did a lots of research and found out the hibernate naming strategy configuration can help.
But we dont want to use lots of configuration in our library as we need the library to be easily usable in the projects.
Can someone help us on this aspect.
Thanks in advance.
You can dynamically determine the actual DataSource based on the current context, use Spring's AbstractRoutingDataSource class. You could write your own version of this class and configure it to use a different data source based on the property file.
This allows you to switch between databases or schema without having to change the code in your library.
See: https://www.baeldung.com/spring-abstract-routing-data-source
Using a NamingStrategy is good approach.
You could let it delegate to an existing NamingStrategy and add a prefix.
Use a library specific default for the prefix, but also allow users of your library specify an alternative prefix.
This way your library can be used without extra configuration, but can also handle the case of multiple applications using it in the same database schema.
Of course this might involve the risk of someone using the default prefix without realizing that, that is already used.
It is not clear what the consequences of that scenario are.
If the consequences are really bad you should drop the default value and require that a project specific prefix is used.
When no prefix is specified throw an exception with an instructional error message telling the user, i.e. the developer how to pick a prefix and where to put it.

How should I control order of execution between Liquibase and springboot data.sql?

We have a (internal to company, external to project) library (jar) that includes some Liquibase scripts to add tables to a schema to support the functionality of that library.
Using SpringBoot and Maven to run integration tests with H2, we have been using sql files, listed in property files, to initialise the DB.
We want to be able to add data to the tables created by the external (to the project) library for the ITs but finding the tables haven't been created by Liquibase when SpringBoot/SpringData attempts to run the insert statements in our sql files.
Given the errors we're seeing (tables not existing when spring attempts to run the insert.sql files) it looks like spring is executing those files before Liquibase has done its thing.
How can I ensure Liquibase config run by the library to create tables has completed before Spring does it's thing with running sql files specified by the spring.datasource.data property?
We don't really want to include the test data in the library (which was working, but introduced other issued we are trying to work around with liquibase inserting test data into production DB).
what about using different context for your tests?
so you will have application.properties in your test folder and there you will define another changelog that will include all changelogs that are needed (even from library) and you will also include the .sql file that you are running probably with jpa? Try to look here if that helps.

How to generate Kotlin and or Java data model from GraphQL schema

My current project has switched to GraphQL API's and I wish to automate the generation of model objects that match both Query/Mutations requests/responses.
All I require is the model classes, I do not want to use tools such as Apollo at runtime in my Application.
I require model classes to be either Java or Kotlin.
I found this https://www.graphql-java-kickstart.com/tools/schema-definition/
however this appears to require me to create the model classes my self...
based on this statement "GraphQL Java Tools will expect to be given three classes that map to the GraphQL types: Query, Book, and Author. The Data classes for Book and Author are simple:"
What am I missing?
When I attempt use Apollo-cli to download my schema I get this error
~ - $ npx apollo-cli download-schema $https://my.graphql.end.point/graphql --output schema.json
Error while fetching introspection query result: only absolute urls are supported
Surely this is an basic requirement when employing GraphQL
So if I understand you correctly what you are trying to do is to a) download and locally create the schema from an existing graphql endpoint and b) create java model objects from this schema.
To download the schema you can use the graphql-cli. First install via npm install -g graphql-cli and run graphql init to setup your .graphqlconfig. Finally run graphql get-schema to download the schema from the defined endpoint.
Next you want to leverage a Java code generator that takes the GraphQL schema and creates:
Interfaces for GraphQL queries, mutations and subscriptions
Interfaces for GraphQL unions
POJO classes for GraphQL types
Enum classes for each GraphQL enum
There are various options depending on your setup / preferences (e.g. gradle vs maven):
https://graphql-maven-plugin-project.graphql-java-generator.com/index.html
https://github.com/kobylynskyi/graphql-java-codegen-gradle-plugin
https://github.com/kobylynskyi/graphql-java-codegen-maven-plugin
I recommend you to check out the first option, since it looks very well documented and also provides full flexibility after generating the desired helpers:
graphql-java-generator generates the boilerplate code, and lets you
concentrate on what’s specific to your use case. Then, the running
code doesn’t depend on any dependencies from graphql-java-generator.
So you can get rid of graphql-java-generator at any time: just put the
generated code in your SCM, and that’s it.
When in client mode, you can query the server with just one line of
code.
For instance :
Human human = queryType.human("{id name appearsIn homePlanet
friends{name}}", "180");
In this mode, the plugin generates:
One java class for the Query object, One java class for the Mutation
object (if any), One POJO for each standard object of the GraphQL
object, All the necessary runtime is actually attached as source code
into your project: the generated code is stand-alone. So, your
project, when it runs, doesn’t depend on any external dependency from
graphql-java-generator.

How to import data into Apache Solr index from LDAP

I have an LDAP-Server that contains a large set of user data and would like to import this into an Apache Solr index. The question is not about whether this is a good idea or not (as discussed here). I need this kind of architecture as one of our production systems depends on a Solr index of our ldap data.
I'm considering different options to do so, but I'm not sure which one should be preferred:
Option 1: Use the Apache Solr DataImportHandler:
This seems to be the most straight forward Solr way of doing so. Unfortunately there does not seem to be DataSource available that would work with LDAP.
I tried to combine the JdbcDataSource with the JDBC-LDAP-Bridge. In theory that might probably work but the driver looks quite dated (latest Version from 2007).
Another Option might be to write a custom LdapDataSource using some of the LDAP-Libraries for Java (probably Spring LDAP, directly via JNDI or something similar?).
Option 2: Build a custom Feeder:
Another option might be to write a standalone service/script that bridges between the two services. However that feels a bit like reinventing the wheel.
Option 3: Something I haven't thought of yet:
Maybe there are additional options here that I simply haven't discovered yet.
Solved it by writing a custom LDAP DataSource for the Solr DataImportHandler.
It's not as hard as it sounds. The JdbcDataSource can be used as a template for writing your custom DataSource, so basically you just have to rewrite that one Java-Class for the LDAP protocol.
For accessing the LDAP-Client there are numerous options, such as plain JNDI, UnboundID LDAP SDK, Apache LDAP API, OpenDJ LDAP SDK or OpenLDAP JLDAP (there are probably more but I only had a look at those).
I went for UnboundID LDAP due to its well documented API and full support for LDAPv3.
Afterwards it is just a matter of referencing the datasource from the data-config.xml.
A nice side-effect of this setup is, that you can use all the goodies that the Solr DataImportHandler provides while indexing the LDAP server (Entity Processors and Transformers). This makes it easy to map the data structure between LDAP and the Solr Index.

Resources