How to deploy assembly jar and use it as provided dependency? - maven

Using spark over hbase and hadoop using Yarn,
an assembly library among other libraries is provided server side.
(called like spark-looongVersion-haddop-looongVersion.jar)
it includes numerous libraries.
When the spark jar is sent as a job to the server for execution, conflicts may arise between the libraries included in the job and the server libraries (assembly jar and possibly other libraries) .
I need to include this assembly jar as a "provided" maven dependency to avoid conflicts between client dependencies and server classpath
how can I deploy and use this assembly jar as a provided dependency ?

how can I deploy and use this assembly jar as a provided dependency ?
An assembly jar is a regular jar file and so as any other jar file can be a library dependency if it's available in the artifact repo to download it from, e.g. Nexus, Artifactory or similar.
The quickest way to do it is to "install" it in your Maven local repository (see Maven's Guide to installing 3rd party JARs). That however binds you to what you have locally available and so will quickly get out of sync with what other teams are using.
The recommended way is to deploy the dependency using Apache Maven Deploy Plugin.
Once it's deployed, declaring it as a dependency is not different from declaring other dependencies.

Provided dependencies scope
Spark dependencies must be excluded from the assembled JAR. If not, you should expect weird errors from Java classloader during application startup. Additional benefit of assembly without Spark dependencies is faster deployment. Please remember that application assembly must be copied over the network to the location accessible by all cluster nodes (e.g: HDFS or S3).

Related

Flink: Why is Hive dependency flink-sql-connector-hive not available on Maven Central?

According to Flink SQL Hive: Using bundled hive jar:
The following tables list all available bundled hive jars. You can pick one to the /lib/ directory in Flink distribution.
flink-sql-connector-hive-1.2.2 (download link)
flink-sql-connector-hive-2.2.0 (download link)
...
However, these dependencies are not available from Maven central. As a work around, I use user defined dependencies, but this is not recommended:
the recommended way to add dependency is to use a bundled jar. Separate jars should be used only if bundled jars don’t meet your needs.
I wonder why the bundle jars are not available in Maven central?
Follow-up: Since they are not available from Maven central, I wonder how to include them in pom.xml in order to run mvn package, if I don't want to use user defined dependencies?
Thanks!
The answer is, I am wrong. flink-sql-connector-hive does exist in maven central, see https://search.maven.org/artifact/org.apache.flink/flink-sql-connector-hive-2.2.0_2.11/1.12.0/jar.

Is the distribution package (.tgz) of Apache Kafka available as a Maven dependency?

I would like to add Apache Kafka .tgz archive contents to my Maven project's distribution package. I was not able to find the archive on Maven Central. Any reason why it is not there?
Maven Central generally has jars and POMs, not tarballs.
The kafka module, (not kafka-clients) includes everything needed to programmatically run a KafkaServer class, though you'd want KafkaServerStartable to initialize that

Why third party dependency is required exclusively from OSGi container even if I have it in my maven dependencies?

I want to know why OSGi do not respect the maven dependenceis.
I want to create one app in OSGi(AEM). I want to communicate(CRUD) to the database with the help of JPA(eclipselink).
I created maven project with aem-archetype.
Added all required dependencies(of JPA) into my maven project's pom file.
No errors in Eclipse, I built the project via mvn clean install and installed it into AEM(CQ5) via mvn sling:install. All good till now. No Errors.
But when I go and see my bundle in the felix console, I see that it is not Active but in Installed state.The error reported is that it could not resolve the javax.persistence package.
I was puzzled, I searched and I read about it here -
You have to make sure that you place the same version in another
bundle and deploy first. https://forums.adobe.com/thread/2325007
I converted JPA jar to OSGi bundle and installed in my OSGi container, and the error was gone. Great!
But why OSGi is not watching out for the dependencies I wrote in pom.xml of my maven project. Why it needs JPA strictly from OSGi bundle?
Maybe this is due to any architectural benefit, but could anyone please explain me here about this behaviour of OSGi? And why/how this feature of OSGi is useful ?
The <dependency> section of your Maven POM only covers your compile time dependencies. That means when you run Maven to build your project those dependencies are used to compile the source code and build your bundle. Maven itself is not aware of AEM or OSGi or any other platform or framework (e.g. Spring).
Maven just compiles your code.
You, as a developer, are responsible that all those required compile time dependencies are also available at runtime.
What we usually do is to create an AEM content package Maven module and put all of our required third party dependencies (e.g. JPA bundles) into it. This content package is then deployed by Maven so that those dependencies are also available at runtime.
Reason is: what you are adding as dependency is getting added in build path of your project and being available for your classes.When you run mvn install,it checks presence of all dependency and creates a bundle/jar for you.By default this bundle has only your project classes not other dependencies.
You need to check in depfinder whether external dependencies are already there in OSGi container,if not you have to load them in OSGi container either by embedding external dependencies in your bundle with the help of maven-bundle-plugin present in pom.xml or by making a bundle of jar file(I wont recommend that)which you have done.
I hope this helps!

Can Gradle read transitive dependencies from pom.xml contained in local JAR files?

Unlike external dependencies (from Maven, Ivy, etc.) local JAR files usually do not provide a list of transitive dependencies for Gradle. Unless they theoretically do in form of files pom.xml and pom.properties in directory META-INF/maven/<groupId>/<artifactId>. As far as I understand these are the same files Maven uses to provide transitive dependencies for an artifact.
So I wonder if Gradle is somehow able to read these transitive dependencies from a local JAR file as if the local JAR was an external dependency. Only adding the local JAR as dependency seems to ignore the embedded pom.xml.
Use case: I am writing an Plugin API JAR for an internal product which should be used by our developers to develop plugins. The API JAR has some external dependencies (Hibernate Annotations in domain classes, dom4j, stuff like that) and it would be great if the developer wouldn't have to define these dependencies by himself (they could change with newer API version). I also don't want to create a fat JAR containing all dependencies because a) the size! and b) it would not contain the sources of the external dependencies.

Add multiple JARs and Javadoc to local Maven repository

I have a number of JAR files that comprise two different Java SDKs for BOXI R3.1: BusinessObjects Enterprise Java SDK and the Web Services Consumer Java SDK.
The BusinessObjects Enterprise Java SDK has a number of 'core' JARs:
biarengine.jar
biplugins.jar
cecore.jar
celib.jar
ceplugins_client.jar
ceplugins_core.jar
ceplugins_cr.jar
cereports.jar
cesession.jar
ceutils.jar
corbaidl.jar
ebus405.jar
flash.jar
SL_plugins.jar
logging.jar
pluginhelper.jar
xcelsius.jar
and a number of dependencies:
asn1.jar
backport-util-concurrent-2.2.jar
certj.jar
commons-logging.jar
derby.jar
freessl201.jar
jsafe.jar
log4j.jar
rascore.jar
sslj.jar
The Javadocs are available as a ZIP file.
The situation is similar for the web-services SDK, so I will omit the details.
Goal: package each SDK and its Javadoc as a local, Maven repository (it doesn't appear that SAP is providing a remote one).
Questions:
can one Maven repository contain multiple JAR files? The mvn deploy:deploy-file plugin seems to only work on a single file: How to add a jar, source and Javadoc to the local Maven repository?
should Javadocs be kept in ZIP format in a Maven repository?
if i choose to make to repos for a given SDK (i.e. core and dependencies), is specifying the linkage as easy as editing the core repos' configuration file?
rather than creating a repo for the dependencies, I'm assuming that it would be better to identify and reference existing Maven repos (e.g log4j.jar). Will this lead me to JAR hell?
Yes a maven repo can contain multiple files, you can execute mvn deploy:deploy-file on each one (using -Djavadoc and -Dsources as needed).
To specify dependencies for a jar, create a pom file for it (with dependencies) and use -DpomFile (and omit -DgeneratePom) in mvn deploy:deploy-file.
Yes you should not re-invent the wheel and deploy artifacts to your repository that are already in central. You can use tools like http://mvnrepository.com to search for your jars (look META-INF/MANIFEST.MF in your jars to find the version).
For more info see: http://maven.apache.org/plugins/maven-deploy-plugin/deploy-file-mojo.html.

Resources