When to use "optional" dependencies and when to use "provided" scope? - maven

Dependencies decorated by <optional>true</optional> or <scope>provided</scope> will be ignored when they are dependent transitively. I have read this, and my understanding is like the difference between #Component and #Service in Spring, they only vary semantically.
Is it right?

In addition to the comment, there is more important semantic difference: "Provided" dependencies are expected to be supplied by the container, so if your container gives you hibernate, you should mark hibernate as provided.
Optional dependencies are mainly used to reduce the transitive burden of some libraries. For example: If you can use a library with 5 different database types, but you usually only require one, you can mark the library-dependent dependencies as optional, so that the user can supply the one they actually use. If you don't do, you might get two types of problems:
The library pulls a huge load of transitive dependencies of which you actually need very few so that you blow up your project without reason.
More dangerously: You might pull two libraries with overlapping classes, so that the class loader cannot load both of them. This might lead to unexpected behaviour of your library.

A minor difference I'd like to point out is the treatment of optional vs. provided by various plugins that create packages.
Apparently war plugin will not package optional dependencies, but there is an open bug about it: https://issues.apache.org/jira/browse/MWAR-351
The assembly plugin doesn't seem to provide any way to filter based on optional status, while it allows you to filter based on scope.
It seems the same is true for the shade plugin.
TL;DR if you are not developing a library, but a top-level application provided scope will give you more flexibility.

Related

Aspectj, how to use ajc in a modular way

I am trying to use the Aspectj compiler ajc in a modular (OSGi setting). The standard way ajc seems to be used is to take aspects & java code and turn it into one a JAR with all classes and resources in the -inpath, -aspectpath, and -sourceroots.
I am trying to weave aspects an OSGi executable JAR from bnd. This executable jar contains a set of bundles that need to be woven. However, in a modular system, the boundary is quite important. For one, the manifest often contains highly relevant information to that bundle or one of the many extenders. Flattening all the classes into a big blog won't work.
I am therefore weaving each bundle separately. However, then the output is cluttered with the aspects. I'd like to import these to keep the aspect modules proper modules. However, using the annotation programming model, I notice that ajc is modifying the aspect modules, so I need to rewrite those as well. This is fine, but since I weave each bundle separately, I have the question if the weaving of the aspect could depend on what gets other modules woven? That is,
does the modification of the annotated aspect depend on the classes that it is woven in?
The other issue is what happens to resources with the same name? Since my -inpath is only one JAR (the bundle), I notice I end up with the correct manifest (META-INF/MANIFEST.MF) in the output. However, if the -inpath consists of many bundles, what will the manifest be? Or any other resource that has the same path and thus overlaps?
Last issue is external dependencies. I understand acj wants to see the whole world and include this whole world into the output JAR. However, I must exclude external dependencies of a bundle. Is there a way to mark JARs as: use, but do not include. A bit like the maven 'provided' scope?
Summary:
Does the modification of an #Aspect annotated class depend on the targets that is applied to?
Can I compile the #Aspect annotated classes into separate JARs?
How to handle the external dependencies that will be provided in the runtime and thus must be excluded from the output JAR.
What are the rules around overlapping resource paths in the -inpath and -sourceroots?
UPDATE In the mean time I've made an implementation in Bndtools.
Does the modification of an #Aspect annotated class depend on the targets that is applied to?
If you want to be 100% sure you have to read the AspectJ source code, but I would assume that an aspect's byte code is independent of its target classes, because otherwise you could not compile aspects separately and also not build aspect libraries.
Can I compile the #Aspect annotated classes into separate JARs?
Absolutely, see above.
How to handle the external dependencies that will be provided in the runtime and thus must be excluded from the output JAR.
If I understand the question correctly, you probably want to put them on the class path during compilation, not on the inpath.
What are the rules around overlapping resource paths in the -inpath and -sourceroots?
Again, probably you have to look at the source code. If I was you I would simply assume that the selection order is undefined and make sure to not have duplicates in the first place. There should be Maven plugins helping you with filtering the way you want the result to be.
bndtools seems to have close ties to Eclipse. So does AspectJ as an Eclipse project. Maybe you can connect with Andy Clement, the AspectJ maintainer. He is so swamped with his day-time job though, he hardly ever has any free cycles. I am trying to unburden him as much as I can, but OSGi is one of my blind spots and I hardly know the AspectJ source code. I am rather an advanced user.

What does it really mean that api configuration exposes depedencies whereas implementation does not, in Gradle?

I have gone through the official doc and many StackOverflow questions about the difference betweenapi and implementation configurations. I think I understand the basic, but I really want to understand what it means by dependencies are exposed or not exposed.
This is what I have got so far. When I publish my Java library (written in Kotlin, but not relevant), the dependency scope in the published pom file is either complie when api is used or runtime when implementation is used, i.e.
dependencies {
api "..."
}
<dependency>
<groupId>...</groupId>
<artifactId>...</artifactId>
<version>...</version>
<scope>compile</scope>
</dependency>
dependencies {
implementation "..."
}
<dependency>
<groupId>...</groupId>
<artifactId>...</artifactId>
<version>...</version>
<scope>runtime</scope>
</dependency>
So does exposing dependencies in this case really just mean adding them to classpath (compile scope)?
One of the many answers about api vs implementation says it is merely about build optimization, it makes sense that the build time will be reduced if we not adding everything in the classpath maybe?
And a bonus question, the Gradle doc says api configuration comes with java-library plugin, but apparently, I can use it without applying the plugin, how is this possible?
// Gradle 6.1.1
plugins {
id 'org.jetbrains.kotlin.jvm' version 'XXX'
}
dependencies {
api "myLibrary"
}
So does exposing dependencies in this case really just mean adding them to classpath (compile scope)?
Yes, it's pretty much just a matter of having them on the consumer's compile classpath or not.
One of the many answers about api vs implementation says it is merely about build optimization, it makes sense that the build time will be reduced if we not adding everything in the classpath maybe?
Well, good software design advocates not exposing internal implementation details. This is why you have public and private class members in the code. You could argue that this principal is solid when it comes to dependencies as well. I see the following benefits:
A consumer does not implicitly start relying on "internal" transitive dependencies. If they did, it would mean that you can't remove them from the library without breaking the consumers.
A reduced classpath may make compilation slightly faster. I don't think it matters a whole lot for normal projects though. Maybe it is more impactful if you rely on Java or Kotlin annotation processors or Groovy AST transformations that feels like scanning the entire classpath through each time.
Not having unnecessary modules on the compilation classpath means a library will not have to be recompiled if those modules changes.
The last one is the biggest benefit in my opinion. Let's say you have a big multi-project where a shared sub-project internally relies on Apache Commons Lang. If you have declared Lang as an api dependency and update it, then all other projects relying on this shared project need to be recompiled. If you declare it as an implementation dependency instead, this will not happen. All those projects will still need to be re-tested of cause as the runtime behaviour might have changed (this is handled correctly by default in Gradle).
And a bonus question, the Gradle doc says api configuration comes with java-library plugin, but apparently, I can use it without applying the plugin, how is this possible?
This is because the Kotlin plugin also declares an api configuration. It has the same semantics as configured by the java-library plugin.
If your project is a multi-project, you can still add the java-library plugin even if it is using the Kotlin plugin. An additional change that this will cause is that consumers will see the output directory for the compiled classes instead of the final jar file. This removes the need to construct the jar during normal development, which should reduce build time. On the other hand, there is apparently a potential performance problem on Windows if you have a lot of classes in a single project, so the usual your mileage may vary disclaimer applies here as well (I don't know how many "a lot" is though).

Dealing with other dependencies in your own Maven dependency

I want to reuse and centralize the utils I created for my Spring REST API for my future projects. That's why I thought I'd outsource them to my own project and make them available as a Maven dependency.
These Util files e.g. a basic service, basic controllers also contain Spring annotations, i.e. I need some Spring dependencies in my Util dependency. Now I'm a bit unsure whether I'm making a mistake or not.
First of all, I'm not sure if I should even use spring dependencies in a utility dependency or try to remove everything. Otherwise, I'll have to specify a spring version, but it might differ from the version I want to use later in the project it's included in. How am I supposed to solve this?
It is perfectly reasonable to have dependencies for your dependencies (these are called transitive dependencies). Of course, you should keep the number as low as possible, but on the other hand, you do not want to reinvent the wheel.
When somebody uses your dependency, they will automatically draw the transitive dependency on spring. Now, several cases can occur:
If this is the only reference to spring, the version is just used as you stated it.
If at some other point, a different version of spring is given, Maven dependency mediation kicks in. It decides by a "nearest is best" rule which version to take.
But: You can always set the spring version in <dependencyManagement> and then overwrite all transitively given version numbers.
That is the main concept of Maven. Your utility module must shipped together with Spring dependencies. It's called transitive dependencies.
Try to imagine that situation when all dependencies had excluded. In that case nobody will never know what kind and which version of Spring dependencies are needed.
Maven has a very good dependency conflict resolution. It's based on nearest-newest principle. So you can override those Spring versions easily and your application will use only one of that.
Take a look at these:
[1] Dependency Mechanism
[2] Dependency Mediation and Conflict Resolution

Should I rely on transitive dependencies in Maven if they come from other sub-module of my parent?

Suppose we are working on mortgage sub-module, and we are directly using the Google Guava classes in module code, but the dependcy for the guava is defined in other sub-module under the same parent and we have access to Guava classes only by transitive dependency on "investment" module:
banking-system (parent pom.xml)
|
|-- investment (pom.xml defines <dependency>guava</dependency>)
|
|-- mortgage (pom.xml defiens <dependency>investment</dependency>)
Should we still put a <dependency> to Guava in the mortgage pom.xml?
The cons looks like duplication in our pom.xml, the pros are: if someone developing "investment" will drop guava, then it will not stop our mortgage sub-module from being successfuly build.
If yes, then what <version> shoudle we specify? (none + <dependencyManagement> in parent pom?)
If yes, should we use a <provided> scope in some module then?
Note: Keep in mind, that I am asking in specific situation, when modules have common parent pom (e.g. being an application as whole).
Maybe this structure was not the best example, imagine:
banking-app
banking-core (dep.on: guava, commons, spring)
investment (dep.on: banking-core)
mortgage (dep.on: banking-core)
Should still Investment explicitly declare Spring when it use #Component, and declare Guava if it uses Guava's LoadedCache?
we are directly using the Google Guava classes in module code, but the
dependcy for the guava is defined in other sub-module under the same
parent and we have access to Guava classes only by transitive
dependency on "investment" module [...] Should we still put a to Guava in the mortgage pom.xml?
Yes, you should declare Google Guava dependency in your module and not expect it to be available as transitive-dependency. Even if it works with the current version, it may not be the case anymore in later versions of direct dependencies.
If your code depends on a module, your code should depends only directly on classes of this module, not a transitive-dependency of this module. As you mentioned, there is no guarantee that the investment module will continue to depend on Guava in the future. You need to specify this dependency either in the parent's pom.xml or in the module itself to ensure it will be available without relying on transitive dependencies. It's not duplication as such, how else can you tell Maven your module depends on Guava?
I do not see any situation in which minimal best practices are respected where you would need to do otherwise.
If yes, then what <version> shoudle we specify? (none + <dependencyManagement> in parent pom?)
Yes, using <dependencyManagement> in parent and using a <dependency> in your child module without version is best: you will make sure all your modules uses the same version of your dependency. As your modules are an application as a whole, it is probably better as it will avoid various issues such as having different versions of the same dependency being present on the classpath causing havoc.
Even if for some reason one of your module using the same parent requires a different version of our dependency, it will still be possible to override the version for this specific module using <version>.
If yes, should we use a scope in some module then?
Probably not, having the dependency with a compile scope is the best wat to go with most packaging methods.
However you may have situations where you need or prefer to do this, for example if said modules requires to use a runtime environment specific version, or if your deployment or packaging model is designed in a way that demands it. Given the situation you expose, both are possible, though most of the time it should not be necessary.
Yes, declare the dep. It's not a duplication!!! That compile dependencies are transitive is not intended by the maven developer, it's forced by the java-language. Because features like class-inheritance forces this behavior. Your already mentioned "pro" is the important fact.
See the (*) note in the transitive-scope-table
Yes, always declare needed third party lib-versions in your reactor parent with dependencyManagement. It's a pain to find errors from different lib-versions at runtime. Avoid declaring versions of third-party libs in sub-modules of large reactors, always use a depMngs in parent.
No, i would use "provided" only for dependencies provided from the runtime, in your example tomcat/jboss/wildfly/.. for things like servlet-api/cdi-api/. But not for third party libraries.
Declare the "provided" scope as late as possible (i.e. your deployment(s) module war/ear) not in your business modules. This makes it easier to write tests.
For example:
investment (depends on guava scope:=provided)
mortgage (depends on investment, but don't need guava himself)
--> mortgage classpath doesn't contain guava.
If you write a unit-test for mortgage where classes involved from investment it will not work -> you need to declare at least guava with scope=test/runtime to run these tests...
When a module uses a 3rd party library, the module should explicitly depend on that library in its pom.xml too. Imagine if another project should use the 'mortgage' module, and doesn't depend on Guava already, it will fail e.g. when a unit test comes upon a code path that involves Guava. An explicit dependency also covers you against the scenario where you refactor the 'investment' module so that it doesn't use Guava anymore. Your 'investment' module should be agnostic to such changes in its dependencies.
It's always correct to explicitly list your direct dependencies. When it comes to version, it's best to keep that in the dependencyManagement section of your parent pom so all child projects inherit that (same) version.

What is the final dependency scope when different scopes are specified for one JAR?

I am studying some JARs in the Maven Repository and discovered this:
Hibernate Validator Engine 5.4.0.FINAL lists jboss-logging as a compile dependency, and jboss-logging-processor as a provided dependency
jboss-logging-processor lists jboss-logging as a provided dependency
In general, when a JAR is mentioned multiple times along the way under different scopes, what is the final, actual scope? Is there an order of precedence of sorts?
It depends on the context rather than inheritance.
However, if some implications are present:
something is marked as compile it is implicitly a runtime dependency.
something is marked as runtime it is implicitly a test dependency.
provided will be used in both runtime and test though it is not loaded during run or test time.
system will be used in both runtime and test

Resources