Why should we define a scope of a dependency in Maven? - maven

Maven provides a tag for dependencies which can "limit the transitivity of a dependency". I understand, that by defining, for instance, a test scope for a given dependency, this dependency will not be available in other phases (diagram). But I don't get what is the advantage of doing so?

Scopes have three main purposes:
avoid that you use something in your application that you did not want to use (if you declare the implementation as runtime, you cannot accidentally use it in your code).
reduce the amount of transitive dependencies. Especially test dependencies will not become dependencies of the users of your library.
reduce the size of a WAR/EAR: If your container already provides the dependencies, you declares them as provided so that they are not packaged into your application.

https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope
You don't need hoe and gun for digging, you just need hoe only.
You don't need JUnit dependency for running on web-server (scope runtime), you need JUnit when you test only (scope test), you don't need JUnit when you package for production.
Another benefit is avoiding version conflicting, avoid unnecessary dependencies redundancy.

Related

Dealing with other dependencies in your own Maven dependency

I want to reuse and centralize the utils I created for my Spring REST API for my future projects. That's why I thought I'd outsource them to my own project and make them available as a Maven dependency.
These Util files e.g. a basic service, basic controllers also contain Spring annotations, i.e. I need some Spring dependencies in my Util dependency. Now I'm a bit unsure whether I'm making a mistake or not.
First of all, I'm not sure if I should even use spring dependencies in a utility dependency or try to remove everything. Otherwise, I'll have to specify a spring version, but it might differ from the version I want to use later in the project it's included in. How am I supposed to solve this?
It is perfectly reasonable to have dependencies for your dependencies (these are called transitive dependencies). Of course, you should keep the number as low as possible, but on the other hand, you do not want to reinvent the wheel.
When somebody uses your dependency, they will automatically draw the transitive dependency on spring. Now, several cases can occur:
If this is the only reference to spring, the version is just used as you stated it.
If at some other point, a different version of spring is given, Maven dependency mediation kicks in. It decides by a "nearest is best" rule which version to take.
But: You can always set the spring version in <dependencyManagement> and then overwrite all transitively given version numbers.
That is the main concept of Maven. Your utility module must shipped together with Spring dependencies. It's called transitive dependencies.
Try to imagine that situation when all dependencies had excluded. In that case nobody will never know what kind and which version of Spring dependencies are needed.
Maven has a very good dependency conflict resolution. It's based on nearest-newest principle. So you can override those Spring versions easily and your application will use only one of that.
Take a look at these:
[1] Dependency Mechanism
[2] Dependency Mediation and Conflict Resolution

Should I rely on transitive dependencies in Maven if they come from other sub-module of my parent?

Suppose we are working on mortgage sub-module, and we are directly using the Google Guava classes in module code, but the dependcy for the guava is defined in other sub-module under the same parent and we have access to Guava classes only by transitive dependency on "investment" module:
banking-system (parent pom.xml)
|
|-- investment (pom.xml defines <dependency>guava</dependency>)
|
|-- mortgage (pom.xml defiens <dependency>investment</dependency>)
Should we still put a <dependency> to Guava in the mortgage pom.xml?
The cons looks like duplication in our pom.xml, the pros are: if someone developing "investment" will drop guava, then it will not stop our mortgage sub-module from being successfuly build.
If yes, then what <version> shoudle we specify? (none + <dependencyManagement> in parent pom?)
If yes, should we use a <provided> scope in some module then?
Note: Keep in mind, that I am asking in specific situation, when modules have common parent pom (e.g. being an application as whole).
Maybe this structure was not the best example, imagine:
banking-app
banking-core (dep.on: guava, commons, spring)
investment (dep.on: banking-core)
mortgage (dep.on: banking-core)
Should still Investment explicitly declare Spring when it use #Component, and declare Guava if it uses Guava's LoadedCache?
we are directly using the Google Guava classes in module code, but the
dependcy for the guava is defined in other sub-module under the same
parent and we have access to Guava classes only by transitive
dependency on "investment" module [...] Should we still put a to Guava in the mortgage pom.xml?
Yes, you should declare Google Guava dependency in your module and not expect it to be available as transitive-dependency. Even if it works with the current version, it may not be the case anymore in later versions of direct dependencies.
If your code depends on a module, your code should depends only directly on classes of this module, not a transitive-dependency of this module. As you mentioned, there is no guarantee that the investment module will continue to depend on Guava in the future. You need to specify this dependency either in the parent's pom.xml or in the module itself to ensure it will be available without relying on transitive dependencies. It's not duplication as such, how else can you tell Maven your module depends on Guava?
I do not see any situation in which minimal best practices are respected where you would need to do otherwise.
If yes, then what <version> shoudle we specify? (none + <dependencyManagement> in parent pom?)
Yes, using <dependencyManagement> in parent and using a <dependency> in your child module without version is best: you will make sure all your modules uses the same version of your dependency. As your modules are an application as a whole, it is probably better as it will avoid various issues such as having different versions of the same dependency being present on the classpath causing havoc.
Even if for some reason one of your module using the same parent requires a different version of our dependency, it will still be possible to override the version for this specific module using <version>.
If yes, should we use a scope in some module then?
Probably not, having the dependency with a compile scope is the best wat to go with most packaging methods.
However you may have situations where you need or prefer to do this, for example if said modules requires to use a runtime environment specific version, or if your deployment or packaging model is designed in a way that demands it. Given the situation you expose, both are possible, though most of the time it should not be necessary.
Yes, declare the dep. It's not a duplication!!! That compile dependencies are transitive is not intended by the maven developer, it's forced by the java-language. Because features like class-inheritance forces this behavior. Your already mentioned "pro" is the important fact.
See the (*) note in the transitive-scope-table
Yes, always declare needed third party lib-versions in your reactor parent with dependencyManagement. It's a pain to find errors from different lib-versions at runtime. Avoid declaring versions of third-party libs in sub-modules of large reactors, always use a depMngs in parent.
No, i would use "provided" only for dependencies provided from the runtime, in your example tomcat/jboss/wildfly/.. for things like servlet-api/cdi-api/. But not for third party libraries.
Declare the "provided" scope as late as possible (i.e. your deployment(s) module war/ear) not in your business modules. This makes it easier to write tests.
For example:
investment (depends on guava scope:=provided)
mortgage (depends on investment, but don't need guava himself)
--> mortgage classpath doesn't contain guava.
If you write a unit-test for mortgage where classes involved from investment it will not work -> you need to declare at least guava with scope=test/runtime to run these tests...
When a module uses a 3rd party library, the module should explicitly depend on that library in its pom.xml too. Imagine if another project should use the 'mortgage' module, and doesn't depend on Guava already, it will fail e.g. when a unit test comes upon a code path that involves Guava. An explicit dependency also covers you against the scenario where you refactor the 'investment' module so that it doesn't use Guava anymore. Your 'investment' module should be agnostic to such changes in its dependencies.
It's always correct to explicitly list your direct dependencies. When it comes to version, it's best to keep that in the dependencyManagement section of your parent pom so all child projects inherit that (same) version.

What is the final dependency scope when different scopes are specified for one JAR?

I am studying some JARs in the Maven Repository and discovered this:
Hibernate Validator Engine 5.4.0.FINAL lists jboss-logging as a compile dependency, and jboss-logging-processor as a provided dependency
jboss-logging-processor lists jboss-logging as a provided dependency
In general, when a JAR is mentioned multiple times along the way under different scopes, what is the final, actual scope? Is there an order of precedence of sorts?
It depends on the context rather than inheritance.
However, if some implications are present:
something is marked as compile it is implicitly a runtime dependency.
something is marked as runtime it is implicitly a test dependency.
provided will be used in both runtime and test though it is not loaded during run or test time.
system will be used in both runtime and test

When to use "optional" dependencies and when to use "provided" scope?

Dependencies decorated by <optional>true</optional> or <scope>provided</scope> will be ignored when they are dependent transitively. I have read this, and my understanding is like the difference between #Component and #Service in Spring, they only vary semantically.
Is it right?
In addition to the comment, there is more important semantic difference: "Provided" dependencies are expected to be supplied by the container, so if your container gives you hibernate, you should mark hibernate as provided.
Optional dependencies are mainly used to reduce the transitive burden of some libraries. For example: If you can use a library with 5 different database types, but you usually only require one, you can mark the library-dependent dependencies as optional, so that the user can supply the one they actually use. If you don't do, you might get two types of problems:
The library pulls a huge load of transitive dependencies of which you actually need very few so that you blow up your project without reason.
More dangerously: You might pull two libraries with overlapping classes, so that the class loader cannot load both of them. This might lead to unexpected behaviour of your library.
A minor difference I'd like to point out is the treatment of optional vs. provided by various plugins that create packages.
Apparently war plugin will not package optional dependencies, but there is an open bug about it: https://issues.apache.org/jira/browse/MWAR-351
The assembly plugin doesn't seem to provide any way to filter based on optional status, while it allows you to filter based on scope.
It seems the same is true for the shade plugin.
TL;DR if you are not developing a library, but a top-level application provided scope will give you more flexibility.

Understanding Maven scoping better

I have been struggling to figure out what's the use of scoping that is provided by Maven
as mentioned here.
Why should you not always have compile time scoping? Real life examples would be really appreciated.
The compile scoped dependencies are only used during compilation.
The test scoped ones -- only during tests. Say you have tests using junit, or easymock. You obviously do not want your final artifact to have a dependency on them, but would like to be able to just depend on these libraries while running your tests.
Those dependencies which are marked provided are expected to be on your classpath when you're running the produced artifact. For example: you have a webapp and you have a dependency on the servlet library. Obviously, you should not package it inside your WAR file, as the webapp container will already have it and a conflict may occur.
One of the reasons to have different scopes for dependencies is that different parts of the build can depend on different dependencies. For example, if you are only compiling your code and not executing any tests, then there is no point in having Maven downloading your test dependencies (if they're not already present in your local repository, of course). The other reason is that not all dependencies need to be placed in your final artifact (whether it's an assembly, or WAR file), as some of the dependencies are only used during the build and testing phases.
compile
Will copy these jar files into prepared War file.
Ex: hibernate-core.jar need to have in our prepared War.
provided
These jars will be considered only at complie time and test time
Ex:
servlet.jar will be provided by deployed server, so no need to provide from our prepared War file.
test
These jars are only required for running test classes.
Ex: Junit.jar will be required only for running Junit test classes, no need to deploy these.
Scopes are quite well explained in here:
https://maven.apache.org/pom.html#Dependencies
As a reference, I copied the paragraph:
scope: This element refers to the classpath of the task at hand
(compiling and runtime, testing, etc.) as well as how to limit the
transitivity of a dependency. There are five scopes available:
compile
- this is the default scope, used if none is specified. Compile dependencies are available in all classpaths. Furthermore, those
dependencies are propagated to dependent projects.
provided - this is
much like compile, but indicates you expect the JDK or a container to
provide it at runtime. It is only available on the compilation and
test classpath, and is not transitive.
runtime - this scope indicates
that the dependency is not required for compilation, but is for
execution. It is in the runtime and test classpaths, but not the
compile classpath.
test - this scope indicates that the dependency is
not required for normal use of the application, and is only available
for the test compilation and execution phases.
system - this scope is
similar to provided except that you have to provide the JAR which
contains it explicitly. The artifact is always available and is not
looked up in a repository.
there are a couple of reasons that you might not want to have all dependencies to be default compile scope
reduce the size of final artifact(jar,war...) by indicating different scope.
when you have a multiple-modules project, you have ability to let each module have it's own version of dependency
avoid class version collision by provided scope, for instance if you are going deploy a war file to weblogic server, you need to get rid of some javax jars, like javax.servlet, javax.xml.parsers, JPA jars and etc. otherwise you might end up with class collision error.

Resources