clean up and maintain local maven cached artifact in .m2 - maven

I have an .m2 repository on my Jenkins slave which is growing every day, currently it's nearly ~40 GB.
Since I have multiple jobs running and picking dependencies from .m2 I cannot remove everything, but I can see in each repo of .m2 there is an older and useless version of the artefact.
Are there any means of way available in maven so that when a job triggers $mvn install maven will keep the latest version only in the .m2 repo (example versioning x.y.z.w which is incremental) for every repo inside .m2?

If you don't care that external dependencies are pulled in every build, you could use a private Maven repository per job (Maven -> Advanced -> Check 'use private Maven repository') and clean the workspace at the start of your build. The private repository creates a .repository in your workspace, so cleaning your workspace will ensure you start with an empty repository.
Should you have many shared external dependencies, then you may be using even more diskspace, since they are present multiple times in the different repositories. In that case you could write a script that periodically (using a task scheduler like cron) removes unused files from the shared repository, see for example this Stack Overflow answer.
However be cautious with a shared Maven repository! Maven by default is not threadsafe, so concurrent jobs downloading the same artifact might use the incomplete downloads. Consider using the Takari extensions to make your Maven repository thread-safe.

Having been through a similar problem, I came up with a solution and made it open source as it might help others. The application is available on Github and it can clean up old dependencies and retain just the latest.
https://github.com/techpavan/mvn-repo-cleaner
Apart from cleaning old dependencies, it has other features like date based cleanup based on download date / last accessed date, removing snapshots, sources, javadocs, ignoring or enforcing deletion of specific groups or artifacts.
Additionally, this is cross platform and can run on both Windows and Unix / Linux environments.

Related

Maven Multi Module build

Our project has a requirement where we want to build only modules which got changed and remaining should be referred from maven local repository or remote repository. Is there any way to do this?
Ideal solution would be if maven can detect any changes in modules from SCM like SVN and build only that and remaining pick from repsoitory
We want to do this because we have many modules and it takes lot of time to compile thus will save lot of time for us.

What if maven artifact not anymore available?

As we just migrated all projects to maven projects, one question came up and I couldn't find any suitable solutions yet.
What if maven shuts down forever or just one artifact is not available anymore? Or what if some versions are not available anymore, and the software cannot run with the newer ones?
Since we're not hosting a copy of the artifacts locally, should we host copies of every jar somewhere for such a scenario?
Thanks
Unlikely, but if you build your artifacts with Maven, you already have a copy of each relevant artifact - in your local repository. If you backup it from time to time, you have the necessary level of security.
Alternatively, use a company repository manager (Nexus/Artifactory) to proxy MavenCentral. It will also keep copies of the used artifacts.

Maven different local repositories for SNAPSHOTs and RELEASE artifacts

is it possible in Maven to configure different local repositories for SNAPSHOT and RELEASE artifacts?
The reason I am asking, we are using Jenkins for continuous build for our project. To ensure the consistency (if same artifact is built from different Jenkins jobs because of race condition, we can experience chaotic behavior) before build start, we create a fresh local repository for Jenkins.
Now the problem is, our project is huge, so for every build we have to download lots of dependencies from our Nexus but when you think about it, there is no reason to download every time new the RELEASE artifacts. The RELEASE artifacts don't change from build to build, for ex, Spring 4.5, httpclient 4.0, aspectj 1.8.1 is same for one build to another.
So actually to ensure the consistency, we only should not have the SNAPSHOT dependencies in the repository. If we could have two local repositories one for RELEASE artifacts and the other for SNAPSHOT's, then before every build start, we could delete the SNAPSHOT repository but re-use the local RELEASE repository, which would save me gigabytes of download from Nexus.
I know we can do RELEASE, SNAPSHOT configurations for remote repositories, is it possible to do same sort of configuration for local repositories?
If this is not possible, how would you solve this problem.
There is currently no way to achieve this, and yes, I agree with the sentiment.
A reasonably recent versions of Jenkins' Maven plugin allow you to specify a custom local repository without having to edit a settings.xml file — the option is right there at the job definition screen (in the Advanced section, select Use private Maven repository).
So, what I would do is use this option, and precede the Maven build step with a script that deletes all directories, in the local private repository, which end in -SNAPSHOT.
It's repulsive, but I can't think of any other way.

How to change updatePolicy for my local Maven repository?

I know how to do it for an external repository but not for my local repository, since I don't have a <repository> for my local repository in my settings.xml.
I use snapshot versions for my sub-projects, so when I re-build the parent project I want maven to get all the sub-projects snapshot versions from my local repository not only once a day (which seems to be what happens by default) but always.
If I'm understanding your comment, I think #FrVaBe may have the correct answer. When you change code for a child project on your development machine, it's up to you to rebuild the snapshot and get it into your local artifact repo (via mvn install) so it's available for the parent project to use.
If, however, you want your parent project build to pull in changes made by your teammates and published to the corporate remote repository more often than once per day, read on.
Here is a summary of how Maven central (and kin), remote repositories (e.g a company instance of Nexus or Artifactory) and your local repository work together. If you always want the latest version of snapshots to download on every build, go into your settings.xml file, find <snapshot> repository containing the snapshot you want, and change the <updatePolicy> value to "always". Personally I rarely do this, I simply add the '-U' option to my mvn command line when I want to ensure I have the latest version of a snapshot from my remote repo.
There is no update policy for the local repository!
The local repository is just a bunch of files. When you install to your local repository your local projects already reference the artifacts directly. There is no update that needs to be performed except that maybe your IDE needs to be refreshed to pickup the newer files.
In this manner you can build local snapshots all day long with no versioning headaches, no updates required and no old artifacts left hanging around afterwards. Nice and clean but not so obvious if you're new to Maven and still getting to grips with all these repositories and their fancy update mechanisms.
I think you missunderstood something. Maven will always take the latest/newest SNAPSHOT from your local respository. But in your project setup (Project Inheritance) you need to build the sub projects on their own if you changed something.
An automatical build of the sub project only happens on a Project Aggregation layout.
The difference is explained in the Project Inheritance vs Project Aggregation section of the documentation.

Does Artifactory support a concept of SNAPSHOT artifact expiration?

I am using Artifactory to support an enterprise multi-module project. Often, we change the names of modules and the associated dependencies in POM files are not updated to use the new module name. Because SNAPSHOT dependencies are not automatically cleaned up on a regular interval, these old module references can stay there for months. I discovered a few when I migrated Artifactory to another server and the old module dependencies resulted in build errors. I am building these SNAPSHOT artifacts nightly using Jenkins so I would like some way to automate cleaning up the SNAPSHOT artifacts.
Does Artifactory (or another artifact server such as Nexus) support a concept where if a SNAPSHOT artifact is older than X days, the artifact is deleted? Is there another way to automate artifact server cleanup to accomplish what I want to do? The only thing I can think of is to create a cron job to clear out libs-snapshot-local on a regular interval before the nightly build starts. Has someone already built this capability?
As far as I know, Artifactory doesn't have an automated way to delete modules that are older than a certain value. At my shop we've written a Groovy client that uses Artifactory's REST API to do exactly this.
Note that, if your artifacts are shared libraries, you need to be careful that nothing depends on them before you delete them. Our script takes this into account, too.
If you're interested in following up, post a comment and I'll see if it's OK to share our script with you.
Another solution might be a user plugin. You can write a simple Groovy script that will run in Artifactory itself (as opposite to remote invocation by REST Gareth proposed) on a scheduled basis, searching for artifacts not downloaded for a long time and deleting them.
I've made a Ruby script to delete artifacts which aren't download for X days. The way it works just like what JBaruch mentioned in his answer.
It isn't a plugin. It works with Artifactory Open Source. Plugin is only supported by Artifactory Pro.
The source code: https://gist.github.com/aleung/5203736

Resources