Download artifacts from Maven repository without using Maven - shell

In various shell scripts, we need to download artifacts from a Maven repository (Nexus 2.x at the moment, but may change in the future).
The servers that run the scripts usually have no Maven installed. So I am looking for something http based.
On the one hand, there is a REST interface which can be used like
wget "http://local:8081/service/local/artifact/maven/redirect?g=com.mycompany&a=some-app&v=1.2.3"
On the other hand, you can construct a "standard" URL that seems to work for different Maven repositories. It consists of a prefix, then the groupId with slashes instead of dots, then the artifactId, then the version and then a file name of the form artifactId-(classifier)-version.type.
What is the recommended practise?

The Maven coordinates section of the POM reference describes the second scenario you mentioned. In general I've found that pattern easiest to explain to folks learning Maven, i.e. whether local or remote, an artifact is located at
$REPO/groupId/as/path/artifactId/version/artifactId-version[-classifier].type
where $REPO can be $USER_HOME/.m2/repository or https://remote.repo:port/....
I would also prefer the second as I suspect it will be easier for this app to work with another repository some day if needed. Even if not quite true, it's more self-documenting, so seems like it would be easier to adjust.

Related

Maven scm and svn

I am having trouble accessing the svn revision number through Maven. The only real help I've received from the SCM usage page is the following:
<scm>
<connection>scm:svn:http://somerepository.com/svn_repo/trunk</connection>
<developerConnection>scm:svn:https://somerepository.com/svn_repo/trunk</developerConnection>
<tag>HEAD</tag>
<url>http://somerepository.com/view.cvs</url>
</scm>
This means nothing to me as I can't figure out what connection, developerConnection, and url mean. I simply plugged in the url to my repo for all 3 elements. I also don't know why Maven does not ask me for the username and password for the repository.
I am very new to Maven and might be asking a very basic question but would appreciate a full explanation as to how I am to access the svn repo.
First I would begin by clarifying the usage of Maven which seems to cause the confusion in your case:
Apache Maven is a software project management tool... that can manage the project's build.
Apache Maven has nothing to do with your revisions being pushed to your source code management system (SVN in your case).
Typically, you'll be pushing your changes through an IDE (Eclipse, IntelliJ IDEA and alike) or through a command line to your SVN repository and you won't in any way be pushing those changes through Maven in which case you'll be breaking its usage purpose.
Now comes the question, why you may need then those SCM related properties?
The answer is simple and should be relevant since Maven is a project build tool, it must handle your project release cycle which is final piece of the project build cycle... And it won't be able to do it in coherent way without updating your remote project informations since you are using an SCM remote repository.
Now back to those SCM related properties, and what do they mean:
connection: an URL connection endpoint to your SCM repository and which will only used for read access.
developerConnection: an URL connection endpoint to your SCM repository and which will be used for write access. (That's what a developer role is intended to do after all, push changes to the repository).
tag: it specifies the tag under which the project lives and I've seen only HEAD being used in there and assume would be the default.
url: it specifies a browsable repository, such as the one going through viewvc (In most cases you can replace the /svn/ path under your connection URL with /viewvc/)
SCM (Software Configuration Management, also called Source Code/Control Management or, succinctly, version control) is an integral part of any healthy project. If your Maven project uses an SCM system (it does, doesn't it?) then here is where you would place that information into the POM.
connection, developerConnection: The two connection elements convey to how one is to connect to the version control system through Maven. Where connection requires read access for Maven to be able to find the source code (for example, an update), developerConnection requires a connection that will give write access. The Maven project has spawned another project named Maven SCM, which creates a common API for any SCMs that wish to implement it. The most popular are CVS and Subversion, however, there is a growing list of other supported SCMs. All SCM connections are made through a common URL structure.
scm:[provider]:[provider_specific]
Where provider is the type of SCM system. For example, connecting to a CVS repository may look like this:
scm:svn:https://somerepository.com/svn_repo/trunk
tag: Specifies the tag that this project lives under. HEAD (meaning, the SCM root) should be the default.
url: A publicly browsable repository. For example, via ViewCVS.
Source
Analogy : https://www.youtube.com/watch?v=9In7ysQJGBs

Guidelines when splitting artifact repositories

I am looking for an article which describes a set of guidelines to follow when creating repositories in an artifact repository manager.
I know that:
You need to keep snapshots in snapshot repositories.
You need to keep releases in release repositories.
Third-party artifacts should be in a separate repository (the same goes for forked/patched
versions of third-party libraries).
It's generally a good idea to prefix the names with int-* and ext-*.
Usually different product lines end up having their own repositories as sometimes their artifacts don't depend on each other.
I've been trying to find an article on this to illustrate to a client how this artifact separation abstraction is done by other companies and organizations using repositories.
Many thanks in advance!
I am not aware of existence of such an article, but as #tieTYT mentioned, you can look at Artifactory default repositories. They reflect years of experience in binaries management, continuous integration and delivery.
Those practices still apply even if you use Nexus (and you can observe them even without installing Artifactory, by looking at JFrog public Artifactory instance http://repo.jfrog.org)
For your convenience, here are the defaults (important usage emphasised):
Local Repositories:
libs-snapshot-local: Deploy here your local snapshots
libs-release-local: Deploy here your local releases
ext-snapshot-local: Deploy here 3rd-party snapshots which aren't available in remote repos
ext-release-local: Deploy here 3rd-party releases which aren't available in remote repos
plugins-snapshot-local: Deploy here your plugin (usually, maven) snapshots
plugins-release-local: Deploy here your plugin (usually, maven) releases
Remote Repositories:
jcenter: proxy of http://jcenter.bintray.com. Normally, that's the only remote repo you'll need. It includes whatever exists in maven central plus all other major maven repositories
Virtual Repositories:
remote-repos: aggregation of all the remote repositories
libs-release: this is the resolution repository for release builds. It includes remote-repos, libs-release-local and ext-release-local
libs-snapshot: this is the resolution repository for snapshot builds. It includes remote-repos, libs-snapshot-local and ext-shapshot-local
repo: this is special virtual repository, that aggregates everything. Generally, do not use it, if you ever plan building release pipeline using binary repository.
I'll be glad to advice on specific question.
As is the case with many questions about best practices, the answer is: It depends.
Technically there are only two distinctions that are required:
Snapshot vs release repo
Hosted vs proxy repository
Snapshots vs release repositories as a distinction is required since the Maven repository format and therefore Maven and other build tools differentiate how they work with the the meta data and what they do during upload.
For proxy repositories you will just have to add as many you need to proxy. This will depend on what components you require and will be separate for proxying snapshot and release repos.
For hosted repositories you also have to have separate snapshot and release repos. Beyond that is is all up for grabs. Having a separate third party repo as preconfigured in Nexus (and Artifactory) and other setups are certainly useful, but not really necessary. You can have all those distinctions sorted out by internal meta data where required.
Along the same lines you can have one release repo for everyone or one for each team or whatever. You can still apply access rights within those repositories to separate access and so on in Nexus with repository targets. I assume Artifactory and Archiva can do something similar. The question here mostly boils down to ease of administration, backups, security setup and access for users.
Naming conventions like you mentioned can help if you want to have separate repositories, but technically none of this is necessary.
Other things I have seen are e.g. migration repos that are used to migrate legacy project libraries into a repo but become frozen after the migration is done, separate repos per team, separate repos per project and so on. Another aspect are separate repos for different levels of approval and so on (e.g. check out problems with that on http://blog.sonatype.com/people/2013/10/golden-repository/)
In the end however this all hinges really on usability and meta data and is not required. Ultimately these repositories will in most cases grouped together and accessed via one group, which flattens out the whole separation. And access rights still carry through into the group so everything can still be controlled as you like. So it turns to be a matter of taste on how you want to slice and dice and manage it.
PS: I am referring to the Maven repositories and format. Once you add a whole bunch of other formats into the mix and wrappers around them exposing them in other formats, everything gets more complicated, but the ideas behind things stay similar.

List available artifacts from repo with gradle, ivy or other

I'm looking for a way to list all available artifacts programmatically for given repo url, group and artifact. The repo is maven-based.
I know about maven-metadata.xml but the repo that is in use doesn't provide classifier details which are crucial for me.
Solution may be based on ivy, gradle or other compatible tools. If anybody has an idea please let mi know :)
I hope to find a code sample that will allow me to browse repo in an easy and friendly way.
Use the search features of your Maven repository manager.
If you're using Nexus, it supports searches of it's Lucene index. For example the following URL returns all the artifacts matching the string "log4j":
https://repository.sonatype.org/service/local/lucene/search?q=log4j
The response is verbose but includes information like classfiers (which is what you're looking for)
maven-metdata.xml only has module information, and classifier belongs to artifact (not module). Gradle is probably not a good fit here. I'd consider a low-level approach with some GET requests and HTML parsing. In case the repository is backed by a repository manager such as Artifactory and Nexus, their REST API might also be an option.
Thanks guys for all hints. Yesterday I've managed to solve the problem using artifactory REST search API and parsing the incoming JSON respones. Thanks once again.

mvn opposite of dependency:get? How to set up authorization?

I've been tasked with writing scripts to interact with Nexus/Maven. The files I'm working with in Maven are XML files placed there with the specific idea that they would be used by shell scripts. Essentially, the files are configurations for another application.
I've already completed the scripts to pull the files from the repositories, but I'm having problems with putting files into the repositories. To pull the files, I'm using the plugin dependency:get.
What I need is more or less the opposite of that plugin. One that will update the repository with new versions of a file. I think that "mvn deploy:deploy-file" is what I need to use. Will that work?
If so, then the next problem I have is that I can't seem to figure out how to set up the authorization. I have a settings file with a server defined that has the correct authorization information in it, but the link between the server and the repository (or URL?) is missing and the authorization isn't being performed correctly.
How do I connect the repository URL to the server info in the settings.xml file so that mvn will be authorized to perform the correct actions? (I don't know where the .pom file is for Maven, and may not have permissions to alter it.)
Thanks,
Sean.
deploy:deploy-file is correct. Use it with -Durl=http://repo:port/path, -DrepositoryId=server-whatever. Your settings.xml needs to contain
<servers>
<server>
<id>server-whatever</id>
<username>demo</username>
<password>demo</password>
</server>
</servers>
where the server ID server-whatever matches the repositoryId parameter.
Having said that, I'd question the appropriateness of Maven for this. It's designed for binary artifaccts rather than configuration.
The problem turned out to be in the -Durl option.
When using the dependency:get plugin, the URL was something like:
-Durl=http://companymavenrepo
And that worked fine for the dependency plugin.
However, that's not sufficient when trying to put things into the repository using the deploy plugin. The URL has to contain the maven server and the exact repository of where to place the artifact. (My terminology might be off.) I went to our Nexus/Sonatype webpage, looked at the exact repository where the artifact was stored, then used something like this:
-Durl=http://companymavenrepo/nexus/content/repositories/this_maven_repo
That solved the authorization problem, and I was able to add the file into the repository without issue.
(This might have been easier for other to see had I posted both mvn command lines I was trying to use. On the other hand, it also seems reasonable that when you use the -Durl option with a specific value in one command line and it works that it will work unchanged in another command line.)
Sean.

Get available Dependencies using pom.xml in Command line, like eclipse dependency search in m2eclipse plugin

Maven eclipse plugin can search available dependencies from the default repositories and any additional repositories configured, given that I know the partial group Id or partial artifact Id. This is really useful in finding the available dependencies. Is there a similar mechanism available using maven in command line.
Example: suppose I know only "mybatis", and I intend to find the proper group id, artifact id, and version and whether type jar is available or not. I can easily do this using eclipse search dependency. But without eclipse do I really need to use the browser and go to repo2.maven.org (and now I find that directory browsing of this has been disabled).
First, you can search the sonatype repository, which covers a lot of ground. (I'm not sure how many other repo's are mirrored though this. I guess that's a separate question.)
Second, nexus itself has an API that you can use to script queries against the repository. For example, you can use Ruby or Groovy and do something like (assuming groovy is installed; I'm on linux):
$ cat foo.groovy
#!/usr/bin/env groovy
def xml = args.length < 2 ?
"http://repository.sonatype.org/service/local/data_index?q=" + args[0] :
"http://repository.sonatype.org/service/local/data_index?g=${args[0]}&a=${args[1]}&v=${args[2]}"
println "Searching: " + xml
def root = new XmlParser().parseText( xml.toURL().text )
root.data.artifact.each {
println "${it.groupId.text()}:${it.artifactId.text()}:${it.version.text()}"
}
Then,
$ ./foo.groovy org.mybatis mybatis 3.0.4
Searching: http://repository.sonatype.org/service/local/data_index?g=org.mybatis&a=mybatis&v=3.0.4
org.mybatis:mybatis:3.0.4
org.mybatis:mybatis:3.0.4
org.mybatis:mybatis:3.0.4
Or, closer your question (output truncated),
$ ./foo.groovy mybatis
Searching: http://repository.sonatype.org/service/local/data_index?q=mybatis
org.mybatis:mybatis:3.0.1
org.mybatis:mybatis:3.0.1
...
org.mybatis.caches:mybatis-caches-parent:1.0.0-RC1
org.mybatis.caches:mybatis-ehcache:1.0.0-RC1
org.mybatis.caches:mybatis-ehcache:1.0.0-RC1
...
org.apache.camel:camel-mybatis:2.7.0
org.apache.servicemix.bundles:org.apache.servicemix.bundles.mybatis:3.0.2_1
Note that this assumes you're querying an existing nexus maven repo, and in addition this is just searching that single repo. (So it's not exactly what you asked.)
But, actually, this is the way I want it to be: my only repository used by my maven projects is a single, internal (intranet) nexus server, and it functions as a mirror (and cache) of all the 3rd party repositories that I currently need. If I decide I need to pull in other jars from another repo (e.g., googlecode or company XYZ...), then I add that repo's url to my internal nexus configuration. Everyone on my team -- netbeans/eclipse/mvn users -- always point to the single internal maven repo, & everyone automatically picks up the newly available artifacts.
Then you can still use the above script to search for an artifact. (Note: it lets you do a generic search, or a GAV (group/artifact/version) search.)
If you're not sure which repository a given artifact is in, I guess there's always http://mvnrepository.com/

Resources