Maven scm and svn - maven

I am having trouble accessing the svn revision number through Maven. The only real help I've received from the SCM usage page is the following:
<scm>
<connection>scm:svn:http://somerepository.com/svn_repo/trunk</connection>
<developerConnection>scm:svn:https://somerepository.com/svn_repo/trunk</developerConnection>
<tag>HEAD</tag>
<url>http://somerepository.com/view.cvs</url>
</scm>
This means nothing to me as I can't figure out what connection, developerConnection, and url mean. I simply plugged in the url to my repo for all 3 elements. I also don't know why Maven does not ask me for the username and password for the repository.
I am very new to Maven and might be asking a very basic question but would appreciate a full explanation as to how I am to access the svn repo.

First I would begin by clarifying the usage of Maven which seems to cause the confusion in your case:
Apache Maven is a software project management tool... that can manage the project's build.
Apache Maven has nothing to do with your revisions being pushed to your source code management system (SVN in your case).
Typically, you'll be pushing your changes through an IDE (Eclipse, IntelliJ IDEA and alike) or through a command line to your SVN repository and you won't in any way be pushing those changes through Maven in which case you'll be breaking its usage purpose.
Now comes the question, why you may need then those SCM related properties?
The answer is simple and should be relevant since Maven is a project build tool, it must handle your project release cycle which is final piece of the project build cycle... And it won't be able to do it in coherent way without updating your remote project informations since you are using an SCM remote repository.
Now back to those SCM related properties, and what do they mean:
connection: an URL connection endpoint to your SCM repository and which will only used for read access.
developerConnection: an URL connection endpoint to your SCM repository and which will be used for write access. (That's what a developer role is intended to do after all, push changes to the repository).
tag: it specifies the tag under which the project lives and I've seen only HEAD being used in there and assume would be the default.
url: it specifies a browsable repository, such as the one going through viewvc (In most cases you can replace the /svn/ path under your connection URL with /viewvc/)

SCM (Software Configuration Management, also called Source Code/Control Management or, succinctly, version control) is an integral part of any healthy project. If your Maven project uses an SCM system (it does, doesn't it?) then here is where you would place that information into the POM.
connection, developerConnection: The two connection elements convey to how one is to connect to the version control system through Maven. Where connection requires read access for Maven to be able to find the source code (for example, an update), developerConnection requires a connection that will give write access. The Maven project has spawned another project named Maven SCM, which creates a common API for any SCMs that wish to implement it. The most popular are CVS and Subversion, however, there is a growing list of other supported SCMs. All SCM connections are made through a common URL structure.
scm:[provider]:[provider_specific]
Where provider is the type of SCM system. For example, connecting to a CVS repository may look like this:
scm:svn:https://somerepository.com/svn_repo/trunk
tag: Specifies the tag that this project lives under. HEAD (meaning, the SCM root) should be the default.
url: A publicly browsable repository. For example, via ViewCVS.
Source
Analogy : https://www.youtube.com/watch?v=9In7ysQJGBs

Related

Common Local repository for Maven

We want to maintain a common repository for Maven for all the systems within our local network, i.e., there should not be a .m2 directory on every system but on a common server(say with some local ip 172.<>).
Can it be acheived via any file transfer protocol or any other service?
Operating System : Windows
While this is actually possible (you can give Maven a settings.xml on the command line, so you can always point to the one in the network), I would strongly recommend against this:
The Maven local repository is not thread safe. When two guys build against it at the same time, anything might break, especially SNAPSHOT versions. I speak from experience: We tried to have only one local repository on our build server and we got wrong results in different builds.
If you want a repository for your team, you need Nexus or Artifactory.

Start to use artifactory

in company where I am working we are starting to use artifactory like tool of repositories managment, and then I'm reading the user guide of this tool. We started in the configuration creating a virtual repository, a few local and remote repositories. On the use guide i found the following thing:
Prevent disclosing sensitive business information derived from your artifact queries to whomever can intercept the queries, including the
owners of the remote repository itself.
I saw that this could be avoided through
exclude pattern
functionality on the virtual repository. Can you give us some suggestion about this? What kinds of request we should avoided to do?
You should avoid requests for internal artifacts being sent to remote repositories (directly or via virtuals). This can happen when projects depends on internal libraries or within multi module projects where modules depends on each other. When working with virtual repositories Artifactory will always search for such artifacts in local repositories first. However, if someone asked for a wrong version or had a typo in the artifact name, the artifact will not be found in a local repository and Artifactory will try to look for it in the remote repositories configured in this virtual.
To avoid exposing sensitive business information as described above, we strongly recommend the following best practices:
The list of remote repositories used in an organization should be managed under a single virtual repository to which all requests are directed
All internal artifacts should be specified in the Excludes Pattern field of the virtual repository (or alternatively, of each remote repository) using wildcard characters to encapsulate the widest possible specification of internal artifacts.
Assuming all of your projects/modules are using some kind of namespace, for example com.mycompany, you can configure an exclusion pattern for artifacts under this namespace: com/mycompany/**.
For more information take a look at avoiding security risks with an excludes pattern

Guidelines when splitting artifact repositories

I am looking for an article which describes a set of guidelines to follow when creating repositories in an artifact repository manager.
I know that:
You need to keep snapshots in snapshot repositories.
You need to keep releases in release repositories.
Third-party artifacts should be in a separate repository (the same goes for forked/patched
versions of third-party libraries).
It's generally a good idea to prefix the names with int-* and ext-*.
Usually different product lines end up having their own repositories as sometimes their artifacts don't depend on each other.
I've been trying to find an article on this to illustrate to a client how this artifact separation abstraction is done by other companies and organizations using repositories.
Many thanks in advance!
I am not aware of existence of such an article, but as #tieTYT mentioned, you can look at Artifactory default repositories. They reflect years of experience in binaries management, continuous integration and delivery.
Those practices still apply even if you use Nexus (and you can observe them even without installing Artifactory, by looking at JFrog public Artifactory instance http://repo.jfrog.org)
For your convenience, here are the defaults (important usage emphasised):
Local Repositories:
libs-snapshot-local: Deploy here your local snapshots
libs-release-local: Deploy here your local releases
ext-snapshot-local: Deploy here 3rd-party snapshots which aren't available in remote repos
ext-release-local: Deploy here 3rd-party releases which aren't available in remote repos
plugins-snapshot-local: Deploy here your plugin (usually, maven) snapshots
plugins-release-local: Deploy here your plugin (usually, maven) releases
Remote Repositories:
jcenter: proxy of http://jcenter.bintray.com. Normally, that's the only remote repo you'll need. It includes whatever exists in maven central plus all other major maven repositories
Virtual Repositories:
remote-repos: aggregation of all the remote repositories
libs-release: this is the resolution repository for release builds. It includes remote-repos, libs-release-local and ext-release-local
libs-snapshot: this is the resolution repository for snapshot builds. It includes remote-repos, libs-snapshot-local and ext-shapshot-local
repo: this is special virtual repository, that aggregates everything. Generally, do not use it, if you ever plan building release pipeline using binary repository.
I'll be glad to advice on specific question.
As is the case with many questions about best practices, the answer is: It depends.
Technically there are only two distinctions that are required:
Snapshot vs release repo
Hosted vs proxy repository
Snapshots vs release repositories as a distinction is required since the Maven repository format and therefore Maven and other build tools differentiate how they work with the the meta data and what they do during upload.
For proxy repositories you will just have to add as many you need to proxy. This will depend on what components you require and will be separate for proxying snapshot and release repos.
For hosted repositories you also have to have separate snapshot and release repos. Beyond that is is all up for grabs. Having a separate third party repo as preconfigured in Nexus (and Artifactory) and other setups are certainly useful, but not really necessary. You can have all those distinctions sorted out by internal meta data where required.
Along the same lines you can have one release repo for everyone or one for each team or whatever. You can still apply access rights within those repositories to separate access and so on in Nexus with repository targets. I assume Artifactory and Archiva can do something similar. The question here mostly boils down to ease of administration, backups, security setup and access for users.
Naming conventions like you mentioned can help if you want to have separate repositories, but technically none of this is necessary.
Other things I have seen are e.g. migration repos that are used to migrate legacy project libraries into a repo but become frozen after the migration is done, separate repos per team, separate repos per project and so on. Another aspect are separate repos for different levels of approval and so on (e.g. check out problems with that on http://blog.sonatype.com/people/2013/10/golden-repository/)
In the end however this all hinges really on usability and meta data and is not required. Ultimately these repositories will in most cases grouped together and accessed via one group, which flattens out the whole separation. And access rights still carry through into the group so everything can still be controlled as you like. So it turns to be a matter of taste on how you want to slice and dice and manage it.
PS: I am referring to the Maven repositories and format. Once you add a whole bunch of other formats into the mix and wrappers around them exposing them in other formats, everything gets more complicated, but the ideas behind things stay similar.

When using maven-release-plugin, why not just detect scm info from local repo?

As far as I am aware, in order to use the maven-release-plugin, you have to drop in an scm section into your POM file. E.g.:
<scm>
<connection>scm:hg:ssh://hg#bitbucket.org/my_account/my_project</connection>
<developerConnection>scm:hg:ssh://hg#bitbucket.org/my_account/my_project</developerConnection>
<url>ssh://hg#bitbucket.org/my_account/my_project</url>
<tag>HEAD</tag>
</scm>
I understand this data is used to determine what to tag and where to push changes. But isn't this information already available if you have the code cloned/checked-out? I'm struggling a bit with the concept that I need to tell maven what code it needs to tag when it could, at least in theory, just ask Git/HG/SVN/CVS what code it's dealing with. I suspect I'm missing something in the details, but I'm not sure what. Could the the maven-release-plugin code be changed to remove this as a requirement, or at least make auto-detection the default? If not could someone provide some context on why that wouldn't work?
For one thing, GIT and Subversion can have different SCM URIs for read-write and read-only access.
This is what the different <connection> and <developerConnection> URIs are supposed to capture. The first is a URI that is guaranteed read access. The second is a URI that is guaranteed write access.
Very often from a checked out repository, it is not possible to infer the canonical URIs.
For example, I might check out the Subversion repository in-house via the svn: protocol and the IP address of the server, but external contributors would need to use https:// with the hostname.
Or even with GIT repositories, on Github you have different URIs for different access mechanisms, e.g.
https://github.com/stephenc/eaio-uuid.git (read-write using Username / Password or OAuth)
git#github.com:stephenc/eaio-uuid.git (read-write using SSH private key Identification)
git://github.com/stephenc/eaio-uuid.git (anonymous read only)
Never mind that you may have checked out git://github.com/zznate/eaio-uuid.git or cloned a local check out, in other words, your local git repository may thing that "upstream" is ../eaio-uuid-from-nate and not git#github.com:stephenc/eaio-uuid.git
I agree that for some SCM tools, you could auto-detect... for example if you know the source is checked out from, e.g. AccuRev, you should be OK assuming its details... until you hit the Subversion or GIT or CVS or etc code module checked out into the AccuRev workspace (true story) so that the tag that was being pulled in could be updated.
So in short, the detection code would have to be damn sure that you were not using two SCM systems at the same time to be sure which is the master SCM... and the other SCM may not even be leaving marker files on disk to sniff out (AccuRev, for example, doesn't... hence why I've picked on it)
The only safe way is to require the pom to define, at least the SCM system, and for those SCM systems where the URI cannot be reliably inferred (think CVS, Subversion, GIT, HG, in fact most of them) require the URI to be specified.

Maven verify signatures of downloaded pom/jar files

I was trying to find if there is SSL enabled central repository but there probably isn't. I noticed that there are signatures for every jar and pom file in maven central repository. So at least I'd like to check signatures of all maven downloaded files (pom/jar).
The example from http://repo1.maven.org/maven2/org/apache/ant/ant/1.8.2/:
ant-1.8.2.jar
ant-1.8.2.jar.asc
ant-1.8.2.jar.asc.md5
ant-1.8.2.jar.asc.sha1
ant-1.8.2.jar.md5
ant-1.8.2.jar.sha1
ant-1.8.2.pom
ant-1.8.2.pom.asc
ant-1.8.2.pom.asc.md5
ant-1.8.2.pom.asc.sha1
ant-1.8.2.pom.md5
ant-1.8.2.pom.sha1
I realize that I'll have to import public keys for every repository and I'm fine with that. I guess that public keys for maven central are here https://svn.apache.org/repos/asf/maven/project/KEYS.
There are PLENTY of tutorials on web on how to sign with maven. However I didn't find any information on how to force maven (2 or 3) to verify signatures of downloaded jar/pom files. Is it possible?
(Nexus Professional is not an option)
Thank you for help.
Now, that people seem to realize this is a real security problem (as described in this blog-post (the blog seems down, here is an archived version of the blog)), there is a plugin for verifying PGP signatures. You can verify the signatures for all dependencies of your project with the following command:
mvn org.simplify4u.plugins:pgpverify-maven-plugin:check
Of course, to be 100% sure the plugin is not malicious by itself, you would have to download and verify the source for the plugin from maven central, build it with maven, and execute it. (And this should also be done with all the dependencies and plugins that are needed for the build, recursively.)
Or you use Maven 3.2.3 or above (with a clean repository), which uses TLS for downloading all artefacts. Thus man-in-the-middle attacks are impossible and you get at least the artefacts as they are on maven central.
See also:
related Question and Answer
Sonatype's Blog to this topic
Could you write a bash shell script using GnuPG to verify each sig?
Something like:
for x in *.jar; do gpg --verify "${x}".asc; done
Obviously you would need the public keys for all the sigs before you started.
SSL access to Central is now available for a token payment. From https://blog.sonatype.com/people/2012/10/now-available-ssl-connectivity-to-central/ :
We’re making SSL connectivity to Central available to anyone that downloads open source components regardless of the repository manager.
...
In order to ensure the highest level of performance for those who count on SSL, we are securing the service with a token. You can get a token for your organization simply by providing a $10 donation that will be donated to open source causes.
Assuming you only want to download artifacts w/ valid checksums, one option would be to run the OSS version of Nexus and configure it to have a proxy of central. Then configure your settings.xml to only load from your repo (mirror tag in settings.xml). You can then configure nexus to only allow artifacts that have a valid checksum.

Resources