Not found :org.apache.hadoop.security.authentication.util.KerberosUtil - hadoop

I am running storm jar in a cluster ,where I configured hadoop,kafka,storm cluster
when I run the jar in local mode it works fine ,when I run it on storm cluster, I am finding respective error in Storm UI:
java.lang.NoSuchMethodError: org.apache.hadoop.security.authentication.util.KerberosUtil.hasKerberosTicket(Ljavax/security/auth/Subject;)Z at
org.apache.hadoop.security.UserGroupInformation.<init>(UserGroupInformation.java:666) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:861) at
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:820)
pom.xml
Click here to view POM file
After some google I found I found we have add hadoop auth jar.even after i finding same error

I think you're packaging an old Hadoop jar.
Take a look at the storm-hdfs POM https://github.com/apache/storm/blob/v1.0.6/external/storm-hdfs/pom.xml. When you use the Shade plugin, the jar you end up with will contain all your dependencies, including transitive ones brought in through direct dependencies. Storm-hdfs declares a dependency on a list of Hadoop jars. You need to make sure that you're declaring the same list of Hadoop jars in your POM if you want to use a different version of Hadoop from the default.
Specifically what's happening is that you haven't declared hadoop-auth in your POM, so your POM gets packaged with the default version of that jar (2.6.1). Since that version of hadoop-auth is incompatible with the other Hadoop jars (which are 2.9.1), you get an exception at runtime.
You should either exclude all Hadoop jars from your import of storm-hdfs and then put the jars you want to use in Storm's lib directory, or add the right versions of the Hadoop jars to your dependency list in your POM.
Edit:
I think I found your issue. You haven't set the scope of storm-core to provided. Since storm-core depends on hadoop-auth, and you haven't declared it explicitly, Maven will try to guess which version of hadoop-auth you need based on where the dependency appears in the tree. Since hadoop-auth appears as 2.9.1 through some of your Hadoop dependencies, but 2.6.1 through storm-core, you happen to get 2.6.1 put in your jar.
If you want to avoid this kind of thing in the future, you should use Maven's dependencyManagement block https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management.
i.e. you should add something like the following to your pom, and then remove the exclusions of hadoop jars.
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
</dependencies>
</dependencyManagement>

Related

How does maven resolve the dependencies of the main dependencies on which our application is build?

I am trying to understand maven a little more. How is maven able to download the dependencies of the main dependency of the application? For example assuming my application has main dependency like this:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.7.0</version>
<scope>provided</scope>
</dependency>
Now, when maven downloads this jar , it downloads the dependencies for this jar as well. For example, see the screen shot below:
As can be seen, maven has not only downloaded the hadoop-hdfs-2.7.0.jar but also all it dependencies.
Now, my questions is how maven knows what are the dependencies for the "top-level" dependency, that is in this case the "top-level" dependency is hadoop-hdfs, so what all jars it has to download for this?
I see this as well in the .m2/respository for hadoop-hdfs:
I opened the .pom file, the contents are (partly):
<project>
....
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.7.0</version>
<description>Apache Hadoop HDFS</description>
<name>Apache Hadoop HDFS</name>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-annotations</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
<scope>provided</scope>
</dependency>
<dependencies>
...
</project>
What is this hadoop-hdfs-2.7.0.pom ? Does this file give information to maven what are the dependencies to be downloaded for hadoop-hdfs-2.7.0.jar?
Can anyone help me clear these things?
First of all you are right, the hadoop-hdfs-2.7.0.pom tells Maven
about the libraries that hadoop depends upon. But, when using hadoop
as a dependency in your project, maven uses the below strategies to
finalize the list of dependencies in addition to using the
hadoop-hdfs-2.7.0.pom.
If a dependency is specified with groupid, artifactid and version in the current project under the dependencies tag, it takes the first
precedence. This is how hadoop-hdfs got added in your project.
Dependency Management takes the next precedence. When a dependency is specified only with group and artifact id's under dependencies tag
but at the same time, the dependency is defined under
dependencyManagement tag with version and transitively inside hadoops pom.xml also,
the one under the dependencyManagement tag will be given preference.
Dependency Mediation takes the last precedence. Dependencies are resolved using dependency mediation. Meaning, in your case the
dependencies mentioned inside hadoop-hdfs-2.7.0.pom are the transitive
dependencies (indirectly depends on these dependencies since your
dependency "hadoop-hdfs" requires it) of your project and this process continues
recursively until all child dependencies are resolved.
Note: There are other features such as excluding dependencies, marking
one optional and importing a list of dependencies. But they are used
sparsely. More information with examples can be found in the below URL
[https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management][1]

Adding a dependency existing internally as a dependency

My project is a fairly large project consisting of many maven modules (but not microservices). I was trying to do Moving from spring to spring-bom on WAS but seems lot of clashes in versions. So for example one of my modules is using commons-collectionsversion 2.6.0 and my current project is using 3.2.2. I want the same jar to be used across. Since its more of a migration project I cannot do changes in container or repository changes at this time. I should only make sure that all the version are compatible with each other. My plan :
I want to include a dependency which is with in some other dependency
into the current pom as a dependency.
Also I want other jars in this pom (which exists as a dependency) to included the dependency
Is there anyway to do it?
I didn't completely understand your question, but the can help you to define a cross-module dependency version, as long as you place it in the parent-pom file.
<dependencyManagement>
<dependency>
<groupId>com.group</groupId>
<artifactId>project-1</artifactId>
<version>1.0.0</version>
</dependency>
</dependencyManagement>
and then define the dependency in the relevant module without providing it a version (it will be inherited from the parent-pom's <dependencyManagment> tag:
<dependencies>
<dependency>
<groupId>com.group</groupId>
<artifactId>project-1</artifactId>
</dependency>
</dependencies>

Does Maven need to explicitly specify the dependency that Spring/Hibernate dependented?

I'm new to Maven, I try to use Maven with Spring, Hibernate in my project. After go though the Spring and Hibernate reference, I found that "there is no need to explicitly specify the dependent liberaries in POM.xml file for such Apache commons liberaries".
My questions is that : If my other parts of project refer to Apache commons liberary, such as commons-io, SHOULD I explicit specify this dependency in POM.xml file?
You should define those dependencies in Maven which your project is using. For example, even though some library depends on commons-io but if your code needs this then you should directly define commons-io in your pom.xml
You should not worry about the dependencies of the libraries you have defined in your pom.xml. Maven will do that for you.
Maven is used to avoid the issue of having to run down JAR files that are dependent on other JAR files. Of course you do not HAVE to use maven to do this, but you should. Maven will automatically download the dependent JAR files of the JAR file you require. THe hibernate-entity manager JAR file, for example, has over 100 dependencies and maven does the work for you.
Anyway,even if you do add the commons-io file to the build path/classpath of the maven project,and then update the project configuration, maven will kick it out.
You can provide a lib name on a site like mvnrepository.com to see what it depends on (e.g. take a look at a section called "This artifact depends on ..." in case of spring-webmvc library). Those dependencies (which your artifact depends on) are called transitive dependencies. You don't have to specify these in your pom.xml as maven will resolve them for you.
For the sake of readability you should only state those dependencies in your module that you rely on directly. You want JUnit to test your software, only declare JUnit; you need hibernate to use ORM, declare hibernate, and so on. Leave the rest to Maven.
And most of the time you should state what you intend to use in the very module you want to use it in. So if you want to use a dependency in more than one module, consider moving it into a dependencyManagement block in a parent pom and referencing it from there in the module you want it in.
parent pom.xml
<dependencyManagement>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.10</version>
<scope>test</scope>
</dependency>
</dependencies>
</dependencyManagement>
child pom.xml
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
</dependency>
</dependencies>
This guarantees you version-stability and still allows you to find out what a module uses by only looking in it's pom (and not all over the place).

How to add all dependency in my project pom file?

I have added around 100 jars in my local Apache Archiva. Now i will want to add all these dependency jar to my project Pom.xml file.
Can it possible to add all these dependency by single Copy-paste? Right now i have to copy each individual dependency from Apache Archiva and paste into my project pom.xml file.I have to copy-paste these lines in my Pom.xml file for each jar which is very tough task.
<dependency>
<groupId>org.csdc</groupId>
<artifactId>dom4j</artifactId>
<version>1.6.1</version>
</dependency>
It's very unlikely that you need all 100 jars as direct dependencies. In maven, you have to list your direct dependencies - one by one, yes. However, you don't need to list your transitive dependencies because maven will manage that for you. This is one of the most fundamental improvements over older manual classpath management java building.
No All dependency of all jar,
because of in that jars some of the dependency have same group Id ,
so that have fetch all the jars that included.
some of the dependency is writing in pom.xml file
for example code is
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-annotations</artifactId>
<version>3.4.0.GA</version>
</dependency>
the above dependency fetch all jars of related to hibernate-annotation
- hinernate-annotation
- hibernate-common-annotation
- hibernate-core jar files to be fetched.....

Can security JARS be loaded with Maven?

We have some unit tests that will fail unless you have two jars, local_policy.jar, and US_export_policy.jar in your $JAVA_HOME/jre/lib/security folder. I'm supposed to see if we can just put them in a project folder, then tell Maven to use them when it does a build("mvn install"). Maybe with something like the dependency tag? Yes, I know everyone should just install these in their $JAVA_HOME, but this is the task I've been asked to look into.
You are speaking about Maven dependency scope. Documentation here. You can say to Maven use some libraries just for testing using "test" scope.
You can add them as Maven systemPath dependencies.
systemPath
is used only if the the dependency scope is system. Otherwise, the build will fail if this element is set. The path must be absolute, so it is recommended to use a property to specify the machine-specific path (more on properties below), such as ${java.home}/lib. Since it is assumed that system scope dependencies are installed a priori, Maven will not check the repositories for the project, but instead checks to ensure that the file exists. If not, Maven will fail the build and suggest that you download and install it manually.
<dependencies>
<dependency>
<!-- The groupId can be anything. Use your own groupId for example -->
<groupId>anything</groupId>
<artifactId>local_policy</artifactId>
<!-- The version can be anything. Use the version of Java for example -->
<version>7.0</version>
<systemPath>${java.home}/lib/security/local_policy.jar</systemPath>
<scope>system</scope>
</dependency>
<dependency>
<!-- The groupId can be anything. Use your own groupId for example -->
<groupId>anything</groupId>
<artifactId>US_export_policy</artifactId>
<!-- The version can be anything. Use the version of Java for example -->
<version>7.0</version>
<systemPath>${java.home}/lib/security/US_export_policy.jar</systemPath>
<scope>system</scope>
</dependency>
</dependencies>

Resources