Issue with Flume log4j Appender - hadoop

I'm trying to configure flume to write hadoop service logs to a common sink.
Here's what I have added to hdfs log4j.properties
# Define the root logger to the system property "hadoop.root.logger".
log4j.rootLogger=${hadoop.root.logger}, flume
#Flume Appender
log4j.appender.flume = org.apache.flume.clients.log4jappender.Log4jAppender
log4j.appender.flume.Hostname = localhost
log4j.appender.flume.Port = 41414
and when I run sample pi job I get this error
$ hadoop jar hadoop-mapreduce-examples.jar pi 10 10
log4j:ERROR Could not find value for key log4j.appender.flume.layout 15/11/25 07:23:26 WARN api.NettyAvroRpcClient: Using default maxIOWorkers log4j:ERROR RPC client creation failed! NettyAvroRpcClient { host: localhost, port: 41414 }: RPC connection error Exception in thread "main" java.lang.ExceptionInInitializerError
at org.apache.hadoop.util.RunJar.run(RunJar.java:200)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: org.apache.commons.logging.LogConfigurationException: User-specified log class 'org.apache.commons.logging.impl.Log4JLogger' cannot be found or is not useable.
at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:804)
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:541)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:292)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:269)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:657)
at org.apache.hadoop.util.ShutdownHookManager.<clinit>(ShutdownHookManager.java:44)
... 2 more
I have added these jars to hadoop-hdfs lib
avro-ipc-1.7.3.jar, flume-ng-log4jappender-1.5.2.2.2.7.1-33.jar, flume-ng-sdk-1.5.2.2.2.7.1-33.jar
and I do have the commons-logging( commons-logging-1.1.3.jar) and log4j(1.2.17) jars present in the hdfs lib. Any pointers to debug this issue?

Related

Unable to start Hive CLI Hadoop(MapR)

I am trying to access hive CLI. However, it is failing to start with the following AccessControl issue.
Strangly enough, I am able to query hive data from Hue without the AccessControl issue. However, hive CLI is not working.
I am on a MapR cluster.
Any help is much appreciated.
[<user_name>#<edge_node> ~]$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/hive/hive-2.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in file:/opt/mapr/hive/hive-2.1/conf/hive-log4j2.properties Async: true
2017-09-23 23:52:08,988 WARN [main] DataNucleus.General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/spark/spark-2.1.0/jars/datanucleus-api-jdo-4.2.4.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/mapr/hive/hive-2.1/lib/datanucleus-api-jdo-4.2.1.jar."
2017-09-23 23:52:08,993 WARN [main] DataNucleus.General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/spark/spark-2.1.0/jars/datanucleus-core-4.1.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/mapr/hive/hive-2.1/lib/datanucleus-core-4.1.6.jar."
2017-09-23 23:52:09,004 WARN [main] DataNucleus.General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/spark/spark-2.1.0/jars/datanucleus-rdbms-4.1.19.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/mapr/hive/hive-2.1/lib/datanucleus-rdbms-4.1.7.jar."
2017-09-23 23:52:09,038 INFO [main] DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
2017-09-23 23:52:09,039 INFO [main] DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
2017-09-23 23:52:14,2251 ERROR JniCommon fs/client/fileclient/cc/jni_MapRClient.cc:2172 Thread: 20235 mkdirs failed for /user/<user_name>, error 13
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: User <user_name>(user id 50005586) has been denied access to create <user_name>
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:617)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:531)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:646)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hadoop.security.AccessControlException: User <user_name>(user id 50005586) has been denied access to create <user_name>
at com.mapr.fs.MapRFileSystem.makeDir(MapRFileSystem.java:1256)
at com.mapr.fs.MapRFileSystem.mkdirs(MapRFileSystem.java:1276)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1913)
at org.apache.hadoop.hive.ql.exec.tez.DagUtils.getDefaultDestDir(DagUtils.java:823)
at org.apache.hadoop.hive.ql.exec.tez.DagUtils.getHiveJarDirectory(DagUtils.java:917)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createJarLocalResource(TezSessionState.java:616)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:256)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.beginOpen(TezSessionState.java:220)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:614)
... 10 more
The error is saying you're defined access to create a directory in the file system. This is likely /user/<user name>, which will need to be added by the HDFS / MapR FS super user.
I am able to query hive data from Hue without the AccessControl
Hue communicates via Thrift and HiveServer2.
Hive CLI bypasses HiveServer2 and is deprecated.
You should use Beeline instead.
beeline -n $(whoami) -u jdbc:hive2://hiveserver:10000/default
And if you're in a kerberized cluster, then you'll need some extra options there.

Java - Failed to specify server's Kerberos principal name

I have some integration tests that use HDFS with Kerberos authentication. When I run them, I get this exception:
java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name; Host Details : local host is: "Serbans-MacBook-Pro.local/1.2.3.4"; destination host is: "10.0.3.33":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
I believe that everything is configured correctly:
System.setProperty("java.security.krb5.realm", "...");
System.setProperty("java.security.krb5.kdc", "...");
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://10.0.3.33:8020");
conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
conf.set("hadoop.security.authentication", "Kerberos");
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab("user#...", "/Users/user/user.keytab");
What do you believe the problem is? On my host (10.0.3.33) I have core-site.xml and hdfs-site.xml configured correctly. But I am not running from that host, as the exception suggests.
Any ideas what to do, in order to be able to run the tests from any host?
Thanks,
Serban
If you are using Hadoop's older version less than 2.6.2, default pattern property is not available in hdfs-site.xml file then you need to specify pattern property manually.
config.set("dfs.namenode.kerberos.principal.pattern", "hdfs/*#BDBIZVIZ.COM");

storm hdfs bolt not working in HDP 2.2

I have tried executing storm hdfs sample program. I have created a jar file and copied the jar to the edge node of the cluster.
Then I executed the below command
java -cp storm-.1.jar:lib/*:lib/log4j.properties test.storm.sample.StormSampleTopology
But, I am getting the below exception
17434 [Thread-11-hdfs_bolt] WARN org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
17448 [Thread-9-hdfs_bolt] WARN org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
17457 [Thread-11-hdfs_bolt] ERROR backtype.storm.util - Async loop died!
java.lang.RuntimeException: Error preparing HdfsBolt: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name; Host Details : local host is: "xxxx2002.xx.xxxxx.com/11.11.111.221"; destination host is: "x-xxxxx.xxxx.com":8020;
at org.apache.storm.hdfs.bolt.AbstractHdfsBolt.prepare(AbstractHdfsBolt.java:96) ~[storm-hdfs-0.1.2.jar:na]
at backtype.storm.daemon.executor$fn__3441$fn__3453.invoke(executor.clj:692) ~[storm-core-0.9.3.jar:0.9.3]
at backtype.storm.util$async_loop$fn__464.invoke(util.clj:461) ~[storm-core-0.9.3.jar:0.9.3]
at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name; Host Details : local host is: "xxx2002.xx.xxx.com/11.11.111.221"; destination host is: "x-xxxx.xxx.com":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) ~[hadoop-common-2.6.0.2.2.0.0-2041.jar:na]
at org.apache.hadoop.ipc.Client.call(Client.java:1472) ~[hadoop-common-2.6.0.2.2.0.0-2041.jar:na]
at org.apache.hadoop.ipc.Client.call(Client.java:1399) ~[hadoop-common-2.6.0.2.2.0.0-2041.jar:na]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[hadoop-common-2.6.0.2.2.0.0-2041.jar:na]
at com.sun.proxy.$Proxy14.create(Unknown Source) ~[na:na]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[hadoop-hdfs-2.6.0.2.2.0.0-2041.jar:na]
My key-tab file is located in the edge node itself. I have specified the location also in the property file
Please help me to resolve this issue.

Job via Oozie HDP 2.1 not creating job.splitmetainfo

When trying to execute a sqoop job which has my Hadoop program passed as a jar file in -jarFiles parameter, the execution blows off with below error. Any resolution seems to be not available. Other jobs with same Hadoop user is getting executed successfully.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://sandbox.hortonworks.com:8020/user/root/.staging/job_1423050964699_0003/job.splitmetainfo
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1541)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1396)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1363)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:976)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:135)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1241)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1041)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1452)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1448)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1381)
So here is the way I solved it. We are using CDH5 to run Camus to pull data from kafka. We run CamusJob which is responsible for getting data from kafka using comman line:
hadoop jar...
The problem is that new hosts didn't get so-called "yarn-gateway". Cloudera names pack of configs related to service and copied to /etc/hadoop/conf
as "gateway". So I just clicked "deploy client configuration" in CM UI. YARN client conf has been copied to each YARN NodeManager node and it solved problem.

The "Spring XD" xd-shell can't run the hadoop fs ls command, the command returns a java exception

I compiled the latest spring-xd as I needed CDH support. I am able to start the server however when I connect to the server via the xd-shell I try to change a "configuration". Also this is a kerberized cluster, I am not sure how xd will/can handle that.
1st scenario:
admin config server --uri http://testdomain:10111
hadoop config fs --namenode hdfs://nameservice1:8020
hadoop config props set hadoop.security.group.mapping=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
hadoop config props load hadoop.security.group.mapping
hadoop fs ls
Error message:
xd:>hadoop fs ls
-ls: Fatal internal error
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128)
at org.apache.hadoop.security.Groups.<init>(Groups.java:55)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:182)
at org.apache.hadoop.security.UserGroupInformation.initUGI(UserGroupInformation.java:252)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:223)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:214)
at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:277)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:668)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:573)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2428)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2420)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2288)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:270)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:224)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:207)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
at org.springframework.xd.shell.hadoop.FsShellCommands.run(FsShellCommands.java:412)
at org.springframework.xd.shell.hadoop.FsShellCommands.runCommand(FsShellCommands.java:407)
at org.springframework.xd.shell.hadoop.FsShellCommands.ls(FsShellCommands.java:110)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:191)
at org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:64)
at org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:48)
at org.springframework.shell.core.AbstractShell.executeCommand(AbstractShell.java:127)
at org.springframework.shell.core.JLineShell.promptLoop(JLineShell.java:483)
at org.springframework.shell.core.JLineShell.run(JLineShell.java:157)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:126)
... 35 more
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.JniBasedUnixGroupsMapping
at org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.<init>(JniBasedUnixGroupsMappingWithFallback.java:38)
... 40 more
2nd scenario
alternatively I remove some java opts
run steps 1, 2 from previous scenario
then
hadoop config props set hadoop.security.authorization=true
hadoop config props set hadoop.security.authentication=kerberos
error below
16:50:29,682 WARN Spring Shell util.NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ls: Authorization (hadoop.security.authorization) is enabled but authentication (hadoop.security.authentication) is configured as simple. Please configure another method like kerberos or digest.
Thanks for you assistance - can't wait to get this working!
Thanks for raising this - we haven't tested with authorization/authentication in the shell for a while - though it is tested as part of the project https://github.com/vmware-serengeti/serengeti-ws
Are you able to perform operations using the standard hadoop file system shell. e.g.
hdfs dfs -ls /user/hadoop/file1
There is currently no specific support in XD for running against a secured Hadoop cluster.
Feel free to open a JIRA ticket at https://jira.springsource.org/browse/XD -- this is something we know we will have to address soon.

Resources