Why do the Spark examples fail to spark-submit on EC2 with spark-ec2 scripts? - amazon-ec2

I downloaded spark-1.5.2 and I setup a cluster on ec2 using the spark-ec2 doc here.
After that I went to examples/ and run mvn package and packaged the examples in a jar.
In the end I run the submit with:
bin/spark-submit --class org.apache.spark.examples.JavaTC --master spark://url_here.eu-west-1.compute.amazonaws.com:7077 --deploy-mode cluster /home/aki/Projects/spark-1.5.2/examples/target/spark-examples_2.10-1.5.2.jar
Instead of it running, I get the error:
WARN RestSubmissionClient: Unable to connect to server spark://url_here.eu-west-1.compute.amazonaws.com:7077.
Warning: Master endpoint spark://url_here.eu-west-1.compute.amazonaws.com:7077 was not a REST server. Falling back to legacy submission gateway instead.
15/12/22 17:36:07 WARN Utils: Your hostname, aki-linux resolves to a loopback address: 127.0.1.1; using 192.168.10.63 instead (on interface wlp4s0)
15/12/22 17:36:07 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/12/22 17:36:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:242)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:98)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:116)
at org.apache.spark.deploy.Client$$anonfun$7.apply(Client.scala:233)
at org.apache.spark.deploy.Client$$anonfun$7.apply(Client.scala:233)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.deploy.Client$.main(Client.scala:233)
at org.apache.spark.deploy.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241)
... 21 more

Are you sure the URL to master contains "url-here"?
spark://url_here.eu-west-1.compute.amazonaws.com:7077
Or maybe you are trying to obfuscate it for this post.
If you can you connect the Spark UI at
http://url_here.eu-west-1.compute.amazonaws.com:4040 or depending on your spark version http://url_here.eu-west-1.compute.amazonaws.com:8080, make sure you are using the URL variable seen on the Spark UI for your spark://...:7070 command line argument

Related

Apache Phoenix - How can start the query server and thin client on Kerberos cluster

I have recently spent several days trying to run the phoenix thin (queryserver.py and sqlline-thin.py) and thick via zookeeper to secure cluster.But, I could not able to start or connect the phoenix service on secure cluster.
Faced Below issues on phoenix thin and thick clients
17/09/27 08:41:47 WARN util.NativeCodeLoader: Unable to load native-hadoop libra
ry for your platform... using builtin-java classes where applicable
Error: java.lang.RuntimeException: java.lang.NullPointerException (state=,code=0
)
java.sql.SQLException: java.lang.RuntimeException: java.lang.NullPointerExceptio
n
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(Connecti
onQueryServicesImpl.java:2465)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(Connecti
onQueryServicesImpl.java:2382)
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExe
cutor.java:76)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQ
ueryServicesImpl.java:2382)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(Phoe
nixDriver.java:255)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(Phoeni
xEmbeddedDriver.java:149)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
at sqlline.Commands.connect(Commands.java:1064)
at sqlline.Commands.connect(Commands.java:996)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.jav
a:38)
at sqlline.SqlLine.dispatch(SqlLine.java:809)
at sqlline.SqlLine.initArgs(SqlLine.java:588)
at sqlline.SqlLine.begin(SqlLine.java:661)
at sqlline.SqlLine.start(SqlLine.java:398)
at sqlline.SqlLine.main(SqlLine.java:291)
Caused by: java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(R
pcRetryingCaller.java:218)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:
326)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanne
r.java:301)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConst
ruction(ClientScanner.java:166)
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.jav
a:161)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:797)
at org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.
java:602)
at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccess
or.java:366)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java
:406)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(Connecti
onQueryServicesImpl.java:2410)
... 20 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.getMetaReplicaNode
s(ZooKeeperWatcher.java:489)
at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailabl
e(MetaTableLocator.java:558)
at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocatio
n(ZooKeeperRegistry.java:61)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplement
ation.locateMeta(ConnectionManager.java:1211)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplement
ation.locateRegion(ConnectionManager.java:1178)
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getR
egionLocations(RpcRetryingCallerWithReadReplicas.java:305)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(Scann
erCallableWithReplicas.java:156)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(Scann
erCallableWithReplicas.java:60)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(R
pcRetryingCaller.java:210)
... 29 more
sqlline version 1.2.0
0: jdbc:phoenix:Namenode1>
Followed below steps
Removed the hbase-site.xml file on phoenix and set the HBase package environment.
Provided the Kerberos principal and keytab for the Phoenix Query Server in the $HBASE_CONF_DIR/hbase-site.xml file.
phoenix.queryserver.kerberos.principal
HTTP/webuser#Domain.COM
phoenix.queryserver.kerberos.keytab
C:\KeyTab\webuser.keytab
Execute thin server and Client by
a. Start Query server - /bin>python queryserver.ph which started properly.
b. Start thin client - /bin>python sqlline.py http://namenode1:8765;authentication=SPENGO;principal=HTTP/webuser#Domain.COM;keytab=C:\\KeyTab\\webuser.keytab
4.Execute on Thick client by below commands
/bin> python sqlline.py Namenode1:2181:/hbase-secure:HTTP/webuser#Domain.COM:C:\\KeyTab\\webuser.keytab
Let me know, Any other configurations are required an Apache phoenix with secure cluster.

Nutch 2.3.1 on OSX does not connect to MongoDB

I configured a local Nutch 2.3.1 instance on MacOS 10.11.5 (El Capitan) running in Eclipse as described here: https://wiki.apache.org/nutch/RunNutchInEclipse
As data store to use I configured MongoDB 2.6.12 which is also running on my local MacOS machine. I took the Gora config from here: http://www.aossama.com/search-engine-with-apache-nutch-mongodb-and-elasticsearch/
ivy.xml
<dependency org="org.apache.gora" name="gora-mongodb" rev="0.6.1" conf="*->default" />
gora.properties
gora.datastore.default=org.apache.gora.mongodb.store.MongoStore
gora.mongodb.override_hadoop_configuration=false
gora.mongodb.mapping.file=/gora-mongodb-mapping.xml
gora.mongodb.servers=localhost:27017
# I tried several server settings like localhost, 127.0.0.1, 127.0.0.1:27017, ...
gora.mongodb.db=nutch
I did not change gora-mongodb-mapping.xml.
nutch-site.xml
<property>
<name>storage.data.store.class</name>
<value>org.apache.gora.mongodb.store.MongoStore</value>
<description>Default class for storing data</description>
</property>
If I run the inject command, hadoop.log shows this confusing result:
2016-07-12 23:23:16,818 INFO crawl.InjectorJob - InjectorJob: starting at 2016-07-12 23:23:16
2016-07-12 23:23:16,819 INFO crawl.InjectorJob - InjectorJob: Injecting urlDir: /Users/myaccount/Documents/Nutch/urls
2016-07-12 23:23:17,054 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-07-12 23:23:17,416 ERROR store.MongoStore -
2016-07-12 23:23:17,417 ERROR store.MongoStore - [Ljava.lang.StackTraceElement;#4b5189ac
2016-07-12 23:23:17,418 ERROR store.MongoStore - Error while initializing MongoDB store: java.lang.NullPointerException
2016-07-12 23:23:17,419 ERROR crawl.InjectorJob - InjectorJob: org.apache.gora.util.GoraException: java.lang.RuntimeException: java.io.IOException: java.lang.NullPointerException
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:233)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:267)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:290)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:299)
Caused by: java.lang.RuntimeException: java.io.IOException: java.lang.NullPointerException
at org.apache.gora.mongodb.store.MongoStore.initialize(MongoStore.java:131)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
... 7 more
Caused by: java.io.IOException: java.lang.NullPointerException
at org.apache.gora.mongodb.store.MongoMappingBuilder.fromFile(MongoMappingBuilder.java:123)
at org.apache.gora.mongodb.store.MongoStore.initialize(MongoStore.java:118)
... 9 more
Caused by: java.lang.NullPointerException
at org.apache.gora.mongodb.store.MongoMapping.newDocumentField(MongoMapping.java:109)
at org.apache.gora.mongodb.store.MongoMapping.addClassField(MongoMapping.java:169)
at org.apache.gora.mongodb.store.MongoMappingBuilder.loadPersistentClass(MongoMappingBuilder.java:169)
at org.apache.gora.mongodb.store.MongoMappingBuilder.fromFile(MongoMappingBuilder.java:112)
... 10 more
After two days I've run out of ideas.
Within the log file I can't identify any valuable hint. The MongoDB logs don't show any connection attempts (not to mention an active connection). Using mongo I'm able to connect to the database and requesting http://localhost:27017 shows the expected message ("It looks like you are trying to access MongoDB over HTTP on the native driver port.") and corresponding log file entries. If I switch the data store to Cassandra, injecting works as expected, so Nutch itself also seems to work.
Does anybody know what I'm missing or understand what the hadoop.log is trying to tell me?
Any help would be appreciated! Thx.
Update: I also tried to use this configuration on an Ubuntu 14.04 server - works as expected. So I suppose my issue is related to the connection between Nutch & MongoDB running on a Mac. (If somebody wants to know: I'm trying to get the configuration working on my Mac because I want to do some local development with no need of a server connection.)

Spark 2.0 Thrift server not started in yarn mode

I have started the spark-2.0 thrift server in local environment and it's working fine, when I'll tried with the cluster environment the following exception thrown.
16/06/02 10:21:06 INFO spark.SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended!
It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitFor
Application(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start
(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:148)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:502)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2246)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:749)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:57)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:81)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:724)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/06/02 10:21:06 INFO util.ShutdownHookManager: Shutdown hook called
When checking in application master logs
Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
Spark-defaults configuration :
spark.executor.memory 2g
spark.driver.memory 4g
spark.executor.cores 1

Giraph ZooKeeper port problems

I am trying to run the SimpleShortestPathsVertex (aka SimpleShortestPathComputation) example described in the Giraph Quick Start. I am running this on a Hortonworks Sandbox instance (HDP 2.1) using VirtualBox, and I packaged giraph.jar using profile hadoop_2.0.0.
When I try to run the example using
hadoop jar giraph.jar org.apache.giraph.GiraphRunner
org.apache.giraph.examples.SimpleShortestPathsVertex -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip
/user/hue/tinygraph.txt -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat
-op /user/hue/output/shortestpaths -w 1
I get the following exception
2014-04-30 07:22:15,390 INFO [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to sandbox.hortonworks.com:22181 with poll msecs = 3000
2014-04-30 07:22:15,396 WARN [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:701)
at org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:357)
at org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:188)
at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:60)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:90)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
I have found a work around - it seems that Giraph expects ZooKeeper to be running on port 22181, while it is actually running on 2181. I have simply used the Ambari interface to set ZooKeeper to run on 22181 (go to http://127.0.0.1:8080/, login admin/admin, Services tab, ZooKeeper and change the port to 22181, save and Service Actions -> Restart All.
Does anyone have a better solution for this problem? Is there a config via which the port should be specified, or is this port in the Giraph source code a typo?
Yes, you can specify each time you run a Giraph job via using option -Dgiraph.zkList=localhost:2181
Also you can set it up in Hadoop configs and then you don't have to pass on this option each time you submit a Giraph job. For that add the following line in conf/core-site.xml file :
<property><name>giraph.zkList</name><value>localhost:2181</value></property>
[Please check the syntax, I don't recall it on top my head and currently I don't have access to a cluster to check it]

The "Spring XD" xd-shell can't run the hadoop fs ls command, the command returns a java exception

I compiled the latest spring-xd as I needed CDH support. I am able to start the server however when I connect to the server via the xd-shell I try to change a "configuration". Also this is a kerberized cluster, I am not sure how xd will/can handle that.
1st scenario:
admin config server --uri http://testdomain:10111
hadoop config fs --namenode hdfs://nameservice1:8020
hadoop config props set hadoop.security.group.mapping=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
hadoop config props load hadoop.security.group.mapping
hadoop fs ls
Error message:
xd:>hadoop fs ls
-ls: Fatal internal error
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128)
at org.apache.hadoop.security.Groups.<init>(Groups.java:55)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:182)
at org.apache.hadoop.security.UserGroupInformation.initUGI(UserGroupInformation.java:252)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:223)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:214)
at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:277)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:668)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:573)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2428)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2420)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2288)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:270)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:224)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:207)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
at org.springframework.xd.shell.hadoop.FsShellCommands.run(FsShellCommands.java:412)
at org.springframework.xd.shell.hadoop.FsShellCommands.runCommand(FsShellCommands.java:407)
at org.springframework.xd.shell.hadoop.FsShellCommands.ls(FsShellCommands.java:110)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:191)
at org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:64)
at org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:48)
at org.springframework.shell.core.AbstractShell.executeCommand(AbstractShell.java:127)
at org.springframework.shell.core.JLineShell.promptLoop(JLineShell.java:483)
at org.springframework.shell.core.JLineShell.run(JLineShell.java:157)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:126)
... 35 more
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.JniBasedUnixGroupsMapping
at org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.<init>(JniBasedUnixGroupsMappingWithFallback.java:38)
... 40 more
2nd scenario
alternatively I remove some java opts
run steps 1, 2 from previous scenario
then
hadoop config props set hadoop.security.authorization=true
hadoop config props set hadoop.security.authentication=kerberos
error below
16:50:29,682 WARN Spring Shell util.NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ls: Authorization (hadoop.security.authorization) is enabled but authentication (hadoop.security.authentication) is configured as simple. Please configure another method like kerberos or digest.
Thanks for you assistance - can't wait to get this working!
Thanks for raising this - we haven't tested with authorization/authentication in the shell for a while - though it is tested as part of the project https://github.com/vmware-serengeti/serengeti-ws
Are you able to perform operations using the standard hadoop file system shell. e.g.
hdfs dfs -ls /user/hadoop/file1
There is currently no specific support in XD for running against a secured Hadoop cluster.
Feel free to open a JIRA ticket at https://jira.springsource.org/browse/XD -- this is something we know we will have to address soon.

Resources