Not able to run oozie workflow in AWS EMR - hadoop

We are trying to implement a POC where we are trying to run Oozie in AWS EMR. Due to security reasons, I cannot post the workflow but it is an simple example where we only have an rename action which renames the file name. The rest of the actions are the standard ones like start, end, Fatal error, error Handler etc.
The same workflow worked fine on EC2 instance. But when we try to run Oozie workflow on EMR we are getting the following error
2019-09-12 19:34:41,300 WARN ActionStartXCommand:523 - SERVER[<hostname>] USER[hadoop] GROUP[-] TOKEN[] APP[<WorkflowName>] JOB[0000006-190911195656052-oozie-oozi-W] ACTION[0000006-190911195656052-oozie-oozi-W#ErrorHandler] Error starting action [ErrorHandler]. ErrorType [ERROR], ErrorCode [EM007], Message [EM007: Encountered an error while sending the email message over SMTP.]
org.apache.oozie.action.ActionExecutorException: EM007: Encountered an error while sending the email message over SMTP.
at org.apache.oozie.action.email.EmailActionExecutor.email(EmailActionExecutor.java:304)
at org.apache.oozie.action.email.EmailActionExecutor.validateAndMail(EmailActionExecutor.java:173)
at org.apache.oozie.action.email.EmailActionExecutor.start(EmailActionExecutor.java:112)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:291)
at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:459)
at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:82)
at org.apache.oozie.command.XCommand.call(XCommand.java:291)
at org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:283)
at org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:62)
at org.apache.oozie.command.XCommand.call(XCommand.java:291)
at org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:244)
at org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:56)
at org.apache.oozie.command.XCommand.call(XCommand.java:291)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.mail.MessagingException: Could not connect to SMTP host: <hostname>, port: 25;
nested exception is:
java.net.ConnectException: Connection refused (Connection refused)
at com.sun.mail.smtp.SMTPTransport.openServer(SMTPTransport.java:1961)
at com.sun.mail.smtp.SMTPTransport.protocolConnect(SMTPTransport.java:654)
When we check the application logs, we get the below error
Launcher AM execution failed
java.lang.UnsupportedOperationException: Not implemented by the S3FileSystem FileSystem implementation
at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.oozie.action.hadoop.FSLauncherURIHandler.create(FSLauncherURIHandler.java:36)
at org.apache.oozie.action.hadoop.PrepareActionsHandler.execute(PrepareActionsHandler.java:86)
at org.apache.oozie.action.hadoop.PrepareActionsHandler.prepareAction(PrepareActionsHandler.java:73)
at org.apache.oozie.action.hadoop.LauncherAM.executePrepare(LauncherAM.java:371)
at org.apache.oozie.action.hadoop.LauncherAM.access$000(LauncherAM.java:55)
at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:220)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217)
at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141)
Exception in thread "main" java.lang.UnsupportedOperationException: Not implemented by the S3FileSystem FileSystem implementation
at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1060)
at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1371)
Hadoop distribution:Amazon 2.8.5
Oozie version:Oozie 5.1.0
EMR version : emr-5.26.0
Appreciate any guidance here.

Issue resolved after we used the older version of Oozie i.e., 4.3. No other changes made. Works fine. Had read in one of the AWS links that some people were not able to execute oozie with 5.X versions. Will update the answer once we get an concrete reply from AWS.

Related

What is causing "org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: null"?

I have an Elastic MapReduce job which uses elasticsearch-hadoop via scalding-taps to transfer data from Amazon S3 to Amazon Elasticsearch Service. For a long time this job ran successfully. However, it has recently started failing with the following stack trace:
2016-03-02 07:28:34,003 FATAL [IPC Server handler 0 on 41019] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1456902751849_0012_m_000000_0 - exited : cascading.tuple.TupleException: unable to sink into output identifier: myindex/mytable
at cascading.tuple.TupleEntrySchemeCollector.collect(TupleEntrySchemeCollector.java:160)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:95)
at cascading.tuple.TupleEntrySchemeCollector.add(TupleEntrySchemeCollector.java:134)
at cascading.flow.stream.SinkStage.receive(SinkStage.java:90)
at cascading.flow.stream.SinkStage.receive(SinkStage.java
:37)
at cascading.flow.stream.FunctionEachStage$1.collect(FunctionEachStage.java:80)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
at com.twitter.scalding.MapFunction.operate(Operations.scala:59)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: null
We have enabled the "es.nodes.wan.only" setting.
What could be causing this failure?

Why do the Spark examples fail to spark-submit on EC2 with spark-ec2 scripts?

I downloaded spark-1.5.2 and I setup a cluster on ec2 using the spark-ec2 doc here.
After that I went to examples/ and run mvn package and packaged the examples in a jar.
In the end I run the submit with:
bin/spark-submit --class org.apache.spark.examples.JavaTC --master spark://url_here.eu-west-1.compute.amazonaws.com:7077 --deploy-mode cluster /home/aki/Projects/spark-1.5.2/examples/target/spark-examples_2.10-1.5.2.jar
Instead of it running, I get the error:
WARN RestSubmissionClient: Unable to connect to server spark://url_here.eu-west-1.compute.amazonaws.com:7077.
Warning: Master endpoint spark://url_here.eu-west-1.compute.amazonaws.com:7077 was not a REST server. Falling back to legacy submission gateway instead.
15/12/22 17:36:07 WARN Utils: Your hostname, aki-linux resolves to a loopback address: 127.0.1.1; using 192.168.10.63 instead (on interface wlp4s0)
15/12/22 17:36:07 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/12/22 17:36:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:242)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:98)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:116)
at org.apache.spark.deploy.Client$$anonfun$7.apply(Client.scala:233)
at org.apache.spark.deploy.Client$$anonfun$7.apply(Client.scala:233)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.deploy.Client$.main(Client.scala:233)
at org.apache.spark.deploy.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241)
... 21 more
Are you sure the URL to master contains "url-here"?
spark://url_here.eu-west-1.compute.amazonaws.com:7077
Or maybe you are trying to obfuscate it for this post.
If you can you connect the Spark UI at
http://url_here.eu-west-1.compute.amazonaws.com:4040 or depending on your spark version http://url_here.eu-west-1.compute.amazonaws.com:8080, make sure you are using the URL variable seen on the Spark UI for your spark://...:7070 command line argument

storm hdfs bolt not working in HDP 2.2

I have tried executing storm hdfs sample program. I have created a jar file and copied the jar to the edge node of the cluster.
Then I executed the below command
java -cp storm-.1.jar:lib/*:lib/log4j.properties test.storm.sample.StormSampleTopology
But, I am getting the below exception
17434 [Thread-11-hdfs_bolt] WARN org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
17448 [Thread-9-hdfs_bolt] WARN org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
17457 [Thread-11-hdfs_bolt] ERROR backtype.storm.util - Async loop died!
java.lang.RuntimeException: Error preparing HdfsBolt: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name; Host Details : local host is: "xxxx2002.xx.xxxxx.com/11.11.111.221"; destination host is: "x-xxxxx.xxxx.com":8020;
at org.apache.storm.hdfs.bolt.AbstractHdfsBolt.prepare(AbstractHdfsBolt.java:96) ~[storm-hdfs-0.1.2.jar:na]
at backtype.storm.daemon.executor$fn__3441$fn__3453.invoke(executor.clj:692) ~[storm-core-0.9.3.jar:0.9.3]
at backtype.storm.util$async_loop$fn__464.invoke(util.clj:461) ~[storm-core-0.9.3.jar:0.9.3]
at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name; Host Details : local host is: "xxx2002.xx.xxx.com/11.11.111.221"; destination host is: "x-xxxx.xxx.com":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) ~[hadoop-common-2.6.0.2.2.0.0-2041.jar:na]
at org.apache.hadoop.ipc.Client.call(Client.java:1472) ~[hadoop-common-2.6.0.2.2.0.0-2041.jar:na]
at org.apache.hadoop.ipc.Client.call(Client.java:1399) ~[hadoop-common-2.6.0.2.2.0.0-2041.jar:na]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[hadoop-common-2.6.0.2.2.0.0-2041.jar:na]
at com.sun.proxy.$Proxy14.create(Unknown Source) ~[na:na]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[hadoop-hdfs-2.6.0.2.2.0.0-2041.jar:na]
My key-tab file is located in the edge node itself. I have specified the location also in the property file
Please help me to resolve this issue.

Server returns 403 during secondary namenode docheckpoint with namenode

I am configuring hadoop on clusters.
All node started successfully, but secondary node failed doCheckpoint with following log:
2011-10-25 11:09:07,207 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint:
2011-10-25 11:09:07,208 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.io.IOException: Server returned HTTP response code: 403 for URL: https://name.node.http:50470/getimage?getimage=1
at sun.reflect.GeneratedConstructorAccessor24.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1491)
at java.security.AccessController.doPrivileged(Native Method)
at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1485)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:234)
at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:183)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$3.run(SecondaryNameNode.java:364)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$3.run(SecondaryNameNode.java:353)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.downloadCheckpointFiles(SecondaryNameNode.java:353)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:438)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:329)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:288)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:337)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1110)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:285)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Server returned HTTP response code: 403 for URL: https://name.node.http:50470/getimage?getimage=1
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
at sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2308)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getHeaderField(HttpsURLConnectionImpl.java:271)
at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:175)
... 14 more
Seems namenode rejects request of secondarynode with http error code 403.
Kerberos is configured with hadoop, and auth is passed by namenode to accept the request of secondary namenode:
2011-10-25 11:27:40,033 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successfull for hadoop/secondarynamenode#MY.DOMAIN.COM
2011-10-25 11:27:40,100 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successfull for hadoop/secondarynamenode#MY.DOMAIN.COM for protocol=interface org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol
2011-10-25 11:27:40,101 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 123.58.169.92
Does anyone know how could that happen? How can I fix it?
Thanks very much.
I think it's more appropriate to move my comment above to here as an answer.
This error is because of the _HOST macro setting of secondary namemode principal in hdfs-site.xml, if there is no dfs.secondary.http.address set in hdfs-site.xml, the _HOST will be translated by the one who use it.
I this case, code runs in namenode, so, _HOST parsed to namenode address, since kerberos principal composed of name, hostname, realm, that's a different principal, that's why authentication failed.

Cassandra Upgrade 0.8.2->0.8.4 get error "failed connecting to all endpoints"

After upgrade of cassandra from 0.8.2 to 0.8.4, got this error
I have restarted cassandra, removed data, etc. nothing helps
I have 6 identical machines in the cloud, before it was working fine.
If I make netstat then it shows port 9160 listening
nodetool ... ring - responces with 6 machines UP.
what could be the problem? : (
Exception in thread "main" java.io.IOException: Could not get input splits
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:157)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at WordCount.run(Unknown Source)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at WordCount.main(Unknown Source)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints slave1/98.188.69.242
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:153)
... 7 more
Caused by: java.io.IOException: failed connecting to all endpoints slave1/98.188.69.242
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:234)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:70)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:190)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:175)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
I don't know whether you have solved the problem. I met with the same problem as yours (Same configuration as yours too) and trrie to solve it.
Problem location:
public List call() throws Exception { ... List
tokens = getSubSplits(keyspace, cfName, range, conf); ...
In the method getSubSplits when calling method
createConnection(host, ConfigHelper.getRpcPort(conf), true)
, the format of host is not right. It is hostname/10.197.34.111 sometimes (ip_address), so createConnection will fail. We need to extract the ip address and then call createConnection.
You can try to change the code and try Hadoop again.
Good luck!

Resources