distcp is not executing

distcp is not executing - hadoop

i am trying to copy data from one hdfs cluster to another using distcp command.following is the command which i submitted
hadoop distcp hdfs://sourcenamenodehostname:50070/var/lib/hadoop-hdfs/distcptest.txt hdfs://destinationnamenodehostname:50070/var/lib/hadoop-hdfs
while submitting this i got error message . please go through the error message and guide me to the correct way.
19/02/27 04:28:19 INFO tools.OptionsParser: parseChunkSize: blocksperchunk false
19/02/27 04:28:20 ERROR tools.DistCp: Invalid arguments:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1835)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1515)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4448)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:912)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getFileInfo(AuthorizationProviderProxyClientProtocol.java:533)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:862)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)
at org.apache.hadoop.ipc.Client.call(Client.java:1504)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:788)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2168)
at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1266)
at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1262)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1262)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1418)
at org.apache.hadoop.tools.DistCp.setTargetPathExists(DistCp.java:208)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:133)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:493)
Invalid arguments: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1835)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1515)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4448)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:912)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getFileInfo(AuthorizationProviderProxyClientProtocol.java:533)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:862)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)

It seems to me that one of the namenode (the source) is not healthy and not accepting inbound connections to modify state or new writes. See this -
Invalid arguments: Operation category READ is not supported in state standby.

Related

Spark Socket Timeout issue while writing to hdfs

I am trying to process a hive query and write to HDFS as ORC format. But I am getting a timeout issue. I checked in spark-default.conf ,but there is no timeout settings, should i add it? Please let me know the changes to make
temp = sqlContext.sql(""" query """)
temp.write.format("orc").option("header", "true").save("hdfs://app/Quality/spark_test/")
Attached is the log of the error
: org.apache.hadoop.net.ConnectTimeoutException: Call From .. to app:8020 failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=app/64.100.51.136:8020]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:751)
at org.apache.hadoop.ipc.Client.call(Client.java:1482)
at org.apache.hadoop.ipc.Client.call(Client.java:1409)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy46.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy47.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2113)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1460)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:73)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:256)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=app/64.100.51.136:8020]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:708)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:374)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1531)
at org.apache.hadoop.ipc.Client.call(Client.java:1448)
... 41 more

The problem comes from badly provided hdfs location. When you are providing hdfs://app/ spark will assume that the namenode host name is app
So to solve this problem: in your code when you are providing the hdfs location you should:
Either provide the namenode's hostname/or IP, so your code should be:
temp.write.format("orc").option("header", "true").save("hdfs://NAMENODE_HOST:8020/app/Quality/spark_test/")
Or, if you have already configured Spark with HDFS/Yarn, by setting the configuration files' locations in spark-env.sh, providing the location on HDFS without the protocol (hdfs), so the code will be :
temp.write.format("orc").option("header", "true").save("/app/Quality/spark_test/")

Transfer data between hadoop clusters

I would like to transfer data between 2 hadoop clusters located on different servers.
Source:
hadoop version
Hadoop 2.0.0-cdh4.3.1
Destination:
hadoop version
Hadoop 2.0.0-cdh4.3.1
I have a directory in source as:
hadoop fs -ls /bid_data/foo
I have a directory in destination as:
hadoop fs -ls /bid_data
I want to copy the /bid_data/foo directory and its content inside /bid_data in another cluster
Thank you,
Rio
=
updated:
Below is the error message I get when I use below command: Is it a firewall issue? The destination cluster is not open to the world which means I need VPN to access the destination cluster outside the local area network.
hadoop distcp hdfs://nn1:8020/bid_data/foo hdfs://nn2:8020/bid_data
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
=
yFilesMapper.map(DistCp.java:314)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=rio, access=EXECUTE, inode="/user/rio/test":rio:warehouse:-rw-rw-rw-
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:177)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:142)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4716)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4698)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkTraverse(FSNamesystem.java:4677)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2934)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:673)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:646)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:45002)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1701)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1697)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1695)
at org.apache.hadoop.ipc.Client.call(Client.java:1231)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:629)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1545)

Hadoop provides an option to transfer data from one cluster to another. Its Distcp
The command for the same is
hadoop distcp hdfs://nn1:8020/bid_data/foo hdfs://nn2:8020/bid_data
For more details refer :
http://hadoop.apache.org/docs/r0.19.0/distcp.html

i tried to run map reduce sample program for hadoop-1.1.2. but i have following error

Warning: $HADOOP_HOME is deprecated.
Exception in thread "main" java.lang.IllegalArgumentException: Does not contain a valid host:port authority: localhost:54311
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:149)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:130)
at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:2312)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:490) at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:473)
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:513)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:511)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:499)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
at WordCount.main(WordCount.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
please any one help me

Check that every proccess is up by running "jps". The jobtracker must be up and appear in the list of your localhost computer
Namenode
Jobtracker
Tasktracker
Datanode
SecondaryNamenode
If it is not up it might be that another proccess is using the same port (54311) or simply a firewall problem
I hope it helps

Hive MapReduce Job Submission fails "Target is a directory"

I've been playing around with Hadoop and it's sister projects, and I've had a few problems along the way, but I've finally hit one that I can not find an answer to:
I have a hive table stored on hdfs as a tab-delimited text file. And I can do a basic select on the table, but as soon as I make the query a little more complicated, hive turns it into a map reduce job which fails with the following stack trace
13/11/29 08:31:00 ERROR security.UserGroupInformation: PriviledgedActionException as:hduser (auth:SIMPLE) cause:java.io.IOException: Target /tmp/hadoop-> > yarn/staging/hduser/.staging/job_1385633903169_0013/libjars/lib/lib is a directory
13/11/29 08:31:00 ERROR security.UserGroupInformation: PriviledgedActionException as:hduser (auth:SIMPLE) cause:java.io.IOException: Target /tmp/hadoop-yarn/staging/hduser/.staging/job_1385633903169_0013/libjars/lib/lib is a directory
java.io.IOException: Target /tmp/hadoop-yarn/staging/hduser/.staging/job_1385633903169_0013/libjars/lib/lib is a directory
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:500)
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:502)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:348)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
at org.apache.hadoop.mapreduce.JobSubmitter.copyRemoteFiles(JobSubmitter.java:139)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:212)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:300)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:387)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:144)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Job Submission failed with exception 'java.io.IOException(Target /tmp/hadoop-yarn/staging/hduser/.staging/job_1385633903169_0013/libjars/lib/lib is a directory)'
13/11/29 08:31:00 ERROR exec.Task: Job Submission failed with exception 'java.io.IOException(Target /tmp/hadoop-yarn/staging/hduser/.staging/job_1385633903169_0013/libjars/lib/lib is a directory)'
java.io.IOException: Target /tmp/hadoop-yarn/staging/hduser/.staging/job_1385633903169_0013/libjars/lib/lib is a directory
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:500)
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:502)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:348)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
at org.apache.hadoop.mapreduce.JobSubmitter.copyRemoteFiles(JobSubmitter.java:139)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:212)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:300)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:387)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:144)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
The folder in question does exist on the dfs, at least the "/tmp/hadoop-yarn/staging" part, and no matter what I set it's permissions to, hive or hadoop resets them on a job submission. The really concerning part is that the full path appears to be a generated folder name, so why does the software have a problem with something that it has generated on its own? Why is it a problem that the path is a directory? And what should it rather be?
Edit:
Here is the table I'm working with and the query I'm trying to run:
Query:
select * from hive_flow_details where node_id = 100 limit 10;
Table:
col_name data_type comment
id bigint None
flow_versions_id int None
node_id int None
node_name string None
Bear in mind, this happens to any uery that I attempt that has any kind of a where clause, as hive translates that into an MR job.

I eventually came right with this issue. I found conflicting jars in my classpath which I cleaned, and since then I have had no problems.

LeaseExpiredException in Hive

all. I run a hive query runs to 97% and exception shows that org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on sth.
Can anyone kindly explain the reason why this error occurred?
And this is a single user Hive cluster environment.
Thank you in advance.
2013-01-02 22:16:17,833 ERROR org.apache.hadoop.hdfs.DFSClient: Exception closing file /tmp/hive-hadoop/hive_2013-01-01_21-21-32_067_6367259756570557828/_task_tmp.-ext-10002/_tmp.000004_1 : org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /tmp/hive-hadoop/hive_2013-01-01_21-21-32_067_6367259756570557828/_task_tmp.-ext-10002/_tmp.000004_1 File does not exist. Holder DFSClient_attempt_201301012114_0002_m_000004_1 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1631)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1622)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:1677)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1665)
at org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:718)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /tmp/hive-hadoop/hive_2013-01-01_21-21-32_067_6367259756570557828/_task_tmp.-ext-10002/_tmp.000004_1 File does not exist. Holder DFSClient_attempt_201301012114_0002_m_000004_1 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1631)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1622)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:1677)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1665)
at org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:718)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
at org.apache.hadoop.ipc.Client.call(Client.java:1070)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy2.complete(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy2.complete(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3897)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3812)
at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1345)
at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:275)
at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:328)
at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:277)
at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:260)

Does your hive query create parallel MR jobs ?
I had the same problem, and found that: LeaseExpiredException: No lease error on HDFS :
When the job ends he deletes /data/work/ folder. If few jobs are running in parallel the deletion will also delete the files of the another job. actually I need to delete /data/work/.
In other words this exception is thrown when the job try to access to files which are not existed anymore

SET hive.exec.max.dynamic.partitions=100000;
SET hive.exec.max.dynamic.partitions.pernode=100000;

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

distcp is not executing - hadoop

It seems to me that one of the namenode (the source) is not healthy and not accepting inbound connections to modify state or new writes. See this - Invalid arguments: Operation category READ is not supported in state standby.

Related

Spark Socket Timeout issue while writing to hdfs

Transfer data between hadoop clusters

i tried to run map reduce sample program for hadoop-1.1.2. but i have following error

Hive MapReduce Job Submission fails "Target is a directory"

LeaseExpiredException in Hive

Categories

Resources