LeaseExpiredException: No lease error on HDFS

LeaseExpiredException: No lease error on HDFS - hadoop

I am trying to load large data to HDFS and I sometimes get the error below. any idea why?
The error:
org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /data/work/20110926-134514/_temporary/_attempt_201109110407_0167_r_000026_0/hbase/site=3815120/day=20110925/107-107-3815120-20110926-134514-r-00026 File does not exist. Holder DFSClient_attempt_201109110407_0167_r_000026_0 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1557)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1548)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:1603)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1591)
at org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:675)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428)
at org.apache.hadoop.ipc.Client.call(Client.java:1107)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
at $Proxy1.complete(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.complete(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:966)
at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.close(SequenceFile.java:1297)
at org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.close(SequenceFileOutputFormat.java:78)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs$RecordWriterWithCounter.close(MultipleOutputs.java:303)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.close(MultipleOutputs.java:456)
at com.my.hadoop.platform.sortmerger.MergeSortHBaseReducer.cleanup(MergeSortHBaseReducer.java:145)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:178)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:572)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)

I managed to fix the problem:
When the job ends he deletes /data/work/ folder. If few jobs are running in parallel the deletion will also delete the files of the another job. actually I need to delete /data/work/.
In other words this exception is thrown when the job try to access to files which are not existed anymore

I meet the same problem when i use spark streaming to saveAsHadoopFile to Hadoop(2.6.0-cdh5.7.1), of course i use MultipleTextOutputFormat to write different data to different path. Sometimes the exception what Zohar said would happen. The reason is as Matiji66 say:
another program read,write and delete this tmp file cause this error.
but the root reason he didn't talk about is the hadoop speculative:
Hadoop doesn’t try to diagnose and fix slow running tasks, instead, it tries to detect them and runs backup tasks for them.
So the really reason is that, your task execute slow, then hadoop run another task to do the same thing(in my case is to save data to a file on hadoop), when one task of the two task finished, it will delete the temp file, and the other after finished, it will delete the same file, then it does not exists, so the exception
does not have any open files
happened
you can fix it by close the speculative of spark and hadoop:
sparkConf.set("spark.speculation", "false");
sparkConf.set("spark.hadoop.mapreduce.map.speculative", "false");
sparkConf.set("spark.hadoop.mapreduce.reduce.speculative", "false")

For my case, another program read,write and delete this tmp file cause this error.
Try to avoid this.

ROOT CAUSE
Storage policy was set on staging directory and hence MAPREDUCE job failed.
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>
RESOLUTION
Setup staging directory for which storage policy is not setup. I.e. modify yarn.app.mapreduce.am.staging-dir in yarn-site.xml
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/tmp</value>
</property>

I use Sqoop to import into HDFS and have same error. By the help of previous answers I have realized that I needed to remove last "/" from --target-dir /dw/data/
I used --target-dir /dw/data
works fine

I encountered this problem when I changed my program to use saveAsHadoopFile method to improve performance, in which scenario I can't make use of DataFrame API directly. see the problem
The reason why this would happen is basically what Zohar said, the saveAsHadoopFile method with MultipleTextOutputFormat actually doesn't allow multiple programs concurrently running to save files to the same directory. Once a program finished, it would delete the common _temporary directory the others still need, I am not sure if it's a bug in M/R API. (2.6.0-cdh5.12.1)
You can try this solution below if you can't redesign your program:
This is the source code of FileOutputCommitter in M/R API: (you must download an according version)
package org.apache.hadoop.mapreduce.lib.output;
public class FileOutputCommitter extends OutputCommitter {
private static final Log LOG = LogFactory.getLog(FileOutputCommitter.class);
/**
* Name of directory where pending data is placed. Data that has not been
* committed yet.
*/
public static final String PENDING_DIR_NAME = "_temporary";
Changes:
"_temporary"
To:
System.getProperty("[the property name you like]")
Compiles the single Class with all required dependencies, then creates a jar with the three output class files and places the jar to you classpath. (make it before the original jar)
Or, you can simply put the source file to your project.
Now, you can config the temp directory for each program by setting a different system property.
Hope it can help you.

Related

unable to save rdd on local filesystem on windows 10

I have a scala/spark program that is used to validate xmls file in an input directory and then writes the report to another input parameter (local filesystem path to write report to).
As per the requirements from stakeholders this program is to run on local machines hence I am using spark in local mode.
Till now things were fine, i was using the code below to save my report to a file
dataframe.repartition(1)
.write
.option("header", "true")
.mode("overwrite")
.csv(reportPath)
However this required winutils to be installed/configured on the machines running my program.
Given we use cloudera updates very often, there was an overhead of changing winutils after evry update as we would be updating the jars to the latest version in our pom file. Hence, I have been asked to remove the dependency on winutils
On a quick google search and after coming across How to save Spark RDD to local filesystem
I decided to change the above pice of code to
val outputRdd = dataframe.rdd
val count = outputRdd.count()
println("\nCount is: " + count + "\n")
println("\nOutput path is: " + reportPath + "\n")
outputRdd.coalesce(1).saveAsTextFile(reportPath)
However, on running the code I am now getting this error
Count is: 15
Output path is: C:\\codingdir\\test\\report
Exception in thread "main" java.lang.IllegalAccessError: tried to access method org.apache.hadoop.mapred.JobContextImpl.<init>(Lorg/apache/hadoop/mapred/JobConf;Lorg/apache/hadoop/mapreduce/JobID;)V from class org.apache.spark.internal.io.HadoopMapRedWriteConfigUtil
at org.apache.spark.internal.io.HadoopMapRedWriteConfigUtil.createJobContext(SparkHadoopWriter.scala:178)
at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:67)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1096)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply$mcV$sp(PairRDDFunctions.scala:1067)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply$mcV$sp(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:957)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply$mcV$sp(RDD.scala:1499)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1478)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1478)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1478)
at com.optus.dcoe.hawk.XmlParser$.delayedEndpoint$com$optus$dcoe$hawk$XmlParser$1(XmlParser.scala:120)
at com.optus.dcoe.hawk.XmlParser$delayedInit$body.apply(XmlParser.scala:16)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at com.optus.dcoe.hawk.XmlParser$.main(XmlParser.scala:16)
at com.optus.dcoe.hawk.XmlParser.main(XmlParser.scala)
I have tried changing the value of reportPath varible to
C:\codingdir\test\report
file://C:/codingdir/test/report
file://C:/codingdir/test/report
and other values as suggested on
Write RDD as textfile using Apache Spark
How to save Spark RDD to local filesystem
How to access local files in Spark on Windows?
and other links but I am still getting same error
I have found these articles about java.lang.IllegalAccessError but not sure how do i get around this error:
https://examples.javacodegeeks.com/java-basics/exceptions/java-lang-illegalaccesserror-how-to-resolve-illegal-access-error/
java.lang.IllegalAccessError: tried to access method
Can someone please help me in resolving this?
Env variable HADOOP_HOME pertaining to winutls has been removed.
winutils entry has been removed from PATH variable
I am using java 8 on windows 10 (all the users of the program would be on similar laptops)
Spark version is 2.4.0-cdh6.2.1

Finally found out the issue,
It was caused by some unwanted mapreduce related dependencies which have now been removed and I have moved to another error now

Unable to run basic example for hadoop

Hadoop version 2.9.0, Java - 1.8.0_162
When trying to run the example given here: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html, under standalone operation, I get the following error:
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.0.jar grep input output 'dfs[a-z.]+'
java.lang.NoSuchMethodError: org.apache.hadoop.util.ProgramDriver.run([Ljava/lang/String;)I
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
I am new to hadoop and not sure how to fix this. I have set the JAVA_HOME in hadoop-env.sh. I am pretty sure that I am using the correct compatible versions of java and hadoop.
Any help will be useful.

Nish,
The error message means that while the runtime was able to find the class ProgramDriver, the function run() is not present.
The most likely reason for this is that you're running an old version of Hadoop that exposed a difference interface in ProgramDriver. About a year ago this method was renamed to run() after being called driver().
The fix for that would be making sure you're running a recent version of Hadoop.
For your reference please check following links they have asked same question.
Error while executing hadoop-mapreduce-examples-2.2.0.jar
Can the hadoop programm which write under the hadoop-2.2.0 run in hadoop-1.2.1?

NativeIO chmod "ENOTDIR" exception in mapreduce

I'm finding that a mapreduce job cant start due to some funniness in the RawLocalFileSystem, it appears.
How can I debug this error ? It appears that there is no trace of the directory or command which is associated with the NativeIO chmod exception.
One option will be of course to bundle a jar into my classpath with a custom RawLocalFileSystem implementation, but that seems like overkill.
13/07/11 18:39:43 ERROR security.UserGroupInformation: PriviledgedActionException as:root cause:ENOTDIR: Not a directory
ENOTDIR: Not a directory
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:699)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:654)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)

This was an interesting bug: It turned out that i indeed had a FILE created which existed, already, in a place where a directory was needed to be created.
That is , some how, I had a file called "tmp" in my file system implementation's root directory !
In any case, the confusion was due to scarce error reporting by the hadoop NativeIO class.
I think it is a failure which should be reported and logged better by the underlying NativeIO class .

my pig UDF runs in local mode but fails with "Deserialization error: could not instantiate" on my cluster

I have a pig UDF which runs perfectly in local mode, but fails with: could not instantiate 'com.bla.myFunc' with arguments 'null' when I try it on the cluster.

my mistake was not digging hard enough in the task logs.
when you dig there thru the jobTracker UI, you could see that the root cause was:
Caused by: java.lang.ClassNotFoundException: com.google.common.collect.Maps
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
so, besides the usual:
pigServer.registerFunction("myFunc", new FuncSpec("com.bla.myFunc"));
we should add:
registerJar(pigServer, Maps.class);
and so on for any jar used by the UDF.
Another option is to use build-jar-with-dependencies, but then you have to put the pig.jar before yours in the classpath, or else you'll tackle this one: embedded hadoop-pig: what's the correct way to use the automatic addContainingJar for UDFs?

Configuring Teamcity's logging behaviour

I'm using Teamcity 5 for our CI environment. It's a great tool, but I've been struggling with one thing: the stdout_yyyyMMdd.log file in the \TeamCity\logs folder grows to ridiculous sizes. Is there a way to turn it off?
Places I've looked so far:
Jetbrains: Nothing on stdout;
Google for "tomcat stdout logs": the first few links don't really address the issue.
Edit:
At KIR's suggestion, I actually looked to see what's in stdout. It's the same exception message repeated over and over again:
[2010-12-01 08:57:21,268] WARN - jetbrains.buildServer.SERVER - java.io.FileNotFoundException: <...Path...>\.BuildServer\system\caches\search\_8p.prx (The system cannot find the file specified)
[2010-12-01 08:57:21,315] ERROR - erverSide.search.SearchService - SearchService.enqueueHistory
java.io.FileNotFoundException: <...Path...>\.BuildServer\system\caches\search\_8p.prx (The system cannot find the file specified)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(Unknown Source)
at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.<init>(SimpleFSDirectory.java:78)
at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.<init>(SimpleFSDirectory.java:108)
at org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:65)
at org.apache.lucene.index.SegmentReader$CoreReaders.<init>(SegmentReader.java:132)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:638)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:599)
at org.apache.lucene.index.DirectoryReader.<init>(DirectoryReader.java:104)
at org.apache.lucene.index.ReadOnlyDirectoryReader.<init>(ReadOnlyDirectoryReader.java:27)
at org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:74)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:704)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:69)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:476)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:314)
at jetbrains.buildServer.serverSide.search.SearchService.getIndexSearcher(SearchService.java:451)
at jetbrains.buildServer.serverSide.search.SearchService.enqueueHistory(SearchService.java:515)
at jetbrains.buildServer.serverSide.search.BackgroundIndexer.run(BackgroundIndexer.java:32)
at java.lang.Thread.run(Unknown Source)
Any idea what this file is?

If you're running TC on unix you could use logrotate: http://linuxcommand.org/man_pages/logrotate8.html (Obviously, this is a workaround but it should be effective.)
This guy has a windows equivalent that may do the trick too: http://www.datori.org/?p=7

Remove .BuildServer\system\caches\search directory and restart TeamCity. May be this would help.

The problem is caused by someone or something deleting the Lucene Index in Team city.
Everytime you hit a page after this it will log in stdout that it couldnt find the file.
If you clear out the whole folder which should be
%USERPROFILE%.BuildServer\system\caches\search\
See http://confluence.jetbrains.net/display/TCD5/TeamCity+Data+Directory for more information on where to find the folder.
And restart Teamcity it will recreate the index on start up and stop logging the error message.
Oh and search should start working again too.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

LeaseExpiredException: No lease error on HDFS - hadoop

For my case, another program read,write and delete this tmp file cause this error. Try to avoid this.

I use Sqoop to import into HDFS and have same error. By the help of previous answers I have realized that I needed to remove last "/" from --target-dir /dw/data/ I used --target-dir /dw/data works fine

Related

unable to save rdd on local filesystem on windows 10

Unable to run basic example for hadoop

NativeIO chmod "ENOTDIR" exception in mapreduce

my pig UDF runs in local mode but fails with "Deserialization error: could not instantiate" on my cluster

Configuring Teamcity's logging behaviour

Categories

Resources