Failed to set permissions of path: \tmp - hadoop

Failed to set permissions of path: \tmp\hadoop-MayPayne\mapred\staging\MayPayne2016979439\.staging to 0700
I'm getting this error when the MapReduce job executing, I was using hadoop 1.0.4, then I got to know it's a known issue and I tried this with the 1.2.0 but the issue still exists. Can I know a hadoop version that they have resolved this issue.
Thank you all in advance

I was getting the same exception while runing nutch-1.7 on windows 7.
bin/nutch crawl urls -dir crawl11 -depth 1 -topN 5
The following steps worked for me
Download the pre-built JAR, patch-hadoop_7682-1.0.x-win.jar, from theDownload section. You may get the steps for hadoop.
Copy patch-hadoop_7682-1.0.x-win.jar to the ${NUTCH_HOME}/lib directory
Modify ${NUTCH_HOME}/conf/nutch-site.xml to enable the overriden implementation as shown below:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.file.impl</name>
<value>com.conga.services.hadoop.patch.HADOOP_7682.WinLocalFileSystem</value>
<description>Enables patch for issue HADOOP-7682 on Windows</description>
</property>
</configuration>
Run your job as usual (using Cygwin).

Downloading hadoop-core-0.20.2.jar and putting it on nutcher's lib directory resolved the problem for me
(In case of windows) If still not solved for you, try using this hadoop's patch

set the below vm arguments
-Dhadoop.tmp.dir=<A directory location with write permission>
to override the default /tmp directory
Also using hadoop-core-0.20.2.jar (http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core/0.20.2) will solve the reported issue.

I managed to solve this by changing the hadoop-core jar file little bit. Changed the error causing method in FileUtil.java in hadoop-core.jar file and recompiled and included in my eclipse project. Now the error is gone. I suggest every one of you to do that.

Related

no mapred-site.xml.template file in 3.0.0

I am in the process of installing a pseudo-distributed node Hadoop cluster on my Windows laptop using Oracle virtual box 5.1 and ubuntu. I have already downloaded version 3.0.0 from the mirror site. I trying to create the mapred-site.xml file by typing the command
sudo cp $HADOOP_HOME/etc/hadoop/mapred-site.xml.template \$HADOOP_HOME/etc/hadoop/mapred-site.xml
The mapred-site.xml.template file is not in the directory /usr/share/hadoop/etc/hadoop
Is the mapred-site.xml.template file not included in this release?
I have already searched StackOverflow with no success as well as googling this issue.
there is no .template in Hadoop 3.0.0 their should already be a mapred-site.xml file in Hadoop 3.0.0. If the file is not there for some reason you can create the xml file that references the configuration.xsl
*xml version="1.0"
xml-stylesheet type="text/xsl" href="configuration.xsl"
then you will fill out your configuration element.
try 2.7.5 if you want to cp that file

Hadoop 2.4 installation for mac: file configuration

I am new to Hadoop. I am trying to set up Hadoop 2.4 on MacBook Pro using Homebrew. I have been following instructions in this web site (http://shayanmasood.com/blog/how-to-setup-hadoop-on-mac-os-x-10-9-mavericks/). I have installed Hadoop on my machine. Now I am trying to configure Hadoop.
One needs to configure the following files according to the website.
mapred-site.xml
hdfs-site.xml
core-site.xml
hadoop-env.sh
But, it seems that this information is a bit old. In Terminal, I see the following.
In Hadoop's config file:
/usr/local/Cellar/hadoop/2.4.0/libexec/etc/hadoop/hadoop-env.sh,
/usr/local/Cellar/hadoop/2.4.0/libexec/etc/hadoop/mapred-env.sh and
/usr/local/Cellar/hadoop/2.4.0/libexec/etc/hadoop/yarn-env.sh
$JAVA_HOME has been set to be the output of:
/usr/libexec/java_home
It seems that I have three files to configure here. Am I right on the track? There is information for hadoop-env.sh and mapped-env.sh for configuration. But, I have not seen one for yarn-env.sh. What do I have to do with this file?
The other question is how I access to these files for modification? I receive the following message in terminal right now.
-bash: /usr/local/Cellar/hadoop/2.4.0/libexec/etc/hadoop/hadoop-env.sh: Permission denied
If you have any suggestions, please let me know. Thank you very much for taking your time.
You can find the configuration files under :
/usr/local/Cellar/hadoop/2.4.0/libexec/etc/hadoop
And concerning the permission for the scripts suggested by brew, you also need to change their mode.
In the scripts directory (/usr/local/Cellar/hadoop/2.4.0/libexec/etc/hadoop/)
sudo chmod +x *.sh
You are checking in hadoop/conf/ folder to amend below
mapred-site.xml,hdfs-site.xml,core-site.xml
And you can change permission of hadoop-env.sh to make changes to that.
Make sure that your session is in SSH. Then use the start-all.sh command to start Hadoop.

Hadoop release missing /conf directory

I am trying to install a single node setup of Hadoop on Ubuntu.
I started following the instructions on the Hadoop 2.3 docs.
But I seem to be missing something very simple.
First, it says to
To get a Hadoop distribution, download a recent stable release from one of the Apache Download Mirrors.
Then,
Unpack the downloaded Hadoop distribution. In the distribution, edit the file conf/hadoop-env.sh to define at least JAVA_HOME to be the root of your Java installation.
However, I can't seem to find the conf directory.
I downloaded a release of 2.3 at one of the mirrors. Then unpacked the tarball, an ls of the inside returns:
$ ls
bin etc include lib libexec LICENSE.txt NOTICE.txt README.txt sbin share
I was able to find the file they were referencing, just not in a conf directory:
$ find . -name hadoop-env.sh
./etc/hadoop/hadoop-env.sh
Am I missing something, or am I grabbing the wrong package? Or are the docs just outdated?
If so, anyone know where some more up-to date docs are?
I am trying to install a pseudo-distributed mode Hadoop, running into the same issue.
By following the book Hadoop The Definitive Guide (Third Edition), on page 618, it says:
In Hadoop 2.0 and later, MapReduce runs on YARN and there is an additional con-
figuration file called yarn-site.xml. All the configuration files should go in the
etc/hadoop subdirectory
Hope this confirms that etc/hadoop is the correct place.
I think the docs need to be updated. Although the directory structure has changed, file names for important files like hadoop-env.sh, core-ste.xml and hdfs-site.xml have not changed. You may find the following link useful for getting started.
http://codesfusion.blogspot.com/2013/10/setup-hadoop-2x-220-on-ubuntu.html
In Hadoop1,
{$HADOOP_HOME}/conf/
In Hadoop2,
{$HADOOP_HOME}/etc/hadoop
in Hadoop 2.7.3 the file is in hadoop-common/src/main/conf/
$ sudo find . -name hadoop-env.sh
./hadoop-2.7.3-src/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
Just adding a note on the blog post http://codesfusion.blogspot.com/2013/10/setup-hadoop-2x-220-on-ubuntu.html. The blogpost is fantastic and very useful. That's how I got started. One aspect that I took a little time to figure is, that this blog seems to use a simplified way of providing configuration in the hadoop conf files such as "conf/core-site.xml", hdfs-site.xml
etc... as follows
<!--fs.default.name is the name node URI -->
<configuration>
fs.default.name
hdfs://localhost:9000
</configuration>
As per official docs there is a more rigorous way - that would be useful when you have more than one properties is to add it as follows ( please note - the description is optional :-) )
<configuration>
<property>
<name> fs.default.name </name>
<value>hdfs://localhost:9000 </value>
<description>the name node URI </description>
</property>
<!--Add more configuration properties here -->
</configuration>
The conf directory for Hadoop's (2022) version 3.3.1 is located in src/main directory:
$HOME/hadoop/hadoop3.3/hadoop-common-project/hadoop-common/src/main/

Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses-submiting job2remoteClustr

I recently upgraded my cluster from Apache Hadoop1.0 to CDH4.4.0. I have a weblogic server in another machine from where i submit jobs to this remote cluster via mapreduce client. I still want to use MR1 and not Yarn. I have compiled my client code against the client jars in the CDH installtion (/usr/lib/hadoop/client/*)
Am getting the below error when creating a JobClient instance. There are many posts related to the same issue but all the solutions refer to the scenario of submitting the job to a local cluster and not to remote and specifically in my case from a wls container.
JobClient jc = new JobClient(conf);
Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
But running from the command prompt on the cluster works perfectly fine.
Appreciate your timely help!
I had a similar error and added the following jars to classpath and it worked for me:
hadoop-mapreduce-client-jobclient-2.2.0.2.0.6.0-76:hadoop-mapreduce-client-shuffle-2.3.0.jar:hadoop-mapreduce-client-common-2.3.0.jar
It's likely that your app is looking at your old Hadoop 1.x configuration files. Maybe your app hard-codes some config? This error tends to indicate you are using the new client libraries but that they are not seeing new-style configuration.
It must exist since the command-line tools see them fine. Check your HADOOP_HOME or HADOOP_CONF_DIR env variables too although that's what the command line tools tend to pick up, and they work.
Note that you need to install the 'mapreduce' service and not 'yarn' in CDH 4.4 to make it compatible with MR1 clients. See also the '...-mr1-...' artifacts in Maven.
In my case, this error was due to the version of the jars, make sure that you are using the same version as in the server.
export HADOOP_MAPRED_HOME=/cloudera/parcels/CDH-4.1.3-1.cdh4.1.3.p0.23/lib/hadoop-0.20-mapreduce
I my case i was running sqoop 1.4.5 and pointing it to the latest hadoop 2.0.0-cdh4.4.0 which had the yarn stuff also thats why it was complaining.
When i pointed sqoop to hadoop-0.20/2.0.0-cdh4.4.0 (MR1 i think) it worked.
As with Akshay (comment by Setob_b) all I needed to fix was to get hadoop-mapreduce-client-shuffle-.jar on my classpath.
As follows for Maven:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-shuffle</artifactId>
<version>${hadoop.version}</version>
</dependency>
In my case, strangely this error was because in my 'core-site.xml' file, I mentioned "IP-address" rather than "hostname".
The moment I mentioned "hostname" in place of IP address and in "core-site.xml" and "mapred.xml" and re-installed mapreduce lib files, error got resolved.
in my case, i resolved this by using hadoop jar instead of java -jar .
it's usefull, hadoop will provide the configuration context from hdfs-site.xml, core-site.xml ....

Mkdirs failed to create hadoop.tmp.dir

I have upgraded from Apache Hadoop 0.20.2 to the newest stable release; 0.20.203. While doing that, I've also updated all configuration files properly. However, I am getting the following error while trying to run a job via a JAR file:
$ hadoop jar myjar.jar
$ Mkdirs failed to create /mnt/mydisk/hadoop/tmp
where /mnt/mydisk/hadoop/tmp is the location of hadoop.tmp.dir as stated in the core-site.xml:
..
<property>
<name>hadoop.tmp.dir</name>
<value>/mnt/mydisk/hadoop/tmp</value>
</property>
..
I've already checked that the directory exists, and that the permissions for the user hadoop are set correctly. I've also tried out to delete the directory, so that Hadoop itself can create it. But that didn't help.
Executing an Hadoop job with hadoop version 0.20.2 worked out of the box. However, something is broken after the update. Can someone help me to track down the problem?

Resources