(Solved)I want to contact hadoop cluster and get some job/task information.
In hadoop1, I was able to use JobClient ( local pesudo distributed mode, use Eclipse):
JobClient jobClient = new JobClient(new InetSocketAddress("127.0.0.1",9001),new JobConf(config));
JobID job_id = JobID.forName("job_xxxxxx");
RunningJob job = jobClient.getJob(job_id);
.....
Today I set up a pesudo distributed hadoop2 YARN cluster, however, the above code doesn't work. I use the port of resource manager(8032).
JobClient jobClient = new JobClient(new InetSocketAddress("127.0.0.1",8032),new JobConf(config));
This line gives exception:
Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
I search this exception but all solutions are not working. I use eclipse, and I have add all hadoop jars including hadoop-mapreduce-client-xxx. Also, I can successfully run example programs on my cluster.
Any suggestions on how to use JobClient on hadoop2 yarn?
Update: I am able to solve this issue by compile with the same hadoop lib as the rm server. In Eclipse it still gives this exception but after I compiled and deployed my project it works fine.(not sure why as in hadoop1 it works in eclipse) There is no need to change the api, JobClient is still functioning well in hadoop2
Have you configured the mapred-site.xml file as followed? It is located in $HADOOP_HOME/etc/hadoop/ in hadoop 2.x
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
edit: Also make sure that your yarn-site.xml (same location) contains the following property:
<property>
<name>yarn.resourcemanager.address</name>
<value>host:port</value>
</property>
One last thing: I strongly advise you to work with hostnames instead of IPs. There are known cases of failure with hadoop when IPs are set in the configuration files.
Related
Node1 : hadoop2.5.2 RedhatLinux.el6 64bit
build 64bit native library and it's working
Node2 : hadoop2.5.2 RedhatLinux.el5 32bit
build 32bit native library and it's working
when running map reduce task as single node it works(with compression)
as multinode also it's working (without compression)
but as multinode with compression it's not working....
map task only finishing in one of the node(somtimes in node1, sometime in node2) in other node it is failed with error and job got failed.
Error: java.io.IOException: Spill failed at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.checkSpillException(MapTask.java:1535)
at . . Caused by: java.lang.RuntimeException: native lz4 library not
available at
org.apache.hadoop.io.compress.Lz4Codec.getCompressorType(Lz4Codec.java:124)
at
org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:148)
at
i tried
<name>mapreduce.admin.user.env</name>
<value>LD_LIBRARY_PATH=$HADOOP_HOME/lib/native</value>
in mapred-site.xml
but still not working...
please suggest a solution...
Adding these properties in mapred-site.xml of the Hadoop node, in which the job is submitting solved the problem.
<property>
<name>yarn.app.mapreduce.am.admin.user.env</name>
<value>LD_LIBRARY_PATH={{HADOOP_COMMON_HOME}}/lib/native</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>LD_LIBRARY_PATH={{HADOOP_COMMON_HOME}}/lib/native</value>
</property>
<property>
<name>mapreduce.admin.user.env</name>
<value>LD_LIBRARY_PATH={{HADOOP_COMMON_HOME}}/lib/native</value>
</property>
Enable debug logs for hadoop in the machine where the exception is thrown.
restart hadoop process, post that you should be able to figure out based on logs of NativeCodeLoader as why native library is not loaded.
you can use below command to verify if native libraries are loaded or not.
hadoop checknative -a
there is solution for HA hadoop + hbase stack for hadoop 1, but i can't find any mentions on such solution for hadoop 2.
It has name node avaliability but you still need to set hostname in hadoop setup, so if master name node goes down hbase remains blinded.
What solutions can you suggest for making hbase resilient to name node failures?
You need to configure name service and use name service instead of specifying specific IP.
For example here "mycluster" is name service name.
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
And then configure for HA
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
In hbase-site.xml also you can use "mycluster" name service to refer the cluster.
For more details, Please refer here
I'm trying to use Oozie from Java to start a job on a Hadoop cluster. I have very limited experience with Oozie on Hadoop 1 and now I'm struggling trying out the same thing on YARN.
I'm given a machine that doesn't belong to the cluster, so when I try to start my job I get the following exception:
E0501 : E0501: Could not perform authorization operation, User: oozie is not allowed to impersonate hadoop
Why is that and what to do?
I read a bit about core-site properties that need to be set
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>users</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>master</value>
</property>
Does it seem that this is the problem? Should I contact people responsible for cluster to fix that?
Could there be problems because I'm using same code for YARN as I did for Hadoop 1? Should something be changed? For example, I'm setting nameNode and jobTracker in workflow.xml, should jobTracker exist, since there is now ResourceManager? I have set the address of ResourceManager, but left the property name as jobTracker, could that be the error?
Maybe I should also mention that Ambari is used...
Hi please update the core-site.xml
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
and jobTracker address is the Resourcemananger address that will not be the case . once update the core-site.xml file it will works.
Reason:
Cause of this type of error is- You run oozie server as a hadoop user but you define oozie as a proxy user in core-site.xml file.
Solution:
change the ownership of oozie installation directory to oozie user and run oozie server as a oozie user and problem will be solved.
I am trying to implement Kerberos authentication. I am using Hadoop 2.3 version of hadoop on cdh5.0.1. I have done the following changes :
Added following properties to core-site.xml
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
After restarting the daemon when i am issuing hadoop fs -ls / command, I am getting following error :
ls: Failed on local exception: java.io.IOException: Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.; Host Details : local host is: "cldx-xxxx-xxxx/xxx.xx.xx.xx"; destination host is: "cldx-xxxx-xxxx":8020;
Please help me out.
Thanks in advance,
Ankita Singla
There is a lot more to configuring a secure HDFS cluster than just specifying hadoop.security.authentication as Kerberos. See Configuring Hadoop Security in CDH 5 about the required config settings. You'll need to create appropriate keytab files. Only after you configured everything and you confirmed that none of the Hadoop services report any error in their respective logs (namenode, datanode on all hosts, resourcemanager, nodemanager on all nodes etc) can you attempt to connect.
I am a newbie to hadoop using hadoop in a single server node, i have setup the hadoop environment and have set my core-site.XML file in the conf folder of Hadoop as
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/oracle/Hadoop/hadoop_temp_files</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
</configuration>
After setting the respected configuration I formatted the namenode and started the agents.
All the agents started as expected but no dir. with hadoop_temp_files got created. inside Hadoop, What could be the possible problem.
I am logged in to a server remotely,
The usergroup oracle to which I am remotely logged into, however is not added to the sudoers and doesn't have admin right. Can this be the reason of the hadoop_temp_files dir not getting created.
Also when I started the agents all the agents started, but while stopping the agents by writting stop-all.sh the output responded:
There's no tasktracker to stop and
there's no secondarynode to stop
Please help me clarify my problem.