Unable to start node manager on hadoop 0.23.0 - hadoop

while installing Hadoop 0.23.0 on the cluster, node manager is unable to start and getting the following error.
Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:8080
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:303)
at org.apache.hadoop.mapred.ShuffleHandler.start(ShuffleHandler.java:255)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.start(AuxServices.java:123)
at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
... 4 more
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)

It means 8080 is already in use.
so
sudo netstat -nvvpa |grep 8080 and see the o/p.
if it listens to any java, then if possible stop the process. and then try to start the nodemanager again.
This made my problem solved. thankyou.

Related

failed to launch apache.spark.master

Whenever i run start-master.sh command on my local machine i am getting following error please someone help me to fix this issue
Terminal Error
Error which i get in terminal
starting org.apache.spark.deploy.master.Master, logging to /usr/local/spark-2.0.1-bin-hadoop2.6/logs/spark-andani-org.apache.spark.deploy.master.Master-1-andani.sakha.com.out
failed to launch org.apache.spark.deploy.master.Master:
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)
Log Error
If i check the spark log file following is the error
Exception in thread "main" java.net.BindException: Cannot assign requested address: Service 'sparkMaster' failed after 16 retries! Consider explicitly setting the appropriate port for the service 'sparkMaster' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1089)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:430)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:415)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:903)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)
The error is due to your sparkMaster was not able to contact to your internal IP
Once check your /etc/hosts file weather it pointing to proper host name or your previous IP address might have changed.
Reconfigure it and run the command once again.

How install alluxio1.2 on openstack

I try to install alluxio1.2 on a VM centos on openstack with spark and hdfs but the installation doesn't works. Spark and hdfs are already install and work
ERROR logger.type (AlluxioMaster.java:main) - Uncaught exception while running Alluxio master, stopping it and exiting.
java.lang.RuntimeException: java.net.BindException: Address already in use
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at alluxio.web.UIWebServer.startWebServer(UIWebServer.java:164)
at alluxio.master.AlluxioMaster.startServingWebServer(AlluxioMaster.java:467)
at alluxio.master.AlluxioMaster.startServing(AlluxioMaster.java:452)
at alluxio.master.AlluxioMaster.startServing(AlluxioMaster.java:447)
at alluxio.master.AlluxioMaster.start(AlluxioMaster.java:389)
at alluxio.master.AlluxioMaster.main(AlluxioMaster.java:86)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187)
at alluxio.web.UIWebServer.startWebServer(UIWebServer.java:154)
... 5 more
Are there a special installation to install alluxio on one openstack machine ?
It looks like the Alluxio master's web UI cannot start because the address is already in use. This happens if the port is taken by another process. Alluxio web UI uses the port 19999 for the web UI by default. If you expect another process to be using that port, you can change the Alluxio master web UI port by changing the configuration parameter (http://www.alluxio.org/docs/master/en/Configuration-Settings.html#master-configuration), alluxio.master.web.port, to another port number.

Mapreduce returns error when accessing to datanode on master machine

I set up a Hadoop 2.4.0 cluster with three machines. One master machine is deployed with namenode, resource manager, datanode and node manager. The other two worker machines are deployed with datanode and node manager. When I run Hive query, the work fails and the error is
2014-06-11 13:40:13,364 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.net.ConnectException: Call From master/127.0.0.1 to
master:43607 failed on connection exception: java.net.ConnectException: Connection >refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:5>7)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImp>l.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1414)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231)
at com.sun.proxy.$Proxy9.getTask(Unknown Source)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:604)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:699)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
at org.apache.hadoop.ipc.Client.call(Client.java:1381)
... 4 more
if I disable the datanode on master machine, everything works well. I'm wondering if it's allowed to deployed datanode on the master machine. Thank you for your kindly help in advance.
BTW, my /etc/hosts on the three machines are the same:
127.0.0.1 localhost
10.1.154.231 master
10.1.153.220 slave1
10.1.153.133 slave2
Please set up passwordless ssh on your master to itself.
You can achieve this by
cat ~/id_rsa.pub >> ~/.ssh/authorized_keys2
Make sure the permissions are correct
chmod 0600 ~/.ssh/authorized_keys2
In this case you may check if the namenode is started correctly on the master by checking logs at your yourhadoopfolder/logs/hadoop-[hadoop-user]-namenode-master.log
It is often caused by the hdfs is not formatted before. Run
hadoop namenode -format
Of course you will need to put your data to the cluster again.

Hadoop ResourceManager Can Not Start

I got the following error, but netstat shows 8088 is not in use.
This is a 3 node cluster, Namenode, Jobtracker, Datanode running on different EC2 instance
2014-02-04 02:49:43,519 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager
org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:262)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:623)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:655)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:872)
Caused by: java.net.BindException: Port in use: jobtracker.hdp-dev.XYZ.com:8088
at org.apache.hadoop.http.HttpServer.openListener(HttpServer.java:742)
at org.apache.hadoop.http.HttpServer.start(HttpServer.java:686)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:257)
... 4 more
Caused by: java.net.BindException: Cannot assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at org.apache.hadoop.http.HttpServer.openListener(HttpServer.java:738)
... 6 more
2014-02-04 02:49:43,522 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down ResourceManager at name01.hdp-dev.XYZ.com/10.xxx.xxx.xxx
************************************************************/
On a Debian-based system, you can run something like
apt-cache policy zookeeper at the terminal. That command will list all the repositories where the package zookeeper is available.
If the package zookeeper is available from two repositories or more: eg :Ubuntu’s Raring Universe Repository and the CDH repository. So, you have a problem.
Specially understanding that this can be a package mix/match issue
Solution is : Create a file at /etc/apt/preferences.d/cloudera.pref with the following contents:
Package: *
Pin: release o=Cloudera, l=Cloudera
Pin-Priority: 501
No apt-get update is required after creating this file.
Here, the default priority of packages is 500. By creating the file above, you provide a higher priority of 501 to any package that has origin specified as “Cloudera” (o=Cloudera) and is coming from Cloudera’s repo (l=Cloudera), which does the trick..
Hope this helps..

Cloudera Manager failed to start Activity Monitor

I downloaded CDH 4.5 quick start vm from here. Each service looks good except for below error was seen after I opened Activities tab to view mapreduce activities:
The Activity Monitor server (activitymonitor (localhost)) is unavailable or not responding to connections.
The problem remains after I tried to restart Activity Monitor service, then I found following error message in the log. Can anybody help take a look?
11:24:35.862 PM WARN org.mortbay.log
failed SelectChannelConnector#localhost.localdomain:9999: java.net.BindException: Address already in use
11:24:35.864 PM WARN org.mortbay.log
failed Server#59cc2f42: java.net.BindException: Address already in use
11:24:35.869 PM ERROR com.cloudera.cmon.firehose.Main
Failed to start Firehose
com.cloudera.enterprise.EnterpriseServiceException: java.net.BindException: Address already in use
at com.cloudera.cmon.firehose.AgentMessageService.startService(AgentMessageService.java:144)
at com.cloudera.enterprise.EnterpriseService.start(EnterpriseService.java:71)
at com.cloudera.enterprise.EnterpriseService.start(EnterpriseService.java:68)
at com.cloudera.cmon.firehose.Main.main(Main.java:371)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at org.mortbay.jetty.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:315)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.mortbay.jetty.Server.doStart(Server.java:235)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at com.cloudera.cmon.firehose.AgentMessageService.startService(AgentMessageService.java:142)
... 3 more
"java.net.BindException: Address already in use" clearly states that port 9999 is already occupied by some other service. You have to check the PID of the service and stop it:
lsof -P | grep LISTEN | grep 9999
use the PID to kill or if you know the service then stop gracefully

Resources