how to establish the RegionServer of Hbase to master - hadoop

Please tell me how to establish the RegionServer of Hbase to master.
I configured 5 region servers, however, only 2 server is worked properly.
hbase(main):001:0> status
2 servers, 0 dead, 1.5000 average load
The hostname of this two servers are sm3-10 and sm3-12 from http://hbase-master:60010.
But the other servers like sm3-8 not work.
I'd like to know the trouble shooting step and resolutions.
sm3-10:slave, work well
[root#sm3-10 ~]# jps
2581 QuorumPeerMain
2761 SecondaryNameNode
2678 DataNode
19913 Jps
2551 HRegionServer
[root#sm3-10 ~]# lsof -i:54310
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
java 2678 hdfs 52r IPv6 27608 TCP sm3-10:33316->sm3-12:54310 (ESTABLISHED)
[root#sm3-10 ~]# lsof -i:3888
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
java 2581 zookeeper 19u IPv6 7239 TCP *:ciphire-serv (LISTEN)
java 2581 zookeeper 20u IPv6 7242 TCP sm3-10:ciphire-serv->sm3-11:53593 (ESTABLISHED)
java 2581 zookeeper 25u IPv6 27011 TCP sm3-10:ciphire-serv->sm3-12:40352 (ESTABLISHED)
java 2581 zookeeper 29u IPv6 25573 TCP sm3-10:ciphire-serv->sm3-8:44271 (ESTABLISHED)
sm3-8:slave, not work properly, however, the status looks good
[root#sm3-8 ~]# jps
3489 Jps
2249 HRegionServer
2463 DataNode
2297 QuorumPeerMain
2686 SecondaryNameNode
[root#sm3-8 ~]# lsof -i:54310
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
java 2463 hdfs 51u IPv6 9919 TCP sm3-8.nos-seamicro.local:40776->sm3-12:54310 (ESTABLISHED)
[root#sm3-8 ~]# lsof -i:3888
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
java 2297 zookeeper 18u IPv6 5951 TCP *:ciphire-serv (LISTEN)
java 2297 zookeeper 19u IPv6 9839 TCP sm3-8.nos-seamicro.local:52886->sm3-12:ciphire-serv (ESTABLISHED)
java 2297 zookeeper 20u IPv6 5956 TCP sm3-8.nos-seamicro.local:44271->sm3-10:ciphire-serv (ESTABLISHED)
java 2297 zookeeper 24u IPv6 5959 TCP sm3-8.nos-seamicro.local:47922->sm3-11:ciphire-serv (ESTABLISHED)
Mastet:sm3-12
[root#sm3-12 ~]# jps
2760 QuorumPeerMain
3035 NameNode
3096 SecondaryNameNode
2612 HRegionServer
4330 Jps
2872 DataNode
3723 HMaster
[root#sm3-12 ~]# lsof -i:54310
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
java 2872 hdfs 51u IPv6 7824 TCP sm3-12:45482->sm3-12:54310 (ESTABLISHED)
java 3035 hdfs 54u IPv6 7783 TCP sm3-12:54310 (LISTEN)
java 3035 hdfs 70u IPv6 7873 TCP sm3-12:54310->sm3-8:40776 (ESTABLISHED)
java 3035 hdfs 71u IPv6 7874 TCP sm3-12:54310->sm3-11:54990 (ESTABLISHED)
java 3035 hdfs 72u IPv6 7875 TCP sm3-12:54310->sm3-10:33316 (ESTABLISHED)
java 3035 hdfs 74u IPv6 7877 TCP sm3-12:54310->sm3-12:45482 (ESTABLISHED)
[root#sm3-12 ~]#
[root#sm3-12 ~]# cat /etc/hbase/conf/hbase-site.xml
hbase.rootdir
hdfs://sm3-12:54310/hbase
true
hbase.zookeeper.quorum
sm3-8,sm3-10,sm3-11,sm3-12,sm3-13
true
--- snip ---
[root#sm3-12 ~]# cat /etc/zookeeper/zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/zookeeper
clientPort=2181
server.1=sm3-10:2888:3888
server.2=sm3-11:2888:3888
server.3=sm3-12:2888:3888
server.4=sm3-8:2888:3888
[root#sm3-12 ~]#
Thanks in advance,
Hiromi

check to make sure your dns is configured properly on all of the hosts, and each server can do a reverse lookup

Related

HDFS NFS startup error: “ERROR mount.MountdBase: Failed to start the TCP server...ChannelException: Failed to bind..."

Attempting to use / startup HDFS NFS following the docs (ignoring the instructions to stop the rpcbind service and did not start the hadoop portmap service given that the OS is not SLES 11 and RHEL 6.2), but running into error when trying to set up the NFS service starting the hdfs nfs3 service:
[root#HW02 ~]#
[root#HW02 ~]#
[root#HW02 ~]# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
[root#HW02 ~]#
[root#HW02 ~]#
[root#HW02 ~]# service nfs status
Redirecting to /bin/systemctl status nfs.service
Unit nfs.service could not be found.
[root#HW02 ~]#
[root#HW02 ~]#
[root#HW02 ~]# service nfs stop
Redirecting to /bin/systemctl stop nfs.service
Failed to stop nfs.service: Unit nfs.service not loaded.
[root#HW02 ~]#
[root#HW02 ~]#
[root#HW02 ~]# service rpcbind status
Redirecting to /bin/systemctl status rpcbind.service
● rpcbind.service - RPC bind service
Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2019-07-23 13:48:54 HST; 28s ago
Process: 27337 ExecStart=/sbin/rpcbind -w $RPCBIND_ARGS (code=exited, status=0/SUCCESS)
Main PID: 27338 (rpcbind)
CGroup: /system.slice/rpcbind.service
└─27338 /sbin/rpcbind -w
Jul 23 13:48:54 HW02.ucera.local systemd[1]: Starting RPC bind service...
Jul 23 13:48:54 HW02.ucera.local systemd[1]: Started RPC bind service.
[root#HW02 ~]#
[root#HW02 ~]#
[root#HW02 ~]# hdfs nfs3
19/07/23 13:49:33 INFO nfs3.Nfs3Base: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting Nfs3
STARTUP_MSG: host = HW02.ucera.local/172.18.4.47
STARTUP_MSG: args = []
STARTUP_MSG: version = 3.1.1.3.1.0.0-78
STARTUP_MSG: classpath = /usr/hdp/3.1.0.0-78/hadoop/conf:/usr/hdp/3.1.0.0-78/hadoop/lib/jersey-server-1.19.jar:/usr/hdp/3.1.0.0-78/hadoop/lib/ranger-hdfs-plugin-shim-1.2.0.3.1.0.0-78.jar:
...
<a bunch of other jars>
...
STARTUP_MSG: build = git#github.com:hortonworks/hadoop.git -r e4f82af51faec922b4804d0232a637422ec29e64; compiled by 'jenkins' on 2018-12-06T12:26Z
STARTUP_MSG: java = 1.8.0_112
************************************************************/
19/07/23 13:49:33 INFO nfs3.Nfs3Base: registered UNIX signal handlers for [TERM, HUP, INT]
19/07/23 13:49:33 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
19/07/23 13:49:33 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
19/07/23 13:49:33 INFO impl.MetricsSystemImpl: Nfs3 metrics system started
19/07/23 13:49:33 INFO oncrpc.RpcProgram: Will accept client connections from unprivileged ports
19/07/23 13:49:33 INFO security.ShellBasedIdMapping: Not doing static UID/GID mapping because '/etc/nfs.map' does not exist.
19/07/23 13:49:33 INFO nfs3.WriteManager: Stream timeout is 600000ms.
19/07/23 13:49:33 INFO nfs3.WriteManager: Maximum open streams is 256
19/07/23 13:49:33 INFO nfs3.OpenFileCtxCache: Maximum open streams is 256
19/07/23 13:49:34 INFO nfs3.DFSClientCache: Added export: / FileSystem URI: / with namenodeId: -1408097406
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Configured HDFS superuser is
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Delete current dump directory /tmp/.hdfs-nfs
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Create new dump directory /tmp/.hdfs-nfs
19/07/23 13:49:34 INFO nfs3.Nfs3Base: NFS server port set to: 2049
19/07/23 13:49:34 INFO oncrpc.RpcProgram: Will accept client connections from unprivileged ports
19/07/23 13:49:34 INFO mount.RpcProgramMountd: FS:hdfs adding export Path:/ with URI: hdfs://hw01.ucera.local:8020/
19/07/23 13:49:34 INFO oncrpc.SimpleUdpServer: Started listening to UDP requests at port 4242 for Rpc program: mountd at localhost:4242 with workerCount 1
19/07/23 13:49:34 ERROR mount.MountdBase: Failed to start the TCP server.
org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:4242
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at org.apache.hadoop.oncrpc.SimpleTcpServer.run(SimpleTcpServer.java:89)
at org.apache.hadoop.mount.MountdBase.startTCPServer(MountdBase.java:83)
at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:98)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:56)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:69)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:79)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
...
...
19/07/23 13:49:34 INFO util.ExitUtil: Exiting with status 1: org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:4242
19/07/23 13:49:34 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down Nfs3 at HW02.ucera.local/172.18.4.47
************************************************************/
Not sure how to interpret any of the errors seen here (and have not installed any packages like nfs-utils, assuming that Ambari would have installed all needed packages when cluster was initially installed).
Any debugging suggestions or solutions for what to do about this?
** UPDATE:
After looking at the error, I can see
Caused by: java.net.BindException: Address already in use
and looking into what is already using it, we see...
[root#HW02 ~]# netstat -ltnp | grep 4242
tcp 0 0 0.0.0.0:4242 0.0.0.0:* LISTEN 98067/jsvc.exec
The process jsvc.exec appears to be related to running java applications. Given that hadoop runs on java, I assume it would be bad to just kill the process. Is it not supposed to be on this port (since interferes with NFS Gateway)? Not sure what to do about this.
TLDR: nfs gateway service was already running (by default, apparently) and the service that I thought was blocking the hadoop nfs3 service (jsvc.exec) from starting was (I'm assuming) part of that service already running.
What made me suspect this was that when shutting down the cluster, the service also stopped plus the fact that it was using the port I needed for nfs. The way that I confirmed this was just from following the verification steps in the docs and seeing that my output was similar to what should be expected.
[root#HW02 ~]# rpcinfo -p hw02
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100005 1 udp 4242 mountd
100005 2 udp 4242 mountd
100005 3 udp 4242 mountd
100005 1 tcp 4242 mountd
100005 2 tcp 4242 mountd
100005 3 tcp 4242 mountd
100003 3 tcp 2049 nfs
[root#HW02 ~]# showmount -e hw02
Export list for hw02:
/ *
Another thing that could told me that the jsvc process was part of an already running hdfs nfs service would have been checking the process info...
[root#HW02 ~]# ps -feww | grep jsvc
root 61106 59083 0 14:27 pts/2 00:00:00 grep --color=auto jsvc
root 163179 1 0 12:14 ? 00:00:00 jsvc.exec -Dproc_nfs3 -outfile /var/log/hadoop/root/hadoop-hdfs-root-nfs3-HW02.ucera.local.out -errfile /var/log/hadoop/root/privileged-root-nfs3-HW02.ucera.local.err -pidfile /var/run/hadoop/root/hadoop-hdfs-root-nfs3.pid -nodetach -user hdfs -cp /usr/hdp/3.1.0.0-78/hadoop/conf:...
...
hdfs 163193 163179 0 12:14 ? 00:00:17 jsvc.exec -Dproc_nfs3 -outfile /var/log/hadoop/root/hadoop-hdfs-root-nfs3-HW02.ucera.local.out -errfile /var/log/hadoop/root/privileged-root-nfs3-HW02.ucera.local.err -pidfile /var/run/hadoop/root/hadoop-hdfs-root-nfs3.pid -nodetach -user hdfs -cp /usr/hdp/3.1.0.0-78/hadoop/conf:...
and seeing jsvc.exec -Dproc_nfs3 ... to get the hint that jsvc (which apparently is for running java apps on linux) was being used to run the very nfs3 service I was trying to start.
And for anyone else with this problem, note that I did not stop all the services that the docs want you to stop (since using centos7)
[root#HW01 /]# service nfs status
Redirecting to /bin/systemctl status nfs.service
● nfs-server.service - NFS server and services
Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled)
Active: inactive (dead)
[root#HW01 /]# service rpcbind status
Redirecting to /bin/systemctl status rpcbind.service
● rpcbind.service - RPC bind service
Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2019-07-19 15:17:02 HST; 6 days ago
Main PID: 2155 (rpcbind)
CGroup: /system.slice/rpcbind.service
└─2155 /sbin/rpcbind -w
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
Also note that I did not follow any of the config file settings recommended in the docs (and that some of the properties instructed in the docs could not even be found in the Ambari-managed HDFS configs (so if anyone can explain why this is still working for me despite that, please do)).
** Update:
After talking with some people more experienced with using HDP (v3.1) than me, the docs that I linked to for setting up NFS for HDFS may not be totally up to date (when setting up NFS via Ambari mgnt. in any case)...
Can have a cluster node act as an NFS gateway by checking it off as a NFS node in the Ambari host management UI:
Needed configs can be set like so in the HDFS mgnt. UI...
Can confirm that HDFS NFS gateway is running by looking at the Host > Summary > Components section in Ambari...

Hazelcast successfully discovers nodes but unable to connect (OrientDB)

I'm using Hazelcast in EC2 discovery mode to allow OrientDB to run in distributed mode. The nodes are running in the same security group under the same EC2 role. The plugin successfully discovers other nodes but fails to connect over ports 5701-5703. Here is the error message:
2019-05-31 19:44:45:303 INFO [10.4.31.181]:5701 [orientdb] [3.8.4] Could not connect to: /10.4.26.235:5703. Reason: SocketException[Connection timed out to address /10.4.26.235:5703] [InitConnectionTask]
I checked if any process is listening on those ports on other nodes (lsof -i -P -n) and discovered these entries
java 10023 root 91u IPv6 357165 0t0 TCP *:2424 (LISTEN)
java 10023 root 92u IPv6 357166 0t0 TCP *:2480 (LISTEN)
java 10023 root 135u IPv6 357826 0t0 TCP *:5701 (LISTEN)
It seems that all OrientDB listeners are using IPv6 although I never enabled it anywhere (there are no IPv4 listeners). How do I make it listen using IP4? Here is the only thing I changed in hazelcast.xml after I install it.
<network>
<join>
<multicast enabled="false"/>
<aws enabled="true">
<tag-key>Name</tag-key>
<tag-value>orientdb-test</tag-value>
</aws>
</join>
</network>

hadoop slave cannot connect to master:8031

I'm a total hadoop newbee. I set up hadoop on two machines master and slave following this tutorial (I obtained the same error following this other tutorial).
Problem: After starting dfs and yarn, the only node appearing on localhost:50070 is the master, even if the right processes are running on the master (NameNode, DataNode, SecondaryNameNode, ResourceManager) and on the slave (DataNode).
The nodemanager log of the slave reports: INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.0.14:8031. Already tried 10 times.
Note that:
I have already edited yarn-site.xml following this thread.
I have disabled the firewall on the master
I ran netstat -anp | grep 8031 on the master and confirmed that there are a couple of processes listening on port 8031 using tcp.
I suffered from the same problem, and these steps may help.
Make sure your namenodes can negotiate with your datanodes.
hadoop#namenode-01:~$ jps
8678 NameNode
9530 WebAppProxyServer
9115 ResourceManager
8940 SecondaryNameNode
9581 Jps
hadoop#datanode-01:~$ jps
8592 NodeManager
8715 Jps
8415 DataNode
Check your nodemanager-datanode-01.log from $HADOOP_HOME/logs reports.
INFO org.apache.hadoop.ipc.Client: Retrying connect to server: namenode-01/192.168.4.2:8031. Already tried 8 time(s);
Check if your yarn.resourcemanager.resource-tracker.address listen on ipv6 instead of ipv4.
hadoop#namenode-01:~$ netstat -natp
tcp6 0 0 127.0.2.1:8088 :::* LISTEN 9115/java
tcp6 0 0 127.0.2.1:8030 :::* LISTEN 9115/java
tcp6 0 0 127.0.2.1:8031 :::* LISTEN 9115/java
tcp6 0 0 127.0.2.1:8032 :::* LISTEN 9115/java
tcp6 0 0 127.0.2.1:8033 :::* LISTEN 9115/java
If your yarn addresses listen on ipv6, maybe you should disable ipv6 first.
sysctl -w net.ipv6.conf.all.disable_ipv6=1
sysctl -w net.ipv6.conf.default.disable_ipv6=1
Finally, the default yarn addresses are like this:
yarn.resourcemanager.resource-tracker.address ${yarn.resourcemanager.hostname}:8031
Check your /ect/hosts to avoid misconfigurations.
hadoop#namenode-01:~$ cat /etc/hosts
127.0.2.1 namenode-01 namenode-01 (for example, this line config hostname as 127.0.2.1, delete this line)
192.168.4.2 namenode-01
192.168.4.3 datanode-01
192.168.4.4 datanode-02
192.168.4.5 datanode-03

Find which process is listening on port 8001 on Mac OS X

How can I see which process is listening on port 8001 on Mac OS X?
I have tried several commands:
lsof -i | grep LISTEN
Output:
qbittorre 321 user 26u IPv4 0xc8e6037f28270c31 0t0 TCP *:6881 (LISTEN)
qbittorre 321 user 27u IPv6 0xc8e6037f216348e1 0t0 TCP *:6881 (LISTEN)
mysqld 14131 user 10u IPv4 0xc8e6037f3218da91 0t0 TCP *:mysql (LISTEN)
httpd 14133 user 16u IPv6 0xc8e6037f216352e1 0t0 TCP *:http (LISTEN)
httpd 14135 user 16u IPv6 0xc8e6037f216352e1 0t0 TCP *:http (LISTEN)
httpd 14136 user 16u IPv6 0xc8e6037f216352e1 0t0 TCP *:http (LISTEN)
httpd 14137 user 16u IPv6 0xc8e6037f216352e1 0t0 TCP *:http (LISTEN)
httpd 14138 user 16u IPv6 0xc8e6037f216352e1 0t0 TCP *:http (LISTEN)
httpd 14139 user 16u IPv6 0xc8e6037f216352e1 0t0 TCP *:http (LISTEN)
httpd 14148 user 16u IPv6 0xc8e6037f216352e1 0t0 TCP *:http (LISTEN)
httpd 14149 user 16u IPv6 0xc8e6037f216352e1 0t0 TCP *:http (LISTEN)
httpd 14150 user 16u IPv6 0xc8e6037f216352e1 0t0 TCP *:http (LISTEN)
Skype 14543 user 57u IPv4 0xc8e6037f324f9a91 0t0 TCP *:18666 (LISTEN)
java 24640 user 68u IPv6 0xc8e6037f3295a3e1 0t0 TCP *:http-alt (LISTEN)
java 24640 user 73u IPv6 0xc8e6037f32958fe1 0t0 TCP *:8009 (LISTEN)
java 24640 user 101u IPv6 0xc8e6037f32959ee1 0t0 TCP localhost:8005 (LISTEN)
lsof:
sudo lsof -nPi -sTCP:LISTEN | grep 8001
Nothing found
netstat:
netstat -a | grep 8001
Nothing found
I know that the port is in use by someone, because I am trying to change the Emacs simple-httpd default httpd-port from 8080 (default) to 8001, and it fails:
Warning (initialization): An error occurred while loading `/Users/user/.emacs':
File error: Cannot bind server socket, address already in use
To ensure normal operation, you should investigate and remove the
cause of the error in your initialization file. Start Emacs with
the `--debug-init' option to view a complete error backtrace.
How can I resolve it? I tried also to set the port to 8002, with the same problem and didn't find which process is listening on port 8002.
What can be the source of the problem?
Using nmap I discovered that port 8001 is used by vcom-tunnel service and it’s a closed port and that port 8002 is used by teradataordbms and is also closed.
What are these services used for? Can I disable them and use their occupied ports?
You can use lsof to detect who is using the connection as long as there is active traffic on the connection.
Here is a demonstration:
setting a server on a given port fails with the error Address already in use
lsof doesn't report any listener for that port
Here is the shell log demonstrating this:
python -m SimpleHTTPServer 3333 2>&1 | fgrep error
Output:
socket.error: [Errno 48] Address already in use
sudo lsof -i TCP:3333
echo $?
Output:
1
[1] : starting a web server on port 3333 fails with the error Address already in use
[2] : lsof doesn't report port 3333 being used by anyone
Let's generate traffic in order to force lsof to detect the usage of the port: in another terminal open a telnet connection:
telnet localhost 3333
Now back on your previous terminal, you will see that lsof finds your port:
sudo lsof -n -P -i :3333
Output:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
telnet 78142 loic 5u IPv4 0x3fa2e8474ece6129 0t0 TCP 127.0.0.1:51855->127.0.0.1:3333 (ESTABLISHED)
There is traffic going on, but according to the OS, only one end of the connection is there, the initiator, there still isn’t any `LISTENER`!
Note: in my case, OS is macOS v10.13.3 (High Sierra), but I had that with previous versions of macOS/OSX too

Hadoop slave cannot connect to master, even when service is running and ports are open

I'm running hadoop 2.5.1 and I'm having a problem when slaves are connecting to master. My goal is to set-up a hadoop cluster. I hope someone can help, I'm been poundering with this too long already! :)
This is what comes up to the log file of slave:
2014-10-18 22:14:07,368 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: master/192.168.0.104:8020
This is my core-site.xml -file (same on master and slave):
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master/</value>
</property>
</configuration>
This is my hosts -file ((almost)same on master and slave).. I have hard coded addresses to there without any success:
127.0.0.1 localhost
192.168.0.104 xubuntu: xubuntu
192.168.0.104 master
192.168.0.194 slave
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Netstats from master:
xubuntu#xubuntu:/usr/local/hadoop/logs$ netstat -atnp | grep 8020
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 192.168.0.104:8020 0.0.0.0:* LISTEN 26917/java
tcp 0 0 192.168.0.104:52114 192.168.0.104:8020 ESTABLISHED 27046/java
tcp 0 0 192.168.0.104:8020 192.168.0.104:52114 ESTABLISHED 26917/java
Nmap from master to master:
Starting Nmap 6.40 ( http://nmap.org ) at 2014-10-18 22:36 EEST
Nmap scan report for master (192.168.0.104)
Host is up (0.000072s latency).
rDNS record for 192.168.0.104: xubuntu:
PORT STATE SERVICE
8020/tcp open unknown
..and nmap from slave to master (even when the port is open, the slave doesn't connect to it..):
ubuntu#ubuntu:/usr/local/hadoop/logs$ nmap master -p 8020
Starting Nmap 6.40 ( http://nmap.org ) at 2014-10-18 22:35 EEST
Nmap scan report for master (192.168.0.104)
Host is up (0.14s latency).
PORT STATE SERVICE
8020/tcp open unknown
What is this all about? The problem is not about firewall.. I have also read every thread there is to to this without any success. I'm frustrated to this.. :(
At least one of your problems is that you are using old configuration name for the HDFS. For version 2.5.1 the configuration name should be fs.defaultFS instead of fs.default.name. I also suggest defining the port in the value, so the value would be hdfs://master:8020.
Sorry, I'm not linux guru, so I don't know about nmap, but does telnet'ing work from slave to master to the port?

Resources