Running HBase in standalone mode but get hadoop "retrying connect to server" message? - hadoop

I'm trying to run HBase in standalone mode following this tutorial:
http://hbase.apache.org/book.html#quickstart
I get the following exception when I try to run
create 'test', 'cf'
in the HBase shell
ERROR: org.apache.hadoop.hbase.PleaseHoldException: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
I've seen questions here regarding this error, but the solutions haven't worked for me.
What is perhaps more troubling, and what may be at the heart of the matter, is that when I stop HBase, I get the following message over and over in the log:
INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 192.168.200.1/192.168.200.1:54310. Already tried <n> time(s)
I don't know what server it's trying to connect to- that's not my computer's IP address- and like I said, I'm trying to run HBase in standalone mode.
I would really appreciate if someone could help me understand this log output.
My etc/hosts file:
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1 localhost
127.0.0.1 j.gloves
iconfig -a
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
options=3<RXCSUM,TXCSUM>
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
nd6 options=1<PERFORMNUD>
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
options=10b<RXCSUM,TXCSUM,VLAN_HWTAGGING,AV>
ether 10:9a:dd:60:de:3d
nd6 options=1<PERFORMNUD>
media: autoselect (none)
status: inactive
fw0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 4078
lladdr 70:cd:60:ff:fe:4c:07:7a
nd6 options=1<PERFORMNUD>
media: autoselect <full-duplex>
status: inactive
en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
ether 10:9a:dd:b6:b4:7d
inet6 fe80::129a:ddff:feb6:b47d%en1 prefixlen 64 scopeid 0x6
inet 192.168.1.161 netmask 0xffffff00 broadcast 192.168.1.255
nd6 options=1<PERFORMNUD>
media: autoselect
status: active
p2p0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 2304
ether 02:9a:dd:b6:b4:7d
media: autoselect
status: inactive
hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///Users/j.gloves/trynutch/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/Users/j.gloves/trynutch/zookeeper</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>false</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
</configuration>

Thank you to everyone who offered help in the comments.
My boss was able to fix the problem. It turned out there was an older version of Hadoop on my machine that was referencing an old IP address. Once it was removed from my path and the machine, HBase worked as expected.

Related

Cannot determine ethernet address for proxy ARP (Cent OS PPTP VPN)

I've installed pptpd on CentOS 7 with AWS EC2 and I can connect to vpn with windows client but I have no internet access while the server has full internet access. In pptpd log I noticed the error "Cannot determine ethernet address for proxy ARP".
I've changed the dns in /etc/ppp/options.pptpd as below:
ms-dns 8.8.8.8
ms-dns 8.8.4.4
I've also created users in /etc/ppp/chap-secrets and clients can connect without problem (but with no internet access.)
I've also enabled IP forwarding in /etc/sysctl.conf
net.ipv4.ip_forward = 1
and execute this command:
sudo sysctl -p
I changed local and remote IPs in /etc/pptpd.conf as below:
localip 192.168.10.1
remoteip 192.168.20.10-100
I configured firewall for IP masquerading:
sudo iptables -t nat -A POSTROUTING -o ens5 -j MASQUERADE
This is the ifconfig result:
ens5: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9001
inet 172.31.28.246 netmask 255.255.240.0 broadcast 172.31.31.255
inet6 fe80::4e6:11ff:fed8:bb4a prefixlen 64 scopeid 0x20<link>
ether 06:e6:11:d8:bb:4a txqueuelen 1000 (Ethernet)
RX packets 3668 bytes 347939 (339.7 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 3111 bytes 385009 (375.9 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 6 bytes 416 (416.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 6 bytes 416 (416.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ppp0: flags=4305<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST> mtu 1396
inet 192.168.10.1 netmask 255.255.255.255 destination 192.168.20.10
ppp txqueuelen 3 (Point-to-Point Protocol)
RX packets 40 bytes 3158 (3.0 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 8 bytes 104 (104.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
And this is the pptpd status (I could connect to the VPN successful but could not access the internet):
[root#ip-172-31-28-246 ~]# systemctl status pptpd
● pptpd.service - PoPToP Point to Point Tunneling Server
Loaded: loaded (/usr/lib/systemd/system/pptpd.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2021-08-22 09:24:41 UTC; 2min 9s ago
Main PID: 1476 (pptpd)
CGroup: /system.slice/pptpd.service
├─1476 /usr/sbin/pptpd -f
├─1505 pptpd [171.213.14.133:ED5A - 0000]
└─1506 /usr/sbin/pppd local file /etc/ppp/options.pptpd 115200 192.168.10.1:192.168.20.10 ipparam 171.213.14.133 plugin /usr/lib64/pptpd/pptpd-logwtmp.so pptpd-original-ip 171.213.14.133 remote...
Aug 22 09:25:28 ip-172-31-28-246.ap-east-1.compute.internal pptpd[1505]: CTRL: Starting call (launching pppd, opening GRE)
Aug 22 09:25:28 ip-172-31-28-246.ap-east-1.compute.internal pppd[1506]: Plugin /usr/lib64/pptpd/pptpd-logwtmp.so loaded.
Aug 22 09:25:28 ip-172-31-28-246.ap-east-1.compute.internal pppd[1506]: pppd 2.4.5 started by root, uid 0
Aug 22 09:25:28 ip-172-31-28-246.ap-east-1.compute.internal pppd[1506]: Using interface ppp0
Aug 22 09:25:28 ip-172-31-28-246.ap-east-1.compute.internal pppd[1506]: Connect: ppp0 <--> /dev/pts/1
Aug 22 09:25:32 ip-172-31-28-246.ap-east-1.compute.internal pppd[1506]: peer from calling number 171.213.14.133 authorized
Aug 22 09:25:32 ip-172-31-28-246.ap-east-1.compute.internal pppd[1506]: MPPE 128-bit stateless compression enabled
Aug 22 09:25:34 ip-172-31-28-246.ap-east-1.compute.internal pppd[1506]: Cannot determine ethernet address for proxy ARP
Aug 22 09:25:34 ip-172-31-28-246.ap-east-1.compute.internal pppd[1506]: local IP address 192.168.10.1
Aug 22 09:25:34 ip-172-31-28-246.ap-east-1.compute.internal pppd[1506]: remote IP address 192.168.20.10

PhpStorm not receiving Xdebug from Vagrant machine

I've a Vagrant VM for dev with xdebug installed and I want to connect it to PhpStorm.
In my xdebug.ini I have this:
zend_extension=xdebug.so
xdebug.remote_connect_back = 0
xdebug.idekey = "vagrant"
xdebug.remote_enable=on
xdebug.remote_host=192.168.56.1
xdebug.remote_port=9001
xdebug.remote_log=/tmp/xdebug.log
But it doesn't work. I did some debug and I check that in my host machine the 9001 port was open, and it is:
phpstorm 311 jose 41u IPv4 0x2b62c0b107d1be65 0t0 TCP *:9001 (LISTEN)
phpstorm 311 jose 42u IPv4 0x2b62c0b1056e5245 0t0 TCP *:10137 (LISTEN)
phpstorm 311 jose 143u IPv4 0x2b62c0b10ae27e65 0t0 TCP 127.0.0.1:6942 (LISTEN)
phpstorm 311 jose 168u IPv4 0x2b62c0b107cd875d 0t0 TCP *:20080 (LISTEN)
phpstorm 311 jose 342u IPv4 0x2b62c0b110c70245 0t0 TCP 127.0.0.1:63342 (LISTEN)
JuniperSe 497 jose 10u IPv4 0x2b62c0b10696675d 0t0 TCP 127.0.0.1:3333 (LISTEN)
VBoxHeadl 726 jose 24u IPv4 0x2b62c0b1086e6b3d 0t0 TCP 127.0.0.1:2222 (LISTEN)
VBoxHeadl 726 jose 25u IPv4 0x2b62c0b10fe1d435 0t0 TCP *:33060 (LISTEN)
VBoxHeadl 726 jose 26u IPv4 0x2b62c0b108709435 0t0 TCP *:8088 (LISTEN)
But from Vagrant the 9001 port is not accessible:
nc -z -v -w5 192.168.56.1 9001
nc: connect to 192.168.56.1 port 9001 (tcp) timed out: Operation now in progress
And that's the same for all PhpStorm ports. But I can access 8088 or 33060
nc -z -v -w5 192.168.56.1 8088
Connection to 192.168.56.1 8088 port [tcp/omniorb] succeeded!
I've checked the option to accept external connections in PhpStorm for xdebug. I'm using Mac OS.
Ok, very silly problem. I had the external connections blocked for phpstorm, I changed that in System Preferences > Security > Firewall, on the list of apps I searched phpstorm and allow external connections

Unable to start mesos-slave on different VM. Constant Deactivated status

I'm trying to setup simple mesos cluster on 2 virtual machines. IPs are:
10.10.0.102 (with 1 master and 1 slave)- FQDN mesos1.mydomain
10.10.0.103 (with 1 slave)- FQDN mesos2.mydomain
I'm using mesos 0.27.1 (rpm's downloaded from Mesosphere) and CentOS Linux release 7.1.1503 (Core).
I was successful in deploying 1 node cluster (10.10.0.102): master and slave works and I can deploy and scale some simple application via marathon.
The problem comes when I try to start second slave on 10.10.0.103. Always, when I start that slave its state is deactivated.
Logs from slave on 10.10.0.103:
I0226 13:49:58.428019 14937 slave.cpp:463] Slave resources: cpus(*):1; mem(*):2768; disk(*):3409; ports(*):[31000-32000]
I0226 13:49:58.428019 14937 slave.cpp:471] Slave attributes: [ ]
I0226 13:49:58.428019 14937 slave.cpp:476] Slave hostname: mesos2
I0226 13:49:58.430469 14946 state.cpp:58] Recovering state from '/tmp/mesos/meta'
I0226 13:49:58.430922 14947 status_update_manager.cpp:200] Recovering status update manager
I0226 13:49:58.430954 14947 containerizer.cpp:390] Recovering containerizer
I0226 13:49:58.432219 14947 provisioner.cpp:245] Provisioner recovery complete
I0226 13:49:58.432273 14947 slave.cpp:4495] Finished recovery
I0226 13:49:58.448940 14948 group.cpp:349] Group process (group(1)#10.10.0.103:5051) connected to ZooKeeper
I0226 13:49:58.449050 14948 group.cpp:831] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I0226 13:49:58.449064 14948 group.cpp:427] Trying to create path '/mesos' in ZooKeeper
I0226 13:49:58.451846 14948 detector.cpp:154] Detected a new leader: (id='3')
I0226 13:49:58.451937 14948 group.cpp:700] Trying to get '/mesos/json.info_0000000003' in ZooKeeper
I0226 13:49:58.453397 14948 detector.cpp:479] A new leading master (UPID=master#10.10.0.102:5050) is detected
I0226 13:49:58.453459 14948 slave.cpp:795] New master detected at master#10.10.0.102:5050
I0226 13:49:58.453698 14948 slave.cpp:820] No credentials provided. Attempting to register without authentication
I0226 13:49:58.453724 14948 slave.cpp:831] Detecting new master
I0226 13:49:58.453743 14948 status_update_manager.cpp:174] Pausing sending status updates
I0226 13:50:58.445101 14948 slave.cpp:4304] Current disk usage 22.11%. Max allowed age: 4.752451232032847days
I0226 13:51:58.460233 14948 slave.cpp:4304] Current disk usage 22.11%. Max allowed age: 4.752451232032847days
Logs from master on 10.10.0.102
I0226 22:55:14.240464 2021 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 682
I0226 22:55:14.240542 2021 hierarchical.cpp:473] Added slave a61e9d9f-f85b-4c72-9780-166a7ffc0ac3-S167 (mesos2) with cpus(*):1; mem(*):2768; disk(*):3409; ports(*):[31000-32000] (allocated: )
I0226 22:55:14.240671 2021 master.cpp:5350] Sending 1 offers to framework c5a5818d-16fa-42bf-8e73-697a2d12fe97-0001 (marathon) at scheduler-91034353-1820-4020-aad1-10e11d567136#10.10.0.102:45698
I0226 22:55:14.240767 2021 replica.cpp:537] Replica received write request for position 682 from (1259)#10.10.0.102:5050
E0226 22:55:14.241082 2027 process.cpp:1966] Failed to shutdown socket with fd 32: Transport endpoint is not connected
I0226 22:55:14.241143 2019 master.cpp:1172] Slave a61e9d9f-f85b-4c72-9780-166a7ffc0ac3-S167 at slave(1)#10.10.0.103:5051 (mesos2) disconnected
I0226 22:55:14.241153 2019 master.cpp:2633] Disconnecting slave a61e9d9f-f85b-4c72-9780-166a7ffc0ac3-S167 at slave(1)#10.10.0.103:5051 (mesos2)
I0226 22:55:14.241161 2019 master.cpp:2652] Deactivating slave a61e9d9f-f85b-4c72-9780-166a7ffc0ac3-S167 at slave(1)#10.10.0.103:5051 (mesos2)
I0226 22:55:14.241230 2019 hierarchical.cpp:560] Slave a61e9d9f-f85b-4c72-9780-166a7ffc0ac3-S167 deactivated
I0226 22:55:14.245923 2019 master.cpp:3673] Processing DECLINE call for offers: [ a61e9d9f-f85b-4c72-9780-166a7ffc0ac3-O1251 ] for framework c5a5818d-16fa-42bf-8e73-697a2d12fe97-0001 (marathon) at scheduler-91034353-1820-4020-aad1-10e11d567136#10.10.0.102:45698
W0226 22:55:14.245923 2019 master.cpp:3720] Ignoring decline of offer a61e9d9f-f85b-4c72-9780-166a7ffc0ac3-O1251 since it is no longer valid
I0226 22:55:14.249065 2021 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 8.264893ms
I0226 22:55:14.249107 2021 replica.cpp:712] Persisted action at 682
I0226 22:55:14.249220 2021 replica.cpp:691] Replica received learned notice for position 682 from #0.0.0.0:0
I've tried to start slave using two approaches (on 10.10.0.103):
sudo service mesos-slave start
mesos-slave --master=10.10.0.102:5050 --ip=10.10.0.103
Both give me the same results.
Additionally in MESOS-SLAVE.WARNING I see also:
Running on machine: mesos2.mydomain
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W0226 13:49:58.415089 14937 systemd.cpp:244] Required functionality `Delegate` was introduced in Version `218`. Your system may not function properly; however since some distributions have patched systemd packages, your system may still be functional. This is why we keep running. See MESOS-3352 for more information
Base on similar topics I see that this can be related to network configuration so below is some info about:
hosts file on 10.10.0.102
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.10.0.103 mesos2 mesos2.mydomain
10.10.0.102 mesos1 mesos1.mydomain
hosts file on 10.10.0.103
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.10.0.102 mesos1 mesos1.mydomain
10.10.0.103 mesos2 mesos2.mydomain
both VM's have 2 network interfaces (without loopback). Below comes from 10.10.0.103- on 10.10.0.102 is similar:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:49:76:48 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic enp0s3
valid_lft 75232sec preferred_lft 75232sec
inet6 fe80::a00:27ff:fe49:7648/64 scope link
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:d9:24:2a brd ff:ff:ff:ff:ff:ff
inet 10.10.0.103/24 brd 10.10.0.255 scope global enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fed9:242a/64 scope link
valid_lft forever preferred_lft forever
Both VMs have network connectivity.
from 10.10.0.102 to 10.10.0.103
[root#mesos1 ~]# ping mesos2.mydomain
PING mesos2 (10.10.0.103) 56(84) bytes of data.
64 bytes from mesos2 (10.10.0.103): icmp_seq=1 ttl=64 time=0.578 ms
64 bytes from mesos2 (10.10.0.103): icmp_seq=2 ttl=64 time=0.616 ms
from 10.10.0.103 to 10.10.0.102
[root#mesos2 ~]# ping mesos1.mydomain
PING mesos1 (10.10.0.102) 56(84) bytes of data.
64 bytes from mesos1 (10.10.0.102): icmp_seq=1 ttl=64 time=0.441 ms
64 bytes from mesos1 (10.10.0.102): icmp_seq=2 ttl=64 time=0.972 ms
All help would be highly appreciate.
Regards
Andrzej
Like always the simplest answers are the best. It's turn out that I had running iptables on slave node. Disabling this resolve my problem:
systemctl disable firewalld
systemctl stop firewalld
Thanks everyone for help!

ElasticSearch java.net.NoRouteToHostException in docker

[2015-10-11 13:08:26,587][WARN ][transport.netty ] [Joseph] exception caught on transport layer [[id: 0x7e9f652b]], closing connection
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
I get this exception when launching the elasticsearch in docker (Actually I only have this problem in CentOS7 docker host)
First, my dockefile exposes the UDP ports.
EXPOSE 9200 9300/udp 9301/udp 9302/udp 9303/udp 9304/udp 9305/udp
When I start the docker container, I opened these ports via -p 9200:9200 -p 9300:9300/udp -p 9301:9301/udp -p 9302:9302/udp -p 9303:9303/udp -p 9304:9304/udp -p 9305:9305/udp
Within docker ps, I do see these ports are opened as 0.0.0.0:9300-9305->9300-9305/udp
And here is some lines of my elasticsearch.yml
cluster.name: changsha
discovery.zen.ping.unicast.hosts: [ "10.0.5.241" ]
network.publish_host: 10.0.5.241
10.0.5.241 is my docker host's IP address. Please what is wrong here? it succeeded in CentOS6 host, but failes on this CentOS7 host.
UPDATE
Following this answer, I get the following result from tcpdump -p -nn icmp.
09:26:53.277117 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
09:26:53.277494 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
09:26:53.277822 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
09:26:53.278043 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
09:26:54.277753 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
09:27:04.280703 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
First, find out the docker interface ip address
# ifconfig
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.17.42.1 netmask 255.255.0.0 broadcast 0.0.0.0
ether 56:84:7a:fe:97:99 txqueuelen 0 (Ethernet)
RX packets 115761 bytes 12605533 (12.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 55687 bytes 22647938 (21.5 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Then add all of the docker IP addresses into whitelist
firewall-cmd --permanent --zone=trusted --add-source=172.17.0.0/16
firewall-cmd --reload
Problem solved
If someone come across the issue in centos 7.4, it`s because of the conflict between docker service and firewalld service.
you can solve by disable firewalld and then restart docker service.
please refer https://sanenthusiast.com/docker-and-firewalld-mess-in-centos-7/

Hadoop slave cannot connect to master, even when service is running and ports are open

I'm running hadoop 2.5.1 and I'm having a problem when slaves are connecting to master. My goal is to set-up a hadoop cluster. I hope someone can help, I'm been poundering with this too long already! :)
This is what comes up to the log file of slave:
2014-10-18 22:14:07,368 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: master/192.168.0.104:8020
This is my core-site.xml -file (same on master and slave):
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master/</value>
</property>
</configuration>
This is my hosts -file ((almost)same on master and slave).. I have hard coded addresses to there without any success:
127.0.0.1 localhost
192.168.0.104 xubuntu: xubuntu
192.168.0.104 master
192.168.0.194 slave
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Netstats from master:
xubuntu#xubuntu:/usr/local/hadoop/logs$ netstat -atnp | grep 8020
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 192.168.0.104:8020 0.0.0.0:* LISTEN 26917/java
tcp 0 0 192.168.0.104:52114 192.168.0.104:8020 ESTABLISHED 27046/java
tcp 0 0 192.168.0.104:8020 192.168.0.104:52114 ESTABLISHED 26917/java
Nmap from master to master:
Starting Nmap 6.40 ( http://nmap.org ) at 2014-10-18 22:36 EEST
Nmap scan report for master (192.168.0.104)
Host is up (0.000072s latency).
rDNS record for 192.168.0.104: xubuntu:
PORT STATE SERVICE
8020/tcp open unknown
..and nmap from slave to master (even when the port is open, the slave doesn't connect to it..):
ubuntu#ubuntu:/usr/local/hadoop/logs$ nmap master -p 8020
Starting Nmap 6.40 ( http://nmap.org ) at 2014-10-18 22:35 EEST
Nmap scan report for master (192.168.0.104)
Host is up (0.14s latency).
PORT STATE SERVICE
8020/tcp open unknown
What is this all about? The problem is not about firewall.. I have also read every thread there is to to this without any success. I'm frustrated to this.. :(
At least one of your problems is that you are using old configuration name for the HDFS. For version 2.5.1 the configuration name should be fs.defaultFS instead of fs.default.name. I also suggest defining the port in the value, so the value would be hdfs://master:8020.
Sorry, I'm not linux guru, so I don't know about nmap, but does telnet'ing work from slave to master to the port?

Resources