I have issue with dfs in hadoop. Does somebody know how to solve my problem?
[hduser#evghost ~]$ start-dfs.sh
Starting namenodes on [evghost]
Error: Please specify one of --hosts or --hostnames options and not both.
evghost: starting datanode, logging to /usr/lib/hadoop-2.7.1/logs/hadoop-hduser-datanode-evghost.out
Starting secondary namenodes [0.0.0.0]
Error: Please specify one of --hosts or --hostnames options and not both.
As you can see here is something with hosts and hostname. I don't know what to do here about 2 days... I didn't find any solution of this problem in internet, help me please.
It's issue with DNS server. If you have a hostname not like 'localhost' you'll not to be able to deploy a pseudo mode for dfs because DNS won't give you ip address from your request domain name. Here i had a hostname evghost, lets look:
[main#evghost ~]$ host evghost
Host evghost not found: 3(NXDOMAIN)
DNS didn't get answer to you. Noway to deal with it, but you can set up your own dns server in your PC. Much pain, but i think it can works.
Solution is to post
localhost
in /etc/hostname and NOT another!
I spend 2 days to understand that, hate this technology and like it together.
Related
I've successfully gone through initiating single-node in a pseudo-distributed mode described in https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation, under Window's wsl2 environment.
After that, I tried to repeat it using MacBookPro. But somehow start-dfs.sh fails. Terminal throws error:
Stopping namenodes on [localhost]
Stopping datanodes
Stopping secondary namenodes [kakaoui-MacBookPro.local]
kakaoui-MacBookPro.local: ssh: connect to host kakaoui-macbookpro.local port 22: Connection refused
2021-06-26 23:01:23,377 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Okay. There are answers saying I should enable ssh connection via system property, but it is already set so and ssh localhost also works fine.
And then thing goes worth; Sometimes it is described that secondary namenode fails as:
Starting secondary namenodes [kakaoui-MacBookPro.local]
kakaoui-MacBookPro.local: ssh: connect to host kakaoui-macbookpro.local port 22: Operation timed out
Then when I leave Mac for a while and again command start-dfs.sh, once in a while it succeeds. And as I do stop-dfs.sh and start-dfs.sh to check, it fails.
Even if I could successfully start-dfs.sh, a lot of problems like not being able to start data node or resourcemanager or nodemanager etc comes after. I couldn't run hadoop environment even once.
Feels like everything is mixed up and things are not stable at all. Tried reinstalling this and that for several times already. Unfortunately most of initiation failure is not even recored in /logs folder.
Currently I'm using:
macOS: Catalina 10.15.6
java: 1.8.0_291
hadoop: 3.3.1
I've spent whole two day just trying. Please help!
Okay, I found the solution that I don’t understand. I turned off wifi connection during initiation process and all processes started up. Can’t understand how wifi connection interferes ssh localhost though.
Provide ssh-key less access to all your worker nodes in hosts file, even localhost as well as kakaoui-macbookpro.local. Read instruction in the Creating a SSH Public Key on OSX.
At last test access without password by ssh localhost and ssh [yourworkernode] (maybe ssh kakaoui-macbookpro.local).
I'm currently running a hadoop setup with a Namenode(master-node - 10.0.1.86) and a Datanode(node1 - 10.0.1.85) using two centOS VM's.
When I run a hive query that starts a mapReduce job, I get the following error:
"Application application_1515705541639_0001 failed 2 times due to
Error launching appattempt_1515705541639_0001_000002. Got exception:
java.net.NoRouteToHostException: No Route to Host from
localhost.localdomain/127.0.0.1 to 10.0.2.62:48955 failed on socket
timeout exception: java.net.NoRouteToHostException: No route to host;
For more details see: http://wiki.apache.org/hadoop/NoRouteToHost"
Where on earth is this IP of 10.0.2.62 coming from? Here is an example of what I am seeing.
This IP does not exist on my network. You can not reach it through ping of telnet.
I have gone through all my config files on both master-node and node1 and I cannot find where it is picking up this IP. I've stopped/started both hdfs and yarn and rebooted both the VM's. Both /etc/host files are how they should be. Any general direction on where to look next would be appreciated, I am stumped!
Didn't have any luck on discovering where this rogue IP was coming from. I ended up assigning the VM the IP address that the node-master was looking for. Sure enough all works fine.
I have installed spark and hadoop in standalone modes on ubuntu virtualbox for my learning. I am able to do normal hadoop mapreduce operations on hdfs without using spark. But when I use below code in spark-shell,
val file=sc.textFile("hdfs://localhost:9000/in/file")
scala>file.count()
I get "input path does not exist." error. The core-site.xml has fs.defaultFS with value hdfs://localhost:9000. If I give localhost without the port number, I get "Connection refused" error as it is listening on default port 8020. Hostname and localhost are set to loopback addresses 127.0.0.1 and 127.0.1.1 in etc/hosts.
Kindly let me know how to resolve this issue.
Thanks in advance!
I am able to read and write into the hdfs using
"hdfs://localhost:9000/user/<user-name>/..."
Thank you for your help..
Probably your configuration is alright, but the file is missing, or in an unexpected
location...
1) try:
sc.textFile("hdfs://in/file")
sc.textFile("hdfs:///user/<USERNAME>/in/file")
with USERNAME=hadoop, or your own username
2) try on the command line (outside spark-shell) to access that directory/file :
hdfs dfs -ls /in/file
I've started hadoop cluster composed of on master and 4 slave nodes.
Configuration seems ok:
hduser#ubuntu-amd64:/usr/local/hadoop$ ./bin/hdfs dfsadmin -report
When I enter NameNode UI (http://10.20.0.140:50070/) Overview card seems ok - for example total Capacity of all Nodes sumes up.
The problem is that in the card Datanodes I see only one datanode.
I came across the same problem, fortunately, I solved it. I guess it causes by the 'localhost'.
Config different name for these IP in /etc/host
Remember to restart all the machines, things will go well.
It's because of the same hostname in both datanodes.
In your case both datanodes are registering to the namenode with same hostname ie 'localhost' Try with different hostnames it will fix your problem.
in UI it will show only one entry for a hostname.
in "hdfs dfsadmin -report" output you can see both.
The following tips may help you
Check the core-site.xml and ensure that the namenode hostname is correct
Check the firewall rules in namenode and datanodes and ensure that the required ports are open
Check the logs of datanodes
Ensure that all the datanodes are up and running
As #Rahul said the problem is because of the same hostname
change your hostname in /etc/hostname file and give different hostname for each host
and resolve hostname with ip address /etc/hosts file
then restart your cluster you will see all datanodes in Datanode information tab on browser
I have the same trouble because I use ip instead of hostname, [hdfs dfsadmin -report] is correct though it is only one[localhost] in UI. Finally, I solved it like this:
<property>
<name>dfs.datanode.hostname</name>
<value>the name you want to show</value>
</property>
you almost can't find it in any doucument...
Sorry, feels like it's been a time. But still I'd like to share my answer:
the root cause is from hadoop/etc/hadoop/hdfs-site.xml:
the xml file has a property named dfs.datanode.data.dir. If you set all the datanodes with the same name, then hadoop is assuming the cluster has only one datanode. So the proper way of doing it is naming every datanode with a unique name:
Regards,
YUN HANXUAN
Your admin report looks absolutely fine. Please run the below to check the HDFS disk space details.
"hdfs dfs -df /"
If you still see the size being good, its just a UI glitch.
My Problems: I have 1 master node and 3 slave nodes. when I start all nodes by start-all. sh and accessing the dashboard of master nodes. I was able to see only one data node on the web UI.
My Solution:
Try to stop the Firewall temporary by sudo systemctl stop firewalld. if you do not want to stop your firewalld service then r allow the ports of the data node by
sudo firewall-cmd --permanent --add-port{PORT_Number/tcp,PORT_number2/tcp} ; sudo firewall-cmd --reload
If you are using sapretae user for Hadoop in my case I am using hadoop user to manage hadoop daemons then change the owner on your dataNode and nameNode file by. sudo chown hadoop:hadoop /opt/data -R
My hdfs-site.xml config as given in image
Check your daemons on data node by jps command. it should show as given in the below image.
jps Output
My manager has provided me with an Amazon instance along with a ppk. Able to login; trying to install hadoop; made the needed config changes like, edited the masters and slaves file from localhost to the EC2 instance name, added needed properties to the mapred-site.xml/hdfs-site.xml/core-site.xml files, formatted the namenode into HDFS.
Now, when i run start-dfs.sh script, i get the following errors.
starting namenode, logging to /home/ubuntu/hadoop/libexec/../logs/hadoop-ubuntu-namenode-domU-12-31-39-07-60-A9.out
The authenticity of host 'XXX.amazonaws.com (some IP)' can't be established.
Are you sure you want to continue connecting (yes/no)? yes
XXX.amazonaws.com: Warning: Permanently added 'XXX.amazonaws.com,' (ECDSA) to the list of known hosts.
XXX.amazonaws.com: Permission denied (publickey).
XXX.amazonaws.com: Permission denied (publickey).
as of now, the master and slave nodes would be the same machine.
XXX is the instance name, and some IP is its IP. Masking them for security reasons.
I have absolutely no idea about using an EC2 instance, SSH etc. only need to run a simple MapReduce Program in it.
Kindly suggest.
Hadoop uses SSH to transfer information from master to slaves. It looks like your nodes are trying to talk to each other via SSH but haven't been configured to do so. In order to communicate, the Hadoop master node need passwordless SSH access to the slave nodes. Passwordless is useful so that every time you try to run a job you don't have to enter your password again for each of the slave nodes. That would be quite tedious. It looks like you'll have to set this up between the nodes before you can continue.
I would suggest you check this guide and find the section called "Configuring SSH". It lays out how to accomplish this.