hadoop on macOS initiating secondary namenode fails due to ssh connection refused - hadoop

I've successfully gone through initiating single-node in a pseudo-distributed mode described in https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation, under Window's wsl2 environment.
After that, I tried to repeat it using MacBookPro. But somehow start-dfs.sh fails. Terminal throws error:
Stopping namenodes on [localhost]
Stopping datanodes
Stopping secondary namenodes [kakaoui-MacBookPro.local]
kakaoui-MacBookPro.local: ssh: connect to host kakaoui-macbookpro.local port 22: Connection refused
2021-06-26 23:01:23,377 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Okay. There are answers saying I should enable ssh connection via system property, but it is already set so and ssh localhost also works fine.
And then thing goes worth; Sometimes it is described that secondary namenode fails as:
Starting secondary namenodes [kakaoui-MacBookPro.local]
kakaoui-MacBookPro.local: ssh: connect to host kakaoui-macbookpro.local port 22: Operation timed out
Then when I leave Mac for a while and again command start-dfs.sh, once in a while it succeeds. And as I do stop-dfs.sh and start-dfs.sh to check, it fails.
Even if I could successfully start-dfs.sh, a lot of problems like not being able to start data node or resourcemanager or nodemanager etc comes after. I couldn't run hadoop environment even once.
Feels like everything is mixed up and things are not stable at all. Tried reinstalling this and that for several times already. Unfortunately most of initiation failure is not even recored in /logs folder.
Currently I'm using:
macOS: Catalina 10.15.6
java: 1.8.0_291
hadoop: 3.3.1
I've spent whole two day just trying. Please help!

Okay, I found the solution that I don’t understand. I turned off wifi connection during initiation process and all processes started up. Can’t understand how wifi connection interferes ssh localhost though.

Provide ssh-key less access to all your worker nodes in hosts file, even localhost as well as kakaoui-macbookpro.local. Read instruction in the Creating a SSH Public Key on OSX.
At last test access without password by ssh localhost and ssh [yourworkernode] (maybe ssh kakaoui-macbookpro.local).

Related

Secondary namenode connection timed out

I'm trying to set up hadoop on my Mac Mojave 10.14.6. The hadoop version I'm using is 3.0.3
I followed this tutorial to set up the config: https://dbmstutorials.com/hive/hdfs-setup-on-mac.html
While running hdfs namenode -format I have this following error for the secondary namenode:
Starting secondary namenodes [xp]
xp: ssh: connect to host xp port 22: Operation timed out
2019-12-09 09:26:03,796 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
I allowed remote login and created ssh keys without password and desactivate the fire wall to check if it could help but the problem remains. Any help would be greatly appreciated :)
Yes I tried to ssh xp and it didn't work. After investigating a bit more I managed to make it work...
I changed the ip in the /etc/hosts from 127.0.1.1 to another one which were responding to the ping. I don't know why the ip 127.0.1.1 didn't work but at least the problem seems to be fixed for now

Hadoop: Unable to connect to Web GUI

Introduction: I'm using Ubuntu 18.04.2 LTS on which I'm trying to set up a Hadoop 3.2 Single Node Cluster. The installation goes perfectly fine, and I have Java installed. JPS is working as well.
Issue: I'm trying to connect to the Web GUI at localhost:50070, but I'm unable to. I'm attaching a snippet of my console when I execute ./start-all.sh:
root#it-research:/usr/local/hadoop/sbin# ./start-all.sh
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [it-research]
Starting resourcemanager
Starting nodemanagers
pdsh#it-research: localhost: ssh exited with exit code 1
root#it-research:/usr/local/hadoop/sbin# jps
6032 Jps
3154 SecondaryNameNode
2596 NameNode
I'm unable to resolve localhost: ssh exited with exit code 1
Solutions I've tried:
Set up password-less SSH
Set up NameNode User
Set up PDSH to work with SSH
I've also added master [myIPAddressv4Here] in /etc/hosts file and tried connecting to master:50070. but still facing the same issue
Expected Behaviour: I should be able to connect to the Web GUI when I go to localhost:50070, but I can't.
Please let me know if there's some more information I should provide.
The port number for Hadoop 3.x is 9870, so localhost:9870 should work.

Hadoop standalone mode not starting at the local machine have permission issues

I am not able to figure out what the problem is, I have checked all the links available for the problem and tried but still the same problem.
Please need help as the sandbox available needs higher configuration like more RAM.
hstart
WARNING: Attempting to start all Apache Hadoop daemons as adityaverma
in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [localhost]
localhost: adityaverma#localhost: Permission denied
(publickey,password,keyboard-interactive).
Starting datanodes
localhost: adityaverma#localhost: Permission denied
(publickey,password,keyboard-interactive).
Starting secondary namenodes [Adityas-MacBook-Pro.local]
Adityas-MacBook-Pro.local: adityaverma#adityas-macbook-pro.local:
Permission denied (publickey,password,keyboard-interactive).
2018-05-30 11:07:03,084 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable
Starting resourcemanager
Starting nodemanagers
localhost: adityaverma#localhost: Permission denied (publickey,password,keyboard-interactive).
This error typically means you failed to setup passwordless SSH. For example, the same error should happen with ssh localhost, and it should not prompt for a password
Check the Hadoop documentation again on SSH key generation and add it to your authorized keys file
I might suggest setting up a virtual machine anyway (for example, using Vagrant) if the sandbox requires too many resources. The Hortonworks&Cloudrea installation docs are fairly detailed to install a cluster from scratch
This way, Hadoop isn't cluttering your Mac's hard drive and a Linux server will closer match Hadoop installations running in production environments

Hadoop Data node IP isn't a real VM

I'm currently running a hadoop setup with a Namenode(master-node - 10.0.1.86) and a Datanode(node1 - 10.0.1.85) using two centOS VM's.
When I run a hive query that starts a mapReduce job, I get the following error:
"Application application_1515705541639_0001 failed 2 times due to
Error launching appattempt_1515705541639_0001_000002. Got exception:
java.net.NoRouteToHostException: No Route to Host from
localhost.localdomain/127.0.0.1 to 10.0.2.62:48955 failed on socket
timeout exception: java.net.NoRouteToHostException: No route to host;
For more details see: http://wiki.apache.org/hadoop/NoRouteToHost"
Where on earth is this IP of 10.0.2.62 coming from? Here is an example of what I am seeing.
This IP does not exist on my network. You can not reach it through ping of telnet.
I have gone through all my config files on both master-node and node1 and I cannot find where it is picking up this IP. I've stopped/started both hdfs and yarn and rebooted both the VM's. Both /etc/host files are how they should be. Any general direction on where to look next would be appreciated, I am stumped!
Didn't have any luck on discovering where this rogue IP was coming from. I ended up assigning the VM the IP address that the node-master was looking for. Sure enough all works fine.

java.net.ConnectException: Connection refused error when running Hive

I'm trying work through a hive tutorial in which I enter the following:
load data local inpath '/usr/local/Cellar/hive/0.11.0/libexec/examples/files/kv1.txt' overwrite into table pokes;
Thits results in the following error:
FAILED: RuntimeException java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused
I see that there are some replies on SA having to do with configuring my ip address and local host, but I'm not familiar with the concepts in the answers. I'd appreciate anything you can tell me about the fundamentals of what causes this kind of answer and how to fix it. Thanks!
This is because hive is not able to contact your namenode
Check if your hadoop services has started properly.
Run the command jps to see what all services are running.
The reason why you get this error is that Hive needs hadoop as its base. So, you need to start Hadoop first.
Here are some steps.
Step1: download hadoop and unzip it
Step2: cd #your_hadoop_path
Step3: ./bin/hadoop namenode -format
Step4: ./sbin/start-all.sh
And then, go back to #your_hive_path and start hive again
Easy way i found to edit the /etc/hosts file. default it looks like
127.0.0.1 localhost
127.0.1.1 user_user_name
just edit and make 127.0.1.1 to 127.0.0.1 thats it , restart your shell and restart your cluster by start-all.sh
same question when set up hive.
solved by change my /etc/hostname
formerly it is my user_machine_name
after I changed it to localhost, then it went well
I guess it is because hadoop may want to resolve your hostname using this /etc/hostname file, but it directed it to your user_machine_name while the hadoop service is running on localhost
I was able to resolve the issue by executing the below command:
start-all.sh
This would ensure that the Hive service has started.
Then starting the Hive was straight forward.
I had a similar problem with a connection timeout:
WARN DFSClient: Failed to connect to /10.165.0.27:50010 for block, add to deadNodes and continue. java.net.ConnectException: Connection timed out: no further information
DFSClient was resolving nodes by internal IP. Here's the solution for this:
.config("spark.hadoop.dfs.client.use.datanode.hostname", "true")

Resources