Hortonworks Sandbox HDP 2.6.5 on Mac with VirtualBox - hadoop

I am new to Hortonworks Sandbox HDP 2.6.5. I have it successfully installed on a MacOS Catalina, itself running VirtualBox. All is good - I can access the Ambari dashboard and ssh from my Mac to the Hadoop FS.
However, I am confused about what is where and therefore how to access....
I can ssh using this line:
maria_dev#127.0.0.1 -p 2222
.... and I arrive here: maria_dev#sandbox-hdp
This looks a lot like the Hadoop file system.
In Ambari, I use the FileView to navigate in the GUI to user/maria_dev
This looks to me like I am navigating the Linux host.
Assuming this is correct(..is it?) , how to I ssh to here (user/maria_dev) from a terminal on my Mac?
Thanks in advance
Simon

Ambari fileview is HDFS
You don't see HDFS files from an SSH session without using the hdfs fs -ls commands, and this is different from just ls/cd on its own
FWIW, HDP 2.6 has been deprecated for a few years
how do I log into the Linux system that is supporting the Hadoop instance
That is what SSH does

Related

localhost:8088 does not work on hadoop 3

I want to install hadoop 3 on mint but at the end local host::9870 works fine and show nameNode but although in terminal resource manager starts, localhost:8088 does not works!
https://imgur.com/0QCqHkG
With Ubuntu 18.04 and Hadoop 3.1.1 I had the same problem.
I workarounded it by using Java 8 instead of Java 11. I.e. I replaced:
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
— with:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
in etc/hadoop/hadoop-env.sh.

How to get HDFS and YARN version programmatically?

I'm writing a spark program that download different jars from maven based on the environment it runs on, each for a different version of Hadoop distribution (e.g. CDH, HDP, MapR).
This is necessary because some low-level APIs of HDFS and YARN are not shared between these distributions. However, I cannot find any public API of HDFS and YARN that tells their version.
Is it possible to do it only in Java? Or I have to run an external shell to know it?
In Java org.apache.hadoop.util.VersionInfo.getVersion() should work.
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/util/VersionInfo.html
For the CLIs, you can use:
$ hadoop version
$ hdfs version
$ yarn version

setting up 3 node hadoop cluster

I want to setup a cluster of 3 nodes in my office innovation lab. All the 3 machines are having windows 7 installed. so I thought of creating a cluster using Ubuntu installed on all the 3 machines. so far I have followed below steps.
Installed VM ware on all the 3 machines
Installed Ubuntu on the 3 machines.
installed java 1.8 on all the machines
Please guide me what all steps do I need to follow to setup the cluster?
I have seen few videos where in they have created some local repository and did some setup for httpd also
thank
Brijesh
first you install hadoop version this command
rpm -ivh hadoop
then goes hadoop directory
cd /etc/hadoop
and open here hdfs-site.xml file and core-site.xml and edit property
[1]: http://i.stack.imgur.com/WkTIy.png
[2]: http://i.stack.imgur.com/uf89i.png
that's called is datanode . try if possible so on..

how to install apache phoenix to ambari 1.7 with hbase?

I'm new to hadoop. I want to install phoenix with hbase but I have installed hadoop cluster using ambari 1.7 on ubuntu. I'm not able to find any tutorial to do so.
If you build up your own Hadoop stack:
https://phoenix.apache.org/download.html
https://phoenix.apache.org/installation.html
If you use e.g. IBM Open Platform (which is for free btw):
https://developer.ibm.com/hadoop/blog/2015/10/21/installing-apache-phoenix-ibm-open-platform-apache-hadoop-4-1/
hbase should be available as service under add service button on home page.
For installing phoenix i used this link
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2-trunk/bk_installing_manually_book/content/upgrade-22-7-a.html
basically yum install phoenix on each node and then create soft links to the phoenix server jar file
hth

Cloudera Installation Error I want to know can we cloudera manager for Hadoop single node Cluster on ubuntu?

I am using ubuntu 12.04 64bit, I installed and ran sample hadoop programs with single node successfully.
I am getting the following error while installing cloudera manager on my ubuntu
Refreshing repository metadata failed. See
/var/log/cloudera-manager-installer/2.refresh-repo.log for details.
Click OK to revert this installation.
I want to know can we install Cloudera for Hadoop's Single node cluster on ubuntu. Please response me that Is it possible to install cloudera manager for single node or not. Or else Am i want to create multiple nodes for using cloudera with my hadooop
Yes, CM can run in a single node.
This error is because CM can not use apt-get install to get the packages. Which tutorial do you follow?
However, you can manually add the cloudera repo. See this thread.

Resources