How to install cloudera manager without cdh installation? - hadoop

I have a hadoop environment with tarball which I downloaded at http://hadoop.apache.org/releases.html#Download.
Then, I must use cloudera manager to monitor my mapreduce application.
is possible using cloudera manager without cdh installation?

Cloudera Manager is useless without CDH. Any reason why you would not just use that? Usually the question is the other way around ("I have CDH installed, do I need to use Cloudera Manager?")

Related

are there different ways how to install a cloudera hadoop packages?

can I only install packages via RPM? (RedHat Package Management)
Im using Cloudera and I heard a couple times about CDH Parcel Services but im not sure, if i can do that with this too? or is there another mechanism?
best regards
If you're using Debian or Ubuntu, then you'd use DEB packages, not RPM.
Parcels should work
So would compiling code from source.
Just because you're running Cloudera doesn't make the system any less of a regular Linux machine
Parcels is Cloudera's way of installing their distribution. You install Cloudera Manager, and it will install all of the components using parcels (although I think you have a choice). This is done through a GUI (or API). This is probably the easiest way to go about it.
If you are just learning, the Quick Start VM is not a bad way to get started.
I have done installation of Cloudera using Parcels and it is easier than package installation. Parcels are quite conveniently picked up by Cloudera Manager for installation purposes. Almost everything is ready once parcel installation is done.

Suitable hadoop framework for ubuntu

I want to start working with Hadoop and BigData. I need an easy graphical interface to start. I try Hue but I couldn't get it configured.
Please help me to choose my suitable Hadoop.
I use Ubuntu 14.04.
I think Cloudera,sandbox(by hortonworks) is a easy way.Hard way is installation to Ubuntu.Also i have ubuntu 14.04 and Hadoop(hive,pig),Apache spark exist and i dont need open virtual machine.
There are 3 major Hadoop distributions that you can start with.
Cloudera
Hortonworks
MapR
Each one of them has a UI installer and manager. I think the best for you would be though, to use the virtual environment that these vendors provide.
The Hortonworks Developer Sandbox is an image including Hue as UI to get started. However, the downloadable sandbox image is based on CentOS.
If you want to install a Hortonworks Distribution on Ubuntu, you need to run an Ambari installation (Downloads - Hortonworks Hadoop). Be aware that Hue is not included into the default Ambari installation, but Hue can be installed easily separately. To run properly, Hue on Hortonworks still needs Python 2.6.x.
There are some distributions like Cloudera or Hortonworks but their package needs high machine configuration. For example RAM + 16GB and sometimes it's not possible for the user. In addition, they include some Hadoop related project that user doesn't need at all. If you want to enter this field seriously I strongly recommend installing Hadoop on your own. Doing that you do some configuration and will get familiar with many Hadoop concepts.
You can start using this install tutorial.

How do I install Cloudera CDH on 100 Node cluster without using Cloudera manager?

How do I install Cloudera CDH on 100 Node cluster without using Cloudera manager? Installing and configuring CDH manually on each node in a cluster is a difficult task. What tools and technologies are used to automate the task in production?
CDH supports both Parcel based and Package based installation. You can use Puppet/Chef these type of configuration management tools to do the package based install if you wish. However, the recommended way is to use Cloudera Manager to do Parcel-based installation. Cloudera Manager provides many features OOTB including monitoring, configuration versioning, wizard based security configuration, rolling upgrade, etc. If your reason of not using Cloudera Manager is because it is not open source, please note
There is a free version of CM (some enterprise features are not
available)
CM is just a management tool. Your data are still stored
on HDFS and your big data applications (hive scripts, spark/MapReduce
applications, etc) still work on standard open source Hadoop
platform and there is no vendor lock-in.

Cloudera support for docker container or Docker support for CM 5 image

Recently my org is considering Docker. Our group is using cloudera CDH 5.1.2.
1) Does cloudera compatable with Docker container?
2) Is there any known issue related to docker and cloudera combination?
I could not find any topic on docker in this forum.
Any pointer would be helpful.
Thanks,
Amit
An official answer from Cloudera has been posted here :
I read through what docker is, yesterday. I do not think this has
been tested, there are a number of platform virtualization projects in
progress, but I did not see this on the list.
Lookt at its intent, it might work but you would definately want to
test. The thing I'm concearned about is the level of effort to
normalize between distribution types as there are a large volume of
subcomponents that are brought directly into the CDH "Parcel" that are
platform specific.
You might be able to get a CM server and agents deployed in a generic
way, but then you would want CM to manage the deployment of CDH parcel
across the target "cluster" once it was online, rather than
abstracting that install as well.
Bottom line is installing Cloudera Manager inside a Docker container does not seem to be an easy route, because CM needs to manage the installation of the other Hadoop components.
Other options include:
Using Vagrant to create a CDH VM with Cloudera Manager (Cloudera Documentation Link)
Managing CDH components manually without cloudera Manager (Cloudera Documentation Link)

How to Start working with Hadoop

Hi I want to learn Hadoop.I have basic idea on how hadoop works with MapReduce framework.
Now i want to practice on my local PC so i want to know how to install hadoop on single Node.
I installed VM Workstation 10 and i tried to install any Linux flavour Operating system to install Hadoop , but iam not able to load Ubuntu into VM ware Workstation ,iam getting error as Exiting intel ...,Operating Not found message.
Can any one please provide me steps on how to start with Hadoop installation.
Should i go for any Distributions(Cloudera,Hortonworks,MapR).If that is simple then tell me how to install those distributions.(I tried even with Cloudera importing vmware file into VMWare workstation it did not worked for me)
You can use the VM given by Udacity for its course on Hadoop. I found it really easy to set up.

Resources