Can anyone tell me that how can i install apache storm on windows 7 ?
I am new to big data so need a little help.
Please explain what does not work. Storm is Java + Python so setting up on Windows should not be a problem. Zookeeper run on Windows just fine. There are many Vagrant / Docker implementations that will work as well. So what problem are you trying to resolve?
BTW, if you are trying to set it up for development, you dont need a cluster. You can run it with local cluster settings. (check storm documentation)
The general steps are:
Download Zookeeper
Untar and configure single node cluster
Download Storm
Follow Storm documentation and configure Nimbus/Supervisor settings
Follow Storm documentation to start Nimbus, Supervisor, Storm UI and Log Viewer.
Make sure you read documentation of 0.10.0 and 1.0.x These releases are not compatible and some of the libs you may want to use will not work with the new Storm release.
Related
I want to start working with Hadoop and BigData. I need an easy graphical interface to start. I try Hue but I couldn't get it configured.
Please help me to choose my suitable Hadoop.
I use Ubuntu 14.04.
I think Cloudera,sandbox(by hortonworks) is a easy way.Hard way is installation to Ubuntu.Also i have ubuntu 14.04 and Hadoop(hive,pig),Apache spark exist and i dont need open virtual machine.
There are 3 major Hadoop distributions that you can start with.
Cloudera
Hortonworks
MapR
Each one of them has a UI installer and manager. I think the best for you would be though, to use the virtual environment that these vendors provide.
The Hortonworks Developer Sandbox is an image including Hue as UI to get started. However, the downloadable sandbox image is based on CentOS.
If you want to install a Hortonworks Distribution on Ubuntu, you need to run an Ambari installation (Downloads - Hortonworks Hadoop). Be aware that Hue is not included into the default Ambari installation, but Hue can be installed easily separately. To run properly, Hue on Hortonworks still needs Python 2.6.x.
There are some distributions like Cloudera or Hortonworks but their package needs high machine configuration. For example RAM + 16GB and sometimes it's not possible for the user. In addition, they include some Hadoop related project that user doesn't need at all. If you want to enter this field seriously I strongly recommend installing Hadoop on your own. Doing that you do some configuration and will get familiar with many Hadoop concepts.
You can start using this install tutorial.
Recently my org is considering Docker. Our group is using cloudera CDH 5.1.2.
1) Does cloudera compatable with Docker container?
2) Is there any known issue related to docker and cloudera combination?
I could not find any topic on docker in this forum.
Any pointer would be helpful.
Thanks,
Amit
An official answer from Cloudera has been posted here :
I read through what docker is, yesterday. I do not think this has
been tested, there are a number of platform virtualization projects in
progress, but I did not see this on the list.
Lookt at its intent, it might work but you would definately want to
test. The thing I'm concearned about is the level of effort to
normalize between distribution types as there are a large volume of
subcomponents that are brought directly into the CDH "Parcel" that are
platform specific.
You might be able to get a CM server and agents deployed in a generic
way, but then you would want CM to manage the deployment of CDH parcel
across the target "cluster" once it was online, rather than
abstracting that install as well.
Bottom line is installing Cloudera Manager inside a Docker container does not seem to be an easy route, because CM needs to manage the installation of the other Hadoop components.
Other options include:
Using Vagrant to create a CDH VM with Cloudera Manager (Cloudera Documentation Link)
Managing CDH components manually without cloudera Manager (Cloudera Documentation Link)
I want to setup a cluster of Hadoop. Is there any problem if i use Centos v7 for operating system?
I already use Centos 6.5 and in this setup i want to use the newest version.
I test it my self and the result was this version is not stable i got a lot of crashes and randomly reboot in cluster so i back to v6.5 .
Hi I want to learn Hadoop.I have basic idea on how hadoop works with MapReduce framework.
Now i want to practice on my local PC so i want to know how to install hadoop on single Node.
I installed VM Workstation 10 and i tried to install any Linux flavour Operating system to install Hadoop , but iam not able to load Ubuntu into VM ware Workstation ,iam getting error as Exiting intel ...,Operating Not found message.
Can any one please provide me steps on how to start with Hadoop installation.
Should i go for any Distributions(Cloudera,Hortonworks,MapR).If that is simple then tell me how to install those distributions.(I tried even with Cloudera importing vmware file into VMWare workstation it did not worked for me)
You can use the VM given by Udacity for its course on Hadoop. I found it really easy to set up.
After reading this article...
http://blog.cloudera.com/blog/2012/01/an-update-on-apache-hadoop-1-0/
If I were to make a brand new installation of hadoop to work with... is it still 0.23 today that has all the features? Or is there a better version that is out there now that has everything and captures all features and performance? There are so many guides out there that use 0.20... makes it seem as if 1.0 is not to be trusted...
Here is a guide I have followed at least three times to install and run on single node and two-node clusters and Michael does a pretty good job of keeping it current:
Running Hadoop on Ubuntu Linux (Single-Node Cluster)
Running Hadoop on Ubuntu Linux (Multi-Node Cluster)
This uses version Hadoop version 1.0.3 released in May 2012; The latest stable as of this writing is 1.1.2, but if you want to do a first install to test and become familiar a guide like the one above may help you familiarize with the system and then upgrade to the latest-one once you have a reference point.
Check the Hadoop documentation for the status of the different releases. As of now 1.0.4 is the stable release.
I came across this tutorial for setting up a single node cluster in ubuntu 12.04.
http://preciselyconcise.com/apis_and_installations/hadoop_installation.php. I followed the tutorial and i successfully installed hadoop 1.1.2 on my linux system.