I'm looking for some resource on setting these up to work together that is with a more recent version. All of the tutorials I see are for older versions of hadoop, does anyone know a good resource for 1.x?
I would use the Cloudera CDH3 or CDH4 guides to help you. Though you may have to reinstall your version of hadoop. I recommend Cloudera because they keep a repository that makes getting updates a breeze.
https://ccp.cloudera.com/display/CDH4DOC/HBase+Installation
I have run through this install a dozen times in the last week so if you run into any issues in particular hit me up.
Related
Haven't touched ES for a while and this morning trying to setup a cluster to play with, and then realised the good old head plugin is not maintained anymore, did a bit of google, found ElasticHQ, also realised it's not maintained anymore.
So what opensource tool can I use to manage the ES cluster?
Welcome back!! I'm heavily using the following two very good head-like tools:
https://github.com/lmenezes/cerebro
https://elasticvue.com/
I am trying to run my model on Philly cluster which uses CNTK v2beta15 with py34. Could someone point me to the documentation for that particular version as many commands and examples of CNTK v2.0 (stable) are not working on Philly. Also, I am running into issues while installing the v2beta15 locally. I downloaded the binaries and trying to run install.bat while the machine crashes.
Any solution would be very helpful!
Python documentation for 2.0b15 is at
https://cntk.ai/pythondocs-ver/2_0-beta15/
Also, consider rephrasing your question to benefit other SO users. Is the Philly cluster something that the rest of the community should care about? If you have issues with internal infrastructure in your organization it might be best to contact the team responsible for that through non-public means.
I am complete beginner to Hadoop and I saw various posts on internet whics tells about installing Cloudera VM using VMWare. Recently I saw a youtube video which shows how to install hadoop on ubuntu by downloading hadoop tar file from Apache but they didn't install Cloudera VM. My Question is:
What is the difference between the two approach? Is there any benefit using one over the another?
I want to learn Hadoop by myself and looking for the best way/more adopted way to learn it.
Cloudera is "yet-another distribution of hadoop" You can think of basic Hadoop as stock android in Nexus mobile phones and Cloudera Hadoop as androids in non-nexus phone. Its basically a custom built version.
Cloudera is more of a plug-and play version meaning you can download the VM and start playing with Hadoop.
On, the other hand,Hadoop in Ubuntu is a get your hands dirty mode where you work on building your own hadoop.
Personal Opinion - I suggest setting up your own Hadoop that helps better understanding of internals of Hadoop and the Hadoop learning activities that follow.
Hope it helps. Happy Hadooping!
I spent a lot of time playing with the Cloudera software and their Quickstart VM is good, until you start trying to e.g. add nodes. It was not designed to do that but when you have invested time using it it would be nice to use it as a basis for a real system.
So the next step would be to use CDH (Cloudera's 'proper' Hadoop) or Hortonworks version HDP or maybe even MapR (I've not used it).
CDH and HDP technologies have nice GUI features over basic Hadoop and are seemingly easier to setup. HOWEVER I spent a lot of time trying to get both CDH and HDP to work unsuccesfully.
They give red lights and cryptic messages when things go wrong and add a layer of obfuscation when trying to fix things. For example in plain hadoop you can easily change the configuration files but in CDH you can't access them directly you have to discover where Cloudera hides their various options.
If would recommend using plain hadoop unless you have a big organisation, lots of people and machines.
UPDATE: I have finally got HDP to work and it is really nice. Good Ambari GUi and you can use Zeppelin Notebooks to do fancy graphics.
I am very much new to hadoop and bigdata, Using horton work I got an idea about Pig,Hive and different type of analysis available with hadoop, but still i am unclear about the development stage
please give me some example about getting started to build an application with hadoop analysis suport
You should reach this link
You will come to know about hadoop and it's installation.
What would be the best way of installing Hadoop 1.0 (whether it is Apache hadoop or CDH)? CDH seems to have some kind of installation manager but somehow I can't find good information on the Web after a couple of hours of searching. I only found documentation about pseudo mode installation.
Just visit Cloudera site. They have both Cloudera Manager free which is very good point to start and standalone CDH package. They also have complete set of documentation like installation guide for every version of such products.
Of course I'd recommend Cloudera blog and official Apache Hadoop site dicumentation for better understanding.
I am using Apache Hadoop not much issues except that I have to resolve any compatibility issue while using hadoop eco system components such as hive, pig, sqoop etc.
Cloudera Manager on the other side take care of most of these compatibility issues and kind of provides u a complete package with support.
Hope this helps!