which version of hadoop to be used with nutch 1.15 - hadoop

I'm planning to build a web crawler using nutch and solr. I want to know which version of hadoop should I install to work with nutch 1.15.

Nutch 1.15 is built with Hadoop 2.2.0 but it runs also on Hadoop installations using higher versions of Hadoop 2.x and 3.x.

Related

Install specific version of hadoop using Ambari

I am new to HDP installation using Ambari. I want to install Hadoop 2.9.0 using Ambari web installation. My Ambari version is 2.7.0.0 and I am using HDP 3.0 which has Hadoop 3.1.0. But I need to install Hadoop 2.9.0. Can someone please let me know if this can be done? And how can this be achieved?
I have not started the cluster installation yet and I'm done with Ambari installation.
Ambari uses pre-defined software stacks.
HDP does not offer any stack with Hadoop 2.9.0
You would therefore need to manually install that version of Hadoop yourself, although you can still manage the servers (but not the Hadoop configuration) using Ambari
In any case, there's little benefit to installing a lower version of the software, plus you won't get Hortonworks support if you do that

CDH components version numbers

I have an installed CDH cluster and used hadoop version, but it returns only with Hadoop version. Is there any way to get maybe all installed components version number on a graphical interface? Which command can get for example Spark version number?
Open CM (hostname:portnumber) -> Hosts tab -> Host Inspector to find what version of
CM and CDH is installed across all hosts in the cluster, as well as installed cdh components list with version details
Spark version can checked in using
spark-submit --version
Spark was developed separately from Hadoop-hdfs and Hadoop-mapreduce as a standalone tool which can be be used along with Hadoop, as such most of its interfaces are are different from hadoop.

How to build hadoop 2.7.1 plugins for eclipse

I want to build hadoop eclipse plugin for hadoop 2.7.1 version. So how to build this plugin?
AFAIK the hadoop eclipse plugin is very very old and unmaintained. Some time ago I tried to use it with hadoop 2.3 and even I was able to install it, it never worked. I don't recommend you waste your time with old plugins.

which version of flume should i use?

I'm using CDH 4.7.0 and will be installing Flume to feed HDFS data. I also downloaded Flume v1.4.0 from Apache (the same version that CDH comes with. There seem to be 2 flume-ng-core files between the one that comes with CDH and the one from Apache. There versions are 1.4.0 and 1.4.0-cdh4.7.0. Should I be using 1.4.0-cdh4.7.0 or can I safely use 1.4.0?
Flume 1.4.0 and Flume version 1.4.0-cdh4.7.0 are same but 1.4.0-cdh4.7.0 is compiled and tested with CDH4.7.0 therefore 1.4.0-cdh4.7.0 is risk free to use with CDH4.7.0.
Hence I recommend to use the cdh4.7.0 version of flume along with your CDH4.7.0 version.

Which version of CDH using Cloudera Manager automatically Installs JDK1.7?

I am using Cloudera Manager with CDH4.2.2 for my 3+1 cluster. On starting the installation with cloudera manager, it automatically downloads and installs JDK1.6. I want to use JDK1.7 with CDH for my convinience. Is it possible or is there any version of CDH which while installating Hadoop in the cluster automatically downloads and installs and successfully runs Hadoop with JDK1.7?
If yes, may I know which version of CDH is it and where do i get to download it from?
I want to work with JDK1.7 instead of 1.6 because i want to install Apache Giraph on CDH but it seems Giraph does not fit fine with JDK1.6 and needs the JDK1.7.
With Regards,
JDK 1.7 is supported for all CDH applications as of CDH 4.4 and Cloudera Manager 4.7.
That being said, no version of Cloudera Manager 4.x installs JDK 1.7 during the installation (latest version is 4.8.2). The only version of Cloudera Manager that installs JDK 1.7 automatically is 5.0.0.
To summarize: If you want an automated installation of JDK 1.7 via Cloudera Manager, you need to upgrade to CDH 5, and CM 5.0.0. Alternatively, you could upgrade to CDH4.4, and then perform a manual installation of JDK 1.7.

Resources