stormcrawler currently compatible with which version of Apache Storm - apache-storm

What version of Apache Storm should I have installed in order to use the latest version of StormCrawler?
Apache Storm 1.2.3 or 2.0.0
Thanks

As of writing, the master branch of StormCrawler is based on Storm 1.2.3 as shown in the pom.xml, but there is a 2.x branch available as well.

Related

which version of hadoop to be used with nutch 1.15

I'm planning to build a web crawler using nutch and solr. I want to know which version of hadoop should I install to work with nutch 1.15.
Nutch 1.15 is built with Hadoop 2.2.0 but it runs also on Hadoop installations using higher versions of Hadoop 2.x and 3.x.

Why Hadoop 2.6.4 is released after 2.7.2

Apache Hadoop Release
I downloaded Hadoop 2.6.4, but I find 2.7.2 was released earlier than 2.6.4, also 2.7.2 is stable. How can 2.7.2 is more stable than 2.6.4, and how can 2.7.2 released earlier than 2.6.4 ?
I guess 2.7.X are stable versions, 2.6.X are unstable versions ?
Hadoop community had decided to drop JDK6 runtime and works with JDK 7+ only.
On 2.x release versions they started this trough release 2.7.0.
To extend support for JDK6 hadoop users they started maintenance release for 2.6.x branch.
Summary
2.7.x versions contains roadmap features with JDK7 support.
2.6.x version contains point releases having critical fixes with JDK6 support.
To answer the question 2.6.x release post 2.7.x are point releases for 2.6.x branch containing critical fixes.
As long as stability is concerned, release 2.7.1 and 2.7.2 are stable.
Sources
http://hadoop.apache.org/releases.html
Check POM configuration of Hadoop's 2.7.x & 2.6.x branches on github mirrors for JDK compatibility

which version of flume should i use?

I'm using CDH 4.7.0 and will be installing Flume to feed HDFS data. I also downloaded Flume v1.4.0 from Apache (the same version that CDH comes with. There seem to be 2 flume-ng-core files between the one that comes with CDH and the one from Apache. There versions are 1.4.0 and 1.4.0-cdh4.7.0. Should I be using 1.4.0-cdh4.7.0 or can I safely use 1.4.0?
Flume 1.4.0 and Flume version 1.4.0-cdh4.7.0 are same but 1.4.0-cdh4.7.0 is compiled and tested with CDH4.7.0 therefore 1.4.0-cdh4.7.0 is risk free to use with CDH4.7.0.
Hence I recommend to use the cdh4.7.0 version of flume along with your CDH4.7.0 version.

Region server fails in cloudera after phoenix-4.3.0-server.jar is added

I have added the phoenix-4.3.0-server.jar in the /opt/cloudera/parcels/CDH/lib/hbase/lib of the cloudera.
When I tried to start the servers(Region and master servers), only the master server is started. The region server sometimes starts momentarily and goes down immediately.
This worked fine with the previous version of the phoenix(4.0.0-incubating).
Kindly help me with a workaround for this problem.
The reason it failed when upgrading from 4.0.0 to 4.3.0 was the compatibility. For some reason 4.3.0 was not compatible to be upgraded from an older version of phoenix.
So, upgrading from 4.0.0 to 4.1.0, restarting the HBase server and then upgrading from 4.1.0 to 4.3.0 with a HBase restart helped the situation.

How to build eclipse-plugin for CDH4 ( 4.2.0 )

is there any one know how to build eclipse-plugin for CDH4 ( 4.2.0 )?
I google it all the morning and only found some tips about 4.1.2 or older version.
My hadoop cluster is build by Cloudera Management, and the latest version of CDH is 4.2.0, I don't know how to build eclipse-plugin, or I should just roll back my CDH version to 4.1.2? if so, how to do that?

Resources