compatibility hbase and hadoop - hadoop

Hadoop 2.2.0 and HBase 0.98.0-hadoop2 are compatible ?
EDIT: http://hbase.apache.org/book/configuration.html already read.

No, they are not compatible.
Read this release.

Related

what is the difference between two downloadable versions of giraph: 1.2giraph-dist-1.2.0-hadoop2-bin.tar.gz and giraph-dist-1.2.0-bin.tar.gz

What is the difference between
giraph-dist-1.2.0-hadoop2-bin.tar.gz and giraph-dist-1.2.0-bin.tar.gz.
Is there any documentation about that?
The only documentation that I found is the following one:
Apache Hadoop 2 (latest version: 2.5.1)
This is the latest version of Hadoop 2 (supporting YARN in addition
to MapReduce) Giraph could use. You may tell maven to use this version
with "mvn -Phadoop_2 ".

parquet version used to write a file

is there a way to find out what parquet version was used to write a parquet file in HDFS?
I'm trying to see if various files were written using the same parquet version or different versions.
$ hadoop jar parquet-tools-1.9.0.jar meta my-parquet-file.parquet |grep "parquet-mr version"
creator: parquet-mr version 1.8.1 (build 4aba4dae7bb0d4edbcf7923ae1339f28fd3f7fcf)
Other option using parquet-tools
parquet-tools meta --debug file.parquet

Is it possible to build Apache Spark against Hadoop 2.5.1

After compiling Hadoop 2.5.1 with maven
hadoop version
Hadoop 2.5.1, I tried to compile apache spark using the following command:
mvn -Pyarn -Phadoop-2.5 -Dhadoop.version=2.5.1 -Pdeb -DskipTests clean package
But apparently there is no 2.5 profile.
My question is : what should I do?
rebuild hadoop 2.4
or compile spark with profile 2.4
or any other solution ?
Looks like this was asked after the poster inquired:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-0-with-Hadoop-2-5-0-td15827.html
"The hadoop-2.4 profile is really intended to be "Hadoop 2.4+". It
should compile and run fine with Hadoop 2.5 as far as I know. CDH 5.2
is Hadoop 2.5 + Spark 1.1, so there is evidence it works."
Just changing the profile name worked for me.
Thx for the answers.

Sqoop Installation with hadoop 2.2.0?

I am trying to install all apache hadoop components in my system. I installed hadoop-2.2.0, hive-0.11.0, pig-0.12.0, hbase-0.96.0, now its time to install sqoop. So please suggest me installation steps of sqoop which is compatable with hadoop-2.2.0 and hbase.
Hope for reply soon
thanks in advance for reply back.
#Naveen:The link that you have provided is for Sqoop2.It is not specifically for Hadoop 2.0 branch.Basically it tries to resolve and enhance Sqoop by changing the design to client server model(i.e it's major promises include ease of use,ease of extension,security).For more details,find this interesting video for sqoop2 #https://www.youtube.com/watch?v=hg683-GOWP4.
We can use the latest Sqoop(version 1.4.4(compiled library for hadoop2.0) or 1.4.5) from ASF.Just download the correct version of sqoop for hadoop 2.0 branch.For e.g sqoop-1.4.5.bin__hadoop-2.0.4-alpha.tar.gz can be downloaded and used without any issue with Hadoop 2.0+ versions.
If you couldn't find sqoop version(I assume you are using versions earlier than 1.4.4) for Hadoop2.0 + from the ASF site,you have to recompile the sqoop source code for hadoop 2.0 branch.But it is not required since you can just use the latest sqoop version which supports hadoop 2.0(Hope you are not looking for production ready sqoop version for hadoop 2.0 since the recent versions of sqoop for hadoop2 is still in alpha phase!!!)
I haven't tried Sqoop2 yet.It will also help with the new enhancements for all Hadoop versions 1.0,2.0.
Thank you
Try These steps for installation of sqoop with hadoop 2.2.0
https://sqoop.apache.org/docs/1.99.1/Installation.html

Installing PIG on single node

I installed Hadoop (1.0.2) for a single node on Windows 7 with Cygwin, and it is working. However, I cannot get PIG (0.10.0) to see the Hadoop.
1) "Error: JAVA_HOME is not set."
I added this line to pig (under bin): export JAVA_HOME=/cygdrive/c/PROGRA~1/Java/jdk1.7.0_05
2) which: no hadoop in (/usr/local/b.....)
cygpath: cannot create short name of C:\pig-0.10.0\logs
Cannot locate pig.jar. do 'ant jar', and try again
I tried adding below lines to pig and it is still not finding hadoop. What should i do?
export PIG_HOME="/cygdrive/c/pig-0.10.0"
export PATH=$PATH:$PIG_HOME/bin
export PIG_CLASSPATH=/cygdrive/hadoop/hadoop-1.0.2/conf
You might need to add your Hadoop install to your path as well. e.g.
export HADOOP_INSTALL=/Users/yourname/dev/hadoop-0.20.203.0
export PATH=$PATH:$HADOOP_INSTALL/bin
I had same issue with pig-0.11. Seems this is cygwin specific issue.
Copying pig-0.11.1-withouthadoop to pig-withouthadoop.jar under PIG_HOME fixed the issue for me
I was trying to set up PIG on my gateway machine which has Windows 7 installed on it.
This issue is very specific to Cygwin.
After breaking my head for a couple of hours I found the solution :
Solution is very simple.
Just rename the jar file under ”pig-0.10.1-withouthadoop.jar” to “pig-withouthadoop.jar”.
Its documented here
Also, you can add path : (hadoop directory)\hadoop-v.v.v\bin to environment variables manually in Windows 7. This will solve this problem
which: no hadoop in (/usr/local/b.....)
You must visit this for installing pig 12 on hadoop 2.2.0 without any errors as it re compiles the pig library for hadoop version specified.
http://javatute.com/javatute/faces/post/hadoop/2014/installing-pig-11-for-hadoop-2-on-ubuntu-12-lts.xhtml
After following the steps, you will get the running pig without any errors on grunt.
Just enjoy doing.
% pig [return]
I had a similar problem with Pig 0.12.0 (and Hadoop 1.0.3) installed on Fedora 19.
When trying any Pig command, e.g.
pig -help
I was getting the error:
Cannot locate pig-withouthadoop.jar. do 'ant jar-withouthadoop.jar', and try again
Hadoop and Pig installation /bin folders were properly included in my PATH.
Simply copying pig-0.12.0-withouthadoop.jar to PIG_HOME folder fixed the issue for me.

Resources