We have two clusters with hbase 0.94, hadoop 1.04 and hbase 0.98, hadoop 2.4
I've created a snapshot from table on 0.94 snapshot and want to migrate it to cluster with hbase 0.98.
After run this command on 0.98 cluster:
hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot snapshot-name -copy-from webhdfs://hadoops-master:9000/hbase -copy-to hdfs://solr1:8020/hbase
I see:
Exception in thread "main" org.apache.hadoop.hbase.snapshot.ExportSnapshotException: Failed to copy the snapshot directory: from=webhdfs://hadoops-master:9000/hbase/.hbase-snapshot/snapshot-name to=hdfs://solr1:8020/hbase/.hbase-snapshot/.tmp/snapshot-name
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:916)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:1000)
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:1004)
Caused by: java.net.SocketException: Unexpected end of file from server
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:772)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:769)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$Runner.getResponse(WebHdfsFileSystem.java:596)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$Runner.run(WebHdfsFileSystem.java:530)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:417)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:630)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:641)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337)
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:914)
... 3 more
Lars Hofhansl himself (a principal HBase committer and the maintainer of 0.94) has stated that exports are not supported from 0.94 and 0.98. So you are likely going nowehere with this.
Here is an excerpt from his thread just this afternoon:
It's tricky from multiple angles:- replication between 0.94 and 0.98 does not work (there's a gateway process that supposedly does that, but it's not known to be very reliable)- snapshots cannot be exported from 0.94 and 0.98
source: hbase user's mailing list today 12/15/14
UPDATE On the HBase mailing list there was a report that one user was able to find a way to do the export. A piece of that info is:
If exporting from an HBase 0.94 cluster to an HBase 0.98 cluster, you will need to use the webhdfs protocol (or possibly hftp, though I couldn’t get that to work). You also need to manually move some files around because snapshot layouts have changed. Based on the example above, on the 0.98 cluster do the following:
check whether any imports already exist for the table:
hadoop fs -ls /apps/hbase/data/archive/data/default
It would not be correct for me to reproduce the entire discussion here: a link to nabble that contains the entire gory details is:
http://apache-hbase.679495.n3.nabble.com/0-94-going-forward-td4066883.html
I am digging into this issue and it has more to do with the underlying HDFS.
Once the streams (in my case for distcp) have been written, close is called:
public void close() throws IOException {
try {
super.close();
} finally {
try {
validateResponse(op, conn, true);
} finally {
conn.disconnect();
}
}
}
Wherein it fails in the validate response call (probably the other end of connection has been closed).
This could be due to incompatibility between HDFS 1.0 and 2.4!
I've repacked the hadoop and hbase libs with jarjar https://code.google.com/p/jarjar/
It needs to fix some class name issues.
Then I've write a mapreduce copyTable job. It reads rows from 94 culster and writes to 98 cluster.
Here is the code:
https://github.com/fiserro/copy-table-94to98
Thanks github.com/falsecz for the idea and help!
Related
I'd like to know if it's possible to have an external table pointing to a DynamoDB table on AWS using Hive.
I'm not using AWS EMR, what I'm using is a Hadoop Stack configured through Apache Ambari.
Hive version: Hive 3.1.0.3.1.4.0-315
What I did was:
Downloaded the EMR Dynamo-Hive connector JARS directly from the maven repository: https://mvnrepository.com/artifact/com.amazon.emr
I loaded all the JARS in hive.aux.jars.path:
emr-dynamodb-hadoop-4.12.0.jar
emr-dynamodb-hive-4.12.0.jar
emr-dynamodb-tools-4.12.0.jar
hive1.2-shims-4.12.0.jar
hive1-shims-4.12.0.jar
hive2-shims-4.12.0.jar
hive2-shims-4.15.0.jar
shims-common-4.12.0.jar
shims-loader-4.12.0.jar
But when I try to create the table with:
CREATE EXTERNAL TABLE dynamo_LabDynamoHive
(id double, nome string)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES (
"dynamodb.table.name" = "LabDynamoHive",
"dynamodb.column.mapping" = "id:id,nome:nome"
);
I get the following error:
INFO : Starting task [Stage-0:DDL] in serial mode
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Shim class for Hive version 3.1.1000 does not exist
INFO : Completed executing command(queryId=hive_20200422142624_6ebabdc8-8942-4025-84a8-411505d20895); Time taken: 0.203 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Shim class for Hive version 3.1.1000 does not exist (state=08S01,code=1)
I know I'm not loading a Shims JAR for Hive 3, but I'd like to know if any of you have tried and succeded in using an external table with DynamoDB using Hive 3 outside of EMR.
Any help or directions would be greatly appreciated!
problem is apparently the source code of this EMR connector is somewhat outdated and lacks Hive 3.x support recently introduced by AWS for EMR 6.0.
However, you can find a working 3.1 implementation here, forked from the official EMR connector: https://github.com/ramsesrm/emr-dynamodb-connector
Installation steps as follow:
1- compile the mentioned code (mvn clean package)
2- install the 3 JARs in your hive.aux.jars.path, along with aws-java-sdk-core and aws-java-sdk-dynamodb JARs from AWS (shim JARs are not required), 5 in total.
Tha's it. Don't forget to specify the region as a TBLPROPERTIES if you're not using the default US one.
I am running HBase on an EMR cluster (emr-5.7.0) enabled on S3.
We are using 'ImportTsv' and 'CompleteBulkLoad' utilities for importing the data into HBase.
During our process, we have observed that intermittently there were failures stating that there were HFile corruption for some of the imported files. This happens sporadically and there is no pattern that we could deduce for the errors.
After lot of research and going through many suggestions in blogs, I have tried the below fixes but of no avail and we are still facing the discrepancy.
Tech Stack :
AWS EMR Cluster (emr-5.7.0 | r3.8xlarge | 15 nodes)
AWS S3
HBase 1.3.1
Data Volume:
~ 960000 lines (To be upserted) | ~ 7GB TSV file
Commands used in sequence:
1) hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="|" -Dimporttsv.columns="<Column Names (472 Columns)>" -Dimporttsv.bulk.output="<HFiles Path on HDFS>" <Table Name> <TSV file path on HDFS>
2) hadoop fs -chmod 777 <HFiles Path on HDFS>
3) hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles <HFiles Path on HDFS> <Table Name>
Fixes Tried:
Increasing S3 Max Connections:
We increased the below property but it did not seem to resolve the issue. fs.s3.maxConnections : Values tried -- 10000, 20000, 50000, 100000.
HBase Repair:
Another approach was to execute the HBase repair command but it didn't seem to help either.
Command : hbase hbase hbck -repair
Error Trace is as below:
[LoadIncrementalHFiles-17] mapreduce.LoadIncrementalHFiles: Received a
CorruptHFileException from region server: row '00218333246' on table
'WB_MASTER' at
region=WB_MASTER,00218333246,1506304894610.f108f470c00356217d63396aa11cf0bc.,
hostname=ip-10-244-8-74.ec2.internal,16020,1507907710216, seqNum=198
org.apache.hadoop.hbase.io.hfile.CorruptHFileException:
org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem
reading HFile Trailer from file
s3://wbpoc-landingzone/emrfs_test/wb_hbase_compressed/data/default/WB_MASTER/f108f470c00356217d63396aa11cf0bc/cf/2a9ecdc5c3aa4ad8aca535f56c35a32d_SeqId_200_
at
org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:497)
at
org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:525)
at
org.apache.hadoop.hbase.regionserver.StoreFile$Reader.(StoreFile.java:1170)
at
org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:259)
at
org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:427)
at
org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:528)
at
org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:518)
at
org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:667)
at
org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:659)
at
org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:799)
at
org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:5574)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:2034)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34952)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
Caused by: java.io.FileNotFoundException: File not present on S3 at
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem$NativeS3FsInputStream.read(S3NativeFileSystem.java:203)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at
java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at
java.io.BufferedInputStream.read(BufferedInputStream.java:345) at
java.io.DataInputStream.readFully(DataInputStream.java:195) at
org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:391)
at
org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:482)
Any suggestions in figuring out the root cause for this discrepancy would be really helpful.
Appreciate your help! Thank you!
After much research and trial & errors, I was finally able to find a resolution for this issue, thanks to AWS support folks. It seems the issue is an occurrence as a result of S3's eventual consistency. The AWS team suggested to use the below property and it worked like a charm, so far we haven't hit the HFile corruption issue. Hope this helps if someone is facing the same issue!
Property (hbase-site.xml):
hbase.bulkload.retries.retryOnIOException : true
I am trying to run spark-terasort with spark-1.6.1-bin-hadoop1 (pre-built package for hadoop 1.X).
When I try to run spark:
./bin/spark-submit --class com.github.ehiggs.spark.terasort.TeraGen ~/spark-terasort/target/spark-terasort-1.0-jar-with-dependencies.jar 100G hdfs:///input_terasort
I get the error:
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found class org.apache.hadoop.mapreduce.JobContext, but interface was expected
This may have to do with different Hadoop versions (between spark and spark-terasort). I have tried playing around with pom.xml (used to compile spark-terasort) but without much success.
How can I use spark-terasort with spark-1.6.1-bin-hadoop1?
The spark-terasort is old:
<scala.binary.version>2.10</scala.binary.version>
<spark.version>1.2.1</spark.version>
I am looking into patching it. Will get back..
Update I tried with 1.6.0-SNAPSHOT and TeraGen worked fine.
Input size: 1000KB
Total number of records: 10000
Number of output partitions: 2
Number of records/output partition: 5000
===========================================================================
===========================================================================
Number of records written: 10000
This was when running against local filesystem. I will look at real hdfs in about 12 hours from now.
has anyone had successful experience loading data to hbase-0.98.0 from pig-0.12.0 on hadoop-2.2.0 in an environment of hadoop-2.20+hbase-0.98.0+pig-0.12.0 combination without encountering this error:
ERROR 2998: Unhandled internal error.
org/apache/hadoop/hbase/filter/WritableByteArrayComparable
with a line of log trace:
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/WritableByteArra
I searched the web and found a handful of problems and solutions but all of them refer to pre-hadoop2 and base-0.94-x which were not applicable to my situation.
I have a 5 node hadoop-2.2.0 cluster and a 3 node hbase-0.98.0 cluster and a client machine installed with hadoop-2.2.0, base-0.98.0, pig-0.12.0. Each of them functioned fine separately and I got hdfs, map reduce, region servers , pig all worked fine. To complete an "loading data to base from pig" example, i have the following export:
export PIG_CLASSPATH=$HADOOP_INSTALL/etc/hadoop:$HBASE_PREFIX/lib/*.jar
:$HBASE_PREFIX/lib/protobuf-java-2.5.0.jar:$HBASE_PREFIX/lib/zookeeper-3.4.5.jar
and when i tried to run : pig -x local -f loaddata.pig
and boom, the following error:ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/filter/WritableByteArrayComparable (this should be the 100+ times i got it dying countless tries to figure out a working setting).
the trace log shows:lava.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/WritableByteArrayComparable
the following is my pig script:
REGISTER /usr/local/hbase/lib/hbase-*.jar;
REGISTER /usr/local/hbase/lib/hadoop-*.jar;
REGISTER /usr/local/hbase/lib/protobuf-java-2.5.0.jar;
REGISTER /usr/local/hbase/lib/zookeeper-3.4.5.jar;
raw_data = LOAD '/home/hdadmin/200408hourly.txt' USING PigStorage(',');
weather_data = FOREACH raw_data GENERATE $1, $10;
ranked_data = RANK weather_data;
final_data = FILTER ranked_data BY $0 IS NOT NULL;
STORE final_data INTO 'hbase://weather' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('info:date info:temp');
I have successfully created a base table 'weather'.
Has anyone had successful experience and be generous to share with us?
ant clean jar-withouthadoop -Dhadoopversion=23 -Dhbaseversion=95
By default it builds against hbase 0.94. 94 and 95 are the only options.
If you know which jar file contains the missing class, e.g. org/apache/hadoop/hbase/filter/WritableByteArray, then you can use the pig.additional.jars property when running the pig command to ensure that the jar file is available to all the mapper tasks.
pig -D pig.additional.jars=FullPathToJarFile.jar bulkload.pig
Example:
pig -D pig.additional.jars=/usr/lib/hbase/lib/hbase-protocol.jar bulkload.pig
I'm trying to user Solr with Nutch on a Windows Machine and I'm getting the following error:
Exception in thread "main" java.io.IOException: Failed to set permissions of path: c:\temp\mapred\staging\admin-1654213299\.staging to 0700
From a lot of threads I learned, that hadoop which seems to be used by nutch does some chmod magic that will work on Unix machines, but not on Windows.
This problem exists for more than a year now. I found one thread, where the code line is shown and a fix proposed. Am I really them only one who has this problem? Are all others creating a custom build in order to run nutch on windows? Or is there some option to disable the hadoop stuff or another solution? Maybe another crawler than nutch?
Here's the stack trace of what I'm doing:
admin#WIN-G1BPD00JH42 /cygdrive/c/solr/apache-nutch-1.6
$ bin/nutch crawl urls -dir crawl -depth 3 -topN 5 -solr http://localhost:8080/solr-4.1.0
cygpath: can't convert empty path
crawl started in: crawl
rootUrlDir = urls
threads = 10
depth = 3
solrUrl=http://localhost:8080/solr-4.1.0
topN = 5
Injector: starting at 2013-03-03 17:43:15
Injector: crawlDb: crawl/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Exception in thread "main" java.io.IOException: Failed to set permissions of path: c:\temp\mapred\staging\admin-1654213299\.staging to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
at org.apache.nutch.crawl.Injector.inject(Injector.java:281)
at org.apache.nutch.crawl.Crawl.run(Crawl.java:127)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)
It took me a while to get this working but here's the solution which works on nutch 1.7.
Download Hadoop Core 0.20.2 from the maven repository
Replace $NUTCH_HOME/lib/hadoop-core-1.2.0.jar with the downloaded file renaming it with the same name.
That should be it.
Explanation
This issue is caused by hadoop since it assumes you're running on unix and abides by the file permission rules. The issue was resolved in 2011 actually but nutch didn't update the hadoop version they use. The relevant fixes are here and here
We are using Nutch too, but it is not supported for running on Windows, on Cygwin our 1.4 version had similar problems as you had, something like mapreduce too.
We solved it by using a vm (Virtual box) with Ubuntu and a shared directory between Windows and Linux, so we can develop and built on Windows and run Nutch (crawling) on Linux.
I have Nutch running on windows, no custom build. It's a long time since I haven't used it though. But one thing that took me a while to catch, is that you need to run cygwin as a windows admin to get the necessary rights.
I suggest a different approach. Check this link out. It explains how to swallow the error on Windows, and does not require you to downgrade Hadoop or rebuild Nutch. I tested on Nutch 2.1, but it applies to other versions as well.
I also made a simple .bat for starting the crawler and indexer, but it is meant for Nutch 2.x, might not be applicable for Nutch 1.x.
For the sake of posterity, the approach entails:
Making a custom LocalFileSystem implementation:
public class WinLocalFileSystem extends LocalFileSystem {
public WinLocalFileSystem() {
super();
System.err.println("Patch for HADOOP-7682: "+
"Instantiating workaround file system");
}
/**
* Delegates to <code>super.mkdirs(Path)</code> and separately calls
* <code>this.setPermssion(Path,FsPermission)</code>
*/
#Override
public boolean mkdirs(Path path, FsPermission permission)
throws IOException {
boolean result=super.mkdirs(path);
this.setPermission(path,permission);
return result;
}
/**
* Ignores IOException when attempting to set the permission
*/
#Override
public void setPermission(Path path, FsPermission permission)
throws IOException {
try {
super.setPermission(path,permission);
}
catch (IOException e) {
System.err.println("Patch for HADOOP-7682: "+
"Ignoring IOException setting persmission for path \""+path+
"\": "+e.getMessage());
}
}
}
Compiling it and placing the JAR under ${HADOOP_HOME}/lib
And then registering it by modifying ${HADOOP_HOME}/conf/core-site.xml:
fs.file.impl
com.conga.services.hadoop.patch.HADOOP_7682.WinLocalFileSystem
Enables patch for issue HADOOP-7682 on Windows
You have to change the project dependences hadoop-core and hadoop-tools. I'm using 0.20.2 version and works fine.