Hadoop Remote file creation fails - hadoop

I am trying to create a file in HDFS from outside the cluster (HDFS apis), like this:
Configuration conf = new Configuration();
conf.set("mapred.job.tracker", "192.168.56.101:54310");
conf.set("fs.default.name", "hdfs://192.168.56.101:54311");
FileSystem fs = FileSystem.get(conf);
fs.createNewFile(new Path("/app/hadoop/tmp/data/tools.txt"));
Getting error:
Exception in thread "main" org.apache.hadoop.ipc.RemoteException: java.io.IOException: Unknown protocol to job tracker: org.apache.hadoop.hdfs.protocol.ClientProtocol
at org.apache.hadoop.mapred.JobTracker.getProtocolVersion(JobTracker.java:370)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
at org.apache.hadoop.ipc.Client.call(Client.java:1113)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at $Proxy1.getProtocolVersion(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422)
at org.apache.hadoop.hdfs.DFSClient.createNamenode(DFSClient.java:183)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:281)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:245)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124)
at LineCounter.main(LineCounter.java:107)

Namenode exposes two ports one is ipc port(8020 by default) and WebUI(50070 default). For accessing HDFS filesystem from a remote machine no need to set the property mapred.job.tracker only fs.default.name would be sufficient and make sure that you are using the correct hdfs dependant library versions as cluster and namenode is accessible from the remote machine.
Before connecting to remote hdfs find the value of fs.default.name(8020 default) by checking core-site.xml file from anynodes or edgenode in the cluster. You can make sure the ipc port is correct or not by executing the following command in any of the nodes
hadoop fs -ls hdfs://192.168.56.101:54310/
If you are able to access hdfs you can give (above hdsf URI) in conf.set method

Can you please check if Name node and Jobtracker deamons are running and ports you have mentioned are correct.

I use the following and it works for me on a single node cluster on Linux VM:
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://localhost:9000");
I would suggest you try 9000 as the port number, else find out the correct port number that needs to be used from your Hadoop configuration file "site-core.xml". My file is something like this:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>~/hacking/hd-data/tmp</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>~/hacking/hd-data/snn</value>
</property>
</configuration>
Also, as sachinjose pointed out, I would suggest too that you remove the following from your code:
conf.set("mapred.job.tracker", "192.168.56.101:54310");

Related

Error launching Hadoop Streaming job

I am launching a Hadoop streaming job that fails. The line that launches it is:
hadoop jar $HADOOP_HOME/hadoop-streaming-2.6.0.2.2.5.3-1.jar -conf ~/HADOOP/conf/hadoop-cluster.xml -files aggregation_jobs -input /epcot -output /crowd_analytics/event_count/ -mapper "aggregation_jobs/streaming/event_count_map.py" -reducer "aggregation_jobs/streaming/event_count_reduce.py" -verbose >> output
The contents of the hadoop-cluster.xml file are these:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://fsdala12080.test.domain.com/</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>fsdala12081.test.domain.com:8032</value>
</property>
</configuration>
And I get the following error:
java.net.UnknownHostException: ww-am04035180-010082030080.test.domain.com: ww-am04035180-010082030080.test.domain.com: unknown error
at java.net.InetAddress.getLocalHost(InetAddress.java:1484)
at org.apache.hadoop.streaming.Environment.getHost(Environment.java:121)
at org.apache.hadoop.streaming.StreamUtil.<clinit>(StreamUtil.java:176)
at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:822)
at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:128)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.net.UnknownHostException: ww-am04035180-010082030080.test.domain.com: unknown error
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:907)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1302)
at java.net.InetAddress.getLocalHost(InetAddress.java:1479)
... 13 more
where the first part of the unknown host string is my local computer, which is not in the cluster that I am talking too. By the way, I can talk from my computer to the cluster and copy files from my local hard drive to hdfs on that cluster. Any ideas on why this might not be working? Why is it peaking my local host? Any errors on my submission of the job? Any pointers will be helpful.

Hbase Hadoop integration issue

I am trying to configure Hbase in pseudo distributed mode integrated with Hadoop which is already running in pseudo distributed mode. Hbase-master fails to start.
1.
hbase-site.xml looks like below:
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8030/hbase</value>
</property>
<!-- <property>
<name>hbase.rootdir</name>
<value>file:/home/hadoop/HBase/HFiles</value>
</property> -->
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/zookeeper</value>
</property>
</configuration>
hbase-master fails to start and below error is written in hbase-root-master-bdhost.log
2016-01-08 17:48:38,333 FATAL [bdhost:16000.activeMasterManager] master.HMaster: Failed to become active master
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): SIMPLE authentication is not enabled. Available:[TOKEN]
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy16.setSafeMode(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.setSafeMode(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setSafeMode(ClientNamenodeProtocolTranslatorPB.java:602)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
at com.sun.proxy.$Proxy17.setSafeMode(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2264)
at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:986)
at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:970)
at org.apache.hadoop.hbase.util.FSUtils.isInSafeMode(FSUtils.java:524)
at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:970)
at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:417)
at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:146)
at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:126)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:649)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1646)
at java.lang.Thread.run(Thread.java:745)
2016-01-08 17:48:38,334 FATAL [bdhost:16000.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): SIMPLE authentication is not enabled. Available:[TOKEN]
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy16.setSafeMode(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
....
I'm using hadoop 2.6.3 and hbase 1.1.2 on Fedora Linux release 21.
Tried disabling selinux, ipv6 but that did not help.
Any pointer is much appreciated?
Thanks.
In the following property try giving the name as rootDir (with an uppercase D) and let me know if it works. Of course make sure the HDFS is running on the port mentioned in the property.
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8030/hbase</value>
</property>
Ravi

Hadoop Security configuration related with hive and sqoop

I'm using sqoop-1.4.6 to import data from MSSQL to hadoop-2.7.1
Using sqoop itself I can successfully list the table in MSSQL which mean it works fine. But when I tried to import to hadoop, following error message raised:
ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/libjars/opencsv-2.3.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3110)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3034)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:723)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1476)
at org.apache.hadoop.ipc.Client.call(Client.java:1407)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1430)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1226)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
So I check the log file of datanode, it gave the following infomation:
org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client. Perhaps the client is running an older version of Hadoop which does not support encryption.
Any idea how to change the configuration or how to deal with this problem?
Update:
It turns out that after I changed some configuration file, that problem begins. And the problem is not only about sqoop but hive has the same problem.
Configuration that I changed:
core-site.xml
<property>
<name>hadoop.rpc.protection</name>
<value>privacy</value>
</property>
hdfs-site.xml
<property>
<name>dfs.encrypt.data.transfer</name>
<value>true</value>
</property>
<property>
<name>dfs.encrypt.data.transfer.cipher.suites</name>
<value>AES/CTR/NoPadding</value>
</property>
<property>
<name>dfs.encrypt.data.transfer.cipher.key.bitlength</name>
<value>256</value>
</property>
Thanks
Try set dfs.block.access.token.enable to true.

Using Oozie to create a hive table on hbase causes an error with libthrift?

I'm using an oozie hive action on cloudera (cdh 4) to create an hbase hive table. Running the create table command on my local dev util box executes without error. When I execute the same command via an oozie hive action in the cluster, I get this error:
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], main() threw exception, org.apache.thrift.EncodingUtils.setBit(BIZ)B
java.lang.NoSuchMethodError: org.apache.thrift.EncodingUtils.setBit(BIZ)B
at org.apache.hadoop.hive.ql.plan.api.Query.setStartedIsSet(Query.java:487)
at org.apache.hadoop.hive.ql.plan.api.Query.setStarted(Query.java:474)
at org.apache.hadoop.hive.ql.QueryPlan.updateCountersInQueryPlan(QueryPlan.java:309)
at org.apache.hadoop.hive.ql.QueryPlan.getQueryPlan(QueryPlan.java:450)
at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:622)
at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:504)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1106)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:445)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:455)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:713)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:302)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:260)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:495)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:394)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Googling around, most answers said that this was due to different versions of thrift on hive, hbase, or hadoop; but as far as I can tell (using find -name in a shell action) they all have version 0.9.0:
Stdoutput ./lib/flume-ng/lib/libthrift-0.9.0.jar
Stdoutput ./lib/hcatalog/share/webhcat/svr/lib/libthrift-0.9.0.jar
Stdoutput ./lib/whirr/lib/libthrift-0.9.0.jar
Stdoutput ./lib/whirr/lib/libthrift-0.5.0.jar
Stdoutput ./lib/hive/lib/libthrift-0.9.0-cdh4-1.jar
Stdoutput ./lib/oozie/libserver/libthrift-0.9.0.jar
Stdoutput ./lib/oozie/libtools/libthrift-0.9.0.jar
Stdoutput ./lib/hbase/lib/libthrift-0.9.0.jar
Stdoutput ./lib/mahout/lib/libthrift-0.9.0.jar
These same versions are on my dev util box, and the hive command works fine. Any ideas what could be causing this issue?
Thanks in advance!
The issue was with a jar included in the workflow's lib directory. This jar had dependencies that had dependencies with an older version of thrift.
I was able to circumvent this by making the hive action happen in a sub workflow, then setting
<global>
<configuration>
<property>
<name>oozie.use.system.libpath</name>
<value>false</value>
</property>
<property>
<name>oozie.libpath</name>
<value>${wf:appPath()}/lib</value>
</property>
</configuration>
</global>
on the workflow. This essentially told it to use the lib in my subworkflow's directory, not the main workflow's lib (which included the bad jar).

Oozie Hive workflow intermitent

I currently have a problem where intermittently my oozie workflow will be unable to connect to my hive metastore. It seems like it is running out of connections to the hive-metastore?
Caused by: MetaException(message:Could not connect to meta store using
any of the URIs provided. Most recent failure:
org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:277)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:163)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1082)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:51)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2140)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2151)
at org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:1013)
at org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:1000)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:8732)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8097)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:443)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:347)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:908)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:445)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:455)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:711)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:261)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:238)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:491)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 44 more
)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:323)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:163)
... 42 more
FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
I can run hive from the command line. Use hue to communicate to the hive-metastore and execute queries. This only seems to happen half way through my oozie workflows. Each and every hive action i commit has the hive-site.xml config as oozie.hive.defaults.
hive-site.xml
<property> <name>hive.metastore.uris</name>
<value>thrift://localhost:9083</value> <description>Thrift uri for
the remote metastore. Used by metastore client to connect to remote
metastore.</description> </property>
<property> <name>datanucleus.fixedDatastore</name>
<value>false</value> </property>
<property> <name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/metastore</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property> <name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value> <description>Driver class name
for a JDBC metastore</description> </property>
<property> <name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value> <description>username to use against metastore
database</description> </property>
<property> <name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value> <description>password to use against metastore
database</description> </property>
hive-metastore, hive-server2, mysql-server and oozie all run on the same host for the moment so localhost works. Any ideas? I have oozie share lib default true enabled and the sharelibs has been created.
CDH 4.2.1 packages
-oozie: oozie-3.3.0
-hive-metastore: hive-metastore-0.10.0
-hive-server2: hive-server2-0.10.0
-mysql-server: mysql-server-5.1.69-1
Any help would be greatly appreciated
oozie.hive.defaults is deprecated in the Hive action. Can you try to have Job Xml pointing to the hive-site.xml uploaded on the HDFS?

Resources