I am using talend 5.4 /5.5 to connect to cdh 5.1. A three node cluster
N1: CM, HIVE(all the services ),Datanode, Zookeeper.... etc
N2:RM, Datanode
N3: Datanode
when I am trying to load data from hdfs to hive table is failing where as same command from cli works just fine.
hive> LOAD DATA INPATH '/user/thor/test/rev_sub.txt' INTO TABLE revenue_subs;
when I am running the talend job with tHiveLoad component I am getting following exception
[INFO ]: hive.metastore - Trying to connect to metastore with URI thrift://txwlcloud1:9083
[WARN ]: org.apache.hadoop.security.UserGroupInformation - No groups available for user thor
[INFO ]: hive.metastore - Waiting 1 seconds before next connection attempt.
[INFO ]: hive.metastore - Connected to metastore.
[ERROR]: org.apache.hadoop.hive.ql.Driver - FAILED: SemanticException Line 1:17 Invalid path ''/user/thor/test/rev_sub.txt''
org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:17 Invalid path ''/user/thor/test/rev_sub.txt''
at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.applyConstraints(LoadSemanticAnalyzer.java:148)
at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:229)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:459)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:349)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:355)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:82)
at org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:129)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:209)
at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:154)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:191)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:197)
at big_data.hivejob_0_1.HIVEJob.tHiveLoad_1Process(HIVEJob.java:375)
at big_data.hivejob_0_1.HIVEJob.runJobInTOS(HIVEJob.java:645)
at big_data.hivejob_0_1.HIVEJob.main(HIVEJob.java:504)
Caused by: java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: callId, status; Host Details : local host is: "TXWLHPW295/10.215.206.241"; destination host is: "txwlcloud2":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:763)
at org.apache.hadoop.ipc.Client.call(Client.java:1241)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
I am struggling with this issue for a while.
The possible reason could be
1) jdbc driver issue. Di I have to put the jdbc driver jar somewhere in the cluster? Or it is already there?
2) some thing to do with remote metastore
It will be great help if you guys could point out why the load is failing
When I did beeline> !connect jdbc:hive2://10.215.204.xyz:10000 thor org.apache.hive.jdbc.HiveDriver it is returning a correct connection.
Thanks,
Amit
Related
I have weird situation. When I'm running pig script as test1 user, script executes successfully:
pig -param_file /tmp/pig_parameters.param -param DBNAME=default -param TABLENAME=test_pig_table_orc -param FPATH=/data/170622164344.csv /tmp/test.pig
2017-10-31 14:40:40,968 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2017-10-31 14:40:41,057 [Thread-7] INFO hive.metastore - Closed a connection to metastore, current connections: 1
2017-10-31 14:40:41,058 [Thread-7] INFO hive.metastore - Closed a connection to metastore, current connections: 0
Scripts simple load data from csv and stores data into hive table
But when I connect to the server as another user - test2, and run the same script, got this exception :
Pig Stack Trace
---------------
ERROR 1115: org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : org.apache.thrift.transport.TTransportException
org.apache.pig.impl.plan.VisitorException: ERROR 1115:
<line 27, column 0> Output Location Validation Failed for: 'default.test_pig_table_orc More info to follow:
org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : org.apache.thrift.transport.TTransportException
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:75)
at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:311)
at org.apache.pig.PigServer.compilePp(PigServer.java:1392)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1317)
at org.apache.pig.PigServer.execute(PigServer.java:1309)
at org.apache.pig.PigServer.executeBatch(PigServer.java:387)
at org.apache.pig.PigServer.executeBatch(PigServer.java:365)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at org.apache.pig.tools.grunt.GruntParser.processScript(GruntParser.java:504)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.Script(PigScriptParser.java:1014)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:550)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:547)
at org.apache.pig.Main.main(Main.java:158)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.pig.PigException: ERROR 1115: org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : org.apache.thrift.transport.TTransportException
at org.apache.hive.hcatalog.pig.HCatStorer.setStoreLocation(HCatStorer.java:196)
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:68)
... 30 more
Caused by: org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : org.apache.thrift.transport.TTransportException
at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:220)
at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70)
at org.apache.hive.hcatalog.pig.HCatStorer.setStoreLocation(HCatStorer.java:191)
... 31 more
Caused by: org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1254)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1240)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1263)
at org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:180)
at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:91)
... 33 more
Both users are members of supergroup and have equal permissions.
Script runs from the same server.
Tried to place script .pig file localy and on hdfs as well - the same error
Also important point, that it runs successfully from each worker, except master node. Cluster has kerberos authentication
Got stuck with this issue, pls suggest what I could try to fix it?
Solved, by removing hive-site.xml from test2 user home folder. Or just simply run script being in another directory
In my case there was an old hive-site.xml without kerberos configuration parameters in test2 user home folder. When this user ran pig script, by default it applied file conf parameters from home folder (not only hive), if they are located there.
I have recently spent several days trying to run the phoenix thin (queryserver.py and sqlline-thin.py) and thick via zookeeper to secure cluster.But, I could not able to start or connect the phoenix service on secure cluster.
Faced Below issues on phoenix thin and thick clients
17/09/27 08:41:47 WARN util.NativeCodeLoader: Unable to load native-hadoop libra
ry for your platform... using builtin-java classes where applicable
Error: java.lang.RuntimeException: java.lang.NullPointerException (state=,code=0
)
java.sql.SQLException: java.lang.RuntimeException: java.lang.NullPointerExceptio
n
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(Connecti
onQueryServicesImpl.java:2465)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(Connecti
onQueryServicesImpl.java:2382)
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExe
cutor.java:76)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQ
ueryServicesImpl.java:2382)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(Phoe
nixDriver.java:255)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(Phoeni
xEmbeddedDriver.java:149)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
at sqlline.Commands.connect(Commands.java:1064)
at sqlline.Commands.connect(Commands.java:996)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.jav
a:38)
at sqlline.SqlLine.dispatch(SqlLine.java:809)
at sqlline.SqlLine.initArgs(SqlLine.java:588)
at sqlline.SqlLine.begin(SqlLine.java:661)
at sqlline.SqlLine.start(SqlLine.java:398)
at sqlline.SqlLine.main(SqlLine.java:291)
Caused by: java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(R
pcRetryingCaller.java:218)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:
326)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanne
r.java:301)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConst
ruction(ClientScanner.java:166)
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.jav
a:161)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:797)
at org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.
java:602)
at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccess
or.java:366)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java
:406)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(Connecti
onQueryServicesImpl.java:2410)
... 20 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.getMetaReplicaNode
s(ZooKeeperWatcher.java:489)
at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailabl
e(MetaTableLocator.java:558)
at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocatio
n(ZooKeeperRegistry.java:61)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplement
ation.locateMeta(ConnectionManager.java:1211)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplement
ation.locateRegion(ConnectionManager.java:1178)
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getR
egionLocations(RpcRetryingCallerWithReadReplicas.java:305)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(Scann
erCallableWithReplicas.java:156)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(Scann
erCallableWithReplicas.java:60)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(R
pcRetryingCaller.java:210)
... 29 more
sqlline version 1.2.0
0: jdbc:phoenix:Namenode1>
Followed below steps
Removed the hbase-site.xml file on phoenix and set the HBase package environment.
Provided the Kerberos principal and keytab for the Phoenix Query Server in the $HBASE_CONF_DIR/hbase-site.xml file.
phoenix.queryserver.kerberos.principal
HTTP/webuser#Domain.COM
phoenix.queryserver.kerberos.keytab
C:\KeyTab\webuser.keytab
Execute thin server and Client by
a. Start Query server - /bin>python queryserver.ph which started properly.
b. Start thin client - /bin>python sqlline.py http://namenode1:8765;authentication=SPENGO;principal=HTTP/webuser#Domain.COM;keytab=C:\\KeyTab\\webuser.keytab
4.Execute on Thick client by below commands
/bin> python sqlline.py Namenode1:2181:/hbase-secure:HTTP/webuser#Domain.COM:C:\\KeyTab\\webuser.keytab
Let me know, Any other configurations are required an Apache phoenix with secure cluster.
I have added ojdbc.jar file in /usr/lib/sqoop/lib and I am trying to connect oracle to hadoop using sqoop but facing error.
I am using following command:
sqoop list-tables --connect jdbc:oracle:thin://#192.162.2.8:1521:orcl --username hr --password abc
But the i get following error:
15/05/05 09:21:31 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/05/05 09:21:32 ERROR manager.OracleManager: Failed to rollback transaction
java.lang.NullPointerException
at com.cloudera.sqoop.manager.OracleManager.listTables(OracleManager.java:596)
at com.cloudera.sqoop.tool.ListTablesTool.run(ListTablesTool.java:49)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:218)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:228)
15/05/05 09:21:32 ERROR manager.OracleManager: Failed to list tables
java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:489)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:553)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:254)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:528)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
at com.cloudera.sqoop.manager.OracleManager.makeConnection(OracleManager.java:275)
at com.cloudera.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:51)
at com.cloudera.sqoop.manager.OracleManager.listTables(OracleManager.java:585)
at com.cloudera.sqoop.tool.ListTablesTool.run(ListTablesTool.java:49)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:218)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:228)
Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:439)
at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:454)
at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:693)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:251)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1140)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:340)
... 16 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:149)
at oracle.net.nt.ConnOption.connect(ConnOption.java:133)
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:405)
is there anyhthing wrong with the sqoop command.?
The error "network adaptor could not establish connection" is coming because of incorrect jdbc url. Jdbc url in your sqoop command should be in this format: jdbc:oracle:thin:#192.162.2.8:1521:orcl
The connection refused error may occur by scenarios as far as I know.
The Oracle service might not be running on the specified host on the
given port number.
The firewall in between might restrict the client access to the
oracle server through the given port number.
So I suggest you to first confirm the oracle host, port and the firewall restriction in between.
you can easily check the access by using telnet as below,
telnet 192.162.2.8 1521
See if the listener and the database are initiated. I just started the listener (lsnrctl start) and the database (sqlplus / as sysdba and startup) and it worked.
I setup a Hadoop cluster with security by Kerberos, Hive has been enable Sentry. And I have problem with Hue - Hive (Beeswax) Editor. Hue can't load data, information from hive, in hive-server2 log :
2014-04-03 11:36:39,814 WARN thrift.ThriftCLIService (ThriftCLIService.java:GetSchemas(364)) - Error getting catalogs:
org.apache.hive.service.cli.HiveSQLException: Invalid SessionHandle: SessionHandle [de47ccb1-0bf0-44f0-b15b-c07fd62b1134]
at org.apache.hive.service.cli.session.SessionManager.getSession(SessionManager.java:156)
at org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:222)
at org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:359)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1433)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1418)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:603)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2014-04-03 11:36:39,815 INFO thrift.ThriftCLIService (ThriftCLIService.java:OpenSession(203)) - Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V5
2014-04-03 11:36:39,816 WARN thrift.ThriftCLIService (ThriftCLIService.java:OpenSession(212)) - Error opening session:
org.apache.hive.service.cli.HiveSQLException: Failed to validate proxy privilage of hue for admin
at org.apache.hive.service.cli.thrift.ThriftCLIService.getProxyUser(ThriftCLIService.java:556)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getUserName(ThriftCLIService.java:236)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:242)
at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:206)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1313)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1298)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:603)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.security.authorize.AuthorizationException: Unauthorized connection for super-user: hue from IP /10.199.91.97
at org.apache.hadoop.security.authorize.ProxyUsers.authorize(ProxyUsers.java:165)
at org.apache.hadoop.hive.shims.HadoopShimsSecure.authorizeProxyAccess(HadoopShimsSecure.java:585)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getProxyUser(ThriftCLIService.java:552)
... 12 more
Can anyone help me?
Thank you
Is Hive impersonation turned on? When using Sentry it should be off that way the Hive user can access the data according to Sentry privileges. This Hive with Sentry post details it more.
I am using cdh3 update 4 tarball for development purpose. I have hadoop up and running. Now, I also downloaded equivalent flume tarball from cloudera viz 1.1.0 and tried writing a tail of log file into hdfs using hdfs-sink. When I run the flume agent, it starts okay but ends up in error when it attempts writing the new event data into hdfs. I couldn't find better group to post this question than stackoverflow.
here is flume configuration I am using
agent.sources=exec-source
agent.sinks=hdfs-sink
agent.channels=ch1
agent.sources.exec-source.type=exec
agent.sources.exec-source.command=tail -F /locationoffile
agent.sinks.hdfs-sink.type=hdfs
agent.sinks.hdfs-sink.hdfs.path=hdfs://localhost:8020/flume
agent.sinks.hdfs-sink.hdfs.filePrefix=apacheaccess
agent.channels.ch1.type=memory
agent.channels.ch1.capacity=1000
agent.sources.exec-source.channels=ch1
agent.sinks.hdfs-sink.channel=ch1
Also, this is a small snippet of error that gets displayed in console when it receives new event data and tries writing it into hdfs.
13/03/16 17:59:21 INFO hdfs.BucketWriter: Creating hdfs://localhost:8020/user/hdfs-user/flume/apacheaccess.1363436060424.tmp
13/03/16 17:59:22 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Failed on local exception: java.io.IOException: Broken pipe; Host Details : local host is: "sumit-HP-Pavilion-dv3-Notebook-PC/127.0.0.1"; destination host is: "localhost":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
at org.apache.hadoop.ipc.Client.call(Client.java:1164)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.create(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.create(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:192)
at org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1298)
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1317)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1215)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1173)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:272)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:261)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:78)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:805)
at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1060)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:270)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:369)
at org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:65)
at org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:49)
at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:190)
at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:50)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:157)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:154)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:127)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:154)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:316)
at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:718)
at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:715)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:100)
at sun.nio.ch.IOUtil.write(IOUtil.java:71)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:62)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:143)
at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:153)
at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:114)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:861)
at org.apache.hadoop.ipc.Client.call(Client.java:1141)
... 37 more
13/03/16 17:59:27 INFO hdfs.BucketWriter: Creating hdfs://localhost:8020/user/hdfs-user/flume/apacheaccess.1363436060425.tmp
13/03/16 17:59:27 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Failed on local exception: java.io.IOException: Broken pipe; Host Details : local host is: "sumit-HP-Pavilion-dv3-Notebook-PC/127.0.0.1"; destination host is: "localhost":8020;
As people in cloudera mail list suggest, there are probable reasons of this error:
The HDFS safemode is turned on. Try to run hadoop fs -safemode leave and see if the error goes away.
Flume and Hadoop versions are mismatched. To check this replace the hadoop-core.jar in flume/lib directory with the one found in hadoop's installation folder.