WebHCAT Error in getting Hive Table Metadata. Command was terminated due to timeout(10000ms). See templeton.exec.timeout property","exitcode":143 - hadoop

If I issue this webhcat REST call in my cloudera 5.4.1 environment
curl -s 'http://mywebhcat:50111/templeton/v1/ddl/database/default/table/person?
user.name=admin&format=extended'; echo; echo;
everything works fine and I see the metadata for the table Person.
But there is another table called foo_bar if I change the REST call above to
curl -s 'http://mywebhcat:50111/templeton/v1/ddl/database/default/table/foo_bar?
user.name=admin&format=extended'; echo; echo;
Then I get an error
{"statement":"use default; show table extended like
foo_bar;","error":"unable to show table:
foo_bar","exec":{"stdout":"","stderr":"which: no /opt/cloudera/parcels/
CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/bin/hadoop in ((null))\ndirname: missing operand\nTry
`dirname --help' for more information.\nlog4j:ERROR setFile(null,true) call failed.\njava.io
.FileNotFoundException: /opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hive/logs/hcat.
log (No such file or directory)\n\tat java.io.FileOutputStream.open0(Native Method)\n\tat
java.io.FileOutputStream.open(FileOutputStream.java:270)\n\tat java.io.FileOutputStream.<
init>(FileOutputStream.java:213)\n\tat java.io.FileOutputStream.<init>(FileOutputStream.
java:133)\n\tat org.apache.log4j.FileAppender.setFile(FileAppender.java:294)\n\tat org.
apache.log4j.FileAppender.activateOptions(FileAppender.java:165)\n\tat org.apache.log4j.
DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223)\n\tat org.apache
.log4j.config.PropertySetter.activate(PropertySetter.java:307)\n\tat org.apache.log4j.config
.PropertySetter.setProperties(PropertySetter.java:172)\n\tat org.apache.log4j.config.
PropertySetter.setProperties(PropertySetter.java:104)\n\tat org.apache.log4j.
PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)\n\tat org.apache.log4j.
PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)\n\tat org.apache.log4j.
PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648)\n\tat org.apache.
log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514)\n\tat org.apache.log4j
.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)\n\tat org.apache.log4j.
PropertyConfigurator.configure(PropertyConfigurator.java:415)\n\tat org.apache.hadoop.hive.
common.LogUtils.initHiveLog4jDefault(LogUtils.java:127)\n\tat org.apache.hadoop.hive.common.
LogUtils.initHiveLog4jCommon(LogUtils.java:77)\n\tat org.apache.hadoop.hive.common.LogUtils.
initHiveLog4j(LogUtils.java:58)\n\tat org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.
java:65)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect
.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.
DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.
reflect.Method.invoke(Method.java:497)\n\tat org.apache.hadoop.util.RunJar.run(RunJar.
java:221)\n\tat org.apache.hadoop.util.RunJar.main(RunJar.java:136)\nlog4j:ERROR Either
File or DatePattern options are not set for appender [DRFA].\nOK\nTime taken: 1.035
seconds\n Command was terminated due to timeout(10000ms). See templeton.exec.timeout
property","exitcode":143}}
I don't know why it complains about a missing log directory only for foo_bar table but successfully returns the metadata regarding Person.
BTW, I can go into the hive console and do a select count(*) query on both Person and foo_bar.
EDIT::
Upon reading the error message again, it seems the core problem is
Command was terminated due to timeout(10000ms). See templeton.exec.timeout property","exitcode":143
But cloudera manager is not aware of this property "templeton.exec.timeout" ... what do I do... I don't want to edit files manually as there are many many nodes in the cluster.
Edit2::
I went inside each hadoop node and did
sudo vi /opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/etc/hive-webhcat/conf.dist/webhcat-default.xml
I found the timeout value and increased it to 1000000. I did this on each one and then I restarted Hive and WebHCat server using cloudera manager. but I got exactly the same error message.

Related

Can't use 'put'() to add data to hbase with happybase

My python version is 3.7, and after I ran pip3 install happybase, I started the command hbase thrift start and tried to write a brief .py file as following:
import happybase
connection = happybase.Connection('master')
table = connection.table('jmlr') #'jmlr' is a table in hbase
for i in table.scan():
print(i)
table.put('001', {'title':'dasds'}) #error here
connection.close()
When it's about to run table.put(), it reported such an error:
thriftpy2.transport.base.TTransportException: TTransportException(type=4, message='TSocket read 0 bytes')
And at the same time, the thrift reported an error:
ERROR [thrift-worker-1] thrift.TBoundedThreadPoolServer: Error occurred during processing of message. java.lang.IllegalArgumentException: Invalid famAndQf provided.
But just now I ran this python file again, it gave me a different error in thrift:
thrift.TBoundedThreadPoolServer: Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Bad version in readMessageBegin
I have tried to add parameters like protocol='compact', transport='framed', but this didn't work, even the table.scan() failed.
Everything in the hbase shell is OK, so I can't figure out what went wrong, I'm about to collapse.
I ran into the same issue and found this sollution. You need to add even empty Column Qualifier ( ':' symbol as delimiter between Column Family and Column Qualifier) into put() method:
table.put('001:', {'title':'dasds'})
Also, you have a different error message after second run of script because thrift server is already failed.
I hope it will help you.

Cannot delete a Kafka topic in Windows

I have set the Zookeeper properties of "delete.topic.enable" to true. But I still cannot delete the topic. When I do mvn install or mvn test, I got following problems:
WARN Error processing kafka.log:type=LogManager,name=LogDirectoryOffline,logDirectory=C:\Users\extznq\AppData\Local\Temp\EH4Test7267133751803693562 (com.yammer.metrics.reporting.JmxReporter:397)
javax.management.MalformedObjectNameException: Invalid character ':' in value part of property
ERROR Error while deleting healthchecktopic1516638375589-0 in dir C:\Users\extznq\AppData\Local\Temp\EH4Test9083449671042580730. (kafka.server.LogDirFailureChannel:107)
java.io.IOException: Failed to rename log directory from C:\Users\{My-topic-path} to C:\Users\{My-topic-path}-0.0a40ae7410c2401aba0816891789c334-delete
at kafka.log.LogManager.asyncDelete(LogManager.scala:671)
at kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:178)
at kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:173)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:217)
at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:225)
at kafka.cluster.Partition.delete(Partition.scala:173)
at kafka.server.ReplicaManager.stopReplica(ReplicaManager.scala:341)
at kafka.server.ReplicaManager$$anonfun$stopReplicas$2.apply(ReplicaManager.scala:373)
at kafka.server.ReplicaManager$$anonfun$stopReplicas$2.apply(ReplicaManager.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:371)
at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:190)
at kafka.server.KafkaApis.handle(KafkaApis.scala:104)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:65)
at java.lang.Thread.run(Thread.java:745)
ERROR [Broker id=0] Ignoring stop replica (delete=true) for partition healthchecktopic1516728986980-0 due to storage exception (state.change.logger:107)
org.apache.kafka.common.errors.KafkaStorageException: Error while deleting healthchecktopic1516728986980-0 in dir C:\Users\AppData\Local\Temp\EH4Test7267133751803693562.
Caused by: java.io.IOException: Failed to rename log directory from C:\Users\AppData\Local\Temp\EH4Test7267133751803693562\healthchecktopic1516728986980-0 to C:\Users\AppData\Local\Temp\EH4Test7267133751803693562\healthchecktopic1516728986980-0.65fc6c32c44940e58c1a45bd2972523a-delete
at kafka.log.LogManager.asyncDelete(LogManager.scala:671)
I think the Warning is all right. But I do not know why I would get errors, although I run the Eclipse as administrator. Especially, the KafkaStorageException, I still have 50GB in my computer.
Environment:
Windows 10
Zookeeper 3.5.3-beta
Kafka 1.0.0
looks like you have an invalid character in the properties: Invalid character ':' in value part of property; also in the line below it seems that the placeholder you are using it is not replaced properly: C:\Users\{My-topic-path}
Furthermore there is an open issues related to file operation such as moving/delete in windows environment, I got the same problem upgrading from 0.10.2.1 to 1.1.0
This is the link to the issue: KAFKA-1194

Hive Browser Throwing Error

I am trying to put some basic query in hive editor in hue browser , but it is returning the following error whereas my Hivecli works fine and able to execute queries. Could someone help me?
Fetching results ran into the following error(s):
Bad status for request TFetchResultsReq(fetchType=1,
operationHandle=TOperationHandle(hasResultSet=True,
modifiedRowCount=None, operationType=0,
operationId=THandleIdentifier(secret='r\t\x80\xac\x1a\xa0K\xf8\xa4\xa0\x85?\x03!\x88\xa9',
guid='\x852\x0c\x87b\x7fJ\xe2\x9f\xee\x00\xc9\xeeo\x06\xbc')),
orientation=4, maxRows=-1):
TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="Couldn't
find log associated with operation handle: OperationHandle
[opType=EXECUTE_STATEMENT,
getHandleIdentifier()=85320c87-627f-4ae2-9fee-00c9ee6f06bc]",
sqlState=None,
infoMessages=["*org.apache.hive.service.cli.HiveSQLException:Couldn't
find log associated with operation handle: OperationHandle
[opType=EXECUTE_STATEMENT,
getHandleIdentifier()=85320c87-627f-4ae2-9fee-00c9ee6f06bc]:24:23",
'org.apache.hive.service.cli.operation.OperationManager:getOperationLogRowSet:OperationManager.java:229',
'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:687',
'sun.reflect.GeneratedMethodAccessor14:invoke::-1',
'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43',
'java.lang.reflect.Method:invoke:Method.java:606',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78',
'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36',
'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63',
'java.security.AccessController:doPrivileged:AccessController.java:-2',
'javax.security.auth.Subject:doAs:Subject.java:415',
'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1657',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59',
'com.sun.proxy.$Proxy19:fetchResults::-1',
'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:454',
'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:672',
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553',
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538',
'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39',
'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39',
'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56',
'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285',
'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1145',
'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:615',
'java.lang.Thread:run:Thread.java:745'], statusCode=3), results=None,
hasMoreRows=None)
This error could be either due to HiveServer2 not running or Hue does not have access to hive_conf_dir.
Check whether the HiveServer2 has been started and is running. It uses the port 10000 by default.
netstat -ntpl | grep 10000
If it is not running, start the HiveServer2
$HIVE_HOME/bin/hiveserver2
Also check the Hue configuration file hue.ini. The hive_conf_dir property must be set under [beeswax] section. If not set, add this property under [beeswax]
hive_conf_dir=$HIVE_HOME/conf
Restart supervisor after making these changes.

Hive issue using yarn

I am running hive sql on yarn,
it's throwing error with join condition , I am able to create External as well as internal table but failed to create table when use command
create table as AS SELECT name from student.
when running same query through hive cli it's working fine but with spring jog it throws error
2016-03-28 04:26:50,692 [Thread-17] WARN
org.apache.hadoop.hive.shims.HadoopShimsSecure - Can't fetch tasklog:
TaskLogServlet is not supported in MR2 mode.
Task with the most failures(4):
-----
Task ID:
task_1458863269455_90083_m_000638
-----
Diagnostic Messages for this Task:
AttemptID:attempt_1458863269455_90083_m_000638_3 Timed out after 1 secs
2016-03-28 04:26:50,842 [main] INFO
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Killed application
application_1458863269455_90083
2016-03-28 04:26:50,849 [main] ERROR com.mapr.fs.MapRFileSystem - Failed to
delete path maprfs:/home/pro/amit/warehouse/scratdir/hive_2016-03-28_04-
24-32_038_8553676376881087939-1/_task_tmp.-mr-10003, error: No such file or
directory (2)
2016-03-28 04:26:50,852 [main] ERROR org.apache.hadoop.hive.ql.Driver -
FAILED: Execution Error, return code 2 from
As per my findings I think there is some issue with scratdir.
Kindly suggest if any one face same issue.
This issue occurs if the recursive directory doesnot exist. Hive doesnt automatically create directories recursively.
Please check existence of directories to child\table level from root
I faced a similar issue while running the below Hive query
select * from <db_name>.<internal_tbl_name> where <field_name_of_double_type> in (<list_of_double_values>) order by <list_of_order_fields> limit 10;
I performed an explain on the above statement and below was the result.
fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2d00001590]: it still exists.
2017-05-08 04:26:37,969 WARN [41289638-cd53-4d4b-88c9-3359e9ec99e2 main] fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2700001591]: it still exists.
Time taken: 0.886 seconds, Fetched: 24 row(s)
And checked the logs through
yarn logs -applicationID application_1458863269455_90083
The error happened after a MapR upgrade from the admin team. It is probably due to some upgrade or installation issue and Tez configurations (as suggested by the line 873 in log below). Or probably, the Hive query is syntactically not supporting the Tez optimization. Saying so, because another Hive query on an external table is running fine in my case. Have to check a bit deeper though.
Though not sure but the error line in the logs that looks to be most relevant is as follows:
2017-05-08 00:01:47,873 [ERROR] [main] |web.WebUIService|: Tez UI History URL is not set
Solution:
It is probably happening due to some open files or applications that are using some resources. Pls check https://unix.stackexchange.com/questions/11238/how-to-get-over-device-or-resource-busy
You can run the explain <your_Hive_statement>
In the result execution plan, you can come across the filenames/dirs that Hive execution engine fails to delete e.g.
2017-05-08 04:26:37,969 WARN [41289638-cd53-4d4b-88c9-3359e9ec99e2 main] fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2d00001590]: it still exists.
Go to the path given in the step 2 e.g. /hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/
In path 3, doing ls -a or lsof +D /path will show the open process_ids blocking the files from delete.
If you run ps -ef | grep <pid>, you get
hive_username <pid> 19463 1 05:19 pts/8 00:00:35 /opt/mapr/tools/jdk1.7.0_51/jre/bin/java -Xmx256m -Dhiveserver2.auth=PAM -Dhiveserver2.authentication.pam.services=login -Dmapr_sec_enabled=true -Dhadoop.login=maprsasl -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/mapr/hadoop/hadoop-2.7.0 -Dhadoop.id.str=hive_username -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/mapr/hadoop/hadoop-2.7.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/mapr/hive/hive-2.1/bin/../conf/parquet-logging.properties -Dhadoop.security.logger=INFO,NullAppender -Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf -Dzookeeper.saslprovider=com.mapr.security.maprsasl.MaprSaslProvider -Djavax.net.ssl.trustStore=/opt/mapr/conf/ssl_truststore org.apache.hadoop.util.RunJar /opt/mapr/hive/hive-2.1//lib/hive-cli-2.1.1-mapr-1703.jar org.apache.hadoop.hive.cli.CliDriver
CONCLUSION:
The HiveCLiDriver clearly shows that running "Hive on Spark" (or managed) tables through Hive CLI is not supported any more from Hive 2.0 onwards and it is going to be deprecated going forward. You have to use HiveContext in Spark for running Hive queries. But you can still run queries on Hive external tables through Hive CLI.

hive failed execution error return code 2 from org.apache.hadoop.hive.ql.exec.mapredtask

I have one query. It is executing fine on Hive CLI and returning the result. But when I am executing it with the help of Hive JDBC, I am getting an error below:
java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:192)
What is the problem? Also I am starting the Hive Thrift Server through Shell Script. (I have written a shell script which has command to start Hive Thrift Server) Later I decided to start Hive thrift Server manually by typing command as:
hadoop#ubuntu:~/hive-0.7.1$ bin/hive --service hiveserver
Starting Hive Thrift Server
org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:10000.
at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:99)
at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:80)
at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:73)
at org.apache.hadoop.hive.service.HiveServer.main(HiveServer.java:384)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
hadoop#ubuntu:~/hive-0.7.1$
Please help me out from this.
Thanks
For this error :
java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuer
Go to this link :
http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_Hive.html
and add
**hadoop-0.20-core.jar
hive/lib/hive-exec-0.7.1.jar
hive/lib/hive-jdbc-0.7.1.jar
hive/lib/hive-metastore-0.7.1.jar
hive/lib/hive-service-0.7.1.jar
hive/lib/libfb303.jar
lib/commons-logging-1.0.4.jar
slf4j-api-1.6.1.jar
slf4j-log4j12-1.6.1.jar**
to the class path of your project , add this jars from the lib of hadoop and hive, and try the code. and also add the path of hadoop, hive, and hbase(if your are using) lib folder path to the project class path, like you have added the jars.
and for the second error you got
type
**netstat -nl | grep 10000**
if it shows something means hive server is already running. the second error comes only when the port you are specifying is already acquired by some other process, by default server port is 10000 so very with the above netstat command which i said.
Note : suppose you have connected using code exit from ... bin/hive of if you are connected through bin/hive > then code will not connect because i think (not sure) only one client can connect to the hive server.
do above steps hopefully will solve your problem.
NOTE : exit from cli when you are going to execute the code, and dont start cli while code is being executing.
Might be some issue with permission, just try some query like "SELECT * FROM " which won't start MR jobs.
Try to paste these property before the codes.
SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
set hive.auto.convert.join = false;
set hive.exec.max.dynamic.partitions=100000;
set hive.exec.max.dynamic.partitions.pernode=10000;

Resources