Hive issue using yarn - hadoop

I am running hive sql on yarn,
it's throwing error with join condition , I am able to create External as well as internal table but failed to create table when use command
create table as AS SELECT name from student.
when running same query through hive cli it's working fine but with spring jog it throws error
2016-03-28 04:26:50,692 [Thread-17] WARN
org.apache.hadoop.hive.shims.HadoopShimsSecure - Can't fetch tasklog:
TaskLogServlet is not supported in MR2 mode.
Task with the most failures(4):
-----
Task ID:
task_1458863269455_90083_m_000638
-----
Diagnostic Messages for this Task:
AttemptID:attempt_1458863269455_90083_m_000638_3 Timed out after 1 secs
2016-03-28 04:26:50,842 [main] INFO
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Killed application
application_1458863269455_90083
2016-03-28 04:26:50,849 [main] ERROR com.mapr.fs.MapRFileSystem - Failed to
delete path maprfs:/home/pro/amit/warehouse/scratdir/hive_2016-03-28_04-
24-32_038_8553676376881087939-1/_task_tmp.-mr-10003, error: No such file or
directory (2)
2016-03-28 04:26:50,852 [main] ERROR org.apache.hadoop.hive.ql.Driver -
FAILED: Execution Error, return code 2 from
As per my findings I think there is some issue with scratdir.
Kindly suggest if any one face same issue.

This issue occurs if the recursive directory doesnot exist. Hive doesnt automatically create directories recursively.
Please check existence of directories to child\table level from root

I faced a similar issue while running the below Hive query
select * from <db_name>.<internal_tbl_name> where <field_name_of_double_type> in (<list_of_double_values>) order by <list_of_order_fields> limit 10;
I performed an explain on the above statement and below was the result.
fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2d00001590]: it still exists.
2017-05-08 04:26:37,969 WARN [41289638-cd53-4d4b-88c9-3359e9ec99e2 main] fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2700001591]: it still exists.
Time taken: 0.886 seconds, Fetched: 24 row(s)
And checked the logs through
yarn logs -applicationID application_1458863269455_90083
The error happened after a MapR upgrade from the admin team. It is probably due to some upgrade or installation issue and Tez configurations (as suggested by the line 873 in log below). Or probably, the Hive query is syntactically not supporting the Tez optimization. Saying so, because another Hive query on an external table is running fine in my case. Have to check a bit deeper though.
Though not sure but the error line in the logs that looks to be most relevant is as follows:
2017-05-08 00:01:47,873 [ERROR] [main] |web.WebUIService|: Tez UI History URL is not set
Solution:
It is probably happening due to some open files or applications that are using some resources. Pls check https://unix.stackexchange.com/questions/11238/how-to-get-over-device-or-resource-busy
You can run the explain <your_Hive_statement>
In the result execution plan, you can come across the filenames/dirs that Hive execution engine fails to delete e.g.
2017-05-08 04:26:37,969 WARN [41289638-cd53-4d4b-88c9-3359e9ec99e2 main] fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2d00001590]: it still exists.
Go to the path given in the step 2 e.g. /hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/
In path 3, doing ls -a or lsof +D /path will show the open process_ids blocking the files from delete.
If you run ps -ef | grep <pid>, you get
hive_username <pid> 19463 1 05:19 pts/8 00:00:35 /opt/mapr/tools/jdk1.7.0_51/jre/bin/java -Xmx256m -Dhiveserver2.auth=PAM -Dhiveserver2.authentication.pam.services=login -Dmapr_sec_enabled=true -Dhadoop.login=maprsasl -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/mapr/hadoop/hadoop-2.7.0 -Dhadoop.id.str=hive_username -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/mapr/hadoop/hadoop-2.7.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/mapr/hive/hive-2.1/bin/../conf/parquet-logging.properties -Dhadoop.security.logger=INFO,NullAppender -Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf -Dzookeeper.saslprovider=com.mapr.security.maprsasl.MaprSaslProvider -Djavax.net.ssl.trustStore=/opt/mapr/conf/ssl_truststore org.apache.hadoop.util.RunJar /opt/mapr/hive/hive-2.1//lib/hive-cli-2.1.1-mapr-1703.jar org.apache.hadoop.hive.cli.CliDriver
CONCLUSION:
The HiveCLiDriver clearly shows that running "Hive on Spark" (or managed) tables through Hive CLI is not supported any more from Hive 2.0 onwards and it is going to be deprecated going forward. You have to use HiveContext in Spark for running Hive queries. But you can still run queries on Hive external tables through Hive CLI.

Related

Unable to create database because it's already exists, but it doesn't

I want to create hive database using the command:
create database sbx_products_diff
But it fails with the following error:
jdbc:hive2://myhost> create database sbx_products_diff;
INFO : Compiling
command(queryId=hive_20220414115555_9db94201-d1b7-40ca-8cee-d1d4147c080e):
create database sbx_products_diff INFO : Semantic Analysis
Completed INFO : Returning Hive schema: Schema(fieldSchemas:null,
properties:null) INFO : Completed compiling
command(queryId=hive_20220414115555_9db94201-d1b7-40ca-8cee-d1d4147c080e);
Time taken: 0.003 seconds INFO : Executing
command(queryId=hive_20220414115555_9db94201-d1b7-40ca-8cee-d1d4147c080e):
create database sbx_products_diff INFO : Starting task
[Stage-0:DDL] in serial mode ERROR : FAILED: Execution Error, return
code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Database
sbx_products_diff already exists INFO : Completed
executing
command(queryId=hive_20220414115555_9db94201-d1b7-40ca-8cee-d1d4147c080e);
Time taken: 0.014 seconds Error: Error while processing statement:
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask. Database
sbx_products_diff already exists (state=42000,code=1)
But the problem is that this database doesn't exists actually:
jdbc:hive2://myhost> drop database sbx_products_diff;
Error: Error while compiling statement:
FAILED: SemanticException [Error 10072]: Database does not exist:
sbx_products_diff (state=42000,code=10072)
Seems like this problem is related to this database particularly, because if I run exactly the same command, but with another name it works perfectly, as expected. Could someone make a direction what is the root of my troubles? Or what should I share to let you help me?
UPD1:
If you look in the hive warehouse folder, do you see a folder called "sbx_products_diff"?
No, the directory doesn't exist. But I should tell another thing, that is interesting. _diff part of the name is suffix. I mean there are several existing databases - sbx_products, sbx_products_aux, sbx_products_bkp, .. And all of those databases located in hdfs with the following structure:
hdfs://bla-bla/sbx_products/hive/pa
hdfs://bla-bla/sbx_products/hive/aux
hdfs://bla-bla/sbx_products/hive/bkp
...
Also, I have already searched my database with commands like:
hdfs dfs -ls -R / | grep diff
hdfs dfs -ls -R / | grep sbx_products
But nothing related to my ghost database were found.

Connect to Teradata Using Airflow JDBC Connection

I'm trying to execute a SqlSensor task in Airflow using a connection to Teradata database. The connection is configured as follow:
I have provide in particular 2 driver paths separated by ", " but I am not sure if it's the proper way to do it?
/home/airflow/java_sample/tdgssconfig.jar
/home/airflow/java_sample/terajdbc4.jar
When the DAG executes, it triggers the error message
[2017-08-02 02:32:45,162] {models.py:1342} INFO - Executing <Task(SqlSensor): check_running_batch> on 2017-08-02 02:32:12
[2017-08-02 02:32:45,179] {base_hook.py:67} INFO - Using connection to: jdbc:teradata://myservername.mycompanyname.org/database=MYDBNAME,TMODE=ANSI,CHARSET=UTF8
[2017-08-02 02:32:45,313] {sensors.py:109} INFO - Poking: SELECT BATCH_KEY FROM MYDBNAME.AUDIT_BATCH WHERE BATCH_OWNER='ARO_TEST' AND AUDIT_STATUS_KEY=1;
[2017-08-02 02:32:45,316] {base_hook.py:67} INFO - Using connection to: jdbc:teradata://myservername.mycompanyname.org/database=MYDBNAME,TMODE=ANSI,CHARSET=UTF8
[2017-08-02 02:32:45,497] {models.py:1417} ERROR - java.lang.RuntimeException: Class com.teradata.jdbc.TeraDriver not found
What am I doing wrong?
The appropriate way to input multiple jars in the connections page is to separate both fully qualified paths with a comma which you did above.
I can confirm this is the approach I took and it worked (Airflow 10.1.1 and 10.1.2).
See: https://github.com/apache/airflow/blob/master/airflow/hooks/jdbc_hook.py#L51
Bonus: If you use Ad Hoc Query in Data Profiling to test it out, you'll notice that you'll get an error when you send a SELECT statement because Airflow wraps it in a LIMIT clause which TD doesn't support.
The solution provided by my team member was to merge the two jar into a single jar file. After doing it and indicating that new jar file in the driver path, it worked as expected.
Here is the link to the JAR file: https://github.com/alexisrolland/linux-setup/blob/master/teradataDriverJdbc.jar
Here is a code snippet example to use the connection in a SQLSensor Task:
CheckRunningBatch = SqlSensor(
task_id='check_running_batch',
conn_id='ed_data_quality_edw_dev',
sql="SELECT CASE WHEN MAX(BATCH_KEY) IS NOT NULL THEN 0 ELSE 1 END FROM DATABASE.AUDIT_BATCH WHERE STATUS_KEY=1;",
poke_interval=300,
dag=dag)

Hive Browser Throwing Error

I am trying to put some basic query in hive editor in hue browser , but it is returning the following error whereas my Hivecli works fine and able to execute queries. Could someone help me?
Fetching results ran into the following error(s):
Bad status for request TFetchResultsReq(fetchType=1,
operationHandle=TOperationHandle(hasResultSet=True,
modifiedRowCount=None, operationType=0,
operationId=THandleIdentifier(secret='r\t\x80\xac\x1a\xa0K\xf8\xa4\xa0\x85?\x03!\x88\xa9',
guid='\x852\x0c\x87b\x7fJ\xe2\x9f\xee\x00\xc9\xeeo\x06\xbc')),
orientation=4, maxRows=-1):
TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="Couldn't
find log associated with operation handle: OperationHandle
[opType=EXECUTE_STATEMENT,
getHandleIdentifier()=85320c87-627f-4ae2-9fee-00c9ee6f06bc]",
sqlState=None,
infoMessages=["*org.apache.hive.service.cli.HiveSQLException:Couldn't
find log associated with operation handle: OperationHandle
[opType=EXECUTE_STATEMENT,
getHandleIdentifier()=85320c87-627f-4ae2-9fee-00c9ee6f06bc]:24:23",
'org.apache.hive.service.cli.operation.OperationManager:getOperationLogRowSet:OperationManager.java:229',
'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:687',
'sun.reflect.GeneratedMethodAccessor14:invoke::-1',
'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43',
'java.lang.reflect.Method:invoke:Method.java:606',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78',
'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36',
'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63',
'java.security.AccessController:doPrivileged:AccessController.java:-2',
'javax.security.auth.Subject:doAs:Subject.java:415',
'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1657',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59',
'com.sun.proxy.$Proxy19:fetchResults::-1',
'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:454',
'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:672',
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553',
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538',
'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39',
'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39',
'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56',
'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285',
'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1145',
'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:615',
'java.lang.Thread:run:Thread.java:745'], statusCode=3), results=None,
hasMoreRows=None)
This error could be either due to HiveServer2 not running or Hue does not have access to hive_conf_dir.
Check whether the HiveServer2 has been started and is running. It uses the port 10000 by default.
netstat -ntpl | grep 10000
If it is not running, start the HiveServer2
$HIVE_HOME/bin/hiveserver2
Also check the Hue configuration file hue.ini. The hive_conf_dir property must be set under [beeswax] section. If not set, add this property under [beeswax]
hive_conf_dir=$HIVE_HOME/conf
Restart supervisor after making these changes.

WebHCAT Error in getting Hive Table Metadata. Command was terminated due to timeout(10000ms). See templeton.exec.timeout property","exitcode":143

If I issue this webhcat REST call in my cloudera 5.4.1 environment
curl -s 'http://mywebhcat:50111/templeton/v1/ddl/database/default/table/person?
user.name=admin&format=extended'; echo; echo;
everything works fine and I see the metadata for the table Person.
But there is another table called foo_bar if I change the REST call above to
curl -s 'http://mywebhcat:50111/templeton/v1/ddl/database/default/table/foo_bar?
user.name=admin&format=extended'; echo; echo;
Then I get an error
{"statement":"use default; show table extended like
foo_bar;","error":"unable to show table:
foo_bar","exec":{"stdout":"","stderr":"which: no /opt/cloudera/parcels/
CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hadoop/bin/hadoop in ((null))\ndirname: missing operand\nTry
`dirname --help' for more information.\nlog4j:ERROR setFile(null,true) call failed.\njava.io
.FileNotFoundException: /opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hive/logs/hcat.
log (No such file or directory)\n\tat java.io.FileOutputStream.open0(Native Method)\n\tat
java.io.FileOutputStream.open(FileOutputStream.java:270)\n\tat java.io.FileOutputStream.<
init>(FileOutputStream.java:213)\n\tat java.io.FileOutputStream.<init>(FileOutputStream.
java:133)\n\tat org.apache.log4j.FileAppender.setFile(FileAppender.java:294)\n\tat org.
apache.log4j.FileAppender.activateOptions(FileAppender.java:165)\n\tat org.apache.log4j.
DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223)\n\tat org.apache
.log4j.config.PropertySetter.activate(PropertySetter.java:307)\n\tat org.apache.log4j.config
.PropertySetter.setProperties(PropertySetter.java:172)\n\tat org.apache.log4j.config.
PropertySetter.setProperties(PropertySetter.java:104)\n\tat org.apache.log4j.
PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)\n\tat org.apache.log4j.
PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)\n\tat org.apache.log4j.
PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648)\n\tat org.apache.
log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514)\n\tat org.apache.log4j
.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)\n\tat org.apache.log4j.
PropertyConfigurator.configure(PropertyConfigurator.java:415)\n\tat org.apache.hadoop.hive.
common.LogUtils.initHiveLog4jDefault(LogUtils.java:127)\n\tat org.apache.hadoop.hive.common.
LogUtils.initHiveLog4jCommon(LogUtils.java:77)\n\tat org.apache.hadoop.hive.common.LogUtils.
initHiveLog4j(LogUtils.java:58)\n\tat org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.
java:65)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect
.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.
DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.
reflect.Method.invoke(Method.java:497)\n\tat org.apache.hadoop.util.RunJar.run(RunJar.
java:221)\n\tat org.apache.hadoop.util.RunJar.main(RunJar.java:136)\nlog4j:ERROR Either
File or DatePattern options are not set for appender [DRFA].\nOK\nTime taken: 1.035
seconds\n Command was terminated due to timeout(10000ms). See templeton.exec.timeout
property","exitcode":143}}
I don't know why it complains about a missing log directory only for foo_bar table but successfully returns the metadata regarding Person.
BTW, I can go into the hive console and do a select count(*) query on both Person and foo_bar.
EDIT::
Upon reading the error message again, it seems the core problem is
Command was terminated due to timeout(10000ms). See templeton.exec.timeout property","exitcode":143
But cloudera manager is not aware of this property "templeton.exec.timeout" ... what do I do... I don't want to edit files manually as there are many many nodes in the cluster.
Edit2::
I went inside each hadoop node and did
sudo vi /opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/etc/hive-webhcat/conf.dist/webhcat-default.xml
I found the timeout value and increased it to 1000000. I did this on each one and then I restarted Hive and WebHCat server using cloudera manager. but I got exactly the same error message.

my hadoop job 252 hours later died(tasks then killed)

I had 81,068 tasks complete but then 11,799 failed and only 12 were killed. They SEEM to all failed from
2013-09-10 03:07:36,316 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201308301539_0002_m_083001_0: Error initializing attempt_201308301539_0002_m_083001_0:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201308301539_0002/work in any of the configured local directories
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:389)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.localizeTask(TaskTracker.java:1817)
at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.launchTask(TaskTracker.java:1933)
at org.apache.hadoop.mapred.TaskTracker.launchTaskForJob(TaskTracker.java:830)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:824)
at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1664)
at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:97)
at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1629)
At this point, I am just looking for guidance on how I can debug this before I re-run this again. For some reason out in the cluster, it looks like all the files are deleted though I thought hadoop M/R only deleted successfull task logs????
Anyone have some advice/ideas on how to debug this further?
It looks like all the default directories for map/reduce are used... /tmp/hadoop-hduser for my hduser.
I have seen stuff on /etc/hosts but then I don't get why 81,000 tasks succeeded before finally failing???
I am using the web interface to get some of this information of course and some logs where hadoopinstalled/logs
thanks,
Dean

Resources