After Capacity expansion, everyday, the sentry(sentry-1.5.1+cdh5.15.2+470) alawys down at 9 p.m, as flow:
Caused by: javax.jdo.JDODataStoreException: Iteration request failed : SELECT 'org.apache.sentry.provider.db.service.model.MPath' AS NUCLEUS_TYPE,`A0`.`PATH_NAME`,`A0`.`PATH_ID`,`A0`.`AUTHZ_OBJ_ID` FROM `AUTHZ_PATH` `A0` WHERE EXISTS (SELECT 'org.apache.sentry.provider.db.service.model.MAuthzPathsMapping' AS NUCLEUS_TYPE,`A0_SUB`.`AUTHZ_OBJ_ID` AS `DN_DATASTOREID` FROM `AUTHZ_PATHS_MAPPING` `A0_SUB` WHERE `A0_SUB`.`AUTHZ_SNAPSHOT_ID` = ? AND `A0`.`AUTHZ_OBJ_ID` = `A0_SUB`.`AUTHZ_OBJ_ID`)
NestedThrowables:
java.sql.SQLException: No value specified for parameter 1
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:252)
at org.apache.sentry.provider.db.service.persistent.SentryStore.retrieveFullPathsImageCore(SentryStore.java:2542)
at org.apache.sentry.provider.db.service.persistent.SentryStore.access$2800(SentryStore.java:131)
at org.apache.sentry.provider.db.service.persistent.SentryStore$39.execute(SentryStore.java:2518)
at org.apache.sentry.provider.db.service.persistent.SentryStore$39.execute(SentryStore.java:2509)
at org.apache.sentry.provider.db.service.persistent.TransactionManager.executeTransaction(TransactionManager.java:123)
at org.apache.sentry.provider.db.service.persistent.SentryStore.retrieveFullPathsImageUpdate(SentryStore.java:2508)
at org.apache.sentry.hdfs.PathImageRetriever.retrieveFullImage(PathImageRetriever.java:52)
at org.apache.sentry.hdfs.PathImageRetriever.retrieveFullImage(PathImageRetriever.java:32)
at org.apache.sentry.hdfs.DBUpdateForwarder.retrieveFullImage(DBUpdateForwarder.java:141)
at org.apache.sentry.hdfs.DBUpdateForwarder.getAllUpdatesFrom(DBUpdateForwarder.java:92)
at org.apache.sentry.hdfs.SentryPlugin.getAllPathsUpdatesFrom(SentryPlugin.java:163)
at org.apache.sentry.hdfs.SentryHDFSServiceProcessor.getPathsUpdatesFrom(SentryHDFSServiceProcessor.java:130)
at org.apache.sentry.hdfs.SentryHDFSServiceProcessor.get_authz_updates(SentryHDFSServiceProcessor.java:68)
... 10 more
Caused by: java.sql.SQLException: No value specified for parameter 1
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:898)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:887)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:861)
at com.mysql.jdbc.PreparedStatement.checkAllParametersSet(PreparedStatement.java:2211)
I am trying to execute the below code in hive:
create table xyz (name string,div int) ;
It shows error. Cant we use a column in hive with name div ? I have a large table that has a column div, executing that hql throwed me below error. That is how i tried with a smaller hql as the one above, and it shows same error. I am using hive 0.13.
NoViableAltException(14#[])
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:11627)
at org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:40134)
at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameType(HiveParser.java:34747)
at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeList(HiveParser.java:32979)
at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:4544)
at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2144)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1398)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1036)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:408)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:976)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1041)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:912)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
FAILED: ParseException line 2:15 cannot recognize input near 'div' 'int' ')' in column specification`
Well the answer is found!
create table xyz (name string, `div` int);
This works! Surround div with "`" symbol and then it works.
I suppose div would be a keyword in hive (not found in any document though).
Thanks,
Neethu
I am trying to compress RC and ORC file using LZ4. I have installed Hadoop-2.7.1 and Hive-1.2.1. In case of LZ4, I can compress RC file without any problem. But, when I try to load data in ORC file using LZ4, it is not working. I have created ORC table like below:
CREATE TABLE FINANCE_orc(
PERMNO STRING,
DATE STRING,
CUSIP STRING,
NCUSIP STRING,
COMNAM STRING,
TICKET STRING,
PERMCO STRING,
SHRCD STRING,
EXCHCD STRING,
HEXCD STRING,
SICCD STRING,
HSLCCD STRING,
PRC STRING,
VOL STRING,
RET STRING,
SHROUT STRING,
DLRET STRING,
VWRETD STRING,
EWRETD STRING,
SPRTRN STRING)
STORED AS ORC tblproperties ("orc.compress"="Lz4");
set mapred.output.compress=true;
set hive.exec.compress.output=true;
set mapred.output.compression.type = BLOCK;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec;
INSERT OVERWRITE table finance_orc select * from finance;
But at the time of loading data it gives the following error:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"permno":"PERMNO","ndate":"DATE","cusip":"CUSIP","ncusip":"NCUSIP","comnam":"COMNAM","ticket":"TICKER","permco":"PERMCO","shrcd":"SHRCD","exchcd":"EXCHCD","hexcd":"HEXCD","siccd":"SICCD","hslccd":"HSICCD","prc":"PRC","vol":"VOL","ret":"RET","shrout":"SHROUT","dlret":"DLRET","vwretd":"VWRETD","ewretd":"EWRETD","sprtrn":"SPRTRN"}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"permno":"PERMNO","ndate":"DATE","cusip":"CUSIP","ncusip":"NCUSIP","comnam":"COMNAM","ticket":"TICKER","permco":"PERMCO","shrcd":"SHRCD","exchcd":"EXCHCD","hexcd":"HEXCD","siccd":"SICCD","hslccd":"HSICCD","prc":"PRC","vol":"VOL","ret":"RET","shrout":"SHROUT","dlret":"DLRET","vwretd":"VWRETD","ewretd":"EWRETD","sprtrn":"SPRTRN"}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hive.ql.io.orc.CompressionKind.Lz4
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:577)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:675)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
... 9 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hive.ql.io.orc.CompressionKind.Lz4
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:622)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:566)
... 16 more
Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hive.ql.io.orc.CompressionKind.Lz4
at java.lang.Enum.valueOf(Enum.java:236)
at org.apache.hadoop.hive.ql.io.orc.CompressionKind.valueOf(CompressionKind.java:25)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getOptions(OrcOutputFormat.java:143)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getHiveRecordWriter(OrcOutputFormat.java:203)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getHiveRecordWriter(OrcOutputFormat.java:52)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:261)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:246)
... 18 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 4 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
I have used Snappy and Zlib with the same command and it is working fine. But problem is only with LZ4. I do not know the reason why?
The compression we can use in addition to ORC columnar compression is either of one of NONE, ZLIB, SNAPPY.
The default compression codec is ZLIB.
The compression codecs other than above are not allowed.
In general to get an idea about error, read the error log completely to find the issue to some extent. The error log said -
"org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hive.ql.io.orc.CompressionKind.Lz4"
Now you can manually substitute the jar file for lz4 on official spark repo.
I'm trying to use PIG to filter a relation based on the current day of the week using a ternary condition but it's giving me an error that I haven't seen yet.
This is what I'm trying to do:
C = filter B by (DaysBetween(CurrentTime(),ToDate(0L)) % 7) == (long)0 ? B.interval == 'daily' : B.interval == 'weekly';
and the error that returns is:
ERROR 1200: Pig script failed to parse: NoViableAltException(84#[])
Failed to parse: Pig script failed to parse: NoViableAltException(84#[])
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1684)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1657)
at org.apache.pig.PigServer.registerQuery(PigServer.java:600)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1069)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:542)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: NoViableAltException(84#[])
at org.apache.pig.parser.AstValidator.expr(AstValidator.java:8637)
at org.apache.pig.parser.AstValidator.expr(AstValidator.java:9115)
at org.apache.pig.parser.AstValidator.bin_expr(AstValidator.java:10531)
at org.apache.pig.parser.AstValidator.projectable_expr(AstValidator.java:9790)
at org.apache.pig.parser.AstValidator.var_expr(AstValidator.java:9582)
at org.apache.pig.parser.AstValidator.expr(AstValidator.java:8985)
at org.apache.pig.parser.AstValidator.cond(AstValidator.java:7820)
at org.apache.pig.parser.AstValidator.filter_clause(AstValidator.java:7328)
at org.apache.pig.parser.AstValidator.op_clause(AstValidator.java:1683)
at org.apache.pig.parser.AstValidator.general_statement(AstValidator.java:1035)
at org.apache.pig.parser.AstValidator.statement(AstValidator.java:499)
at org.apache.pig.parser.AstValidator.query(AstValidator.java:373)
at org.apache.pig.parser.QueryParserDriver.validateAst(QueryParserDriver.java:255)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:183)
... 15 more
Does anyone have any idea why it's not working? I'm using 0.11 on my Mac.
Thanks.
EDIT: I was trying not to use UDF and I have used ternary conditions in other relations.
I figured it out. It was just badly constructed.
This is the correct way:
C = filter B by (((DaysBetween(CurrentTime(),ToDate(0L)) % 7) == 4) ? 'daily' : 'weekly') == interval;
I have data
store trn_date dept_id sale_amt
1 2014-12-15 101 10007655
1 2014-12-15 101 10007654
1 2014-12-15 101 10007544
6 2014-12-15 104 100086544
8 2014-12-14 101 1000000
8 2014-12-15 101 100865761
I'm trying to aggregate the data using below code -
Loading the data (tried both the way using HCatLoader() and using PigStorage())
data = LOAD 'data' USING org.apache.hcatalog.pig.HCatLoader();
group_table = GROUP data BY (store, tran_date, dept_id);
group_gen = FOREACH grp_table GENERATE
FLATTEN(group) AS (store, tran_date, dept_id),
SUM(table.sale_amt) AS tota_sale_amt;
Below is the Error stack Trace which I'm getting while running the job
================================================================================
Pig Stack Trace
---------------
ERROR 2103: Problem doing work on Longs
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: grouped_all: Local Rearrange[tuple]{tuple}(false) - scope-1317 Operator Key: scope-1317): org.apache.pig.backend.executionengine.ExecException: ERROR 2103: Problem doing work on Longs
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:289)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:263)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:183)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:161)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:51)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1645)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1611)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1462)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:700)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2103: Problem doing work on Longs
at org.apache.pig.builtin.AlgebraicLongMathBase.doTupleWork(AlgebraicLongMathBase.java:84)
at org.apache.pig.builtin.AlgebraicLongMathBase$Intermediate.exec(AlgebraicLongMathBase.java:108)
at org.apache.pig.builtin.AlgebraicLongMathBase$Intermediate.exec(AlgebraicLongMathBase.java:102)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:369)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:333)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:281)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Number
at org.apache.pig.builtin.AlgebraicLongMathBase.doTupleWork(AlgebraicLongMathBase.java:77)
================================================================================
As i was looking for solution, many of said it is due to loading the data using HCatalog Loader. so i have tried loading the data using "PigStorage()".
still getting the same error.
This is may be because of the way you are storing data in hive. If any aggregation is going to happen on any column do mention it's data type integer or numeric.
Basically each aggregation function returns data with it's default data type,
Like -
AVG returns DOUBLE
SUM returns DOUBLE
COUNT returns LONG
I don't think so this is the issue while storing it into hive because you already tried PigStore() it means, it's just data type issue while you are passing it to aggregation.
Try to change the data type before passing it to aggregation and try.