Intro:
We have a class annotated with #Document that has date fields.
E.g.
#Document(indexName = "notif-index", type = "notif-type", shards = 1, replicas = 0, refreshInterval = "-1")
public class NotifEntry {
...
#Field(type = FieldType.Date)// ? (type=FieldType.Date, format = DateFormat.custom, pattern = "dd.MM.yyyy")
private Date pubDate;
When trying to insert data for years we are getting the following error:
org.springframework.data.elasticsearch.ElasticsearchException: Bulk indexing has failures. Use ElasticsearchException.getFailedDocuments() for detailed messages [{104921=MapperParsingException[failed to parse [pubDate]]; nested: IllegalArgumentException[Invalid format: "-24303888000000" is malformed at "3888000000"];}]
at org.springframework.data.elasticsearch.core.ElasticsearchTemplate.bulkIndex(ElasticsearchTemplate.java:588)
at org.springframework.data.elasticsearch.repository.support.AbstractElasticsearchRepository.save(AbstractElasticsearchRepository.java:176)
at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.executeMethodOn(RepositoryFactorySupport.java:504)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:489)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:461)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:56)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
The specific date in question is: 1199-10-28 00:00:00
Elastic Search Repositories:
public interface NotifiEntryRepository extends ElasticsearchRepository<NotifEntry, String> { }
Questions:
IF we wanted to actually insert data like this how would we?
What is the max and min allowed dates in elastic search?
Update:
I tried converting the specific date to unix time format and am getting the following:
This is not the exact same number as in the exception, but it becomes obvious that elastic search is converting the Date into unix time (in MS) and the exception is taking place after that:
Invalid format: "-24303888000000" is malformed at "3888000000"]
-24303888000000 vs
-24304492800 * 1000ms
Update 2
It seems underneath this error we have a JODA time parsing error:
Example:
Caused by: java.lang.IllegalArgumentException: Invalid format: "32503680000000" is malformed at "3680000000"
at org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187)
at org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:826)
at org.elasticsearch.index.mapper.core.DateFieldMapper$DateFieldType.parseStringValue(DateFieldMapper.java:366)
at org.elasticsearch.index.mapper.core.DateFieldMapper.innerParseCreateField(DateFieldMapper.java:534)
at org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:241)
at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:321)
... 21 more
Update 3
I changed the specific dates from java.util.date but I am now getting the following error:
MapperParsingException[failed to parse [legalReference]]; nested: IllegalArgumentException[unknown property [year]];
at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:329)
at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:311)
at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:328)
at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:254)
at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:124)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:309)
at org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:584)
at org.elasticsearch.index.shard.IndexShard.prepareIndexOnPrimary(IndexShard.java:563)
at org.elasticsearch.action.index.TransportIndexAction.prepareIndexOperationOnPrimary(TransportIndexAction.java:211)
at org.elasticsearch.action.index.TransportIndexAction.executeIndexRequestOnPrimary(TransportIndexAction.java:223)
at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:157)
at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:66)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.doRun(TransportReplicationAction.java:657)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:287)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:279)
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:77)
at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:378)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: unknown property [year]
at org.elasticsearch.index.mapper.core.DateFieldMapper.innerParseCreateField(DateFieldMapper.java:520)
at org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:241)
at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:321)
Update 3
It seems that changing the type from #Field(type = FieldType.Date) to #Field(type = FieldType.String) solves the problem for now. Though it would be interesting to know exactly what effect this has.
Related
I am working on functionality where I need to fetch the data from DB based on the date. The date format in DB is yyyy-MM-dd and in my application, I am passing the date in the same form.
The day column in TempTable entity class is Date(java.util.Date). I used Temporal as well but no success.
Below is the java code:
EntityManagerFactory emf= Persistence.createEntityManagerFactory("punit");
EntityManager em = emf.createEntityManager();
Query query;
query = em.createQuery("select r from TempTable r where r.uId = :uid and r.day = :day group by r.mId, r.rId");
query.setParameter("uid", 5l);
SimpleDateFormat simpleDateFormat= new SimpleDateFormat("yyyy-MM-dd");
Date d = simpleDateFormat.parse("2018-10-12");
query.setParameter("day", d);
query.getResultList();
After execution of the code, found the exception as below:
Jan 13, 2020 3:11:20 PM org.hibernate.engine.jdbc.spi.SqlExceptionHelper logExceptions
ERROR: Unsupported conversion from TIMESTAMP to java.lang.Long
Exception in thread "main" javax.persistence.PersistenceException: org.hibernate.exception.DataException: could not execute query
at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:154)
at org.hibernate.query.internal.AbstractProducedQuery.list(AbstractProducedQuery.java:1535)
at org.hibernate.query.Query.getResultList(Query.java:165)
at com.abc.XXXXX.main(XXXXX.java:30)
Caused by: org.hibernate.exception.DataException: could not execute query
at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:52)
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:42)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:113)
at org.hibernate.loader.Loader.doList(Loader.java:2818)
at org.hibernate.loader.Loader.doList(Loader.java:2797)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2629)
at org.hibernate.loader.Loader.list(Loader.java:2624)
at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:506)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:396)
at org.hibernate.engine.query.spi.HQLQueryPlan.performList(HQLQueryPlan.java:219)
at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1396)
at org.hibernate.query.internal.AbstractProducedQuery.doList(AbstractProducedQuery.java:1558)
at org.hibernate.query.internal.AbstractProducedQuery.list(AbstractProducedQuery.java:1526)
... 2 more
Caused by: java.sql.SQLDataException: Unsupported conversion from TIMESTAMP to java.lang.Long
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:114)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:73)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:96)
at com.mysql.cj.jdbc.result.ResultSetImpl.getObject(ResultSetImpl.java:1382)
at com.mysql.cj.jdbc.result.ResultSetImpl.getLong(ResultSetImpl.java:812)
at com.mysql.cj.jdbc.result.ResultSetImpl.getLong(ResultSetImpl.java:818)
at org.hibernate.type.descriptor.sql.BigIntTypeDescriptor$2.doExtract(BigIntTypeDescriptor.java:63)
at org.hibernate.type.descriptor.sql.BasicExtractor.extract(BasicExtractor.java:47)
at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:257)
at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:253)
at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:243)
at org.hibernate.type.AbstractStandardBasicType.hydrate(AbstractStandardBasicType.java:329)
at org.hibernate.persister.entity.AbstractEntityPersister.hydrate(AbstractEntityPersister.java:3041)
at org.hibernate.loader.Loader.loadFromResultSet(Loader.java:1866)
at org.hibernate.loader.Loader.hydrateEntityState(Loader.java:1794)
at org.hibernate.loader.Loader.instanceNotYetLoaded(Loader.java:1767)
at org.hibernate.loader.Loader.getRow(Loader.java:1615)
at org.hibernate.loader.Loader.getRowFromResultSet(Loader.java:745)
at org.hibernate.loader.Loader.processResultSet(Loader.java:1008)
at org.hibernate.loader.Loader.doQuery(Loader.java:964)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:354)
at org.hibernate.loader.Loader.doList(Loader.java:2815)
... 11 more
Caused by: com.mysql.cj.exceptions.DataConversionException: Unsupported conversion from TIMESTAMP to java.lang.Long
at com.mysql.cj.result.DefaultValueFactory.unsupported(DefaultValueFactory.java:70)
at com.mysql.cj.result.DefaultValueFactory.createFromTimestamp(DefaultValueFactory.java:82)
at com.mysql.cj.protocol.a.MysqlTextValueDecoder.decodeTimestamp(MysqlTextValueDecoder.java:79)
at com.mysql.cj.protocol.result.AbstractResultsetRow.decodeAndCreateReturnValue(AbstractResultsetRow.java:87)
at com.mysql.cj.protocol.result.AbstractResultsetRow.getValueFromBytes(AbstractResultsetRow.java:241)
at com.mysql.cj.protocol.a.result.ByteArrayRow.getValue(ByteArrayRow.java:91)
at com.mysql.cj.jdbc.result.ResultSetImpl.getObject(ResultSetImpl.java:1290)
... 29 more
The configuration is:
Java8
Mysql8
JPA2.1
Spring5
Please check the TempTable entity class and db table structure. If you have anything miss match related to it then you have to map it accordingly. If there is anything the things will not work.
E.g.- java.util.Date column type must be match with Date type column in database.
In NiFi first I'm converting JSON to Avro and then from Avro to JSON. But while converting from Avro to JSON I'm getting exception.
while converting from Avro to JSON I'm getting the below exception:
2019-07-23 12:48:04,043 ERROR [Timer-Driven Process Thread-9] o.a.n.processors.avro.ConvertAvroToJSON ConvertAvroToJSON[id=1db0939d-016c-1000-caa3-80d0993c3468] ConvertAvroToJSON[id=1db0939d-016c-1000-caa3-80d0993c3468] failed to process session due to org.apache.avro.AvroRuntimeException: Malformed data. Length is negative: -40; Processor Administratively Yielded for 1 sec: org.apache.avro.AvroRuntimeException: Malformed data. Length is negative: -40
org.apache.avro.AvroRuntimeException: Malformed data. Length is negative: -40
at org.apache.avro.io.BinaryDecoder.doReadBytes(BinaryDecoder.java:336)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:263)
at org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:430)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:422)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:180)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152)
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:240)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:230)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:174)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.nifi.processors.avro.ConvertAvroToJSON$1.process(ConvertAvroToJSON.java:161)
at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2887)
at org.apache.nifi.processors.avro.ConvertAvroToJSON.onTrigger(ConvertAvroToJSON.java:148)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1162)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:209)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Below is the template file:
https://community.hortonworks.com/storage/attachments/109978-avro-to-json-and-json-to-avro.xml
The flow that I have drawn is:
The input json is:
{
"name":"test",
"company":{
"exp":"1.5"
}
}
The converted avro data is:
Objavro.schema {"type":"record","name":"MyClass","namespace":"com.acme.avro","fields":[{"name":"name","type":"string"},{"name":"company","type":{"type":"record","name":"company","fields":[{"name":"exp","type":"string"}]}}]}avro.codecdeflate�s™ÍRól&D³DV`•ÔÃ6ã(I-.a3Ô3�s™ÍRól&D³DV`•ÔÃ6
For avro data file schema will be embedded and you want to write all
the fields in CSV format so we don't need to setup the registry. if
you are writing only specific columns not all and for other formats
JSON.. then schema registry is required - Shu
NIFI Malformed Data.Length is negative
More about internals
I am using kafka + spark streaming to stream messages and do analytics, then saving to phoenix. Some spark job fail several times per day with the following error message:
org.apache.phoenix.schema.IllegalDataException:
java.lang.IllegalArgumentException: Invalid format: ""
at org.apache.phoenix.util.DateUtil$ISODateFormatParser.parseDateTime(DateUtil.java:297)
at org.apache.phoenix.util.DateUtil.parseDateTime(DateUtil.java:163)
at org.apache.phoenix.util.DateUtil.parseTimestamp(DateUtil.java:175)
at org.apache.phoenix.schema.types.PTimestamp.toObject(PTimestamp.java:95)
at org.apache.phoenix.expression.LiteralExpression.newConstant(LiteralExpression.java:194)
at org.apache.phoenix.expression.LiteralExpression.newConstant(LiteralExpression.java:172)
at org.apache.phoenix.expression.LiteralExpression.newConstant(LiteralExpression.java:159)
at org.apache.phoenix.compile.UpsertCompiler$UpsertValuesCompiler.visit(UpsertCompiler.java:979)
at org.apache.phoenix.compile.UpsertCompiler$UpsertValuesCompiler.visit(UpsertCompiler.java:963)
at org.apache.phoenix.parse.BindParseNode.accept(BindParseNode.java:47)
at org.apache.phoenix.compile.UpsertCompiler.compile(UpsertCompiler.java:832)
at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableUpsertStatement.compilePlan(PhoenixStatement.java:578)
at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableUpsertStatement.compilePlan(PhoenixStatement.java:566)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324)
at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:245)
at org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:172)
at org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:177)
at org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:79)
at org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:39)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply$mcV$sp(PairRDDFunctions.scala:1113)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1111)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1111)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1251)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1119)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1091)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: Invalid format: ""
at org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:673)
at org.apache.phoenix.util.DateUtil$ISODateFormatParser.parseDateTime(DateUtil.java:295)
My code:
val myDF = sqlContext.createDataFrame(myRows, myStruct)
myDF.write
.format(sourcePhoenixSpark)
.mode("overwrite")
.options(Map("table" -> (myPhoenixNamespace + myTable), "zkUrl" -> myPhoenixZKUrl))
.save()
I am using phoenix-spark version 4.7.0-HBase-1.1. Any suggestion to solve the problem would be appreciated. Thanks
You are trying to process dirty data.
That error comes from here:
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/util/DateUtil.java#L301
Where it's trying to parse some string that is expected to be a Date in ISO format and the provided String is empty ("").
You need to prepare+clean your data before attempting to write it to storage.
NULL selection as column in a union/sub-queries failed with internal error
Failing HIVE query:
select clientid from hivesampletable limit 1 union all select null as clientid;
java.lang.RuntimeException: Hive internal error: conversion of string to void not supported yet.
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:132)
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:152)
at org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory.typeCast(ConstantPropagateProcFactory.java:178)
at org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory.evaluateColumn(ConstantPropagateProcFactory.java:525)
at org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory.foldExprFull(ConstantPropagateProcFactory.java:328)
at org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory.foldExpr(ConstantPropagateProcFactory.java:222)
at org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory.foldExprFull(ConstantPropagateProcFactory.java:296)
at org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory.foldExpr(ConstantPropagateProcFactory.java:222)
at org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory.access$000(ConstantPropagateProcFactory.java:93)
at org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory$ConstantPropagateSelectProc.process(ConstantPropagateProcFactory.java:796)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
at org.apache.hadoop.hive.ql.optimizer.ConstantPropagate$ConstantPropagateWalker.walk(ConstantPropagate.java:155)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
at org.apache.hadoop.hive.ql.optimizer.ConstantPropagate.transform(ConstantPropagate.java:125)
at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:178)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10146)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:345)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:443)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:459)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:739)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
CAST NULL explicitly to the required type to work around it.
Ex: select cast(null as string)
Im trying to read data from cassandra using Hive with CqlStorageHandler.
The versions:
Hive 0.11.0
Hadoop 1.2.1
Cassandra 1.2.6
Im able to create EXTERNAL table with the following HIVE Query
CREATE EXTERNAL TABLE input(number string,name string,address string) STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' WITH SERDEPROPERTIES ("cassandra.columns.mapping" = ":key, name, address", "cassandra.ks.name" ="cassandradb", "cassandra.host" = "localhost" ,"cassandra.port" = "9160") TBLPROPERTIES ("cassandra.input.split.size" = "64000","cassandra.range.size" = "1000","cassandra.slice.predicate.size" = "1000");
(The table "input" is already existing and containing some data in cassandra created with CQL3)
However, When I try to read data with the following query
select * from input where number="1";
Im facing the folowing issue:
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.io.IOException: Could not get input splits
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:189)
at org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:213)
at org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:169)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:292)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:297)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatException: For input string: "143514173170822869679056708180186660043"
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:185)
... 31 more
Caused by: java.lang.NumberFormatException: For input string: "143514173170822869679056708180186660043"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:444)
at java.lang.Long.valueOf(Long.java:540)
at org.apache.cassandra.dht.Murmur3Partitioner$1.fromString(Murmur3Partitioner.java:188)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:239)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:207)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Job Submission failed with exception 'java.io.IOException(Could not get input splits)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
Am I missing anything? Kindly advise.