Invalid Data in Hive-Created Table?

Invalid Data in Hive-Created Table? - hadoop

I'm using Hive version 3.1.3 on Hadoop 3.3.4 with Tez 0.9.2. I'm trying to run a SELECT statement on table that Hive created and manages. The query never finishes and fails. The full error message is below, but this appears to be the relevant portion:
Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector
It looks like the error is a long to decimal conversion issue. However, this table was created by Hive, loading/transforming data in a previous step. Wouldn't Hive have thrown an error earlier if it was inserting an invalid value into a decimal column?
I used the exact same codebase and the exact same data on AWS EMR and didn't get this error, so I don't think there's an invalid value. But I'm stuck on where to go from here.
Here's the table definition:
claimid varchar(50)
claimlineid int
dos date
dosto date
member varchar(50)
provider varchar(50)
setname varchar(255)
code varchar(50)
system varchar(255)
primary int
positivenegative int
result decimal(10,2)
supply int
size decimal(10,2)
quantity decimal(10,2)
And here's the full error message:
Vertex failed, vertexName=Map 1, vertexId=vertex_1667735849290_0030_32_15, diagnostics=[Task failed, taskId=task_1667735849290_0030_32_15_000009, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1667735849290_0030_32_15_000009_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:488)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:284)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException:
DeserializeRead detail: Reading byte[] of length 4096 at start offset 4 for length 100 to read 14 fields with types [varchar(50), int, date, date, varchar(50), varchar(50), varchar(255), varchar(50), varchar(255), int, decimal(10,2), int, decimal(10,2), decimal(10,2)]. Read field #14 at field start position 0 current read offset 104
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:611)
at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.closeOp(VectorMapJoinGenerateResultOperator.java:681)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:733)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:757)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:477)
... 17 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException:
DeserializeRead detail: Reading byte[] of length 4096 at start offset 4 for length 100 to read 14 fields with types [varchar(50), int, date, date, varchar(50), varchar(50), varchar(255), varchar(50), varchar(255), int, decimal(10,2), int, decimal(10,2), decimal(10,2)]. Read field #14 at field start position 0 current read offset 104
at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.reProcessBigTable(VectorMapJoinGenerateResultOperator.java:609)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.continueProcess(MapJoinOperator.java:671)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:604)
... 21 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
DeserializeRead detail: Reading byte[] of length 4096 at start offset 4 for length 100 to read 14 fields with types [varchar(50), int, date, date, varchar(50), varchar(50), varchar(255), varchar(50), varchar(255), int, decimal(10,2), int, decimal(10,2), decimal(10,2)]. Read field #14 at field start position 0 current read offset 104
at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.reProcessBigTable(VectorMapJoinGenerateResultOperator.java:589)
... 23 more
Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storePrimitiveRowColumn(VectorDeserializeRow.java:687)
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(VectorDeserializeRow.java:934)
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:1360)
at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.reProcessBigTable(VectorMapJoinGenerateResultOperator.java:585)
... 23 more

Unfortunately, this is a problem with CBO. You can disable it, run the expression and get the result.
set hive.cbo.enable=false;

Related

Can't create HIVE table with 'ROW FORMAT SERDE'

I am trying to create a HIVE table with SERDE. But it always fails.
My table creation command -
CREATE TABLE products_info_raw(
id STRING,
name STRING,
reseller STRING,
category STRING,
price BIGINT,
discount FLOAT,
profit_percent FLOAT
)
PARTITIONED BY (
rptg_dt STRING
)
ROW FORMAT SERDE
'org.apache.hadoop.hive.contrib.serde2.JsonSerde';
I added jar -
ADD jar /Users/<user>/Development/Hadoop/projects/e-commerce/hive-json-serde.jar;
that contains the necessary JsonSerde class -
META-INF/
META-INF/MANIFEST.MF
org/
org/apache/
org/apache/hadoop/
org/apache/hadoop/hive/
org/apache/hadoop/hive/contrib/
org/apache/hadoop/hive/contrib/serde2/
org/json/
org/apache/hadoop/hive/contrib/serde2/JsonSerde.class
org/apache/hadoop/hive/contrib/serde2/NewJson.class
org/json/CDL.class
org/json/Cookie.class
org/json/CookieList.class
org/json/HTTP.class
org/json/HTTPTokener.class
org/json/JSONArray.class
org/json/JSONException.class
org/json/JSONML.class
org/json/JSONObject$1.class
org/json/JSONObject$Null.class
org/json/JSONObject.class
org/json/JSONString.class
org/json/JSONStringer.class
org/json/JSONTokener.class
org/json/JSONWriter.class
org/json/Test$1Obj.class
org/json/Test.class
org/json/XML.class
org/json/XMLTokener.class
But always keep getting below error -
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.serde2.SerDe
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
I am using HIVE 3.1.0.
Please help.

How to use sqoop2 to import mysql table which id filed is string type?

I try to use sqoop2 (1.99.7) to import mysql table. When creating the sqoop job, the default partition column is id. But in the mysql table, the id is a char[36] field. And I don't have any numeric field in this table.
The imported table DDL is as follow:
CREATE TABLE `project` (
`id` char(36) NOT NULL,
`created_by` varchar(50) DEFAULT NULL,
`date_created` datetime NOT NULL,
`date_last_modified` datetime DEFAULT NULL,
`last_modified_by` varchar(50) DEFAULT NULL,
`name` varchar(100) NOT NULL,
`status` varchar(10) DEFAULT NULL,
`create_user_id` char(36) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UK_3k75vvu7mevyvvb5may5lj8k7` (`name`),
KEY `FKklsjfrbvub6jy50shboiiqjsm` (`create_user_id`),
CONSTRAINT `FKklsjfrbvub6jy50shboiiqjsm` FOREIGN KEY (`create_user_id`) REFERENCES `platform_user` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
The job will take the 'id' field as the default partition column.
The sqoop job is as follow：
1 job(s) to show:
Job with name job1 (Enabled: true, Created by liupeng at 2/2/18 12:57 PM, Updated by xxxx at 2/2/18 2:48 PM)
Throttling resources
Extractors:
Loaders:
Classpath configuration
Extra mapper jars:
From link: jdbclink
Database source
Schema name: sailor
Table name: project
SQL statement:
Column names:
Partition column:
Partition column nullable:
Boundary query:
Incremental read
Check column:
Last value:
To link: hdfslink
Target configuration
Override null value:
Null value:
File format: TEXT_FILE
Compression codec: NONE
Custom codec:
Output directory: hdfs://localhost:9000/user/xxxx
Append mode:
The job error is as follow:
sqoop:000> start job -n job1
2018-02-02 14:48:52 CST: FAILURE_ON_SUBMIT
Exception: org.apache.sqoop.common.SqoopException: CORE_0000:An unknown error has occurred
Stack trace: org.apache.sqoop.common.SqoopException: CORE_0000:An unknown error has occurred
at org.apache.sqoop.utils.ClassUtils.executeWithClassLoader(ClassUtils.java:286)
at org.apache.sqoop.job.mr.SqoopInputFormat.getSplits(SqoopInputFormat.java:72)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:313)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:330)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:203)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.sqoop.submission.mapreduce.MapreduceSubmissionEngine.submitToCluster(MapreduceSubmissionEngine.java:279)
at org.apache.sqoop.submission.mapreduce.MapreduceSubmissionEngine.submit(MapreduceSubmissionEngine.java:260)
at org.apache.sqoop.driver.JobManager.start(JobManager.java:329)
at org.apache.sqoop.handler.JobRequestHandler.startJob(JobRequestHandler.java:353)
at org.apache.sqoop.handler.JobRequestHandler.handleEvent(JobRequestHandler.java:114)
at org.apache.sqoop.server.v1.JobServlet.handlePutRequest(JobServlet.java:84)
at org.apache.sqoop.server.SqoopProtocolServlet.doPut(SqoopProtocolServlet.java:81)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:304)
at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.math.BigDecimal
at org.apache.sqoop.connector.jdbc.GenericJdbcPartitioner.constructTextConditions(GenericJdbcPartitioner.java:490)
at org.apache.sqoop.connector.jdbc.GenericJdbcPartitioner.partitionTextColumn(GenericJdbcPartitioner.java:221)
at org.apache.sqoop.connector.jdbc.GenericJdbcPartitioner.getPartitions(GenericJdbcPartitioner.java:123)
at org.apache.sqoop.connector.jdbc.GenericJdbcPartitioner.getPartitions(GenericJdbcPartitioner.java:38)
at org.apache.sqoop.job.mr.SqoopInputFormat.getSplitsInternal(SqoopInputFormat.java:97)
at org.apache.sqoop.job.mr.SqoopInputFormat.access$000(SqoopInputFormat.java:49)
at org.apache.sqoop.job.mr.SqoopInputFormat$1.call(SqoopInputFormat.java:76)
at org.apache.sqoop.job.mr.SqoopInputFormat$1.call(SqoopInputFormat.java:73)
at org.apache.sqoop.utils.ClassUtils.executeWithClassLoader(ClassUtils.java:281)
... 38 more
How can I achieve this importing?
Thanks

Negative Array Size Exception while inserting into Hive Bucketed Table

I am trying to insert into a hive bucketed sorted table and stuck with a Negative Array Size exception thrown by the reducer. Please find below stack trace.
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#3
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.NegativeArraySizeException
at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.<init>(InMemoryMapOutput.java:63)
at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.unconditionalReserve(MergeManagerImpl.java:305)
at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:295)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:514)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
And my table DDL is (Only showing a subset of columns for readability. Actual DDL has 100 columns)
CREATE TABLE clustered_sorted_orc( conv_type string,
multi_dim_id int,
multi_key_id int,
advertiser_id bigint,
buy_id bigint,
day timestamp
PARTITIONED BY(job_instance_id int)
CLUSTERED BY(conv_type) SORTED BY (day) INTO 8 BUCKETS
STORED AS ORC;
Insert statement is
FROM not_clustered_orc
INSERT OVERWRITE TABLE clustered_sorted_orc PARTITION(job_instance_id)
SELECT conv_type ,multi_dim_id ,multi_key_id ,advertiser_id,buy_id ,day, job_instance_id
Following hive properties are set
set hive.enforce.bucketing = true;
set hive.exec.dynamic.partition.mode=nonstrict;
This is a log snippet from MergerManagerImpl which specifies ioSortFactor,mergeThreshold etc if it helps.
2016-06-30 05:57:20,518 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: MergerManager: memoryLimit=12828540928, maxSingleShuffleLimit=3207135232, mergeThreshold=8466837504, ioSortFactor=64, memToMemMergeOutputsThreshold=64
I am using CDH 5.7.1, Hive1.1.0, Hadoop 2.6.0. Has anyone faced a similar issue before? Any help is really appreciated.

I got it working after setting
hive.optimize.sort.dynamic.partition=true

data reload from one table to another in hive

i am loading data from one table into another in hive, while the new properties of the new table differ from the original.
While loading i am facing the below issue... Any help to fix this... ?
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"mdse_item_i":671841,"co_loc_i":146,"persh_expr_d":"2014-05-01","greg_d":"2013-06-17","persh_oh_q":16.0,"crte_btch_i":765,"updt_btch_i":765,"range_n":"ITEM_LOC_DAY_PERSH_OH_INV_2013-04-01_2013-07-31"}
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"mdse_item_i":671841,"
My old table defntn:
hive> describe nonclickstream.ITEM_LOC_DAY_PERSH_OH_INV;
OK
mdse_item_i int
co_loc_i int
persh_expr_d string
greg_d string
persh_oh_q double
crte_btch_i int
updt_btch_i int
range_n string
Time taken: 0.058 seconds
My new table def. is below:
hive> describe ITEM_LOC_DAY_PERSH_OH_INV;
OK
mdse_item_i int from deserializer
co_loc_i int from deserializer
persh_expr_d string from deserializer
greg_d string from deserializer
persh_oh_q string from deserializer
crte_btch_i int from deserializer
updt_btch_i int from deserializer
greg_date string
Time taken: 0.241 seconds
The new one is created with avro schema.
CREATE external TABLE ITEM_LOC_DAY_PERSH_OH_INV
partitioned by (greg_date string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
Location '/common/TD/INV_new/ITEM_LOC_DAY_PERSH_OH_INV/'
TBLPROPERTIES (
'avro.schema.url'='hdfs:///common/TD/INV_new/ITEM_LOC_DAY_PERSH_OH_INV/ITEM_LOC_DAY_PERSH_OH_INV.avs');
Load command we are using:
INSERT INTO TABLE ITEM_LOC_DAY_PERSH_OH_INV PARTITION (greg_date)
SELECT
mdse_item_i,
co_loc_i,
persh_expr_d,
greg_d,
persh_oh_q,
crte_btch_i,
updt_btch_i,
greg_d FROM nonclickstream.ITEM_LOC_DAY_PERSH_OH_INV where range_n='ITEM_LOC_DAY_PERSH_OH_INV_2013-04-01_2013-07-31';
we are using dynamic partitioning while loading!
Actually what we are trying to do is re-partitioning the table with another column. while modified the schema also.
The same approach worked for other tables... but for only this table we are facing this issue......

SqlExceptionHelper - Duplicate entry for Unique key Using Spring Data JPA

Here is the situation:
I have 3 tables - Dummy, A and B - where A and B have One-To-Many
relationship, and Dummy is stand alone
These tables have corresponding JPA entities in the data layer. I am
using Repository design pattern, so accessing these entities via their corresponding
service implementations.
I am making exact sequence of calls to these entities:
Get an entity for ID = xxx
Display the entity ID and name (or whatever)
Update a field using entity.setField(YYY)
push it back to DB using: entityService.updateEntity(entity)
The above sequence from #4 to #7 works like a charm for Dummy table. But fails to execute for the A and B. The exception is:
ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - Duplicate entry <The name here> for key 'name_UNIQUE'
Exception in thread "main" org.springframework.orm.jpa.JpaSystemException: org.hibernate.exception.ConstraintViolationException: could not execute statement; nested exception is javax.persistence.PersistenceException: org.hibernate.exception.ConstraintViolationException: could not execute statement
at org.springframework.orm.jpa.EntityManagerFactoryUtils.convertJpaAccessExceptionIfPossible(EntityManagerFactoryUtils.java:321)
at org.springframework.orm.jpa.AbstractEntityManagerFactoryBean.translateExceptionIfPossible(AbstractEntityManagerFactoryBean.java:403)
at org.springframework.dao.support.ChainedPersistenceExceptionTranslator.translateExceptionIfPossible(ChainedPersistenceExceptionTranslator.java:58)
at org.springframework.dao.support.DataAccessUtils.translateIfNecessary(DataAccessUtils.java:213)
The field that I am updating is not unique. The entities have exactly same structure with some additional columns here and there.
Here is the code for Dummy Entity:
DummyEntity dummyEntity = dummyService.findDummyEntity(16L);
System.out.println(">>> Name is: " + dummyEntity.getName() + " with ID: " + dummyEntity.getId());
dummyEntity.setName("New Name");
dummyEntity.setRank(3333333);
dummyService.updateDummyEntity(dummyEntity);
Repeating the exact same steps for the remaining entities A and B.
So what am I doing wrong? Any pointers will be greatly appreciated.
UPDATE:
#erencan - yes, I double checked that. Here is what I observed after posing the question here. The Table A and B (the troublesome ones) has this issue:
When I ask repository service to return me an instance for a given ID for Table A and B, it returns them alright
when a change is made to that instance using the setXXX(), and updateEntity() or saveEntity() is called (as shown using the demo entity code above), the save/update inserts a new entity in the table with exact attribute values as the old one, but with the new changes incorporated (this was observed by removing the unique key constraint on the Table A and B).
Later, when I query these newly created entities using their ID in the JPA/Java code, and perform the exact same steps (change some attribute and call save/update on the repository), these newly created entities (rows in the db table) get updated in exactly the manner expected.
So it seems that the original entity (row) is somehow 'locked' and updates are prevented. Therefore, the JPA save/update calls simply tries to create a fresh one instead; and since the fresh entity still has the same set of attribute values, any UNIQUE key constraint will start complaining (of course)
I did some tests on the existing tables (ETL'd into the DB), and find this behavior consistent: If the entity is NOT created by JPA, then JPA can read the data alright, but cannot update them; if the entity IS created by JPA, then JPA can read AND update them alright.
Not sure how this happens though (yet). Here is the schema of Table A and B:
CREATE TABLE IF NOT EXISTS `mydbschema`.`table-B` (
`id` INT NOT NULL AUTO_INCREMENT COMMENT 'This is PK',
`name` VARCHAR(100) NULL,
`city` VARCHAR(50) NOT NULL,
`state` VARCHAR(30) NOT NULL,
`zip` VARCHAR(5) NOT NULL,
`country` VARCHAR(50) NOT NULL,
`overall_rank` INT NULL,
`inserted` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`insert_src_ver_id` INT NULL,
`updated` TIMESTAMP NULL ON UPDATE CURRENT_TIMESTAMP,
`update_src_ver_id` INT NULL,
`version` INT NULL,
PRIMARY KEY (`id`),
UNIQUE INDEX `name_UNIQUE` (`name` ASC))
ENGINE = InnoDB;
Another one:
CREATE TABLE IF NOT EXISTS `mydbschema`.`table-A` (
`id` INT NOT NULL AUTO_INCREMENT COMMENT 'This is PK',
`full_name` VARCHAR(200) NULL,
`gender` VARCHAR(1) NULL,
`year_of_birth` VARCHAR(4) NULL,
`title_code` VARCHAR(6) NULL,
`business_role` VARCHAR(30) NULL,
`graduation_year` VARCHAR(4) NULL,
`residency` VARCHAR(500) NULL,
`table-B_id` INT NULL,
`npi_num` VARCHAR(10) NULL,
`upin` VARCHAR(20) NULL,
`dea_num` VARCHAR(20) NULL,
`dea_expire_date` VARCHAR(10) NULL,
`year_started_practicing` VARCHAR(4) NULL,
`high_prescriber` VARCHAR(1) NULL,
`board_action` VARCHAR(1) NULL,
`mdi_qscore` INT NOT NULL DEFAULT 0,
`mdi_cscore` INT NOT NULL DEFAULT 0,
`aco_id` INT NULL,
`npp` INT NULL,
`medicaid_id` VARCHAR(50) NULL,
`medicaid_state` VARCHAR(2) NULL,
`medicare_id` VARCHAR(50) NULL,
`medicare_state` VARCHAR(2) NULL,
`medicare_provider_flag` VARCHAR(1) NULL,
`inserted` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`insert_src_ver_id` INT NULL,
`updated` TIMESTAMP NULL ON UPDATE CURRENT_TIMESTAMP,
`update_src_ver_id` INT NULL,
`version` INT NULL,
PRIMARY KEY (`id`),
UNIQUE INDEX `hdsphy_id_UNIQUE` (`id` ASC),
UNIQUE INDEX `npi_num_UNIQUE` (`npi_num` ASC),
UNIQUE INDEX `dea_num_UNIQUE` (`dea_num` ASC),
CONSTRAINT `fk_table-A_table-B`
FOREIGN KEY (`table-B_id`)
REFERENCES `mydbschema`.`table-B` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
ENGINE = InnoDB;
Here is the FULL Stack trace:
2013-09-30 10:20:49,705 [main] ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - Duplicate entry '1568673648' for key 'npi_num_UNIQUE'
Exception in thread "main" org.springframework.orm.jpa.JpaSystemException: org.hibernate.exception.ConstraintViolationException: could not execute statement; nested exception is javax.persistence.PersistenceException: org.hibernate.exception.ConstraintViolationException: could not execute statement
at org.springframework.orm.jpa.EntityManagerFactoryUtils.convertJpaAccessExceptionIfPossible(EntityManagerFactoryUtils.java:321)
at org.springframework.orm.jpa.AbstractEntityManagerFactoryBean.translateExceptionIfPossible(AbstractEntityManagerFactoryBean.java:403)
at org.springframework.dao.support.ChainedPersistenceExceptionTranslator.translateExceptionIfPossible(ChainedPersistenceExceptionTranslator.java:58)
at org.springframework.dao.support.DataAccessUtils.translateIfNecessary(DataAccessUtils.java:213)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:163)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.data.jpa.repository.support.LockModeRepositoryPostProcessor$LockModePopulatingMethodIntercceptor.invoke(LockModeRepositoryPostProcessor.java:84)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at com.sun.proxy.$Proxy60.save(Unknown Source)
at com.mdinsider.platform.domain.PhysicianServiceImpl_Roo_Service.ajc$interMethod$com_mdinsider_platform_domain_PhysicianServiceImpl_Roo_Service$com_mdinsider_platform_domain_PhysicianServiceImpl$updatePhysician(PhysicianServiceImpl_Roo_Service.aj:48)
at com.mdinsider.platform.domain.PhysicianServiceImpl.updatePhysician(PhysicianServiceImpl.java:1)
at com.mdinsider.platform.domain.PhysicianService_Roo_Service.ajc$interMethodDispatch1$com_mdinsider_platform_domain_PhysicianService_Roo_Service$com_mdinsider_platform_domain_PhysicianService$updatePhysician(PhysicianService_Roo_Service.aj)
at com.mdinsider.platform.mediblip.engine.TestDBSave.saveMDIQualityScore(TestDBSave.java:94)
at com.mdinsider.platform.mediblip.engine.TestDBSave.main(TestDBSave.java:142)
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.ConstraintViolationException: could not execute statement
at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1387)
at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1310)
at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1316)
at org.hibernate.ejb.AbstractEntityManagerImpl.merge(AbstractEntityManagerImpl.java:898)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.orm.jpa.SharedEntityManagerCreator$SharedEntityManagerInvocationHandler.invoke(SharedEntityManagerCreator.java:241)
at com.sun.proxy.$Proxy31.merge(Unknown Source)
at org.springframework.data.jpa.repository.support.SimpleJpaRepository.save(SimpleJpaRepository.java:345)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.executeMethodOn(RepositoryFactorySupport.java:334)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:319)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:96)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:260)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:94)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:155)
... 12 more
Caused by: org.hibernate.exception.ConstraintViolationException: could not execute statement
at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:74)
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:125)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:110)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:136)
at org.hibernate.id.IdentityGenerator$GetGeneratedKeysDelegate.executeAndExtract(IdentityGenerator.java:96)
at org.hibernate.id.insert.AbstractReturningDelegate.performInsert(AbstractReturningDelegate.java:58)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2975)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:3487)
at org.hibernate.action.internal.EntityIdentityInsertAction.execute(EntityIdentityInsertAction.java:81)
at org.hibernate.engine.spi.ActionQueue.execute(ActionQueue.java:377)
at org.hibernate.engine.spi.ActionQueue.addResolvedEntityInsertAction(ActionQueue.java:214)
at org.hibernate.engine.spi.ActionQueue.addInsertAction(ActionQueue.java:194)
at org.hibernate.engine.spi.ActionQueue.addAction(ActionQueue.java:178)
at org.hibernate.event.internal.AbstractSaveEventListener.addInsertAction(AbstractSaveEventListener.java:321)
at org.hibernate.event.internal.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:286)
at org.hibernate.event.internal.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:192)
at org.hibernate.event.internal.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:125)
at org.hibernate.ejb.event.EJB3MergeEventListener.saveWithGeneratedId(EJB3MergeEventListener.java:71)
at org.hibernate.event.internal.DefaultMergeEventListener.saveTransientEntity(DefaultMergeEventListener.java:236)
at org.hibernate.event.internal.DefaultMergeEventListener.entityIsTransient(DefaultMergeEventListener.java:216)
at org.hibernate.event.internal.DefaultMergeEventListener.onMerge(DefaultMergeEventListener.java:154)
at org.hibernate.event.internal.DefaultMergeEventListener.onMerge(DefaultMergeEventListener.java:76)
at org.hibernate.internal.SessionImpl.fireMerge(SessionImpl.java:914)
at org.hibernate.internal.SessionImpl.merge(SessionImpl.java:898)
at org.hibernate.internal.SessionImpl.merge(SessionImpl.java:902)
at org.hibernate.ejb.AbstractEntityManagerImpl.merge(AbstractEntityManagerImpl.java:889)
... 31 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '1568673648' for key 'npi_num_UNIQUE'
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1039)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3609)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3541)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2002)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2163)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2624)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2127)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2427)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2345)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2330)
at org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
at org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:133)

Alright, problem solved. After trying out various ways to isolate the problem, I saw that any rows added manually or by the ETL will be selectively treated for this exception. Any rows added by the Spring Data JPA/Java code will work fine.
Hence, the issue was with manually inserted rows. Then I realized that VERSION field was sitting there with NULL value for those manually inserted rows. When I set the values to 0, the manually inserted rows became acceptable to JPA.
Another alternative is to not have version field at all in of your tables.
Hope this helps folks who come across the same problem.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Invalid Data in Hive-Created Table? - hadoop

Unfortunately, this is a problem with CBO. You can disable it, run the expression and get the result. set hive.cbo.enable=false;

Related

Can't create HIVE table with 'ROW FORMAT SERDE'

How to use sqoop2 to import mysql table which id filed is string type?

Negative Array Size Exception while inserting into Hive Bucketed Table

data reload from one table to another in hive

SqlExceptionHelper - Duplicate entry for Unique key Using Spring Data JPA

Categories

Resources