Hibernate search does not remove old value from lucene index when the object is deleted via an #NoRepositoryBean Jpa method - spring

I have a NoRepositoryBean Jpa interface that has one custom jpa method called deleteAllByIdIn(...) which is inherited by some concrete JpaRepositories. For some reason this custom delete method is ignored by Hibernate Search. Whenever an entity is deleted through this custom method its value is not removed from the lucene index after the delete is done. I will explain the problem some more further down this post; but first here's the code
#NoRepositoryBean
public interface NameTranslationDao<T extends NameTranslation> extends JpaRepository<T, Long> {
#Modifying
#Transactional
#Query(value = "DELETE FROM #{#entityName} c WHERE c.id IN :translationsToDelete")
public void deleteAllByIdIn(#Param("translationsToDelete") Set<Long> translationsToDelete);
}
Heres a JpaRepository subclass that extends this interface:
#Repository
#Transactional(readOnly = true)
public interface LifeStageCommonNameTranslationDao extends CommonNameTranslationDao<LifeStageCommonNameTranslation> {
}
Theres another #NoRepositoryBean interface in-between the concrete JpaRepository and the NameTranslationDao NoRepositoryBean. That one is called CommonNameTranslationDao but it doesn't override the custom method in any way, so it is unlikely the cause of the problem, nevertheless heres the code of that repository:
#NoRepositoryBean
public interface CommonNameTranslationDao<T extends NameTranslation> extends NameTranslationDao<T> {
#Deprecated
#Transactional(readOnly = true)
#Query("SELECT new DTOs.AutoCompleteSuggestion(u.parent.id, u.autoCompleteSuggestion) FROM #{#entityName} u WHERE u.autoCompleteSuggestion LIKE :searchString% AND deleted = false AND (u.language.id = :preferredLanguage OR u.language.id = :defaultLanguage)")
List<AutoCompleteSuggestion> findAllBySearchStringAndDeletedIsFalse(#Param("searchString") String searchString, #Param("preferredLanguage") Long preferredLanguage, #Param("defaultLanguage") Long defaultLanguage);
#Transactional(readOnly = true)
#Query(nativeQuery = true, value = "SELECT s.translatedName FROM #{#entityName} s WHERE s.language_id = :preferredLanguage AND s.parent_id = :parentId LIMIT 1")
public String findTranslatedNameByParentAndLanguage(#Param("preferredLanguage") Long languageId, #Param("parentId") Long parentId);
#Modifying
#Transactional
#Query(nativeQuery = true, value = "DELETE FROM #{#entityName} WHERE id = :id")
void hardDeleteById(#Param("id") Long id);
#Modifying
#Transactional
#Query(nativeQuery = true, value = "UPDATE #{#entityName} c SET c.deleted = TRUE WHERE c.id = :id")
void softDeleteById(#Param("id") Long id);
}
Also, heres the code of the LifeStageCommonNameTranslation entity class:
#Entity
#Indexed
#Table(
uniqueConstraints = {
#UniqueConstraint(name = "UC_life_cycle_type_language_id_translatedName", columnNames = {"translatedName", "parent_id", "language_id"})
},
indexes = {
#Index(name = "IDX_lifestage", columnList = "parent_id"),
#Index(name = "IDX_translator", columnList = "user_id"),
#Index(name = "IDX_species_language", columnList = "language_id, parent_id, deleted"),
#Index(name = "IDX_autoCompleteSuggestion_language", columnList = "autoCompleteSuggestion, language_id, deleted")})
public class LifeStageCommonNameTranslation extends NameTranslation<LifeStage> implements AuthorizationSubject {
#Id #DocumentId
#GenericGenerator(
name = "sequenceGeneratorLifeStageCommonNameTranslation",
strategy = "org.hibernate.id.enhanced.SequenceStyleGenerator",
parameters = {
#org.hibernate.annotations.Parameter(name = "sequence_name", value = "_lifestagecommonnametranslation_hibernate_sequence"),
#org.hibernate.annotations.Parameter(name = "optimizer", value = "pooled"),
#org.hibernate.annotations.Parameter(name = "initial_value", value = "1"),
#org.hibernate.annotations.Parameter(name = "increment_size", value = "25"),
#org.hibernate.annotations.Parameter(name = "prefer_sequence_per_entity", value = "true")
}
)
#GeneratedValue(
strategy = GenerationType.SEQUENCE,
generator = "sequenceGeneratorLifeStageCommonNameTranslation"
)
#Field(analyze = Analyze.NO, store = Store.YES, name = "parentId")
private Long id;
#IndexedEmbedded(includeEmbeddedObjectId = true)
#ManyToOne(fetch = FetchType.LAZY)
private LifeStage parent;
#Field(index = NO, store = Store.YES)
private String autoCompleteSuggestion;
//Getters and setters ommitted
The problem is the following: Whenever i use the inherited deleteAllByIdIn() method on LifeStageCommonNameTranslationDao then Hibernate Search will not remove the autoCompleteSuggestion field value from the lucene index after the entity has been deleted. If however i use the standard deleteById() JpaRepository method to delete the entity then the field value is removed from the lucene index.
Both the custom and the standard delete method were called within a #Transactional annotated method and i also called the flush() jpaRepository method right afterwards. I did this because I've read that this can sometimes help to update the lucene index. But in the case of deleteAllByIdIn() calling flush() afterwards did not help at all.
I already ruled out the possiblity that the problem was caused by the spEL expression in the SQL query. I tested this by replacing #{#entityName} with a concrete entity name like LifeStageCommonTranslation and then calling the deleteAllByIdIn() delete method. But the problem still persisted. The lucene index still did not remove the autoSuggestionText field value after the delete.
I can easily solve this problem by simply using the standard jpa method deleteById() but i want to know why the custom made jpa method deleteAllByIdIn() does not cause Hibernate search to update the lucene index.

Hibernate Search detects entity change events happening in your Hibernate ORM Session/EntityManager. This excludes insert/update/delete statements that you wrote yourself in JPQL or native SQL queries.
The limitation is documented here: https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#limitations-changes-in-session
The workaround is documented there too:
One workaround is to reindex explicitly after you run JPQL/SQL queries, either using the MassIndexer or manually.
EDIT: And of course your workaround might be valid as well, if deleteById loads the entity in the session before deleting it (I'm not that familiar with the internals of Spring Data JPA):
I can easily solve this problem by simply using the standard jpa method deleteById() but i want to know why the custom made jpa method deleteAllByIdIn() does not cause Hibernate search to update the lucene index.

Related

Spring JPA - insert list with batch_size using native query in JpaRepository or CrudRepository

UPDATE: Thank you to #M.Deinum for informing me how to deal with the #ManyToOne cascade issue that I was previously stuck on by using EntityManager getReference or JpaRepository getOne function. I am now able to batch save with basic JpaRepsitorymethods as follows:
#Transactional
public void insertCommands(List<CommandDto> dtos) {
final List<Command> commands = new ArrayList<>();
for (CommandDto dto : dtos) {
ZonedDateTime now = ZonedDateTime.now();
final Request request = requestRepository.getOne(dto.getRequestId());
String commandId = UUID.randomUUID().toString();
final Command command = new Command();
command.setId(commandId);
command.setCreatedBy(SYSTEM);
command.setCreatedTimestamp(now);
command.setStatus(dto.getStatus());
command.setContent(dto.getContent());
command.setSendOnDate(dto.getSendOnDate());
command.setRequest(request);
commands.add(command);
}
commandRepository.saveAll(commands);
}
Original post content as seen below:
I need to insert multiple rows to my application's database using the batch_size property set in my properties:
spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.order_updates=true
spring.jpa.properties.hibernate.jdbc.batch_size=10
I am looking for a way to insert them using nativequery in JpaRepository or CrudRepository with a syntax along the lines of the following, but allowing for multiple rows:
#Modifying
#Transactional
#Query(nativeQuery = true, value = "INSERT into command (id, created_by, created_timestamp, " +
" updated_by, updated_timestamp, status, content, send_on_date, request_id) " +
"VALUES (:id, :createdBy, :createdTimestamp, :updatedBy, :updatedTimestamp, " +
" :status, :content, :sendOnDate, :requestId) ")
int batchInsertCommandDto( #Param("commands") List<CommandDto> commandDtos);
How can I perform this sort of query with a list?
NOTE: Before you bring up EntityManger, I must note that have not had any luck with saveall functionality because:
I am translating the data from it's JSON-friendly class (CommandDto) to its data entity class(Command)
The class for the entity (Command) has a #ManyToOne annotation for one or more Object fields whereas the JSON object (CommandDto) simply has the id of these fields. For example, java class for command entity has "request" field:
#ManyToOne(optional = false, fetch = FetchType.LAZY, cascade = CascadeType.MERGE)
#JoinColumn(name = "request_id", foreignKey = #ForeignKey(name="fk_rcommand_request_id"))
private Request request;
Whereas CommandDto object simply has field "requestId". That means that if I simply try creating a request object with only the requestId, entityManager will fail to save because the Request object is not fully formed and therefore not recognized. It would be grossly inefficient to retrieve the Request object for each command being saved, so I am looking to do the mapping as seen in the nativequery above.

Axon - State Stored Aggregates exception in test

Environment setup : Axon 4.4, H2Database( we are doing component testing as part of the CI)
Code looks something like this.
#Aggregate(repository = "ARepository")
#Entity
#DynamicUpdate
#Table(name = "A")
#Getter
#Setter
#NoArgsConstructor
#EqualsAndHashCode(onlyExplicitlyIncluded = true, callSuper = false)
#Log4j2
Class A implements Serializable {
#CommandHandler
public void handle(final Command1 c1) {
apply(EventBuilder.buildEvent(c1));
}
#EventSourcingHandler
public void on(final Event1 e1) {
//some updates to the modela
apply(new Event2());
}
#Id
#AggregateIdentifier
#EntityId
#Column(name = "id", length = 40, nullable = false)
private String id;
#OneToMany(
cascade = CascadeType.ALL,
fetch = FetchType.LAZY,
orphanRemoval = true,
targetEntity = B.class,
mappedBy = "id")
#AggregateMember(eventForwardingMode = ForwardMatchingInstances.class)
#JsonIgnoreProperties("id")
private List<C> transactions = new ArrayList<>();
}
#Entity
#Table(name = "B")
#DynamicUpdate
#Getter
#Setter
#NoArgsConstructor
#EqualsAndHashCode(onlyExplicitlyIncluded = true, callSuper = false)
#Log4j2
Class B implements Serializable {
#Id
#EntityId
#Column(name = "id", nullable = false)
#AggregateIdentifier
private String id;
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumns({#JoinColumn(name = "id", referencedColumnName = "id")})
#JsonIgnoreProperties("transactions")
private A a;
#EventSourcingHandler
public void on(final Event2 e2) {
//some updates to the model
}
}
I'm using a state store aggregate but I keep getting the error randomly during Spring Test with embedded H2. The same issue does not occur with a PGSQL DB in non embedded mode but than we are not capable of runnign it in the pipeline.
Error : "java.lang.IllegalStateException: The aggregate identifier has not been set. It must be set at the latest when applying the creation event"
I stepped through AnnotatedAggregate
protected <P> EventMessage<P> createMessage(P payload, MetaData metaData) {
if (lastKnownSequence != null) {
String type = inspector.declaredType(rootType())
.orElse(rootType().getSimpleName());
long seq = lastKnownSequence + 1;
String id = identifierAsString();
if (id == null) {
Assert.state(seq == 0,
() -> "The aggregate identifier has not been set. It must be set at the latest when applying the creation event");
return new LazyIdentifierDomainEventMessage<>(type, seq, payload, metaData);
}
return new GenericDomainEventMessage<>(type, identifierAsString(), seq, payload, metaData);
}
return new GenericEventMessage<>(payload, metaData);
}
The sequence for this gets set to 2 and hence it throws the exception instead of lazily initializing the aggregate
Whats the fix for this? Am i missing some configuration or needs a fix in Axon code?
I believe the exception you are getting is the pointer to what you are missing #Rohitdev. When an aggregate is being created in Axon, it at the very least assume you will set the aggregate identifier. Thus, that you will fill in the #AggregateIdentifier annotated field present in your Aggregate.
This is a mandatory validation as without an Aggregate Identifier, you are essentially missing the external reference towards the Aggregate. Due to this, you would simply to be able to dispatch following commands to this Aggregate, as there is no means to route them.
From the code snippets you've shared, there is nothing which indicates that the #AggregateIdentifier annotated String id fields in Aggregate A or B are ever set. Not doing this in combination with using Axon's test fixtures will lead you the the exception you are getting.
When using a state-stored aggregate, know that you will change the state of the aggregate inside the command handler. This means that next to invoke in the AggregateLifecycle#apply(Object) method in your command handler, you will set the id to the desired aggregate identifier.
There are two main other pointers to share based on the question.
There is no command handler inside your aggregate which creates the aggregate itself. You should either have an #CommandHandler annotated constructor in your aggregates, or use the #CreationPolicy annotation to define a regular method as the creation point of the aggregate (as mentioned here in the reference guide).
Lastly, your sample still uses #EventSourcingHandler annotated functions, which should be used when you have an Event Sourced Aggregate. It sounds like you have made a conscious decision against Event Sourcing, hence I wouldn't use those annotations either in your model. Right now it will likely only confuse developers that a mix of state-stored and event sourced aggregate logic is being used.
Finally after debugging we found out that in class B we were not setting the id for update event
#EventSourcingHandler
public void on(final Event2 e2) {
this.id=e2.getId();
}
Once we did that the issue went away.

Spring entity dynamically calculated field with parameter from native query

I have a somewhat complex entity like following (Notice the super-class with many more fields):
public class Question extends Entry {
#OneToMany(orphanRemoval = true, mappedBy = "question")
#JsonManagedReference
private List<Answer> answers = new ArrayList<>();
private Long viewCount = 0L;
private Category category;
#OneToMany(mappedBy = "question", fetch = FetchType.LAZY,
cascade = CascadeType.ALL, orphanRemoval = true)
private List<QuestionTranslation> translations = new ArrayList<>();
#Transient
private double distance;
}
distance should be calculated from the DB when retrieving the result set from a native query.
E.g.
SELECT q.*, ST_Distance_Sphere(cast(q.location as geometry), ST_MakePoint(cast(?1 as double precision), cast(?2 as double precision))) as distance from question q
I cannot use #Formula to annotate my field distance since the query has to take parameters.
How can I map the field distance from the SQL query result to my entity field distance while leaving all the other mappings to be done by Hibernate?
Edit
Based on #gmotux suggestion I created a wrapper entity.
#Entity
#SqlResultSetMapping(
name="MappingQ",
entities={
#EntityResult(
entityClass = QuestionWithDistance.class,
fields={
#FieldResult(name="distance",column="distance"),
#FieldResult(name="question",column="question")})})
public class QuestionWithDistance{
#Id
#GeneratedValue
private String id;
#OneToOne
private Question question;
private double distance;
}
Query
Query query = entityManager.createNativeQuery("SELECT q.*, 222.22 as distance from question q", "MappingQ");
But it always fails with
org.postgresql.util.PSQLException: The column name id1_15_0_ was not found in this ResultSet.
Since you need extra parameters to calculate your field, you indeed cannot use #Formula, or even a getter to calculate the field.
Unfortunately for your case the only thing that comes to mind, assuming you are using an EntityManager based configuration for Hibernate, is leveraging its #PostLoad event listener, which you can use for calculating field values upon entity loading, like :
public class Question extends Entry {
#PostLoad
private void postLoad() {
this.distance = DistanceCalculator.calculateDistance(Double param1,Double param2);
//other calculations
}
}
That of-course is only a workaround and it means that you must have a static method somewhere execute native queries.
I would suggest detaching the "distance" notion from your Question entity, if possible in your requirements and calculate it, when required, with either a native SQL function call or a service method.

Why is JPA query so slow?

I am implementing queries in my web application with JPA repositories. The two main tables I am querying from are FmReportTb and SpecimenTb.
Here are the two entity classes (only important attributes are listed).
//FmReportTb.java
#Entity
#Table(name="FM_REPORT_TB")
public class FmReportTb implements Serializable {
#Column(name="ROW_ID")
private long rowId;
#Column(name="FR_BLOCK_ID")
private String frBlockId;
#Column(name="FR_FULL_NAME")
private String frFullName;
#OneToOne
#JoinColumn(name="SPECIMEN_ID")
private SpecimenTb specimenTb;
FmReportTb has OneToOne relationship with SpecimenTb.
#Entity
#Table(name="SPECIMEN_TB")
public class SpecimenTb implements Serializable {
private String mrn;
#OneToOne(mappedBy="specimenTb", cascade=CascadeType.ALL)
private FmReportTb fmReportTb;
The query I am working on is to find all records in FmReportTb and show a few attributes from FmReportTb plus mrn from SpecimenTb.
Here is my JPA repository for FmReportTb:
#Repository
public interface FmReportRepository extends JpaRepository<FmReportTb, Long> {
#Query("select f from FmReportTb f where f.deleteTs is not null")
public List<FmReportTb> findAllFmReports();
Since, I am only showing part of the attributes from FmReportTb and one attribute from SpecimenTb, I decided to create a Value Object for FmReportTb. The constructor of the VO class assigns attributes from FmReportTb and grabs mrn attribute from SpecimenTb based on the OneToOne relationship. Another reason for using VO is because table FmReportTb has a lot of OneToMany children entities. For this particular query, I don't need any of them.
public class FmReportVO {
private String frBlockId;
private Date frCollectionDate;
private String frCopiedPhysician;
private String frDiagnosis;
private String frFacilityName;
private String frFullName;
private String frReportId;
private String filepath;
private String mrn;
public FmReportVO(FmReportTb fmReport) {
this.frBlockId = fmReport.getFrBlockId();
this.frCollectionDate = fmReport.getFrCollectionDate();
this.frCopiedPhysician = fmReport.getFrCopiedPhysician();
this.frDiagnosis = fmReport.getFrDiagnosis();
this.frFacilityName = fmReport.getFrFacilityName();
this.frFullName = fmReport.getFrFullName();
this.frReportId = fmReport.getFrReportId();
this.mrn = fmReport.getSpecimenTb().getMrn();
}
I implemented findall method in servicebean class to return a list of FmReportTb VOs.
//FmReportServiceBean.java
#Override
public List<FmReportVO> findAllFmReports() {
List<FmReportTb> reports = fmReportRepository.findAllFmReports();
if (reports == null) {
return null;
}
List<FmReportVO> fmReports = new ArrayList<FmReportVO>();
for (FmReportTb report : reports) {
FmReportVO reportVo = new FmReportVO(report);
String filepath = fileLoadRepository.findUriByFileLoadId(report.getFileLoadId().longValue());
reportVo.setFilepath(filepath);
fmReports.add(reportVo);
}
return fmReports;
}
Lastly, my controller looks like this:
#RequestMapping(
value = "/ristore/foundation/",
method = RequestMethod.GET,
produces = "application/json")
public ResponseEntity<List<FmReportVO>> getAllFmReports() {
List<FmReportVO> reports = ristoreService.findAllFmReports();
if (reports == null) {
return new ResponseEntity<List<FmReportVO>>(HttpStatus.NOT_FOUND);
}
return new ResponseEntity<List<FmReportVO>>(reports, HttpStatus.OK);
}
There are about 200 records in the database. Surprisingly, it took almost 2 full seconds to retrieve all the records in JSON. Even though I did not index all the tables, this is way too slow. Similar query takes about probably a few ms on the database directly. Is it because I am using Value Objects or JPA query tends to be this slow?
EDIT 1
This may have to do with the fact that FmReportTb has almost 20 OneToMany entities. Although the fetchmode of these child entities are set to LAZY, JPA Data repository tends to ignore the fetchmode. So I ended up using NamedEntityGraph to specify the attributes EAGER. This next section is added to the head of my FmReportTb entity class.
#Entity
#NamedEntityGraph(
name = "FmReportGraph",
attributeNodes = {
#NamedAttributeNode("fileLoadId"),
#NamedAttributeNode("frBlockId"),
#NamedAttributeNode("frCollectionDate"),
#NamedAttributeNode("frDiagnosis"),
#NamedAttributeNode("frFullName"),
#NamedAttributeNode("frReportId"),
#NamedAttributeNode("specimenTb")})
#Table(name="FM_REPORT_TB")
And then #EntityGraph("FmReportGraph") was added before the JPA repository query to find all records. After doing that, the performance is improved a little bit. Now fetching 1500 records only takes about 10 seconds. However, it still seems too slow given each json object is fairly small.
Answering for the benefit of others with slow JPA queries...
As #Ken Bekov hints in the comments, foreign keys can help a lot with JPA.
I had a couple of tables with a many to one relationship - a query of 100,000 records was taking hours to perform. Without any code changes I reduced this to seconds just by adding a foreign key.
In phpMyAdmin you do this by creating a Relationship from the "many" table to the "one" table. For a detailed explanation see this question: Setting up foreign keys in phpMyAdmin?
and the answer by #Devsi Odedra

NamedEntityGraph Returns All Columns and Objects

I am trying to utilize a NamedEntityGraph to limit the return data for specific queries. Mainly I do not want to return full object details when listing the object. A very simple class example is below.
#Entity
#Table(name="playerreport",schema="dbo")
#NamedEntityGraphs({
#NamedEntityGraph(name = "report.simple",
attributeNodes =
{#NamedAttributeNode(value="intId")
}
)
})
public class PlayerReportEntity {
#Id
#Column(name="intid",columnDefinition="uniqueidentifier")
private String intId;
#Column(name="plyid",columnDefinition="uniqueidentifier")
#Basic(fetch=FetchType.LAZY)
private String plyId;
#ManyToOne(fetch=FetchType.LAZY)
#JoinColumn(name = "plyid", insertable=false,updatable=false)
private PlayerEntity player;
No matter what I do to plyId and player are always returned. Is there any way to only return the requested columns (intId) ?
As for the collection Hibernate does not do the join for the player object but it still returns player as null. So that part is working to an extent.
I am using a JPARepository below to generate Crud Statements for me
public interface PlayerReportRepository extends JpaRepository<PlayerReportEntity, String> {
#EntityGraph(value="report.simple")
List<PlayerIntelEntity> findByPlyId(#Param(value = "playerId") String playerId);
#Override
#EntityGraph(value="report.simple")
public PlayerIntelEntity findOne(String id);
}
A chunk of text from here - "Hence it seems that the #NamedEntityGraph only affects fields that are Collections, but fields that are not a Collection are always loaded." from JIRA
Please use the Example 47 on this page and use repositories accordingly.
In essence, hibernate is right now loading all the feilds in the class and for collections it will work if you follow the example stated above.
Thanks.

Resources