Indexing problem with Spring Data Elastic migration from 3.x to 4.x

Indexing problem with Spring Data Elastic migration from 3.x to 4.x - spring-boot

In our monolith application that used JHIPSTER-6.10.5, we were using Spring-Data-Elastic Version: 3.3.1 with Elastic Search Version: 6.8.8. We have multiple #ManyToOne and #OneToMany relationships with over a 100+ entities.
In some cases a maximum of 7 entities are referenced from each other (I mean interlinked not just from one to other).
For elastic searching, We have been using
To ignore indexing: #JsonIgnoreProperities(value = { "unwanted fields" }, allowSetters = true) and #JsonIgnore where not needed
To map the relations: on ManyToOne's we use #JsonBackReference with a corresponding #JsonManagedReference on the respective OneToMany relationships.
Now we are in process of migration to Jhipster-7.0.1 and started seing the below problems:
New Spring-Data-Elastic Version: 4.1.6 with Elastic Search Version: 7.9.3
Now with Spring data elastic, the Jackson based mapper is not available we are seeing multiple StackOverflow errors. Below is the migration change we did on the annotations:
On the relationships we have added #Field(type = FieldType.Nested, ignoreMalformed = true, ignoreFields = {"unwanted fields"}). This stopped StackOverflow errors at Spring data level but still throw StackOverflow errors at elastic rest-client level internally. So, we are forced to use #Transient to exclude all the OnetoMany relations.
Even on ManyToOne relations with the above mentioned #Field annotation present we are facing the elasticsearchexception with "Limit of total fields [1000] in index [] has been exceeded"
I have tried to follow the documentation on spring data, but could not resolve it.
We have kept the Json(Jackson) Annotations also that were generated by Jhipster but they have no effect.
We are stalled at the moment as we are not sure how to resolve these issues; personally it was very convenient and well documented to use the Json annotations; We being new to both elastic search and spring data elastic search, started using it just for the past 8 months back, not able to figure out how to fix these errors.
Please ask if i missed any information needed. I will share as much as it doesn't voilate the org policies.
Sample code Repository as requested on gitter: https://gitlab.com/thelearner214/spring-data-es-sample
Thank you in advance

Had a look at the repository you linked on gitter (you might consider adding a link here).
First: the #Field annotation is used to write the index mapping and the ignoreFields property is needed to break circular references when the mapping is built. It is not used when the entity is written to Elasticsearch.
What happens for example with the Address and Customer entities during writing to Elasticsearch: The Customer document has Addresses so these adresses are converted as subdocuments embedded in the Customer document. But the Address has a Customer, so on writing the address the Customer is embedded into this Address element which already is a subdocument of the customer.
I suppose the Customers should not be stored in the Address and the other way round. So you need to mark these embedded documents as #org.springframework.data.annotation.Transient, you don't need the #Field annotation on them as you don not want to store them as properties in the index.
Jackson annotations are not used by Spring Data Elasticsearch anymore.
The basic problem of the approach that is used here, is that the modelling that comes from a relational world - linking and joining different tables with (one|many)to{one|many) relationships, manifested in a Java object graph by an ORM mapper - is used on a document based data store that does not use these concepts.
It used to work in your previous version, because the elder version of Spring Data Elasticsearch used Jackson as well and so these fields were skipped on writing, now you have to add the #Transient annotation which is a Spring Data annotation.
But I don't know how #Transient might interfere with Spring Data JPA - another point that shows that it's not a good idea to use the same Java class for different stores

Here is an approach we are using as a stop gap arrangement until we rewrite / find a better solution. Can't use separate classes for ES, like #P.J.Meisch advised, as we have large number of entities to maintain and a "microservice migration program" is already in progress.
Posting here as it might be useful for someone else with a similar issue.
Created a utility to serialize and deserialize the entity to get the benefit of Jackson annotations on the class. Ex: #JsonIgnoreProperities, #JsonIgnore etc.
This way, we are able to reduce usage of the #Transient annotation and still get the ID(s) of the related object(s).
package com.sample.shop.service.util;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.DeserializationFeature;
import com.fasterxml.jackson.databind.JavaType;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.datatype.hibernate5.Hibernate5Module;
import org.jetbrains.annotations.NotNull;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.List;
import java.util.Optional;
public class ESUtils {
private static final Logger log = LoggerFactory.getLogger(ESUtils.class);
public static <T> Optional<T> mapForES(Class<T> type, T input) {
ObjectMapper mapper = getObjectMapper();
try {
return Optional.ofNullable(mapper.readValue(mapper.writeValueAsString(input), type));
} catch (JsonProcessingException e) {
log.error("Parsing exception {} {}", e.getMessage());
return Optional.empty();
}
}
public static <T> List<T> mapListForES(Class<T> type, List<T> input) {
ObjectMapper mapper = getObjectMapper();
try {
JavaType javaType = mapper.getTypeFactory().constructCollectionType(List.class, type);
String serialText = mapper.writeValueAsString(input);
return mapper.readValue(serialText, javaType);
} catch (JsonProcessingException e) {
log.error("Parsing exception {} {}", e.getMessage());
}
}
#NotNull
private static ObjectMapper getObjectMapper() {
ObjectMapper mapper = new ObjectMapper();
mapper.configure(SerializationFeature.FAIL_ON_EMPTY_BEANS, false);
mapper.configure(SerializationFeature.WRITE_SELF_REFERENCES_AS_NULL, true);
mapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
Hibernate5Module module = new Hibernate5Module();
module.disable(Hibernate5Module.Feature.FORCE_LAZY_LOADING);
module.enable(Hibernate5Module.Feature.SERIALIZE_IDENTIFIER_FOR_LAZY_NOT_LOADED_OBJECTS);
module.enable(Hibernate5Module.Feature.USE_TRANSIENT_ANNOTATION);
module.enable(Hibernate5Module.Feature.REPLACE_PERSISTENT_COLLECTIONS);
return mapper;
}
}
Then, to save a single entry, have adjusted the logic to save to use the above utility like:
// categorySearchRepository.save(result); instead of the Jhipster generated code let's use the ESUtils
ESUtils.mapForES(Category.class,category).map(res -> categorySearchRepository.save(res));
And to save a list to for bulk-reindex, using the second utility:
Page<T> categoryPage = jpaRepository.findAll(page);
List<T> categoryList = ESUtils.mapListForES(Category.class, categoryPage.getContent());
elasticsearchRepository.saveAll(categoryList);
Might not be a best solution, but got the work done for our migration.

#Lina Basuni You can use java.util.Collections.emptyList()

Related

JPA issue mapping Cassandra Java Entity to table name with snake case

I am using below drivers.
implementation 'com.datastax.astra:astra-spring-boot-starter:0.3.0'
implementation 'com.datastax.oss:java-driver-core:4.14.1'
implementation 'com.datastax.oss:java-driver-query-builder:4.14.1'
implementation 'com.datastax.oss:java-driver-mapper-runtime:4.14.1'
implementation 'org.springframework.boot:spring-boot-starter-data-cassandra'
Here are my entities:
#NamingStrategy(convention = NamingConvention.SNAKE_CASE_INSENSITIVE)
#CqlName("engine_torque_by_last_miles")
#Entity
public class EngineTorqueByLastMiles {
private UUID id;
#PartitionKey(1)
private String vinNumber;
}
Here is my repository:
public interface EngineTorqueByLastMilesRepository extends CassandraRepository<EngineTorqueByLastMiles, String> {
List<EngineTorqueByLastMiles> findAllByVinNumberAndOrganizationId(String vinNumber, Integer organizationId);
}
The problem I am facing is the soring.data.jpa.cassandra does not map the Entity name or the attributes to snake_case even after using NamingStrategy or CqlName annotations from datastax drivers.
Does datastax provide any driver that supports jpa so that I can write my Entities and their attributes in typical java naming convention and cassandra tables or attributes with snake_case ?

Datastax provides indeed a way to map objects to your Cassandra Tables and it is called the Cassandra object mapper. The documentation is here https://docs.datastax.com/en/developer/java-driver/4.13/manual/mapper/ BUT YOU DO NOT NEED IT HERE.
Looking at your code it seems you want to use Spring Data Cassandra. This is totally fine. You are simply not using the proper set of annotations. You should the Spring data annotations.
Your bean becomes:
#Table("engine_torque_by_last_miles")
public class EngineTorqueByLastMiles {
#PrimaryKeyColumn(name = "vin_number", ordinal = 0, type = PrimaryKeyType.PARTITIONED)
private String vinNumber;
#Column("id")
#CassandraType(type = Name.UUID)
private UUID id;
// default constructor
// getters
// setters
}
Given the table name, it seems your partition key should be last_miles but it was not provided in your question.
You provided an id but it was not annotated also I assumed it was not part of the primary key. If you have a composite primary key with Partition key and cluster columns you need to create an ANOTHER internal bean for the PK and annotate it with #PrimaryKey (sample)
You can find a full-fledge working application here with multiple entities https://github.com/datastaxdevs/workshop-betterreads/tree/master/better-reads-webapp
If you edit or complete your question we could propose the exact beans needed.

Try setting the property:
spring.jpa.hibernate.naming.physical-strategy=org.hibernate.boot.model.naming.PhysicalNamingStrategyStandardImpl
since Hibernate 5 it's the default and you would get your snake-cased naming.
For more reference see the documentation here

How to use generic annotations like #Transient in an entity shared between Mongo and Elastic Search in Spring?

I am using Spring Boot and sharing the same entity between an Elastic Search database and a MongoDB database. The entity is declared this way:
#Document
#org.springframework.data.elasticsearch.annotations.Document(indexName = "...", type = "...", createIndex = true)
public class ProcedureStep {
...
}
Where #Document is from this package: org.springframework.data.mongodb.core.mapping.Document
This works without any issue, but I am not able to use generic annotations to target Elastic Search only. For example:
#Transient
private List<Point3d> c1s, c2s, c3s, c4s;
This will exclude this field from both databases, Mongo and Elastic, whereas my intent was to apply it for Elastic Search only.
I have no issue in using Elastic specific annotations like this:
#Field(type = FieldType.Keyword)
private String studyDescription;
My question is:
what annotation can I use to exclude a field from Elastic Search only and keep it in Mongo?
I don't want to rewrite the class as I don't have a "flat" structure to store (the main class is composed with fields from other classes, which themselves have fields I want to exclude from Elastic)
Many thanks

Assumption: ObjectMapper is used for Serialization/Deserialization
My question is: what annotation can I use to exclude a field from
Elastic Search only and keep it in Mongo? I don't want to rewrite the
class as I don't have a "flat" structure to store (the main class is
composed with fields from other classes, which themselves have fields
I want to exclude from Elastic)
Please understand this is a problem of selective serialization.
It can be achieved using JsonViews.
Example:
Step1: Define 2 views, ES Specific & MongoSpecific.
class Views {
public static class MONGO {};
public static class ES {};
}
Step2: Annotate the fields as below. Description as comments :
#Data
class Product {
private int id; // <======= Serialized for both DB & ES Context
#JsonView(Views.ES.class) //<======= Serialized for ES Context only
private float price;
#JsonView(Views.MONGO.class) // <======= Serialized for MONGO Context only
private String desc;
}
Step 3:
Configure Different Object Mappers for Spring-Data-ES & Mongo.
// Set View for MONGO
ObjectMapper mapper = new ObjectMapper();
mapper.setConfig(mapper.getSerializationConfig().withView(Views.MONGO.class));
// Set View for ES
ObjectMapper mapper = new ObjectMapper();
mapper.setConfig(mapper.getSerializationConfig().withView(Views.ES.class));

Spring Hibernate - Does it support nested objects?

I recently asked this question : Spring Mongodb - Insert Nested document?
And found out that Spring-Data-MongoDB does not support such behavior - so now I need a working alternative.
Now - to avoid having you look at the code on another page, I am going to paste it here from the other question... Here are my two POJOs :
#Document
public class PersonWrapper {
#Id
private ObjectId _Id;
#DBRef
private Person leader;
#DBRef
List<Person> delegates;
// Getters and setters removed for brevity.
}
public class Person
{
#Id
private ObjectId _Id;
private String name;
// Getters and setters removed for brevity.
}
Now, what I want to be able to do here - is send up a JSON object in my POST request as follows :
{
"personWrapper":
{
"_Id":"<ID HERE (MIGHT WANT SQL TO GENERATE THIS DURING CREATE>",
"leader":{
"_Id":"<ID HERE (MIGHT WANT SQL TO GENERATE THIS DURING CREATE>",
"name":"Leader McLeaderFace"
},
delegates:[{...},{...},{...}]
}
}
At this point - I would like the SQL side of this to create the individual records needed - and then insert the PersonWrapper record, with all of the right foreign keys to the desired records, in the most efficient way possible.
To be honest, if one of you thinks I am wrong about the Spring-Data-MongoDB approach to this, I would still be interested in the answer - because it would save me the hassle of migrating my database setup. So I will still tag the spring-data-mongodb community here, too.

If I understand well you want to cascade the save of your objects ?
ex : you save a PersonWrapper with some Person in the delegates property and spring data will save PersonneWrapper in a collection and save also the list of Person in another Collection.
It is possible to do that with Spring DATA JPA if you annotate your POJO with the JPA annotation #OneToMany and setup cascade property of this annotation. See this post
However the cascade feature is not available for Spring DATA mongoDB. See documentation .First you have to save the list of Person and then you save PersonWrapper.

Spring Data JPA querying with transitive sorting

I got a problem with simple Spring Data issue. Let's assume we got two entities.
public class Request {
// all normal stuff
#ManyToOne
private Document doc;
}
public class Document {
private Long id;
private String name;
}
Simple relation. My question is - is it possible to retrieve Request entities using Spring Data Method-DSL and sorting by Document? So what I want to achieve is to create repository method like:
public List<Request> findAllOrderByDoc()
or similar:
public List<Request> findAllOrderByDocId()
Unfortunately when I try that I am given error message saying that there is no Doc field or it cannot be mapped to long. I assume it is possible to be done using QueryDSL and predicates but I am wondering if this pretty obvious and simple thing can be done by plain Spring Data?

Yes, sure.
you need to provide the direction:
public List<Request> findAllOrderByDocAsc()
public List<Request> findAllOrderByDocDesc()

Performance issues using neo4j #QueryResult

I'm using neo4j with spring data.
When I use queries which return multiple fields I generally try and return an interface (#QueryResult annotated), so I won't need to convert the results afterwards.
For Some reason I experience very bad performance as the number of results grow.
Does anyone have solution?
I'm using neo4j 2.0.1 through rest, spring data for neo4j 3.0.0
The dataset is very small, less than a 100 nodes, and the result set is at most ~10 records.

Spring-Data-Neo4j-Rest is quite slow because it was built for the embedded database.
Here are fixes that can help improve the performance of your application still leveraging SDN's power to deserialize the objects for you.
Write your own cypher queries, and use #QueryResult more to retrieve connected nodes. DONT USE THE #Fetch annotation. SDN will make extra REST calls in an attempt to serialize the entire object graph when it could be done in a single query.
For example,
public class Person {
#RelatedTo (type = "MARRIED", direction = BOTH)
private Person partner;
}
#QueryResult
public class PersonResult {
#ResultColumn("person")
private Person person;
#ResultColumn("partner")
private Person partner;
}
In your repository then fetch your results
public interface PersonRepository extends GraphRepository <Person> {
#Query ("MATCH person-[m:MARRIED]-partner RETURN person, partner)
List<PersonResult> findAllMarried ();
}
AGAIN REMOVE THOSE DEADLY #FETCH ANNOTATIONS.
Your objects are serialized in batch this way. This is very efficient pending when the SDN team work out a better way to solve this problem.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio