Recreating Elastic Index with different Field Type - spring-boot

I'm new to ES currently attempting to use spring-data-elasticsearch 3.2.1.RELEASE in my service.
Design is still in early phase and hence I've had to change/update fields in the ElasticDocument which we annotate by #Document.
It looks somewhat like:
#Docuemnt(...)
public class MyDocument {
#Id
private String id;
...
#Field(type = FieldType.String, name = "myField")
private String myField;
}
I had to change days field to an object, for which I simply changed the Datatype and FieldType Attribute to Object.
#Docuemnt(...)
public class MyDocument {
#Id
private String id;
...
#Field(type = FieldType.Object, name = "myField")
private Object myField;
}
I deleted all documents from my index on cluster and attempted to save this document with new field type but it looks like it's still giving errors due to previous type being Text.
org.springframework.data.elasticsearch.ElasticsearchException: Bulk indexing has failures.
Use ElasticsearchException.getFailedDocuments() for detailed messages
[
{
XYZ=ElasticsearchException[
Elasticsearch exception [
type=mapper_parsing_exception,
reason=failed to parse field [myField] of type [text] in document with id 'XYZ']
];
nested: ElasticsearchException[
Elasticsearch exception [
type=illegal_state_exception,
reason=Can't get text on a START_OBJECT at 1:296 ]
];
}
]
I'm pretty sure this might not be the best practice to change Field Types but I have tried this with a different indexName and that worked.
For another attempt, deleting this particular index manually and letting spring data elasticsearch create it while doing bulk indexing does not help. I see the same error.
Could it be because I have more instances (non-local) which are connected to elastic though not doing any operations on this index at this moment?

Related

Not able to search data in redis cache using spring crud repository by passing list of values for a property of the model saved in cache

We have model class saved in Redis as mentioned below:-
#Data
#NoArgsConstructor
#AllArgsConstructor
#RedisHash("book")
public class Book implements Serializable {
private static final long serialVersionUID = 2208852329346517265L;
#Id
private Integer bookID;
#Indexed
private String title;
#Indexed
private String authors;
private String averageRating;
private String isbn;
private String languageCode;
private String ratingsCount;
private BigDecimal price;
}
We have title and authors as our indexed property.
Now we wanted to search all the records from Redis by passing title and a list of authors using the spring crud repository as mentioned below.
public interface BookSpringRepository extends CrudRepository<Book, String> {
List<Book> findAllByTitleAndAuthors(String title, List<String> authors);
}
Service layer:-
#Override
public Optional<List<Book>> searchBooksByTitleAndAuthorNames(String title, List<String>
autherNames) {
return Optional.ofNullable(bookSpringRepository.findAllByTitleAndAuthors(title,
autherNames));
}
Here we are getting below exception
Unable to fetch data from Spring data Redis cache using List of Integer or
String.
Getting error while fetching - "Resolved
[org.springframework.core.convert.ConversionFailedException: Failed to convert from type
[java.lang.String] to type [byte] for value 'Ronak';
nested exception is java.lang.NumberFormatException: For input string: "Ronak"]."
We would not want to convert the list of string/integer to byte as it is a time-consuming process and as we tried took so much amount of time. Also when the results are retrieved we will again have to convert back to normal integer or string values.
The other option is to loop through the list and pass a single value at a time to the Redis crud repository and this time Redis crud repository is happy but that will be a loop call to Redis and network latency.
We cannot add ID attributes on authors' property as these can be duplicate records.
Does the spring crud repository support the LIKE query in search that way we can create a unique id having these authors' names and make put ID annotation on that new derived property to search the records using spring crud repository using LIKE or contains kind of query.
Any suggestions here are highly appreciated!!
Try to add serialization to your redis key and value. This might help :
https://medium.com/#betul5634/redis-serialization-with-spring-redis-data-lettuce-codec-1a1d2bc73d26

Elastic search with Spring Data for Reactive Repository - Deleting based on nested attributes

I am using Spring data for Elastic Search and am using the ReactiveCrudRepository for stuff like finding and deleting. I noticed that with attributes that are in root and are simple objects, the deletion works (deleteByAttributeName). However if I have nested objects then it does not work.
Here's my entities
Book
#Data
#TypeAlias("book")
#Document(indexName = "book")
public class EsBook{
#Field(type = FieldType.Long)
private Long id;
#Field(type = FieldType.Nested)
private EsStats stats;
#Field(type = FieldType.Date, format = DateFormat.date)
private LocalDate publishDate;
}
Stats
#Data
#Builder
#NoArgsConstructor
#AllArgsConstructor
#EqualsAndHashCode
public class EsStats{
#Field(type = FieldType.Double)
private Double averageRating;
#Field(type = FieldType.Integer)
private Double totalRatings;
#Field(type = FieldType.Keyword)
private String category; //this can be null
}
Here is what I have tried and is working and not working
I used ReactiveCrudRepository to delete documents in index. For all the regular fields on Book Level like id or with id and publishDate deletion works perfectly. As soon as I use embedded object like Stats, it stops working. I see the documents and the stats that I am sending match atleast visually but never finds or deletes them.
I tried to use EqualsAndHashcode in the Stats assuming maybe iternally somehow does not consider equal for some reason. I also tried changing double data type to int, because on looking at the elastic search document, I see that average review if whole number like 3 is save as 3 but when we send it from Java, i see in the debug 3 being shown as 3.0, so I was doubting if that is the case, but does not seem so. Even changing the datatype to int deletion does not work.
public interface ReactiveBookRepository extends ReactiveCrudRepository<EsBook, String> {
Mono<Void> deleteById(long id); //working
Mono<Void> deleteByIdAndPublishDate(long id, LocalDate publishDate); //Nor working
Mono<Void> deleteByIdAndStats(long id, LocalDate startDate);
}
Any help will be appreciated
Have you verified that your Elasticsearch index mapping matches your Spring Data annotations?
Verify that the index mapping defines the stats field as a nested field type.
If not, then try changing your Spring annotation to:
#Field(type = FieldType.Object)
private EsStats stats;

How to use generic annotations like #Transient in an entity shared between Mongo and Elastic Search in Spring?

I am using Spring Boot and sharing the same entity between an Elastic Search database and a MongoDB database. The entity is declared this way:
#Document
#org.springframework.data.elasticsearch.annotations.Document(indexName = "...", type = "...", createIndex = true)
public class ProcedureStep {
...
}
Where #Document is from this package: org.springframework.data.mongodb.core.mapping.Document
This works without any issue, but I am not able to use generic annotations to target Elastic Search only. For example:
#Transient
private List<Point3d> c1s, c2s, c3s, c4s;
This will exclude this field from both databases, Mongo and Elastic, whereas my intent was to apply it for Elastic Search only.
I have no issue in using Elastic specific annotations like this:
#Field(type = FieldType.Keyword)
private String studyDescription;
My question is:
what annotation can I use to exclude a field from Elastic Search only and keep it in Mongo?
I don't want to rewrite the class as I don't have a "flat" structure to store (the main class is composed with fields from other classes, which themselves have fields I want to exclude from Elastic)
Many thanks
Assumption: ObjectMapper is used for Serialization/Deserialization
My question is: what annotation can I use to exclude a field from
Elastic Search only and keep it in Mongo? I don't want to rewrite the
class as I don't have a "flat" structure to store (the main class is
composed with fields from other classes, which themselves have fields
I want to exclude from Elastic)
Please understand this is a problem of selective serialization.
It can be achieved using JsonViews.
Example:
Step1: Define 2 views, ES Specific & MongoSpecific.
class Views {
public static class MONGO {};
public static class ES {};
}
Step2: Annotate the fields as below. Description as comments :
#Data
class Product {
private int id; // <======= Serialized for both DB & ES Context
#JsonView(Views.ES.class) //<======= Serialized for ES Context only
private float price;
#JsonView(Views.MONGO.class) // <======= Serialized for MONGO Context only
private String desc;
}
Step 3:
Configure Different Object Mappers for Spring-Data-ES & Mongo.
// Set View for MONGO
ObjectMapper mapper = new ObjectMapper();
mapper.setConfig(mapper.getSerializationConfig().withView(Views.MONGO.class));
// Set View for ES
ObjectMapper mapper = new ObjectMapper();
mapper.setConfig(mapper.getSerializationConfig().withView(Views.ES.class));

Nested document & Parent/Child setup using Spring Boot + Spring Data Elasticsearch

According to official elasticsearch, I'm understand that Nested required to reindexing the parent with all its children if add/delete/update operations, therefore is expensive when required a lot modification.
Example using Nested:
#Document(indexName = "test-index-person-multiple-level-nested", type = "user", shards = 1, replicas = 0, refreshInterval = "-1")
public class PersonMultipleLevelNested {
#Id
private String id;
private String name;
#Field(type = FieldType.Nested)
private List<GirlFriend> girlFriends;
//Getter, setter & constructor
}
And Parent & Child are better suite this kind situation, but how can I setup using Spring Data Elasticsearch? It is not yet supported? Seem can't find related documentation.
Not sure about documentation, but there is a unit test for this feature: https://github.com/spring-projects/spring-data-elasticsearch/blob/master/src/test/java/org/springframework/data/elasticsearch/entities/ParentEntity.java — see #Parent in particular.

Id field handling in Spring Data Mongo for child objects

I have been working in Spring Boot with the Spring Data MongoDB project and I am seeing behavior I am not clear on. I understand that the id field will go to _id in the Mongo repository per http://docs.spring.io/spring-data/mongodb/docs/current/reference/html/#mapping.conventions.id-field. My problem is that it also seems to be happening for child entities which does not seem correct.
For example I have these classes (leaving out setters and getters for brevity) :
public class MessageBuild {
#Id
private String id;
private String name;
private TopLevelMessage.MessageType messageType;
private TopLevelMessage message;
}
public interface TopLevelMessage {
public enum MessageType {
MapData
}
}
public class MapData implements TopLevelMessage {
private String layerType;
private Vector<Intersection> intersections;
private Vector<RoadSegment> roadSegments;
}
public class RoadSegment {
private int id;
private String name;
private Double laneWidth;
}
and I create an object graph using this I use the appropriate MongoRepository class to save I end up with an example document like this (with _class left out):
{
"_id" : ObjectId("57c0c05568a6c4941830a626"),
"_class" : "com.etranssystems.coreobjects.persistable.MessageBuild",
"name" : "TestMessage",
"messageType" : "MapData",
"message" : {
"layerType" : "IntersectionData",
"roadSegments" : [
{
"_id" : 2001,
"name" : "Road Segment 1",
"laneWidth" : 3.3
}
]
}
}
In this case a child object with a field named id has its mapping converted to _id in the MongoDB repository. Not the end of the world although not expected. The biggest problem is now that this is exposed by REST MVC the _id fields are not returned from a query. I have tried to set the exposeIdsFor in my RepositoryRestConfigurerAdapter for this class and it exposes the id for the top level document but not the child ones.
So circling around the 2 questions/issues I have are:
Why are child object fields mapped to _id? My understanding is that this should only happen on the top level since things underneath are not really documents in their own right.
Shouldn't the configuration to expose id fields work for child objects in a document if it is mapping the field names?
Am I wrong to think that RoadSegment does not contain a getId() ? From Spring's documentation:
A property or field without an annotation but named id will be mapped
to the _id field.
I believe Spring Data does this even to nested classes, when it finds an id field. You may either add a getId(), so that the field is named id or annotate it with #Field:
public class RoadSegment {
#Field("id")
private int id;
private String name;
private Double laneWidth;
}
I agree this automatic conversion of id/_id should only be done at the top level in my opinion.
However, the way Spring Data Mongo conversion is coded, all java ojects go through the exact same code to be converted into json (both top and nested objects):
public class MappingMongoConverter {
...
protected void writeInternal(Object obj, final DBObject dbo, MongoPersistentEntity<?> entity) {
...
if (!dbo.containsField("_id") && null != idProperty) {
try {
Object id = accessor.getProperty(idProperty);
dbo.put("_id", idMapper.convertId(id));
} catch (ConversionException ignored) {}
}
...
if (!conversions.isSimpleType(propertyObj.getClass())) {
// The following line recursively calls writeInternal with the nested object
writePropertyInternal(propertyObj, dbo, prop);
} else {
writeSimpleInternal(propertyObj, dbo, prop);
}
}
writeInternal is called on the top level object, and then recalled recursively for each subobjects (aka SimpleTypes). So they both go through the same logic of adding _id.
Perhaps this is how we should read Spring's documentation:
Mongo's restrictions on Mongo Documents:
MongoDB requires that you have an _id field for all documents. If you
don’t provide one the driver will assign a ObjectId with a generated
value.
Spring Data's restrictions on java classes:
If no field or property specified above is present in the Java class
then an implicit _id file will be generated by the driver but not
mapped to a property or field of the Java class.

Resources