How to merge documents from two different indexes using Spring elasticsearch - spring-boot

I am new to elastic-search and i am trying to use spring data elastic search in the application. I have a requirement where in there are two separate indexes and i want to fetch documents from both indexes in one query based on some condition.
I would try to explain it with sample example with the same scenario.
There are two Different classes for individual indexes.
#Document(indexName = "Book", type = "Book")
public class Book {
#Id
private String id;
#Field(type = FieldType.String)
private String bookName;
#Field(type = FieldType.Integer)
private int price;
#Field(type = FieldType.String)
private String authorName;
//Getters and Setters
}
There is one more class Author
#Document(indexName = "Author", type = "Author")
public Class Author{
#Id
private String id;
#Field(type = FieldType.String)
private String authorName;
//Getters and setters
}
So there are two indexes one Book and Other Author.
I want to fetch all the documents where authorName in Book index is equal to authorName in Author index.
Can i get the details from both the index as a single document like merged result.
It would be very helpful if anyone can suggest solution for this usecase.
Thanks a lot for your answer

Your question is not clear. the document in the Author index is not the same document as the document in the Book index. they are two different documents. the closest thing I can think of is querying multiple indices with the same query -
just add multiple indices to the indexCoordinates.of method to search for the same filed\value in both indices.
for example:
elasticsearchTemplae.search(query, returnedObjectClass, indexCoordinates.of("Author", "Book")).
it will return all the search hits with both indices mentioned, which have to fulfill the query conditions, whichever they are.

Related

How to search Mongo for element in a list of string using spring query

So, here's my mongo entity :
public class OfferEntity {
#Id
private String id;
public List<String> categories;
private Type type;
private String title;
}
I use spring org.springframework.data.mongodb.core.query.Criteria and org.springframework.data.mongodb.core.query.Query.
The goal is to find Offers by categories.
Something like :
Query query = new Query();
ArrayList<String> strings = new ArrayList<>();
strings.add("new");
strings.add("old");
strings.add("blue");
strings.add("stolen");
query.addCriteria(Criteria.where("categories").contains(strings);
or
query.addCriteria(Criteria.where("categories").contains("new","old","blue");
I know there's no such thing as "contains" method here, but you get the idea.
I've found some things about "elemMatch" but it does not seems to match my need, as it's meant to find a element in an array with a matching key/value pair.
Any thoughts ? I'm sure it's possible but I'm a bit lost here.
You need to use Criteria.in
Reference
query.addCriteria(Criteria.where("categories").in(Arrays.asList("new","old","blue"));

Elastic search with Spring Data for Reactive Repository - Deleting based on nested attributes

I am using Spring data for Elastic Search and am using the ReactiveCrudRepository for stuff like finding and deleting. I noticed that with attributes that are in root and are simple objects, the deletion works (deleteByAttributeName). However if I have nested objects then it does not work.
Here's my entities
Book
#Data
#TypeAlias("book")
#Document(indexName = "book")
public class EsBook{
#Field(type = FieldType.Long)
private Long id;
#Field(type = FieldType.Nested)
private EsStats stats;
#Field(type = FieldType.Date, format = DateFormat.date)
private LocalDate publishDate;
}
Stats
#Data
#Builder
#NoArgsConstructor
#AllArgsConstructor
#EqualsAndHashCode
public class EsStats{
#Field(type = FieldType.Double)
private Double averageRating;
#Field(type = FieldType.Integer)
private Double totalRatings;
#Field(type = FieldType.Keyword)
private String category; //this can be null
}
Here is what I have tried and is working and not working
I used ReactiveCrudRepository to delete documents in index. For all the regular fields on Book Level like id or with id and publishDate deletion works perfectly. As soon as I use embedded object like Stats, it stops working. I see the documents and the stats that I am sending match atleast visually but never finds or deletes them.
I tried to use EqualsAndHashcode in the Stats assuming maybe iternally somehow does not consider equal for some reason. I also tried changing double data type to int, because on looking at the elastic search document, I see that average review if whole number like 3 is save as 3 but when we send it from Java, i see in the debug 3 being shown as 3.0, so I was doubting if that is the case, but does not seem so. Even changing the datatype to int deletion does not work.
public interface ReactiveBookRepository extends ReactiveCrudRepository<EsBook, String> {
Mono<Void> deleteById(long id); //working
Mono<Void> deleteByIdAndPublishDate(long id, LocalDate publishDate); //Nor working
Mono<Void> deleteByIdAndStats(long id, LocalDate startDate);
}
Any help will be appreciated
Have you verified that your Elasticsearch index mapping matches your Spring Data annotations?
Verify that the index mapping defines the stats field as a nested field type.
If not, then try changing your Spring annotation to:
#Field(type = FieldType.Object)
private EsStats stats;

Spring Mongodb aggregation doesn't work with DBRef

I have an aggregation which doesn't work in mongodb and spring boot. I would be grateful if anyone could help me.
Here is my ExplainDoc class:
#Document(collection = "ExplainDoc")
public class ExplainDoc{
#Id
private String id;
#TextIndexed(weight=3)
private String product_in_brief;
private Product product;
#TextScore
private Float textScore; }
And here is my other class:
#Document(collection = "product")
public class Product{
#Id
private String id;
private String category;
}
What I want to do is to make a text search and find all ExplainDocs which have the given text in their product_in_brief PROVIDED THAT their product has a specific category.
In my search repository, I have an aggregation like the following:
public List<MyAggrResults> searchBriefExplanations(String text, String category){
MatchOperation matchRegion = Aggregation.match(Criteria.where("product.category").is(category));
TextCriteria criteria = TextCriteria.forDefaultLanguage().matchingAny(text);
MatchOperation match = Aggregation.match(criteria);
GroupOperation group = Aggregation.group("product.category").push("$$ROOT").as("myresults").sum("textScore").as("score");
ProjectionOperation project = Aggregation.project("product_in_brief", "product").andExpression("{$meta: \"textScore\"}").as("textScore");
}
The code works now. However, I see it is so expensive to have the product object always as a nested document. How should I change the code if I want to use the product object as #DBRef? When I add the #DBRef, the code doesn't work anymore. I think the reason is that the product.category is not recognized anymore.
I hope somebody can help me.
There is nothing special about DBRef that magically makes it efficient. It is simply a label for the combination of collection name and an id. You still need to use aggregation pipeline to query data that uses DBRef in the same way you'd query if you didn't use DBRef.

Spring data elasticSearch : Update entity using alias

I'm currently fighting with the spring-data-elasticsearch API. I need it to work on an alias with several indexes pointing on it.
Each indexes have the same types stored, but are juste day to day storage (1rst index are monday's resulsts, second are tuesday's resulsts....).
Some of the ElasticsearchRepository methods don't work because of the alias. I currently managed to do a search (findOne() equivalent) but I am not able to update an entity.
I don't know how to achieve that, I looked to the documentation and samples.. but I'm stuck.
My repository
public interface EsUserRepository extends ElasticsearchRepository<User, String>
{
#Query("{\"bool\" : {\"must\" : {\"term\" : {\"id_str\" : \"?0\"}}}}")
User findByIdStr(String idStr);
}
My Entity
#Document(indexName = "osintlab", type = "users")
public class User
{
// Elasticsearch internal id
#Id
private String id;
// Just a test to get the real object index (_index field), in order to save it
#Field(index = FieldIndex.analyzed, type = FieldType.String)
private String indexName;
// Real id, saved under the "id_str" field
#Field(type = FieldType.String)
private String id_str;
#Field(type = FieldType.String)
private List<String> tag_user;
}
What I tested
final IndexQuery indexQuery = new IndexQuery();
indexQuery.setId(user.getId());
indexQuery.setObject(user);
esTemplate.index(indexQuery);
userRepository.index(user));
userRepository.save(user))

Morphia. How to get a part of information from big datastore

I have some problem with Morphia.
Could someone help me?
I am writing web-project on Spring + MongoDB about movies and celebrities.
I have entity class Genre:
#Entity(value="genres")
public class Genre implements IGenre {
#Id
#Indexed
private ObjectId id;
#Indexed
private String name;
private String description;
private long quantity;
private Set <IMovie> movies;
//getters and setters
}
And entity class Movie:
#Entity(value="movies")
public class Movie implements IMovie {
#Id
#Indexed
private ObjectId id;
#Indexed
private String originalTitle;
private String year;
private Set <IGenre> genres;
// getters and setters
}
I have 30 genres. And for example one of them: Comedy.
Also I have 250 000 comedies.
And now I want to do movie pagination by genre = comedy.
How I can get only 20 records from all comedies.
If I use #Embedded or #Reference annotation I will still get the entire list at once. And it's to big for use it in controllers.
You should change your data schema for doing such a query. The schema you use has a circular dependency, in your Genre entity you are holding Movies entity and in movies you hold Genre. Also holding all of the movies according to genre is not easy to query. If I were you I would use such a schema.
#Entity(noClassnameStored = true) // you wouldn't have any problem when you change the class name or package of your class if you don't store the classname through this annotation
public class Movie implements IMovie {
#Id
#Indexed
private ObjectId id;
#Indexed
private String originalTitle;
private String year;
private Set <String> genres; // unique identifier of the genres instead of embedding the whole genre entity
// getters and setters
}
So having such a schema, you can retrieve the movies having a particular genre by writing a simple $in query through genres field. Example query for your case:
datastore.find(Movie.class).field("genres").in(Lists.newArrayList("comedy")).limit(20).asList;
In the below web-page of mongo you can find suggestions about how to design your schema according to diffrerent scenarios.
http://docs.mongodb.org/manual/core/data-modeling/#data-modeling-patterns-and-examples
I'm not too familiar with Mongo, but it looks like you would need to implement a custom query here. Whatever you do, you need to pass a start for your page, as well as a page size (20 in your case).
You can do pagination in Morphia by combining .offset(page_start) and .limit(page_size) on a query. So first you would create a query to get movies that belong to a certain genre, and then apply the pagination.
Looks like it is handled using repositories in Spring http://static.springsource.org/spring-data/data-mongodb/docs/1.0.0.RELEASE/reference/html/#repositories.special-parameters
You'd use a Pageable implementation to pass paging data, without worrying about doing the offset and limit calls yourself. There's an example of "Web pagination" further in the doc.
Hope this helps!

Resources