Spring Data MongoDB: How to describe aggregation $merge with Spring Aggregation? - spring

Code that I want to execute by MongoTemplate:
{
$merge: {
into: 'someCollection',
on: "_id",
whenMatched: 'merge',
whenNotMatched: 'discard'
}
}
I did not find any suitable methods that allow me to describe $merge stage, have doubts if Spring Data MongoDB even supports this stage?

Yes, Spring Data MongoDB have support for $merge stage.
Your code can be executed by MongoTemplate following way.
MergeOperation mergeOperation = Aggregation.merge()
.intoCollection("someCollection")
.on("_id")
.whenMatched(MergeOperation.WhenDocumentsMatch.mergeDocuments())
.whenNotMatched(MergeOperation.WhenDocumentsDontMatch.discardDocument())
.build();
Use this mergeOperation with mongoTemplate.

Related

Multiple swagger documents with Spring Boot - for different http operations

I figured that there is a way to group endpoints into different swagger documents, but i wanted to know if there is a way to separate endpoints based on get/post/patch operations?
For eg, i have 2 endpoints
Get : /app/employee
Post: /app/employee
How can i segregate them into 2 different swagger documents?
Edit 1 : I am referring the below article to segregate swagger endpoints in spring boot:
https://dev.to/s2agrahari/grouping-apis-in-swagger-55kk
Maybe your questione answered by using tags?
Here is the Official example.
Regards.
In your application.yaml, you can configure like this:
springdoc:
swagger-ui:
operationsSorter: method
enabled: true
tags-sorter: alpha
Reference: https://swagger.io/docs/open-source-tools/swagger-ui/usage/configuration/
Also there is a question with more details for setting it programatically: https://swagger.io/docs/open-source-tools/swagger-ui/usage/configuration/
You gave the answer partially yourself, you can just do:
#Bean
GroupedOpenApi getApis() {
return GroupedOpenApi.builder().group("My Get Apis").addOpenApiCustomiser(customizer()).build();
}
And then some logic (maybe you have double check it, but something like so):
public OpenApiCustomiser customizer() {
return (OpenAPI openApi) -> {
final List<String> keysToDelete = new ArrayList<>();
for (Map.Entry<String, PathItem> entry : openApi.getPaths().entrySet()) {
if(entry.getValue().getGet() == null) {
keysToDelete.add(entry.getKey());
}
}
keysToDelete.forEach(key -> openApi.getPaths().remove(key));
};
}
And the same for Post or any other Rest operation....

Mock elasticsearch nested buckets

I'm trying to mock elasticsearch nested aggregation object with the following structure. I did something as described at the attached link, but couldn't do it for nested object:
Mock Elastic Search response in.Net
Here is the real elastic object I've been tying to mock.
var obj= new AggregationDictionary
{
{
"key1",new TermsAggregation("key1")
{
Field="1234",
Aggregations = new AggregationDictionary
{
"top",new TopHitsAggregation("top")
{
Size =10
}
}
}
}
}
};
After struggling with my attempts to mock the Elasticsearch object, I came up with the following solutions:
Using in memory connection
Elasticsearch (using nest client) documentation will teach you how to "Inject" your desired response in advance. Now you can map the response to your type-safe object without creating a connection to Elasticsearch. I recommend to use this technique on integration test and not Unit test.
For unit test, the best option is to abstract your services layer with an interface and totally ignore from Elasticsearch engine.
Just override services method with your mock object. I use "Moq" and response what ever you want, for example:
mockObj= new Mock<IClientRepository>()
mockObj.Setup(x => x.GetData()).Return("put your response here");

Project New Array Field with Spring Data

As part of an aggregate operation, I need to unwind an array. I am wondering how I can put the object back into an array as part of the project. Here is the MongoDB aggregate operation that works:
db.users.aggregate([ { "$match" : {...} , { "$unwind" : "$profiles"} ,{$project: {'profiles': ['$profiles']}}...}
And more specifically, how can I implement this using Spring Data mongoDB ProjectionOperation:
{$project: {'profiles': ['$profiles']}}
This feature has been added since 3.2.
Edit 1:
I looked through some of the posts and one answer by
Christoph Strobl:
and based on the answer I came up with something that works which is as follows:
AggregationOperation project = aggregationOperationContext -> {
Document projection = new Document();
projection.put("profiles", Arrays.<Object> asList("$profiles"));
projection.put("_id","$id");
return new Document("$project", projection);
};
I am wondering if there is a better way of doing it though.
Any help/suggestion is very much appreciated. Thanks.
Unfortunately there is not.
You can replace $project by project() with an AggregationExpression to shorten it a bit.
// ...
unwind("profiles"),
project().and(ctx -> new Document("profiles", asList("$profiles"))).as("profiles")
I created DATAMONGO-2312 to provide support for new array field projections in one of the next versions.

Is there any way to implement pagination in spring webflux and spring data reactive

I'm trying to understand reactive part of spring 5. I have created simple rest endpoint for finding all entities using spring web-flux and spring data reactive (mongo) but don't see any way how to implement pagination.
Here is my simple example in Kotlin:
#GetMapping("/posts/")
fun getAllPosts() = postRepository.findAll()
Does it mean that reactive endpoint does not require pagination? Is some way to implement pagination from server side using this stack?
The reactive support in Spring Data does not provide means of a Page return type. Still, the Pageable parameter is supported in method signatures passing on limit and offset to the drivers and therefore the store itself, returning a Flux<T> that emits the range requested.
Flux<Person> findByFirstname(String firstname, Pageable pageable);
For more information please have a look at the current Reference Documentation for 2.0.RC2 and the Spring Data Examples.
Flux provides skip and take methods to get pagination support, and you also can use filter and sort to filter and sort the result. The filter and sort below is not a good example, but use skip and Pageable as 2nd parameter are no different.
The following codes work for me.
#GetMapping("")
public Flux<Post> all(
//#RequestParam(value = "q", required = false) String q,
#RequestParam(value = "page", defaultValue = "0") long page,
#RequestParam(value = "size", defaultValue = "10") long size) {
return this.postRepository.findAll()
//.filter(p -> Optional.ofNullable(q).map(key -> p.getTitle().contains(key) || p.getContent().contains(key)).orElse(true))//(replace this with query parameters)
.sort(comparing(Post::getCreatedDate).reversed())
.skip(page * size).take(size);
}
Update: The underlay drivers should be responsible for handling the result in the reactivestreams way.
And as you see in the answer from Christoph, if using a findByXXX method, Spring Data Mongo Reactive provides a variant to accept a pageable argument, but the findAll(reactive version) does not include such a variant, you have to do skip in the later operations if you really need the pagination feature. When switching to Flux instead of List, imagine the data in Flux as living water in the rivers or oil in the pipes, or the tweets in twitter.com.
I have tried to compare the queries using Pageale and not in the following case.
this.postRepository.findByTitleContains("title")
.skip(0)
.limitRequest(10)
.sort((o1, o2) -> o1.getTitle().compareTo(o2.getTitle()))
this.postRepository.findByTitleContains("title", PageRequest.of(0, 10, Sort.by(Sort.Direction.ASC, "title")))
When enabling logging for logging.level.org.springframework.data.mongodb.core.ReactiveMongoTemplate=DEBUG and found they print the same log for queries.
find using query: { "title" : { "$regularExpression" : { "pattern" : ".*title.*", "options" : ""}}} fields: Document{{title=1}} for class: class com.example.demo.Post in collection: post
//other logging...
find using query: { "title" : { "$regularExpression" : { "pattern" : ".*title.*", "options" : ""}}} fields: Document{{title=1}} for class: class com.example.demo.Post in collection: post
Keep in mind, all these operations should be DELEGATED to the underlay R2dbc drivers which implemented the reactive streams spec and performed on the DB side, NOT in the memory of your application side.
Check the example codes.
The early sample code I provided above maybe is not a good sample of filter and sort operations(MongoDB itself provides great regularexpression operations for it). But pagination in the reactive variant is not a good match with the concept in the reactive stream spec. When embracing Spring reactive stack, most of the time, we just move our work to a new collection of APIs. In my opinion, the realtime update and elastic response scene could be better match Reactive, eg. using it with SSE, Websocket, RSocket, application/stream+json(missing in the new Spring docs) protocols, etc
This is not efficient but it works for me while I look for another solution
Service
public Page<Level> getPage(int page, int size, Sort.Direction direction, String properties) {
var pageRequest = PageRequest.of(page, size, direction, properties);
var count = levelRepository.count().block();
var levels = levelRepository.findAllLevelsPaged(pageRequest).collectList().block();
return new PageImpl<>(Objects.requireNonNull(levels), pageRequest, Objects.requireNonNull(count));
}
Repo
#Repository
public interface LevelRepository extends ReactiveMongoRepository<Level, String> {
#Query("{ id: { $exists: true }}")
Flux<Level> findAllLevelsPaged(final Pageable page);
}
Ref example

Spring Batch multiple readers for different DB's

I have an existing spring batch project which reads data from MySQL or ArangoDB(NoSql database) based on feature toggle decision during startup and does some process and again writes back to MySQL/ArangoDB.
Now the reader configuration for MySQL is something like below,
#Bean
#Primary
#StepScope
public HibernatePagingItemReader reader(
#Value("#{jobParameters[oldMetadataDefinitionId]}") Long oldMetadataDefinitionId) {
Map<String, Object> queryParameters = new HashMap<>();
queryParameters.put(Constants.OLD_METADATA_DEFINITION_ID, oldMetadataDefinitionId);
HibernatePagingItemReader<Long> reader = new HibernatePagingItemReader<>();
reader.setUseStatelessSession(false);
reader.setPageSize(250);
reader.setParameterValues(queryParameters);
reader.setSessionFactory(((HibernateEntityManagerFactory) entityManagerFactory.getObject()).getSessionFactory());
return reader;
}
and i have another arango reader like below,
#Bean
#StepScope
public ListItemReader arangoReader(
#Value("#{jobParameters[oldMetadataDefinitionId]}") Long oldMetadataDefinitionId) {
List<InstanceDTO> instanceList = new ArrayList<InstanceDTO>();
PersistenceService arangoPersistence = arangoConfiguration
.getPersistenceService());
List<Long> instanceIds = arangoPersistence.getDefinitionInstanceIds(oldMetadataDefinitionId);
instanceIds.forEach((instanceId) ->
{
InstanceDTO instanceDto = new InstanceDTO();
instanceDto.setDefinitionID(oldMetadataDefinitionId);
instanceDto.setInstanceID(instanceId);
instanceList.add(instanceDto);
});
return new ListItemReader(instanceList);
}
and my step configuration is below,
#Bean
#SuppressWarnings("unchecked")
public Step InstanceMergeStep(ListItemReader arangoReader, ItemWriter<MetadataInstanceDTO> arangoWriter,
ItemReader<Long> mysqlReader, ItemWriter<Long> mysqlWriter) {
Step step = null;
if (arangoUsage) {
step = steps.get("arangoInstanceMergeStep")
.<Long, Long>chunk(1)
.reader(arangoReader)
.writer(arangoWriter)
.faultTolerant()
.skip(Exception.class)
.skipLimit(10)
.taskExecutor(stepTaskExecutor())
.build();
((TaskletStep) step).registerChunkListener(chunkListener);
}
else {
step = steps.get("mysqlInstanceMergeStep")
.<Long, Long>chunk(1)
.reader(mysqlReader)
.writer(mysqlWriter)
.faultTolerant()
.skip(Exception.class)
.skipLimit(failedSkipLimit)
.taskExecutor(stepTaskExecutor())
.build();
((TaskletStep) step).registerChunkListener(chunkListener);
}
return step;
}
The MySQL reader has pagination support through HibernatePagingItemReader so that it will handle millions of items without any memory issue.
I want to implement the same pagination support for arango reader to fetch only 250 documents per iteration how can modify the arango reader code to acheive this?
First of all documentation of ListItemReader says that - Useful for testing so don't use it for production. Return an ItemReader instead from all your reader beans instead of actual concrete types.
Having said that, Spring Batch API or Spring Data doesn't seem to supporting Arango DB . Closest that I could find is this
( I have not worked with Arango DB before ) .
So in my opinion, you have to write your own custom arango reader that implements paging by possibly implementing abstract class - org.springframework.batch.item.database.AbstractPagingItemReader
If its not doable by extending above class, you might have to implement everything from scratch. All of pagination readers in Spring Batch API extend this abstract class including HibernatePagingItemReader.
Also, remember that arango record set should have some kind of ordering to implement pagination so we can distinguish between page - 0 & page -1 etc ( similar to ORDER BY clause , BETWEEN Operator & less than , greater than operators etc in SQL. Also FETCH FIRST XXX ROWS OR LIMIT clause kind of thing would be needed too ) .
Implementing by your own is not a very tough task as you have to calculate total possible items , order them and then divide into pages and fetch only one page at a time.
Look at API for implementations like - HibernatePagingItemReader etc to get ideas.
Hope it helps !!

Resources