I have a query in my Spring-Boot project. With this query, I can pull data with limit-offset 100 by 100. After pull it, I want to write all filtered data to CSV. In here actually, I wonder how I can go on.
Repository:
public List<Response> filterData(FilterOptions, limit, offset)
{
sql = "select
name, date, sum ...
from
metadata join report
where filters
limit 100 , offset ..."
//native query mapping to Response entity list.
return responseList.
}
Service:
public void exportData(FilterOptions, totalRows)
{
for (int i = 0; i < totalRows; i=i+100)
{
data = repository.filterData(filterOptions, limit=100, offset=i);
writeToCsv(data);
}
}
In here, in service part; How can I handle writeToCsv part? To use Spring-Batch will be useful for my case? Or to use open-csv? Can I implement parallelism in here?
Related
The goal of my springBoot webflux r2dbc application is Controller accepts a Request including a list of DB UPDATE or INSERT details, and Response a result summary back.
I can write a ReactiveCrudRepository based repository to implement each DB operation. But I don't know how to write the Service to group the executions of the list of DB operations and compose a result summary response.
I am new to java reactive programing. Thanks for any suggestions and help.
Chen
I get the hint from here: https://www.vinsguru.com/spring-webflux-aggregation/ . Ideas are :
From request to create 3 Monos
Mono<List> monoEndDateSet -- DB Row ids of update operation;
Mono<List> monoCreateList -- DB Row ids of new inserted;
Mono monoRespFilled -- partly fill some known fields;
use Mono.zip aggregate the 3 monos, map and aggregate the Tuple3 to Mono to return.
Below are key part of codes:
public Mono<ChangeSupplyResponse> ChangeSupplies(ChangeSupplyRequest csr){
ChangeSupplyResponse resp = ChangeSupplyResponse.builder().build();
resp.setEventType(csr.getEventType());
resp.setSupplyOperationId(csr.getSupplyOperationId());
resp.setTeamMemberId(csr.getTeamMemberId());
resp.setRequestTimeStamp(csr.getTimestamp());
resp.setProcessStart(OffsetDateTime.now());
resp.setUserId(csr.getUserId());
Mono<List<Long>> monoEndDateSet = getEndDateIdList(csr);
Mono<List<Long>> monoCreateList = getNewSupplyEntityList(csr);
Mono<ChangeSupplyResponse> monoRespFilled = Mono.just(resp);
return Mono.zip(monoRespFilled, monoEndDateSet, monoCreateList).map(this::combine).as(operator::transactional);
}
private ChangeSupplyResponse combine(Tuple3<ChangeSupplyResponse, List<Long>, List<Long>> tuple){
ChangeSupplyResponse resp = tuple.getT1().toBuilder().build();
List<Long> endDateIds = tuple.getT2();
resp.setEndDatedDemandStreamSupplyIds(endDateIds);
List<Long> newIds = tuple.getT3();
resp.setNewCreatedDemandStreamSupplyIds(newIds);
resp.setSuccess(true);
Duration span = Duration.between(resp.getProcessStart(), OffsetDateTime.now());
resp.setProcessDurationMillis(span.toMillis());
return resp;
}
private Mono<List<Long>> getNewSupplyEntityList(ChangeSupplyRequest csr) {
Flux<DemandStreamSupplyEntity> fluxNewCreated = Flux.empty();
for (SrmOperation so : csr.getOperations()) {
if (so.getType() == SrmOperationType.createSupply) {
DemandStreamSupplyEntity e = buildEntity(so, csr);
fluxNewCreated = fluxNewCreated.mergeWith(this.demandStreamSupplyRepository.save(e));
}
}
return fluxNewCreated.map(e -> e.getDemandStreamSupplyId()).collectList();
}
...
I am trying to see if there is a way to improve the way data is inserted and updated.
I am using ORACLE DB with JDBC.
The current way i'm doing is to update (e.g.)customer record by using a FOR loop after checking if toUpdate is true . An Example such as the sample code below, followed by calling an existing DAO update() to do so. But this would not allow for the UPSERT of multiple data together.
However, is there a better way to UPSERT multiple data together?
if (toUpdate) {
for (Customer customerRec : customerRecList)
customerRecDAO.update(customerRec);
}
Yes you can use batching:
public <T> int saveInBatch(List<T> records, String sql, Function<T, MapSqlParameterSource> paramFn){
try{
MapSqlParameterSource[] params = records.stream().map(paramFn).toArray(MapSqlParameterSource[]::new);
int rowCount = jdbcTemplate.batchUpdate(sql, params);
return Arrays.stream(rowCount).sum();
}
catch(Exception e){
//exception handling
}
}
paramFn is a lambda of function such that you can map records to their values. example could be
(record)->{
return new MapSqlParameterSource("username" ,username),Integer.class);//just example
}
why we use MapSqlParameterSource
You can call saveInBatch in such a way that you pass smaller batches or customized batches of records. Suppose you have a million records then you may want to update only 200-400 records at a time so you can do something like below:
private <T> int saveRecords(List<T> records, String sql, Function<T, MapSqlParameterSource> paramFn) throws Exception{
return Lists.partition(records, 300).stream().map(batch-> saveInBatch(batch, sql, paramFn)).mapToInt(Integer::intValue).sum();
}
Note: above is not well optimized or streams are not used to their best but this is a working code I tried ages back :).
I have a document like that:
'subject' : {
'name' :"...."
'facebookPosts':[
{
date:"14/02/2017 20:20:03" , // it is a string
text:"facebook post text here",
other stuff here
}
]
}
and I want to count the facebookPosts within a specific objects that their date field contains e.g "23/07/2016".
Now, I do that by extracting all the documents and count in the client side (spring ) , But I think that's not efficient.
You need to aggregate your results.
final Aggregation aggregation = Aggregation.newAggregation(
Aggregation.match(Criteria.where("facebookPosts.date").regex(REGEX)),
Aggregation.unwind("facebookPosts"),
Aggregation.group().count().as("count"));
Regex might not be the best solution, just an example.
unwind will split array into separate elements you can then count.
Create a class that will hold the count, something like:
public class PostCount {
private Long count;
// getters, setters
}
And then execute it like this:
AggregationResults<PostCount> postCount = mongoTemplate.aggregate(aggregation, Subject.class, PostCount.class);
long count = postCount.getMappedResults().get(0).getCount();
Is it possible to fetch data in user defined ranges [int starting record -int last record]?
In my case user will define in query String in which range he wants to fetch data.
I have tried something like this
Pageable pageable = new PageRequest(0, 10);
Page<Project> list = projectRepository.findAll(spec, pageable);
Where spec is my defined specification but unfortunately this do not help.
May be I am doing something wrong here.
I have seen other spring jpa provided methods but nothing are of much help.
user can enter something like this localhost:8080/Section/employee? range{"columnName":name,"from":6,"to":20}
So this says to fetch employee data and it will fetch the first 15 records (sorted by columnName ) does not matter as of now.
If you can suggest me something better that would be great.if you think I have not provided enough information please let me know, I will provide required information.
Update :I do not want to use native or Create query statements (until I don't have any other option).
May be something like this:
Pageable pageable = new PageRequest(0, 10);
Page<Project> list = projectRepository.findAll(spec, new pageable(int startIndex,int endIndex){
// here my logic.
});
If you have better options, you can suggest me that as well.
Thanks.
Your approach didn't work, because new PageRequest(0, 10); doens't do what you think. As stated in docs, the input arguments are page and size, not limit and offset.
As far as I know (and somebody correct me if I'm wrong), there is no "out of the box" support for what you need in default SrpingData repositories. But you can create custom implementation of Pagable, that will take limit/offset parameters. Here is basic example - Spring data Pageable and LIMIT/OFFSET
We can do this with Pagination and by setting the database table column name, value & row counts as below:
#Transactional(readOnly=true)
public List<String> queryEmployeeDetails(String columnName,String columnData, int startRecord, int endRecord) {
Query query = sessionFactory.getCurrentSession().createQuery(" from Employee emp where emp.col= :"+columnName);
query.setParameter(columnName, columnData);
query.setFirstResult(startRecord);
query.setMaxResults(endRecord);
List<String> list = (List<String>)query.list();
return list;
}
If I am understanding your problem correctly, you want your repository to allow user to
Provide criteria for query (through Specification)
Provide column to sort
Provide the range of result to retrieve.
If my understanding is correctly, then:
In order to achieve 1., you can make use of JpaSpecificationExecutor from Spring Data JPA, which allow you to pass in Specificiation for query.
Both 2 and 3 is achievable in JpaSpecificationExecutor by use of Pagable. Pageable allow you to provide the starting index, number of record, and sorting columns for your query. You will need to implement your range-based Pageable. PageRequest is a good reference on what you can implement (or you can extend it I believe).
So i got this working as one of the answer suggested ,i implemented my own Pageable and overrided getPagesize(),getOffset(),getSort() thats it.(In my case i did not need more)
public Range(int startIndex, int endIndex, String sortBy) {
this.startIndex = startIndex;
this.endIndex = endIndex;
this.sortBy = sortBy;
}
#Override
public int getPageSize() {
if (endIndex == 0)
return 0;
return endIndex - startIndex;
}
#Override
public int getOffset() {
// TODO Auto-generated method stub
return startIndex;
}
#Override
public Sort getSort() {
// TODO Auto-generated method stub
if (sortBy != null && !sortBy.equalsIgnoreCase(""))
return new Sort(Direction.ASC, sortBy);
else
return new Sort(Direction.ASC, "id");
}
where startIndex ,endIndex are starting and last index of record.
to access it :
repository.findAll(spec,new Range(0,20,"id");
There is no offset parameter you can simply pass. However there is a very simple solution for this:
int pageNumber = Math.floor(offset / limit) + ( offset % limit );
PageRequest pReq = PageRequest.of(pageNumber, limit);
The client just have to keep track on the offset instead of page number. By this I mean your controller would receive the offset instead of the page number.
Hope this helps!
I am using entity framework with repository pattern and unit of work objects..
I have an entity Request with properties "RequestId", "OldRequestId", which can be accessed using requestRepository object.
eg: requestRepostiory.GetAll(), requestRepository.GetFiltered(r=> r.Requestid =10)
If I pass a RequestId, it should retrieve me the specific record.
If the OldRequestId is not null in the retrieved record, it should bring the old request data as well.
It should go on until the OldRequestId is null.
Simple way would be something like this:
public static IEnumerable<Data> GetRecursive(int id)
{
while (true)
{
var tmp = GetFiltered(x => x.Requestid == id);
yield return tmp;
if (tmp.OldRequestId.HasValue)
id = tmp.OldRequestId.Value;
else
yield break;
}
}
Please note, that this code would run make multiple queries towards the database. Performance won't be the best, but it might work for your scenario.