Spring batch - Using mybatis pagination on union operation - spring

Is there any better way of handling mybatis pagination while using union query?
Do not consider this as a strong use case but just representing my actual problem.
I need to do union in order to get records from 2 different tables.
Here the problem is if I set the pageSize as 100 for example, if 10 students have 20 records each, then I get only 100 records, even though there are 200 records. And in below example class, when I print the number of records that each student has, I will not see all records.
For Example -
<code>
with student AS (
select * from std (
select studentId, name, class, ROW_NUMBER() over (order by studentId) as paginationRank from Student
)std
where paginationRank > #{ _skipRows} and paginationRank <= ( #{_pageSize} *
(#{_page}+1))
)
select student.studentId, attendanceRegfields......Creditsfields....
from student left outer join
( select ..... from attendanceReg
union all
select .... from Credits ) all_records
on all_records.studentId = student.studentId
</code>
In my item writer, if i get all student records
<code>
class MyItemWriter extends ItemWriter<Student>
{
write(List<student> studentRecords){
Map<String, List<Student>> studentRecordsMap =
studentRecords.stream().collect(groupby(e-> e.getStudentId()));
studentRecordsMap .forEach((key, studentRecords) -> process(stuedntRecords);
}
process(List<Student> studentRecords){
// here I am processing all records
}
}
</code>

Each record of the result of your query should represent an item (a Student in your case). Your item reader should be able to return a complete student item with all its child records (apparently Credit records from your query). Otherwise pagination will not return the correct results for obvious reasons.
What you need in your case is the Driving Query Pattern: You reader can read only students (without child records) and then a processor would complete each student with child records (basically the result of the union query for the current item). With this approach, pagination will work only on students regardless of how many child records each item has.
Hope this helps.

Related

Oracle SQL query with CASE WHEN EXISTS subquery optimization

I'm using the following query to create a view in Oracle 11g (11.2.0.3.0).
CREATE OR REPLACE FORCE VIEW V_DOCUMENTS_LIST
(
ID_DOC,
ATTACHMENTS_COUNT,
TOTAL_DIMENSION,
INSERT_DATE,
ID_STATE,
STATE,
ID_INSTITUTE,
INSTITUTE,
HASJOB
)
AS
SELECT D.ID_DOC,
COUNT (F.ID_FILE) AS ATTACHMENTS_COUNT,
CASE
WHEN SUM (F.DIMENSION) IS NULL THEN 0
ELSE SUM (F.DIMENSION)
END
AS TOTAL_DIMENSION,
D.INSERT_DATE,
D.ID_STATE,
S.STATE_DESC AS STATE,
D.ID_INSTITUTE,
E.NAME AS INSTITUTE,
CASE
WHEN EXISTS (SELECT D.ID_DOC FROM JOB) THEN 'true'
ELSE 'false'
END
AS HASJOB
FROM DOCUMENTS D
LEFT JOIN FILES F ON D.ID_DOC = F.ID_DOC
JOIN STATES S ON D.ID_STATE = S.ID_STATE
JOIN INSTITUTES E ON D.ID_INSTITUTE = E.ID_INSTITUTE
GROUP BY D.ID_DOC,
D.INSERT_DATE,
D.ID_STATE,
S.STATE_DESC,
D.ID_INSTITUTE,
E.NAME;
Then I query that view to get the values for a DataGridView in an ASPX page.
SELECT *
FROM V_DOCUMENTS_LIST
ORDER BY ID_STATE DESC, INSTITUTE, INSERT_DATE DESC;
Relevant tables and relations
DOCUMENTS; FILES; JOBS;
DOCUMENTS (1-1) <----> (0-N) FILES
JOBS (0-1) <----> (0-N) DOCUMENTS
Querying the view I get the complete list of documents with all their associated information (ID, description, dates, state, etc.) and also for each one:
total count of attached files;
total dimension in bytes of attached files;
boolean value indicating whether there's at least one JOB associated to the DOCUMENT or not.
Everything worked fine untile the view contained a few thousand records. Now the records amount is increasing and the SELECT * FROM on the view takes about 2:30 mins with 15.000-20.000 records.
I know that a really time consuming part of my view is the nested SELECT:
CASE
WHEN EXISTS (SELECT D.ID_DOC FROM JOB) THEN 'true'
ELSE 'false'
END
AS HASJOB
How can I optimize my view?
To address the not exists issue, you can add a join:
LEFT JOIN (select distinct id_doc from JOB) J
ON d.id_doc = J.id_doc
The Has_job column would be:
CASE
WHEN j.id_doc is not null THEN 'true'
ELSE 'false'
END AS HASJOB
PS: Your current implementation has a problem, as SELECT D.ID_DOC FROM JOB would allways contain rows if job table has rows. It is equivalent with select * from job, because exists just test existence of rows. A logically correct implementation would be: SELECT 1 FROM JOB j where j.id_doc = D.ID_DOC.
You are going full index on table JOB, put WHERE clause in the query:
SELECT D.ID_DOC FROM JOB

LINQ Join and performing aggregate functions

I am facing issue in writing LINQ query to perform join on three tables and then performing aggregate functions on the rows. Kindly do provide some help.
I have three tables
Table 1: Students (Id, Name)
Table 2: Subject (SubID, Title, Id)
Table 3: Grade (Id, SubID, marks)
I have to write LINQ query to get the results as following
Count of Students table rows
Count of Grade table rows
Sum of
marks of all rows in Grade table
I am writing query as following but it is not up to the mark as i feel it is not correct.
var _Count = from student in _context.Students
join subject in _context.Subject on student.Id equals subject.Id
join grade in _context.Grade on subject.SubID equals grade.SubID
// How to group them?
select new { //How to take and return the counts?};

Need help on oracle query

I have two oracle tables, table 1 contains students info and the second table contains student transaction details. Now I want an sql query to bring out the report of the transaction details for each student. eg student ID, name, amount, transaction date etc.
Note, a student can have many transactions, so I want a situation where by if student with ID 1 bought 3 items, in the result of the query I want to see student ID 1 and the sum of 3 items bought.
I don't want the student ID to repeat 3 times and the number of items bought.
Thanks
EDIT:
Here's the query I have so far:
select
distinct(s.spriden_id),
s.spriden_last_name,
s.spriden_first_name,
t.tbraccd_detail_code,
t.sum(tbraccd_amount),
t.tbraccd_term_code,
t.tbraccd_user,
t.TBRACCD_DATE
from SPRIDEN s, TBRACCD t
where s.spriden_pidm = t.tbraccd_pidm
and t.tbraccd_term_code = 201320
and t.tbraccd_desc = 'Misc Book Store Charges';
(The first table is SPRIDEN while the second table is TBRACCD)
You can use GROUP BY to group students, as below:
select
s.spriden_id,
sum(t.tbraccd_amount),
from SPRIDEN s, TBRACCD t
where s.spriden_pidm = t.tbraccd_pidm
and t.tbraccd_term_code = 201320
and t.tbraccd_desc = 'Misc Book Store Charges'
GROUP BY s.spriden_id;
MODIFIED VERSION to select all columns:
select
s.spriden_id,
t.tbraccd_entry_date,
t.tbraccd_term_code,
t.tbraccd_user,
sum(t.tbraccd_amount)
from SPRIDEN s, TBRACCD t
where s.spriden_pidm = t.tbraccd_pidm
and t.tbraccd_term_code = 201320
and t.tbraccd_desc = 'Misc Book Store Charges'
GROUP BY
s.spriden_id,
t.tbraccd_entry_date,
t.tbraccd_term_code,
t.tbraccd_user;

Need to select column from subquery into main query

I have a query like below - table names etc. changed for keeping the actual data private
SELECT inv.*,TRUNC(sysdate)
FROM Invoice inv
WHERE (inv.carrier,inv.pro,inv.ndate) IN
(
SELECT carrier,pro,n_dt FROM Order where TRUNC(Order.cr_dt) = TRUNC(sysdate)
)
I am selecting records from Invoice based on Order. i.e. all records from Invoice which are common with order records for today, based on those 3 columns...
Now I want to select Order_Num from Order in my select query as well.. so that I can use the whole thing to insert it into totally seperate table, let's say orderedInvoices.
insert into orderedInvoices(seq_no,..same columns as Inv...,Cr_dt)
(
SELECT **Order.Order_Num**, inv.*,TRUNC(sysdate)
FROM Invoice inv
WHERE (inv.carrier,inv.pro,inv.ndate) IN
(
SELECT carrier,pro,n_dt FROM Order where TRUNC(Order.cr_dt) = TRUNC(sysdate)
)
)
?? - how to do I select that Order_Num in main query for each records of that sub query?
p.s. I understand that trunc(cr_dt) will not use index on cr_dt (if a index is there..) but I couldn't select records unless I omit the time part of it..:(
If the table ORDER1 is unique on CARRIER, PRO and N_DT you can use a JOIN instead of IN to restrict your records, it'll also enable you to select whatever data you want from either table:
select order.order_num, inv.*, trunc(sysdate)
from Invoice inv
join order ord
on inv.carrier = ord.carrier
and inv.pro = ord.pro
and inv.ndate = ord.n_dt
where trunc(order.cr_dt) = trunc(sysdate)
If it's not unique then you have to use DISTINCT to deduplicate your record set.
Though using TRUNC() on CR_DT will not use an index on that column you can use a functional index on this if you do need an index.
create index i_order_trunc_cr_dt on order (trunc(cr_dt));
1. This is a really bad name for a table as it's a keyword, consider using ORDERS instead.

Linq to sum on groups based on two columns

I have a class as below:
Class Financial
{
string Debit;
string Credit;
decimal Amount;
}
And I have a list with objects of this class, with multiple records. All I need is to perform a groupped sum, something like in sql
Select debit, Credit, sum(amount) group by Debit, Credit
I tried with a statement as below:
from operation in m_listOperations
orderby operation.Debit, operation.Credit ascending
group operation by operation.Debit, operation.Credit into groupedOperation
select new Financial(Debit, Credit,
groupedOperation.Sum(operation => operation.Amount))
But this doesn't work since I cannot group on two columns.
Any suggestions?
...
group operation by new { operation.Debit, operation.Credit } into groupedOperation
...

Resources