Is it possible to update in database in jdbccursoritemreader - spring

I am using Spring batch JDBCCursorItemReader to read set of data from a table. Once data is read spring batch will process each row in a chunk(reader, processor, writer). Now I want to update/delete those records which my reader fetched to avoid reprocessing by another instance of same job. Can someone please tell me how can I do this in reader?
Thanks

Like it has been pointed out this might be a bad design idea. However if your sure this is what you want to do,
create a two step job,
step a, with commit interval as 1
Read the record
Write the updated record with the current job execution id
step b
Read the record where job execution id is current job execution id
process and update as needed
Notes
I do not recommend this approach for reasons stated in the comments
A commit interval of 1 would kill you performance wise so this approach if ever used should be for a low volume job only.

Related

Spring Batch Transaction using Chunk based processing

enter image description hereI am using chunk based processing using Spring Batch to read data in chunks from DB using JdbcPagingItemReader.
Now , I am killing the task in between the write stage of a chunk. Ideally the previous records in the chunk should have got rolled back but that did not happen.
The DB used here is DB2 .
The approach which I used was- to set Autocommit false for the connection and then after the write steps were complete I used the commit statement.This approach worked fine for a small set of data. But in real time there would be millions of records.
So , is this the right approach and if not then what can be the other solutions.
Thanks!
If I'm understanding you correctly: do you want to do thousands of processings without committing. And when everything is done committing?
Don't do so. You will have serious problems, both in the system and in the data base.
Things like that you better use another strategy, for example:
A temporary table. It goes on committing and when it finishes, it analyzes the result and if everything is right, it executes an update from the temporary to the final table.
You have to divide and conquer:
Run the process that generates the thousands of rows in a temporary table.
Analyze the result, it can be an analysis that results in the deletion of unsatisfactory lines or even an analysis that excludes the entire process.
Perform what should be done based on the analysis process.
I would create three or more Spring Batch Steps for each step of the one described above.

Spring batch to read CSV and update data in bulk to MySQL

I've below requirement to write a spring batch. I would like to know the best approach to achieve it.
Input: A relatively large file with report data (for today)
Processing:
1. Update Daily table and monthly table based on the report data for today
Daily table - Just update the counts based on ID
Monthly table: Add today's count to the existing value
My concerns are:
1. since data is huge I may end up having multiple DB transactions. How can I do this operation in bulk?
2. To add to the existing counts of the monthly table, I must have the existing counts with me. I may have to maintain a map beforehand. But is this a good way to process in this way?
Please suggest the approach I should follow or any example if there is any?
Thanks.
You can design a chunk-oriented step to first insert the daily data from the file to the table. When this step is finished, you can use a step execution listener and in the afterStep method, you will have a handle to the step execution where you can get the write count with StepExecution#getWriteCount. You can write this count to the monthly table.
since data is huge I may end up having multiple DB transactions. How can I do this operation in bulk?
With a chunk oriented step, data is already written in bulk (one transaction per chunk). This model works very well even if your input file is huge.
To add to the existing counts of the monthly table, I must have the existing counts with me. I may have to maintain a map beforehand. But is this a good way to process in this way?
No need to store the info in a map, you can get the write count from the step execution after the step as explained above.
Hope this helps.

best practices with Oracle DBMS_Scheduler

what's good for Oracle DBMS_Scheduler?
keeping a job scheduled(disabled) every time. and enable it and run it when needed.
create the job ,run it and drop it.
I have a table x and whenever a records gets submitted to that table ,I should have a job to process that record.
we may or may not have the record insertions always..
Keeping this in mind..what's better...?
Processing rows as they appear in a table in an asynchronous process can be done in a number of different ways, choose the way that suits you:
Add a trigger to the table which creates a one-off job to process the row using DBMS_JOB. This is suitable if the volume of data being inserted to the table is quite low, and you don't want your job running all the time. The advantage of DBMS_JOB is that the job will not start until the insert is committed; if it is rolled back, the job is also rolled back so doesn't run. The disadvantage is that if there is a sustained spike of activity, all the jobs created will crowd out any other jobs that are running.
Create a single job using DBMS_SCHEDULER which runs regularly, polls the table for new records and processes them. This method would need a column on the table that it can update to mark each record as "processed". For example, add a VARCHAR2(1) flag which is set to 'Y' on insert and set to NULL by the job after processing. You could add an index to that flag which will only store entries for unprocessed rows (so it will be small and fast). This method is much more efficient, especially for large data volumes, because each run of the job can effectively process large chunks of data in bulk at a time.
Use Oracle Advanced Queueing. http://docs.oracle.com/cd/E11882_01/server.112/e11013/aq_intro.htm#ADQUE0100
For (1), a separate job is created for each record in the table. You don't need to create the jobs. You do need to monitor them, however; if one fails, you would need to investigate and re-run manually.
For (2), you just create one job and let it run regularly. If one record fails, it could be picked up by the next iteration of the job. I would process each record in a separate transaction so the failure of one record doesn't affect the failure of other records still in the queue.
For (3), you still create a job like (2) but instead of reading the table it pulls requests off a queue.

Quartz schedular: how do I setup dynamic job arguments

I'm setting up a Quartz driven job in a Spring. The job needs a single argument which is the id of a database record that it can use to locate the data it needs to process.
The sequence is:
Job starts,
locates next available record id,
processes data.
Because the record id is unknown until the job starts, I cannot set it up when I create the job. I also need to account for restarts if things go bad. From reading Quartz doco it appears that if I store the record Id in the trigger's JobDataMap, then when the server restarts, the job will automatically restart with the same record Id it was original started with.
This is where things get tricky, I'm trying to figure out where and when to get the record Id so I can store it in the trigger's JobDataMap. I'm thinking I need to implement a TriggerListener and use it to set the record Id in the JobDataMap when the triggerFired() callback is called. This will involve a call to the database to get the record Id.
I'm not really sure if this approach is the correct one, or whether I'm barking up the wrong tree. Can someone with some quartz experience tell me if this is correct, or if there is a better way to configure a jobs arguments so that they can be dynamically set and a restart will preserve them?
Thanks
Derek

.net MVC3 BeginTransaction() locks the table

How can I read from the table when another transaction is processing?
Im using BeginTransaction(). So when another process try to read from that particular table I get a timeout because the previous transaction is holding the table.
How can I make the table to be readable when a transaction is ongoing?
Thanks in advance.
You can specify the isolationLevel parameter, as shown in the documentation here
The isolation level you want depends on what you're trying to do, because the value you read will depend on whether the write has finished or not.

Resources