Perform Spring batch StepListerner and writer in single Transaction - spring

I have a file with 100 records. I want to first delete all the records in "before stepexecution listener" and perform insertion in writer.
How can i handle the scenario of roll back. It mean if I delete the records and insertion fails. Then I have lost the data.

Related

Spring jpa performance the smart way

I have a service that listens to multiple queues and saves the data to a database.
One queue gives me a person.
Now if I code it really simple. I just get one message from the queue at a time.
I do the following
Start transaction
Select from person table to check if it exists.
Either update existing or create a new entity
repository.save(entity)
End transaction
The above is clean and robust. But I get alot of messages its not fast enough.
To improve performance I have done this.
Fetch 100 messages from queue
then
Start transaction
Select all persons where id in (...) in one query using ids from incomming persons
Iterate messages and for each one check if it was selected above. If yes then update it if not then create a new
Save all changes with batch update/create
End transaction
If its a simple message the above is really good. It performs. But if the message is complicated or the logic I should do when I get the message is then the above is not so good since there is a change some of the messages will result in a rollback and the code becomes hard to read.
Any ideas on how to make it run fast in a smarter way?
Why do you need to rollback? Can't you just not execute whatever it is that then has to be rolled back?
IMO the smartest solution would be to code this with a single "upsert" statement. Not sure which database you use, but PostgreSQL for example has the ON CONFLICT clause for inserts that can be used to do updates if the row already exists. You could even configure Hibernate to use that on insert by using the #SQLInsert annotation.

JdbcBatchItemWriterBuilder vs org.springframework.jdbc.core.jdbcTemplate.batchUpdate

I understand jdbcTemplate.batchUpdate is used for sending several records to data base in one communication.
Lets say i have 1000 records to be updated, instead of 1000 communications from Application to database, the Application will send 1000 records in request.
Coming to JdbcBatchItemWriterBuilder its combination of Tasks in a job.
My question is, if there is 1000 records to be processed(INSERT statements) via JdbcBatchItemWriterBuilder, all INSERTS executed in one go? or one after one?
If one after one, connecting to database 1000 times using JdbcBatchItemWriterBuilder causes perf issues? hows that handled?
i would like to understand if Spring batch performs better than running 1000 INSERT staments using jdbcTemplate.update ?
The JdbcBatchItemWriter uses java.sql.PreparedStatement#addBatch and java.sql.Statement#executeBatch internally (See https://github.com/spring-projects/spring-batch/blob/c4010fbffa6b71cbcfe79d523023251ce73666a4/spring-batch-infrastructure/src/main/java/org/springframework/batch/item/database/JdbcBatchItemWriter.java#L189-L195), so there will be a single batch insert for all items of the chunk.
Moreover, this will be executed in a single transaction as described in the Chunk-oriented Processing section of the reference documentation.

Spring data jpa save huge entity list async

I have a huge entity list say having 10000 items, I want to use the crud repository to save the list async and the api will return without waiting for the save result (because the save can take long time). Is it possible to use #async annotation to do so?
Yes you can use #Async for committing to the database. However, keep in mind that the commit will then run in a separate thread with it's own transaction. You will not see any intermediate results until the commit transaction finishes.

Customize Spring's JdbcBatchItemWriter to use different SQL for every record

I have a requirement where I will receive a flat file from a vendor and I need to read the records and insert/update/delete them in my DB table. I get the action flag from vendor indicating whether I need to insert/update/delete that particular record. The flat file will contain huge records and I do not want to do manual steps like checking the action flag for every record [by overriding write() method of ItemWriter and looping the items list in chunk] and construct sql manually and use JDBCTemplate to do the DB operation for every record.
Can I achieve this using JdbcBatchItemWriter? Is there a way to set the sql for every record in the chunk so that Spring Batch will do a batch update? How does the ItemPreparedStatementSetter can be invoked in that case?
Since your choice is at the record level, take a look at the ClassifierCompositeItemWriter (http://docs.spring.io/spring-batch/trunk/apidocs/org/springframework/batch/item/support/ClassifierCompositeItemWriter.html). That ItemWriter implementation takes a Classifier implementation that it uses to determine which ItemWriter to use. From there, you can configure one ItemWriter that does inserts, one for updates, and one for deletes. Each record will be funneled through to the correct instance and assuming your delegates are JdbcBatchItemWriters, you'll get the same batching you normally do (one batch for inserts, one for updates, and one for deletes).

Need to commit after each 500 rows deleted

I am invoking delete queries in a loop using JDBC. The number of records that are going to be deleted by a particular delete query is not consistent. It can be 40, 80 or 100 etc. My scenario is I need to commit after each 500 records deletion.
The way I implemented is, I am accumulating the count of the records that are going to be deleted by the particular delete query, until the count is equal or more than 500. As soon the count becomes equal to or more than 500, I am pausing count accumulation and invoking the delete query and committing to delete 500 records in one shot.
Is there any other better approach or JDBC standard way to do this?
You can use two JDBC APIs for deleting: You can run a query like delete ... where PK in (...) and pass an array of 500 primary keys. Or you can use JDBC batch to batch a series of SQL queries and run them as one.
Batch is just a way to package several SQL queries in a single "conversation" with the database. The performance gain is mostly network: Instead of sending hundreds of small packages, you send a few big ones. Parsing and processing on the DB side is the same.
But if you do a more complex query and collect the number of changed rows, then there is no special JDBC support for that. You have to accumulate the count yourself and then commit manually when the count gets too big.

Resources