Need thoughts on where to implement unique number generation logic in our distributed environment - uniqueidentifier

We have a unique requirement where we need to create fixed 12 digit unique number for every transaction we process successfully in our current application. The application is set of restful services and has Oracle DB as a data store.
We do have the logic as to how to come up with unique 12 digit number but we are trying to understand where we can fit this logic so that the transactions which are getting executed in this environment gets reference to this unique id.
We figured out that keeping some part of that 12 digit in DB sequence could be an option but that will not work in near future as we would be having multiple databases.

How about if you have a Sequencer service which is responsible for generating these unique numbers? When a new transaction is created, the entity which manages the transaction can request a unique number from this service and associate this with the transaction.

Related

Spring batch fetch huge amount of data from DB-A and store them in DB-B

I have the following scenario. In a database A I have a table with huge amount of records (several millions); these records increase day by day very rapidly (also 100.000 records at day).
I need to fetch these records, check if these records are valid and import them in my own database. At the first interaction I should take all the stored records. Then I can take only the new records saved. I have a timestamp column I can use for this filter but I can't figure how to create a JpaPagingItemReader or a JdbcPagingItemReader and pass the dynamic filter based on the date (e.g. select all records where timestamp is greater than job last execution date)
I'm using spring boot, spring data jpa and spring batch.I'm configuring the Job instance in chunks with dimension 1000. I can also use a paging query (is it useful if I use chunks?)
I have a micro service (let's call this MSA) with all the business logic needed to check if records are valid and insert the valid records.
I have another service on a separate server. This service contains all the batch operation (let's call this MSB).
I'm wondering what is the best approach to the batch. I was thinking to these solutions:
in MSB I duplicate all the entities, repositories and services I use in the MSA. Then in MSB I can make all needed queries
in MSA I create all the rest API needed. The ItemProcessor of MSB will call these rest API to perform checks on items to be processed and finally in the ItemWriter I'll call the rest API for saving data
The first solution would avoid the http calls but it forces me to duplicate all repositories and services between the 2 micro services. Sadly I can't use a common project where to place all the common objects.
The second solution, on the other hand, would avoid the code duplication but it would imply a lot of http calls (above all in the ItemProcessor to check if an item is valid or less).
Do you have any other suggestion? Is there a better approach?
Thank you
Angelo

scaled microservices instances needs to update 1

I have unique problem trying to see what is the best implementation for this.
I have table which has half million rows. Each row represents
business entity I need to fetch information about this entity from
internet and update back on the table asynchronously
. (this process takes about 2 to 3 minutes) .
I cannot get all these rows updated efficiently with 1 instance of
microservices. so planning to scale this up to multiple instances
my microservice instances is async daemon fetch business entity 1 at time and process the data & finally update the data back to the table.
. Here is where my problem between multiple instances how do I ensure no 2 microservice instance works with same business entity (same row) in the update process? I want to implement an optimal solution microservices probably without having to maintain any state on the application layer.
You have to use an external system (Database/Cache) to save information about each instance.
Example: Shedlock. Creates a table or document in the database where it stores the information about the current locks.
I would suggest you to use a worker queue. Which looks like a perfect fit for your problem. Just load the whole data or id of the data to the queue once. Then let the consumers consume them.
You can see an clear explanation here
https://www.rabbitmq.com/tutorials/tutorial-two-python.html

Implementing static shared counter in microservice architecture

I have a use case where i want to record data in rows and display to the user.
Multiple users can add these records and they have to be displayed in order of insertion AND - MOST IMPORTANTLY - with a sequence number starting from 1.
I have a Spring boot microservice architecture at the backend, which obviously means i cannot hold state in my boot application as i'm gonna have multiple running instances.
Another method was to fetch all existing records in the db,count them,increment the count by 1 and use that as my sequence. I need to do this every time i am doing an insert.
But the problem with the second approach is with parallel requests, which could result in same sequence number being given to 2 records.
Third approach is to configure the counter in a db , but since i am using cosmos DB, apparently that is also not an option.
Any suggestions as to how i can implement a static, shared counter ?

Parallel processing of records from database table

I have a relational table that is being populated by an application. There is a column named o_number which can be used to group the records.
I have another application that is basically having a Spring Scheduler. This application is deployed on multiple servers. I want to understand if there is a way where I can make sure that each of the scheduler instances processes a unique group of records in parallel. If a set of records are being processed by one server, it should not be picked up by another one. Also, in order to scale, we would want to increase the number of instances of the scheduler application.
Thanks
Anup
This is a general question, so here's my general 2 cents on the matter.
You create a new layer managing the requesting originating from your application instances to the database. So, probably you will be building a new code/project running on the same server as the database (or some other server). The application instances will be talking to that managing layer instead of the database directly.
The manager will keep track of which records are requested hence fetch records that are yet to be processed upon each new request.

Spring Batch Framework

I am not able to finalize whether Spring Batch framework is applicable for the below requirement. I need experts inputs on this.
Following is my requirement:
Read multiple Oracle tables (at least 10 tables including both transaction and master), do complex
calculation based on the business rules, Insert / Update / Delete
records in transaction tables.
I have identified the following two designs:
Design # 1:
ItemReader: Select eligible records from Key transaction table.
ItemProcessor: Fetch additional details from DB using the key available in the record retrieved by ItemReader.(It would require multipble DB transactions)
Do the validation and computation and add the details to be written to DB as objects in a list.
ItemWriter: Write the details available in objects using CustomItemWriter(insert / update / delete operation)
With this design, we can achieve parallel processing but increase the number of DB transactions.
Design # 2:
Step # 1
ItemReader: Use Composite Item Reader (Group of ItemReaders) to read all the required tables.
ItemWriter: Save the result sets as lists of Objects (One list per table) in execution context
Step # 2
ItemReader: Retrieve lists of Objects available in execution context and group them into one list of objects based on the business processing so that processor can process them.
IremProcessor:
Process the chunk of Objects returned by ItemReader.
Do the validation and computation and add the details to be written to DB as objects in a list.
ItemWriter: Write the details available in objects using CustomItemWriter(insert / update / delete operation)
With this design, we can REDUCE the number of DB Transactions but we are delaying the processing till all table records are retrieved and stored in execution context ie we are not using parallel processing provided by SpringBatch.
Please advise whether the above is feasible using SpringBatch or we need to use conventional Java program.
The good news is that your problem description matches a very common use case for spring-batch. The bad news is that the problem description is too generic to allow much meaningful input about the specifc design beyond the comments already provided.
Spring-batch brings facilities similar to JCL and ISPF from the mainframe world into the java context.
Spring batch provides a framework for organizing and managing the boundaries of your process. It is a natural for a lot of ETL and bigdata operations, but it is not the only way to write these processes.
If you process can be broken down into discreet steps, then spring batch is a good choice for you.
The Itemreader should (logicall) be an iterator returning a single object representing the start of one logical unit of work (luw). The luw object is captured by the chunker and assembled into collections of the size you configure, and then passed to the processor. The result of the processor is then passed to the writer. In the context of an RDBMS centric process, the commit happens at the end of the writer's operation.
What happens in each of those pieces of the step is 100% whatever you need (plain old java). The point of the framework is to free you from the complexity and enable you to solve the problem.
From my understanding, Spring batch has nothing to do with database batch operations (or at least the word 'batch' has a different meaning in these two contexts..) Spring batch is used to create processes with multiple steps, and gives you the chance to restart a process if one of the process steps fails (without repeating the previously finished process steps.)

Resources