I'm little confused on how the server.connection-timeout property will work on a spring boot REST API project
I have a Spring boot REST API Project in which I have a delete REST API, this will basically do couple of delete operation on a Database table say for example this delete API will delete some rows on 3 tables as following
Delete API gets "customer Id" as Input and execution the following
Delete all records matching the customer Id in Table A (delete call to an external DB)
Delete all records matching the customer Id in Table B (delete call to an external DB)
Delete all records matching the customer Id in Table C (delete call to an external DB)
my question here is if I set "server.connection-timeout" to 5 Seconds what does it actually means?
I have 2 two assumptions
The delete Rest Api will timeout in 5 Seconds meaning all the 3 external DB call has to be done within the 5 Seconds if not the REST API will timeout
Each external DB call will have 5 Seconds timeout, in this case 15 Seconds totally
In worst case if all the 3 External DB call takes 4 Seconds then the Delete API will take 12 Seconds to respond - is this a valid one?
I think you are confusing. server.connection-timeout – is the time that connectors wait for another HTTP request before closing the connection.
It doesn't matter how much time it takes to complete the request.
In your case if server.connection-timeout is 5, this will not effect #1 #2 or #3 deletes which you mentioned.
In a simple terms connection-timeout does not apply to long running requests. Instead It applies to the initial connection, when the server waits for the client to request something.
Default: the connector’s container-specific default is used. Use a value of -1 to indicate infinite timeout.
Related
We use ABP with a MSSQL database hosted on Azure.
in scope of Cost optimization we need to make as less requests to DB as possible.
during investigation i found out that ABP does around 50 millions of requests to table AbpUserOrganizationUnits per month and we don't this table at all
I would like to disable all calls to this table.
making 50 millions requests to database to receive an empty set is not what a normal product should do.
i woudl like to know if there is a way to stop this requests or if redirect them to a redis Cache or even stop them inside the API
I have the following scenario. In a database A I have a table with huge amount of records (several millions); these records increase day by day very rapidly (also 100.000 records at day).
I need to fetch these records, check if these records are valid and import them in my own database. At the first interaction I should take all the stored records. Then I can take only the new records saved. I have a timestamp column I can use for this filter but I can't figure how to create a JpaPagingItemReader or a JdbcPagingItemReader and pass the dynamic filter based on the date (e.g. select all records where timestamp is greater than job last execution date)
I'm using spring boot, spring data jpa and spring batch.I'm configuring the Job instance in chunks with dimension 1000. I can also use a paging query (is it useful if I use chunks?)
I have a micro service (let's call this MSA) with all the business logic needed to check if records are valid and insert the valid records.
I have another service on a separate server. This service contains all the batch operation (let's call this MSB).
I'm wondering what is the best approach to the batch. I was thinking to these solutions:
in MSB I duplicate all the entities, repositories and services I use in the MSA. Then in MSB I can make all needed queries
in MSA I create all the rest API needed. The ItemProcessor of MSB will call these rest API to perform checks on items to be processed and finally in the ItemWriter I'll call the rest API for saving data
The first solution would avoid the http calls but it forces me to duplicate all repositories and services between the 2 micro services. Sadly I can't use a common project where to place all the common objects.
The second solution, on the other hand, would avoid the code duplication but it would imply a lot of http calls (above all in the ItemProcessor to check if an item is valid or less).
Do you have any other suggestion? Is there a better approach?
Thank you
Angelo
We are trying a scenario of Rate Limiting the total no. of JSON records requested in a month to 10000 for an API.
We are storing the total count of records in a table against client_id and a Timestamp(which is primary key).
Per request we fetch record from table for that client with Timestamp with in that month.
From this record we get the current count, then increment it with no. of current records in request and update the DB.
Using the Spring Transaction, the pseudocode is as below
#Transactional(propagation=Propagation.REQUIRES_NEW, isolation=Isolation.REPEATABLE_READ)
public void updateLimitData(String clientId, currentRecordCount) {
//step 1
startOfMonthTimestamp = getStartOfMonth();
endOfMonthTimestamp = getEndOfMonth();
//step 2
//read from DB
latestLimitDetails = fetchFromDB(startOfMonthTimestamp, endOfMonthTimestamp, clientId);
latestLimitDetails.count + currentRecordCount;
//step 3
saveToDB(latestLimitDetails)
}
We want to make sure that in case of multiple threads accessing the "updateLimitData()" method, each thread get the updated data for a clientId for a month and it do not overwrite the count wrongly.
In the above scenario if multiple threads access the method "updateLimitData()" and reach the "step 3". First thread will update "count" in DB, then the second thread update "count" in DB which may not have latest count.
I understand from Isolation.REPEATABLE_READ that "Write Lock" is placed in the rows when update is called at "Step 3" only(by that time other thread will have stale data). How I can ensure that always threads get he latest count from table in multithread scenario.
One solution came to my mind is synchronizing this block but this will not work well in multi server scenario.
Please provide a solution.
A transaction would not help you unless you lock the table/row whilst doing this operation (don't do that as it will affect performance).
You can migrate this to the database, doing this increment within the database using a stored procedure or function call. This will ensure ACID and transactional safety as this is built into the database.
I recommend doing this using standard Spring Actuator to produce a count of API calls however, this will mean re-writing your service to use the actuator endpoint and not the database. You can link this to your Gateway/Firewall/Load-balancer to deny access to the API once their quote is reached. This means that your API endpoint is pure and this logic is removed from your API call. All new API's you developer will automatically get this functionality.
I'm working on a app where I have some entities in the database that have a column representing the date until that particular entity is available for some actions. When it expires I need to change it's state, meaning updating a column representing it's state.
What I'm doing so far, whenever I ask the database for those entities to do something with them, I first check if they are not expired and if they are, I update them. I don't particularly like this approach, since that means I will have a bunch of records in the database that would be in the wrong state just because I haven't queried them. Another approach would be to have a periodic task that runs over those records and updates them as necessary. That I also don't like since again, I would have records in a inconsistent state and in this case, the first approach seems more reasonable.
Is there another way of doing this, am I missing something? I need to mention, I use spring-boot + hibernate for my application. The underlying db is Postgresql. Is there any technology specific trick I can use to obtain what I want?
in database there it no triger type expired. if you have somethind that expired and you should do somethig with that there is two solutions (you have wrote about then) : do some extra with expired before you use data , and some cron/task (it might be on db level or on server side).
I recomend you use cron approach. Here is explanation :
do something with expired before you get data :
updated before select
+: you update expired data before you need it , and here are questions - update only that you requested or all that expired... update all might be time consumed in case if from all records you need just 2 records and updated 2000 records that are not related you you working dataset.
-: long time to update all record ; if database is shared - access to db not only throth you application , logic related to expired is not executed(if you have this case); you need controll entry point where you should do something with expired and where you shouldn't ; if time expired in min , sec - then even after you execure logic for expired , in next sec new records might be expired too;also if you need update workflow logic for expired data handling you need keep it in one plase - in cron , in case with update before you do select you should update changed logic too.
CRON/TASK
-: you should spend time to configure it just once 30-60 mins max:) ;
+: it's executed in the background ; if your db is used not only by your application , expired data logic also be available; you don't have to check(and don't rememebr about it , and explaine about for new employee....) is there any staled data in your java code before select something; you do split logic between cares about staled data , and normal queries do db .
You can execute 'select for update' in cron and even if you do select during update time from server side query you will wait will staled data logic complets and you get in select up to date data
for spring :
spring scheduling documentation , simple example spring-quartz-schedule
for db level postgresql job scheduler
scheduler/cron it's best practices for such things
I have 10 large files in production, and we need to read each line from the file and convert comma separated values into some value object and send it to JMS queue and also insert into 3 different table in the database
if we take 10 files we will have 33 million lines. We are using spring batch(MultiResourceItemReader) to read the earch line and have write to write it o db and also send it to JMS. it roughly takes 25 hrs to completed all.
Eventhough we have 10 system in production, presently we use only one system to run this job( i am new to spring batch, and not aware how spring supports in load balancing)
Since we have only one system we configured data source to connect to db and max connection is specified as 25.
To improve the performance we thought to use spring multi thread support. started to use 5 threads. we could see the performance improvement and could see everything completed in 10 hours.
Here i Have below questions:
1) if i process using 5 threads, we will publish huge amount of data into JMS queue. Will queue support huge data.Note we have 10 systems in production to read JMS Message from the queue.
2) Using thread(5) and 1 production system is good approach (or) instead of spring batch insert the data into db i can create a rest service and spring batch calls the rest api to insert the data into db and let spring api inserts data into JmS queue(again, if spring batch process file annd use rest to insert data into db, per second i will read 4 or 5 lines and will call the rest api. Note we have 10 production system). If use rest API approach will my system support(rest can handle huge request using load balancer, and also JMS can handle huge and huge message) or using thread in spring batch app using 1 production system is better approach.
Different JMS providers are going to have different limits, but in general messaging can easily handle millions of rows in a small period of time.
Messaging is going to be faster than inserting directly into the database because a message has very little data to manage (other than JMS properties) instead of the overhead of a complete RDBMS or NoSQL database or whatever, messaging out performs them all.
Assuming the individual lines can be processed in any order, then sending all data to the same queue and have n consumers working the back-end is a sound solution.
Your big bottleneck, however, is getting the data into the database. If the destination table(s) have m/any keys/indices on them, there is going to be serious contention because each insert/update/delete needs to rebuild the indices, so even though you have n different consumers trying to update the database, they're going to trounce on each other as the transactions are completed.
One solution I've seen is disabling all database constrains before you start and enabling at the end, and hopefully if things worked the data is consistent and usable; of course, the risk is there was bad data that you didn't catch and now you need to clean up or reattempt the load
A better solution might be to transform the files into a single file that can be batch loaded into the database using a platform-specific tool. These tools often disable indexes, contraint checking, and anything else that's going to slow things down - often times bypassing SQL itself - to get performance.