fetch 200k records in single jpa select query within 5 seconds - performance

I want to fetch 200k records in single jpa select query within 5 seconds. I am selecting one column which is already indexed. Currently It is taking more than 5 minutes. is it possible to select over 100k of records in 5 seconds?

This is not possible with hibernate or normal native query since it has to create hundreds of thousands of objects in java side and results needs to be sent over the network (Serialization & de-serialization).
You could do below steps for fine tuning,
At DB side you could change the index method default is binery tree instead set it as "HASH" method.
Use Parallel threads to retrieve the results in paginated mode (Use native SQL).
Hope it gives some inputs for further fine tuning.

Use this property to retrieve lakh of records.
query.setHint(org.hibernate.fetchSize, 5000);

Related

Spring JPA - Update - SET - Huge Columns - Performance

I come across this link - Update single field using spring data jpa on search
In My application, one table is displayed in the front-end which has 100 columns, where user changes approximately 5 to 10 columns max.
However the front-end sends all the values and back-end update query has 100 columns in the SET.
Is this is a best practice? Some says - SET with all the columns doesn't impact as the JPA will do delete and insert internally or the DB does it. Is this is true?
What should be the best practice and does having all columns in the SET affects the performance in general?
Thanks
If the user has changed just columns and it is one row updated, then no, the performance would not be affected much. It would be affected, but in most cases optimizing that performance is not necessary unless you're handling a huge amount of updates. And when you're using JPA i would guess you do not actually populate the update yourself but using an entity where you update the affected fields? Then JPA would chose how to actually do the update (most probably sending all fields of the entity to the update).
If it would be 100 rows and the user changes data in 5-10 rows, then it would be better to only pass those 5-10 rows to the database update.

What effect have the number of records of a particular table (in SQL Server) on LINQ Queries response time in C# MVC

I made some googling about my subject title, but didn't find useful answer. Most question were about effect of number of table's columns on query performance, while i need to know the effect of number of table's rows on linq query response time in C# MVC project. Actually, i have a web MVC project in which i try to get alarms set using ajax call from server and show them in a web grid in client side. Each ajax call is performed every 60 seconds in a loop created with setTimeOut method. Number of rows are gradually increasing within the alarm table (in SQL Server Database) and after a week, it reaches to thousands of rows. At first when lunching the project, I can see in DevTools of browser(Chrome), the time each ajax call takes is about 1 second or so. But this time gradually increases every day and after a week each success ajax call takes more than 2 minutes. This causes about 5 ajax call always be in pending queue. I am sure there is no memory leak both in client(JQuery) and server(C#) sides code, So the only culprit I suspect is SELECT Query response time performed on alarm table. I appreciate any advice.
Check the query that is executed in the database. There are two options:
Your Linq query fetches all the data from the database and process them locally. In this case the number of rows in the table is quite important. You need to fix your Linq query and make sure it fetches only the relevant rows from the database. Then you can hit option 2.
Your Linq query fetches only the relevant rows from the database, but there are no relevant indexes in your table and each query scans all the data in the table.
However with only few thousands rows in your table, I have doubts it will take 2 minutes to scan the table, so option 1 is more likely to be the reason for this slowdown.

Querying from azure db using linq takes a lot of time

I am using Azure db and my table has over 120000 records.
Applied with paging of 10 records a page I am fetching data into IQueryable, but fetching 10 records only taking around 2 minutes. This query has no join and just 2 filters. While using Azure search I can get all the records within 3 seconds.
Please suggest me who to minimise my Linq search as azure search is costly.
Based on the query in the comment to your question, it looks like you are reading the entire db table to the memory of your app. The data is potentially transferred from one side of the data centre to another, thus causing the performance issue.
unitofwork.Repository<EntityName().GetQueriable().ToList().Skip(1).Take(10);
Without seeing the rest of the code, I'm just guessing your LINQ query should be something like this:
unitofwork.Repository<EntityName().GetQueriable().Skip(1).Take(10).ToList();
Skip and Take should be executed on the db Server, while .ToList() is at the end will materialize the entities.

JPA getResultList much slower than SQL query

I have a (oracle)table with about 5 million records and a quite complex query which returns about 5000 records of it in less than 5 seconds with a database tool like toad.
However when I ran the query via entityManager(eclipseLink) the query runs for minutes...
I'm probably too naive in the implementation.
I do:
Query query = em.createNativeQuery(complexQueryString, Myspecific.class);
... setParameter...
List result = query.getResultList();
The complexQueryString starts with a "SELECT *".
What kinds of optimization do I have?
May be one is only to select the fields I really need later. Some explanation would be great.
I had a similar problem (I tried to read 800000 records with 8 columns in less than one second) and the best solution was to fall back to jdbc. The ResultSet was created and read really 10 times faster than using JPA, even when doing a native query.
How to use jdbc: normally in the J2EE-Servers a JDBC-DataSource can be injected as #Resource.
An explanation: I think the OR-Mappers try to create and cache objects so that changes can easily detected later. This is a very substantial overhead, that can't be recognized if you are just working with single entities.
Query.setFetchSize(...) may help a bit. It tells the jdbc driver how many rows to return in one chunk. Just call it before getResultList();
query.setFetchSize(5000);
query.getResultList();

Order by making very slow the application using oracle

In my application I need to generate the report for transaction history which is done by all clients. I have used Oracle 12c for my application. I have 300k clients. This table is related client details and transaction history table. I have written the query to generate showing transaction history per month. It returns near 20 million records.
SELECT C.CLIENT_ID, CD.CLIENT_NAME, ...... FROM CLIENT C, CLIENT_DETAILS CD,
TRANSACTION_HISTORY TH
--Condition part
ORDER BY C.CLIENT_ID
These 3 tables have right indexes which is working fine. But when fetching data using order by to showing customers in order this query takes 8 hours to execute the batch process.
I have analysed the cost of the query. Cost is 80085. But when I remove the order by from query the cost became to 200. So that I have removed the order by as of now. But I need to show the customers by order. I cannot use the limit. Is there any way to overcome this?
you can try indexing the client id in the table, which would speed up the performance of the table to fetch the data in some order.
you can use the link for the reference: link
Hope this would help you

Resources