Order by making very slow the application using oracle - oracle

In my application I need to generate the report for transaction history which is done by all clients. I have used Oracle 12c for my application. I have 300k clients. This table is related client details and transaction history table. I have written the query to generate showing transaction history per month. It returns near 20 million records.
SELECT C.CLIENT_ID, CD.CLIENT_NAME, ...... FROM CLIENT C, CLIENT_DETAILS CD,
TRANSACTION_HISTORY TH
--Condition part
ORDER BY C.CLIENT_ID
These 3 tables have right indexes which is working fine. But when fetching data using order by to showing customers in order this query takes 8 hours to execute the batch process.
I have analysed the cost of the query. Cost is 80085. But when I remove the order by from query the cost became to 200. So that I have removed the order by as of now. But I need to show the customers by order. I cannot use the limit. Is there any way to overcome this?

you can try indexing the client id in the table, which would speed up the performance of the table to fetch the data in some order.
you can use the link for the reference: link
Hope this would help you

Related

ORACLE DB 24HRS RECORD COUNT

Is there a way in oracle dB to See how many records in a table got inserted updated and deleted in the schema? Right now I am using USER_TAB_MODIFICATIONS and I am having an issue if I want to get the daily count then I have to gather stats on the table on daily basis which I want to avoid cuz a lot of my tables having millions of records and gather stats will take a lot of time to run? Can some one head me in a right direction. I will really appreciate all the help. Thanks
I have few advices :
SYS.DBA_TAB_MODIFICATIONS gives you totalnumbers ...
If you want daily counts, you can alternatively alter your tables to store insert update delete dates on your table (soft delete , maybe even soft update ...)
*But i think , you can use a tool to monitor it. My tool is : Real Time Oracle Monitoring and Performance Analytics Tool.
You can give it a chance

What effect have the number of records of a particular table (in SQL Server) on LINQ Queries response time in C# MVC

I made some googling about my subject title, but didn't find useful answer. Most question were about effect of number of table's columns on query performance, while i need to know the effect of number of table's rows on linq query response time in C# MVC project. Actually, i have a web MVC project in which i try to get alarms set using ajax call from server and show them in a web grid in client side. Each ajax call is performed every 60 seconds in a loop created with setTimeOut method. Number of rows are gradually increasing within the alarm table (in SQL Server Database) and after a week, it reaches to thousands of rows. At first when lunching the project, I can see in DevTools of browser(Chrome), the time each ajax call takes is about 1 second or so. But this time gradually increases every day and after a week each success ajax call takes more than 2 minutes. This causes about 5 ajax call always be in pending queue. I am sure there is no memory leak both in client(JQuery) and server(C#) sides code, So the only culprit I suspect is SELECT Query response time performed on alarm table. I appreciate any advice.
Check the query that is executed in the database. There are two options:
Your Linq query fetches all the data from the database and process them locally. In this case the number of rows in the table is quite important. You need to fix your Linq query and make sure it fetches only the relevant rows from the database. Then you can hit option 2.
Your Linq query fetches only the relevant rows from the database, but there are no relevant indexes in your table and each query scans all the data in the table.
However with only few thousands rows in your table, I have doubts it will take 2 minutes to scan the table, so option 1 is more likely to be the reason for this slowdown.

SQL Server Full Table Scan and Load

For the purpose of this question, let's pretend I have the following table:
Transaction:
Id
ProductId
ProductName
City
State
Country
UnitCost
SellAmount
NumberOfTimesPurchased
Profit (NumberOfTimesPurchased * (SellAmount - UnitCost))
Basically, a single de-normalized table with a million plus rows in it. It is important to note that only two columns will ever by updated: Profit and NumberOfTimesPurchased. When a sale is made, the NumberOfTimesPurchased will be updated and the new profit amount will be re-calculated.
Now, I need to do some minimal reporting on this table, which consists of queries that aggregate and group. As an example:
SELECT
City, AVG(UnitCost), AVG(SellAmount),
SUM(NumberOfTimesPurchased), AVG(Profit)
FROM
Transaction
GROUP BY
City
SELECT
State, AVG(UnitCost), AVG(SellAmount), SUM(NumberOfTimesPurchased),
AVG(Profit)
FROM
Transaction
GROUP BY
State
SELECT
Country, AVG(UnitCost), AVG(SellAmount), SUM(NumberOfTimesPurchased),
AVG(Profit)
FROM
Transaction
GROUP BY
Country
SELECT
ProductId, ProductName, AVG(UnitCost), AVG(SellAmount),
SUM(NumberOfTimesPurchased), AVG(Profit)
FROM
Transaction
GROUP BY
ProductId, ProductName
These queries are quick: ~1 second. However, I've noticed that under load, performance significantly drops (from 1 second up to a minute when there are 20+ concurrent requests), and I'm guessing the reason is that each query performs a full table scan.
I've attempted to use indexed views for each query, however my update statement performance takes a beating since each view needs to be rebuilt. On the same note, I've attempted to create covering indexes for each query, but again my update statement performance is not acceptable.
Assuming full table scans are the culprit, do I have any realistic options to get the query time down while keeping update performance at acceptable levels?
Note that I cannot use column store indexes (I'm using the cheaper version of Azure SQL Database). I'd also like to stay away from any sort of roll-up implementation, as I need the data available immediately.
Finally - the example above is not a completely accurate representation of my table. I have 20 or so different columns that can be 'grouped', and 6 columns that can be updated. No inserts or deletes.
Because there are no WHERE clauses on your queries, the database engine can nothing but a table scan (or clustered index scan which is really the same thing). If there were covering indexes with containing all the columns from your query, then the engine would prefer those. If your real queries have WHERE clauses, then appropriate indexing with those columns as the leading columns of the index might help.
But I think your problem lies elsewhere. As far as concurrency goes you haven't put enough money in the meter. According to the main service tiers doc, the Basic tier for Azure SQL Database is for:
... supporting typically one single
active operation at a given time. Examples include databases used for
development or testing, or small-scale infrequently used applications.
Therefore you might want to think about splashing out for Premium edition to support both your concurrency requirement and columnstore indexes, which are perfectly suited to this type of query. Just for fun, I created a test-rig based on AdventureWorksDW2012 to try and recreate your problem which is here. Query performance was atrocious (> 20 secs). I'd be surprised if you weren't getting DTU warnings on your portal:
An upgrade to Standard (S0-S2) did boost performance so you should experiment. You could look at scaling up for busy query times and down when not required.
This table also looks a bit like a fact table, so you might want to consider refactoring this as a fact / dimensional model then use Azure Analysis Services on top to bring that sub-second performance.
Coincidentally there is a feedback item you can vote for to bring columnstore to Standard tier:
https://feedback.azure.com/forums/217321-sql-database/suggestions/6878001-make-sql-column-store-feature-available-for-standa
Recent comments suggest it is "in the work queue" as at May 2017;

select & update in both live & archive tables in the same schema

The application that I am working on currently has an archive logic where all the records older than 6 months will be moved to history tables in the same schema, but on a different table space. This is achieved using a stored procedure which is being executed daily.
For ex. TABLE_A (live, latest 6 months) ==> TABLE_A_H (archive, older than 6 months, up to 8 years).
So far no issues. Now the business has come up with a new requirement where the archived data should also be available for selects & updates. The updates can happen even for an year old data.
selects could be direct like,
select * from TABLE_A where id = 'something'
Or it could be open-ended query like,
select * from TABLE_A where created_date < 'XYZ'
Updates are usually for specific records.
These queries are exposed as REST services to the clients. There are possibilities of junk/null values (no way the application can sanitize the input).
The current snapshot of the DB is
PARENT_TABLE (10M records, 10-15K for each record)
CHILD_TABLE_ONE (28M records, less than 1K for each record)
CHILD_TABLE_TWO (25M records, less than 1K for each record)
CHILD_TABLE_THREE (46M records, less than 1K for each record)
CHILD_TABLE_FOUR (57M records, less than 1K for each record)
Memory is not a constraint - I can procure additional 2 TB of space if needed.
The problem is how do I keep the response time lower when it accesses the archive tables?.
What are all the aspects that I should consider when building a solution?
Solution1: For direct select/update, check if the records are available in live tables. If present, perform the operation on the live tables. If not, perform the operation on the archive tables.
For open ended queries, use UNION ???
Solution2: Use month-wise partitions and keep all 8 years of data in single set of tables?. Does oracle handles 150+ Millions of records in single table for select/update efficiently?
Solution3: Use NoSQL like Couchbase?. Not a feasible solution at the moment because of the infra/cost involved.
Solution4: ???
Tech Stack: Oracle 11G, J2EE Application using Spring/Hibernate (Java 1.6) hosted on JBoss.
Your response will be very much appreciated.
If I were you, I'd go with Solution 2, and ensure that you have the relevant indexes available for the types of queries you expect to be run.
Partitioning by month means that you can take advantage of partition pruning, assuming that the queries involve the column that's being partitioned on.
It also means that your existing code does not need to be amended in order to select or update the archived data.
You'll have to set up housekeeping to add new partitions though - unless you go for interval partitioning, but that has its own set of gotchas.

Inserting data from and to the same table - IN or INNER JOIN?

In an application I have 2000 Accounts. The first Account contains 10000 Clients, which is the maximum limit for an Account. Users can select Clients from the first Account and then select some Accounts to copy the selected Clients to the selected Accounts. So the possible maximums are 1999 Accounts and 10000 Clients.
Currently I’m looping over the Account list and calling a Stored Procedure in each iteration from the client application. With each iteration, an Account Id and a table-valued parameter, that contains the list of ids of the clients, are passed to the SP. While testing with 500 Accounts and 10000 Clients, it takes 25 minutes, 34 second and 543 milliseconds. At some point within the SP I’m using the following code –
INSERT INTO Client
SELECT AccountId, CId, Code, Name, Email FROM Client
WHERE Client.Id IN(SELECT Id FROM #clientIdList)
where #clientIdList is the table-type variable that contains the 10000 Client's Id.
The thing is, after each iteration 10000 new Client data is being added to Client table. So, with each iteration, this INSERT operation is gonna take longer in the next iteration. Googling for some SP performance tips I came to know that the IN clause is considered somewhat evil, and most people suggests to use INNER JOIN instead. So I changed the above code to –
INSERT INTO Client
SELECT c.AccountId, c.CId, c.Code, c.Name, c.Email FROM Client AS c
INNER JOIN #clientIdList AS cil
ON c.Id = cil.Id
Now it takes 25 minutes, 17 seconds and 407 milliseconds. Nothing exciting, really!
I’m new to Stored Procedures arena. So, with this amount of data, is it supposed to take this long? And which one should I consider for the given scenario, IN or INNER JOIN? Suggestions and performance tips are welcome. Thanks.
It's hard to tell exactly what is going on without knowing more about your stored procedure.
What I would recommend is checking the Execution plan. To do this, open up SQL Server Management studio. In a new query window make a call to your stored procedure passing in any relevant parameters.
Before you execute this, go up to the Query menu and choose the Client Side Statistics and the Actual Execution Plan menu items.
Now run your query.
Come back in 25 minutes when it's all done and there should be 3 or 4 tabs at the bottom (depending on if it returns data or not.) 1 Tab for results, 1 Tab for messages, 1 tab for the client stats and 1 tab for the execution plan.
The client stats tab is useful for seeing if the changes you make affect performance (it keeps several of your last runs to show you how it performed over those.)
The more interesting tab is the execution plan tab. Look at this one, it should look something like this:
Here it tells me that my query was able to use the primary key index on all my tables. You want to look out for whole table scans (because that means it's not using an index.) Also, if my query hadn't been so simple and had taken a long time, and not used an index then below "Query 1" there would be green text stating "Missing Index" or something along those lines. It will tell you the index you need to create to improve performance.
Also notice it tells you how much each query took, in percentage. I had one query so obviously it took 100% of the time. But if you had 5 queries in your sproc and one took 80% of the time, you might want to check that one out first.
It also tells you how much time was spent on each part of the query, again in percentages. That can be helpful for trying to understand what it is that your query is doing.
Run through this and I'd guess you have other problems with your sproc, and you can ask follow up questions from that.

Resources