Easy performance metrics for SQL Server 2000 - performance

The reports that I use (and update) are taking a long time (some take hours). I felt this is far too long and asked previously about this. After taking a long look at various web sites that discuss SQL performance, they all take the stance of being DBA's. However I'm not, and neither are my colleagues (I guess if we had a DBA then we wouldn't have this problem).
What I want is a simple way of returning the top 10 or so most run and worst performing scripts. I would of hoped there is a nice SET METRICS ON switch, but I guess if that was the case then the sites wouldn't go on about recording profiles.
The last thing I want to do is to cause performance to drop even further and recording a profile sounds like a performance killer.

You have at least following options.
look at the plan of a bad performing query in SQL Analyzer and try to optimize it, query by query from there.
or use a script (see below) to give you advice by analyzing SQLServer's statistics on what indexes you could create.
or use the Database Engine Tuning Advisor to suggest and/or create indexes for you to speed up your queries
or use a tool like redgate's SQL Response to give you more information than you can digest
In the end, automated tools will get you a long way. It may even be enough in your case but keep in mind that there is no automated tool that will be able to outperform a skilled DBA for the mear fact that automated tools can not rewrite your queries.
SET CONCAT_NULL_YIELDS_NULL OFF
--Joining the views gives a nice picture of what indexes
--would help and how much they would help
SELECT
'CREATE INDEX IX_' + UPPER(REPLACE(REPLACE(COALESCE(equality_columns, inequality_columns), '[', ''), ']', ''))
+ ' ON ' + d.statement + '(' + COALESCE(equality_columns, inequality_columns)
+ CASE WHEN equality_columns IS NOT NULL THEN
CASE WHEN inequality_columns IS NOT NULL THEN ', ' + inequality_columns
END END
+ ')' + CASE WHEN included_columns IS NOT NULL THEN ' INCLUDE (' + included_columns + ')' END
, object_name(object_id)
, d.*
, s.*
FROM sys.dm_db_missing_index_details d
LEFT OUTER JOIN sys.dm_db_missing_index_groups g ON d.index_handle = g.index_handle
LEFT OUTER JOIN sys.dm_db_missing_index_group_stats s ON g.index_group_handle = s.group_handle
WHERE database_id = db_id()
ORDER BY avg_total_user_cost DESC

You should be able to go thru the sys.dm_exec_query_stats table, which keeps information on all queries against a database.
SELECT creation_time
,last_execution_time
,total_physical_reads
,total_logical_reads
,total_logical_writes
, execution_count
, total_worker_time
, total_elapsed_time
, total_elapsed_time / execution_count avg_elapsed_time
,SUBSTRING(st.text, (qs.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(st.text)
ELSE qs.statement_end_offset END
- qs.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY last_execution_time,total_elapsed_time / execution_count DESC;
Gives you basic timing information of how long, historically, queries took.

Related

How make view use index

The view has several join, but no WHEREcloses. It helped our developers to have all the data the needed in one appian object, that could be easily used in the "low code" later on. In most cases, Appian add conditions to query the data on the view, in a subsequent WHERE clause like below:
query: [Report on Record Type], order by: [[Sort[histoDateAction desc], Sort[id asc]]],
filters:[((histoDateAction >= TypedValue[it=9,v=2022-10-08 22:00:00.0])
AND (histoDateAction < TypedValue[it=9,v=2022-10-12 22:00:00.0])
AND (histoUtilisateur = TypedValue[it=3,v=miwem6]))
]) (APNX-1-4198-000) (APNX-1-4205-031)
Now we start to have data in the database, and performances get low. Reason seems to be, from execution plan view, query do not use indexes when data is queried.
Here is how the query for view VIEW_A looks:
SELECT
<columns> (not much transformation here)
FROM A
LEFT JOIN R on R.id=A.id_type1
LEFT JOIN R on R.id=A.id_type2
LEFT JOIN R on R.id=A.id_type3
LEFT JOIN U on U.id=A.id_user <500>
LEFT JOIN C on D.id=A.id_customer <50000>
LEFT JOIN P on P.id=A.id_prestati <100000>
and in the current, Appian added below clauses:
where A.DATE_ACTION < to_date('2022-10-12 22:00:00', 'YYYY-MM-DD HH24:MI:SS')
and A.DATE_ACTION >= to_date('2022-10-08 22:00:00', 'YYYY-MM-DD HH24:MI:SS')
and A.USER_ACTION = 'miwem6'
typically, when I show explain plan for the VIEW_A WHERE <conditions> , I have a cost around 6'000, and when I show explain plan for the <code of the view> where <clause>, the cost is 30.
Is it possible to use some Oracle hint to tell it: "Some day, someone will query this adding a WHERE clause on some columns, so don't be a stupid engine and use indexes when time comes"?
First, this isn't a great architecture. I can't tell you how many times folks have pulled me in to diagnose performance problems due to unpredictably dynamic queries where they are adding an assortment of unforseable WHERE predicates.
But if you have to do this, you can increase your likelihood of using indexes by lowering their cost. Like this:
SELECT /*+ opt_param('optimizer_index_cost_adj',1) */
<columns> (not much transformation here)
FROM A . . .
If you know for sure that nested loops + index use is the way you want to access everything, you can even disable the CBO entirely:
SELECT /*+ rule */
<columns> (not much transformation here)
FROM A . . .
But of course it's on you to ensure that there's an index on every high cardinality column that your system may use to significantly filter desired rows by. That's not every column, but it sounds like it may be quite a few.
Oh, and one more thing... please ignore COST. By definition Oracle always chooses what it computes as the lowest cost plan. When it makes wrong choices, it's because it's computation of cost is incorrect. Therefore by definition, if you are having problems, the COST numbers you see are wrong. Ignore them.

How to tell my #query in spring jpa repository NOT to use prepared statement (lead to very slow queries)?

In my Spring Repository Class, I have the following query (kind of analytics query) running on a Postgresql 9.6 server :
#Query("SELECT d.id as departement_id, COUNT(m.id) as nbMateriel FROM Departement d LEFT JOIN d.sites s LEFT JOIN s.materiels m WHERE "
+ "(s.metier.id IN (:metier_id) OR :metier_id IS NULL) AND (s.entite.id IN (:entite_id) OR :entite_id IS NULL) "
+ "AND (m.materielType.id IN (:materielType_id) OR :materielType_id IS NULL) AND "
+ "(d.id= :departement_id OR :departement_id IS NULL) "
+ "AND m.dateLivraison is not null and (EXTRACT(YEAR FROM m.dateLivraison) < :date_id OR :date_id IS NULL) "
+ "AND ( m.estHISM =:estHISM OR :estHISM IS NULL OR m.estHISM IS NULL) "
+ "GROUP BY d.id")
List<Map<Long, Long>> countByDepartementWithFilter(#Param("metier_id") List<Long> metier_id,#Param("entite_id") List<Long> entite_id,#Param("materielType_id") List<Long> materielType_id,
#Param("departement_id") Long departement_id, #Param("date_id") Integer date_id,
#Param("estHISM") Boolean estHISM);
The problem is : this query is called several times with different combination of parameters, and after 5-6 calls, time execution go from 20 ms to 10 000 ms
From what I have read, what cause this is the use of prepared statements which is not suited to analytics queries, where there are number of parameters whose values can change a lot. And indeed, running the above query directly is always fast (20 ms).
Question 1 : How can I say to Spring JPA not to use prepared statements for this specific query ?
Question 2 : If Question 1 not possible, what workaround can I have ?
There are some tips in general to enhance query performance both from JPA / DB POV:
1- use #NamedQuery instead of #Query
2- For reporting queries, don't run it inside a transaction
3- You can set the flush mode to COMMIT if you don't need to flush the persistence context before the query runs
4- check the generated query, take it and run it on SQL developer od TOAD, check its cost and run strategy, you can also consult your DBA if you can enhance it with some native DB functions / provcedures , hence use a native query instead of JPQL query
5- if data returning is large, consider making this query a DB view or materialized view and calling it directly
6- make use of query hints to activate a certain index for example, note that indexes may be ignored in case of JPQL
7- you can use native query if the query hint didn't work on JPQL
8- While comparing the query on SQL Developer with that from the code make sure that you are comparing right , the query might run very quickly initially on DB directly but takes loong time to fetch all the data , and you might be comparing this initial short time with the application data fetch time
9- use fetch size hint according to your provider
10- According to my knowledge, you might escape prepared statement if you use native non parametrized query (thus using manual placeholders and replacing values manually) but generally this should be used with care and avoided as much as possible because of SQL injection vulnerabilities and also disallows the DB query engine from as well as the hibernate engine from precompiling the queries

Performance tuning tips -Plsql/sql-Database

We are facing performance issue in production. Mv refersh program is running for long, almost 13 to 14 hours.
In the MV refersh program is trying to refersh 5 MV. Among that one of the MV is running for long.
Below is the MV script which is running for long.
SELECT rcvt.transaction_id,
rsh.shipment_num,
rsh.shipped_date,
rsh.expected_receipt_date,
(select rcvt1.transaction_date from rcv_transactions rcvt1
where rcvt1.po_line_id = rcvt.po_line_id
AND rcvt1.transaction_type = 'RETURN TO VENDOR'
and rcvt1.parent_transaction_id=rcvt.transaction_id
)transaction_date
FROM rcv_transactions rcvt,
rcv_shipment_headers rsh,
rcv_shipment_lines rsl
WHERE 1 =1
AND rcvt.shipment_header_id =rsl.shipment_header_id
AND rcvt.shipment_line_id =rsl.shipment_line_id
AND rsl.shipment_header_id =rsh.shipment_header_id
AND rcvt.transaction_type = 'RECEIVE';
Shipment table contains millions of records and above query is trying to extract almost 60 to 70% of the data. We are suspecting data load is the reason.
We are trying to improve the performance for the above script.So we added date filter to restrict the data.
SELECT rcvt.transaction_id,
rsh.shipment_num,
rsh.shipped_date,
rsh.expected_receipt_date,
(select rcvt1.transaction_date from rcv_transactions rcvt1
where rcvt1.po_line_id = rcvt.po_line_id
AND rcvt1.transaction_type = 'RETURN TO VENDOR'
and rcvt1.parent_transaction_id=rcvt.transaction_id
)transaction_date
FROM rcv_transactions rcvt,
rcv_shipment_headers rsh,
rcv_shipment_lines rsl
WHERE 1 =1
AND rcvt.shipment_header_id =rsl.shipment_header_id
AND rcvt.shipment_line_id =rsl.shipment_line_id
AND rsl.shipment_header_id =rsh.shipment_header_id
AND rcvt.transaction_type = 'RECEIVE'
AND TRUNC(rsh.creation_date) >= NVL(TRUNC((sysdate - profile_value),'MM'),TRUNC(rsh.creation_date) );
For 1 year profile, it shows some improvement but if we give for 2 years range its more worse than previous query.
Any suggestions to improve the performance.
Pls help
I'd pull out that scalar subquery into a regular outer join.
Costing for scalar subqueries can be poor and you are forcing it to do a lot of single record lookups (presumably via index) rather than giving it other options.
"The main query then has a scalar subquery in the select list.
Oracle therefore shows two independent plans in the plan table. One for the driving query – which has a cost of two, and one for the scalar subquery, which has a cost of 2083 each time it executes.
But Oracle does not “know” how many times the scalar subquery will run (even though in many cases it could predict a worst-case scenario), and does not make any cost allowance whatsoever for its execution in the total cost of the query."

Optimizing DB2 query - why does bash "time" output stay the same?

I have a DB2 query on the TPC-H schema, which I'm trying to optimize with indexes. I can tell from db2expln output that the estimated costs are significantly lower (300%) when indexes are available.
However I'm also trying to measure execution time and it doesn't change significantly when the query is executed using indexes.
I'm working in an SSH terminal, executing my queries and writing output to a file like
(time db2 "SELECT Q.name FROM
(SELECT custkey, name FROM customer WHERE nationkey = 22) Q WHERE Q.custkey IN
(SELECT custkey FROM orders B WHERE B.orderkey IN
(SELECT orderkey FROM lineitem WHERE receiptdate BETWEEN '1992-06-11' AND '1992-07-11'))") &> output.txt
I did 10 measurements each: 1) without indexes, 2) with index on lineitem.receiptdate, 3) with indexes on lineitem.receiptdate and customer.nationkey,
calculated average time and standard deviation, all are within the same range.
I executed RUNSTATS ON TABLE schemaname.tablename AND DETAILED INDEXES ALL after index creation.
I read this post about output of the time command, from what I understand sys+user time should be relevant for my measurement. There is no change in added sys+user time, not either in real.
sys + user is around 44 ms, real is around 1s.
Any hints why I cannot see a change in time? Am I interpreting time output wrong? Are the optimizer estimations in db2expln misleading?
Disclaimer: I'm supposed to give a presentation about this at university, so it's technically homework, but as it's more of a comprehension question and not "please make my code work" I hope it's appropriate to post it here. Also, I know the query could be simplified but my question is not about this.
The optimizer estimates timerons (measurement of internal costs) and these timerons cannot be translated one to one into query execution time. So a difference of 300% in timerons does not mean you will see a 300% difference in runtime.
Measuring time for one or more SQL statements I recommend to use db2batch with the option
-i complete
`SELECT f1.name FROM customer f1
WHERE f1.nationkey = 22 and exists (
select * from orders f2 inner join lineitem f3 on f2.orderkey=f3.orderkey
where f1.custkey=f2.custkey and f3.receiptdate BETWEEN '1992-06-11' AND '1992-07-11'
)`

SonarQube doesn't store all results in the database?

I am aware I should not use the database directly from SonarQube, but this is a one-shot complex thing, and saves me days if I can do it directly from the database.
I need to know the amount of classes per project, where the amount of lines is less than 200. So far no problem creating this in SQL.
Only problem I have: for 2 projects, this information isn't stored at all in the database! In the GUI from SonarQube I can see this measure for each file, but than in the database these files have only 1 measures stored (technical debt).
My guess is that for some strange reasons these measures are calculated on the fly for this project? Could that be? And is there a way to force SonarQube to store proejct measures for each file in the database? I tried with the sonar.analysis.mode=analysis parameter but that didn't work?
Thanks a lot and regards,
Pieter
it was due to the query. this is the right query
SELECT metrics.description, root.name, COUNT(project_measures.metric_id) AS AmountOfFiles,
SUM(CASE WHEN project_measures.value < 200 THEN 1 ELSE 0 END) Less200,
SUM(CASE WHEN ((project_measures.value >= 200) AND (project_measures.value <= 1000)) THEN 1 ELSE 0 END) Between200And1000,
SUM(CASE WHEN project_measures.value > 1000 THEN 1 ELSE 0 END) More1000
FROM project_measures
INNER JOIN metrics on project_measures.metric_id = metrics.id
INNER JOIN snapshots on project_measures.snapshot_id = snapshots.id
INNER JOIN projects root on snapshots.root_project_id = root.id
WHERE metrics.id = '1' /*Line: code lines + comments + blanc lines*/
and root.scope = 'PRJ' /*projects*/
and snapshots.scope = 'FIL'
and root.name like '%' /* '%' to show all projects*/
GROUP BY root.name
I am a little bit confused by your question. Are you saying that, when using your SQL queries, you have the results correct for all but 2 projects, whose metrics appear in the SonarQube dashboard?
I suppose the problem cannot come from your SQL queries?
Have you been running these 2 projects recently?
Try clearing your browser cache?
Regards.

Resources