How to profile sqlite query execution against the query plan? - performance

I'm familliar with the EXPLAIN and EXPLAIN QUERY PLAN commands. However, these only show how the query will be executed. I would like to be able to compare the query plan with data from the actual execution. In particular, I'd like to see the number of rows accessed and returned from every step of the query plan as well as the wall/cpu time each step took.

Related

Snowflake queries with CTE seems not to cache results

When I execute a query containing a CTE (common table expression defined by WITH clause) in Snowflake, the result is not cached.
The question now is: is this how Snowflake works-as-designed, or do I need to consider something to force a result caching?
Snowflake does use the result set cache for CTEs. You can confirm that by running this simple one twice. It should show in the history table that the second one did not use a warehouse to run. Drilling down into the query profile should show the second one's execution plan is a single node, query result reuse.
with
my_cte(L_ORDERKEY) as
(select L_ORDERKEY from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF1"."LINEITEM")
select * from MY_CTE limit 10000;
There are certain conditions that make Snowflake not use the result set cache. One of the more common ones is use of a function that can produce different results on multiple runs. For example, if a query includes current_timestamp(), that's going to change each time it runs.
Here is a complete list of the criteria that all must be met in order to use the result set cache. Even then, there's a note that meeting all of those criteria does not guarantee use of the result set cache.
https://docs.snowflake.com/en/user-guide/querying-persisted-results.html#retrieval-optimization

MonetDB Query Plan

I have a few queries that I am running and I would like to view some sort of query plan for a given query. When I add "explain" before the query, I get a long (~4,000 lines) result that is not possible to interpret.
The MAL plan exposes all parallel activity needed to solve the query. Each line is a relational algebra operator or catalog action.
You might also use PLAN to get an idea of the output of the SQL optimizer.
Each part in the physical execution plan that'll be executed in parallel is repeated the same number of times as the number of cores you have in the result of EXPLAIN. That's why EXPLAIN can sometimes produce a huge MAL plan.
If you just want to have an idea of how are query is handled, you can force MonetDB to generate a sequential MAL plan, then at least, you get rid of the repetitions. For this, you can change the default optimiser pipe line to, e.g., 'sequential_pipe'. This can be done both in a client (it works then only for this client session), or in a server (it works then for the whole server session). For more information: https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/OptimizerPipelines

Analyze the runtime characteristics of a HiveQL query without actual execution

How can I determine an approximate runtime of a HiveQL query without (a) executing the query or (b) fetching the results?
HIVE command EXPLAIN gives the execution plan of the query. Just add the keyword EXPLAIN before the query and execute it.
Otherwise, instead of returning the result, you could return the count of the records from the query. That might provide some insight into the execution time.
As mentioned by #visakh the "explain" gives an execution plan. However it is cryptic and does NOT give execution time. You will have to do a fair amount of analysis on the (potentially copious) output of explain to derive the information you are looking for.
Running "analyze" on the hive tables helps but still does not make the explain user friendly. The "explain" is a feature that my team at a former major employer requested to HortonWorks to improve.
However I disagree with the "count" approach comment : the "count" typically takes as much time as running the query itself. After all the data has to be fetched and the various filtering and aggregation operations performed in order to return the count. Unfortunately Hive is not intelligent enough to discard the "sorting/ordering" steps when doing the count - so you end up paying essentially the entire "price" of the query.

Different execution paths for the same query through Explain plan and Monitor SQL

I am getting different execution PATHS for a query execution through:
SQL Developer> Explain plan
SQL Developer> Tools> Monitor SQL> Monitored SQL Execution
Details (feature of OEM)
The first option shows indexes being used. However, the second option does not cover those indexes during actual execution.
Note: I cannot run these queries in the tool since the PRODUCT I'm using creates and executes them on the fly (I know the queries are exactly same because I can view the queries in the execution monitor). That's why I specifically need to know which result is correct. Or is there a way I can track the specific index usage.
Explains are the theoretical plan. Real time SQL monitoring, which you're referring to when you talk about 'Monitor SQL' shows the actual plan, as it executes.
You can also ask SQL Developer to show you the cached plan that was most likely used to execute the statement last. In version 4.0 and higher, use the drop-down control on the Explain button to see those.
I discuss this here
http://www.thatjeffsmith.com/archive/2013/07/explain-plan-and-autotrace-enhancements-in-oracle-sql-developer-4/

The query time of a view increases after having fetched the last page from a view in Oracle PL/SQL

I'm using Oracle PL/SQL Developer on a Oracle Database 11g. I have recently written a view with some weird behaviour. When I run the simple query below without fetching the last page of the query the query time is about 0.5 sec (0.2 when cached).
select * from covenant.v_status_covenant_tuning where bankkode = '4210';
However, if i fetch the last page in PL/SQL Developer or if I run the query from Java-code (i.e. I run a query that retrieves all the rows) something happens to the view and the query time increases to about 20-30 secs.
The view does not start working properly again before I recompile it. The explain plan is exactly the same before and after. All indexes and tables are analyzed. I don't know if it's relevant but the view uses a few analytic expressions like rank() over (partition by .....), lag(), lead() and so on.
As I'm new here I can't post a picture of the explain plan (need a reputation of 10) but in general the optimizer uses indexes efficiently and it does a few sorts because of the analytic functions.
If the plan involves a full scan of some sort, the query will not complete until the very last block in the table has been read.
Imagine a table that has lots of matching rows in the very first few blocks in the table, and no matching rows in the rest of it. If there is a large volume of blocks to check, the query might return the first few pages of results very quickly, as it finds them all in the first few blocks of the table. But before it can return the final "no more results" to the client, it must check every last block of the table - it doesn't know if there might be one more result in the very last block of the table, so it has to wait until it has read that last block.
If you'd like more help, please post your query plan.

Resources