Index selection in CockroachDB - cockroachdb

How do I know which index CockroachDB will select for my query? How do I make sure I’m not performing a full table scan?

This is a pretty lengthy topic; there's an entire blog post devoted to the subject, which might be the best source for understanding how it works in CockroachDB.
To see which indexes CockroachDB is using for a given query, you can use the EXPLAIN statement, which will print out the query plan, including any indexes that are being used:
EXPLAIN SELECT col1 FROM tbl1;
If you'd like to tell the query planner which index to use, you can do so via some special syntax for index hints:
SELECT col1 FROM tbl1#idx1;

Related

PL/SQL: Looping through a list string

Please forgive me if I open a new thread about looping in PL/SQL but after reading dozens of existing ones I'm still not able to perform what I'd like to.
I need to run a complex query on a view of a table and the only way to shorten running time is to filter through a where clause based on a variable to which such table is indexed (otherwise the system ends up doing a full scan of the table which runs endlessly)
The variable the table is indexed on is store_id (string)
I can retrieve all the store_id I want to query from a separate table:
e.g select distinct store_id from store_anagraphy
Then I'd like to make a loop that iterate queries with the store_id identified above
e.g select *complex query from view_of_sales where store_id = 'xxxxxx'
and append (union) all the result returned by each of this queries
Thank you very much in advance.
Gianluca
In theory, you could write a pipelined table function that ran multiple queries in a loop and made a series of pipe row calls to return the results. That would be pretty unusual but it could be done.
It would be far, far more common, however, to simply combine the two queries and run a single query that returns all the rows you want
select something
from your_view
where store_id in (select distinct store_id
from store_anagraphy)
If you are saying that you have tried this query and Oracle is choosing to do a table scan rather than using the index then what you really have is a tuning problem. Most likely, statistics on one or more objects are inaccurate which leads Oracle to expect that this query would return more rows than it really will thus favoring the table scan. You should be able to fix that by fixing the statistics on the objects. In a pinch, you could also use hints to force an index to be used.

How does Hive partition works

Lets assume the below table:
as schema:
ID,NAME,Country and my partition key is country.
If my query is like:
select * from table where id between 155555756 to 10000000000;
The partition will not work in that case, right? .
On a simple note .What if I do not use partition key in my query . So table full scan will be there, right?
Answer to your first question is yes, this query plan will not do partition pruning.
You can use the following statement to check if the query does partition pruning:
explain dependency <your query>
Answer to your second question - It depends!
If the hive.mapred.mode is set to strict, then hive will not allow to do full table scans, and few other "risky" operations like cross joins etc.,
Depending on the version of hive you are using, these settings also affect the number of partitions that can be scanned by a single query
hive.metastore.limit.partition.request (or)
hive.limit.query.max.table.partition

Speed Improving Index

I have a fairly basic query such as below that I am needing to execute very often and as fast as possible:
Select
B.ID, B.FirstName, B.LastName
From
TableA as A
Join
TableB as B on A.ID = B.ID
Where
A.OtherID = #Input
So my thought was to create a stored procedure with parameters of #Input. I figure that since the execution plan was saved on the server side this would increase the speed.
I however want to increase it further and thought that maybe an Index might help. But I have not dealt with indexes much just read a little.
What all information would you need to help me build an index that can help?
Would an Index help?
Also this stored procedure is going to be called from Excel 2013 if that makes a difference on something else we can do to speed it up.
We are using SQL Server 2012.
SQL Server Management Studio is remarkably good at recommending indexes. Run your query and see what it says.
Without knowing more about your schema or the OLAP patterns, I can only make a suggestion...
Is "ID" the key field in TableA and/or TableB? If so, they're already indexed.
I'd say you're looking at two indexes:
An index on TableA for OtherID that includes ID. This will help SQL find values of OtherID and return the ID's associated with them.
An index on TableB on ID that includes FirstName and LastName. This will help SQL with the join and save a trip back to the rows for FirstName and LastName.

Is there any use to create index on all the table columns in oracle?

In our one of production database, we have 4 column table and there are no PK,UK constraints on it. only one notnull constraint on one column. The inserts are slow on this table and when I checked the indexes , there is one index which is built on all columns.
It is a normal table and not IOT. I really don't see a need of all column index, but wondering why the developers has created it?
Appreciate your thoughts?
It might be usefull, i.e. if you (mainly) query all columns oracle doesn't have to access the table at all, but can get all the data from the index. Though inserts take longer because a larger index has to be maintained by the dbms everytime.
One case where it could be useful is,
Say for example, you are trying to check the existence of records in this table and for that you have to have joins on all four columns. So in such a case if you have written a correlated query like below,
SELECT <something>
FROM table_1 t1
WHERE EXISTS
(SELECT 1 FROM table_t2 t2 where t1.c1=t2.c1 and t1.c2=t2.c2 and t1.c3=t2.c3 and t1.c4=t2.c4)
Apart from above case, it looks an error to me from developer's side.
Indexes are good to better query optimization but causes slow updates/inserts because the indexes needs to be updated at each modification.
If these tables first use is querying and inserts happens only in a specific periods like a batch at the beginning or the end of the day only, then you can remove the indexes before updating tables and then restore them.
In addition, all the queries all these tables need to be analysed to see which indexes are useful and which are not?
Anyway, You need to ask developers before removing these indexes.

How do you create an index on a subquery factored temporary table?

I've got a query which has a WITH statement for a subquery at the top, and I'm then running a couple of CONNECT BYs on the subquery. The subquery can contain tens of thousands of rows, and there's no limit to the depth of the CONNECT BY hierarchy. Currently, this query takes upwards of 30 seconds; is it possible to specify indexes to put on the temporary table created for the factored subquery to speed up the CONNECT BYs, or speed it up another way?
There is no way to do it right in the query: Oracle does not support Eager Spool.
You can temporarily store your resultset in an indexed temporary table and issue the CONNECT BY query against it.
However, for the unsargable equality conditions in the query, the CONNECT BY usually builds a hash table which is in most cases even better than an index.
Could you please post your query here?
You might be able to use the MATERIALIZE hint with query subfactoring so that the subquery isn't being rerun iteratively. While it's undocumented, it seems to reliably flush the results of a WITH clause into a temporary table.
Jonathan Lewis' blog has several examples of how it can be used. There is some risk, however, due to the hint's undocumented nature.

Resources