How to use getall with orderby in RethinkDB - rethinkdb

i want to list records where id=1 between two timestamps and finally order them according to timestamp.
Mysql query something:
Select * from test
where (timestamp between 100099323 AND 1423699323) AND id=1
order by timestamp
there are more than 5 million documents in rethink database.
I tried using index for the simple mysql query:
Select * from test where id=1 order by timestamp
and Rethinkdb query is:
r.table('test').getAll(1, {index: 'id'}).orderBy({index: 'timestamp'})
but i am getting error:
RqlRuntimeError: Indexed order_by can only be performed on a TABLE or
TABLE_SLICE in:
r.table("test").getAll(1, {index: "id"}).orderBy({index: "timestamp"})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Any suggestion?

RethinkDB doesn't support efficient index intersection (the Github issue to add this is #809), but you could implement this query efficiently by adding a compound index for the 'id' and 'timestamp' indexes.
If your result set is small enough, though, the orderBy could just be done completely in-memory by dropping the 'index' optarg:
r.table("test").getAll(1, {index: "id"}).orderBy("timestamp")
To do this efficiently for large result sets, you would need an index. Assuming your 'id' and 'timestamp' indexes correspond directly to fields in your rows, adding the index would look like:
r.table("test").indexCreate("id_time",
function(row) {
return [row("id"), row("timestamp")];
})
To get all the rows with id=1 and sort by the timestamp, you would then run:
r.table("test").between([1], [2], {"index": "id_time"})
.orderBy({"index": "id_time"})
In addition, going back to the original query you posted, you could query between two timestamps for id=1 by running:
r.table("test").between([1, <time 1>], [1, <time 2>], {"index": "id_time"})
.orderBy({"index": "id_time"})

Related

Oracle query with "in clause" - how to speed up using an index?

I got a oracle query that uses an in-clause with eight given values, like:
select * from mytable a
where a.wf_type in ('value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8');
The table is not really big (about 3 million rows) and the query does a full table scan.
Therefore I added an index for the wf_type attribute.
But the index is not used by the query with the in-clause. If I change the query to one specific value like
select * from mytable a where a.wf_type = 'value1';
the index is used and the query runs fast.
How do I fasten the query with the in-clause? Is it possible by using an index or are there other ways?

cassandra long latency for query if many rows in result

exp: table schema:
Create Table tbl {
key int,
seq int,
name text,
Primary key(key, seq) };
For each key, there are multiple rows(1000K suppose);
Suppose I want to query content for a specific key, My query is:
select * from tbl where key = 'key1'
(actually I use the cpp driver in program, and use the paging interface)
Result contains 1000k rows, and it costs about 10s for this query.
I think data for each query is stored together on disk, so it should be very fast to return.
Why it costs so long time?
Is there any way to optimize???
Why it costs so long time?
There are almost 1000K=1000,000=1M rows returned from your query. That's why it costs too long time.
Is there any way to optimize???
Yes!! there are.
Try using limit and pivoting/pagination in the query.
From table definition, it seems that you have a clustering key seq you can easily use this seq value to optimize your query. Assuming clustering key(seq) has default ascending order. Changed your query to:
select * from tbl where key = 'key1' and seq > [pivot] limit 100
replace [pivot] with the last value of your result set. for the first query use Integer.MIN_VALUE as [pivot].
For example:
select * from tbl where key = 'key1' and seq > -100 limit 100

SQLite SELECT with max() performance

I have a table with about 1.5 million rows and three columns. Column 'timestamp' is of type REAL and indexed. I am accessing the SQLite database via PHP PDO.
The following three selects run in less than a millisecond:
select timestamp from trades
select timestamp + 1 from trades
select max(timestamp) from trades
The following select needs almost half a second:
select max(timestamp) + 1 from trades
Why is that?
EDIT:
Lasse has asked for a "explain query plan", I have run this within a PHP PDO query since I have no direct SQLite3 command line tool access at the moment. I guess it does not matter, here is the result:
explain query plan select max(timestamp) + 1 from trades:
[selectid] => 0
[order] => 0
[from] => 0
[detail] => SCAN TABLE trades (~1000000 rows)
explain query plan select max(timestamp) from trades:
[selectid] => 0
[order] => 0
[from] => 0
[detail] => SEARCH TABLE trades USING COVERING INDEX tradesTimestampIdx (~1 rows)
The reason this query
select max(timestamp) + 1 from trades
takes so long is that the query engine must, for each record, compute the MAX value and then add one to it. Computing the MAX value involves doing a full table scan, and this must be repeated for each record because you are adding one to the value.
In the query
select timestamp + 1 from trades
you are doing a calculation for each record, but the engine only needs to scan the entire table once. And in this query
select max(timestamp) from trades
the engine does have to scan the entire table, however it also does so only once.
From the SQLite documentation:
Queries that contain a single MIN() or MAX() aggregate function whose argument is the left-most column of an index might be satisfied by doing a single index lookup rather than by scanning the entire table.
I emphasized might from the documentation, because it appears that a full table scan may be necessary for a query of the form SELECT MAX(x)+1 FROM table
if column x be not the left-most column of an index.

Searching first condition first and only if not available then second condition

I am writing an SQL query where the query should first search the first value, and only if this value is missing the query should search for the second value.
I have two tables. One of these tables contains the modification date (this is not always filled and can be null) and a creation date which is always filled.
Now what I want is that the query first looks in the table with the modification date and only if it is null looks at the table with the creation date.
Example of the query:
Select *
from all_articles
where to_char(modification_date, 'YYYYMMDD') = to_char(sysdate, 'YYYYMMDD')-1
-- if this is an empty record then
to_char(creation_date, 'YYYYMMDD') = to_char(sysdate, 'YYYYMMDD')-1
Can anyone help me with this query?
Almost all the major RDBMS' available have in built functions to handle such a situation.
The Oracle DB has NVL function which works as follows:
NVL(Modified_dt, Create_dt);
The above will return Modified_dt column data by default. However, if that isn't available, it will return Create_dt.
See here for details:
http://www.w3schools.com/sql/sql_isnull.asp

Oracle Text multi column index based query does not return any rows

I have a MAH_KERESES_MV table with 3 columns OBJEKTUM_NEV, KERESES_SZOVEG_1, KERESES_SZOVEG_2. I create the following multi column Oracle Text index:
exec ctx_ddl.create_preference( 'MAH_SEARCH', 'MULTI_COLUMN_DATASTORE');
exec ctx_ddl.set_attribute('MAH_SEARCH', 'COLUMNS', 'OBJEKTUM_NEV, KERESES_SZOVEG_1, KERESES_SZOVEG_2');
create index MAX_KERES_CTX on MAH_KERESES_MV(OBJEKTUM_NEV)
indextype is ctxsys.context
parameters ('DATASTORE MAH_SEARCH');
But the query does not return any rows, although if I formulate the query with the like operator, then I get the results as expected:
SELECT id, OBJEKTUM_NEV
FROM MAH_KERESES_MV
WHERE CONTAINS(OBJEKTUM_NEV, 'C')>0;
Can some body please help? TIA,
Tamas
Just in case any body might be interested later on, the solution was that the above CONTAINS clause filters for the C character as a stand alone entity (i.e. word). The correct where clause would have been:
WHERE CONTAINS(OBJEKTUM_NEV, 'C%')>0;

Resources