How to use RethinkDB indices in the following scenario? - rethinkdb

I'd like to use an index to select all documents that don't have a particular nested field set.
In my situation with the JS-api this works out to this:
r.table('sometable').filter(r.row('_state').hasFields("modifiedMakeRefs").not())
How would I use an index on the above? I.e.: filter doesn't support defining indices afaik?

You would write:
r.table('sometable').indexCreate('idx_name', function(row) {
return row('_state').hasFields("modifiedMakeRefs");
})
And then:
r.table('sometable').getAll(false, {index: 'idx_name'})

Related

Filtering a list of values by a field value in GraphQL

So I'm doing some tests with GraphQL, and I'm failing in doing something that I believe is fairly simple.
When going to the GraphQL demo site (https://graphql.org/swapi-graphql) I'm presented with a default query which goes like this:
{
allFilms {
films {
title,
director,
releaseDate
}
}
}
This works as expected and returns a list of films.
Now - I would like to modify this query to return only the films where the director is George Lucas, and for the life of me - I can't figure out how to do that.
I've tried using the where and filter expressions, and also change the second line to films: (director: "George Lucas") but keep getting error messages.
What's the correct syntax for doing that?
Thanks!
If you check the docs of the provided GraphQL schema, you'll see that this is not possible. Following is the definition of the allFilms field:
allFilms(
after: String
first: Int
before: String
last: Int
): FilmsConnection
As per the doc, it has 4 input arguments, which are after, first, before, and last. There is no way to filter this out using the director's name.
GraphQL is not SQL. You cannot use expressions like WHERE or FILTER in GraphQL. The schema is already defined and the filters are pre-defined too. If the schema does not allow you to filter values using a certain field, you just can't do it.
You can to see the graphql schema here https://github.com/graphql/swapi-graphql/blob/master/schema.graphql
The allFilms query does not contain a filter for the field director. Also i can't find other query with this filter.
Most likely you need to write a filter on the result of the query.

Subqueries to filter out in rethinkdb

How do write an equivalent statement in RethinkDB using Python client driver?
SELECT id fields FROM tasks WHERE id NOT IN (SELECT id FROM finished_tasks)
This is what I tried:
r.table('tasks').filter(lambda row: r.not(row['id'] in r.table('finished_tasks').pluck("id").coerce_to('array').run()
In Java Script:
r.table("tasks").filter(function(task){
return r.expr(r.table("finished_tasks").pluck("id")).map(function(i){
return i("id");
}).coerceTo('array')
.contains(task("id"))
.not();
})
In Python should be something like this.
I don't have an example in Python. I give JavaScript example and I think you can compare on API doc to write Python equivalent.
Assume that id is also the primary key of finished_tasks table.
r.table('tasks').filter(function(task) {
return r.table('finished_tasks').get(task('id')).eq(null)
})
If id isn't primary key of finished_tasks, let's create a secondary index for it, then use it in getAll
// Create index
r.table('finished_tasks').indexCreate('finished_task', r.row('id'))
// Using index for efficient query
r.table('tasks').filter(function(task) {
return r.table('finished_tasks').getAll(task('id'), {index: 'finished_task'}).count().eq(0)
})

Removing a field and updating another field in a document

Is it possible to remove a field from a document and update another field in the same document in one query?
Afaik, to remove field, you have to use a replace query, like so:
r.db("db").table("table").get("some-id").replace(r.row.without("field-to-remove"))
And to update:
r.db("db").table("table").get("some-id").update({ "field-to-update": "new-value" })
But chaining these two together doesn't work. I get a "RqlRuntimeError: Expected type SELECTION but found DATUM" error when running the following query (the order of the replace/update doesn't matter):
r.db("db").table("table").get("some-id").replace(r.row.without("field-to-remove")).update({ "field-to-update": "new-value" })
Try:
r.db('db').table('table').get('id').update({
"field-to-remove": r.literal(),
"field-to-update": "new-value"
})
You don't need to use replace here since you don't care about explicitly setting the other fields.
You can use replace with without and merge inside of your replace function:
r.table('30514947').get("492a41d2-d7dc-4440-8394-3633ae8ac337")
.replace(function (row) {
return row
.without("remove_field")
.merge({
"field-to-update": "hello"
})
})

RethinkDB filter and retrieve value from nested array

Using the following query:
r.db('somedb').table('sometable')('users')
I get the following data from the result:
[
   [
      {
         "fn": "dpw",
         "u": "usertwo"
      },
      {
         "fn": "dwd",
         "u": "userone"
      }
   ]
]
I would like to take the field "u" and specify lets say "usertwo" and get the value of "fn" for that "u". I want to have the result filtered using ReQL so that I am not just parsing the json result in nodejs as the result will be enormous eventually. What would be the best and most efficient approach. I am new to RethinkDB and would appreciate if you could explain the answer as best you can.
I'm not sure of what you exactly want, but from my understanding, this is what you are looking for:
r.db('somedb').table('sometable')('users').filter(function(user) {
return user("u").eq("usertwo")
})("fn")
You seem to have an array of array of users. if that was not a typo, the query should probably be
r.db('somedb').table('sometable')('users').nth(0).filter(function(user) {
return user("u").eq("usertwo")
})("fn")

RethinkDB index for filter + orderby

Lets say a comments table has the following structure:
id | author | timestamp | body
I want to use index for efficiently execute the following query:
r.table('comments').getAll("me", {index: "author"}).orderBy('timestamp').run(conn, callback)
Is there other efficient method I can use?
It looks that currently index is not supported for a filtered result of a table. When creating an index for timestamp and adding it as a hint in orderBy('timestamp', {index: timestamp}) I'm getting the following error:
RqlRuntimeError: Indexed order_by can only be performed on a TABLE. in:
This can be accomplished with a compound index on the "author" and "timestamp" fields. You can create such an index like so:
r.table("comments").index_create("author_timestamp", lambda x: [x["author"], x["timestamp"]])
Then you can use it to perform the query like so:
r.table("comments")
.between(["me", r.minval], ["me", r.maxval]
.order_by(index="author_timestamp)
The between works like the get_all did in your original query because it gets only documents that have the author "me" and any timestamp. Then we do an order_by on the same index which orders by the timestamp(since all of the keys have the same author.) the key here is that you can only use one index per table access so we need to cram all this information in to the same index.
It's currently not possible chain a getAll with a orderBy using indexes twice.
Ordering with an index can be done only on a table right now.
NB: The command to orderBy with an index is orderBy({index: 'timestamp'}) (no need to repeat the key)
The answer by Joe Doliner was selected but it seems wrong to me.
First, in the between command, no indexer was specified. Therefore between will use primary index.
Second, the between return a selection
table.between(lowerKey, upperKey[, {index: 'id', leftBound: 'closed', rightBound: 'open'}]) → selection
and orderBy cannot run on selection with an index, only table can use index.
table.orderBy([key1...], {index: index_name}) → selection<stream>
selection.orderBy(key1, [key2...]) → selection<array>
sequence.orderBy(key1, [key2...]) → array
You want to create what's called a "compound index." After that, you can query it efficiently.
//create compound index
r.table('comments')
.indexCreate(
'author__timestamp', [r.row("author"), r.row("timestamp")]
)
//the query
r.table('comments')
.between(
['me', r.minval],
['me', r.maxval],
{index: 'author__timestamp'}
)
.orderBy({index: r.desc('author__timestamp')}) //or "r.asc"
.skip(0) //pagi
.limit(10) //nation!
I like using two underscores for compound indexes. It's just stylistic. Doesn't matter how you choose to name your compound index.
Reference: How to use getall with orderby in RethinkDB

Resources