Rethinkdb: Know which records were updated - rethinkdb

Is there any way to update a sequence and know the primary keys of the updated documents?
table.filter({some:"value"}).update({something:"else"})
Then know the primary keys of the records that were updated without needing a second query?

It's currently not possible to return multiple values with {returnVals: true}, see for example https://github.com/rethinkdb/rethinkdb/issues/1382
There's is however a way to trick the system with forEach
r.db('test').table('test').filter({some: "value"}).forEach(function(doc) {
return r.db('test').table('test').get(doc('id')).update({something: "else"}, {returnVals: true}).do(function(result) {
return {generated_keys: [result("new_val")]}
})
})("generated_keys")
While it works, it's really really hack-ish. Hopefully with array limits, returnVals will soon be available for range writes.

Related

Proper Upsert (Atomic Update Counter Field or Insert Document) with RethinkDB

After looking at some SO questions and issues on RethinkDB github, I failed to come to a clear conclusion if atomic Upsert is possible?
Essentially I would like to perform the same operation as ZINCRBY using Redis.
If member does not exist in the sorted set, it is added with increment
as its score (as if its previous score was 0.0). If key does not
exist, a new sorted set with the specified member as its sole member
is created.
The current implementation appears to differ from almost all databases that I have used. With the data being replaced or inserted not updated. This is a simple use case, like update the last visit, update the number of clicks, update a product quantity. So I must be missing something very obvious, because I cannot see a simple way to do this.
Yes, it is possible. After get on the key, perform an atomic replace. Something like this might work:
function set_or_increment_score(player, points){
return r.table('scores').get(player).replace(
row =>
{ id: player,
score: r.branch(
row.eq(null),
points,
row('score').add(points))
});
}
It has the following behaviour:
> set_or_increment_score("alice", 1).run(conn)
{ inserted: 1 }
> set_or_increment_score("alice", 2).run(conn)
{ replaced: 1 }
It works because get returns null when the document doesn't exist, and a replace on a non-existing document tuns into an insert. See the documentation for replace
So I end up using the following code to go around the no Update issue.
r.db("test").table("t").insert(
{id:"A", type:"player", species:"warrior", score:0, xp:0, armor:0},
{conflict: function(id, oldDoc, newDoc) {
return newDoc.merge(oldDoc).merge(
{armor: oldDoc("armor").add(1)});
}
}
)
Do you think this is more readable/elegant or do you see any issues with the code compared to your sample?

How to get a list of all keys across all documents in a RethinkDB table?

I have a dynamically populated table in which documents can have different keys that are not known in advance:
Document 1
{
'attribute1': 'foo',
'attribute2': 'bar'
}
Document 2
{
'attribute1': 'foo',
'attribute3': 'baz'
}
How can I get a list of all attributes present in all documents?
attribute1
attribute2
attribute3
I've tried grouping by keys() but I get a list of the possible attribute combinations, not the individual keys.
While this isn't fast enough if you have a lot of document, it will eventually finishes and won't consume lots of memory:
r.table('table')
.map(r.row.keys())
.reduce(function(left, right) {
return left.setUnion(right)
})
It will be slow, but you can write something like table.concatMap(function(row) { return row.keys(); }).distinct().
I'm not sure there's a solution that is more efficient than O(n) (?) unless you update some custom meta-data by each data update, but anyways, I guess I'd go
table.reduce(function(left, right) {
return left.merge(right)
}).keys()
You can simply create a secondary index of type multi:
r.table('foo').indexCreate('all_keys', function(d){
return d.without('id').keys()
}, {multi: true})
And to get all the keys, just run:
r.table('foo').distinct({index: 'all_keys'})
Voila ;-)

Rethinkdb - filtering by value in another table

In our RethinkDB database, we have a table for orders, and a separate table that stores all the order items. Each entry in the OrderItems table has the orderId of the corresponding order.
I want to write a query that gets all SHIPPED order items (just the items from the OrderItems table ... I don't want the whole order). But whether the order is "shipped" is stored in the Order table.
So, is it possible to write a query that filters the OrderItems table based on the "shipped" value for the corresponding order in the Orders table?
If you're wondering, we're using the JS version of Rethinkdb.
UPDATE:
OK, I figured it out on my own! Here is my solution. I'm not positive that it is the best way (and certainly isn't super efficient), so if anyone else has ideas I'd still love to hear them.
I did it by running a .merge() to create a new field based on the Order table, then did a filter based on that value.
A semi-generalized query with filter from another table for my problem looks like this:
r.table('orderItems')
.merge(function(orderItem){
return {
orderShipped: r.table('orders').get(orderItem('orderId')).pluck('shipped') // I am plucking just the "shipped" value, since I don't want the entire order
}
})
.filter(function(orderItem){
return orderItem('orderShipped')('shipped').gt(0) // Filtering based on that new "shipped" value
})
it will be much easier.
r.table('orderItems').filter(function(orderItem){
return r.table('orders').get(orderItem('orderId'))('shipped').default(0).gt(0)
})
And it should be better to avoid result NULL, add '.default(0)'
It's probably better to create proper index before any finding. Without index, you cannot find document in a table with more than 100,000 element.
Also, filter is limit for only primary index.
A propery way is to using getAll and map
First, create index:
r.table("orderItems").indexCreate("orderId")
r.table("orders").indexCreate("shipStatus", r.row("shipped").default(0).gt(0))
With that index, we can find all of shipper order
r.table("orders").getAll(true, {index: "shipStatus"})
Now, we will use concatMap to transform the order into its equivalent orderItem
r.table("orders")
.getAll(true, {index: "shipStatus"})
.concatMap(function(order) {
return r.table("orderItems").getAll(order("id"), {index: "orderId"}).coerceTo("array")
})

RethinkDB: order / index by version - build

I have records with the following field: { version: "2.4.1-5dev" }. I want to order / index them by version. The documents should be ordered by the version / build combination (ascending by their partially-numeric values). Is it possible, and if so, how can I do that?
edit:
I still can't index/order by version. Again, I want to be able to sort by version, even if there are words like "dev" in them.
In python, there's pkg_resources.parse_version() which helps compare two versions by pkg_resources.parse_version(ver1) > pkg_resources.parse_version(ver2) and it works even for somewhat crazy version naming.
Is there any chance I can use pkg_resources.parse_version() as a cmp function for indexing / ordering, or alternatively, get the same result within a query when trying to order documents by the version field?
You can create an index with a function, in JavaScript it would be something like
r.table("product").indexCreate("version", function(product) {
return r.branch(
product("version").match('dev'),
null,
product("version").split('.').concatMap(function(version) {
return version.split('-')
}).map(function(num) {
return num.coerceTo('NUMBER')
})
})
That works because null is currently not stored in secondary indexes, this behavior may change though -- See https://github.com/rethinkdb/rethinkdb/issues/1032
The third argument in r.branch, split the value on . and - then coerce each value to a number. For example
r.expr("2.4.1-5").split('.').concatMap(r.row.split('-')).map(r.row.coerceTo('NUMBER'))
// will return [2,4,1,5]

How do I get unique field values using rethinkdb javascript?

I have a field which has similar values. For eg {country : 'US'} occurs multiple times in the table. Similar for other countries too. I want to return an array which contains non-redundant values of 'country' field. I am new to creating Databases so likely this is a trivial question but I couldn't find anything useful in rethinkdb api.[SOLVED]
Thanks
You can use distinct, but the distinct command was created for short sequences only.
If you have a lot of data, you can use map/reduce
r.table("data").map(function(doc) {
return r.object(doc("country"), true) // return { <country>: true}
}).reduce(function(left, right) {
return left.merge(right)
}).keys() // return all the keys of the final document

Resources