Document concurrent update - rethinkdb

I have a document like:
{
owner: 'alex',
live: 'some guid'
}
Two or more users can update live field simultaneously.
How can I make sure that only the first user wins and others updates fails?

You can get the semantics you want if you store some variable like "times_updated" in the document. Operations on a single document are atomic, so you can check that the field is the value you expect, and then throw an error if it isn't.
It might look something like:
var timesUpdated = 3
r.table('foo').get(rowId).update(function(row) {
return r.branch(row('timesUpdated').eq(timesUpdated),
{
timesUpdated: row('timesUpdated').add(1),
live: 'some special value'
},
r.error('Someone else updated the live field!')
);
}, {returnChanges: true})
So if another query comes in before you for timesUpdated = 3, your query will blow up. When do you get timesUpdated? That depends on how your app is designed, and what you're trying to do.
Another thing to note is that adding {returnChanges: true} is really useful because it allows you to get the new value of timesUpdated atomically. You can also see what exactly changed in the updated document.

Related

Firestore transaction produces console error: FAILED_PRECONDITION: the stored version does not match the required base version

I have written a bit of code that allows a user to upvote / downvote recipes in a manner similar to Reddit.
Each individual vote is stored in a Firestore collection named votes, with a structure like this:
{username,recipeId,value} (where value is either -1 or 1)
The recipes are stored in the recipes collection, with a structure somewhat like this:
{title,username,ingredients,instructions,score}
Each time a user votes on a recipe, I need to record their vote in the votes collection, and update the score on the recipe. I want to do this as an atomic operation using a transaction, so there is no chance the two values can ever become out of sync.
Following is the code I have so far. I am using Angular 6, however I couldn't find any Typescript examples showing how to handle multiple gets() in a single transaction, so I ended up adapting some Promise-based JavaScript code that I found.
The code seems to work, but there is something happening that is concerning. When I click the upvote/downvote buttons in rapid succession, some console errors occasionally appear. These read POST https://firestore.googleapis.com/v1beta1/projects/myprojectname/databases/(default)/documents:commit 400 (). When I look at the actual response from the server, I see this:
{
"error": {
"code": 400,
"message": "the stored version (1534122723779132) does not match the required base version (0)",
"status": "FAILED_PRECONDITION"
}
}
Note that the errors do not appear when I click the buttons slowly.
Should I worry about this error, or is it just a normal result of the transaction retrying? As noted in the Firestore documentation, a "function calling a transaction (transaction function) might run more than once if a concurrent edit affects a document that the transaction reads."
Note that I have tried wrapping try/catch blocks around every single operation below, and there are no errors thrown. I removed them before posting for the sake of making the code easier to follow.
Very interested in hearing any suggestions for improving my code, regardless of whether they're related to the HTTP 400 error.
async vote(username, recipeId, direction) {
let value;
if ( direction == 'up' ) {
value = 1;
}
if ( direction == 'down' ) {
value = -1;
}
// assemble vote object to be recorded in votes collection
const voteObj: Vote = { username: username, recipeId: recipeId , value: value };
// get references to both vote and recipe documents
const voteDocRef = this.afs.doc(`votes/${username}_${recipeId}`).ref;
const recipeDocRef = this.afs.doc('recipes/' + recipeId).ref;
await this.afs.firestore.runTransaction( async t => {
const voteDoc = await t.get(voteDocRef);
const recipeDoc = await t.get(recipeDocRef);
const currentRecipeScore = await recipeDoc.get('score');
if (!voteDoc.exists) {
// This is a new vote, so add it to the votes collection
// and apply its value to the recipe's score
t.set(voteDocRef, voteObj);
t.update(recipeDocRef, { score: (currentRecipeScore + value) });
} else {
const voteData = voteDoc.data();
if ( voteData.value == value ) {
// existing vote is the same as the button that was pressed, so delete
// the vote document and revert the vote from the recipe's score
t.delete(voteDocRef);
t.update(recipeDocRef, { score: (currentRecipeScore - value) });
} else {
// existing vote is the opposite of the one pressed, so update the
// vote doc, then apply it to the recipe's score by doubling it.
// For example, if the current score is 1 and the user reverses their
// +1 vote by pressing -1, we apply -2 so the score will become -1.
t.set(voteDocRef, voteObj);
t.update(recipeDocRef, { score: (currentRecipeScore + (value*2))});
}
}
return Promise.resolve(true);
});
}
According to Firebase developer Nicolas Garnier, "What you are experiencing here is how Transactions work in Firestore: one of the transactions failed to write because the data has changed in the mean time, in this case Firestore re-runs the transaction again, until it succeeds. In the case of multiple Reviews being written at the same time some of them might need to be ran again after the first transaction because the data has changed. This is expected behavior and these errors should be taken more as warnings."
In other words, this is a normal result of the transaction retrying.
I used RxJS throttleTime to prevent the user from flooding the Firestore server with transactions by clicking the upvote/downvote buttons in rapid succession, and that greatly reduced the occurrences of this 400 error. In my app, there's no legitimate reason someone would need to clip upvote/downvote dozens of times per seconds. It's not a video game.

How to update item conditionally with branch in RethinkDB

I am trying to do simple upsert to the array field based on branch condition. However branch does not accept a reql expression as argument and I get error Expected type SELECTION but found DATUM.
This is probably some obvious thing I've missed, however I can't find any working example anywhere.
Sample source:
var userId = 'userId';
var itemId = 'itemId';
r.db('db').table('items').get(itemId).do(function(item) {
return item('elements').default([]).contains(function (element) {
return element('userId').eq(userId);
}).branch(
r.expr("Element already exist"),
//Error: Expected type SELECTION but found DATUM
item.update({
elements: item('elements').default([]).append({
userId: 'userId'
})
})
)
})
The problem here is that item is a datum, not a selection. This happens because you used r.do. The variable doesn't retain information about where the object originally came from.
A solution that might seem to work would be to write a new r.db('db').table('items').get(itemId) expression. The problem with that option is the behavior isn't atomic -- two different queries might append the same element to the 'elements' array. Instead you should write your query in the form r.db('db').table('items').get(itemId).update(function(item) { return <something>;) so that the update gets applied atomically.

Proper Upsert (Atomic Update Counter Field or Insert Document) with RethinkDB

After looking at some SO questions and issues on RethinkDB github, I failed to come to a clear conclusion if atomic Upsert is possible?
Essentially I would like to perform the same operation as ZINCRBY using Redis.
If member does not exist in the sorted set, it is added with increment
as its score (as if its previous score was 0.0). If key does not
exist, a new sorted set with the specified member as its sole member
is created.
The current implementation appears to differ from almost all databases that I have used. With the data being replaced or inserted not updated. This is a simple use case, like update the last visit, update the number of clicks, update a product quantity. So I must be missing something very obvious, because I cannot see a simple way to do this.
Yes, it is possible. After get on the key, perform an atomic replace. Something like this might work:
function set_or_increment_score(player, points){
return r.table('scores').get(player).replace(
row =>
{ id: player,
score: r.branch(
row.eq(null),
points,
row('score').add(points))
});
}
It has the following behaviour:
> set_or_increment_score("alice", 1).run(conn)
{ inserted: 1 }
> set_or_increment_score("alice", 2).run(conn)
{ replaced: 1 }
It works because get returns null when the document doesn't exist, and a replace on a non-existing document tuns into an insert. See the documentation for replace
So I end up using the following code to go around the no Update issue.
r.db("test").table("t").insert(
{id:"A", type:"player", species:"warrior", score:0, xp:0, armor:0},
{conflict: function(id, oldDoc, newDoc) {
return newDoc.merge(oldDoc).merge(
{armor: oldDoc("armor").add(1)});
}
}
)
Do you think this is more readable/elegant or do you see any issues with the code compared to your sample?

RethinkDB: Cannot call `changes` on an eager stream

I have a table of users who each have an array of friends.
A document in it looks something like this:
{
id: "0ab43d81-b883-424a-be56-32f9ff98f7d2",
username: "testUser1234",
friends: [
"04423c56-1890-4028-b38a-cb9aff7112de" ,
"05e4e613-2131-408c-b0ae-a952f3007405" ,
"0395ee53-8ab0-48cc-aa4e-41aad93b8737"
]
}
I want to watch for changes on a user's friends'. A query like this will get me a list of friends:
r.db("Test").table("Users").get("0ab43d81-b883-424a-be56-32f9ff98f7d2")("friends").map(function(id) {
return r.db("Test").table("Users").get(id);
})
But, when I try to throw a .changes() on the end, RethinkDB tells me that it won't work:
RqlRuntimeError: Cannot call `changes` on an eager stream in:
r.db("Test").table("Users").get("0ab43d81-b883-424a-be56-32f9ff98f7d2")("friends").map(function(var_19) { return r.db("Test").table("Users").get(var_19); }).changes()
Is there anyway to get this to work? I am afraid that my only alternative is to subscribe to the friends list (in my app) and update the subscription to the actual friends when it changes:
r.db("Test").table("Users").getAll(friendId1, friendId2 , friendId3, friendId4).changes()
Not the end of the world, but I was really excited about being able to do it entirely in the DB.
Also, can anyone explain what an "eager stream" is? I think it has something to do with lazy vs. immediate evaluation, but I had no idea how to tell what the criteria determines whether a stream is eager or not.
I can get the query working with the following formation, inspired by this post:
r.db("Test").table('Users').getAll(r.args(
r.db('Test').table('Users').get("0ab43d81-b883-424a-be56-32f9ff98f7d2")('friends')
)).changes()
You can attach the .changes before some of the transofrmations.
r.db("Test")
.table("Users")
.get("0ab43d81-b883-424a-be56-32f9ff98f7d2")
.changes()
.getField('new_val')('friends')
.map(function(id) {
return r.db("Test").table("Users").get( id );
})
Basically, every time there is a change, the map function is executed. At the moment, that is the only way to do this type of operations with .changes, but that will change in upcoming versions of RethinkDB.

CouchDB Views: remove duplicates *and* order by time

Based on a great answer to my previous question, I've partially solved a problem I'm having with CouchDB.
This resulted in a new view.
Now, the next thing I need to do is remove duplicates from this view while ordering by date.
For example, here is how I might query that view:
GET http://scoates-test.couchone.com/follow/_design/asset/_view/by_userid_following?endkey=[%22c988a29740241c7d20fc7974be05ec54%22]&startkey=[%22c988a29740241c7d20fc7974be05ec54%22,{}]&descending=true&limit=3
Resulting in this:
HTTP 200 http://scoates-test.couchone.com/follow/_design/asset/_view/by_userid_following
http://scoates-test.couchone.com > $_.json.rows
[ { id: 'c988a29740241c7d20fc7974be067295'
, key:
[ 'c988a29740241c7d20fc7974be05ec54'
, '2010-11-26T17:00:00.000Z'
, 'clementine'
]
, value:
{ _id: 'c988a29740241c7d20fc7974be062ee8'
, owner: 'c988a29740241c7d20fc7974be05f67d'
}
}
, { id: 'c988a29740241c7d20fc7974be068278'
, key:
[ 'c988a29740241c7d20fc7974be05ec54'
, '2010-11-26T15:00:00.000Z'
, 'durian'
]
, value:
{ _id: 'c988a29740241c7d20fc7974be065115'
, owner: 'c988a29740241c7d20fc7974be060bb4'
}
}
, { id: 'c988a29740241c7d20fc7974be068026'
, key:
[ 'c988a29740241c7d20fc7974be05ec54'
, '2010-11-26T14:00:00.000Z'
, 'clementine'
]
, value:
{ _id: 'c988a29740241c7d20fc7974be063b6d'
, owner: 'c988a29740241c7d20fc7974be05ff71'
}
}
]
As you can see, "clementine" shows up twice.
If I change the view to emit the fruit/asset name as the second key (instead of the time), I can change the grouping depth to collapse these, but that doesn't solve my order-by-time requirement. Similarly, with the above setup, I can order by time, but I can't collapse duplicate asset names into single rows (to allow e.g. 10 assets per page).
Unfortunately, this is not a simple question to explain. Maybe this chat transcript will help a little.
Please help. I'm afraid that what I need to do is still not possible.
S
You can do this using list function. Here is an example to generate a really simple list containing all the owner fields without dupes. You can easily modify it to produce json or xml or anything you want.
Put it into your assets design doc inside the lists.nodupes and use like this:
http://admin:123#127.0.0.1:5984/follow/_design/assets/_list/nodupes/by_userid_following_reduce?group=true
function(head, req) {
start({
"headers": {
"Content-Type": "text/html"
}
});
var row;
var dupes = [];
while(row = getRow()) {
if (dupes.indexOf(row.key[2]) == -1) {
dupes.push(row.key[2]);
send(row.value[0].owner+"<br>");
}
}
}
Ordering by one field and uniquing on another isn't something the basic map reduce can do. All it can do is sort your data, and apply reduce rollups to dynamic key-ranges.
To find the latest entry for each type of fruit, you'd need to query once per fruit.
There are some ways to do this that are kinda sane.
You'll want a view with keys like [fruit_type, date], and then you can query like this:
for fruit in fruits
GET /db/_design/foo/_view/bar?startkey=["apples"]&limit=1&descending=true
This will give you the latest entry for each fruit.
The list operation could be used to do this, it would just echo the first row from each fruit's block. This would be efficient enough as long as each fruit has a small number of entries. Once there are many entries per fruit, you'll be discarding more data than you echo, so the multi-query approach actually scales better than the list approach, when you get to a large data set. Luckily they can both work on the same view index, so when you have to switch it won't be a big deal.

Resources