RethinkDB - Is this a valid optimistic locking implementation - rethinkdb

I am trying out a lot of new ideas (DDD, Event Sourcing and CQRS) and evaluating RethinkDB as a potential data store for the Domain Events. In DDD, an aggregate is set of objects that work together to provide a specific behaviour. Each aggregate is a transactional/consistency boundary. The root of the aggregate is an object that provides an API and hides the internal implementation.
To persistent an aggregate it is usually recommended to use optimistic locking. The idea is to have a version number attribute in the aggregate and when the time comes to save an aggregate we check to make sure the version of the aggregate in the database matches the version of the aggregate that was read/updated in the application. This guarantees that nobody changed the aggregate in the meantime and prevents overwriting others changes.
Obviously this version checking can't just happen in the application layer (think multiple application servers scenario). The application needs support from data store for doing atomic updates that take this version number into consideration.
Here is a simple implementation using the RethinkDB Ruby API.
I created a table called 'applicants' with one record
"id": "6b3b57a7-3ba8-4322-873e-1d6c8333daae" ,
"name": "Homer Simpson" ,
"updated_at": Mon Dec 28 2015 12:05:40 GMT+05:30 ,
"version": 1
Here is the sample test code that I ran twice in parallel
require 'rethinkdb'
include RethinkDB::Shortcuts
conn = r.connect(:host => 'localhost',
:port => 28015,
:db => 'test')
def update_applicant(conn, current_version)
result = r.table('applicants').get('6b3b57a7-3ba8-4322-873e-1d6c8333daae').update{ |applicant|
r.branch(
applicant['version'].eq(current_version),
{updated_at: Time.now, version: current_version + 1},
{}
)
}.run(conn)
fail 'optimistic locking failure' if result['unchanged'] == 1
rescue => e
puts "optimistic locking failure: #{current_version}"
current_version = r.table('applicants').get('6b3b57a7-3ba8-4322-873e-1d6c8333daae').run(conn)['version']
retry
end
(1..100).each { |version| update_applicant(conn, version) }
conn.close
This seems to work but I want to make sure there will be no race conditions and other issues with this approach in a production environment. I am assuming that update is an atomic operation and using a branch in update still keeps it atomic.
I am looking for some validation and suggestions from RethinkDB devs/users. Thanks.

update is always an atomic operation unless you pass the non_atomic: true flag (which is sometimes necessary if the update contains a nondeterministic operation), so that code looks safe to me.

Related

Elasticsearch Delete by Query Version Conflict

I am using Elasticsearch version 5.6.10. I have a query that deletes records for a given agency, so they can later be updated by a nightly script.
The query is in elasticsearch-dsl and look like this:
def remove_employees_from_search(jurisdiction_slug, year):
s = EmployeeDocument.search()
s = s.filter('term', year=year)
s = s.query('nested', path='jurisdiction', query=Q("term", **{'jurisdiction.slug': jurisdiction_slug}))
response = s.delete()
return response
The problem is I am getting a ConflictError exception when trying to delete the records via that function. I have read this occurs because the documents were different between the time the delete process started and executed. But I don't know how this can be, because nothing else is modifying the records during the delete process.
I am going to add s = s.params(conflicts='proceed') in order to silence the exception. But this is a band-aid as I do not understand why the delete is not processing as expected. Any ideas on how to troubleshoot this? A snapshot of the error is below:
ConflictError:TransportError(409,
u'{
"took":10,
"timed_out":false,
"total":55,
"deleted":0,
"batches":1,
"version_conflicts":55,
"noops":0,
"retries":{
"bulk":0,
"search":0
},
"throttled_millis":0,
"requests_per_second":-1.0,
"throttled_until_millis":0,
"failures":[
{
"index":"employees",
"type":"employee_document",
"id":"24681043",
"cause":{
"type":"version_conflict_engine_exception",
"reason":"[employee_document][24681043]: version conflict, current version [5] is different than the one provided [4]",
"index_uuid":"G1QPF-wcRUOCLhubdSpqYQ",
"shard":"0",
"index":"employees"
},
"status":409
},
{
"index":"employees",
"type":"employee_document",
"id":"24681063",
"cause":{
"type":"version_conflict_engine_exception",
"reason":"[employee_document][24681063]: version conflict, current version [5] is different than the one provided [4]",
"index_uuid":"G1QPF-wcRUOCLhubdSpqYQ",
"shard":"0",
"index":"employees"
},
"status":409
}
You could try making it do a refresh first
client.indices.refresh(index='your-index')
source https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/api-reference.html#_indices_refresh
First, this is a question that was asked 2 years ago, so take my response with a grain of salt due to the time gap.
I am using the javascript API, but I would bet that the flags are similar. When you index or delete there is a refresh flag which allows you to force the index to have the result appear to search.
I am not an Elasticsearch guru, but the engine must perform some systematic maintenance on the indices and shards so that it moves the indices to a stable state. It's probably done over time, so you would not necessarily get an immediate state update. Furthermore, from personal experience, I have seen when delete does not seemingly remove the item from the index. It might mark it as "deleted", give the document a new version number, but it seems to "stick around" (probably until general maintenance sweeps run).
Here I am showing the js API for delete, but it is the same for index and some of the other calls.
client.delete({
id: string,
index: string,
type: string,
wait_for_active_shards: string,
refresh: 'true' | 'false' | 'wait_for',
routing: string,
timeout: string,
if_seq_no: number,
if_primary_term: number,
version: number,
version_type: 'internal' | 'external' | 'external_gte' | 'force'
})
https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/api-reference.html#_delete
refresh
'true' | 'false' | 'wait_for' - If true then refresh the affected shards to make this operation visible to search, if wait_for then wait for a refresh to make this operation visible to search, if false (the default) then do nothing with refreshes.
For additional reference, here is the page on Elasticsearch refresh info and what might be a fairly relevant blurb for you.
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html
Use the refresh API to explicitly refresh one or more indices. If the request targets a data stream, it refreshes the stream’s backing indices. A refresh makes all operations performed on an index since the last refresh available for search.
By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. You can change this default interval using the index.refresh_interval setting.

Managing shared resources with threads in Huey

I have to update many rows (increment one value in each rows) in peewee database (SqliteDatabase). Some objects can be uncreated so I have to create them with default values before working with them. I would use ways which are in peewee docs (Atomic updates) but I couldn't figure out how to mix model.get_or_create() and in [my_array].
So I decided to make queries in a transaction to commit it once at the end (I hope it does).
Why I'm writting in stack overflow is because I don't know how to work with db.atomic() with threading (I tested with 4 workers) in Huey because .atomic() locks the connection (peewee.OperationalError: database is locked). I've tried to use #huey.lock_task but it's not a solution of my problem as I've found.
Code of my class:
class Article(Model):
name = CharField()
mention_number = IntegerField(default=0)
class Meta:
database = db
Code of my task:
#huey.task(priority=30)
def update(names): # "names" is a list of strings
with db.atomic():
for name in names:
article, success = Article.get_or_create(name=name)
article.mention_number += 1
article.save()
Well, if you're using a recent version of Sqlite (3.24 or newer) you can use Postgres-style upsert queries. This is well supported by Peewee: http://docs.peewee-orm.com/en/latest/peewee/api.html#Insert.on_conflict
To answer the other question about shared resources, it's not clear from your example what you would like to happen... Sqlite only allows one write transaction at a time. So if you are running several threads, only one of them may be writing at any given time.
Peewee stores database connections in a thread local, so Peewee databases can be safely used in multithreaded applications.
You didn't mention why huey lock_task wouldn't work.
Another suggestion is to try using WAL-mode with Sqlite, as WAL-mode allows multiple reader transactions to co-exist with a single writer.

ActiveRecord Postgres database not locking - getting race conditions

I'm struggling with locking a PostgreSQL table I'm working on. Ideally I want to lock the entire table, but individual rows will do as long as they actually work.
I have several concurrent ruby scripts that all query a central jobs database on AWS (via a DatabaseAccessor class), find a job that hasn't yet been started, change the status to started and carry it out. The problem is, since these are all running at once, they'll typically all find the same unstarted job at once, and begin carrying it out, wasting time and muddying the results.
I've tried a bunch of things, .lock, .transaction, the fatalistic gem but they don't seem to be working, at least, not in pry.
My code is as follows:
class DatabaseAccessor
require 'pg'
require 'pry'
require 'active_record'
class Jobs < ActiveRecord::Base
enum status: [ :unstarted, :started, :slow, :completed]
end
def initialize(db_credentials)
ActiveRecord::Base.establish_connection(
adapter: db_credentials[:adapter],
database: db_credentials[:database],
username: db_credentials[:username],
password: db_credentials[:password],
host: db_credentials[:host]
)
end
def find_unstarted_job
job = Jobs.where(status: 0).limit(1)
job.started!
job
end
end
Does anyone have any suggestions?
EDIT: It seems that LOCK TABLE jobs IN ACCESS EXCLUSIVE MODE; is the way to do this - however, I'm struggling with then returning the results of this after updating. RETURNING * will return the results after an update, but not inside a transaction.
SOLVED!
So the key here is locking in Postgres. There are a few different table-level locks, detailed here.
There are three factors here in making a decision:
Reads aren't thread safe. Two threads reading the same record will result in that job being run multiple times at once.
Records are only updated once (to be marked as completed) and created, other than the initial read and update to being started. Scripts that create new records will not read the table.
Reading varies in frequency. Waiting for an unlock is non-critical.
Given these factors, if there were a read-lock that still allowed writes, this would be acceptable, however, there isn't, so ACCESS EXCLUSIVE is our best option.
Given this, how do we deal with locking? A hunt through the ActiveRecord documentation gives no mention of it.
Thankfully, other methods to deal with PostgreSQL exist, namely the ruby-pg gem. A bit of a play with SQL later, and a test of locking, and I get the following method:
def converter
result_hash = {}
conn = PG::Connection.open(:dbname => 'my_db')
conn.exec("BEGIN WORK;
LOCK TABLE jobs IN ACCESS EXCLUSIVE MODE;")
conn.exec("UPDATE jobs SET status = 1 WHERE id =
(SELECT id FROM jobs WHERE status = 0 ORDER BY ID LIMIT 1)
RETURNING *;") do |result|
result.each { |row| result_hash = row }
end
conn.exec("COMMIT WORK;")
result_hash.transform_keys!(&:to_sym)
end
This will result in:
An output of an empty hash if there are no jobs with a status of 0
An output of a symbolized hash if one is found and updated
Sleeping if the database is currently locked, before returning the above once unlocked.
The table will remain locked until the COMMIT WORK statement.
As an aside, I wish there was a cleaner way to convert the result to a hash. If anyone has any suggestions, please let me know in the comments! :)

Aborting queries on neo4jrb

I am running something along the lines of the following:
results = queries.map do |query|
begin
Neo4j::Session.query(query)
rescue Faraday::TimeoutError
nil
end
end
After a few iterations I get an unrescued Faraday::TimeoutError: too many connection resets (due to Net::ReadTimeout - Net::ReadTimeout) and Neo4j needs switching off and on again.
I believe this is because the queries themselves aren't aborted - i.e. the connection times out but Neo4j carries on trying to run my query. I actually want to time them out, so simply increasing the timeout window won't help me.
I've had a scout around and it looks like I can find my queries and abort them via the Neo4j API, which will be my next move.
Am I right in my diagnosis? If so, is there a recommended way of managing queries (and aborting them) from neo4jrb?
Rebecca is right about managing queries manually. Though if you want Neo4j to automatically stop queries within a certain time period, you can set this in your neo4j conf:
dbms.transaction.timeout=60s
You can find more info in the docs for that setting.
The Ruby gem is using Faraday to connect to Neo4j via HTTP and Faraday has a built-in timeout which is separate from the one in Neo4j. I would suggest setting the Neo4j timeout as a bit longer (5-10 seconds perhaps) than the one in Ruby (here are the docs for configuring the Faraday timeout). If they both have the same timeout, Neo4j might raise a timeout before Ruby, making for a less clear error.
Query management can be done through Cypher. You must be an admin user.
To list all queries, you can use CALL dbms.listQueries;.
To kill a query, you can use CALL dbms.killQuery('ID-OF-QUERY-TO-KILL');, where the ID is obtained from the list of queries.
The previous statements must be executed as a raw query; it does not matter whether you are using an OGM, as long as you can input queries manually. If there is no way to manually input queries, and there is no way of doing this in your framework, then you will have to access the database using some other method in order to execute the queries.
So thanks to Brian and Rebecca for useful tips about query management within Neo4j. Both of these point the way to viable solutions to my problem, and Brian's explicitly lays out steps for achieving one via Neo4jrb so I've marked it correct.
As both answers assume, the diagnosis I made IS correct - i.e. if you run a query from Neo4jrb and the HTTP connection times out, Neo4j will carry on executing the query and Neo4jrb will not issue any instruction for it to stop.
Neo4jrb does not provide a wrapper for any query management functionality, so simply setting a transaction timeout seems most sensible and probably what I'll adopt. Actually intercepting and killing queries is also possible, but this means running your query on one thread so that you can look up its queryId in another. This is the somewhat hacky solution I'm working with atm:
class QueryRunner
DEFAULT_TIMEOUT=70
def self.query(query, timeout_limit=DEFAULT_TIMEOUT)
new(query, timeout_limit).run
end
def initialize(query, timeout_limit)
#query = query
#timeout_limit = timeout_limit
end
def run
start_time = Time.now.to_i
Thread.new { #result = Neo4j::Session.query(#query) }
sleep 0.5
return #result if #result
id = if query_ref = Neo4j::Session.query("CALL dbms.listQueries;").to_a.find {|x| x.query == #query }
query_ref.queryId
end
while #result.nil?
if (Time.now.to_i - start_time) > #timeout_limit
puts "killing query #{id} due to timeout"
Neo4j::Session.query("CALL dbms.killQuery('#{id}');")
#result = []
else
sleep 1
end
end
#result
end
end

can elasticsearch guarantee correctiveness when updating one doc at high frequency?

I have been working on a project which involves massive updates to elasticsearch and I found that when updating applied to one single doc at a high frequency, consistence can not be guaranteed.
For each update, this is how we do(scala code). Notice that, we have to explicitly remove origin fields and replace it with new one because 'merge' is not we want(_update is merge in fact in elasticsearch).
def replaceFields(alarmId: String, newFields: Map[String, Any]): Future[BulkResponse] = {
def removeField(fieldName: String): UpdateDefinition = {
log.info("script: " + s"""ctx._source.remove("${fieldName}")""")
update id alarmId in IndexType script s"""ctx._source.remove("${fieldName}")"""
}
client.execute {
bulk(
{newFields.toList.map(ele => removeField(ele._1)) :+
{update id alarmId in IndexType doc (newFields)}} : _*
)
}}
It cannot. You can increase the write quorum level to all (see Undestanding the write_consistency and quorum rule of Elasticsearch for some discussion around this; also see the docs https://www.elastic.co/guide/en/elasticsearch/reference/2.4/docs-index_.html#index-consistency) and that would get you closer. But Elasticsearch does not have any linearizability guarantees (eg https://aphyr.com/posts/317-jepsen-elasticsearch for examples and https://aphyr.com/posts/313-strong-consistency-models for definitions) and it's not difficult to cook up scenarios in which ES will not be consistent.
That being said, it tends to be consistent most of the time. But in a high update environment, you're gonna be putting a lot of GC pressure on your JVM to clean out the old docs. I assume you know how updates work under the hood in ES but, in case you are not, it's also worth paying attention to https://www.elastic.co/guide/en/elasticsearch/reference/current/_updating_documents.html

Resources