According to ES documentation document indexing/deletion happens as follows:
Request received at one of the nodes.
Request forwarded to the document's primary shard.
The operation performed on the primary shard and parallel requests sent to replica nodes.
Primary shard node waits for a response from replica nodes and then send the response to the node where the request was originally received.
Send the response back to the client.
Now in my case, I am sending a create document request to ES at time t and then sending a request to delete the same document (using delete_by_query) at approximately t+800 milliseconds. These requests are sent via a messaging system (internal implementation of kafka) which ensures that the delete request will be sent to ES only after receiving 200 OK response for the indexing operation from ES.
According to ES documentation, delete_by_query throws a 409 version conflict only when the documents present in the delete query have been updated during the time delete_by_query was still executing.
In my case, it is always guaranteed that the delete_by_query request will be sent to ES only when a 200 OK response has been received for all the documents that have to be deleted. Hence there is no possibility of an update/create of a document that has to be deleted during delete_by_query operation.
Please let me know if I am missing something or this is an issue with ES.
Possible reason could be due to the fact that when a document is created, it is not "committed" to the index immediately.
Elasticsearch indices operate on a refresh_interval, which defaults to 1 second.
This documentation around refresh cycles is old, but I cannot for the life of me find anything as descriptive in the more modern ES versions.
A few things you can try:
Send _refresh with your request
Add ?refresh=wait_for or ?refresh=true param
Note that refreshing the index on every indexing request is terrible for performance, which begs the question as to why you are trying to delete a document immediately after indexing it.
add
deleteByQueryRequest.setAbortOnVersionConflict(false);
Related
We are using elastic search 6.0 and using bulk indexing to index many documents in a single request using “index” action. In a single request we can have a scenario where there are multiple “index” requests on same document. Will ES fail the bulk request in such case OR it will process all of them in order?
Edit1: I use a script for indexing in bulk request where we are handling out of order updates. So as long as all “index” requests are getting processed, we don’t have any issue.
ES will not fail, but it is not necessarily clear which indexing operation will "win". It might be the last one but since all operations in the bulk batch might be spread over several ingest nodes, and not all of those nodes process the indexing operations at the same rate, it might not be clear which operation will be processed first and which will be processed last.
The only guarantee that you have is that in the response, you'll get the state of each operation in the same order as specified in the request batch.
If your index has only one primary shard, then the order in which you submit the operations will be the same order as the one those operations are processed, hence the last one wins, but if you have more than one primary shard on more than one node, then you can't really know.
A better question would be why do you submit several indexing operations per document knowing in advance that only one will win?
We are using elastic search version 5.4.1 in our production environments. The cluster setup is 3 data, 3 query, 3 master nodes. Of late we are observing a lot of slow queries in a particular data node and the [index][shard] present in that are just replicas.
I don't find many deleted docs or memory issues that could directly cause the slowness.
Any pointers on how to go about the investigation here would be helpful.
Thanks!
Many things are happening during one ES query. First, check the took field returned by ElasticSearch.
took – time in milliseconds for Elasticsearch to execute the search
However, the took field is the time that it
took ES to process the query on its side. It doesn't include
serializing the request into JSON on the client
sending the request over the network
deserializing the request from JSON on the server
serializing the response into JSON on the server
sending the response over the network
deserializing the response from JSON on the client
As such, I think you should try to identify the exact step that is slow.
Reference: Query timing: ‘took’ value and what I’m measuring
I was going through elastic search and wanted to get consistent response from ES clusters.
I read Elasticsearch read and write consistency
https://www.elastic.co/guide/en/elasticsearch/reference/2.4/docs-index_.html
and some other posts and can conclude that ES returns success to write operation after completing writes to all shards (Primary + replica), irrespective of consistency param.
Let me know if my understanding is wrong.
I am wondering if anyone knows, how does elastic search add a node/shard back into a cluster which was down transiently. Will it start serving read requests immediately after it is available or does it ensures it has up to date data before serving read requests?
I looked for the answer to above question, but could not find any.
Thanks
Gopal
If node is removed from the cluster and it joins again, Elasticsearch checks if the data is up to date. If it is not, then it will not be made available for search, until it is brought up to date again (which could mean the whole shard gets copied again).
the consistency parameter is just an additional pre-index check if the number of expected shards are available in the cluster (if the index is configured to have 4 replicas, then the primary shard plus two replicas need to be available, if set to quorum). However this parameter does never change the behaviour that a write needs to be written to all available shards, before returning to the client.
If I have requests 1,2,3 in the bulk API of elasticsearch, am I guaranteed that it is executed sequentially, i.e 1 first then 2 and then 3?
This article says that
Each subrequest is executed independently, so the failure of one subrequest won’t affect the success of the others.
This implies that you should not count on the order of the requests, because some of them might not finish successfully at all.
However, the response contains the status for each subrequest in the same order as they were submitted.
Also note that the index is refreshed only 1/sec (by default), so i would expect that individual subrequests would not see the changes of other operations from the same batch.
After reading the source code, we've found that, for operations upon the same doc id, the order can be assured. Because Elasticsearch server will first sort the bulk request and group them by Shard. Then distributed requests will be sending to those shards. Once a shard receives a Shard bulk request, it will execute the requests one by one.
I have a question with Elasticsearch as below:
In mode replication is async, while a document is being indexed, the document will already be present on the primary shard but not yet copied to the replica shards. At this time, a GET request for this document is forwarded to a replica shard.
How does Elasticsearch handle this in the cases the document is not yet indexed on the replica shards or the document is not yet updated on the replica shards ?
A fail request will be returned in the case indexing a new document or a old document returned in the case updating a document? Or requesting node will re-forward to the primary shard to get data?
First of all, I wouldn't recommend using the async mode. It doesn't really provides any benefits that you couldn't achieve in a safer way but creates issues when in comes to reliability. Because of this, this feature was deprecated in v1.5 and completely removed in v2.0 of elasticsearch.
Saying that, if you still want to use async and care about getting the latest results, you have to use it with primary preference.
In case of update operation, you don't have to do anything. Update is always performed on the primary shard first and then the result of the operation is replicated to all replicas.