Can't delete item in Elasticsearch with _delete_by_query - elasticsearch

I would like to delete some items in Elasticsearch database according simple condition. I try to do it via Postman app. So I have a POST request to this url localhost:9200/newlocalsearch/_delete_by_query with this json query:
{
"query": {
"bool": {
"must_not": [
{"exists": {"field": "ico"}}
]
}
}
}
But as I send request to database it returns this error response:
{
"took": 51,
"timed_out": false,
"total": 1,
"deleted": 0,
"batches": 1,
"version_conflicts": 1,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": [
{
"index": "newlocalsearch",
"type": "doc",
"id": "0",
"cause": {
"type": "version_conflict_engine_exception",
"reason": "[doc][0]: version conflict, current version [-1] is different than the one provided [1]",
"index_uuid": "jZbdUfqwSAqtFELXB2Z2AQ",
"shard": "0",
"index": "newlocalsearch"
},
"status": 409
}
]
}
I dont understand what happens. Is there anybody out there :) who knows what it means? Thanks a lot.

It could be you need to refresh your index first:
Send a POST request to localhost:9200/newlocalsearch/_refresh

Related

Lucene vs Elasticsearch query syntax

I can see that Elasticsearch support both Lucene syntax and it's own query language.
You can use both and get same kinds of results.
Example (might be done differently maybe but to show what I mean):
Both of these queries produce the same result but use Lucene or Elastic query syntax.
GET /index/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "field101:Denmark"
}
}
]
}
}
}
GET /index/_search
{
"query": {
"match": {
"field101": {
"query": "Denmark"
}
}
}
}
I was wondering are there any kind of implications when choosing one approach over the other (like performance or some kinds of optimizations)? Or is Elastic query syntax just translated to Lucene query somewhere since Elastic runs Lucene as its underlying search engine ?
I was wondering are there any kind of implications when choosing one approach over the other (like performance or some kinds of optimizations)?
Elasticsearch DSL will convert into Lucene query under the hood, you can set "profile":true in the query to see how that works and exactly how much time it takes to convert.
I would say there are no important performance implications and you should always use the DSL, because in many cases Elasticsearch will do optimizations for you. Also, query_string will expect well written Lucene queries, and you can have syntax errors (try doing "Denmark AND" as query_string.
Or is Elastic query syntax just translated to Lucene query somewhere since Elastic runs Lucene as its underlying search engine ?
Yes. You can try it yourself:
GET test_lucene/_search
{
"profile": true,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "field101:Denmark"
}
}
]
}
}
}
will produce:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"profile": {
"shards": [
{
"id": "[KGaFbXIKTVOjPDR0GrI4Dw][test_lucene][0]",
"searches": [
{
"query": [
{
"type": "TermQuery",
"description": "field101:denmark",
"time_in_nanos": 3143,
"breakdown": {
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 0,
"match": 0,
"next_doc_count": 0,
"score_count": 0,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 0,
"advance_count": 0,
"score": 0,
"build_scorer_count": 0,
"create_weight": 3143,
"shallow_advance": 0,
"create_weight_count": 1,
"build_scorer": 0
}
}
],
"rewrite_time": 2531,
"collector": [
{
"name": "SimpleTopScoreDocCollector",
"reason": "search_top_hits",
"time_in_nanos": 1115
}
]
}
],
"aggregations": []
}
]
}
}
And
GET /test_lucene/_search
{
"profile": true,
"query": {
"match": {
"field101": {
"query": "Denmark"
}
}
}
}
Will produce the same
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"profile": {
"shards": [
{
"id": "[KGaFbXIKTVOjPDR0GrI4Dw][test_lucene][0]",
"searches": [
{
"query": [
{
"type": "TermQuery",
"description": "field101:denmark",
"time_in_nanos": 3775,
"breakdown": {
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 0,
"match": 0,
"next_doc_count": 0,
"score_count": 0,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 0,
"advance_count": 0,
"score": 0,
"build_scorer_count": 0,
"create_weight": 3775,
"shallow_advance": 0,
"create_weight_count": 1,
"build_scorer": 0
}
}
],
"rewrite_time": 3483,
"collector": [
{
"name": "SimpleTopScoreDocCollector",
"reason": "search_top_hits",
"time_in_nanos": 1780
}
]
}
],
"aggregations": []
}
]
}
}
As you see, times are in nanoseconds, not even miliseconds, that says conversion is fast.
You can read more about here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-profile.html

Elastic App Search results in different total_results for different current_page

Doing lazy loading with Elastic App Search.
My initial request looks like
{
"query": "",
"page": {
"current": 1,
"size": 10
},
"sort": {
"editedat": "desc"
}
}
This will result correctly in the following response
{
"meta": {
"alerts": [],
"warnings": [],
"precision": 3,
"page": {
"current": 1,
"total_pages": 5,
"total_results": 41,
"size": 10
},
"engine": {
"name": "myengine",
"type": "default"
},
"request_id": "71805727-9c0a-496b-95a9-bb317345807c"
},
"results": [
{
// the 10 results
...
}
]
When my app now requests the second page, the results look different:
request
{
"query": "",
"page": {
"current": 2,
"size": 10
},
"sort": {
"editedat": "desc"
}
}
response
{
"meta": {
"alerts": [],
"warnings": [],
"precision": 3,
"page": {
"current": 2,
"total_pages": 2,
"total_results": 18,
"size": 10
},
"engine": {
"name": "myengine",
"type": "default"
},
"request_id": "5d402099-e25d-41c9-af80-b961b78c5a94"
},
"results": [
{
// the 8 results
...
}
]
Now it suddenly shows two pages in total and 18 results, but my first request responded five pages with 41 items in total which would be the correct amount.
Am I missing anything quite simple? Is this a bug? Do I have to take another approach?
Thanks for your help and experience.

Elasticsearch 3 of 280 shards failed error - has anyone seen anything like this before and knows how to fix it?

Every time I load the dashboard based on this index, this error keeps popping up
My visualisations still look fine and the data is still appearing, I have just never come across this error before. Any ideas on how I can fix this issue?
Here is the response from the error popup:
{
"took": 1137,
"timed_out": false,
"terminated_early": false,
"_shards": {
"total": 280,
"successful": 277,
"skipped": 0,
"failed": 3,
"failures": [
{
"shard": 1,
"index": "nbs_comprehend-2021-w41",
"node": "oGEHA-aRSnmwuEmqSZc6Kw",
"reason": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"org.elasticsearch.index.fielddata.ScriptDocValues$Longs.get(ScriptDocValues.java:121)",
"org.elasticsearch.index.fielddata.ScriptDocValues$Longs.getValue(ScriptDocValues.java:115)",
"doc['user.followers_count'].value > 9999 ? 1 : 0",
" ^---- HERE"
],
"script": "doc['user.followers_count'].value > 9999 ? 1 : 0",
"lang": "painless",
"position": {
"offset": 27,
"start": 0,
"end": 48
},
"caused_by": {
"type": "illegal_state_exception",
"reason": "A document doesn't have a value for a field! Use doc[<field>].size()==0 to check if a document is missing a field!"
}
}
}
]
},
"hits": {
"total": 696059,
"max_score": null,
"hits": []
},
"aggregations": {
"termsAgg": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 0,
"doc_count": 604397
},
{
"key": 1,
"doc_count": 91662
}
]
}
}
}
you've got a document with a field that doesn't have a value for user.followers_count. your scripted field is not checking for that at all and so things error when it can't return any value. that's what A document doesn't have a value for a field is saying in the error
try this, which will return a zero if the field is empty, basically creating a default value;
if (doc['user.followers_count'].size() == 0) { return "0" } else { return doc['user.followers_count'].value > 9999 ? 1 : 0 }

Kibana/Elasticsearch 6.8 - delete_by_query returns reason "blocked by: [FORBIDDEN/8/index write (api)];"

I'm using the Dev Tools in Kibana 6.8 to delete docs by query but I received a 403 with the type "cluster_block_exception", reason "blocked by: [FORBIDDEN/8/index write (api)];".
I used the following command:
curl -XPOST "http://localhost:9200/my_index/_delete_by_query" -H 'Content-Type: application/json' -d'
{
"query": {
"match_all": {}
}
}'
Here's a sample response:
{
"took": 26,
"timed_out": false,
"total": 2,
"deleted": 0,
"batches": 1,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": [
{
"index": "my_index",
"type": "doc",
"id": "TnOKCHMBlyetxY-P6HZ_",
"cause": {
"type": "cluster_block_exception",
"reason": "blocked by: [FORBIDDEN/8/index write (api)];"
},
"status": 403
},
{
"index": "my_index",
"type": "doc",
"id": "T3OKCHMBlyetxY-P6XYF",
"cause": {
"type": "cluster_block_exception",
"reason": "blocked by: [FORBIDDEN/8/index write (api)];"
},
"status": 403
}
]
}
Any help on how to set the proper permissions would be greatly appreciated. Thanks.
You could try to first change the state of the index with this request :
PUT /my_index/_settings { "index": { "blocks": { "write": "false" } } }
And in second time, you have to identify which cause this state of index (indices policy, FileSystem : "low watermark" of 85%?)

ElasticSearch errors in deleting records by query

I am trying to delete large-number of documents in ES via delete_by_query.
But I am seeing the following errors.
Query
POST indexName/typeName/_delete_by_query
{
"size": 100000,
"query": {
"bool": {
"must": [
{
"range": {
"CREATED_TIME": {
"gte": 0,
"lte": 1507316563000
}
}
}
]
}
}
}
Result
{
"took": 50489,
"timed_out": false,
"total": 100000,
"deleted": 0,
"batches": 1,
"version_conflicts": 1000,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": [
{
"index": "indexName",
"type": "typeName",
"id": "HVBLdzwnImXdVbq",
"cause": {
"type": "version_conflict_engine_exception",
"reason": "[typeName][HVBLdzwnImXdVbq]: version conflict, current version [2] is different than the one provided [1]",
"index_uuid": "YPJcVQZqQKqnuhbC9R7qHA",
"shard": "1",
"index": "indexName"
},
"status": 409
},....
Please read this article.
You have two ways of handling this issue, by set the url to ignore version conflicts or set the query to ignore version conflicts:
If you’d like to count version conflicts rather than cause them to abort then set conflicts=proceed on the url or "conflicts": "proceed" in the request body.

Resources