Elasticsearch - How to delete a list of documents? - elasticsearch

I have an array of _id.
On this page I found out how to retrieve a list of documents from it :
GET ads/_mget
{
"ids": [ "586213440e7d2c7f10fe2574",
"586213440e7d2c7f10fe2575",
"586213450e7d2c7f10fe2576",
"586213450e7d2c7f10fe2577" ]
}
This works and returns a list of 4 full documents, as expected.
(sidenote)
I find it weird to have to write "ids" in the query, when it actually acts on the "_id" field.
(end sidenote)
Now I can't figure out how to DELETE these documents from the same _id list.
I tried DELETE ads/_mget but I get an error : No handler found for uri [/ads/_mget] and method [DELETE]
I tried _mdelete instead of _mget but it doesn't seem to exist.
I also tried
DELETE ads
{
"ids": [ "586213440e7d2c7f10fe2574",
"586213440e7d2c7f10fe2575",
"586213450e7d2c7f10fe2576",
"586213450e7d2c7f10fe2577" ]
}
...but this... just deletes EVERYTHING and I have to reindex the database.

You can always use feature of Delete By Query and supply payload like:
POST ads/_delete_by_query
{
"query" : {
"terms" : {
"_id" :
[ "586213440e7d2c7f10fe2574",
"586213440e7d2c7f10fe2575",
"586213450e7d2c7f10fe2576",
"586213450e7d2c7f10fe2577" ]
}
}
}
For more infromation about terms query please follow https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html

Related

Attempting to delete all the data for an Index in Elasticsearch

I am trying to delete all the documents, i.e. data from an index. I am using v6.6 along with the dev tools in Kibana.
In the past, I have done this operation successfully but now it is saying 'not found'
{
"_index" : "new-index",
"_type" : "doc",
"_id" : "_query",
"_version" : 1,
"result" : "not_found",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 313,
"_primary_term" : 7
}
Here is my kibana statement
DELETE /new-index/doc/_query
{
"query": {
"match_all": {}
}
}
Also, the index GET operation which verified the index has data and exists:
GET new-index/doc/_search
I verified the type is doc but I can post the whole mapping, if needed.
Easier way is to navigate in Kibana to Management->Elasticsearch index mapping then select indexes you would like to delete via checkboxes, and click on Manage index -> delete index or flush index depending on your need.
I was able to resolve the issue by using a delete by query:
POST new-index/_delete_by_query
{
"query": {
"match_all": {}
}
}
Delete documents is a problematic way to clear data.
Preferable delete index:
DELETE [your-index]
From kibana console.
And recreate from scratch.
And more preferable way is to make a template for an index that creates index as well with the first indexed document.
Only solutions currently are to either delete the index itself (faster), or delete-by-query (slower)
https://www.elastic.co/guide/en/elasticsearch/reference/7.4/docs-delete-by-query.html
POST new-index/_delete_by_query?conflicts=proceed
{
"query": {
"match_all": {}
}
}
Delete API only removes a single document https://www.elastic.co/guide/en/elasticsearch/reference/7.4/docs-delete.html
My guess is that someone changed a field's name and now the DB (NoSQL) and Elasticsearch string name for that field doesn't match. So Elasticsearch tried to delete that field, but the field was "not found".
It's not an error I would lose sleep over.

Delete by Query with Sort in Elasticsearch

I want to delete the most current item in my Elasticsearch index sorted by myDateField which is a date type. Is that possible? I want something like this query but this would delete all matching items even though I have the size at 1.
{
"query" : {
"match_all" : {
}
},
"size" : "1",
"sort" : [
{
"myDateField" : {
"order" : "desc"
}
}
]
}
Delete by query is unlikely to support any sorting features.
If you try Delete by query - however you'll get the error: request does not support [sort]. I couldn't find any documentation saying that the "sort" parameter is not supported in delete by query.
I've one idea to do it but don't know it's the best way or not?
Step 1: Do a normal query based on your conditions+sorting and get those ids.
Step 2: Build a bulk query to delete all documents retrieved above by id those you got on Step 1.

How to run a query with the properties from a stored document?

Let's say we have a index with docs which contain the following fields: uid and hobbies. How can I run a query to find similarities between 1 and the other users, without having to retrieve the user first and then run a new query with his hobbies?
You could use the more like this query and ask ES to retrieve documents that are like a given document (e.g. user with uid=1) (without having to retrieve that document first).
So in the like array below you simply give a reference to the document that needs to be used as a reference for the "more like this" query (you can give more than one document and also verbatim hobbies strings). ES will retrieve that document, check the hobbies field and perform a "more like this hobbies" query on all other documents.
POST /users/user/_search
{
"query": {
"more_like_this" : {
"fields" : ["hobbies"],
"like" : [
{
"_index" : "users",
"_type" : "user",
"_id" : "1" <---- fill in the UID of the user here
}
]
}
}
}

Elasticsearch bool search matching incorrectly

So I have an object with an Id field which is populated by a Guid. I'm doing an elasticsearch query with a "Must" clause to match a specific Id in that field. The issue is that elasticsearch is returning a result which does not match the Guid I'm providing exactly. I have noticed that the Guid I'm providing and one of the results that Elasticsearch is returning share the same digits in one particular part of the Guid.
Here is my query source (I'm using the Elasticsearch head console):
{
query:
{
bool:
{
must: [
{
text:
{
couchbaseDocument.doc.Id: 5cd1cde9-1adc-4886-a463-7c8fa7966f26
}
}]
must_not: [ ]
should: [ ]
}
}
from: 0
size: 10
sort: [ ]
facets: { }
}
And it is returning two results. One with ID of
5cd1cde9-1adc-4886-a463-7c8fa7966f26
and the other with ID of
34de3d35-5a27-4886-95e8-a2d6dcf253c2
As you can see, they both share the same middle term "-4886-". However, I would expect this query to only return a record if the record were an exact match, not a partial match. What am I doing wrong here?
The query is (probably) correct.
What you're almost certainly seeing is the work of the 'Standard Analyzer` which is used by default at index-time. This Analyzer will tokenize the input (split it into terms) on hyphen ('-') among other characters. That's why a match is found.
To remedy this, you want to set your couchbaseDocument.doc.Id field to not_analyzed
See: How to not-analyze in ElasticSearch? and the links from there into the official docs.
Mapping would be something like:
{
"yourType" : {
"properties" : {
"couchbaseDocument.doc.Id" : {"type" : "string", "index" : "not_analyzed"},
}
}
}

Register and call query in ElasticSearch

Is it possible to register query (like the percolate process) and call them by name to execute them.
I am building an application that will let the user save search query associated with a label. I would like to save the query generated by the filter in ES.
If I save the query in an index, I have to call ES first to retrieve the query, extract the field containing the query and then call ES again to execute it. Can I do it in one call ?
The other solution is to register queries (labels with _percolator with an identifier of the user:
/_percolate/transaction/user1_label1
{
"userId": "user1",
"query":{
"term":{"field1":"foo" }
}
}
and when there is a new document use the percolator in a non indexing mode (filtered per userId) to retrieve which query match, and then update the document by adding a field "label":["user1_label1", "user1_label2"] and finaly index the document. SO the labelling is done at indexing time.
What do you think ?
Thanks in advance.
Try Filter Aliases.
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{
"add" : {
"index" : "the_real_index",
"alias" : "user1",
"filter" : { "term" : { "field1" : "foo" } }
}
}
]
}'

Resources