Attempting to delete all the data for an Index in Elasticsearch - elasticsearch

I am trying to delete all the documents, i.e. data from an index. I am using v6.6 along with the dev tools in Kibana.
In the past, I have done this operation successfully but now it is saying 'not found'
{
"_index" : "new-index",
"_type" : "doc",
"_id" : "_query",
"_version" : 1,
"result" : "not_found",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 313,
"_primary_term" : 7
}
Here is my kibana statement
DELETE /new-index/doc/_query
{
"query": {
"match_all": {}
}
}
Also, the index GET operation which verified the index has data and exists:
GET new-index/doc/_search
I verified the type is doc but I can post the whole mapping, if needed.

Easier way is to navigate in Kibana to Management->Elasticsearch index mapping then select indexes you would like to delete via checkboxes, and click on Manage index -> delete index or flush index depending on your need.

I was able to resolve the issue by using a delete by query:
POST new-index/_delete_by_query
{
"query": {
"match_all": {}
}
}

Delete documents is a problematic way to clear data.
Preferable delete index:
DELETE [your-index]
From kibana console.
And recreate from scratch.
And more preferable way is to make a template for an index that creates index as well with the first indexed document.

Only solutions currently are to either delete the index itself (faster), or delete-by-query (slower)
https://www.elastic.co/guide/en/elasticsearch/reference/7.4/docs-delete-by-query.html
POST new-index/_delete_by_query?conflicts=proceed
{
"query": {
"match_all": {}
}
}
Delete API only removes a single document https://www.elastic.co/guide/en/elasticsearch/reference/7.4/docs-delete.html

My guess is that someone changed a field's name and now the DB (NoSQL) and Elasticsearch string name for that field doesn't match. So Elasticsearch tried to delete that field, but the field was "not found".
It's not an error I would lose sleep over.

Related

Elasticsearch conflict while putting document to index

I want to create an index and modify its setting with template and at the same time create an alias for it
"template_1" : {
"order" : 0,
"index_patterns" : [
"test*"
],
"settings" : {
"index" : {
"number_of_shards" : "2",
"number_of_replicas" : "2"
}
},
"mappings" : { },
"aliases" : {
"some-alias" : { }
}
}
}
when I am trying to put a document using alias, it tries to create an index with the alias name. However I am looking for something which will search for the index which has this alias and throws an error that there are no index exist with this alias
The problem is you are referencing multiple indexes with a single alias, so when you PUT a document ES does not know in which document to store it to.
Quoting the doc:
If no write index is specified and there are multiple indices referenced by an alias, then writes will not be allowed.
One solution, as per quote above, is to specify a write index (see docs) as the default destination for new documents (its also possible to specify rollover rules to update it).
The other solution, of course, is use the actual index name when putting docs.

Delete by Query with Sort in Elasticsearch

I want to delete the most current item in my Elasticsearch index sorted by myDateField which is a date type. Is that possible? I want something like this query but this would delete all matching items even though I have the size at 1.
{
"query" : {
"match_all" : {
}
},
"size" : "1",
"sort" : [
{
"myDateField" : {
"order" : "desc"
}
}
]
}
Delete by query is unlikely to support any sorting features.
If you try Delete by query - however you'll get the error: request does not support [sort]. I couldn't find any documentation saying that the "sort" parameter is not supported in delete by query.
I've one idea to do it but don't know it's the best way or not?
Step 1: Do a normal query based on your conditions+sorting and get those ids.
Step 2: Build a bulk query to delete all documents retrieved above by id those you got on Step 1.

How to be sure that all documents indexed in ElasticSearch

I have a question about Index Aliases and Zero Downtime
When we put a document to an index it takes time until document available for search.
How to check that all documents available for search before switching from old to a new index?
one way to get that information is to get the stats of the index (GET your-index/_stats/docs,indexing) and compare the stats of the docs and indexing blocks.
...
"_all" : {
"primaries" : {
"docs" : {
"count" : 1234, <-- searchable docs
"deleted" : 0
},
"indexing" : {
"index_total" : 1300, <--- indexed docs
"index_time_in_millis" : 13,
...
}
...
To make all your docs searchable, you can either wait for your refresh strategy to kick in, or you trigger an index refresh explicitly by using the refresh API (https://www.elastic.co/guide/en/elasticsearch/reference/6.6/indices-refresh.html)

Elasticsearch - How to delete a list of documents?

I have an array of _id.
On this page I found out how to retrieve a list of documents from it :
GET ads/_mget
{
"ids": [ "586213440e7d2c7f10fe2574",
"586213440e7d2c7f10fe2575",
"586213450e7d2c7f10fe2576",
"586213450e7d2c7f10fe2577" ]
}
This works and returns a list of 4 full documents, as expected.
(sidenote)
I find it weird to have to write "ids" in the query, when it actually acts on the "_id" field.
(end sidenote)
Now I can't figure out how to DELETE these documents from the same _id list.
I tried DELETE ads/_mget but I get an error : No handler found for uri [/ads/_mget] and method [DELETE]
I tried _mdelete instead of _mget but it doesn't seem to exist.
I also tried
DELETE ads
{
"ids": [ "586213440e7d2c7f10fe2574",
"586213440e7d2c7f10fe2575",
"586213450e7d2c7f10fe2576",
"586213450e7d2c7f10fe2577" ]
}
...but this... just deletes EVERYTHING and I have to reindex the database.
You can always use feature of Delete By Query and supply payload like:
POST ads/_delete_by_query
{
"query" : {
"terms" : {
"_id" :
[ "586213440e7d2c7f10fe2574",
"586213440e7d2c7f10fe2575",
"586213450e7d2c7f10fe2576",
"586213450e7d2c7f10fe2577" ]
}
}
}
For more infromation about terms query please follow https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html

Elasticsearch querying alias with routing giving partial results

In an effort to create multi-tenant architecture for my project.
I've created an elasticsearch cluster with an index 'tenant'
"tenant" : {
"some_type" : {
"_routing" : {
"required" : true,
"path" : "tenantId"
},
Now,
I've also created some aliases -
"tenant" : {
"aliases" : {
"tenant_1" : {
"index_routing" : "1",
"search_routing" : "1"
},
"tenant_2" : {
"index_routing" : "2",
"search_routing" : "2"
},
"tenant_3" : {
"index_routing" : "3",
"search_routing" : "3"
},
"tenant_4" : {
"index_routing" : "4",
"search_routing" : "4"
}
I've added some data with tenantId = 2
After all that, I tried to query 'tenant_2' but I only got partial results, while querying 'tenant' index directly returns with the full results.
Why's that?
I was sure that routing is supposed to query all the shards that documents with tenantId = 2 resides on.
When you have created aliases in elasticsearch, you have to do all operations using aliases only. Be it indexing, update or search.
Try reindexing the data again and check if possible (If it is a test index, I hope so).
Remove all the indices.
curl -XDELETE 'localhost:9200/' # Warning:!! Dont use this in production.
Use this command only if it is test index.
Create the index again. Create alias again. Do all the indexing, search and delete operations on alias name. Even the import of data should also be done via alias name.

Resources