ElasticSearch remove all documents with over 1000 fields - elasticsearch

I'm getting this error:
While updating a dev ElasticSearch DB from a LIVE one. I believe it is being caused because the live DB is sending documents with over 1000 fields in them and the dev DB index.mapping.total_fields.limit is set to 1000
I know I can up the fields limit, but for now I would like to just remove all documents with 1000 or more fields.
I'm guessing make a Postman call to the _delete_by_query API with something like:
{
"query": {
"range": {
"fields": {
"gt": 1000
}
}
}
}
Does anyone know of a simple query that can accomplish this?

You can run a query like this against the LIVE cluster:
POST logger/_delete_by_query
{
"query": {
"script": {
"script": {
"source": "params._source.size() > 1000"
}
}
}
}
Provided you don't have nested fields/objects, this will delete all documents having more than 1000 fields.

Related

How to compare two date fields in same document in elasticsearch

In my elastic search index, each document will have two date fields createdDate and modifiedDate. I'm trying to add a filter in kibana to fetch the documents where the modifiedDate is greater than createdDate. How to create this filter in kibana?
Tried Using below query instead of greater than it is considering as gte and fetching all records
GET index/_search
{
"query": {
"bool": {
"filter": {
"script": {
"script" : {
"inline" : "doc['modifiedTime'].value.getMillis() > doc['createdTime'].value.getMillis()",
"lang" : "painless"
}
}
}
}
}
}
There are a few options.
Option A: The easiest and most performant one is to store the difference of the two fields inside a new field of your document, e.g.
{
"createDate": "2022-01-11T12:34:56Z",
"modifiedDate": "2022-01-11T12:34:56Z",
"diffMillis": 0
}
{
"createDate": "2022-01-11T12:34:56Z",
"modifiedDate": "2022-01-11T12:35:58",
"diffMillis": 62000
}
Then, in Kibana you can query on diffMillis > 0 and figure out all documents that have been modified after their creation.
Option B: You can use a script query
GET index/_search
{
"query": {
"bool": {
"filter": {
"script": {
"script": """
return doc['createdDate'].value.millis < doc['modifiedDate'].value.millis;
"""
}
}
}
}
}
Note: depending on the amount of data you have, this option can potentially have disastrous performance, because it needs to be evaluated on ALL of your documents.
Option C: If you're using ES 7.11+, you can use runtime fields directly from the Kibana Discover view.
You can use the following script in order to add a new runtime field (e.g. name it diffMillis) to your index pattern:
emit(doc['modifiedDate'].value.millis - doc['createdDate'].value.millis)
And then you can add the following query into your search bar
diffMillis > 0

Elasticsearch doesn't find product when i have mistake in name

I tried use elasticsearch to write websearch. I created 3 products in my products index
Ibuprom Max
Nurofen Max Forte
Gripex Max
when i use
{
"query": {
"match_all: {}
}
}
I received all records, buth when i use search query
{
"query": {
"match": {
"name": "Max"
}
}
}
I receive all match where is "Max", but when i change Max to Mxa or Mx then i don't receive nothing. From what I read, elasticsearch by default with a typo, should I find products with Max in the name, unless I'm doing something wrong? Just what?

Elasticsearch delete By Query not completing deletes

I need to delete a large number of documents in a 5.5 Elasticsearch cluster. I know the optimal way to do this is to rebuild the cluster without the intended documents, but that's not possible in our case. I run the following query that deletes documents from a subset of the indexes in the cluster:
GET myindex_1*/doc_type/_delete_by_query
{
"query": {
"bool": {
"filter": [
{
"terms": {
"typeCode": [
"Filtered_Type"
]
}
}
],
"must": [
{
"range": {
"createdDateUTC": {
"lt": "2017-10-28"
}
}
}
]
}
}
}
It starts deleting documents for a couple of hours but then just stops and I have to kick it off again. Any ideas why it stops running the delete query?
Just a note, I'm using Kibana to run the query and the request times out on the client side when though I can see it continues deleting on the backend.
From here:
By default _delete_by_query uses scroll batches of 1000. You can change the batch size with the scroll_size URL parameter:
POST twitter/_delete_by_query?scroll_size=5000
{
"query": {
"term": {
"user": "kimchy"
}
}
}
You can find more information here about batching and batch sizes here:
batches and requests_per_second in ElasticSearch Delete By Query API
And since you'll need to scroll through one to many batches to delete all of the documents found by your query, you can find more information about scrolling here:
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/search-request-scroll.html

ElasticSearch how to get docs with 10 or more fields in them?

I want to get all docs that have 10 or more fields in them. I'm guessing something like this:
{
"query": {
"range": {
"fields": {
"gt": 1000
}
}
}
}
What you can do is to run a script query like this
{
"query": {
"script": {
"script": {
"source": "params._source.size() >= 10"
}
}
}
}
However, be advised that depending on the number of documents you have and the hardware that supports your cluster, this can negatively impact the performance of your cluster.
A better idea would be to add another integer field that contains the number of fields that the document contains, so you can simply run a range query on it, like in your question.
As Per Documentation of _source field, you can do this like that or can't get results based on fields count.
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html

Logstash: query parameter lower than a value through elasticsearch plugin

With executing searches, I know that if I want to query a pamareter lower than a certain value i have to execute the following script:
{"query": {
"bool": {
"must": [
{
"range": {
"length": {
"lte": "22"
}
}
}
]
}
}
}
However, i want to do the same thing through the elastic plugin in logstash.
elasticsearch{
query =>= "...."
}
But I didn't find how to do that. (and the website doesn't give any help https://www.elastic.co/guide/en/logstash/current/plugins-filters-elasticsearch.html)
Thank you for your attention and your help.
Joe
Using the query string query syntax, you can do it like this
elasticsearch{
query =>= "length:{* to 22]"
}
Also note that at some point, we might be able to use the query DSL if this issue gets some traction.

Resources