Cannot divide multiple context values in Elasticsearch watcher - elasticsearch

I have created a watch in Elasticsearch to alert me if the ratio of http errors is greater than 15% of total requests over 60 minutes.
I am using chain inputs to generate the dividend and divisor values for my ratio calculation.
In my condition I am using scripting to do the division and check if it is greater than my ratio.
However, whenever I use 2 ctx parameters to do the division, it always equals to zero.
If I play with it and only use one of ctx param, then it works fine.
It seems that we cannot use 2 ctx params in a condition.
Does anyone know how to get around this?
Below is my watch.
Thanks.
{
"trigger" : {
"schedule" : {
"interval" : "5m"
}
},
"input" : {
"chain":{
"inputs": [
{
"first": {
"search" : {
"request" : {
"indices" : [ "logstash-*" ],
"body" : {
"query" : {
"bool":{
"must": [
{
"match" : {"d.uri": "xxxxxxxx"}
},
{
"match" : {"topic": "xxxxxxxx"}
}
],
"filter": {
"range": {
"#timestamp": {
"gte": "now-60m"
}
}
}
}
}
}
},
"extract": ["hits.total"]
}
}
},
{
"second": {
"search" : {
"request" : {
"body" : {
"query" : {
"bool":{
"must": [
{
"match" : {"d.uri": "xxxxxxxx"}
},
{
"match" : {"topic": "xxxxxxxx"}
},
{
"match" : {"d.status": "401"}
}
],
"filter": {
"range": {
"#timestamp": {
"gte": "now-60m"
}
}
}
}
}
}
},
"extract": ["hits.total"]
}
}
}
]
}
},
"condition" : {
"script" : {
"source" : "return (ctx.payload.second.hits.total / ctx.payload.first.hits.total) == 0"
}
}
}

The issue comes in fact from the fact that I was doing an integer division to get to a ratio in the form of 0.xx.
I reversed the operation and it is working fine.

Related

In ElasticSearch break down hits per filter?

Given the following query, how can I get the number of hits independently for each range and term query and what are the performance implications for this? As of yet, I can't find anything in the documentation that indicates how to do this. Where can I find the docs for such a feature?
{
"query": {
"bool" : {
"must" : {
"term" : { "user.id" : "kimchy" }
},
"filter": {
"term" : { "tags" : "production" }
},
"must_not" : {
"range" : {
"age" : { "gte" : 10, "lte" : 20 }
}
},
You can use filter aggregation for getting document count per query clause. As you are providing query as well, you need to use global aggregation with filter aggregation. If you dont use global aggregation then it will return count based on top level query and you will not able to get total document for specific query clause.
Below is sample query with aggregation:
{
"query": {
"bool": {
"must": {
"term": {
"user.id": "kimchy"
}
},
"filter": {
"term": {
"tags": "production"
}
},
"must_not": {
"range": {
"age": {
"gte": 10,
"lte": 20
}
}
}
}
},
"aggs": {
"Total": {
"global": {},
"aggs": {
"user_term": {
"filter": {
"term": {
"user.id": "kimchy"
}
}
},
"tag_term": {
"filter": {
"term": {
"tags": "production"
}
}
},
"age_range_not": {
"filter": {
"bool": {
"must_not": {
"range": {
"age": {
"gte": 10,
"lte": 20
}
}
}
}
}
},
"age_range": {
"filter": {
"range": {
"age": {
"gte": 10,
"lte": 20
}
}
}
}
}
}
}
}
You will get below response:
"aggregations" : {
"Total" : {
"doc_count" : 3,
"age_range" : {
"doc_count" : 2
},
"age_range_not" : {
"doc_count" : 1
},
"tag_term" : {
"doc_count" : 3
},
"user_term" : {
"doc_count" : 2
}
}
}

Elasticsearch Multi-Term Auto Completion

I'm trying to implement the Multi-Term Auto Completion that's presented here.
Filtering down to the correct documents works, but when aggregating the completion_terms they are not filtered to those that match the current partial query, but instead include all completion_terms from any matched documents.
Here are the mappings:
{
"mappings": {
"dynamic" : "false",
"properties" : {
"completion_ngrams" : {
"type" : "text",
"analyzer" : "completion_ngram_analyzer",
"search_analyzer" : "completion_ngram_search_analyzer"
},
"completion_terms" : {
"type" : "keyword",
"normalizer" : "completion_normalizer"
}
}
}
}
Here are the settings:
{
"settings" : {
"index" : {
"analysis" : {
"filter" : {
"edge_ngram" : {
"type" : "edge_ngram",
"min_gram" : "1",
"max_gram" : "10"
}
},
"normalizer" : {
"completion_normalizer" : {
"filter" : [
"lowercase",
"german_normalization"
],
"type" : "custom"
}
},
"analyzer" : {
"completion_ngram_search_analyzer" : {
"filter" : [
"lowercase"
],
"tokenizer" : "whitespace"
},
"completion_ngram_analyzer" : {
"filter" : [
"lowercase",
"edge_ngram"
],
"tokenizer" : "whitespace"
}
}
}
}
}
}
}
I'm then indexing data like this:
{
"completion_terms" : ["Hammer", "Fortis", "Tool", "2000"],
"completion_ngrams": "Hammer Fortis Tool 2000"
}
Finally, the autocomplete search looks like this:
{
"query": {
"bool": {
"must": [
{
"term": {
"completion_terms": "fortis"
}
},
{
"term": {
"completion_terms": "hammer"
}
},
{
"match": {
"completion_ngrams": "too"
}
}
]
}
},
"aggs": {
"autocomplete": {
"terms": {
"field": "completion_terms",
"size": 100
}
}
}
}
This correctly returns documents matching the search string "fortis hammer too", but the aggregations include ALL completion terms that are included in any of the matched documents, e.g. for the query above:
"buckets": [
{ "key": "fortis" },
{ "key": "hammer" },
{ "key": "tool" },
{ "key": "2000" },
]
Ideally, I'd expect
"buckets": [
{ "key": "tool" }
]
I could filter out the terms that are already covered by the search query ("fortis" and "hammer" in this case) in the app, but the "2000" doesn't make any sense from a user's perspective, because it doesn't partially match any of the provided search terms.
I understand why this is happening, but I can't think of a solution. Can anyone help?
try filters agg please
{
"query": {
"bool": {
"must": [
{
"term": {
"completion_terms": "fortis"
}
},
{
"term": {
"completion_terms": "hammer"
}
},
{
"match": {
"completion_ngrams": "too"
}
}
]
}
},
"aggs": {
"findOuthammerAndfortis": {
"filters": {
"filters": {
"fortis": {
"term": {
"completion_terms": "fortis"
}
},
"hammer": {
"term": {
"completion_terms": "hammer"
}
}
}
}
}
}
}

Getting "Field data loading is forbidden" when trying to aggregate

I'm trying to do a simple unique aggregation, but getting this error:
java.lang.IllegalStateException: Field data loading is forbidden on eid
this is my query:
POST /logstash-2016.06.*/Nginx/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"pid": "1"
}
},
{
"term": {
"cvprogress": "0"
}
},
{
"range" : {
"ServerTime" : {
"gte" : "2016-06-28T00:00:00"
}
}
}
]
}
},
"aggs": {
"distinct_colors" : {
"cardinality" : {
"field" : "eid"
}
}
}
}
After going through the entire thread at https://github.com/elastic/elasticsearch/issues/15267 what worked was adding .raw
like this:
"aggs": {
"distinct_colors" : {
"cardinality" : {
"field" : "eid.raw"
}
}
}

Watcher alert if no records matching filter in x minutes

I need to get ElasticSearch watcher to alert if there is no record matching a pattern inserted into the index in a time frame, it needs to be able to do this whilst grouping on another pair of field.
i.e. the records will be of the pattern:
Date Timestamp Level Message Client Site
It needs to check that Message matches "is running" for each Client's site(s) (i.e. Google Maps and Bing Maps have the same site of Maps). I tihnk the best(?) way to do this right now is to run a wacher per client site.
Sofar I have this, assume the task should write is running into the log every 20 minutes :
{
"trigger" : {
"schedule" : {
"interval" : "25m"
}
},
"input" : {
"search" : {
"request" : {
"search_type" : "count",
"indices" : "<logstash-{now/d}>",
"body" : {
"filtered" : {
"query" : {
"match_phrase" : { "Message" : "Is running" }
},
"filter" : {
"match" : { "Client" : "Example" } ,
"match" : { "Site" : "SomeSite" }
}
}
}
}
}
},
"condition" : {
"script" : "return ctx.payload.hits.total < 1"
},
"actions" : {
},
"email_administrator" : {
"email" : {
"to" : "me#host.tld",
"subject" : "Tasks are not running for {{ctx.payload.client}} on their site {{ctx.payload.site}}",
"body" : "Too many error in the system, see attached data",
"attach_data" : true,
"priority" : "high"
}
}
}
}
For anyone looking how to do this in the future, a few things need nesting in query as part of filter and match becomes term. Fun!...
{
"trigger": {
"schedule": {
"interval": "25m"
}
},
"input": {
"search": {
"request": {
"search_type": "count",
"indices": "<logstash-{now/d}>",
"body": {
"query": {
"filtered": {
"query": {
"match_phrase": {
"Message": "Its running"
}
},
"filter": {
"query": {
"term": {
"Client": "Example"
}
},
"query": {
"term": {
"Site": "SomeSite"
}
},
"query": {
"range": {
"event_timestamp": {
"gte": "now-25m",
"lte": "now"
}
}
}
}
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"lte": 1
}
}
},
"actions": {
"email_administrator": {
"email": {
"to": "me#host.tld",
"subject": "Tasks are not running for {{ctx.payload.client}} on their site {{ctx.payload.site}}",
"body": "Tasks are not running for {{ctx.payload.client}} on their site {{ctx.payload.site}}",
"attach_data": true,
"priority": "high"
}
}
}
}
You have to change your condition,It support json format:
"condition" : {
"script" : "return ctx.payload.hits.total : 1"
}
Please refer below link,
https://www.elastic.co/guide/en/watcher/current/condition.html

Replacing OR/AND/NOT filters with bool filter creates a hard-to-understand query with too many levels?

I have the following filter in a filtered query. As seen, it has many OR/AND/NOT filters at different levels. I was advised to replace them with bool filters for performance reasons, and I am going to do that.
"filter" : {
"or" : [
{
"and" : [
{ "range" : { "start" : { "lte": 201407292300 } } },
{ "range" : { "end" : { "gte": 201407292300 } } },
{ "term" : { "condtion1" : false } },
{
"or" : [
{
"and" : [
{ "term" : { "condtion2" : false } },
{
"or": [
{
"and" : [
{ "missing" : { "field" : "condtion6" } },
{ "missing" : { "field" : "condtion7" } }
]
},
{ "term" : { "condtion6" : "nop" } }
{ "term" : { "condtion7" : "rst" } }
]
}
]
},
{
"and" : [
{ "term" : { "condtion2" : true } },
{
"or": [
{
"and" : [
{ "missing" : { "field" : "condtion3" } },
{ "missing" : { "field" : "condtion4" } },
{ "missing" : { "field" : "condtion5" } },
{ "missing" : { "field" : "condtion6" } },
{ "missing" : { "field" : "condtion7" } }
]
},
{ "term" : { "condtion3" : "abc" } },
{ "term" : { "condtion4" : "def" } },
{ "term" : { "condtion5" : "ghj" } },
{ "term" : { "condtion6" : "nop" } },
{ "term" : { "condtion7" : "rst" } }
]
}
]
}
]
}
]
},
{
"and" : [
{
"term": { "condtion8" : "TIME_POINT_1" }
},
{ "range" : { "start" : { "lte": 201407302300 } } },
{
"or": [
{ "term" : { "condtion9" : "GROUP_B" } },
{
"and" : [
{ "term" : { "condtion9" : "GROUP_A" } },
{ "ids" : { values: [100, 10] } }
]
}
]
}
]
},
{
"and" : [
{
"term": { "condtion8" : "TIME_POINT_2" }
},
{ "ids" : { values: [100, 10] } }
]
},
{
"and" : [
{
"term": { "condtion8" : "TIME_POINT_3" }
},
{
"or": [
{ "term" : { "condtion1" : true } },
{ "range" : { "end" : { "lt": 201407302300 } } }
]
},
{
"or": [
{ "term" : { "condtion9" : "GROUP_B" } },
{
"and" : [
{ "term" : { "condtion9" : "GROUP_A" } },
{ "ids" : { values: [100, 10] } }
]
}
]
}
]
}
]
}
However, I feel replacing these OR/AND/NOT filters would create a query that has too many levels and is hard to understand. For example, replacing
"or": [
....
]
I have to have:
"bool" {
"should": [
]
}
Am I right that replacing OR/AND/NOT with bool filter in my case is at the expense of sacrificing understandability?
A related question
If I have to replace OR/AND/NOT filters for performance, should I replace ALL of these OR/AND/NOT filters, or just some of them such as the one at the top for example?
Thanks and regards.
You should replace all of them except geo/script/range filters. Having said that understanding the possible impact of each filter can help you also. For example if one of the filter is going to filter out say 90% of the result then you may want to put that in an and filter at the starting. Since and/or filters are executed sequentially the rest of the filters will have lesser documents to process. In case of bool filters all the filters are combined in a single bitset operation. You might have already read about it.
I don't think you will be sacrificing understability by replacing OR/AND/NOT with bool filter. As the example you have given, for a single or filter converting to should filter looks like an increase in the query structure but in an overall combination the structure would be almost similar.

Resources