Elasticsearch Watcher error while trying to send email attachment, dashboard.pdf - elasticsearch

I have created a watcher alert from the advanced option which sends dashboard.pdf as an email attachment when the triggering condition is met. Now when the criteria is matching (threshold is exceeded) then it is throwing error as below in the watcher output.
"root_cause": [
{
"type": "connect_timeout_exception",
"reason": "Connect to mydomainname.com:443 [mydomainname.com/XX.X.XXX.XXX] failed: Connect timed out"
}
],
"type": "connect_timeout_exception",
"reason": "Connect to mydomainname.com:443 [mydomainname.com/XX.X.XXX.XXX] failed: Connect timed out",
"caused_by": {
"type": "socket_timeout_exception",
"reason": "Connect timed out"
Below is found from Elasticsearch log.
[2022-03-29T11:39:54,682][ERROR][o.e.x.w.a.e.ExecutableEmailAction] [node-1] failed to execute action [test_watcher_1_last10mins_gte5_tran_dt_accord_sof/email_admin]
org.apache.http.conn.ConnectTimeoutException: Connect to mydomainname.com:443 [mydomainname.com/XX.X.XXX.XXX] failed: Connect timed out
....
....
Caused by: java.net.SocketTimeoutException: Connect timed out
Below is the watcher script.
{
"trigger": {
"schedule": {
"interval": "6m"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"my_index"
],
"rest_total_hits_as_int": true,
"body": {
"size": 0,
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"match_phrase": {
"partner.keyword": "RXGTY"
}
},
{
"match_phrase": {
"partner.keyword": "VGHUT"
}
}
]
}
},
{
"match": {
"state.keyword": {"query": "Fail"}
}
},
{
"match": {
"ops.keyword": {"query": "api_name"}
}
}
],
"filter": {
"range": {
"datetime": {
"gte": "{{ctx.trigger.scheduled_time}}||-5m",
"lte": "{{ctx.trigger.scheduled_time}}",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}
}
}
}
},
"condition": {
"script": {
"source": "if (ctx.payload.hits.total >= params.threshold) { return true; } return false;",
"lang": "painless",
"params": {
"threshold": 1
}
}
},
"actions": {
"email_admin": {
"email": {
"profile": "standard",
"attachments": {
"dashboard.pdf": {
"reporting": {
"url": "https://mydomainname.com/api/reporting/generate/printablePdf?jobParams= ..removing the rest portion of the url for security reason",
"auth": {"type":"basic","username":"elastic","password":"pass"}
}
},
"data.yml": {
"data": {
"format": "yaml"
}
}
},
"from": "from_email#xyz.com",
"to": [
"to_email_name <to_email#abc.com>"
],
"subject": "Elastic Watcher : Alert 1",
"body": {
"text": "Too many error in the system, see attached data."
}
}
}
},
"transform": {
"script": {
"source": "HashMap result = new HashMap(); result.result = ctx.payload.hits.total; return result;",
"lang": "painless",
"params": {
"threshold": 1
}
}
}
}
Our elastic stack version is 7.11.1 and the license is activated, basic stack security is enabled.
Note that, when I have tried the same from local kibana (7.10.1), where the trial license is activated, there this alerting action is working perfectly. Also note that, in my local stack, the security feature is not enabled.
Please help!!
Regards,
Souvik

Related

ElasticSearch Watcher simulate fires the action, otherwise it's stuck

I have a slack action configured. All aspects appear to be set up correctly. If I go to my watch's simulate section and choose execute (not ignoring the conditions) it executes fine and the message appears correctly templated in slack. If I save the config and let the watcher run it doesn't send. If I use the email action, it sends the email. If I use both, it sends neither.
{
"trigger": {
"schedule": {
"interval": "1m"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"elastic"
],
"rest_total_hits_as_int": true,
"body": {
"query": {
"bool": {
"must": {
"match": {
"level": "ERROR"
}
},
"filter": {
"range": {
"#timestamp": {
"gte": "now-1500m"
}
}
}
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gte": 1
}
}
},
"actions": {
"notify-slack": {
"throttle_period_in_millis": 5000,
"slack": {
"account": "monitoring",
"proxy": {
"host": "proxy.example.com"
"port": 3128
},
"message": {
"from": "watcher",
"to": [
"#elk-cluster-alerts"
],
"text": "Elk Error Alerts",
"icon": ":chuck:",
"attachments": [
{
"color": "danger",
"title": "Elk Error Alerts",
"text": "Roundhouse kick!"
}
]
}
}
}
}
}
UPDATE:
Not a fix, but the configuration works when I use a webhook instead of the slack config

Getting "expected [END_OBJECT] but found [FIELD_NAME]" in Kibana

I am working on Kibana 6x and using SentiNL to generate email alerts. Below is my query to generate mail if my application generate log "CREDENTIALS ARE NOT DEFINED FOR PULL EVENT SOURCES" with threshold 1. When i play my watcher i get below error.
Error: Watchers: play watcher : execute watcher : execute advanced watcher : get elasticsearch payload : search : [parsing_exception] [match] malformed query, expected [END_OBJECT] but found [FIELD_NAME], with { line=1 & col=80 }
Query:
"input": {
"search": {
"request": {
"index": [
"filebeat-2019.03.21"
],
"body": {
"query": {
"match": {
"msg": "CREDENTIALS ARE NOT DEFINED FOR PULL EVENT SOURCES"
},
"minimum_number_should_match": 1,
"bool": {
"filter": {
"range": {
"#timestamp": {
"gte": "now-15m/m",
"lte": "now/m",
"format": "epoch_millis"
}
}
}
}
},
"size": 0,
"aggs": {
"dateAgg": {
"date_histogram": {
"field": "#timestamp",
"time_zone": "Europe/Amsterdam",
"interval": "1m",
"min_doc_count": 1
}
}
}
}
}
}
}
Also I have used "minimum_number_should_match" to track threshold value. Is that correct?
Found the solution(Here i have not added threshold value) :
{
"actions": {
"email_html_alarm_2daee075-0f24-408e-a362-59172b5e3a1d": {
"name": "email html alarm",
"throttle_period": "1m",
"email_html": {
"stateless": false,
"subject": "Error v1.9 conditon",
"priority": "high",
"html": "<p>{{payload.hits.hits}} test hits Hi {{watcher.username}}</p>\n<p>There are {{payload.hits.total}} results found by the watcher <i>{{watcher.title}}</i>.</p>\n\n<div style=\"color:grey;\">\n <hr />\n <p>This watcher sends alerts based on the following criteria:</p>\n <ul><li>{{watcher.wizard.chart_query_params.queryType}} of {{watcher.wizard.chart_query_params.over.type}} over the last {{watcher.wizard.chart_query_params.last.n}} {{watcher.wizard.chart_query_params.last.unit}} {{watcher.wizard.chart_query_params.threshold.direction}} {{watcher.wizard.chart_query_params.threshold.n}} in index {{watcher.wizard.chart_query_params.index}}</li></ul>\n</div>",
"to": "abc#qwe.com",
"from": "abc#qwe.com"
}
}
},
"input": {
"search": {
"request": {
"index": [
"file-2019.04.03"
],
"body": {
"query": {
"bool": {
"must": {
"query_string": {
"query": "CREDENTIALS ARE NOT FOUND",
"analyze_wildcard": true,
"default_field": "*"
}
},
"filter": [{
"range": {
"#timestamp": {
"gte": "now-1d",
"lte": "now/m",
"format": "epoch_millis"
}
}
}]
}
}
}
}
}
},
"condition": {
"script": {
"script": "payload.hits.total > 0"
}
},
"trigger": {
"schedule": {
"later": "every 2 minutes"
}
},
"disable": true,
"report": false,
"title": "watcher_title",
"save_payload": false,
"spy": false,
"impersonate": false
}

Set up watcher for alerting high CPU usage by some process

I'm trying to create a Watcher Alert that will be triggered when some process on a node uses over 0.95% of CPU for the last one hour.
Here is an example of my config:
{
"trigger": {
"schedule": {
"interval": "10m"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"metricbeat*"
],
"types": [],
"body": {
"size": 0,
"query": {
"bool": {
"must": [
{
"range": {
"system.process.cpu.total.norm.pct": {
"gte": 0.95
}
}
},
{
"range": {
"system.process.cpu.start_time": {
"gte": "now-1h"
}
}
},
{
"match": {
"environment": "test"
}
}
]
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"actions": {
"send-to-slack": {
"throttle_period_in_millis": 1800000,
"webhook": {
"scheme": "https",
"host": "hooks.slack.com",
"port": 443,
"method": "post",
"path": "{{ctx.metadata.onovozhylov-test}}",
"params": {},
"headers": {
"Content-Type": "application/json"
},
"body": "{ \"text\": \" ==========\nTest parameters:\n\tthrottle_period_in_millis: 60000\n\tInterval: 1m\n\tcpu.total.norm.pct: 0.5\n\tcpu.start_time: now-1m\n\nThe watcher:*{{ctx.watch_id}}* in env:*{{ctx.metadata.env}}* found that the process *{{ctx.system.process.name}}* has been utilizing CPU over 95% for the past 1 hr on node:\n{{#ctx.payload.nodes}}\t{{.}}\n\n{{/ctx.payload.nodes}}\n\nThe runbook entry is here: *{{ctx.metadata.runbook}}* \"}"
}
}
},
"metadata": {
"onovozhylov-test": "/services/T0U0CFMT4/BBK1A2AAH/MlHAF2QuPjGZV95dvO11111111",
"env": "{{ grains.get('environment') }}",
"runbook": "http://mytest.com"
}
}
This Watcher doesn't work when I set the metric system.process.cpu.start_time. Perhaps this metric is not a correct one... Unfortunately, I don't have relevant experience with Watcher to solve this issue on my own.
And another issue is that I don't know how to add the system.process.name into a message body.
Thanks in advance for any help!
Use timestamp field instead of system.process.cpu.start_time to check for all metrcibeat-* documents in the last 10 mins
"range": {
"timestamp": {
"gte": "now-10m",
"lte": "now"
}
}
To include system.process.name in your message body look at the {{ctx.payload}} and use the appropriate notation to refer to the process name. For ex. in one of our watcher configs we use {{_source.appname}} to refer to the application name.

How to filter on a date range for Sentinl?

So we've started to implement Sentinl to send alerts. I have managed to get a count of errors sent if it exceeds a specified threshold.
What I'm really struggling with, is filtering for the last day!
Could someone please point me in the right direction!
Herewith the script:
{
"actions": {
"Email Action": {
"throttle_period": "0h0m0s",
"email": {
"to": "juan#company.co.za",
"from": "elk#company.co.za",
"subject": "ELK - ERRORS caused by CreditDecisionServiceAPI.",
"body": "{{payload.hits.total}} ERRORS caused by CreditDecisionServiceAPI. Threshold is 100."
}
},
"Slack Action": {
"throttle_period": "0h0m0s",
"slack": {
"channel": "#alerts",
"message": "{{payload.hits.total}} ERRORS caused by CreditDecisionServiceAPI. Threshold is 100.",
"stateless": false
}
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"index": [
"*"
],
"types": [],
"body": {
"size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"appName": "CreditDecisionServiceAPI"
}
},
{
"match": {
"level": "ERROR"
}
},
{
"range": {
"timestamp": {
"from": "now-1d"
}
}
}
]
}
}
}
}
}
},
"condition": {
"script": {
"script": "payload.hits.total > 100"
}
},
"transform": {},
"trigger": {
"schedule": {
"later": "every 15 minutes"
}
},
"disable": true,
"report": false,
"title": "watcher_CreditDecisionServiceAPI_Errors"
}
So to be clear, this is the part that's being ignored by the query:
{
"range": {
"timestamp": {
"from": "now-1d"
}
}
}
You need to change it and add the filter Json tag before the range one, like that:
"filter": [
{
"range": {
"timestamp": {
"gte": "now-1d"
}
}
}
]
So we've FINALLY solved the problem!
Elastic search has changes their DSL multiple times, so please note that you need to look at what version you're using for the correct solution. We're on Version: 6.2.3
Below query finally worked:
"query": {
"bool": {
"must": [
{
"match": {
"appName": "CreditDecisionServiceAPI"
}
},
{
"match": {
"level": "ERROR"
}
},
{
"range": {
"#timestamp": {
"gte": "now-1d"
}
}
}
]
}
}

Elasticsearch (version 2.3) Function Score Query with filtered type query

I am very new to elastic search, We are migrating from Solr to elastic-search. As part of migration working converting existing Solr query to elastic-search DSL query.
Here is the DSL query I have partially completed using function score feature.
{
"query": {
"function_score": {
"query": {
"filtered": {
"match": {
"name": "barack obama"
},
"filter": {
"range": {
"relevance": {
"gte": 6
}
},
"bool": {
"must_not": [
{
"terms": {
"classIds": [
199,
220
],
"execution": "and"
}
}
],
"must": [
{
"term": {
"classIds": 10597
}
}
]
}
}
}
},
"boost_mode": "replace",
"functions": [
{
"script_score": {
"script": {
"lang": "groovy",
"file": "calculate-score",
"params": {
"relevance_boost": 1,
"class_penalize": 0.25
}
}
}
}
]
}
}
}
This query returning error while am running against elastic-search cluster. Please help me to figure out the issue.
Here calculate-score is groovy script and its working fine, I tested that with simple query.
Here is the error response:
{
"error": {
"root_cause": [
{
"type": "query_parsing_exception",
"reason": "[filtered] query does not support [match]",
"index": "nodes_5e27a7d3-b370-40bd-9e71-cf04a36297c0",
"line": 6,
"col": 11
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "nodes_5e27a7d3-b370-40bd-9e71-cf04a36297c0",
"node": "NOAwAtVwQS25egu7AIaHEg",
"reason": {
"type": "query_parsing_exception",
"reason": "[filtered] query does not support [match]",
"index": "nodes_5e27a7d3-b370-40bd-9e71-cf04a36297c0",
"line": 6,
"col": 11
}
}
]
},
"status": 400
}
Here is Solr query I am trying to convert to elastic-search:
SOLR QUERY (UNIQUE_NODE_CORE): q={!boost b="product(pow(field(relevance),1.0000),if(exists(query({!v='all_class_ids:226'})),0.25,1),if(exists(query({!v='all_class_ids:14106'})),0.25,1),if(exists(query({!v='all_class_ids:656'})),0.25,1))"}
raw_name:"barack obama"
&rows=1
&start=0
&sort=score desc,relevance desc
-&fq=class_id:"10597"
-fq=relevance:[6 TO *]
-&fq=-all_class_ids:"14127"
-&fq=-all_class_ids:"14106"
-&fq=-all_class_ids:"226"
&fl=ontology_id,url_friendly_name,name,score,raw_notable_for,property_207578
Just need help to run filtered query with function score.
Great job, you're almost there, you're just missing a query section inside your filtered query in order to wrap the match query. As well, the range filter can be inserted into the bool/must. Quite a mouthful, I know.
{
"query": {
"function_score": {
"query": {
"filtered": {
"query": {
"match": {
"name": "barack obama"
}
},
"filter": {
"bool": {
"must_not": [
{
"terms": {
"classIds": [
199,
220
],
"execution": "and"
}
}
],
"must": [
{
"range": {
"relevance": {
"gte": 6
}
}
},
{
"term": {
"classIds": 10597
}
}
]
}
}
}
},
"boost_mode": "replace",
"functions": [
{
"script_score": {
"script": {
"lang": "groovy",
"file": "calculate-score",
"params": {
"relevance_boost": 1,
"class_penalize": 0.25
}
}
}
}
]
}
}
}
Note that since ES 2.0 the filtered query is deprecated and you can rewrite it with a bool/must/filter query like this:
{
"query": {
"function_score": {
"query": {
"bool": {
"must": {
"match": {
"name": "barack obama"
}
},
"filter": [
{
"range": {
"relevance": {
"gte": 6
}
}
},
{
"term": {
"classIds": 10597
}
}
],
"must_not": [
{
"terms": {
"classIds": [
199,
220
],
"execution": "and"
}
}
]
}
},
"boost_mode": "replace",
"functions": [
{
"script_score": {
"script": {
"lang": "groovy",
"file": "calculate-score",
"params": {
"relevance_boost": 1,
"class_penalize": 0.25
}
}
}
}
]
}
}
}

Resources