Forwarding Prometheus alert by target group - yaml

I want to group some targets into different groups and then send alerts to different email by target group not by alert type, name or label. The scrape_configs section of prometheus.yml is
scrape_configs:
- job_name: 'node_exporter'
scrape_interval: 5s
static_configs:
- targets: ['target-1', 'target-2']
labels:
machine_group: node
- targets: ['target-3', 'target-4']
labels:
machine_group: oracle
- job_name: 'oracle_exporter'
scrape_interval: 5s
static_configs:
- targets: ['target-3', 'target-4']
labels:
machine_group: oracle
How I can send any alert (oracle/node exporter metric) from target-3 and target-4 to 'email-1#smth.com'. And alerts (node exporter metric) from target-1-2 to 'email-2#smth.com'. My example alertmanager.yml file is:
routes:
- receiver: 'email-1#smth.com'
match:
machine_group: 'oracle'
- receiver: 'email-2#smth.com'
match:
machine_group: 'node'
It seems not working. Another thing I need to know is, is it valid to to group target in multiple groups like:
labels:
machine_group: gitlab, genkins

Related

Drop log lines to Loki using multiple conditions with Promtail

I want to drop lines in Promtail using an AND condition from two different JSON fields.
I have JSON log lines like this.
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET / HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 1"}
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET /path HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 1"}
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET / HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 2"}
My local Promtail config looks like this.
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: testing-my-job-drop
pipeline_stages:
- match:
selector: '{job="my-job"}'
stages:
- json:
expressions:
http_user_agent:
request:
- drop:
source: "http_user_agent"
expression: "user agent 1"
# I want this to be AND
- drop:
source: "request"
expression: "GET / HTTP/1.1"
drop_counter_reason: my_job_healthchecks
static_configs:
- labels:
job: my-job
Using a Promtail config like this drops lines using OR from my two JSON fields.
How can I adjust my config so that I only drop lines where http_user_agent = user agent 1 AND request = GET / HTTP/1.1?
If you provide multiple options they will be treated like an AND clause, where each option has to be true to drop the log.
If you wish to drop with an OR clause, then specify multiple drop stages.
https://grafana.com/docs/loki/latest/clients/promtail/stages/drop/#drop-stage
Drop logs by time OR length
Would drop all logs older than 24h OR longer than 8kb bytes
- json:
expressions:
time:
msg:
- timestamp:
source: time
format: RFC3339
- drop:
older_than: 24h
- drop:
longer_than: 8kb
Drop logs by regex AND length
Would drop all logs that contain the word debug AND are longer than 1kb bytes
- drop:
expression: ".*debug.*"
longer_than: 1kb
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: testing-my-job-drop
pipeline_stages:
- match:
selector: '{job="my-job"}'
stages:
- json:
expressions:
http_user_agent:
request:
- labels:
http_user_agent:
request:
#### method 1
- match:
selector: '{http_user_agent="user agent 1"}'
stages:
- drop:
source: "request"
expression: "GET / HTTP/1.1"
drop_counter_reason: my_job_healthchecks
## they are both conditions match will drop
#### method 2
- match:
selector: '{http_user_agent="user agent 1",request="GET / HTTP/1.1"}'
action: drop
#### method 3, incase regex pattern.
- match:
selector: '{http_user_agent="user agent 1"} |~ "(?i).*GET / HTTP/1.1.*"'
action: drop
static_configs:
- labels:
job: my-job
match stage include match stage.

How to sms with prometheus/alertmanager

I have two problems that I can't solve because I don't know if I'm missing something or not..
Here is my promising configuration, and I would therefore like to receive alerts via sms or via pushover, but it does not work.
global:
resolve_timeout: 5m
route:
group_by: ['critical']
group_wait: 30s
group_interval: 180s
repeat_interval: 300s
receiver: myIT
receivers:
- name: 'myIT'
email_configs:
- to: me#myfirm
from: me#myfirm
smarthost: ssl0.ovh.net:587
auth_username: 'me#myfirm'
auth_identity: 'me#myfirm'
auth_password: 'ZZZZZZZZZZZZZZZZZ'
- name: Teams
webhook_configs:
- url: 'https://teams.microsoft.com/l/channel/19%3xxxxxxxxyyyyuxxxab%40thread.tacv2/Alertes?groupId=xxxxxxxxyyyyuxxx0&t enantId=3caa0abd-0122-496f-a6cf-73cb6d3aaadd'
send_resolved: true
- name: Sms
webhook_configs:
- url: 'https://www.ovh.com/cgi-bin/sms/http2sms.cgi?&account=sms-XXXXXXX-1&login=XXXXX&password=XXXXXXX&from=XXXXXX&to=0123456789&message=Alert '
send_resolved: true
- name: pushover
pushover_configs:
- user_key: xxxxxxxxyyyyuxxx
token: xxxxxxxxyyyyuxxx
For the pushover part, it works via my grafana (and still not all the time). For the http2sms, it works all the time via a browser.
But for both it doesn't work under alertmanager. AND I would like to be able to differentiate the alerts. The simple warnign in teams or by email for example, and criticize them by sms.
Did I forget to install something?
Does anyone have a configuration that could look like this need? Thank you
Well. I found.
route:
group_by: ['critical']
group_wait: 30s
group_interval: 180s
repeat_interval: 300s
receiver: myIT
receivers:
- name: 'myIT'
email_configs:
- to: me#myfirm
from: me#myfirm
smarthost: ssl0.ovh.net:587
auth_username: 'me#myfirm'
auth_identity: 'me#myfirm'
auth_password: 'ZZZZZZZZZZZZZZZZZ'
webhook_configs:
- url: 'https://teams.microsoft.com/l/channel/19%3xxxxxxxxyyyyuxxxab%40thread.tacv2/Alertes?groupId=xxxxxxxxyyyyuxxx0&t enantId=3caa0abd-0122-496f-a6cf-73cb6d3aaadd'
send_resolved: true
pushover_configs:
- user_key: xxxxxxxxyyyyuxxx
token: xxxxxxxxyyyyuxxx
It works fine like that.

Add endpoint as the receiver in the prometheus alert configuration

I am trying to activate my spring boot application endpoints with the alerts, for the required event that is defined in the alert rules of prometheus is broken, so that I want to add my application endpoints as a receiver to receive alerts from the prometheus alertmanager. Can anyone please suggest how to configure endpoint as a receiver to this receiver label, instead of any other push notifiers?
- receiver: 'frontend-pager'
group_by: [product, environment]
matchers:
- team="frontend"
I think 'webhook receiver' can help you. More information can refer doc https://prometheus.io/docs/alerting/latest/configuration/#webhook_config
This is an example of a webhook alert created based on blackbox_exporter's metric scraping.
prometheus rule setting
You need to create rule(s) to trigger alert, defined a rule named 'http_health_alert' here as example.
groups:
- name: http
rules:
- alert: http_health_alert
expr: probe_success == 0
for: 3m
labels:
type: http_health
annotations:
description: Health check for {{$labels.instance}} is down
Alertmanager setting
'match' is set to http_health_alert, the alert will be sent to'http://example.com/alert/receiver' via HTTP/POST method (I think you will prepare in advance).
The alert will post JSON format to the configured endpoint 'http://example.com/alert/receiver'. And you can also customize different receiving methods or receiving information in the endpoint/program for different label contents.
global:
route:
group_by: [alertname, env]
group_wait: 30s
group_interval: 3m
repeat_interval: 1h
routes:
- match:
alertname: http_health_alert
group_by: [alertname, env]
group_wait: 30s
group_interval: 3m
repeat_interval: 1h
receiver: webhook_receiver
receivers:
- name: webhook_receiver
webhook_configs:
- send_resolved: true
url: http://example.com/alert/receiver
- name: other_receiver
email_configs:
- send_resolved: true
to: xx
from: xxx

How can I config prometheus alert with line-notify

I've trying to find a way for send alert notification on my prometheus server to line-notify.I checked alert rules configure status on prometheus is OK and alert rules can detect event normally, this my config.yml for alertmanager
global:
resolve_timeout: 5m
route:
receiver: "line-noti"
# group_by: ['test-node-linux', 'test-node-windows', 'test-container-exporter', 'test-jmx-exporter']
group_interval: 10s
repeat_interval: 1m
receivers:
- name: 'line-noti'
webhook_configs:
- url: 'https://notify-api.line.me/api/notify'
send_resolved: true
http_config:
bearer_token: [my_token]
but it doesn't send any messages to line-notify
How can I do for solved this case?
The problem in the receiver's name, you have double quotation marks ". However, the name of receiver should be either with single apostrophes ' or completely without.
Also the url can be without apostrophes.
Try this:
global:
resolve_timeout: 5m
route:
receiver: line-noti
# group_by: ['test-node-linux', 'test-node-windows', 'test-container-exporter', 'test-jmx-exporter']
group_interval: 10s
repeat_interval: 1m
receivers:
- name: line-noti
webhook_configs:
- url: https://notify-api.line.me/api/notify
send_resolved: true
http_config:
bearer_token: [my_token]

Prometheus share slaves is set as follows, the status is up, but no data can be searched. Where does my error occur?

I want to build horizontal extensions, implement a total prometheus,
and monitor two child prometheus nodes. For example: Nodes A, B:
Monitor Node_exporter and mysql_export data;
The total Prometheus is on the host computer C, and the child nodes are on Hosts A and B respectively.
The master node is configured as follows:
prometheus.yml(host_C):
global:
rule_files:
# - node_rules/zep.test.rules
scrape_configs:
- job_name: slaves
honor_labels: true
scrape_interval: 1s
metrics_path: /federate
params:
match[]:
- '{__name__=~"^job:.*"}'
- '{__job__=~"^job:.*"}'
static_configs:
- targets:
- hostA_ip:9090
- hostB_ip:9090
The child nodes are configured as follows:
slaves1.yml(host_A):
global:
external_labels:
slave: 0
rule_files:
scrape_configs:
- job_name: myjob_1
scrape_interval: 1s
file_sd_configs:
- files: ['./mytest.json']
relabel_configs:
- source_labels: [__address__]
modulus: 2
target_label: __tmp_hash
action: hashmod
- source_labels: [__tmp_hash]
regex: ^0$
action: keep
slaves2.yml(host_B):
global:
external_labels:
slave: 1
rule_files:
scrape_configs:
- job_name: myjob_2
scrape_interval: 1s
file_sd_configs:
- files: ['./mytest.json']
relabel_configs:
- source_labels: [__address__]
modulus: 2
target_label: __tmp_hash
action: hashmod
- source_labels: [__tmp_hash]
regex: ^1$
action: keep
mytest.json:
[{
"targets": [
"hostA_ip:9100",
"hostA_ip:9104"
],
"labels": {
"services": "dba_test"
}
}]
run it :
./prometheus --web.listen-address="hostA_ip:9090" --storage.tsdb.path="global_data/" --config.file="prometheus.yml" --web.enable-admin-api
./prometheus --web.listen-address="hostB_ip:9090" --storage.tsdb.path="data1/" --config.file="slave1.yml" --web.enable-admin-api
./prometheus --web.listen-address="hostC_ip:9090" --storage.tsdb.path="data2/" --config.file="slave2.yml" --web.enable-admin-api
The reason for this problem is because wildcards do not match. The
official website provides ___job____, but the actual use is job.
Specific will also look at the 9090 under the status of the Targets
specific configuration page, not all in accordance with the official
to provide configuration

Resources