Drop log lines to Loki using multiple conditions with Promtail - grafana-loki

I want to drop lines in Promtail using an AND condition from two different JSON fields.
I have JSON log lines like this.
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET / HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 1"}
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET /path HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 1"}
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET / HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 2"}
My local Promtail config looks like this.
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: testing-my-job-drop
pipeline_stages:
- match:
selector: '{job="my-job"}'
stages:
- json:
expressions:
http_user_agent:
request:
- drop:
source: "http_user_agent"
expression: "user agent 1"
# I want this to be AND
- drop:
source: "request"
expression: "GET / HTTP/1.1"
drop_counter_reason: my_job_healthchecks
static_configs:
- labels:
job: my-job
Using a Promtail config like this drops lines using OR from my two JSON fields.
How can I adjust my config so that I only drop lines where http_user_agent = user agent 1 AND request = GET / HTTP/1.1?

If you provide multiple options they will be treated like an AND clause, where each option has to be true to drop the log.
If you wish to drop with an OR clause, then specify multiple drop stages.
https://grafana.com/docs/loki/latest/clients/promtail/stages/drop/#drop-stage
Drop logs by time OR length
Would drop all logs older than 24h OR longer than 8kb bytes
- json:
expressions:
time:
msg:
- timestamp:
source: time
format: RFC3339
- drop:
older_than: 24h
- drop:
longer_than: 8kb
Drop logs by regex AND length
Would drop all logs that contain the word debug AND are longer than 1kb bytes
- drop:
expression: ".*debug.*"
longer_than: 1kb

clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: testing-my-job-drop
pipeline_stages:
- match:
selector: '{job="my-job"}'
stages:
- json:
expressions:
http_user_agent:
request:
- labels:
http_user_agent:
request:
#### method 1
- match:
selector: '{http_user_agent="user agent 1"}'
stages:
- drop:
source: "request"
expression: "GET / HTTP/1.1"
drop_counter_reason: my_job_healthchecks
## they are both conditions match will drop
#### method 2
- match:
selector: '{http_user_agent="user agent 1",request="GET / HTTP/1.1"}'
action: drop
#### method 3, incase regex pattern.
- match:
selector: '{http_user_agent="user agent 1"} |~ "(?i).*GET / HTTP/1.1.*"'
action: drop
static_configs:
- labels:
job: my-job
match stage include match stage.

Related

Elasticsearch/Kibana shows the wrong timestamp

I transfer logfiles with filebeat to elasticsearch.
The data are analyzed with kibana.
Now to my problem:
Kibana shows not the timestamp from the logfile.
Kibana shows the time of the transmission in #timestamp.
I want to show the timestamp from the logfile in kibana.
But the timestamp in the logfile is overwritten.
Where is my fault?
Has anyone a solution for my problem?
Here a example from my logfile and the my filebeat config.
{"#timestamp":"2022-06-23T10:40:25.852+02:00","#version":1,"message":"Could not refresh JMS Connection]","logger_name":"org.springframework.jms.listener.DefaultMessageListenerContainer","level":"ERROR","level_value":40000}
## Filebeat configuration
## https://github.com/elastic/beats/blob/master/deploy/docker/filebeat.docker.yml
#
filebeat.config:
modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
filebeat.autodiscover:
providers:
# The Docker autodiscover provider automatically retrieves logs from Docker
# containers as they start and stop.
- type: docker
hints.enabled: true
filebeat.inputs:
- type: filestream
id: pls-logs
paths:
- /usr/share/filebeat/logs/*.log
parsers:
- ndjson:
processors:
- add_cloud_metadata: ~
output.elasticsearch:
hosts: ['http://elasticsearch:9200']
username: elastic
password:
## HTTP endpoint for health checking
## https://www.elastic.co/guide/en/beats/filebeat/current/http-endpoint.html
#
http.enabled: true
http.host: 0.0.0.0
Thanks for any support!
Based upon the question, this could be one potential option, which would be to use filebeat processors. What you could do is write that initial #timestamp value to another field, like event.ingested, using the following script below:
#Script to move the timestamp to the event.ingested field
- script:
lang: javascript
id: init_format
source: >
function process(event) {
var fieldTest = event.Get("#timestamp");
event.Put("event.ingested", fieldTest);
}
And then the last processor you write could move that event.ingested field to #timestamp again using the following processor:
#setting the timestamp field to the Date/time when the event originated, which would be the event.created field
- timestamp:
field: event.created
layouts:
- '2006-01-02T15:04:05Z'
- '2006-01-02T15:04:05.999Z'
- '2006-01-02T15:04:05.999-07:00'
test:
- '2019-06-22T16:33:51Z'
- '2019-11-18T04:59:51.123Z'
- '2020-08-03T07:10:20.123456+02:00'

Forwarding Prometheus alert by target group

I want to group some targets into different groups and then send alerts to different email by target group not by alert type, name or label. The scrape_configs section of prometheus.yml is
scrape_configs:
- job_name: 'node_exporter'
scrape_interval: 5s
static_configs:
- targets: ['target-1', 'target-2']
labels:
machine_group: node
- targets: ['target-3', 'target-4']
labels:
machine_group: oracle
- job_name: 'oracle_exporter'
scrape_interval: 5s
static_configs:
- targets: ['target-3', 'target-4']
labels:
machine_group: oracle
How I can send any alert (oracle/node exporter metric) from target-3 and target-4 to 'email-1#smth.com'. And alerts (node exporter metric) from target-1-2 to 'email-2#smth.com'. My example alertmanager.yml file is:
routes:
- receiver: 'email-1#smth.com'
match:
machine_group: 'oracle'
- receiver: 'email-2#smth.com'
match:
machine_group: 'node'
It seems not working. Another thing I need to know is, is it valid to to group target in multiple groups like:
labels:
machine_group: gitlab, genkins

How to parse nested json in Promtail

I am having the following log which is in nested JSON
{"level":30,"time":1627625600625,"pid":15676,"hostname":"admin-hp-elitebook-840-g2","reqId":"req-2","req":{"method":"POST","url":"/v1/login","hostname":"127.0.0.1:3000","remoteAddress":"127.0.0.1","remotePort":55884},"msg":"incoming request"}
From that, i would like to create labels for method, URL, host i have tried the JSON expression like below in promtail.I have tried to parse the JSON i was able to extract the req but i don't know how to parse the nested one in promtail
scrape_configs:
- job_name: plainlog
pipeline_stages:
- json:
expressions:
req: req
- labels:
req:
- output:
source: req
static_configs:
- targets:
- localhost
labels:
job: plainlog
__path__: /home/nidhin/Desktop/plainlog/*log
You need to add another json stage. For every level of nested json, you add one more stage to parse data from the extracted data from the level above.
EG:
- json:
expressions:
req:
- json:
expressions:
method:
source: req
More info here: https://grafana.com/docs/loki/latest/clients/promtail/stages/json/#using-extracted-data

Is it possible to take snapshot and restore with elasticsearch-curator without loosing the updates in the destination index?

I am able to run the curator to take the snapshot from the source index and restore the same snapshot in the destination index.
But all the updations that I did on the destination index are lost after the next snapshot and restore action.
Is it possible to specify not to overwrite the updations of the destination index?
source index: test_index
destination index: dest_test_index
snapshot-action.yml file
actions:
1:
action: snapshot
description: Snapshot selected indices to 'repository' with the snapshot name or name pattern in 'name'. Use all other options as assigned
options:
repository: esbackup
name:
wait_for_completion: True
max_wait: 3600
wait_interval: 10
filters:
- filtertype: pattern
kind: regex
value: '^(test_index)$'
exclude:
restore-action.yml file
actions:
1:
action: create_index
description: "Create the temporary index with dest_index_v2 name"
options:
name: dest_index_v2
2:
action: close
description: >-
Close index dest_indiex_v2.
options:
ignore_empty_list: True
skip_flush: False
delete_aliases: False
ignore_sync_failures: True
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: dest_index_v2
3:
action: restore
description: >-
Restore test_index from the most recent snapshot in temp index dest_index_v2.
options:
repository: esbackup
# If name is blank, the most recent snapshot by age will be selected
name:
# If indices is blank, all indices in the snapshot will be restored
indices: ['test_index']
rename_pattern: test_index
rename_replacement: dest_index_v2
wait_for_completion: True
max_wait: 3600
wait_interval: 10
filters:
- filtertype: none
4:
action: open
description: >-
Open index pattern dest_index_v2.
options:
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: dest_index_v2
exclude:
5:
description: "Reindex dest_index_v2 into dest_test_index"
action: reindex
options:
wait_interval: 9
max_wait: -1
request_body:
source:
index: dest_index_v2
dest:
index: dest_test_index
filters:
- filtertype: none
6:
action: delete_indices
description: >-
Delete index dest_index_v2. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: dest_index_v2
If you are looking for some elasticsearch setting which will merge in updated destination index with the source index(which you are restoring from snapshot), then the short answer is NO.
You can write custom code to perform following operation to make sure destination index update is not lost.
Restore source index(test_index) to a temporary index in the cluster
, lets call this index as temp_index
Retrieve documents from temp_index and insert in destination index (dest_test_index) with op_type=create
Operation type create will make sure that the index operation will fail if a document by that id already exists in the index.
You can refer to documentation here
Hope this solves your purpose.

Sending messages to multiple elastic search indices

We are running an ELK stack to aggregate all our logs and we have multiple systems. Currently, we have Filebeat configured to log to specific indices based on the system (SystemA, SystemB, SystemC).
I would like to, additionally, send all logs with level ERROR to another index where I would like to collect all errors across systems, but somehow I can't figure out how to get Filebeat to send one message to multiple indices
According to the documentation, the first condition that matches will define the index to be used, which sounds to me as if it's not possible to send a message that would match multiple patterns to multiple indices?
What I want to do:
output.elasticsearch:
hosts: '${ELASTICSEARCH_HOSTS}'
username: '${ELASTICSEARCH_USERNAME}'
password: '${ELASTICSEARCH_PASSWORD}'
index: "filebeat-external-%{+yyyy.MM.dd}"
indices:
- index: "filebeat-error-logs-%{+yyyy.MM.dd}"
when:
or:
- equals:
level: "ERROR"
- equals:
level: "error"
- index: "filebeat-service-a-%{+yyyy.MM.dd}"
when:
regexp:
container.name: "^service-a-"
- index: "filebeat-service-b-%{+yyyy.MM.dd}"
when:
regexp:
container.name: "^service-b-"
The only way I currently see is to have multiple indices per system and aggregate them in Kibana:
output.elasticsearch:
hosts: '${ELASTICSEARCH_HOSTS}'
username: '${ELASTICSEARCH_USERNAME}'
password: '${ELASTICSEARCH_PASSWORD}'
index: "filebeat-external-%{+yyyy.MM.dd}"
indices:
- index: "error-log-service-a-%{+yyyy.MM.dd}"
when:
and:
- equals:
level: "ERROR"
- regexp:
container.name: "^service-a-"
- index: "service-log-service-a-%{+yyyy.MM.dd}"
when:
and:
- not:
- equals:
level: "ERROR"
- regexp:
container.name: "^service-a-"
But this would double our number of indices and is code duplication. Am I missing something here, is there an easier way to have a general error-index but still have errors go to the service-specific indices as well?

Resources