promtail: transform the whole log line based on regex - grafana-loki

I'm having some challenges with coercing my log lines in a certain format.
I'm running one promtail instance on several log files, of which some are logfmt and others are free-form.
My objective is to transform the free-form ones to the same logfmt as the others, independent of any other labeling. That means the actual payload (log line) pushed to my qryn instance is then supposed to have the same format, and I woudn't even be able to "see" the original, free-form log line downstream. This should enable me to use a simple | logfmt in grafana, regardless of the log source.
I tried in several ways, but I can't get the log line replaced, i.e. while I can extract to labels in all ways conceivable, I can't replace the actual log line.
A (slightly redacted) promtail-config.yml:
server:
disable: true
positions:
filename: ${RUNDIR}/.logs/positions.yaml
clients:
- url: http://mylocalqryn:33100/loki/api/v1/push
batchwait: 5s
timeout: 30s
scrape_configs:
- job_name: consolidated-logs
# https://grafana.com/docs/loki/latest/clients/promtail/pipelines/
# https://grafana.com/docs/loki/latest/clients/promtail/stages/template/
pipeline_stages:
- match:
selector: '{ Program="freeformlog" }'
stages:
- regex:
expression: '^(?P<time>^[0-9-:TZ.+]*)\s+(?P<level>[A-z]*)\s+(?P<Function>[0-9A-z:.]*)\s+(?P<msg>.*$)'
- timestamp:
format: RFC3339
source: time
- template:
source: level
template: '{{ ToLower .Value }}'
- labels:
level:
msg:
Function:
- replace:
expression: '.*'
replace: 'time="{{ .timestamp }}" level="{{ .level }}" msg="{{ .msg }}" Host="{{ .Host }}" Program="{{ .Program }}" Function="{{ .Function }}"'
static_configs:
- targets:
- localhost
labels:
Host: ${HOST:-"_host-unknown_"}
Program: logfmtcompat
__path__: ${RUNDIR}/.logs/logfmtcompat.log
- targets:
- localhost
labels:
Host: ${HOST:-"_host-unknown_"}
Program: freeformlog
__path__: ${RUNDIR}/.logs/freeformlog.log

Related

Multiple expression RegEx in Ansible

Note: I have next to zero experience with Ansible.
I need to be able to conditionally modify the configuration of a Kubernetes cluster control plane service. To do this I need to be able to find a specific piece of information in the file and if its value matches a specific pattern, change the value to something else.
To illustrate, consider the following YAML file:
apiVersion: v1
kind: Pod
metadata:
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
...
In this scenario, the line I'm interested in is the line containing --bind-address. If that field's value is "127.0.0.1", it needs to be changed to "0.0.0.0". If it's already "0.0.0.0", nothing needs to be done. (I could also approach it from the point of view of: if its not "0.0.0.0" then it needs to change to that.)
The initial thought that comes to mind is: just search for "--bind-address=127.0.0.1" and replace it with "--bind-address=0.0.0.0". Simple enough, eh? No, not that simple. What if, for some reason, there is another piece of configuration in this file that also matches that pattern? Which one is the right one?
The only way I can think of to ensure I find the right text to change, is a multiple expression RegEx match. Something along the lines of:
find spec:
if found, find containers: "within" or "under" spec:
if found, find - command: "within" or "under" containers: (Note: there can be more than one "command")
if found, find - kube-controller-manager "within" or "under" - command:
if found, find - --bind-address "within" or "under" - kube-controller-manager
if found, get the value after the =
if 127.0.0.1 change it to 0.0.0.0, otherwise do nothing
How could I write an Ansible playbook to perform these steps, in sequence and only if each step returns true?
Read the data from the file into a dictionary
- include_vars:
file: conf.yml
name: conf
gives
conf:
apiVersion: v1
kind: Pod
metadata:
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
Update the containers
- set_fact:
containers: []
- set_fact:
containers: "{{ containers +
(update_candidate is all)|ternary([_item], [item]) }}"
loop: "{{ conf.spec.containers|d([]) }}"
vars:
update_candidate:
- item is contains 'command'
- item.command is contains 'kube-controller-manager'
- item.command|select('match', '--bind-address')|length > 0
update: "{{ item.command|map('regex_replace',
'--bind-address=127.0.0.1',
'--bind-address=0.0.0.0') }}"
_item: "{{ item|combine({'command': update}) }}"
gives
containers:
- command:
- kube-controller-manager
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=0.0.0.0
Update conf
conf_update: "{{ conf|combine({'spec': spec}) }}"
spec: "{{ conf.spec|combine({'containers': containers}) }}"
give
spec:
containers:
- command:
- kube-controller-manager
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=0.0.0.0
conf_update:
apiVersion: v1
kind: Pod
metadata:
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=0.0.0.0
Write the update to the file
- copy:
dest: /tmp/conf.yml
content: |
{{ conf_update|to_nice_yaml(indent=2) }}
gives
shell> cat /tmp/conf.yml
apiVersion: v1
kind: Pod
metadata:
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=0.0.0.0
Example of a complete playbook for testing
- hosts: localhost
vars:
conf_update: "{{ conf|combine({'spec': spec}) }}"
spec: "{{ conf.spec|combine({'containers': containers}) }}"
tasks:
- include_vars:
file: conf.yml
name: conf
- debug:
var: conf
- set_fact:
containers: []
- set_fact:
containers: "{{ containers +
(update_candidate is all)|ternary([_item], [item]) }}"
loop: "{{ conf.spec.containers|d([]) }}"
vars:
update_candidate:
- item is contains 'command'
- item.command is contains 'kube-controller-manager'
- item.command|select('match', '--bind-address')|length > 0
update: "{{ item.command|map('regex_replace',
'--bind-address=127.0.0.1',
'--bind-address=0.0.0.0') }}"
_item: "{{ item|combine({'command': update}) }}"
- debug:
var: containers
- debug:
var: spec
- debug:
var: conf_update
- copy:
dest: /tmp/conf.yml
content: |
{{ conf_update|to_nice_yaml(indent=2) }}

how to filter rows in promtail yaml config

I'm a bit new to Grafana so this might be an easy one! I have a simple config-promtail.yaml file loading logs into Loki and everything is working, but I'd like to restrict the log rows passed to Loki to only those lines that include the word "error". Here is what I have:
server:
http_listen_port: <port #>
grpc_listen_port: <port #>
positions:
filename: /tmp/positions.yaml
clients:
- url: 'http://10.128.15.231:3100/loki/api/v1/push'
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: log_export
__path__: /path/to/log/file.log
host: host-name
pipeline_stages:
- match:
selector: '{host="host-name"} |= "error"'
action: keep
it works fine until I add the |= "error
I've also tried something like this:
pipeline_stages:
- match:
selector: '{host="host-name"}'
stages:
- regex:
expression: '.*error.*'
which also throws config errors. it seems like this should be relatively simple, but the documentation is really not clear...thanks in advance for any assistance!

ansible is trowing a Syntax Error while loading YAML. did not find expected key

my yml looks like:
---
# YAML documents begin with the document separator ---
# The minus in YAML this indicates a list item. The playbook contains a list
# of plays, with each play being a dictionary
-
# Hosts: where our play will run and options it will run with
hosts: localhost
gather_facts: false
# Vars: variables that will apply to the play, on all target systems
vars:
DDVE_public_IP : 34.107.103.175
destination_port: 3009
Instance_id : 8529834022607504819
S3_bucket_name : bucket_for_ddve_6
# Tasks: the list of tasks that will be executed within the playbook
tasks:
- name: login access token
uri:
url: https://{{ DDVE_public_IP }}:{{ destination_port }}/{{ resource_path }}
method: POST
headers:
Content-Type: application/json
body_format: json
body:
username: sysadmin
password: {{ Instance_id }}
return_content: yes
ignore_errors: yes
register: rest_post
vars:
resource_path: rest/v1.0/auth
- name: DEBUG / GOT INFO
debug:
msg: "{{ rest_post.json }}"
when: rest_post.status == 201
# Handlers: the list of handlers that are executed as a notify key from a task
# Roles: list of roles to be imported into the play
# Three dots indicate the end of a YAML document
...
ansible-playbook ddve6-post-deploy-object-store.yml
[WARNING]: No inventory was parsed, only implicit localhost is available.
[WARNING]: provided hosts list is empty, only localhost is available. Note that
the implicit localhost does not match 'all'.
ERROR! We were unable to read either as JSON nor YAML, these are the errors we got from each:
JSON: Expecting value: line 1 column 1 (char 0)
Syntax Error while loading YAML.
did not find expected key
The error appears to be in '/Users/juergen/Documents/DPSCodeAcademy/Ansible/#dev/ddve/ddve6-post-deploy-object-store.yml': line 30, column 9, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
password: {{ Instance_id }}
return_content: yes
^ here
I have no idea where that error is coming as I can't find the responding problem here.
You just have some indentation errors in that YAML document. Pick an indentation level (e.g., 2 spaces for every level) and stick with it consistently. Many editors have plugins that will syntax check your YAML documents while you write them.
The following validates correctly:
---
- hosts: localhost
gather_facts: false
vars:
DDVE_public_IP: 34.107.103.175
destination_port: 3009
Instance_id: 8529834022607504819
S3_bucket_name: bucket_for_ddve_6
tasks:
- name: login access token
uri:
url: https://{{ DDVE_public_IP }}:{{ destination_port }}/{{ resource_path }}
method: POST
headers:
Content-Type: application/json
body_format: json
body:
username: sysadmin
password: "{{ Instance_id }}"
return_content: true
ignore_errors: true
register: rest_post
vars:
resource_path: rest/v1.0/auth
- name: DEBUG / GOT INFO
debug:
msg: "{{ rest_post.json }}"
when: rest_post.status == 201
Note that it is highly uncommon to terminate your YAML documents with the ... marker.

how to exclude logs/events in journalbeat

We are using journalbeat to push logs of kubernetes cluster to elastic search. It working fine and pushing the logs. However its also pushing event like "200 OK" and "INFO" which we do not want. The journalbeat.yaml is as follows
journalbeat.yaml
journalbeat.yml: |
name: "${NODENAME}"
journalbeat.inputs:
- paths: []
seek: cursor
cursor_seek_fallback: tail
processors:
- add_kubernetes_metadata:
host: "${NODENAME}"
in_cluster: true
default_indexers.enabled: false
default_matchers.enabled: false
indexers:
- container:
matchers:
- fields:
lookup_fields: ["container.id"]
- decode_json_fields:
fields: ["message"]
process_array: false
max_depth: 1
target: ""
overwrite_keys: true
- drop_event.when:
or:
- regexp.kubernetes.pod.name: "filebeat-.*"
- regexp.kubernetes.pod.name: "journalbeat-.*"
- regexp.kubernetes.pod.name: "nginx-ingress-controller-.*"
- regexp.kubernetes.pod.name: "prometheus-operator-.*"
setup.template.enabled: false
setup.template.name: "journal-${ENVIRONMENT}-%{[agent.version]}"
setup.template.pattern: "journal-${ENVIRONMENT}-%{[agent.version]}-*"
setup.template.settings:
index.number_of_shards: 10
index.refresh_interval: 10s
output.elasticsearch:
hosts: '${ELASTICSEARCH_HOSTS:elasticsearch:9200}'
username: '${ELASTICSEARCH_USERNAME}'
password: '${ELASTICSEARCH_PASSWORD}'
index: "journal-${ENVIRONMENT}-system-%{[agent.version]}-%{+YYYY.MM.dd}"
indices:
- index: "journal-${ENVIRONMENT}-k8s-%{[agent.version]}-%{+YYYY.MM.dd}"
when.has_fields:
- 'kubernetes.namespace'
How can i exclude logs like "INFO" and "200 OK" events?
As far as I'm aware there is no way to exclude logs in Journalbeat. It's working other way around, meaning you tell it what input to look for.
You should read about Configuration input:
By default, Journalbeat reads log events from the default systemd journals. To specify other journal files, set the paths option in the journalbeat.inputs section of the journalbeat.yml file. Each path can be a directory path (to collect events from all journals in a directory), or a file path.
journalbeat.inputs:
- paths:
- "/dev/log"
- "/var/log/messages/my-journal-file.journal"
Within the configuration file, you can also specify options that control how Journalbeat reads the journal files and which fields are sent to the configured output. See Configuration options for a list of available options.
Get familiar with the Configuration options and using the translated fields to target the exact input you want to.
{beatname_lc}.inputs:
- id: consul.service
paths: []
include_matches:
- _SYSTEMD_UNIT=consul.service
- id: vault.service
paths: []
include_matches:
- _SYSTEMD_UNIT=vault.service
You should use it to target the inputs you want to have pushed to elastic.
As an alternative to Journalbeat you could use Filebeat and the exclude might look like this:
type: log
paths:
{{ range $i, $path := .paths }}
- {{$path}}
{{ end }}
exclude_files: [".gz$"]
exclude_lines: ['.*INFO.*']
Hope this helps you a bit.
To apply filter use:
logging.level: warning
Use this instruction to drop event journalbeat.service:
processors:
- drop_event:
when:
equals:
systemd.unit: "journalbeat.service"

Ansible yaml anchors and jinja2 templating

how do I overwrite the disk attribute while also not hardcoding the number of disks?
This is what I want it (the tasks/main.yml of that role) to do, but it fails with an syntax error and also requires hardcoding the number of disks:
---
- name: anchors
when: false
debug:
new_disk:
- &new_disk
size_gb: 80
type: thin
datastore: '{{ item.datastore }}'
- name: Deploy usage001 vms
loop: '{{ vms.usage001 }}
vmware_guest:
disk:
- <<: *new_disk
- <<: *new_disk
'{{ item.disk[0] }}'
- <<: *new_disk
'{{ item.disk[1] }}
Where item looks like:
vms:
usage001:
disk:
- size_gb: 1000
- size_gb: 600
usage002:
(...)
The documentation for <<, the Merge Key Language Independent Type states:
The “<<” merge key is used to indicate that all the keys of one or more specified maps should be inserted into the current map.
But you specify the anchor new-disk on a sequence instead of map.
You probably want to do:
new_disk:
- &new_disk
size_gb: 80
type: thin
datastore: '{{ item.datastore }}'
You seem to want select size_gb: 1000 from your item, but as the quotes are outside of your jinja2 syntax the substitution, if it works, will result in:
- <<: *new_disk
'size_gb: 1000'
and for that to work it has to be:
- <<: *new_disk
size_gb: 1000
so make sure you get rid of those quotes.
The selection using item.disk[0] given your item seems strange as well, I would have expected something like item.vms.usage001.disk[0] but that might be my lack of jinja2 specific knowledge.

Resources