How to parse nested json in Promtail - grafana-loki

I am having the following log which is in nested JSON
{"level":30,"time":1627625600625,"pid":15676,"hostname":"admin-hp-elitebook-840-g2","reqId":"req-2","req":{"method":"POST","url":"/v1/login","hostname":"127.0.0.1:3000","remoteAddress":"127.0.0.1","remotePort":55884},"msg":"incoming request"}
From that, i would like to create labels for method, URL, host i have tried the JSON expression like below in promtail.I have tried to parse the JSON i was able to extract the req but i don't know how to parse the nested one in promtail
scrape_configs:
- job_name: plainlog
pipeline_stages:
- json:
expressions:
req: req
- labels:
req:
- output:
source: req
static_configs:
- targets:
- localhost
labels:
job: plainlog
__path__: /home/nidhin/Desktop/plainlog/*log

You need to add another json stage. For every level of nested json, you add one more stage to parse data from the extracted data from the level above.
EG:
- json:
expressions:
req:
- json:
expressions:
method:
source: req
More info here: https://grafana.com/docs/loki/latest/clients/promtail/stages/json/#using-extracted-data

Related

Drop log lines to Loki using multiple conditions with Promtail

I want to drop lines in Promtail using an AND condition from two different JSON fields.
I have JSON log lines like this.
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET / HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 1"}
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET /path HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 1"}
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET / HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 2"}
My local Promtail config looks like this.
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: testing-my-job-drop
pipeline_stages:
- match:
selector: '{job="my-job"}'
stages:
- json:
expressions:
http_user_agent:
request:
- drop:
source: "http_user_agent"
expression: "user agent 1"
# I want this to be AND
- drop:
source: "request"
expression: "GET / HTTP/1.1"
drop_counter_reason: my_job_healthchecks
static_configs:
- labels:
job: my-job
Using a Promtail config like this drops lines using OR from my two JSON fields.
How can I adjust my config so that I only drop lines where http_user_agent = user agent 1 AND request = GET / HTTP/1.1?
If you provide multiple options they will be treated like an AND clause, where each option has to be true to drop the log.
If you wish to drop with an OR clause, then specify multiple drop stages.
https://grafana.com/docs/loki/latest/clients/promtail/stages/drop/#drop-stage
Drop logs by time OR length
Would drop all logs older than 24h OR longer than 8kb bytes
- json:
expressions:
time:
msg:
- timestamp:
source: time
format: RFC3339
- drop:
older_than: 24h
- drop:
longer_than: 8kb
Drop logs by regex AND length
Would drop all logs that contain the word debug AND are longer than 1kb bytes
- drop:
expression: ".*debug.*"
longer_than: 1kb
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: testing-my-job-drop
pipeline_stages:
- match:
selector: '{job="my-job"}'
stages:
- json:
expressions:
http_user_agent:
request:
- labels:
http_user_agent:
request:
#### method 1
- match:
selector: '{http_user_agent="user agent 1"}'
stages:
- drop:
source: "request"
expression: "GET / HTTP/1.1"
drop_counter_reason: my_job_healthchecks
## they are both conditions match will drop
#### method 2
- match:
selector: '{http_user_agent="user agent 1",request="GET / HTTP/1.1"}'
action: drop
#### method 3, incase regex pattern.
- match:
selector: '{http_user_agent="user agent 1"} |~ "(?i).*GET / HTTP/1.1.*"'
action: drop
static_configs:
- labels:
job: my-job
match stage include match stage.

replace yaml array with json format using bash, yq, jq

For readability purpose when displaying informations using yaml format, I'd like to be able to replace a yaml array with its json equivalent
The thing is that I may have several instances to replace in the yaml file/output with different paths
Example:
objects:
- object:
name: objA
inputs:
- dims:
- 1
- 3
- object:
name: objB
outputs:
- dims:
- 5
but I'd like the output to be json format for dims arrays, like
objects:
- object:
name: objA
inputs:
- dims: [1,3]
- object:
name: objB
outputs:
- dims: [5]
converting the value from yaml to json is easy, modifying the value of the yaml nodes is easy, but I don't see how I can get the value for a "dims" node, convert it to a json value string, a put it back in the node (I mean without searching explicitly all instances)
in general, I'm looking for a way to replace the value of a node, with the result of a process on the value of the node (other example, replacing an id with the name of the corresponding object retrieved through a REST api request)
objects:
- object:
name: objA
dependency: 3fc4bd5b-a6ee-4469-946d-5f780476784e
would be displayed as
objects:
- object:
name: objA
dependency: name-of-dependency
where the id is replaced by the friendly name of the dependency
thanks
With mikefarah's yq
yq e '.objects[].object["inputs","outputs"][].dims? |= "["+join(",")+"]"' data.yml
You could use tojson and the update operator |=. This will encode your arrays as JSON, which is a string and therefore itself enclosed in quotes:
yq -y '(.. | .dims? | arrays) |= tojson'
objects:
- object:
name: objA
inputs:
- dims: '[1,3]'
- object:
name: objB
outputs:
- dims: '[5]'
Run with Python yq

How to parse PodSpec.spec.imagePullSecrets from a yaml file?

I want to parse the following structure using go:
---
prjA:
user1:
metadata:
namespace: prj-ns
spec:
containers:
- image: some-contaner:latest
name: containerssh-client-image
resources:
limits:
ephemeral-storage: 4Gi
requests:
ephemeral-storage: 2Gi
securityContext:
runAsGroup: 1000
runAsNonRoot: true
runAsUser: 1000
imagePullSecrets:
- docker-registry-secret
I'm using sigs.k8s.io/yaml to unmarshal YAML:
var userConfig map[string]map[string]kubernetes.PodConfig
err = yaml.UnmarshalStrict(yamlFile, &userConfig)
where kubernetes is imported from github.com/containerssh/kubernetes. Everything works fine - except the immagePullSecrets which gives the following error:
ERROR unmarshal user config file; error [error unmarshaling JSON: while decoding JSON: json: cannot unmarshal string into Go struct field PodSpec.spec.imagePullSecrets of type v1.LocalObjectReference]
What is the correct way to specify / parse an imagePullSecrets in go?
This is a problem with the input - and maybe not a very clear error message.
The imagePullSecrets must be specified using the key name like:
imagePullSecrets:
- name: docker-registry-secret
I leave the question as it might help other people who run in the same problem.

Get all children key values in a YAML with PyYAML

Say I have a YAML like:
Resources:
AlarmTopic:
Type: AWS::SNS::Topic
Properties:
Subscription:
- !If
- ShouldAlarm
Protocol: email
How do I get each key and value of all the children if I'm walking over each resource and I want to know if one of the values may contain a certain string? I'm using PyYAML but I'm also open to using some other library.
You can use the low-level event API if you only want to inspect scalar values:
import yaml
import sys
input = """
Resources:
AlarmTopic:
Type: AWS::SNS::Topic
Properties:
Subscription:
- !If
- ShouldAlarm
- Protocol: email
"""
for e in yaml.parse(input):
if isinstance(e, yaml.ScalarEvent):
print(e.value)
(I fixed your YAML because it had a syntax error.) This yields:
Resources
AlarmTopic
Type
AWS::SNS::Topic
Properties
Subscription
ShouldAlarm
Protocol
email

Pass a CloudFormation YAML list via a JSON string parameter

I am attempting to import an existing load balancer into a CloudFormation stack. The listeners must be specified as a YAML list, but there is no CloudFormation parameter type for list (array) or object, so the parameter for the YAML list must be a string. This is causing the following CloudFormation error
Value of property Listeners must be of type List
The value of the string parameter for the listeners is set using the CLI -
aws elb describe-load-balancers --load-balancer-names $ELB_DNS_NAME --query 'LoadBalancerDescriptions[0].ListenerDescriptions[].Listener' | jq --compact-output '.' | sed -e 's/"/\\"/g'
Notice that the resultant JSON from the above command is escaped. I suspect that this is the root cause of the issue.
[
...
{
"ParameterKey": "ElbListeners",
"ParameterValue": "[{\"Protocol\":\"TCP\",\"LoadBalancerPort\":443,\"InstanceProtocol\":\"TCP\",\"InstancePort\":31672},{\"Protocol\":\"TCP\",\"LoadBalancerPort\":80,\"InstanceProtocol\":\"TCP\",\"InstancePort\":30545}]"
},
...
]
CloudFormation doesn't seem to offer any way of un-escaping the string parameter, so the following template fails.
AWSTemplateFormatVersion: 2010-09-09
Resources:
...
IngressLoadBalancer:
Type: AWS::ElasticLoadBalancing::LoadBalancer
DeletionPolicy: Delete
Properties:
Listeners: !Ref ElbListeners
LoadBalancerName: !Ref ElbName
Parameters:
...
ElbListeners:
Type: String
Description: Listeners for the load balancer
Default: ""
ElbName:
Type: String
Description: Name of the load balancer
Default: ""
Replacing quotes in the resultant JSON with ${quote} in the parameters file, and then replacing ${quote} with quotes using !Sub fails. It seems that the first input for !Sub can't be !Ref ParameterName.
I don't know how many listeners there will be, so it's not feasible to hardcode a list of listeners in the template and pass in multiple parameters for the ports/protocols.
How can I pass a YAML list as a JSON string parameter?
You can take the content of the ElbListeners parameter and simply insert it into the template, removing it from your Parameters. The resulting template would look like:
AWSTemplateFormatVersion: 2010-09-09
Resources:
...
IngressLoadBalancer:
Type: AWS::ElasticLoadBalancing::LoadBalancer
DeletionPolicy: Delete
Properties:
Listeners:
- Protocol: TCP
LoadBalancerPort: 443
InstanceProtocol: TCP
InstancePort: 31672
- Protocol: TCP
LoadBalancerPort: 80
InstanceProtocol: TCP
InstancePort: 30545
LoadBalancerName: !Ref ElbName
Parameters:
...
ElbName:
Type: String
Description: Name of the load balancer
Default: ""

Resources