Unable to access data inside alert section of elastalert - elasticsearch

I have been trying to set up elastalert monitoring on my ELK stack. For the beginning I want to set up a simple rule which will generate a notification if any disk on the file system has reached 80% usage. The rule seems to be working correctly but in the alert section I am not able to pass the data to python script. The uncommented command in the alert section gives following error
ERROR:root:Error while running alert command: Error formatting command: 'system.filesystem.mount_point' error.
Here is my rule file. Please excuse the formatting of the yaml.
name: Metricbeat high FS percentage
type: metric_aggregation
es_host: localhost
es_port: 9200
index: metricbeat-*
minutes: 1
metric_agg_key: system.filesystem.used.pct
metric_agg_type: max
query_key: beat.name.keyword
doc_type: metricsets
minutes: 1
minutes: 2
sync_bucket_interval: true
#allow_buffer_time_overlap: true
#use_run_every_query_size: true
max_threshold: 0.8
- query:
query: "system.filesystem.device_name: dev"
analyze_wildcard: true
- term:
metricset.name: filesystem
# (Required)
# The alert is use when a match is found
- debug
- command
command: ["/home/ubuntu/sendToSlack.py","beat-name","%(beat.name.keyword)s","used_pc","%(system.filesystem.used.pct_max)s","mount_point","%(system.filesystem.mount_point)s"]
# command: ["/home/ubuntu/sendToSlack.py","--beat-name","{match[beat.name.keyword]}","--mount_point","{match[system.filesystem.mount_point]}"]
# command: ["/home/ubuntu/sendToSlack.py","--beat-name","{match[beat][name]}","--mount_point","{match[system][filesystem][mount_point]}"]
#pipe_match_json: true
#- command:
# command: ["/home/ubuntu/sendToSlack.py","%(system.filesystem.used.bytes)s"]
Some observations:
On testing the rule file using the command python -m elastalert.test_rule rules/high_fs.yaml I get the output
Successfully loaded Metricbeat high FS percentage
Got 149161 hits from the last 1 day
Available terms in first hit:
I should be able to access any of the fields mentioned above. When I run this rule using python -m elastalert.elastalert --verbose --rule rules/high_fs.yaml a list is printed on the screen
#timestamp: 2017-10-18T17:15:00Z
beat.name.keyword: my_server_name
num_hits: 98
num_matches: 5
system.filesystem.used.pct_max: 0.823400020599
I am able to access all the key value pairs in this list. Anything thats outside the list fails with the formatting error. Been stuck over this for long. Any help is appreciated.

UPDATE: A reply for the same problem on elastalert's github repo says that certain query types do not contain the full field data.
While I am not sure if this is the correct way to achieve what I was looking for but I was able to get the the desired output using the rule type any and writing my own filters. Here is how one of my rules file looks currently.
name: High CPU percentage
type: any
es_host: localhost
es_port: 9200
index: consumer-*
- beat.name
- range:
from: 0.95
to: 10.0
minutes: 60
- command:
command: ["/home/ubuntu/slackScripts/sendCPUDetails.py","{match[beat][name]}","{match[system][cpu][total_norm_pct]}"]
new_style_string_format: true
Hope it helps someone.


cloud-init: delay disk_setup and fs_setup

I have a cloud-init file that sets up all requirements for our AWS instances, and part of those requirements is formating and mounting an EBS volume. The issue is that on some instances volume attachment occurs after the instance is up, so when cloud-init executes the volume /dev/xvdf does not yet exist and it fails.
I have something like:
resize_rootfs: false
table_type: 'gpt'
layout: true
overwrite: false
- label: DATA
filesystem: 'ext4'
device: '/dev/xvdf'
partition: 'auto'
- [xvdf, /data, auto, "defaults,discard", "0", "0"]
And would like to have something like a sleep 60 or something like that before the disk configuration block.
If the whole cloud-init execution can be delayed, that would also work for me.
Also, I'm using terraform to create the infrastructure.
I guess cloud-init does have an option for running adhoc commands. have a look into this link.
Not sure what your code looks like, but I just tried to pass the below as user_data in AWS and could see that the init script sleep for 1000 seconds... ( Just added a couple of echo statements to check later). I guess you can add a little more logic as well to verify the presence of the volume.
- [ sh, -c, "echo before sleep:`date` >> /tmp/user_data.log" ]
- [ sh, -c, "sleep 1000" ]
- [ sh, -c, "echo after sleep:`date` >> /tmp/user_data.log" ]
<Rest of the script>
I was able to resolve the issue with two changes:
Changed the mount options, adding nofail option.
Added a line to the runcmd block, deleting the semaphore file for disk_setup.
So my new cloud-init file now looks like this:
resize_rootfs: false
table_type: 'gpt'
layout: true
overwrite: false
- label: DATA
filesystem: 'ext4'
device: '/dev/xvdf'
partition: 'auto'
- [xvdf, /data, auto, "defaults,discard", "0", "0"]
- [rm, -f, /var/lib/cloud/instances/*/sem/config_disk_setup]
mode: reboot
timeout: 30
It will reboot, then it will execute the disk_setup module once more. By this time, the volume will be attached so the operation won't fail.
I guess this is kind of a hacky way to solve this, so if someone has a better answer (like how to delay the whole cloud-init execution) please share it.

Transform String into JSON so that it's searchable in Kibana/Elasticsearch

I have Elasticsearch, Filebeat and Kibana running on a Windows machine. Filebeat log has a proper log file and is listening to the path. When I look on the data in Kibana it looks fine.
My issue is that the message field is a String.
Example of one log line:
12:58:09.9608 Trace {"message":"No more Excel rows found","level":"Trace","logType":"User","timeStamp":"2020-08-14T12:58:09.9608349+02:00","fingerprint":"226fdd2-e56a-4af4-a7ff-724a1a0fea24","windowsIdentity":"mine","machineName":"NAME-PC","processName":"name","processVersion":"","jobId":"957ef018-0a14-49d2-8c95-2754479bb8dd","robotName":"NAME-PC","machineId":6,"organizationUnitId":1,"fileName":"GetTransactionData"}
So what I would like to have now is that String converted to a JSON so that it is possible to search in Kibana for example for the level field.
I already had a look on Filebeat. There I tried to enable LogStash . But then the data does not come anymore to Elasticsearch. And also the log file is not genereated into the LogStash folder.
Then I downloaded LogStash via install guide, but unfortunately I got this message:
Logstash logs to C:/Users/mine/Desktop/logstash-7.8.1/logs which
is now configured via log4j2.properties ERROR: Pipelines YAML file is
empty. Location:
C:/Users/mine/Desktop/logstash-7.8.1/config/pipelines.yml usage:
bin/logstash -f CONFIG_PATH [-t] [-r] [] [-w COUNT] [-l LOG]
bin/logstash --modules MODULE_NAME [-M
[-w COUNT] [-l LOG] bin/logstash -e CONFIG_STR [-t] [--log.level
fatal|error|warn|info|debug|trace] [-w COUNT] [-l LOG] bin/logstash
-i SHELL [--log.level fatal|error|warn|info|debug|trace] bin/logstash -V [--log.level fatal|error|warn|info|debug|trace]
bin/logstash --help
[2020-08-14T15:07:51,696][ERROR][org.logstash.Logstash ]
java.lang.IllegalStateException: Logstash stopped processing because
of an error: (SystemExit) exit
I tried to use Filebeat only. Here I set:
- add_host_metadata: ~
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
- dissect:
tokenizer: '"%{event_time} %{loglevel} %{json_message}"'
field: "message"
target_prefix: "dissect"
- decode_json_fields:
fields: ["json_message"]
but that gave me:
The tip with removing the "" at tokenizer helped. Then I got:
I simply refreshed the index and the message was gone. Nice.
But The question is now, how to filter for something in the new field?
The message says, your pipeline config is empty. It seems you did not configured any pipeline yet. Logstash can do the trick (JSON filter plugin), but Filebeat is sufficient here. If you don't want to introduce another Service, this is the better option.
It has the decode_json_fields option to transform specific fields containing JSON in your event to a . Here is the documentation.
For the future case, where your whole event is a JSON, there is the possibility of parsing in filebeat configuring the json.message_key and related json.* option.
EDIT - Added filebeat snippet as an processors example of dissecting the log line into three fields (event_time, loglevel, json_message). Afterwards the recently extracted field json_message, whose value is a JSON object encoded as a string, will be decoded into an JSON structure:
- type: log
- path to your logfile
- dissect:
tokenizer: '%{event_time} %{loglevel} %{json_message}'
field: "message"
target_prefix: "dissect"
- decode_json_fields:
fields: ["dissect.json_message"]
target: ""
- drop_fields:
fields: ["dissect.json_message"]
If you want to practice the filebeat processors, try to set the correct event timestamp, taken from the encoded json and written into #timestamp using the timestamp processor.

Kibana Filebeat Index Pattern is not working

Kibana is trying to use filebeat for dashboard but it doesn't work. Can I fix this error? I added the error and filebeat.yml content. How to fix this? I can't see an error in filebeat.yml? I've done the necessary configurations, but I can't run. Filebeat- * command does not work when creating index pattern in kibana
filebeat version 1.3.1 (amd64)
dev#dev-Machine:~$ service filebeat status
filebeat.service - filebeat
Loaded: loaded (/lib/systemd/system/filebeat.service; enabled; vendor preset: enable
Active: failed (Result: start-limit-hit) since Fri 2018-11-23 02:34:06 +03; 7h ago
Docs: https://www.elastic.co/guide/en/beats/filebeat/current/index.html
Process: 822 ExecStart=/usr/bin/filebeat -c /etc/filebeat/filebeat.yml (code=exited,
Main PID: 822 (code=exited, status=1/FAILURE)
Nov 23 02:34:06 dev-Machine systemd[1]: filebeat.service: Unit entered failed state.
Nov 23 02:34:06 dev-Machine systemd[1]: filebeat.service: Failed with result 'exit-cod
Nov 23 02:34:06 dev-Machine systemd[1]: filebeat.service: Service hold-off time over,
Nov 23 02:34:06 dev-Machine systemd[1]: Stopped filebeat.
Nov 23 02:34:06 dev-Machine systemd[1]: filebeat.service: Start request repeated too q
Nov 23 02:34:06 dev-Machine systemd[1]: Failed to start filebeat.
Nov 23 02:34:06 dev-Machine systemd[1]: filebeat.service: Unit entered failed state.
Nov 23 02:34:06 dev-Machine systemd[1]: filebeat.service: Failed with result 'start-li
lines 1-15/15 (END)
################### Filebeat Configuration Example #########################
############################# Filebeat ######################################
# List of prospectors to fetch data.
# Each - is a prospector. Below are the prospector specific configurations
# Paths that should be crawled and fetched. Glob based paths.
# To fetch all ".log" files from a specific level of subdirectories
# /var/log/*/*.log can be used.
# For each file found under this path, a harvester is started.
# Make sure not file is defined twice as this can lead to unexpected behaviour.
- /var/log/*.log
#- c:\programdata\elasticsearch\logs\*
# Type of the files. Based on this the way the file is read is decided.
# The different types cannot be mixed in one prospector
# Possible options are:
# * log: Reads every line of the log file (default)
# * stdin: Reads the standard in
input_type: log
# exclude_lines. By default, no lines are dropped.
# exclude_lines: ["^DBG"]
# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list. The include_lines is called before
# exclude_lines. By default, all the lines are exported.
# include_lines: ["^ERR", "^WARN"]
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
# exclude_files: [".gz$"]
# Optional additional fields. These field can be freely picked
# to add additional information to the crawled log files for filtering
# level: debug
# review: 1
# fields.
#fields_under_root: false
# Time strings like 2h (2 hours), 5m (5 minutes) can be used.
#ignore_older: 0
# Close older closes the file handler for which were not modified
# for longer then close_older
# Time strings like 2h (2 hours), 5m (5 minutes) can be used.
#close_older: 1h
# Type to be published in the 'type' field. For Elasticsearch output,
# the type defines the document type these entries should be stored
# in. Default: log
#document_type: log
# to 0s, it is done as often as possible. Default: 10s
#scan_frequency: 10s
# Defines the buffer size every harvester uses when fetching the file
#harvester_buffer_size: 16384
# Maximum number of bytes a single log event can have
# All bytes after max_bytes are discarded and not sent. The default is 10MB.
# This is especially useful for multiline log messages which can get large.
#max_bytes: 10485760
# Mutiline can be used for log messages spanning multiple lines. This is common
# for Java Stack Traces or C-Line Continuation
# The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
#pattern: ^\[
# Defines if the pattern set under pattern should be negated or not. Default is false.
#negate: false
# Default is 500
#max_lines: 500
# Default is 5s.
#timeout: 5s
#tail_files: false
# Every time a new line appears, backoff is reset to the initial value.
#backoff: 1s
# file after having backed off multiple times, it takes a maximum of 10s to read the new line
#max_backoff: 10s
# The backoff factor defines how fast the algorithm backs off. The bigger the backoff factor,
# the faster the max_backoff value is reached. If this value is set to 1, no backoff will happen.
# The backoff value will be multiplied each time with the backoff_factor until max_backoff is reached
#backoff_factor: 2
# This option closes a file, as soon as the file name changes.
# This config option is recommended on windows only. Filebeat keeps the files it's reading open. This can cause
# issues when the file is removed, as the file will not be fully removed until also Filebeat closes
# the reading. Filebeat closes the file handler after ignore_older. During this time no new file with the
# same name can be created. Turning this feature on the other hand can lead to loss of data
# on rotate files. It can happen that after file rotation the beginning of the new
# file is skipped, as the reading starts at the end. We recommend to leave this option on false
# but lower the ignore_older value to release files faster.
#force_close_files: false
# Additional prospector
# Configuration to use stdin input
#input_type: stdin
# General filebeat configuration options
# Event count spool threshold - forces network flush if exceeded
#spool_size: 2048
# Enable async publisher pipeline in filebeat (Experimental!)
#publish_async: false
# Defines how often the spooler is flushed. After idle_timeout the spooler is
# Flush even though spool_size is not reached.
#idle_timeout: 5s
# Name of the registry file. Per default it is put in the current working
# directory. In case the working directory is changed after when running
# filebeat again, indexing starts from the beginning again.
registry_file: /var/lib/filebeat/registry
# Full Path to directory with additional prospector configuration files. Each file must end with .yml
# These config files must have the full filebeat config part inside, but only
# the prospector part is processed. All global options like spool_size are ignored.
# The config_dir MUST point to a different directory then where the main filebeat config file is in.
############################# Libbeat Config ##################################
# Base config file used by all other beats for using libbeat features
############################# Output ##########################################
# Configure what outputs to use when sending the data collected by the beat.
# Multiple outputs may be used.
### Elasticsearch as output
# elasticsearch:
# Array of hosts to connect to.
# Scheme and port can be left out and will be set to the default (http and 9200)
# In case you specify and additional path, the scheme is required: http://localhost:9200/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
# hosts: ["localhost:9200"]
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "test"
#password: "test"
# Number of workers per Elasticsearch host.
#worker: 1
# Optional index name. The default is "filebeat" and generates
# [filebeat-]YYYY.MM.DD keys.
#index: "filebeat"
# A template is used to set the mapping in Elasticsearch
# By default template loading is disabled and no template is loaded.
# These settings can be adjusted to load your own template or overwrite existing ones
# Template name. By default the template name is filebeat.
#name: "filebeat"
# Path to template file
#path: "filebeat.template.json"
# Overwrite existing template
#overwrite: false
# Optional HTTP Path
#path: "/elasticsearch"
# Proxy server url
#proxy_url: http://proxy:3128
# The number of times a particular Elasticsearch index operation is attempted. If
# the indexing operation doesn't succeed after this many retries, the events are
# dropped. The default is 3.
#max_retries: 3
# The maximum number of events to bulk in a single Elasticsearch bulk API index request.
# The default is 50.
#bulk_max_size: 50
# Configure http request timeout before failing an request to Elasticsearch.
#timeout: 90
# The number of seconds to wait for new events between two bulk API index requests.
# If `bulk_max_size` is reached before this interval expires, addition bulk index
# requests are made.
#flush_interval: 1
# Boolean that sets if the topology is kept in Elasticsearch. The default is
# false. This option makes sense only for Packetbeat.
#save_topology: false
# The time to live in seconds for the topology information that is stored in
# Elasticsearch. The default is 15 seconds.
#topology_expire: 15
# tls configuration. By default is off.
# List of root certificates for HTTPS server verifications
#certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for TLS client authentication
#certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#certificate_key: "/etc/pki/client/cert.key"
# Controls whether the client verifies server certificates and host name.
# If insecure is set to true, all server host names and certificates will be
# accepted. In this mode TLS based connections are susceptible to
# man-in-the-middle attacks. Use only for testing.
#insecure: true
# Configure cipher suites to be used for TLS connections
#cipher_suites: []
# Configure curve types for ECDHE based cipher suites
#curve_types: []
# Configure minimum TLS version allowed for connection to logstash
#min_version: 1.0
# Configure maximum TLS version allowed for connection to logstash
#max_version: 1.2
### Logstash as output
# The Logstash hosts
hosts: ["localhost:5044"]
# Number of workers per Logstash host.
#worker: 1
# The maximum number of events to bulk into a single batch window. The
# default is 2048.
#bulk_max_size: 2048
# Set gzip compression level.
#compression_level: 3
# Optional load balance the events between the Logstash hosts
#loadbalance: true
# Optional index name. The default index name depends on the each beat.
# For Packetbeat, the default is set to packetbeat, for Topbeat
# top topbeat and for Filebeat to filebeat.
#index: filebeat
# Optional TLS. By default is off.
# List of root certificates for HTTPS server verifications
#certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for TLS client authentication
#certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#certificate_key: "/etc/pki/client/cert.key"
# Controls whether the client verifies server certificates and host name.
# If insecure is set to true, all server host names and certificates will be
# accepted. In this mode TLS based connections are susceptible to
# man-in-the-middle attacks. Use only for testing.
#insecure: true
# Configure cipher suites to be used for TLS connections
#cipher_suites: []
# Configure curve types for ECDHE based cipher suites
#curve_types: []
### File as output
# Path to the directory where to save the generated files. The option is mandatory.
#path: "/tmp/filebeat"
# Name of the generated files. The default is `filebeat` and it generates files: `filebeat`, `filebeat.1`, `filebeat.2`, etc.
#filename: filebeat
# Maximum size in kilobytes of each file. When this size is reached, the files are
# rotated. The default value is 10 MB.
#rotate_every_kb: 10000
# Maximum number of files under path. When this number of files is reached, the
# oldest file is deleted and the rest are shifted from last to first. The default
# is 7 files.
#number_of_files: 7
### Console output
# console:
# Pretty print json event
#pretty: false
############################# Shipper #########################################
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
# If this options is not defined, the hostname is used.
# The tags of the shipper are included in their own field with each
# transaction published. Tags make it easy to group servers by different
# logical properties.
#tags: ["service-X", "web-tier"]
# Uncomment the following if you want to ignore transactions created
# by the server on which the shipper is installed. This option is useful
# to remove duplicates if shippers are installed on multiple servers.
#ignore_outgoing: true
# How often (in seconds) shippers are publishing their IPs to the topology map.
# The default is 10 seconds.
#refresh_topology_freq: 10
# Expiration time (in seconds) of the IPs published by a shipper to the topology map.
# All the IPs will be deleted afterwards. Note, that the value must be higher than
# refresh_topology_freq. The default is 15 seconds.
#topology_expire: 15
# Internal queue size for single events in processing pipeline
#queue_size: 1000
# Configure local GeoIP database support.
# If no paths are not configured geoip is disabled.
# - "/usr/share/GeoIP/GeoLiteCity.dat"
# - "/usr/local/var/GeoIP/GeoLiteCity.dat"
############################# Logging #########################################
# There are three options for the log ouput: syslog, file, stderr.
# Under Windos systems, the log files are per default sent to the file output,
# under all other system per default to syslog.
# Send all logging output to syslog. On Windows default is false, otherwise
# default is true.
#to_syslog: true
# Write all logging output to files. Beats automatically rotate files if rotateeverybytes
# limit is reached.
#to_files: false
# To enable logging to files, to_files option has to be set to true
# The directory where the log files will written to.
#path: /var/log/mybeat
# The name of the files where the logs are written to.
#name: mybeat
# Configure log file size limit. If limit is reached, log file will be
# automatically rotated
rotateeverybytes: 10485760 # = 10MB
# Number of rotated log files to keep. Oldest files will be deleted first.
#keepfiles: 7
# Enable debug output for selected components. To enable all selectors use ["*"]
# Other available selectors are beat, publish, service
# Multiple selectors can be chained.
#selectors: [ ]
# Sets log level. The default log level is error.
# Available log levels are: critical, error, warning, info, debug
#level: error
First of all, I guess you're using filebeat 1.x (which is a very old version of filebeat).
Cleaning your configuration file, it seems that you have a wrongly formatted configuration file.
Your current configuration:
- /var/log/*.log
input_type: log
registry_file: /var/lib/filebeat/registry
hosts: ["localhost:5044"]
I can see that you have wrong identation and a missing prospector start dash "-".
I tested this configuration with filebeat-1.3.1-x86_64 and it works.
Can you please try to update your configuration file to:
input_type: log
- /var/log/*.log
registry_file: /var/lib/filebeat/registry
- "localhost:5044"

sematext logagent debugging patterns

I have installed sematext logagent https://sematext.github.io/logagent-js/installation/
Configured it to output to elasticsearch and all is good but one thing which i spent this all day trying to do.
There is 0, null, none information on how to debug parsers. I start logagent with "logagent --config logagent.yml -v -j", yml file bellow
printStats: 30
# don't write parsed logs to stdout
suppress: false
# Enable/disable GeoIP lookups
# Startup of logagent might be slower, when downloading the GeoIP database
geoipEnabled: false
# Directory to store Logagent status nad temporary files
diskBufferDir: ./tmp
- '/var/log/messages'
- '/var/log/test'
sourceName: !!js/regexp /test/
- type: mysyslog
regex: !!js/regexp /([a-z]){2}(.*)/
fields: [message,severity]
dateFormat: MMM DD HH:mm:ss
module: elasticsearch
url: http://host:9200
index: mysyslog
stdout: yaml # use 'pretty' for pretty json and 'ldjson' for line delimited json (default)
I would expect (based on the scares documentation) that this would split each line of test file into 2, example 'ggff', 'gg' would be message, 'ff' would be severity, but all i can see in my kibana is that 'ggff' is a message and severity is defaulted (?) to info. The problem is, i dont know where the problem is. Does it skip my pattern, does match in my pattern fail ? any help would be VERY appreciated.
Setting 'debug: true' in patterns.yml prints detailed info about matched patterns.
Watch Logagent issue #69 (https://github.com/sematext/logagent-js/issues/69) for additional improvements.
The docs moved to http://sematext.com/docs/logagent/ . I recommend www.regex101.com to test regular expressions (please use JavaScript regex syntax).
Examples of Syslog messages in /var/log are in the default pattern library:

SaltStack: edit yaml file on minion host based on salt pillar data

Say the minion host has a default yaml configuration named myconf.yaml. What I want to do is to edit parts of those yaml entries using values from a pillar. I can't even begin to think how to do this on Salt. The only think I can think of is to run a custom python script on the host via cmd.run and feed it with input via arguments, but this seems overcomplicated.
I want to avoid file.managed. I cannot use a template, since the .yaml file is big, and can change by external means. I just want to edit a few parameters in it. I suppose a python script could do it but I thought salt could do it without writing s/w
I have found salt.states.file.serialize with the merge_if_exists option, I will try this and report.
You want file.serialize with the merge_if_exists option.
# states/my_app.sls
- name: /etc/my_app.yaml
- dataset_pillar: my_app:mergeconf
- formatter: yaml
- merge_if_exists: true
# pillar/my_app.sls
opt3: 100
opt4: 200
On the target, /etc/my_app.yaml might start out looking like this (before the state is applied):
# /etc/my_app.yaml
user: a
pass: b
opt1: 1
opt2: 2
opt3: 3
opt4: 4
And would look like this after the state is applied:
user: a
pass: b
opt1: 1
opt2: 2
opt3: 100
opt4: 200
As far as I can tell this uses the same algorithm as pillar merges, so e.g. you can merge or partially overwrite dictionaries, but not lists; lists can only be replaced whole.
This can be done for both json and yaml with file.serialize. Input can be inline on the state or come from a pillar. A short excerpt follows:
- serialize
# - dataset:
# concurrent_reads: 8
- dataset_pillar: cassandra_yaml
- name: /etc/cassandra/conf/cassandra.yaml
- formatter: yaml
- merge_if_exists: True
- require:
- pkg: cassandra-pkgs
concurrent_reads: "8"
