Sending messages to multiple elastic search indices - elasticsearch

We are running an ELK stack to aggregate all our logs and we have multiple systems. Currently, we have Filebeat configured to log to specific indices based on the system (SystemA, SystemB, SystemC).
I would like to, additionally, send all logs with level ERROR to another index where I would like to collect all errors across systems, but somehow I can't figure out how to get Filebeat to send one message to multiple indices
According to the documentation, the first condition that matches will define the index to be used, which sounds to me as if it's not possible to send a message that would match multiple patterns to multiple indices?
What I want to do:
output.elasticsearch:
hosts: '${ELASTICSEARCH_HOSTS}'
username: '${ELASTICSEARCH_USERNAME}'
password: '${ELASTICSEARCH_PASSWORD}'
index: "filebeat-external-%{+yyyy.MM.dd}"
indices:
- index: "filebeat-error-logs-%{+yyyy.MM.dd}"
when:
or:
- equals:
level: "ERROR"
- equals:
level: "error"
- index: "filebeat-service-a-%{+yyyy.MM.dd}"
when:
regexp:
container.name: "^service-a-"
- index: "filebeat-service-b-%{+yyyy.MM.dd}"
when:
regexp:
container.name: "^service-b-"
The only way I currently see is to have multiple indices per system and aggregate them in Kibana:
output.elasticsearch:
hosts: '${ELASTICSEARCH_HOSTS}'
username: '${ELASTICSEARCH_USERNAME}'
password: '${ELASTICSEARCH_PASSWORD}'
index: "filebeat-external-%{+yyyy.MM.dd}"
indices:
- index: "error-log-service-a-%{+yyyy.MM.dd}"
when:
and:
- equals:
level: "ERROR"
- regexp:
container.name: "^service-a-"
- index: "service-log-service-a-%{+yyyy.MM.dd}"
when:
and:
- not:
- equals:
level: "ERROR"
- regexp:
container.name: "^service-a-"
But this would double our number of indices and is code duplication. Am I missing something here, is there an easier way to have a general error-index but still have errors go to the service-specific indices as well?

Related

Elasticsearch/Kibana shows the wrong timestamp

I transfer logfiles with filebeat to elasticsearch.
The data are analyzed with kibana.
Now to my problem:
Kibana shows not the timestamp from the logfile.
Kibana shows the time of the transmission in #timestamp.
I want to show the timestamp from the logfile in kibana.
But the timestamp in the logfile is overwritten.
Where is my fault?
Has anyone a solution for my problem?
Here a example from my logfile and the my filebeat config.
{"#timestamp":"2022-06-23T10:40:25.852+02:00","#version":1,"message":"Could not refresh JMS Connection]","logger_name":"org.springframework.jms.listener.DefaultMessageListenerContainer","level":"ERROR","level_value":40000}
## Filebeat configuration
## https://github.com/elastic/beats/blob/master/deploy/docker/filebeat.docker.yml
#
filebeat.config:
modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
filebeat.autodiscover:
providers:
# The Docker autodiscover provider automatically retrieves logs from Docker
# containers as they start and stop.
- type: docker
hints.enabled: true
filebeat.inputs:
- type: filestream
id: pls-logs
paths:
- /usr/share/filebeat/logs/*.log
parsers:
- ndjson:
processors:
- add_cloud_metadata: ~
output.elasticsearch:
hosts: ['http://elasticsearch:9200']
username: elastic
password:
## HTTP endpoint for health checking
## https://www.elastic.co/guide/en/beats/filebeat/current/http-endpoint.html
#
http.enabled: true
http.host: 0.0.0.0
Thanks for any support!
Based upon the question, this could be one potential option, which would be to use filebeat processors. What you could do is write that initial #timestamp value to another field, like event.ingested, using the following script below:
#Script to move the timestamp to the event.ingested field
- script:
lang: javascript
id: init_format
source: >
function process(event) {
var fieldTest = event.Get("#timestamp");
event.Put("event.ingested", fieldTest);
}
And then the last processor you write could move that event.ingested field to #timestamp again using the following processor:
#setting the timestamp field to the Date/time when the event originated, which would be the event.created field
- timestamp:
field: event.created
layouts:
- '2006-01-02T15:04:05Z'
- '2006-01-02T15:04:05.999Z'
- '2006-01-02T15:04:05.999-07:00'
test:
- '2019-06-22T16:33:51Z'
- '2019-11-18T04:59:51.123Z'
- '2020-08-03T07:10:20.123456+02:00'

Drop log lines to Loki using multiple conditions with Promtail

I want to drop lines in Promtail using an AND condition from two different JSON fields.
I have JSON log lines like this.
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET / HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 1"}
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET /path HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 1"}
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET / HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 2"}
My local Promtail config looks like this.
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: testing-my-job-drop
pipeline_stages:
- match:
selector: '{job="my-job"}'
stages:
- json:
expressions:
http_user_agent:
request:
- drop:
source: "http_user_agent"
expression: "user agent 1"
# I want this to be AND
- drop:
source: "request"
expression: "GET / HTTP/1.1"
drop_counter_reason: my_job_healthchecks
static_configs:
- labels:
job: my-job
Using a Promtail config like this drops lines using OR from my two JSON fields.
How can I adjust my config so that I only drop lines where http_user_agent = user agent 1 AND request = GET / HTTP/1.1?
If you provide multiple options they will be treated like an AND clause, where each option has to be true to drop the log.
If you wish to drop with an OR clause, then specify multiple drop stages.
https://grafana.com/docs/loki/latest/clients/promtail/stages/drop/#drop-stage
Drop logs by time OR length
Would drop all logs older than 24h OR longer than 8kb bytes
- json:
expressions:
time:
msg:
- timestamp:
source: time
format: RFC3339
- drop:
older_than: 24h
- drop:
longer_than: 8kb
Drop logs by regex AND length
Would drop all logs that contain the word debug AND are longer than 1kb bytes
- drop:
expression: ".*debug.*"
longer_than: 1kb
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: testing-my-job-drop
pipeline_stages:
- match:
selector: '{job="my-job"}'
stages:
- json:
expressions:
http_user_agent:
request:
- labels:
http_user_agent:
request:
#### method 1
- match:
selector: '{http_user_agent="user agent 1"}'
stages:
- drop:
source: "request"
expression: "GET / HTTP/1.1"
drop_counter_reason: my_job_healthchecks
## they are both conditions match will drop
#### method 2
- match:
selector: '{http_user_agent="user agent 1",request="GET / HTTP/1.1"}'
action: drop
#### method 3, incase regex pattern.
- match:
selector: '{http_user_agent="user agent 1"} |~ "(?i).*GET / HTTP/1.1.*"'
action: drop
static_configs:
- labels:
job: my-job
match stage include match stage.

Is it possible to take snapshot and restore with elasticsearch-curator without loosing the updates in the destination index?

I am able to run the curator to take the snapshot from the source index and restore the same snapshot in the destination index.
But all the updations that I did on the destination index are lost after the next snapshot and restore action.
Is it possible to specify not to overwrite the updations of the destination index?
source index: test_index
destination index: dest_test_index
snapshot-action.yml file
actions:
1:
action: snapshot
description: Snapshot selected indices to 'repository' with the snapshot name or name pattern in 'name'. Use all other options as assigned
options:
repository: esbackup
name:
wait_for_completion: True
max_wait: 3600
wait_interval: 10
filters:
- filtertype: pattern
kind: regex
value: '^(test_index)$'
exclude:
restore-action.yml file
actions:
1:
action: create_index
description: "Create the temporary index with dest_index_v2 name"
options:
name: dest_index_v2
2:
action: close
description: >-
Close index dest_indiex_v2.
options:
ignore_empty_list: True
skip_flush: False
delete_aliases: False
ignore_sync_failures: True
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: dest_index_v2
3:
action: restore
description: >-
Restore test_index from the most recent snapshot in temp index dest_index_v2.
options:
repository: esbackup
# If name is blank, the most recent snapshot by age will be selected
name:
# If indices is blank, all indices in the snapshot will be restored
indices: ['test_index']
rename_pattern: test_index
rename_replacement: dest_index_v2
wait_for_completion: True
max_wait: 3600
wait_interval: 10
filters:
- filtertype: none
4:
action: open
description: >-
Open index pattern dest_index_v2.
options:
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: dest_index_v2
exclude:
5:
description: "Reindex dest_index_v2 into dest_test_index"
action: reindex
options:
wait_interval: 9
max_wait: -1
request_body:
source:
index: dest_index_v2
dest:
index: dest_test_index
filters:
- filtertype: none
6:
action: delete_indices
description: >-
Delete index dest_index_v2. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: dest_index_v2
If you are looking for some elasticsearch setting which will merge in updated destination index with the source index(which you are restoring from snapshot), then the short answer is NO.
You can write custom code to perform following operation to make sure destination index update is not lost.
Restore source index(test_index) to a temporary index in the cluster
, lets call this index as temp_index
Retrieve documents from temp_index and insert in destination index (dest_test_index) with op_type=create
Operation type create will make sure that the index operation will fail if a document by that id already exists in the index.
You can refer to documentation here
Hope this solves your purpose.

ansible use of wild card in with_items and loop

Team,
Using single full defined string in with_item I have my task running fine. However, at scale i would like to loop with string inside with_items changing. any hints?
- name: "Fetch all CPU nodes from clusters using K8s beta.kubernetes.io/instance-type"
k8s_facts:
kind: Node
label_selectors:
- "beta.kubernetes.io/instance-type=e1.xlarge"
verify_ssl: no
register: cpu_class_list
failed_when: cpu_class_list == ''
output:
ok: [localhost] => {
"nodes_class_label": [
{
"instanceType": "e1.xlarge,
"nodeType": "cpu",
"node_name": "hostA"
},
{
"instanceType": "e1.xlarge,
"nodeType": "cpu",
"node_name": "hostB"
}
]
}
I would like to pull all the nodes matching any name with wildcard.
label_selectors:
- "beta.kubernetes.io/instance-type=e1.xlarge"
- "beta.kubernetes.io/instance-type=f1.xlarge"
- "beta.kubernetes.io/instance-type=g1.xlarge"
expected output:
list all e1 label nodes output
list all f1 label nodes output
list all g1 label nodes output
my attempted solution:
- name: "Fetch all CPU nodes from clusters using K8s beta.kubernetes.io/instance-type"
k8s_facts:
kind: Node
label_selectors:
- "beta.kubernetes.io/instance-type=*.xlarge"
verify_ssl: no
register: cpu_class_list
failed_when: cpu_class_list == ''
Unfortunately this is not possible. It is a limitation based on the implementation of the Kubernetes API, unrelated to Ansible or the k8s_facts module.
There are a couple of workarounds, but ultimately I think using a set-based selector is your best option. This would look similar to:
- k8s_facts:
kind: Node
label_selectors:
- "beta.kubernetes.io/instance-type in (e1.xlarge, f1.xlarge, g1.xlarge)"
You should also be able to pull the instance types out into a variable for code readability and maintenance purposes.
The other option is to just loop over the k8s_facts task for each instance type, which I'm guessing you have already considered: "beta.kubernetes.io/instance-type={{ item }}".
Finally, one of your examples in the question will not work:
label_selectors:
- "beta.kubernetes.io/instance-type=e1.xlarge"
- "beta.kubernetes.io/instance-type=f1.xlarge"
- "beta.kubernetes.io/instance-type=g1.xlarge"
This is looking for nodes that meet all of those criteria (i.e. e1.xlarge && f1.xlarge && g1.xlarge), which will always be none.

how can i store in two index using two JSON formated log files using filebeat and output to elasticsearch

below is my configuration file for filebeat which is present in /etc/filebeat/filebeat.yml,
it throws an error of
Failed to publish events: temporary bulk send failure
filebeat.prospectors:
- paths:
- /var/log/nginx/virus123.log
input_type: log
fields:
type:virus123
json.keys_under_root: true
- paths:
- /var/log/nginx/virus1234.log
input_type: log
fields:
type:virus1234
json.keys_under_root: true
setup.template.name: "filebeat-%{[beat.version]}"
setup.template.pattern: "filebeat-%{[beat.version]}-*"
setup.template.overwrite: true
processors:
- drop_fields:
fields: ["beat","source"]
output.elasticsearch:
index: index: "filebeat-%{[beat.version]}-%{[fields.type]:other}-%{+yyyy.MM.dd}"
hosts: ["http://127.0.0.1:9200"]
I think I found your problem, Although i'm not sure it is the only problem
index: index: "filebeat-%{[beat.version]}-%{[fields.type]:other}-%{+yyyy.MM.dd}"
should be:
index: "filebeat-%{[beat.version]}-%{[fields.type]:other}-%{+yyyy.MM.dd}"
I saw a similar problem with a wrong index which cause the same error that you showed

Resources