I am using a metricbeat (7.3) docker container along side several other docker containers, and sending the results to an elasticsearch (7.3) instance. This works, and the first time everything spins up I get an index in elasticsearch called metricbeat-7.3.1-2019.09.06-000001
The initial problem is that I have a Graphana dashboard setup to look for an index with today's date, so it seems to ignore one created several days ago altogether. I could try to figure out what's wrong with those Grafana queries, but more generically I need those index names to roll at some point - the index that's there is already up to over 1.3GB, and at some point that will just be too big for the system.
My initial metricbeat.yml config:
- module: docker
metricsets:
- "container"
- "cpu"
- "diskio"
- "memory"
- "network"
hosts: ["unix:///var/run/docker.sock"]
period: 10s
enabled: true
output.elasticsearch:
hosts: ["${ELASTICSEARCH_URL}"]
Searching around a bit, it seems like the index field on the elasticsearch output should configure the index name, so I tried the following:
- module: docker
metricsets:
- "container"
- "cpu"
- "diskio"
- "memory"
- "network"
hosts: ["unix:///var/run/docker.sock"]
period: 10s
enabled: true
output.elasticsearch:
hosts: ["${ELASTICSEARCH_URL}"]
index: "metricbeat-%{[beat.version]}-instance1-%{+yyyy.MM.dd}"
That throws an error about needing setup.template settings, so I settled on this:
- module: docker
metricsets:
- "container"
- "cpu"
- "diskio"
- "memory"
- "network"
hosts: ["unix:///var/run/docker.sock"]
period: 10s
enabled: true
output.elasticsearch:
hosts: ["${ELASTICSEARCH_URL}"]
index: "metricbeat-%{[beat.version]}-instance1-%{+yyyy.MM.dd}"
setup.template:
overwrite: true
name: "metricbeat"
pattern: "metricbeat-*"
I don't really know what the setup.template section does, so most of that is a guess from Google searches.
I'm not really sure if the issue is on the metricbeat side, or on the elasticsearch side, or somewhere in-between. But bottom line - how do I get them to roll the index to a new one when the day changes?
This is the setting/steps that worked for me:
metricbeat.yml file:
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["<es-ip>:9200"]
index: metricbeat-%{[beat.version]}
index_pattern: -%{+yyyy.MM.dd}
ilm.enabled: true
then, over to kibana i.e :5601:
go to "Stack Monitoring", select the "metricbeat-*"
do this kind of a setting to begin with, and what follows later is self-explanatory too:
Related
I've created a docker-compose file with some configurations that deploy Elasticsearch, Kibana, Elastic Agent all version 8.7.0.
where in the Kibana configuration files I define the police I needed under xpack.fleet.agentPolicies, with single command all my environment goes up and all component connect successfully.
The only issue is there is one manual step, which is I had to go to Kibana -> Observability -> APM -> Add Elastic APM and then fill the Server configuration.
I want to automate this and manage this from the API/CMD/configuration file, I don't want to do it from the UI.
What is the way to do this? in which component? what is the path the configuration should be at?
I tried to look for APIs or command to do that, but with no luck. I'm expecting help with automating the remaning step.
#Update 1
I've tried to add it as below, but I still can't see the integration added.
package_policies:
- name: fleet_server-apm
id: default-fleet-server
package:
name: fleet_server
inputs:
- type: apm
enabled: true
vars:
- name: host
value: "0.0.0.0:8200"
- name: url
value: "http://0.0.0.0:8200"
- name: enable_rum
value: true
frozen: true
Tldr;
Yes, I believe there is a way to do it.
But I am pretty sure this is poorly documented.
You can find some idea in the repository of apm-server
Solution
In the kibana.yml file you can add some information related to fleet.
This section below is taken from the repository above and helped me set up apm automatically.
But if you have some specific settings you would like to see enable I am usure where you provide them.
xpack.fleet.packages:
- name: fleet_server
version: latest
xpack.fleet.agentPolicies:
- name: Fleet Server (APM)
id: fleet-server-apm
is_default_fleet_server: true
is_managed: false
namespace: default
package_policies:
- name: fleet_server-apm
id: default-fleet-server
package:
name: fleet_server
It is true that the kibana Fleet API is very poorly documented at this moment. I think your problem is that you are trying to add the variables to the fleet-server package insted of the apm package. Your yaml should look like this:
package_policies:
- name: fleet_server-apm
id: default-fleet-server
package:
name: fleet_server
- name: apm-1
package:
name: apm
inputs:
- type: apm
keep_enabled: true
vars:
- name: host
value: 0.0.0.0:8200
frozen: true
- name: url
value: "http://0.0.0.0:8200"
frozen: true
- name: enable_rum
value: true
frozen: true
Source
I am deploying a VM in azure using ansible and using the public ip created in the next tasks. But the time taken to create the public ip is too long so when the subsequent task is executed, it fails. The time to create the ip also varies, it's not fixed. I want to introduce some logic where the next task will only run when the ip is created.
- name: Deploy Master Node
azure_rm_virtualmachine:
resource_group: myResourceGroup
name: testvm10
admin_username: chouseknecht
admin_password: <your password here>
image:
offer: CentOS-CI
publisher: OpenLogic
sku: '7-CI'
version: latest
Can someone assist me here..! It's greatly appreciated.
I think the wait_for module is a bad choice because while it can test for port availability it will often give you false positives because the port is open before the service is actually ready to accept connections.
Fortunately, the wait_for_connection module was designed for exactly the situation you are describing: it will wait until Ansible is able to successfully connect to your target.
This generally requires that you register your Azure VM with your Ansible inventory (e.g. using the add_host module). I don't use Azure, but if I were doing this with OpenStack I might write something like this:
- hosts: localhost
gather_facts: false
tasks:
# This is the task that creates the vm, much like your existing task
- os_server:
name: larstest
cloud: kaizen-lars
image: 669142a3-fbda-4a83-bce8-e09371660e2c
key_name: default
flavor: m1.small
security_groups: allow_ssh
nics:
- net-name: test_net0
auto_ip: true
register: myserver
# Now we take the public ip from the previous task and use it
# to create a new inventory entry for a host named "myserver".
- add_host:
name: myserver
ansible_host: "{{ myserver.openstack.accessIPv4 }}"
ansible_user: centos
# Now we wait for the host to finished booting. We need gather_facts: false here
# because otherwise Ansible will attempt to run the `setup` module on the target,
# which will fail if the host isn't ready yet.
- hosts: myserver
gather_facts: false
tasks:
- wait_for_connection:
delay: 10
# We could add additional tasks to the previous play, but we can also start
# new play with implicit fact gathering.
- hosts: myserver
tasks:
- ...other tasks here...
I'm working on elasticsearch version 7.2.0 and shipping the logs using filebeat. I can use the custom pipeline but I'm unable to set custom index name. Kindly help.
Below is my filebeat output configuration:
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]
pipeline: reindex_timestamp
index: "logstash-%{+yyyy.MM.dd}"
setup.template.name: "logstash"
setup.template.pattern: "logstash-*"
setup.template.enabled: true
setup.template.overwrite: true
Here I'm not sure though how I have to create a custom template that I specified the name
Update: I found the solution to my requirement- Below configurations worked for me since my requirement is to write weblogs( if logs contain name: web) in separate index and rest of the application logs in another index (now writing to default index called filebeat-*)
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]
#index: "filebeat-7.2.0-logstash-%{+yyyy.MM.dd}" #Its not taking custom index
pipeline: reindex_timestamp_logstash
indices:
- index: "node-%{+yyyy.MM.dd}"
when.contains:
name: web
pipelines:
- pipeline: reindex_timestamp_node
when.contains:
name: web
setup.template.name: "filebeat-7.2.0"
setup.template.pattern: "filebeat-7.2.0-*"
I’m trying to collect logs from Kubernetes nodes using Filebeat and ONLY ship them to ELK IF the logs originate from a specific Kubernetes Namespace.
So far I’ve discovered that you can define Processors which I think accomplish this. However, no matter what I do I can not get the shipped logs to be constrained. Does this look right?
Hm, does this look correct then?
filebeat.config:
inputs:
path: ${path.config}/inputs.d/*.yml
reload.enabled: true
reload.period: 10s
when.contains:
kubernetes.namespace: "NAMESPACE"
modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
processors:
- add_kubernetes_metadata:
namespace: "NAMESPACE"
xpack.monitoring.enabled: true
output.elasticsearch:
hosts: ['elasticsearch:9200']
Despite this configuration I still get logs from all of the namespaces.
Filebeat is running as a DaemonSet on Kubernetes. Here is an example of an expanded log entry: https://i.imgur.com/xfTwbhl.png
You have number options to do it:
Filter data by filebeat
processors:
- drop_event:
when:
contains:
source: "field"
Use ingest pipeline into elasticsearch:
output.elasticsearch:
hosts: ["localhost:9200"]
pipeline: my_pipeline_id
And then test events into pipeline:
{
"drop": {
"if" : "ctx['field'] == null "
}
}
Use drop filter of logstash:
filter {
if ![field] {
drop { }
}
}
In the end, I resolved this by moving the drop processor to the input configuration file from the configuration file.
I'm used to see Ansible examples as:
- file: path=/tmp/file state=touch
but someone at work told me that I should be consistent using only YAML syntax like this:
- file:
path: /tmp/file
state: touch
or,
- file: {path: /tmp/file, state:touch}
Which one satisfies Ansible best practices?
Taken from https://www.ansible.com/blog/ansible-best-practices-essentials
At its core, the Ansible playbook runner is a YAML parser with added logic such as commandline key=value pairs shorthand. While convenient when cranking out a quick playbook or a docs example, that style of formatting reduces readability. We recommend you refrain from using that shorthand (even with YAML folded style) as a best practice.
Here is an example of some tasks using the key=value shorthand:
- name: install telegraf
yum:
name: telegraf-{{ telegraf_version }} state=present update_cache=yes disable_gpg_check=yes enablerepo=telegraf
notify: restart telegraf
- name: configure telegraf
template: src=telegraf.conf.j2 dest=/etc/telegraf/telegraf.conf
notify: restart telegraf
- name: start telegraf
service: name=telegraf state=started enabled=yes
Now here is the same tasks using native YAML syntax:
- name: install telegraf
yum: telegraf-{{ telegraf_version }}
state: present
update_cache: yes
disable_gpg_check: yes
enablerepo: telegraf
notify: restart telegraf
- name: configure telegraf
template:
src: telegraf.conf.j2
dest: /etc/telegraf/telegraf.conf
notify: restart telegraf
- name: start telegraf
service:
name: telegraf
state: started
enabled: yes
Native YAML has more lines; however, those lines are shorter, reducing horizontal scrolling and line wrapping. It lets the eyes scan straight down the play. The task parameters are stacked and easily distinguished from the next. Native YAML syntax also has the benefit of improved syntax highlighting in virtually any modern text editor out there. Being native YAML, editors such as vim and Atom will highlight YAML keys (module names, directives, parameter names) from their values further aiding the readability of your content. Many of our own docs use this shorthand for legacy reasons though we’re progressively changing that. (Documentation pull requests accepted.)