Content repeat collecting problems while use filebeat - elasticsearch

Recently we will use filebeat to collect our system logs to elasticsearch vias:
${local_log_file} -> filebeat -> kafka -> logstash -> elasticsearch -> kibana
While testing our system, we found a scenario that filebeat will repeatly collect logs which means that it will collect logs from the start of file once there is a change.
here is my configuration for filebeat:
filebeat.prospectors:
- input_type: log
paths:
- /home/XXX/exp/*.log
scan_frequency: 1s
#tail_files: true
#================================ Outputs =====================================
#----------------------------- Logstash output --------------------------------
# output.logstash:
# hosts: ["localhost:5044"]
#----------------------------- Kafka output -----------------------------------
output.kafka:
enabled: true
hosts: ["10.10.1.103:9092"]
topic: egou
#----------------------------- console output --------------------------------
output.console:
enabled: true
pretty: true
Notice:
we construct the log files manually, and we are sure that there is a blank line at the end of file
to make a console, we open the output.console
once there is content appended to the end of log file, filebeat will collect from the beginning of the file.But we hope just fetching the change of file.
filebeat version is 5.6.X
Hope any useful hint can be offered by u all

I think this is because of the editor, you are using, creates a new file on save with new meta-data. Filebeat identifies the state of the file using its meta-data, not the content.
try,
echo "something" >> /path/to/file.log
ref: https://discuss.elastic.co/t/filebeat-repeatedly-sending-old-entries-in-log-file/55796

Related

Filebeat Kubernetes cannot output to ElasticSearch

Filebeat Kubernetes cannot output to ElasticSearch,
ElasticSearch is OK.
filebeat is daemonset,relevant environment variables have been added.
filebeat.yml
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
hints.enabled: true
hints.default_config:
enabled: false
type: container
paths:
- /var/log/containers/*-${data.container.id}.log
output.elasticsearch:
hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
username: ${ELASTICSEARCH_USERNAME}
password: ${ELASTICSEARCH_PASSWORD}
Kubernetes
Use nginx app to test:
image=nginx:latest
Deployment annotations have been added.
co.elastic.logs/enabled: "true"
pod.yaml (in node1)
But cannot output to ElasticSearch,Logs and indexes for related input are not seen.
filebeat pod(node1) logs
Expect the filebeat to collect logs for the specified container(Pod) to elasticsearch.
#baymax first off, you don't need to explicitly define the property anywhere:
co.elastic.logs/enabled: "true"
since filebeat, by default, reads all the container log files on the node.
Secondly, you are disabling hints.default_config which ensures filebeat will only read the log files of pods which are annotated as above; however, you haven't provided any template config to be used for reading such log files.
For more info, read: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover-hints.html
Thirdly, in your filebeat logs, do you see any harvester being started, handles created and events published ? Posting a snapshot of logs doesn't give a clear picture. May be try starting filebeat in debug mode for few minutes and paste the logs here in proper formatting.

Filebeat & test inputs

I'm working on a Filebeat solution and I'm having a problem setting up my configuration. Let me explain my setup:
I have a app that produces a csv file that contains data that I want to input in to ElasticSearch using Filebeats.
I'm using Filebeat 5.6.4 running on a windows machine.
Provided below is my filebeat.ymal configuration:
filebeat.inputs:
- type: log
enabled: true
paths:
- C:\App\fitbit-daily-activites-heart-rate-*.log
output.elasticsearch:
hosts: ["http://esldemo.com:9200"]
index: "fitbit-daily-activites-heartrate-%{+yyyy.MM.dd}"
setup.template:
name: "fitbit-daily-activites-heartrate"
pattern: "fitbit-daily-activites-heartrate-*"
fields: "fitbit-heartrate-fields.yml"
overwrite: false
settings:
index.number_of_shards: 1
index.number_of_replicas: 0
And my data looks like this:
0,2018-12-13 00:00:02.000,66.0,$
1,2018-12-13 00:00:07.000,66.0,$
2,2018-12-13 00:00:12.000,67.0,$
3,2018-12-13 00:00:17.000,67.0,$
4,2018-12-13 00:00:27.000,67.0,$
5,2018-12-13 00:00:37.000,66.0,$
6,2018-12-13 00:00:52.000,66.0,$
I'm trying to figure out why my configuration is not picking up my data and outputting it to ElasticSearch. Please help.
There are some differences in the way you configure Filebeat in versions 5.6.X and in the 6.X branch.
For 5.6.X you need to configure your input like this:
filebeat.prospectors:
- input_type: log
paths:
- 'C:/App/fitbit-daily-activites-heart-rate-*.log'
You also need to put your path between single quotes and use forward slashes.
Filebeat 5.6.X configuration

Filebeat is processing all the logs instead of the specified application logs

I have an app server, where I have configured filebeat(through Chef) to extract the logs and publish it to logstash(a separate ELK server), and subsequently to ES and Kibana.
I have configured filebeat to process logs only from /opt/app_logs/*.log, but it seems it is reading logs from other locations too, because in the /etc/filebeat configuration directory, I have filebeat.full.yml and other yml files generated automatically, and they seem to have all those other file locations, thus due to such a huge amount of logs, logstash service is getting out of memory within minutes with logstash.log. How can I not autogenerate the other yml files?
I tried to remove this file and also tried to comment out all the /var/log paths from the prospectors, but then filebeat itself is not starting.
filebeat.yml file:
filebeat:
prospectors: []
registry_file: "/var/lib/filebeat/registry"
config_dir: "/etc/filebeat"
output:
logstash:
hosts:
- elk_host:5044
index: logstash-filebeat
shipper:
name: serverA
tags:
- A
logging:
to_files: 'true'
files:
path: "/var/log/filebeat"
name: filebeat_log
rotateeverybytes: '10485760'
level: info
prospectors:
- paths:
- "/opt/app_logs/*.log"
encoding: plain
input_type: log
ignore_older: 24h
The main problem with your configuration is that for Filebeat 1.2.3 you have the prospectors list defined twice and second one is not in the correct location.
The second problem is that you have defined the config_dir as /etc/filebeat. config_dir is used to specify an additional directory where to look for config files. It should never be set to /etc/filebeat because this is where the main config file should be located. See https://stackoverflow.com/a/39987501/503798 for usage information.
A third problem is that you have used string types in to_files and rotateeverybytes. They should be boolean and integer types respectively.
Here's how the config should look for Filebeat 1.x.
filebeat:
registry_file: "/var/lib/filebeat/registry"
config_dir: "/etc/filebeat/conf.d"
prospectors:
- paths:
- "/opt/app_logs/*.log"
encoding: plain
input_type: log
ignore_older: 24h
output:
logstash:
hosts:
- elk_host:5044
index: logstash-filebeat
shipper:
name: serverA
tags:
- A
logging:
to_files: true
files:
path: "/var/log/filebeat"
name: filebeat_log
rotateeverybytes: 10485760
level: info
I highly recommend that you upgrade to Filebeat 5.x because it has better configuration validation using filebeat -configtest.

Filebeats doesn't foward Docker compose logs, why?

I am following this tutorial to set up a ELK stack (VPS B) that will receive some Docker/docker compose images logs (VPS A) using Beatfile as forwarder, my diagram is as shown below
So far, I have managed to have all the interfaces with green ticks working. However, there are still remaining some issues in that I am not able to understand. Thus, I would appreciate if someone could help me out a bit with it.
My main issue is that I don't get any Docker/docker-compose log from the VPSA into the Filebeat Server of VPSB; nevertheless, I got other logs from VPSA such as rsyslog, authentication log and so on on the Filebeat Server of VPSB. I have configured my docker-compose file to forward the logs using rsyslog as logging driver, and then filebeat is fowarding that syslog to the VPSB. Reaching this point, I do see logs from the docker daemon itself, such as virtual interfaces up/down, staring process and so, but not the "debug" logs of the containters themselves.
The configuration of Filebeat client in VPSA looks like this
root#VPSA:/etc/filebeat# cat filebeat.yml
filebeat:
prospectors:
-
paths:
- /var/log/auth.log
- /var/log/syslog
# - /var/log/*.log
input_type: log
document_type: syslog
registry_file: /var/lib/filebeat/registry
output:
logstash:
hosts: ["ipVPSB:5044"]
bulk_max_size: 2048
tls:
certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]
shipper:
logging:
files:
rotateeverybytes: 10485760 # = 10MB
level: debug
One of the docker-compose logging driver looks like this
redis:
logging:
driver: syslog
options:
syslog-facility: user
Finally I would like to ask, whether it is possible to forward natively from docker-composer the logs to Filebeat client in VPSA, red arrow in the diagram, so that it can forward them to my VPSB.
Thank you very much,
REgards!!
The issue seemed to be in FileBeat VPSA, since it has to collect data from the syslog, it has to be run before that syslog!
Updating rc.d made it work
sudo update-rc.d filebeat defaults 95 10
My filebeats.yml if someone needs it
root#VPSA:/etc/filebeat# cat filebeat.yml
filebeat:
prospectors:
-
paths:
# - /var/log/auth.log
- /var/log/syslog
# - /var/log/*.log
input_type: log
ignore_older: 24h
scan_frequency: 10s
document_type: syslog
registry_file: /var/lib/filebeat/registry
output:
logstash:
hosts: ["ipVPSB:5044"]
bulk_max_size: 2048
tls:
certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]
shipper:
logging:
level: debug
to_files: true
to_syslog: false
files:
path: /var/log/mybeat
name: mybeat.log
keepfiles: 7
rotateeverybytes: 10485760 # = 10MB
Regards

Filebeat doesn't forward data to logstash

I have a setup using elasticsearch, kibana, logstash on one vm machine and filebeat on the slave machine. I managed to send syslog messages and logs from auth.log file following the tutorial from here. In the filebeat log I saw that the messages are published, but when I try to send a json file I don't see any publish event ( I see just Flushing spooler because of timemout. Events flushed: 0).
My filebeat.yml file is
filebeat:
prospectors:
-
paths:
# - /var/log/auth.log
# - /var/log/syslog
# - /var/log/*.log
- /home/slave/data_2/*
input_type: log
document_type: log
registry_file: /var/lib/filebeat/registry
output:
logstash:
hosts: ["192.168.132.207:5044"]
tls:
certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]
shipper:
logging:
level: debug
to_files: true
to_syslog: false
files:
path: /var/log/mybeat
name: mybeat.log
keepfiles: 7
rotateeverybytes: 10485760 # = 10MB
PLEASE NOTE that tabs are not allowed in your filebeat.yml!!!! I used notepad++ and view>Show>whitespace and TAB. Sure enough there was a TAB char in a blank line and filebeat wouldn't start. Use filebeat -c filebeat.yml -configtest and it will give you more information.
Go in your logstash input for filebeat and comment the tls section!
Don't forget to check your log file permissions. If everything is rooted, filebeat won't have read access to it.
Set your file group to adm.
sudo chgrp adm /var/log/*.log

Resources