I am using filebeat to send data to elasticsearch,
filebeat.prospectors:
- input_type: log
paths:
- /var/log/nginx/kibana_access.log
document_type: nginx
- input_type: log
paths:
- /var/log/redis/redis-server.log
document_type: redis
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]
index: '%{[type]}-log'
versions.2x.enabled: false
The configuration is correct, and it is writing to elastic perfectly. But,The issue is that, it is sending the old lines to elastic also, whereas, it should not do so.
No new logs are being written, but in kibana I can see the log countto be the same as previous, when filebeata again sends the data.
I tried checking the registry file, /var/lib/filebeat/registry, and it had information of the files which I had used earlier but was not using now.
{"source":"/var/log/filebeat/filebeat","offset":2514,"FileStateOS":{"inode":4591858,"device":2058},"timestamp":"2017-04-21T17:33:11.913352399+05:30","ttl":-2},{"source":"/var/log/postgresql/postgresql-2017-04-21_120121.log","offset":4485506,"FileStateOS":{"inode":3932558,"device":2058},"timestamp":"2017-04-21T18:11:56.65579033+05:30","ttl":-2}
this is the registry file.
I have set a cron job which restarts filebeat every minute, and sends data to elastic. I am using ubuntu 16.04 and installed filebeat as deb package.
This is the registry file path in filebeat.full.yml --> ${path.data}/registry.
Please explain this behaviour, and also the solution to this.
i just deleted this folder
rm -rf /var/lib/filebeat/
it was solved.
Related
I have ELK all this three components configured on my local windows machine up and running.
I have a logfile on a remote centos7 machine available which I want to ship from there to my local windows with the help of Filebeat. How I can achieve that?
I have installed filebeat on centos with the help of rpm.
in configuration file I have made following changes
commented output.elasticsearch
and uncommented output.logstash (which Ip of my windows machine shall I give overe here? How to get that ip)
AND
**filebeat.inputs:
type: log
enabled: true
paths:
path to my log file**
The flow of the data is:
Agent => logstash > elasticsearch
Agent could be beat family, and you are using filebeat that is okay.
You have to configure all these stuff.
on filebeat you have to configure filebeat.yml
on logstash you have to configure logstash.conf
on elasticsearch you have to configure elasticsearch.yml
I personally will start with logstash.conf
input
{
beats
{
port =>5044
}
}
filter
{
#I assume you just want the log to run into elasticsearch
}
output
{
elasticsearch
{
hosts => "(YOUR_ELASTICSEARCH_IP):9200"
index=>"insertindexname"
}
stdout
{
codec => rubydebug
}
}
this is a very minimal working configuration. this means, Logstash will listen for input from filebeat in port 5044. Filter is needed when you want parse the data. Output is where you want to store the data. We are using elasticsearch output plugin since you want to store it to elasticsearch. stdout is super helpful for debugging your configuration file. add this you will regret nothing. This will print all the messages that sent to elasticsearch
filebeat.yml
filebeat.inputs:
- type: log
paths:
- /directory/of/your/file/file.log
output.logstash:
hosts: ["YOUR_LOGSTASH_IP:5044"]
this is a very minimal working filebeat.yml. paths is where you want logstash to harvest the file.
When you done configuring the file, start elasticsearch then logstash then filebeat.
Let me know any difficulties
I am trying to deploy ELK stack in openshift platform (OKD - v3.11) and using filebeat to automatically detect the logs.
The kibana dashboard is up, elastic & logstash api's are working fine but the filebeat is not sending the data to logstash since I do not see any data polling on the logstash listening on 5044 port.
So I found that from elastic forums that the following iptables command would resolve my issue but no luck,
iptables -A OUTPUT -t mangle -p tcp --dport 5044 -j MARK --set-mark 10
Still nothing is polling on the logstash listener. Please help me if I am missing anything and let me know if you need any more information.
NOTE:
The filebeat.yml, logstash.yml & logstash.conf files are working perfectly while deployed in the plain kubernetes.
The steps I have followed to debug this issue are:
Check if Kibana is coming up,
Check if Elastic API's are working,
Check if Logstash is accessible from Filebeat.
Everything is working fine in my case. Added log levels in Filebeat.yml and found "Permission Denied" error while filebeat is accessing the docker container logs under "/var/lib/docker/containers//" folder.
Fixed the issue by setting selinux to "Permissive" by running the following command,
sudo setenforce Permissive
After this ELK started to sync the logs.
I am assigned with task to create a central logging server. In my case there are many web app servers spread across. My task is to get logs from these different servers and manage in central server where there will be elastic-search and kibana.
Question
Is it possible to get logs from servers that are having different public IP? If possible how?
How much resource (CPU, Memory, Storage) is required in central server.
Things seen
Saw the examples setups where all logs and applications are on same machine only.
Looking for way to send logs over public IP to elastic-search.
I would like to differ from the Ishara's Answer. You can ship logs directly from filebeat to elasticsearch without using logstash, If your logs are generic types(system logs, nginx logs, apache logs), Using this approach You don't need to go into incur extra cost and maintenance of logstash as filebeat provides inbuilt parsing processor.
If you have debian based OS on your server, I have prepared a shell script to install and configure filebeat. You need to change elasticsearch server URL and modify second last line based on the modules that you want to configure.
Regarding your first question, Yes, You can run filebeat agent on each server and send data to centralize Elasticsearch.
For your second question, It depends on the amount of logs elasticsearch server is going to process and store. It also depends on the where kibana is hosted.
sudo wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
sudo echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
sudo apt-get update && sudo apt-get install -y filebeat
sudo systemctl enable filebeat
sudo bash -c "cat >/etc/filebeat/filebeat.yml" <<FBEOL
filebeat.inputs:
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.name: "filebeat-system"
setup.template.pattern: "filebeat-system-*"
setup.template.settings:
index.number_of_shards: 1
setup.ilm.enabled: false
setup.kibana:
output.elasticsearch:
hosts: ["10.32.66.55:9200", "10.32.67.152:9200", "10.32.66.243:9200"]
indices:
- index: "filebeat-system-%{+yyyy.MM.dd}"
when.equals:
event.module: system
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
logging.level: warning
FBEOL
sudo filebeat modules enable system
sudo systemctl restart filebeat
Yes, it is possible to get logs from servers that are having different public IP. You need to setup an agent like filebeat (provided by elastic) to each server which produce logs.
You need to setup filebeat instance in each machine.
It will listen to your log files in each machine and forward them to the logstash instance you would mention in filebeat.yml configuration file like below:
#=========================== Filebeat inputs =============================
filebeat.inputs:
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /path_to_your_log_1/ELK/your_log1.log
- /path_to_your_log_2/ELK/your_log2.log
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["private_ip_of_logstash_server:5044"]
Logstash server listens to port 5044 and stream all logs through logstash configuration files:
input {
beats { port => 5044 }
}
filter {
# your log filtering logic is here
}
output {
elasticsearch {
hosts => [ "elasticcsearch_server_private_ip:9200" ]
index => "your_idex_name"
}
}
In logstash you can filter and split your logs into fields and send them to elasticsearch.
Resources depend on how much of data you produce, data retention plan, TPS and your custom requirements. If you can provide some more details, I would be able to provide a rough idea about resource requirement.
We are setting up elasticsearch, kibana, logstash and filebeat on a server to analyse log files from many applications. Due to reasons* each application log file ends up in a separate directory on the ELK server. We have about 20 log files.
As I understand we can run a logstash pipeline config file for each
application log file. That will be one logstash instance running
with 20 pipelines in parallel and each pipeline will need its own
beat port. Please confirm that this is correct?
Can we have one filebeat instance running or do we need one for each
pipeline/logfile?
Is this architecture ok or do you see any major down sides?
Thank you!
*There are different vendors responsible for different applications and they run a cross many different OS and many of them will not or can't install anything like filebeats.
We do not recommend reading log files from network volumes. Whenever
possible, install Filebeat on the host machine and send the log files
directly from there. Reading files from network volumes (especially on
Windows) can have unexpected side effects. For example, changed file
identifiers may result in Filebeat reading a log file from scratch
again.
Reference
We always recommend installing Filebeat on the remote servers. Using
shared folders is not supported. The typical setup is that you have a
Logstash + Elasticsearch + Kibana in a central place (one or multiple
servers) and Filebeat installed on the remote machines from where you
are collecting data.
Reference
For one filebeat instance running you can apply different configuration settings to different files by defining multiple input sections as below example, check here for more
filebeat.inputs:
- type: log
enabled: true
paths:
- 'C:\App01_Logs\log.txt'
tags: ["App01"]
fields:
app_name: App01
- type: log
enabled: true
paths:
- 'C:\App02_Logs\log.txt'
tags: ["App02"]
fields:
app_name: App02
- type: log
enabled: true
paths:
- 'C:\App03_Logs\log.txt'
tags: ["App03"]
fields:
app_name: App03
And you can have one logstash pipeline with if statement in filter
filter {
if [fields][app_name] == "App01" {
grok { }
} else if [fields][app_name] == "App02" {
grok { }
} else {
grok { }
}
}
Condtion can be also if "App02" in [tags] or if [source]=="C:\App01_Logs\log.txt" as we send from filebeat
I am not getting that how to run this filebeat in order to send output to elasticsearch.
This is from the filebeat.yml file,
- input_type: log
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /var/log/nginx/access.log
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]
index: 'filebeat_nginx'
elasticsearch is up and running.
Now, how to run filebeat to send the log info to elasticsearch.
If I go to bin directory of filebeat, and run this command,
luvpreet#DHARI-Inspiron-3542:/usr/share/filebeat/bin$ sudo ./filebeat -configtest -e
then it shows ,
filebeat2017/04/19 06:54:22.450440 beat.go:339: CRIT Exiting: error loading config file: stat filebeat.yml: no such file or directory
Exiting: error loading config file: stat filebeat.yml: no such file or directory
The filebeat.yml file is in the /etc/filebeat folder. How to run it ?
Please clarify the process to run this with elasticsearch.
A typical filebeat command looks like this:
/usr/share/filebeat/bin/filebeat -c /etc/filebeat/filebeat.yml \
-path.home /usr/share/filebeat -path.config /etc/filebeat \
-path.data /var/lib/filebeat -path.logs /var/log/filebeat
-c indicates your config file, as noted in the comments above. path.home is your scripts. path.config contains config files. path.data is where state is maintained. path.logs is where the filebeat process will log.
If you have made the necessary arrangements in /etc/filebeat/filebeat.yml file, you can use this command "service filebeat start". After the service is started you can control service this command "service filebeat status". If there is an error, you can see errors.
1.If you have installed the rpm package, you will have /etc/filebeat/filebeat.yml file. Edit the file to send the output to Elasticsearch and start it using command "/etc/init.d/filebeat start"
2. If you have downloaded binary and installed it, you can use the command "Downloads/filebeat-5.4.0-darwin-x86_64/filebeat -e -c location_to_your_filebeat.yml"