How to configure FileBeat and Logstash to add XML Files in Elasticsearch? - elasticsearch

I'm a beginner here. My own problem is to configure FileBeat and Logstash to add XML Files in Elasticsearch on CentOS 7.
I have already install the last version of filebeat, logstash, elasticsearch and Kibana, with the plug-in "elasticsearch-head" in standalone to see inside elasticsearch. And to test my installation, i have successfully add simple log file from CentOS system (/var/log/messages), and see it inside elasticsearch-head plug-in (6 index and 26 shards):
This is a viex of my elasticsearch-head plug-in
And now, next step is to add log from XML file. After reading the documentation, i have configure filebeat and logstash. All services are running, and i try the command "touch /mes/AddOf.xml" to try to active an filebeat event, and forward log to logstash (AddOf.xml is my log file).
My XML data structure is like this for one log event :
<log4j:event logger="ServiceLogger" timestamp="1494973209812" level="INFO" thread="QueueWorker_1_38a0fec5-7c7f-46f5-a87a-9134fff1b493">
<log4j:message>Traitement du fichier \\ifs-app-01\Interfaces_MES\AddOf\ITF_MES_01_01_d2bef200-3a85-11e7-1ab5-9a50967946c3.xml</log4j:message>
<log4j:properties>
<log4j:data name="log4net:HostName" value="MES-01" />
<log4j:data name="log4jmachinename" value="MES-01" />
<log4j:data name="log4net:Identity" value="" />
<log4j:data name="log4net:UserName" value="SOFRADIR\svc_mes_sf" />
<log4j:data name="LogName" value="UpdateOperationOf" />
<log4j:data name="log4japp" value="MES_SynchroService.exe" />
</log4j:properties>
<log4j:locationInfo class="MES_SynchroService.Core.FileManager" method="TraiteFichier" file="C:\src\MES_PROD\MES_SynchroService\Core\FileManager.cs" line="47" />
</log4j:event>
My filebeat configuration like this (/etc/filebeat/filebeat.yml):
filebeat.prospectors:
# Each - is a prospector. Most options can be set at the prospector level, so
# you can use different prospectors for various configurations.
# Below are the prospector specific configurations.
- input_type: log
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /mes/*.xml
document_type: message
### Multiline options
# Mutiline can be used for log messages spanning multiple lines. This is common
# for Java Stack Traces or C-Line Continuation
# The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
multiline.pattern: ^<log4j:event
# Defines if the pattern set under pattern should be negated or not. Default is false.
multiline.negate: true
# Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
# that was (not) matched before or after or as long as a pattern is not matched based on negate.
# Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
multiline.match: after
#================================ Outputs =====================================
# Configure what outputs to use when sending the data collected by the beat.
# Multiple outputs may be used.
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["localhost:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
ssl.certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
My input logstash configuration (/etc/logstash/conf.d/01-beats-input.conf) :
input {
beats {
port => 5044
ssl => true
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
}
}
My filter logstash configuration (/etc/logstash/conf.d/01-beats-filter.conf) :
filter
{
xml
{
source => "message"
xpath =>
[
"/log4j:event/log4j:message/text()", "messageMES"
]
store_xml => true
target => "doc"
}
}
My output logstash configuration (/etc/logstash/conf.d/01-beats-output.conf) :
output {
elasticsearch {
hosts => ["localhost:9200"]
sniffing => true
manage_template => false
index => "mes_log"
document_type => "%{[#metadata][type]}"
}
}
But when i try the command "touch /mes/AddOf.xml", or add manually an event log in AddOf.xml, i don't see a new index with events log from XML file in elasticsearch.
I have see documentation for XML plug-in for logstash (here), but i don't now if i need to install something ? Or maybe I'm not doing the right thing for filebeat to send the logs to logstash ?
I'm very involved and motivated to learn about ELK stack. Thank you in advance for your expertise and help. I would be grateful ! :)

In your filebeat config, the regex for multiline.pattern probably should be in single-quotes:
multiline.pattern: '^<log4j:event'

Related

Shipping logs to my local windows ELK/Logstash file from remote centos7 using filebeats

I have ELK all this three components configured on my local windows machine up and running.
I have a logfile on a remote centos7 machine available which I want to ship from there to my local windows with the help of Filebeat. How I can achieve that?
I have installed filebeat on centos with the help of rpm.
in configuration file I have made following changes
commented output.elasticsearch
and uncommented output.logstash (which Ip of my windows machine shall I give overe here? How to get that ip)
AND
**filebeat.inputs:
type: log
enabled: true
paths:
path to my log file**
The flow of the data is:
Agent => logstash > elasticsearch
Agent could be beat family, and you are using filebeat that is okay.
You have to configure all these stuff.
on filebeat you have to configure filebeat.yml
on logstash you have to configure logstash.conf
on elasticsearch you have to configure elasticsearch.yml
I personally will start with logstash.conf
input
{
beats
{
port =>5044
}
}
filter
{
#I assume you just want the log to run into elasticsearch
}
output
{
elasticsearch
{
hosts => "(YOUR_ELASTICSEARCH_IP):9200"
index=>"insertindexname"
}
stdout
{
codec => rubydebug
}
}
this is a very minimal working configuration. this means, Logstash will listen for input from filebeat in port 5044. Filter is needed when you want parse the data. Output is where you want to store the data. We are using elasticsearch output plugin since you want to store it to elasticsearch. stdout is super helpful for debugging your configuration file. add this you will regret nothing. This will print all the messages that sent to elasticsearch
filebeat.yml
filebeat.inputs:
- type: log
paths:
- /directory/of/your/file/file.log
output.logstash:
hosts: ["YOUR_LOGSTASH_IP:5044"]
this is a very minimal working filebeat.yml. paths is where you want logstash to harvest the file.
When you done configuring the file, start elasticsearch then logstash then filebeat.
Let me know any difficulties

Logstash error : Failed to publish events caused by: write tcp YY.YY.YY.YY:40912->XX.XX.XX.XX:5044: write: connection reset by peer

I am using filebeat to push my logs to elasticsearch using logstash and the set up was working fine for me before. I am getting Failed to publish events error now.
filebeat | 2020-06-20T06:26:03.832969730Z 2020-06-20T06:26:03.832Z INFO log/harvester.go:254 Harvester started for file: /logs/app-service.log
filebeat | 2020-06-20T06:26:04.837664519Z 2020-06-20T06:26:04.837Z ERROR logstash/async.go:256 Failed to publish events caused by: write tcp YY.YY.YY.YY:40912->XX.XX.XX.XX:5044: write: connection reset by peer
filebeat | 2020-06-20T06:26:05.970506599Z 2020-06-20T06:26:05.970Z ERROR pipeline/output.go:121 Failed to publish events: write tcp YY.YY.YY.YY:40912->XX.XX.XX.XX:5044: write: connection reset by peer
filebeat | 2020-06-20T06:26:05.970749223Z 2020-06-20T06:26:05.970Z INFO pipeline/output.go:95 Connecting to backoff(async(tcp://xx.com:5044))
filebeat | 2020-06-20T06:26:05.972790871Z 2020-06-20T06:26:05.972Z INFO pipeline/output.go:105 Connection to backoff(async(tcp://xx.com:5044)) established
Logstash pipeline
02-beats-input.conf
input {
beats {
port => 5044
}
}
10-syslog-filter.conf
filter {
json {
source => "message"
}
}
30-elasticsearch-output.conf
output {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "index-%{+YYYY.MM.dd}"
}
}
Filebeat configuration
Sharing my filebeat config at /usr/share/filebeat/filebeat.yml
filebeat.inputs:
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /logs/*
#============================= Filebeat modules ===============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
#reload.period: 10s
#==================== Elasticsearch template setting ==========================
setup.template.settings:
index.number_of_shards: 3
#index.codec: best_compression
#_source.enabled: false
#============================== Kibana =====================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
#host: "localhost:5601"
# Kibana Space ID
# ID of the Kibana Space into which the dashboards should be loaded. By default,
# the Default Space will be used.
#space.id:
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["xx.com:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
#================================ Processors =====================================
# Configure processors to enhance or manipulate events generated by the beat.
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
When I do telnet xx.xx 5044, this is the what I see in terminal
Trying X.X.X.X...
Connected to xx.xx.
Escape character is '^]'
I had the same problem. Here some steps, which could help you to find the core of your problem.
Firstly I tested such way: filebeat (localhost) -> logstash (localhost) -> elastic -> kibana. Each service is on the same machine.
My /etc/logstash/conf.d/config.conf:
input {
beats {
port => 5044
ssl => false
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
}
}
Here, I specially disabled ssl (in my case it was a main reason of the issue, even when certificates were correct, magic).
After that don't forget to restart logstash and test with sudo filebeat -e command.
If everything is ok, you wouldn't see 'connection reset by peer' error
I had the same problem. Starting filebeat as a sudo user worked for me.
sudo ./filebeat -e
I have made some changes to input plugin config, as specifying ssl => false but did not worked without starting filebeat as a sudo privileged user or as root.
In order to start filebeat as a sudo user, filebeat.yml file must be owned by root. Change whole filebeat folder permission to a sudo privileged user by using sudo chown -R sime_sudo_user:some_group filebeat-7.15.0-linux-x86_64/ and then chown root filebeat.yml will change the permission of file.

Unable to connect Filebeat to logstash for logging using ELK

Hi I've been working on a automated logging using elastic stack. I have filebeat that is reading logs from the path and output is set to logstash over the port 5044. The logstash config has an input listening to 5044 and output pushing to localhost:9200. The issue is I can't get it to work, I have no idea what's happening. Below are the files:
My filebeat.yml path: etc/filebeat/filebeat.yml
#=========================== Filebeat prospectors =============================
filebeat.prospectors:
# Each - is a prospector. Most options can be set at the prospector level, so
# you can use different prospectors for various configurations.
# Below are the prospector specific configurations.
- input_type: log
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /mnt/vol1/autosuggest/logs/*.log
#- c:\programdata\elasticsearch\logs\*
<other commented stuff>
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["10.10.XX.XX:5044"]
# Optional SSL. By default is off.
<other commented stuff>
My logstash.yml path: etc/logstash/logstash.yml
<other commented stuff>
path.data: /var/lib/logstash
<other commented stuff>
path.config: /etc/logstash/conf.d
<other commented stuff>
# ------------ Metrics Settings --------------
#
# Bind address for the metrics REST endpoint
#
http.host: "10.10.XX.XX"
#
# Bind port for the metrics REST endpoint, this option also accept a range
# (9600-9700) and logstash will pick up the first available ports.
#
# http.port: 9600-9700
<other commented stuff>
path.logs: /var/log/logstash
<other commented stuff>
My logpipeline30aug.config file path: /usr/share/logstash
input {
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:var0}%{SPACE}%{NOTSPACE}%{SPACE}(?<searchinfo>[^#]*)#(?<username>[^#]*)#(?<searchQuery>[^#]*)#(?<latitude>[^#]*)#(?<longitude>[^#]*)#(?<client_ip>[^#]*)#(?<responseTime>[^#]*)" }
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "logstash30aug2017"
document_type => "log"
}
}
Please Note: Elasticsearch, logstash, filebeat are all installed on the same machine with ip: 10.10.XX.XX and I've checked the firewall, it's not the issue for sure.
I checked that logstash, filebeat services are all running. Filebeat is able to push the data to elasticsearch when configured so and logstash is able to push the data to elasticsearch when configured so.
Maybe it's how I am executing the process is the issue..
I do a bin/logstash -f logpipeline30aug.config in /usr/share/logstash to start it and then I do a /etc/init.d/filebeat start from the root directory.
Please Note: Formatting may be effected due to stackoverflow formatting issue
Can someone please help? I've been trying everything since 3 days now, I've gone through the documentations as well
Your filebeat.yml looks invalid.
The output section lacks an indentation:
output.logstash:
hosts: ["10.10.XX.XX:5044"]
In general, check the correctness of the config files to ensure they're ok.
For instance, for filebeat, you can run:
filebeat -c /etc/filebeat/filebeat.yml -configtest
If you have any errors it explains what is that error so you can fix it.
You can use a similar approach for other ELK services as well

Metricbeat to Logstash on a remote server not working

I am new to elastic stack . I am trying to send metrics from one pc(ubuntu) to another remote server using Metricbeat and logstash. Below are my configuration files,
metricbeat.yml
metricbeat.modules:
#------------------------------- System Module -------------------------------
- module: system
metricsets:
# CPU stats
- cpu
# System Load stats
- load
# Per CPU core stats
#- core
# IO stats
#- diskio
# Per filesystem stats
- filesystem
# File system summary stats
- fsstat
# Memory stats
- memory
# Network stats
- network
# Per process stats
- process
# Sockets (linux only)
#- socket
enabled: true
period: 10s
processes: ['.*']
cpu_ticks : false
#================================ General =====================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
#================================ Outputs =====================================
# Configure what outputs to use when sending the data collected by the beat.
# Multiple outputs may be used.
#-------------------------- Elasticsearch output ------------------------------
output.elasticsearch:
# Array of hosts to connect to.
#hosts: ["localhost:9200"]
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
#password: "changeme"
#----------------------------- Logstash output --------------------------------
#output.logstash:
# The Logstash hosts
hosts: ["127.0.0.1:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
ssl.certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
#================================ Logging =====================================
# Sets log level. The default log level is info.
# Available log levels are: critical, error, warning, info, debug
logging.level: debug
logstash.conf
input {
beats {
port => 5044
ssl => true
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
sniffing => true
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
I am running metricbeat service but can’t get the index in Kibana (localhost:5601). What is the problem, I can't figure it out ?
Thank you.
It looks like you are trying to send data to your logstash using the elasticsearch output of metricbeat.
In fact, in your metricbeat.yml you have this line uncommented:
output.elasticsearch:
and this one commented:
#output.logstash:
If you toggle the comments in both lines it could work.

error INFO No non-zero metrics in the last 30s message in filebeat

I 'm newbie in ELK and and I'm getting issues while running logstash. I ran logstash as define in structure step by step as I do for file beat but
But when run filebeat and logstash, Its show logstash successfully runs at port 9600. In filebeat it gives like this
INFO No non-zero metrics in the last 30s
Logstash is not getting input from file beat. Please help.
My problem is as the same as this article and did what it said but noting change .
the filebeat.yml is :
filebeat.prospectors:
- input_type: log
paths:
- /usr/share/tomcat/log_app/news/*.log
output.logstash:
hosts: ["10.0.20.163:5000"]
and I ran this command sudo ./filebeat -e -c filebeat.yml -d "publish"
the logstash config file is :
input {
beats {
port => "5000"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
geoip {
source => "clientip"
}
}
output {
elasticsearch {
hosts => [ "localhost:9200" ]
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
then ran the commands
1)bin/logstash -f first-pipeline.conf --config.test_and_exit - this gave Ok
2)bin/logstash -f first-pipeline.conf --config.reload.automatic -This started the logstash on port 9600
I couldn't proceeds after this since filebeat gives the INFO
INFO No non-zero metrics in the last 30s
and I use
elastic search : 5.5.1
kibana : 5.5.1
logstash : 5.5.1
file beat : 5.5.1
If you want to resend your data, you can try to delete filebeat's registry file, and when you restart filebeat, it will send the data again.
File location depends on your platform. See https://www.elastic.co/guide/en/beats/filebeat/5.3/migration-registry-file.html
Registry file location can also be defined in your filebeat.yml:
filebeat.registry_file: registry
https://www.elastic.co/guide/en/beats/filebeat/current/configuration-global-options.html
Everytime you stop the filebeat. It will start reading the data from the tail of file. And because the sample file which you are using are not getting frequent data. It's not able to fetch and send it to elastic search.
Edit your log file. Add few more redundant data and then try it. It should work.
This error which you have mentioned is because FIlebeat is not able to get any updated data in that file.

Resources