How to import Varnishlog file into Elasticsearch using Filebeat? - elasticsearch

I would to import a "varnishcsa.log" into Elasticsearch to, at the end, discover/visualize data with Kibana.
I am begining to use the Elastic Stack : Elasticsearch > Filebeat > Kibana.
There is no default "Varnish" build-in Filebeat module.
I did some tests (filebeat, logstash), following some instructions... but without success.
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-varnishlog.html
https://www.elastic.co/guide/en/logstash/current/filebeat-modules.html
I am searching a way to import a Varnishlog file, like the following (working) one for a "mysql-slow-queries.log".
# run Elasticsearch and Kibana
./elasticsearch/bin/elasticsearch
./kibana/bin/kibana
# Import MySQL Slow Queries logs using Filebeat
./filebeat/filebeat -e --modules=mysql --setup -M \
"mysql.slowlog.var.paths=[/path/to/mysql-slow-queries.log]"
Can you help me?
I am a beginner, so do not hesitate to give examples... :).
Thank you.

You don't need a default module to send your logs with Filebeat. See an example config here and use it to configure filebeat to send your varnish logs to Elasticsearch.
- type: log
paths:
/var/log/varnish/varnishncsa.log
fields:
log_type: varnish
fields_under_root: true

Related

Elastic ELK stack 8.5 integration with Spring Boots Application using Filebeat

Setting up a pipeline of elastic search, kibana, and logstash in locally and using filebeat to push logs from a spring boot application to the pipeline. U will find the official documentation well-defined, But I created this questions to answer a few points that were not clear. I answered for a single spring boot app scenario, thanks to people who are adding their scenarios as well.
I spend a few days configuring the ELK stack with my spring boot application. Here I won't specify the step-by-step integration, for that, you can refer to the official documentation. This is more focused on what I didn't find in the documentation steps.
Env: This will be focused on setting up the 8.5.3 version in a mac os.
For Elasticsearch and Kibana I didn't have any trouble following the official document word by word.
Elasticsearch: https://www.elastic.co/downloads/elasticsearch
Kibana:https://www.elastic.co/downloads/kibana
Tip: To Check elastic running
curl --cacert config/certs/http_ca.crt -u elastic https://localhost:9200
Enter password when prompted
Enter host password for user 'elastic':
In my project, I needed to extract only a specific log line and process it. U can use the below official document link to download and extract the logstash and filebeat. Then you can use the mentioned configs before you run it.
Logstash: https://www.elastic.co/downloads/logstash
Filebeat: https://www.elastic.co/downloads/beats/filebeat
Filebeat :
First, you need to make permission changes to your filebeat.yml file. Navigate to your filebeat extracted folder and you can use the following config if needed.
filebeat.inputs:
- type: filestream
id: filebeat-id-name
enabled: true
#Path to you log file
paths:
- /Users/leons/IdeaProjects/SpringELKDemo/myapplogs.log
#I wanted to only process the logs from MainController
include_lines: ['MainController']
output.logstash:
hosts: ["localhost:5044"]
Then you need to alter the write permission for this file using the below command(mac). Later you can edit the file using sudo nano.
sudo chown root filebeat.yml
Logstash:
Initial a sample file for logstash.conf is available in the config folder inside logstash. you can refer to that, also take a look at mine.
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
input {
beats {
port => 5044
}
}
filter {
#Extracting a portion of log message
dissect {
mapping => {
"message" => "%{}: %{data_message}"
}
}
#converting message json into fields
json {
source => "data_message"
}
#mapping the message json timestamp with entry timestamp
date {
target => "#timestamp"
match => ["timestamp","yyyy-MM-dd HH:mm:ss.SSS"]
}
#removing unneeded fields
mutate {
remove_field => ["[event][original]","message","data_message","timestamp"]
}
}
output {
elasticsearch {
hosts => ["https://localhost:9200"]
index => "myindex"
user => "elastic"
password => "*****************"
ssl_certificate_verification => false
}
stdout{
codec => rubydebug
}
}
I used the dissect filter to do string manipulation in my logline, that filebeat transferred. Below was my log, and I needed only the exact message which is JSON string
2022-12-15 21:14:56.152 INFO 9278 --- [http-nio-8080-exec-10] c.p.t.springdemo.controller.MainController : {"name":"leons","id":"123123","msg":"hello world"}
For more on dissect refer official docs
The json filter is used to convert the JSON key: values into fields and values in your elastic document.
Now you should be ready to run logstash and filebeat using official document command. Just for reference use below
Logstash :
bin/logstash -f logstash.conf
Filebeat :
sudo ./filebeat -e -c filebeat.yml

How can I store the logs that are generated using log4j into Elasticsearch using filebeat?

I have a log file containing logs(sent from log4j). I would like to store these logs into elasticsearch. The log file is dynamic, meaning that it is constantly loaded with logs from log4j. I don't want to store system logs(which is covered in most tutorials). How can I configure the filebeat.yml file ? Even some resources will be helpful. Much appreciated
PS: I'm using Ubuntu 20.04
and this is the path of my file
/home/user/Log/Logging.log
The log in my file looks something like this
2022-01-22 21:04:40 INFO CalcServlet:135 - sort
You can use the dissector processor:
processors:
- dissect:
tokenizer: "%{date} %{time} %{level} %{coponent}:%{line|integer} - %{message}"
field: "message"
target_prefix: "dissect"
A detailed example you can find here.

proper set up of parsing custom logs with logstash to kibana, i see no errors and no data

I'm playing a bit with kibana to see how it works.
i was able to add nginx log data directly from the same server without logstash and it works properly. but using logstash to read log files from a different server doesn't show data. no error.. but no data.
I have custom logs from PM2 that runs some PHP script for me and the format of the messages are:
Timestamp [LogLevel]: msg
example:
2021-02-21 21:34:17 [DEBUG]: file size matches written file size 1194179
so my gork filter is:
"%{DATESTAMP:timestamp} \[%{LOGLEVEL:loglevel}\]: %{GREEDYDATA:msg}"
I checked with Gork Validator and the syntax matches the file format.
i've got files that contain the suffix out that are debug level, and files with suffix error for error level.
so to configure logstash on the kibana server, i added the file /etc/logstash/conf.d/pipeline.conf with the following:
input {
beats {
port => 5544
}
}
filter {
grok {
match => {"message"=>"%{DATESTAMP:timestamp} \[%{LOGLEVEL:loglevel}\]: %{GREEDYDATA:msg}"}
}
mutate {
rename => ["host", "server"]
convert => {"server" => "string"}
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
user => "<USER>"
password => "<PASSWORD>"
}
}
I needed to rename the host variable to server or I would get errors like Can't get text on a START_OBJECT and failed to parse field [host] of type [text]
on the 2nd server where the pm2 logs reside I configure filebeat with the following:
- type: filestream
enabled: true
paths:
- /home/ubuntu/.pm2/*-error-*log
fields:
level: error
- type: filestream
enabled: true
paths:
- /home/ubuntu/.pm2/logs/*-out-*log
fields:
level: debug
I tried to use log and not filestream the results are the same.
but it makes sense to use filestream since the logs are updated constantly on ?
so i have logstash running on one server and filebeat on the other, opened firewall ports, i can see they're connecting but i don't see any new data in the Kibana logs dashboard relevant to the files i fetch with logstash.
filebeat log always shows this line Feb 24 04:41:56 vcx-prod-backup-01 filebeat[3797286]: 2021-02-24T04:41:56.991Z INFO [file_watcher] filestream/fswatch.go:131 Start next scan and something about analytics metrics so it looks fine, and still no data.
I tried to provide here as much information as I can, i'm new to kibana, i have no idea why data is not shown in kibana if there are no errors.
I thought maybe i didn't escaped the square brackets properly in gork filter so I tried using "%{DATESTAMP:timestamp} \\[%{LOGLEVEL:loglevel}\\]: %{GREEDYDATA:msg}" which replaces \[ with \\[ but the results are the same.
any information regarding this issue would be greatly appreciated.
#update
ֿ
using stack version 7.11.1
I changed back to log instead of filestream based on #leandrojmp recommendations.
I checked for harverser.go related lines i filebeat and I found these:
Feb 24 14:16:36 SERVER filebeat[4128025]: 2021-02-24T14:16:36.566Z INFO log/harvester.go:302 Harvester started for file: /home/ubuntu/.pm2/logs/cdr-ssh-out-1.log
Feb 24 14:16:36 SERVER filebeat[4128025]: 2021-02-24T14:16:36.567Z INFO log/harvester.go:302 Harvester started for file: /home/ubuntu/.pm2/logs/cdr-ftp-out-0.log
and I also noticed that when i configured the output to stdout, i do see the events that are coming from the other server. so logstash do receive them properly but for some reason i don't see them in kiban.
If you have output using both stdout and elasticsearch outputs but you do not see the logs in Kibana, you will need to create an index pattern in Kibana so it can show your data.
After creating an index pattern for your data, in your case the index pattern could be something like logstash-* you will need to configure the Logs app inside Kibana to look for this index, per default the Logs app looks for filebeat-* index.
ok... so #leandrojmp helped me a lot in understanding what's going on with kibana. thank you! all the credit goes to you! just wanted to write a log answer that may help other people overcome the initial setup.
lets start fresh
I wanted one kibana node that monitors custom logs on a different server.
I have ubuntu latest LTS installed on both, added the deb repositories, installed kibana, elsaticsearch and logstash on the first, and filebeat on the 2nd.
basic setup is without much security and SSL which is not what i'm looking for here since i'm new to this topic, everything is mostly set up.
in kibana.yml i changed the host to 0.0.0.0 instead of localhost so i can connect from outside, and in logstash i added the following conf file:
input {
beats {
port => 5544
}
}
filter {
grok {
match => {"message"=>"%{DATESTAMP:timestamp} \[%{LOGLEVEL:loglevel}\]: %{GREEDYDATA:msg}"}
}
mutate {
rename => ["host", "server"]
convert => {"server" => "string"}
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
}
}
i didn't complicate things and didn't need to set up additional authentication.
my filebeat.yml configuration:
- type: log
enabled: true
paths:
- /home/ubuntu/.pm2/*-error-*log
fields:
level: error
- type: log
enabled: true
paths:
- /home/ubuntu/.pm2/logs/*-out-*log
level: debug
i started everything, no errors in any logs but still no data in kibana, since i had no clue how elasticsearch stored it's data, i needed to find out how can i connect to elasticsearch and see if the data is there, so i executed curl -X GET http://localhost:9200/_cat/indices?v and noticed a logstash index, so i executed curl -X GET http://localhost:9200/logstash-2021.02.24-000001/_search and i noticed that the log data is presented in the database.
so it must means that it's something with kibana. so using the web interface of kibana under settings I noticed a configuration called Index pattern for matching indices that contain log data and the input there did not match the logstash index name, so i appended ,logstash* to it and voila! it works :)
thanks

How to specify pipeline for Filebeat Nginx module?

I have web server (Ubuntu) with Nginx + PHP.
It has Filebeat, which sends Nginx logs to Elastic ingestion node directly (no Logstash or anything else).
When I just installed it 1st time, I made some customizations to the pipeline, which Filebeat created.
Everything worked great for a month or so.
But I noticed, that every Filebeat upgrade result in the creation of new pipeline. Currently I have these:
filebeat-7.3.1-nginx-error-pipeline: {},
filebeat-7.4.1-nginx-error-pipeline: {},
filebeat-7.2.0-nginx-access-default: {},
filebeat-7.3.2-nginx-error-pipeline: {},
filebeat-7.4.1-nginx-access-default: {},
filebeat-7.3.1-nginx-access-default: {},
filebeat-7.3.2-nginx-access-default: {},
filebeat-7.2.0-nginx-error-pipeline: {}
I can create new pipeline, but how do I tell (how to configure) Filebeat to use specific pipeline?
Here is what I tried and it doesn't work:
- module: nginx
# Access logs
access:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
var.paths: ["/var/log/nginx/*/*access.log"]
# Convert the timestamp to UTC
var.convert_timezone: true
# The Ingest Node pipeline ID associated with this input. If this is set, it
# overwrites the pipeline option from the Elasticsearch output.
output.elasticsearch.pipeline: 'filebeat-nginx-access-default'
pipeline: 'filebeat-nginx-access-default
It still using filebeat-7.4.1-nginx-error-pipeline pipeline.
Here is Filebeat instructions on how to configure it (but I can't make it work):
https://github.com/elastic/beats/blob/7.4/filebeat/filebeat.reference.yml#L1129-L1130
Question:
how can I configure Filebeat module to use specific pipeline?
Update (Nov 2019): I submitted related bug: https://github.com/elastic/beats/issues/14348
In beats source code, I found that the pipeline ID is settled by the following params:
beats version
module name
module's fileset name
pipeline filename
the source code snippet is as following:
// formatPipelineID generates the ID to be used for the pipeline ID in Elasticsearch
func formatPipelineID(module, fileset, path, beatVersion string) string {
return fmt.Sprintf("filebeat-%s-%s-%s-%s", beatVersion, module, fileset, removeExt(filepath.Base(path)))
}
So you cannot assign the pipeline ID, which needs the support of elastic officially.
For now, the pipeline ID is changed along with the four params. You MUST change the pipeline ID in elasticsearch when you upgrading beats.
Refer /{filebeat-HOME}/module/nginx/access/manifest.yml,
maybe u should set ingest_pipeline in /{filebeat-HOME}/modules.d/nginx.yml.
the value seems like a local file.
The pipeline can be configured either in your input or output configuration, not in the modules one.
So in your configuration you have different sections, the one you show in your question is for configuring the nginx module. You need to open filebeat.yml and look for the output section where you have configured elasticsearch and put the pipeline configuration there:
#-------------------------- Elasticsearch output ------------------------------
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["elk.slavikf.com:9200"]
pipeline: filebeat-nginx-access-default
If you need to be able to use different pipelines depending on the nature of data you can definitely do so using pipeline mappings:
output.elasticsearch:
hosts: ["elk.slavikf.com:9200"]
pipelines:
- pipeline: "nginx_pipeline"
when.contains:
type: "nginx"
- pipeline: "apache_pipeline"
when.contains:
type: "apache"

Generating filebeat custom fields

I have an elasticsearch cluster (ELK) and some nodes sending logs to the logstash using filebeat. All the servers in my environment are CentOS 6.5.
The filebeat.yml file in each server is enforced by a Puppet module (both my production and test servers got the same configuration).
I want to have a field in each document which tells if it came from a production/test server.
I wanted to generate a dynamic custom field in every document which indicates the environment (production/test) using filebeat.yml file.
In order to work this out i thought of running a command which returns the environment (it is possible to know the environment throught facter) and add it under an "environment" custom field in the filebeat.yml file but I couldn't find any way of doing so.
Is it possible to run a command through filebeat.yml?
Is there any other way to achieve my goal?
In your filebeat.yml:
filebeat:
prospectors:
-
paths:
- /path/to/my/folder
input_type: log
# Optional additional fields. These field can be freely picked
# to add additional information to the crawled log files
fields:
mycustomvar: production
in filebeat-7.2.0 i use next syntax:
processors:
- add_fields:
target: ''
fields:
mycustomfieldname: customfieldvalue
note: target = '' means that mycustomfieldname is a top-level field
official 7.2 docs
Yes, you can add fields to the document through filebeats.
The official doc shows you how.

Resources