How to gather logs to Elasticsearch - elasticsearch

I have logs of web apps in different servers (many machines). How can I gather these logs in a system where I have Elastic search and Kibana installed. When I searched I only found tutorials that show setup where logs, logstash, beats, elasticsearch and kibana are all together.

Since you have many machines which produce logs, you need to setup ELK stack with Filebeat, Logstash, Elasticsearch and Kibana.
You need to setup filebeat instance in each machine.
It will listen to your log files in each machine and forward them to the logstash instance you would mention in filebeat.yml configuration file like below:
#=========================== Filebeat inputs =============================
filebeat.inputs:
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /path_to_your_log_1/ELK/your_log1.log
- /path_to_your_log_2/ELK/your_log2.log
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["private_ip_of_logstash_server:5044"]
Logstash server listens to port 5044 and stream all logs through logstash configuration files:
input {
beats { port => 5044 }
}
filter {
# your log filtering logic is here
}
output {
elasticsearch {
hosts => [ "elasticcsearch_server_private_ip:9200" ]
index => "your_idex_name"
}
}
In logstash you can filter and split your logs into fields and send them to elasticsearch.
Elasticsearch saves all the data we send through logstash in indexes.
All data in elasticsearch database can be readable through Kibana. We can create dashboards with many types of charts based on our data using kibana.
Below is the basic architecture for ELK with filebeat:

You need to install Filebeat first which collects logs from all the web servers.
After that need to pass logs from Filebeat -> Logstash.
In Logstash you can format and drop unwanted logs based on Grok pattern.
Forward logs from Logstash -> Elasticsearch for storing and indexing.
Connect Kibana with Elasticsearch to add Index and view logs in Matrix based on selected Index.

As mentioned in other answers you will need to install Filebeat on all of your instances to listen of your log file and ship the logs.
Your Filebeat configuration will depend on your log format (for example log4j) and where you want to ship it (for example: Kafka, Logstash, Elasticsearch).
Config example:
filebeat.inputs:
- type: log
paths:
- /var/log/system.log
multiline.pattern: '^\REGEX_TO_MATCH_YOUR_LOG_FORMAT'
multiline.negate: true
multiline.match: after
output.elasticsearch:
hosts: ["https://localhost:9200"]
username: "filebeat_internal"
password: "YOUR_PASSWORD"
Also Logstash is not mandatory if you don't want to use it, logs can be sent directly to Elasticsearch, but you will need to setup ingest pipeline in your Elasticsearch cluster to process incoming logs, more on ingest piplelines here.
Also one more useful link: Working With Ingest Pipelines In ElasticSearch And Filebeat

In order to grab all your web application logs you need to setup ELK stack. Right now you have elastic search setup which is just a database where all logs data are saved. In order to view those logs, you need Kibana which is UI and then you need Logstash and Filebeat to read those logs of your application and transfer it to Logstash or directly to Elasticsearch.
If you want a proper centralized logs system then I recommend you to use Logstash with Filebeat also. As you have different servers than on each server you install Filebeat and on your main server where you have kibana and Elasticsearch install Logstash and point all Filebeats to that server.
FileBeats are lightweight data shippers that we install as agents on servers to send specific types of operational data to Logstash and then logstash do the filter and send that logs data to elasticsearch.
Check How To Setup ELK follow the instruction on this website. Also, look
FileBeat + ELK Setup

You can use Splunk and Splunk forwarder to gather all the logs together.
Use Splunk forwarder in your web servers to forward all the logs to your centralized server which has Splunk.

If you don't want to add another tool to Elasticsearch and Kibana stack you can directly send logs to Elasticsearch, but you should be careful while constructing your pipeline to have more stable system.
To gather logs you can use python or another language but for python I would use this library:
https://elasticsearch-py.readthedocs.io/en/master/
There is also another medium tutorial for python:
https://medium.com/naukri-engineering/elasticsearch-tutorial-for-beginners-using-python-b9cb48edcedc
If you prefer other languages to push your logs to elasticsearch, for sure you can use them too. I just suggested python because I am more familiar with it and also you can use it to create a fast prototype before make it live product.

Related

how to configure filebeat configured as agent in kubernetes cluster

I am trying to add ELK to my project which is running on kubernetes. I want to pass by filebeat -> logstach then elastic search. I prepared my filebeat.yml file and in my company the filebeat is configured as an agent in the cluster which i don't realy know what it means? I want to know how to configure the filebeat in this case ? just adding the file in the project and it will be taken into considiration once the pod started or how does it work ?
You can configure the Filebeat in some ways.
1 - You can configure it using the DeamonSet, meaning each node of your Kubernetes architecture will have one POD of Filebeat. Usually, in this architecture, you'll need to use only one filebeat.yaml configuration file and set the inputs, filters, outputs (output to Logstash, Elasticsearch, etc.), etc. In this case, your filebeat will need root access inside your cluster.
2 - Using Filebeat as a Sidecar with your application k8s resource. You can configure an emptyDir in the Deployment/StatefulSet, share it with the Filebeat Sidecar, and set the Filebeat to monitor this directory.

Run ELK with filebeat

I try to start ELK in docker-compose in WSL2. But I can't find any indexes in kibana.
Test code.
I try to load any logs from /var/log/*.log using filebeat
When I open kibana http://localhost:5601/ it offer to add new data.
I expected data in kibana on indexes witch must be created by beanfile.

Is it possible to configure multiple output for a filebeat?

In one of our applications we parse the application logs using logstash and indexing them into elasticsearch. Our simple architecture is logfiles ---> filebeat--->logstash-----> elasticsearch.
As we enabled multiple log files example (apachelogs, passengerlogs, application logs etc,,), logstash is not able to parse the volume of data and hence there are logs missing at elasticsearch. Is there any way to handle huge volume of data at logstash or can we have multiple logstash server to receive logs from filebeat based on the log type? for example: application logs send output logstash-1 and apachelogs to logstash-2.
Thanks in advance.
It is not currently possible to define the same output type multiple time in Filebeat.
But there is a few options to achieve what you want:
You can use the loadbalance option in filebeat to distribute your events to multiple Logstash. https://www.elastic.co/guide/en/beats/filebeat/current/logstash-output.html#loadbalance, by default beats will pick a random host and stick to it.
Use a queue, like kafka and make logstash uses the kafka input, this will allow you add more LS as you need.

How to watch the logstash log?

For my enterprise application distributed and structured logging, I use logstash for log aggregation and elastic search as log storage. I have the clear control pushing logs from my application to logstash. On the other hand, from logstash to elastic search having very thin control.
Assume, if my elasticsearch goes down for some stupid reason, The logstash log(/var/log/logstash/logstash.log) is recording the reason clearly like the following one.
Attempted to send a bulk request to Elasticsearch configured at '["http://localhost:9200/"]', but Elasticsearch appears to be unreachable or down! {:client_config=>{:hosts=>["http://localhost:9200/"], :ssl=>nil, :transport_options=>{:socket_timeout=>0, :request_timeout=>0, :proxy=>nil, :ssl=>{}}, :transport_class=>Elasticsearch::Transport::Transport::HTTP::Manticore, :logger=>nil, :tracer=>nil, :reload_connections=>false, :retry_on_failure=>false, :reload_on_failure=>false, :randomize_hosts=>false}, :error_message=>"Connection refused", :class=>"Manticore::SocketException", :level=>:error}
How will I get noticed OR notified for the error level logs from logstash?
Should be doable with the following 3 steps:
1) Depends on how you want to get notified. If an email is sufficient you could use the Logstash email output-plugin.
But there are many more output plugins available.
2) To restrict certain events you can do stuff like that in your Logstash config (example is taken from the Elastic support site):
if [level] == "ERROR" {
output {
...
}
}
The if clause is not limited to the level field of your JSON; you are able to apply it for any of your JSON fields of course, which makes it more powerful.
3) To make this work (and not run into a logging cycle) you need either:
Start a second Logstash instance on your system (just observing the Logstash ERROR log), which should be okay from what is written here
Or you build a more complicated configuration, using just one Logstash instance. This configuration has to forward log-statements from YOUR application to Elasitcsearch while logstaments from Logstash ERROR logs are forwarded to the e.g. Logstash email output-plugin.
Side note: you may want to have a look at Filebeat which works very well with Logstash (Its from Elastic as well) and it is even more light-weighted than Logstash. It allows stuff like include_lines: ["^ERR", "^WARN"] in your configuration.
To receive input from Filebeat you will have to adopt the config to send data to Logstash and for Logstash you will have to active and use the Beats input plugin described here.

Filebeat > is it possible to send data to Elasticsearch by means of Filebeat without Logstash

I am a newbie of ELK. I installed first Elasticsearch and Filebeat without Logstash, and I would like to send data from Filebeat to Elasticsearch. After I installed the Filebeat and configured the log files and Elasticsearch host, I started the Filebeat, but then nothing happened even though there are lots of rows in the log files, which Filebeats prospects.
So is it possible to forward log data directly to Elasticsearch host without Logstash at all? I
It looks like your ES 2.3.1 is only configured to be reachable from localhost (default since ES 2.0)
You need to modify your elasticsearch.yml file with this and restart ES:
network.host: 168.17.0.100
Then your filebeat output configuration needs to look like this:
output:
elasticsearch:
hosts: ["168.17.0.100:9200"]
Then you can check in your ES filebeat-* indices that you're getting the new log data (i.e. the hits.total count should increase over time):
curl -XGET 168.17.0.100:9200/filebeat-*/_search

Resources