There instructions on how to configure logstash with elastic cloud using the cloud.id are not complete. Specifically the instructions do not say what to put into the output section in the *.conf file if you put the cloud.id and cloud.auth into logstash.yml. Using the cloud.id is supposed the negate the need to put the URL of the ES instance.
If you put nothing in the output section logstash throws a config error. If you put something there, illogical since nothing should be needed, it tries to connect to ES on localhost:
output {
elasticsearch {
}
}
Here is error as logstash is not using the cloud.id.
[WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://127.0.0.1:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://127.0.0.1:9200/]
You should repeat the login information in the output section despite it's being illogical, as stated here. The output of your pipeline shall look something like this:
output {
elasticsearch {
hosts => ["https://xxxxxxxxxxxx.eu-central-1.aws.cloud.es.io:9243"]
user => "myUsername"
password => "myPassword"
}
stdout { codec => rubydebug }
}
cloud.id and cloud.auth are only meant for configuring:
internal modules (such as netflow)
the Logstash monitoring
the centralized pipeline management
The elasticsearch output is another beast that requires its own configuration and does not (yet) benefit from the main configuration.
Related
Setting up a pipeline of elastic search, kibana, and logstash in locally and using filebeat to push logs from a spring boot application to the pipeline. U will find the official documentation well-defined, But I created this questions to answer a few points that were not clear. I answered for a single spring boot app scenario, thanks to people who are adding their scenarios as well.
I spend a few days configuring the ELK stack with my spring boot application. Here I won't specify the step-by-step integration, for that, you can refer to the official documentation. This is more focused on what I didn't find in the documentation steps.
Env: This will be focused on setting up the 8.5.3 version in a mac os.
For Elasticsearch and Kibana I didn't have any trouble following the official document word by word.
Elasticsearch: https://www.elastic.co/downloads/elasticsearch
Kibana:https://www.elastic.co/downloads/kibana
Tip: To Check elastic running
curl --cacert config/certs/http_ca.crt -u elastic https://localhost:9200
Enter password when prompted
Enter host password for user 'elastic':
In my project, I needed to extract only a specific log line and process it. U can use the below official document link to download and extract the logstash and filebeat. Then you can use the mentioned configs before you run it.
Logstash: https://www.elastic.co/downloads/logstash
Filebeat: https://www.elastic.co/downloads/beats/filebeat
Filebeat :
First, you need to make permission changes to your filebeat.yml file. Navigate to your filebeat extracted folder and you can use the following config if needed.
filebeat.inputs:
- type: filestream
id: filebeat-id-name
enabled: true
#Path to you log file
paths:
- /Users/leons/IdeaProjects/SpringELKDemo/myapplogs.log
#I wanted to only process the logs from MainController
include_lines: ['MainController']
output.logstash:
hosts: ["localhost:5044"]
Then you need to alter the write permission for this file using the below command(mac). Later you can edit the file using sudo nano.
sudo chown root filebeat.yml
Logstash:
Initial a sample file for logstash.conf is available in the config folder inside logstash. you can refer to that, also take a look at mine.
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
input {
beats {
port => 5044
}
}
filter {
#Extracting a portion of log message
dissect {
mapping => {
"message" => "%{}: %{data_message}"
}
}
#converting message json into fields
json {
source => "data_message"
}
#mapping the message json timestamp with entry timestamp
date {
target => "#timestamp"
match => ["timestamp","yyyy-MM-dd HH:mm:ss.SSS"]
}
#removing unneeded fields
mutate {
remove_field => ["[event][original]","message","data_message","timestamp"]
}
}
output {
elasticsearch {
hosts => ["https://localhost:9200"]
index => "myindex"
user => "elastic"
password => "*****************"
ssl_certificate_verification => false
}
stdout{
codec => rubydebug
}
}
I used the dissect filter to do string manipulation in my logline, that filebeat transferred. Below was my log, and I needed only the exact message which is JSON string
2022-12-15 21:14:56.152 INFO 9278 --- [http-nio-8080-exec-10] c.p.t.springdemo.controller.MainController : {"name":"leons","id":"123123","msg":"hello world"}
For more on dissect refer official docs
The json filter is used to convert the JSON key: values into fields and values in your elastic document.
Now you should be ready to run logstash and filebeat using official document command. Just for reference use below
Logstash :
bin/logstash -f logstash.conf
Filebeat :
sudo ./filebeat -e -c filebeat.yml
I'm playing a bit with kibana to see how it works.
i was able to add nginx log data directly from the same server without logstash and it works properly. but using logstash to read log files from a different server doesn't show data. no error.. but no data.
I have custom logs from PM2 that runs some PHP script for me and the format of the messages are:
Timestamp [LogLevel]: msg
example:
2021-02-21 21:34:17 [DEBUG]: file size matches written file size 1194179
so my gork filter is:
"%{DATESTAMP:timestamp} \[%{LOGLEVEL:loglevel}\]: %{GREEDYDATA:msg}"
I checked with Gork Validator and the syntax matches the file format.
i've got files that contain the suffix out that are debug level, and files with suffix error for error level.
so to configure logstash on the kibana server, i added the file /etc/logstash/conf.d/pipeline.conf with the following:
input {
beats {
port => 5544
}
}
filter {
grok {
match => {"message"=>"%{DATESTAMP:timestamp} \[%{LOGLEVEL:loglevel}\]: %{GREEDYDATA:msg}"}
}
mutate {
rename => ["host", "server"]
convert => {"server" => "string"}
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
user => "<USER>"
password => "<PASSWORD>"
}
}
I needed to rename the host variable to server or I would get errors like Can't get text on a START_OBJECT and failed to parse field [host] of type [text]
on the 2nd server where the pm2 logs reside I configure filebeat with the following:
- type: filestream
enabled: true
paths:
- /home/ubuntu/.pm2/*-error-*log
fields:
level: error
- type: filestream
enabled: true
paths:
- /home/ubuntu/.pm2/logs/*-out-*log
fields:
level: debug
I tried to use log and not filestream the results are the same.
but it makes sense to use filestream since the logs are updated constantly on ?
so i have logstash running on one server and filebeat on the other, opened firewall ports, i can see they're connecting but i don't see any new data in the Kibana logs dashboard relevant to the files i fetch with logstash.
filebeat log always shows this line Feb 24 04:41:56 vcx-prod-backup-01 filebeat[3797286]: 2021-02-24T04:41:56.991Z INFO [file_watcher] filestream/fswatch.go:131 Start next scan and something about analytics metrics so it looks fine, and still no data.
I tried to provide here as much information as I can, i'm new to kibana, i have no idea why data is not shown in kibana if there are no errors.
I thought maybe i didn't escaped the square brackets properly in gork filter so I tried using "%{DATESTAMP:timestamp} \\[%{LOGLEVEL:loglevel}\\]: %{GREEDYDATA:msg}" which replaces \[ with \\[ but the results are the same.
any information regarding this issue would be greatly appreciated.
#update
ֿ
using stack version 7.11.1
I changed back to log instead of filestream based on #leandrojmp recommendations.
I checked for harverser.go related lines i filebeat and I found these:
Feb 24 14:16:36 SERVER filebeat[4128025]: 2021-02-24T14:16:36.566Z INFO log/harvester.go:302 Harvester started for file: /home/ubuntu/.pm2/logs/cdr-ssh-out-1.log
Feb 24 14:16:36 SERVER filebeat[4128025]: 2021-02-24T14:16:36.567Z INFO log/harvester.go:302 Harvester started for file: /home/ubuntu/.pm2/logs/cdr-ftp-out-0.log
and I also noticed that when i configured the output to stdout, i do see the events that are coming from the other server. so logstash do receive them properly but for some reason i don't see them in kiban.
If you have output using both stdout and elasticsearch outputs but you do not see the logs in Kibana, you will need to create an index pattern in Kibana so it can show your data.
After creating an index pattern for your data, in your case the index pattern could be something like logstash-* you will need to configure the Logs app inside Kibana to look for this index, per default the Logs app looks for filebeat-* index.
ok... so #leandrojmp helped me a lot in understanding what's going on with kibana. thank you! all the credit goes to you! just wanted to write a log answer that may help other people overcome the initial setup.
lets start fresh
I wanted one kibana node that monitors custom logs on a different server.
I have ubuntu latest LTS installed on both, added the deb repositories, installed kibana, elsaticsearch and logstash on the first, and filebeat on the 2nd.
basic setup is without much security and SSL which is not what i'm looking for here since i'm new to this topic, everything is mostly set up.
in kibana.yml i changed the host to 0.0.0.0 instead of localhost so i can connect from outside, and in logstash i added the following conf file:
input {
beats {
port => 5544
}
}
filter {
grok {
match => {"message"=>"%{DATESTAMP:timestamp} \[%{LOGLEVEL:loglevel}\]: %{GREEDYDATA:msg}"}
}
mutate {
rename => ["host", "server"]
convert => {"server" => "string"}
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
}
}
i didn't complicate things and didn't need to set up additional authentication.
my filebeat.yml configuration:
- type: log
enabled: true
paths:
- /home/ubuntu/.pm2/*-error-*log
fields:
level: error
- type: log
enabled: true
paths:
- /home/ubuntu/.pm2/logs/*-out-*log
level: debug
i started everything, no errors in any logs but still no data in kibana, since i had no clue how elasticsearch stored it's data, i needed to find out how can i connect to elasticsearch and see if the data is there, so i executed curl -X GET http://localhost:9200/_cat/indices?v and noticed a logstash index, so i executed curl -X GET http://localhost:9200/logstash-2021.02.24-000001/_search and i noticed that the log data is presented in the database.
so it must means that it's something with kibana. so using the web interface of kibana under settings I noticed a configuration called Index pattern for matching indices that contain log data and the input there did not match the logstash index name, so i appended ,logstash* to it and voila! it works :)
thanks
I have configured logstash to listen the logs at default airflow logs path. I want to create the index in elasticsearch as {dag_id}-{task_id}-{execution_date}-{try_number}. All these are parameters from Airflow. These are the modified values in airflow.cfg.
[core]
remote_logging = True
[elasticsearch]
host = 127.0.0.1:9200
log_id_template = {{dag_id}}-{{task_id}}-{{execution_date}}-{{try_number}}
end_of_log_mark = end_of_log
write_stdout = True
json_format = True
json_fields = asctime, filename, lineno, levelname, message
These task instance details need to passed from Airflow to logstash.
dag_id,
task_id,
execution_date,
try_number
This is my logstash config file.
input {
file{
path => "/home/kmeenaravich/airflow/logs/Helloworld/*/*/*.log"
start_position => beginning
}
}
output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
index => "logginapp-%{+YYYY.MM.dd}"
}
stdout { codec => rubydebug }
}
I have 2 questions. How to pass the parameters from Airflow to Logstash?
I have configured logstash to listen to the logs path. Since remote_logging is True in airfow.cfg, logs are not written to base log folder. If that is false or if I connect to Amazon S3, logs are written to base_log_folder path too. But, for me to configure logstash, logs need to be written in local folder. I use airflow version 1.10.9 . What can I do to stream my logs to Elasticsearch index.
To answer your first question (I assume you mean passing the logs directly to Elasticsearch), you cannot. The Airflow "Elasticsearch Logging" is not really a logging to Elasticsearch but more a configuration to enable the logging to get shipped to Elasticsearch. The naming of the attributes is (in my opinion) a little bit confusing as it suggests that you can write directly to Elasticsearch.
You can configure Airflow to read logs from Elasticsearch. See Airflow Elasticsearch documentation for more information:
Airflow can be configured to read task logs from Elasticsearch and
optionally write logs to stdout in standard or json format. These logs
can later be collected and forwarded to the Elasticsearch cluster
using tools like fluentd, logstash or others.
As you have enabled write_stdout = True, output is written to stdout. If you want the output to be written in files you have to set write_stdout = False or leave it empty. Your logstash configuration should then find the files, which answers your second question.
Cheers
Michael
i am running one instance of elastic and one of logstash in parallel on the same computer.
when trying to load a file into elastic, using logstash that is running the config file below, i get the follwing output msgs on elastic and no file is loaded
(when input is configured to be stdin everything seems to be working just fine)
any ideas?
"
[2014-06-17 22:42:24,748][INFO ][cluster.service ] [Masked Marvel] removed {[logstash- Eitan-PC-5928-2010][Ql5fyvEGQyO96R9NIeP32g][Eitan-PC][inet[Eitan-PC/10.0.0.5:9301]]{client=true, data=false},}, reason: zen-disco-node_failed([logstash-Eitan-PC-5928-2010][Ql5fyvEGQyO96R9NIeP32g][Eitan-PC][inet[Eitan-PC/10.0.0.5:9301]]{client=true, data=false}), reason transport disconnected (with verified connect)
[2014-06-17 22:43:00,686][INFO ][cluster.service ] [Masked Marvel] added {[logstash-Eitan-PC-5292-4014][m0Tg-fcmTHW9aP6zHeUqTA][Eitan-PC][inet[/10.0.0.5:9301]]{client=true, data=false},}, reason: zen-disco-receive(join from node[[logstash-Eitan-PC-5292-4014][m0Tg-fcmTHW9aP6zHeUqTA][Eitan-PC][inet[/10.0.0.5:9301]]{client=true, data=false}])
"
config file:
input {
file {
path => "c:\testLog.txt"
}
}
output {
elasticsearch { host => localhost
index=> amat1
}
}
When you use "elasticsearch" as your output http://logstash.net/docs/1.4.1/outputs/elasticsearch as opposed to "elasticsearch_http" http://logstash.net/docs/1.4.1/outputs/elasticsearch_http you are going to want to set "protocol".
The reason is that it can have 3 different values, "node", "http" or "transport" with different behavior for each and the default selection is not well documented.
From the look of your log files it appears it's trying to use "node" protocol as I see connection attempts on port 9301 which indicates (along with other log entries) that logstash is trying to join the cluster as a node. This can fail for any number of reasons including mismatch on the cluster name.
I'd suggest setting protocol to "http" - that change has fixed similar issues before.
See also:
http://logstash.net/docs/1.4.1/outputs/elasticsearch#cluster
http://logstash.net/docs/1.4.1/outputs/elasticsearch#protocol
EDIT:
A few other issues I see in your config -
Your host and index should be strings, which in a logstash config
file should be wrapped with double quotes, "localhost" and "amat1".
No quotes may work but they recommend you use quotes.
http://logstash.net/docs/1.4.1/configuration#string
If you don't use "http" as the protocol or don't use
"elasticsearch_http" as the output you should set cluster equal to
your ES cluster name (as it will be trying to become a node of the
cluster).
You should set start_position under file in input to "beginning".
Otherwise it will default to reading from the end of the file and you
won't see any data. This a particular problem with Windows right now
as the other way of tracking position within a file, sincedb, is
broken on Windows:
https://logstash.jira.com/browse/LOGSTASH-1587
http://logstash.net/docs/1.4.1/inputs/file#start_position
You should change your path to your log file to this:
"C:/testLog.txt". Logstash prefers forward slashes and upper case
drive letters under Windows.
https://logstash.jira.com/browse/LOGSTASH-430
I am struggling to configure and use logstash with elasticsearch. I downloaded the logstash-1.2.0-flatjar.jar, and created the sample.conf with content
input { stdin { type => "stdin-type"}}
output { stdout {}
elasticsearch { embedded => true }
}
and tried to run java -jar logstash-1.2.0-flatjar.jar agent -f sample.conf which produces
{:fix_jar_path=>["jar:file:/C:/Users/Rajesh/Desktop/Toshiba/logstach-jar/logstash-1.2.0-flatjar.jar!/locales/en.yml"]}
log4j, [2014-04-02T22:39:28.121] WARN: org.elasticsearch.discovery.zen.ping.unicast: [Chimera] failed to send ping to [[#zen_unicast_1#][inet[localho
st/127.0.0.1:9300]]]
Could anyone please help? Do i need to install plugins? Please provide the link
Thanks in Advance
Instead of using the embedded elasticsearch in logstash, you can try to download elasticsearch and start the elasticsearch as a different instance. Please refer to this page about how to setup an elasticsearch