Logstash multiple logs - elasticsearch

I am following an online tutorial and have been provided with a cars.csv file and the following Logstash config file. My logstash is running perfectly well and is indexing the CSV as we speak.
The question is, I have another log file (entirely different data) which I need to parse and index into a different index.
How do I add this configuration without restarting logstash?
If above isn't possible and I edit the config file then restart logstash - it won't reindex the entire cars file will it?
If I do 2. How do I format the config for multiple styles of log file.
eg. my new log file looks like this:
01-01-2017 ORDER FAILED: £12.11 Somewhere : Fraud
Existing Config File:
input {
file {
path => "/opt/cars.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns =>
[
"maker",
"model",
"mileage",
"manufacture_year",
"engine_displacement",
"engine_power",
"body_type",
"color_slug",
"stk_year",
"transmission",
"door_count",
"seat_count",
"fuel_type",
"date_last_seen",
"date_created",
"price_eur"
]
}
mutate {
convert => ["mileage", "integer"]
}
mutate {
convert => ["price_eur", "float"]
}
mutate {
convert => ["engine_power", "integer"]
}
mutate {
convert => ["door_count", "integer"]
}
mutate {
convert => ["seat_count", "integer"]
}
}
output {
elasticsearch {
hosts => "localhost"
index => "cars"
document_type => "sold_cars"
}
stdout {}
}
Config file for orders.log
input {
file {
path => "/opt/logs/orders.log"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
grok {
match => { "message" => "(?<date>[0-9-]+) (?<order_status>ORDER [a-zA-Z]+): (?<order_amount>£[0-9.]+) (?<order_location>[a-zA-Z]+)( : (?<order_failure_reason>[A-Za-z ]+))?"}
}
mutate {
convert => ["order_amount", "float"]
}
}
output {
elasticsearch {
hosts => "localhost"
index => "sales"
document_type => "order"
}
stdout {}
}
Disclaimer: I'm a complete newbie. Second day using ELK.

For point 1, either in your logstash.yml file, you can set
config.reload.automatic:true
Or, while executing logstash with conf file, run it like:
bin/logstash -f conf-file-name.conf --config.reload.automatic
After doing either of these settings, you can start your logstash and from now on any change you make in conf file will be reflected back.

2. If above isn't possible and I edit the config file then restart logstash - it won't reindex the entire cars file will it?
If you use sincedb_path => "/dev/null", Logstash won't remember where is has stopped reading a document and will reindex it at each restart. You'll have to remove this line if you wish for Logstash to remember (see here).
3.How do I format the config for multiple styles of log file.
To support multiple style of log files, you can put tags on the file inputs (see https://www.elastic.co/guide/en/logstash/5.5/plugins-inputs-file.html#plugins-inputs-file-tags) and then use conditionals (see https://www.elastic.co/guide/en/logstash/5.5/event-dependent-configuration.html#conditionals) in your file config.
Like this:
file {
path => "/opt/cars.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
tags => [ "csv" ]
}
file {
path => "/opt/logs/orders.log"
start_position => "beginning"
sincedb_path => "/dev/null"
tags => [] "log" ]
}
if csv in [tags] {
...
} else if log in [tags] {
...
}

Related

Logstash not importing data

I am working on an ELK stack setup I want to import data from a csv file from my PC to elasticsearch via logstash. Elasticsearch and Kibana is working properly.
Here is my logstash.conf file:
input {
file {
path => "C:/Users/aron/Desktop/es/archive/weapons.csv"
start_position => "beginning"
sincedb_path => "NUL"
}
}
filter {
csv {
separator => ","
columns => ["name", "type", "country"]
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200/"]
index => "weapons"
document_type => "ww2_weapon"
}
stdout {}
}
And a sample row data from my .csv file looks like this:
Name
Type
Country
10.5 cm Kanone 17
Field Gun
Germany
German characters are all showing up.
I am running logstash via: logstash.bat -f path/to/logstash.conf
It starts working but it freezes and becomes unresponsive along the way, here is a screenshot of stdout
In kibana, it created the index and imported 2 documents but the data is all messed up. What am I doing wrong?
If your task is only to import that CSV you better use the file upload in Kibana.
Should be available under the following link (for Kibana > v8):
<your Kibana domain>/app/home#/tutorial_directory/fileDataViz
Logstash is used if you want to do this job on a regular basis with new files coming in over time.
You can try with below one. It is running perfectly on my machine.
input {
file {
path => "path/filename.csv"
start_position => "beginning"
sincedb_path => "NULL"
}
}
filter {
csv {
separator => ","
columns => ["field1","field2",...]
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => "https://localhost:9200"
user => "username" ------> if any
password => "password" ------> if any
index => "indexname"
document_type => "doc_type"
}
}

Use Logstash to get nested Airflow Logs, and send to Elasticsearch

I am new to Logstash and ELK as a whole. I am trying to send my airflow logs to Logstash. I am confused on how to configure my configuration file, especially because I have several (nested) log files.
My airflow is deployed on an AWS EC2 instance and my logs directory is something like this: /home/ubuntu/run/logs/scheduler/
The scheduler directory has a couple of dated folders. Using one of the folders as an example:
/home/ubuntu/run/logs/scheduler/2022-08-31.
The dated folder has files such as
testing.py.log hello_world.py.log dag_file.py.log
Now, while configuring my /etc/logstash/conf.d/(based on the log path I shared above), how can I define my path to pick all the logs?
This is what my /etc/logstash/conf.d/apache-01.conf currently looks like, but I know the path isn't accurate:
input {
file {
path => "~/home/ubuntu/run/log/scheduler/"
start_position => "beginning"
codec -> "line"
}
}
filter {
grok {
match => { "path" => "" }
}
mutate {
add_field => {
"log_id" => "%{[dag_id]}-%{[task_id]}-%{[execution_date]}-%{[try_number]}"
}
}
}
output{
elasticsearch {
hosts => ["localhost:9200"]
}
}
The path parameter needs a absolute path.
To process all py.log files you can use this input
input {
file {
path => "/home/ubuntu/run/logs/scheduler/*/*py.log"
start_position => "beginning"
codec -> "line"
}
}
To process only the files hello_world.py.log and dag_file.py.log you can use an array for your path
input {
file {
path => ["/home/ubuntu/run/logs/scheduler/*/hello_world.py.log", "/home/ubuntu/run/logs/scheduler/*/dag_file.py.log"]
start_position => "beginning"
codec -> "line"
}
}

Generating Logs from multiple directories in Logstash. Logs not appearing in ElasticSearch

I am trying to collect logs from multiple directories through Logstash and send them to Elasticsearch.
This is my configuration:
input{
file {
path => ["/XXX/XXX/results/**/log_file.txt"]
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
grok {
pattern_definitions => { "THREAD_NAME" => "%{WORD}?%{NUMBER}?" }
match => { "message" => "%{SPACE}?%{TIMESTAMP_ISO8601:asctime}?%{SPACE}?\|%{SPACE}?%{THREAD_NAME:thread_name}"}
}
}
output{
elasticsearch {
hosts => ["x.x.x.x:9200"]
}
stdout { codec => rubydebug }
}
The file path is a relative path.
The logs are placed inside different directories placed inside the results directory:
results/dir/log_file.txt.
I have tried this configuration with stdin and logs appeared inside Kibana, but Logstash doesn't pick up the logs in the directories. Please advise.

Elasticsearch not recieving input from logstash

I'm running logstash where the output is set to elasticsearch on my localhost. However, when I open up elasticsearch, it appears that it did not receive any data from logstash. Logstash parses the csv file correctly, as I can see by the output in the terminal.
I've tried modifying the conf file, but the problem remains. The conf file is below
input {
file {
path => "/Users/kevinliu/Desktop/logstash_tutorial/gg.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["name","price","unit","url"]
}
}
output {
elasticsearch {
hosts => "localhost"
index => "gg-prices"
}
stdout {}
}
When I access localhost:9200/ I just see the default " "You Know, for Search" display/message from elasticsearch.

Data type conversion using logstash grok

Basic is a float field. The mentioned index is not present in elasticsearch. When running the config file with logstash -f, I am getting no exception. Yet, the data reflected and entered in elasticsearch shows the mapping of Basic as string. How do I rectify this? And how do I do this for multiple fields?
input {
file {
path => "/home/sagnik/work/logstash-1.4.2/bin/promosms_dec15.csv"
type => "promosms_dec15"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
grok{
match => [
"Basic", " %{NUMBER:Basic:float}"
]
}
csv {
columns => ["Generation_Date","Basic"]
separator => ","
}
ruby {
code => "event['Generation_Date'] = Date.parse(event['Generation_Date']);"
}
}
output {
elasticsearch {
action => "index"
host => "localhost"
index => "promosms-%{+dd.MM.YYYY}"
workers => 1
}
}
You have two problems. First, your grok filter is listed prior to the csv filter and because filters are applied in order there won't be a "Basic" field to convert when the grok filter is applied.
Secondly, unless you explicitly allow it, grok won't overwrite existing fields. In other words,
grok{
match => [
"Basic", " %{NUMBER:Basic:float}"
]
}
will always be a no-op. Either specify overwrite => ["Basic"] or, preferably, use mutate's type conversion feature:
mutate {
convert => ["Basic", "float"]
}

Resources