Logstash filename as ElasticSearch index

Logstash filename as ElasticSearch index - elasticsearch

I am using a folder as input:
input {
file {
path => "C:/ProjectName/Uploads/*"
start_position => "beginning"
sincedb_path => "dev/null"
}
}
and as output:
output {
elasticsearch {
hosts => "localhost"
index => "manual_index_name" # want filename here
document_type => "_doc"
}
}
I want the index in elasticsearch to be the name of the file being indexed.
I've tried variations of this answer with no success as I am not clear on what it is doing: https://stackoverflow.com/a/40156466/6483906

You'll need to use a grok filter to find the last portion of the filename:
filter {
grok {
match => ["path", "Uploads/%{GREEDYDATA:index_name}" ]
}
}
and then just use the portion in your index name index => "%{index_name}"

Related

logstash cmd not running

elasticsearch and kibana both are running but when i use the following command to ingest csv file into elasticsearch it stops automatically and take a while to respond .
bin\logstash -f logstash.config
here is my logstash.confg
input {
file {
path => "C:\Users\Sireesha Chapa\Desktop\logstashData.csv"
start_position => "beginning"
}
}
filter {
csv {
separator => ","
columns => ["id","group","sex","disease","age"]
}
mutate { convert => ["id" ,"integer"] }
mutate { convert => ["age","integer"] }
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "health"
document_type => "patient_record"
}
stdout{}
}

Change the name of your logstash config to logstash.conf.

Logstash Not Reading "File" Input

I am trying to use file as an input to logstash.Here is my logstash.conf
input {
file {
path => "/home/dxp/elb.log"
type => "elb"
start_position => "beginning"
sincedb_path => "/home/dxp/log.db"
}
}
filter {
if [type] == "elb" {
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int} %{IP:backend_ip}:%{NUMBER:backend_port:int} %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} %{NUMBER:elb_status_code:int} %{NUMBER:backend_status_code:int} %{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int} %{QS:request}" ]
}
}
}
output
{
elasticsearch {
hosts => "10.99.0.180:9200"
manage_template => false
index => "elblog-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
My logs show this:
[2017-10-27T13:11:31,164][DEBUG][logstash.inputs.file ]_globbed_files: /home/dxp/elb.log: glob is []: I guess my file has not been read by logstash, so a new index is not formed in elasticsearch.
Please help me with what i am missing in this.

elasticsearch - import csv using logstash date is not parsed as of datetime type

I am trying to import csv into elasticsearch using logstash
I have tried using two ways:
Using CSV
Using grok filter
1) For csv below is my logstash file:
input {
file {
path => "path_to_my_csv.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["col1","col2_datetime"]
}
mutate {convert => [ "col1", "float" ]}
date {
locale => "en"
match => ["col2_datetime", "ISO8601"] // tried this one also - match => ["col2_datetime", "yyyy-MM-dd HH:mm:ss"]
timezone => "Asia/Kolkata"
target => "#timestamp" // tried this one also - target => "col2_datetime"
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "my_collection"
}
stdout {}
}
2) Using grok filter:
For grok filter below is my logstash file
input {
file {
path => "path_to_my_csv.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
grok {
match => { "message" => "(?<col1>(?:%{BASE10NUM})),(%{TIMESTAMP_ISO8601:col2_datetime})"}
remove_field => [ "message" ]
}
date {
match => ["col2_datetime", "yyyy-MM-dd HH:mm:ss"]
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "my_collection_grok"
}
stdout {}
}
PROBLEM:
So when I run both the files individually, I am able to import the data in elasticsearch. But my date field is not parsed as of datetime type rather it has been saved as string and because of that I am not able to run the date filters.
So can someone help me to figure out why it's happening.
My elasticsearch version is 5.4.1.
Thanks in advance

There are 2 changes I made to your config file.
1) remove the under_score in the column name col2_datetime
2) add target
Here is how my config file look like...
vi logstash.conf
input {
file {
path => "/config-dir/path_to_my_csv.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["col1","col2"]
}
mutate {convert => [ "col1", "float" ]}
date {
locale => "en"
match => ["col2", "yyyy-MM-dd HH:mm:ss"]
target => "col2"
}
}
output {
elasticsearch {
hosts => "http://172.17.0.1:9200"
index => "my_collection"
}
stdout {}
}
Here is the data file:
vi path_to_my_csv.csv
1234365,2016-12-02 19:00:52
1234368,2016-12-02 15:02:02
1234369,2016-12-02 15:02:07

How to add numeric IDs to elasticsearch documents when reading from CSV file using Logstash?

After importing my elasticsearch documents from a CSV file using Logstash, my documents have their ID value set to long alphanumeric strings. How can I have each document ID set to a numeric value instead?
Here is basically what my logstash config looks like:
input {
file {
path => "/path/to/movies.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["title","director","year","country"]
separator => ","
}
mutate {
convert => {
"year" => "integer"
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "movie"
document_type => "movie"
}
stdout {}
}

The first and easiest option is to add a new column ID in your CSV and use that field as the document id.
Another option is to use a ruby filter that will add a dynamic ID to your events. The downside of this solution is that if your CSV changes and you re-run your pipeline each document might not get the same ID. Another downside is that you need to run your pipeline with only one worker (i.e. with -w 1) because the id_seq variable cannot be shared between worker pipelines.
filter {
csv {
columns => ["title","director","year","country"]
separator => ","
}
mutate {
convert => {
"year" => "integer"
}
}
# create ID
ruby {
"init" => "id_seq = 0"
"code" => "
event.set('id', id_seq)
id_seq += 1
"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "movie"
document_type => "movie"
document_id => "%{id}"
}
stdout {}
}

How to type data input in logstash

I'm trying to input a csv file to elasticsearch through logstash.
That's my configuration file
input {
file {
codec => plain{
charset => "ISO-8859-1"
}
path => ["PATH/*.csv"]
sincedb_path => "PATH/.sincedb_path"
start_position => "beginning"
}
}
filter {
if [message] =~ /^"ID","DATE"/ {
drop { }
}
date {
match => [ "DATE","yyyy-MM-dd HH:mm:ss" ]
target => "DATE"
}
csv {
columns => ["ID","DATE",...]
separator => ","
source => message
remove_field => ["message","host","path","#version","#timestamp"]
}
}
output {
elasticsearch {
embedded => false
host => "localhost"
cluster => "elasticsearch"
node_name => "localhost"
index => "index"
index_type => "type"
}
}
Now, the mapping produced in elasticsearch types the DATE field as string. I would like to type as a date field.
In the filter element, I tried to convert the type field in date but it doesn't work.
How can I fix that ?
Regards,
Alexandre

You have your filter chain setup in the wrong order. The date{} block needs to come after the csv {} block.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Logstash filename as ElasticSearch index - elasticsearch

You'll need to use a grok filter to find the last portion of the filename: filter { grok { match => ["path", "Uploads/%{GREEDYDATA:index_name}" ] } } and then just use the portion in your index name index => "%{index_name}"

Related

logstash cmd not running

Logstash Not Reading "File" Input

elasticsearch - import csv using logstash date is not parsed as of datetime type

How to add numeric IDs to elasticsearch documents when reading from CSV file using Logstash?

How to type data input in logstash

Categories

Resources