csv add field date parse error in logstash - logstash-configuration

I'm attempting to take three columns and combine them into two new fields
Example:
Job_Date 6\5\2019
Job_Start_Time 0:00
Job_End_Time 0:00
Into New Fields:
timestamp_start 6/5/2019, 0:00
timestamp_end 6/5/2019, 0:00
The new fields are getting created but i'm getting the parse error below.
{
"#timestamp" => 2019-06-22T21:08:20.370Z,
"Warning" => 60,
"path" => "/Users/*******/Desktop/Logstash-Files/ax_batch_performance_test_new.csv",
"message" => "job",6/4/2019,13:45,13:45,6,120,60,15\r",
"tags" => [
[0] "_dateparsefailure"
],
"host" => "host",
"Job_Duration" => 6,
"timestamp_end" => "6/4/2019 13:45",
"Job_Start_Time" => "13:45",
"Critical" => 120,
"#version" => "1",
"Job_End_Time" => "13:45",
"Job_Date" => "6/4/2019",
"timestamp_start" => "6/4/2019 13:45",
"Target" => 15,
"Job_Name" => "job name"
I'm running logstash version 7.1.1. I have tried running the mutate command inside and outside of the date plugin.... If it matters I'm still learning.
I have successfully parsed a date format exactly like this before, but not by creating a new field and combining the data and time.
filter{
csv {
separator => ","
columns => ["Job_Name", "Job_Date", "Job_Start_Time", "Job_End_Time", "Job_Duration", "Critical", "Warning", "Target"]
}
mutate {convert => ["Job_Duration", "integer"]}
mutate {convert => ["Critical", "integer"]}
mutate {convert => ["Warning", "integer"]}
mutate {convert => ["Target", "integer"]}
mutate { add_field => {"timestamp_start" => "%{Job_Date} %{Job_Start_Time}"}}
mutate { add_field => {"timestamp_end" => "%{Job_Date} %{Job_End_Time}"}}
date {
match => ["timestamp_start", "M/d/yyyy, HH:MM"]
timezone => "UTC"
}
date {
match => ["timestamp_end", "M/d/yyyy, HH:MM"]
timezone => "UTC"
}
}
I'm expecting the date and time to be parsed and placed into #timestamp as a date.

Related

How can I fully parse json into ElasticSearch?

I'm parsing a mongodb input into logstash, the config file is as follows:
input {
mongodb {
uri => "<mongouri>"
placeholder_db_dir => "<path>"
collection => "modules"
batch_size => 5000
}
}
filter {
mutate {
rename => { "_id" => "mongo_id" }
remove_field => ["host", "#version"]
}
json {
source => "message"
target => "log"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => ["localhost:9200"]
action => "index"
index => "mongo_log_modules"
}
}
Outputs 2/3 documents from the collection into elasticsearch.
{
"mongo_title" => "user",
"log_entry" => "{\"_id\"=>BSON::ObjectId('60db49309fbbf53f5dd96619'), \"title\"=>\"user\", \"modules\"=>[{\"module\"=>\"user-dashboard\", \"description\"=>\"User Dashborad\"}, {\"module\"=>\"user-assessment\", \"description\"=>\"User assessment\"}, {\"module\"=>\"user-projects\", \"description\"=>\"User projects\"}]}",
"mongo_id" => "60db49309fbbf53f5dd96619",
"logdate" => "2021-06-29T16:24:16+00:00",
"application" => "mongo-modules",
"#timestamp" => 2021-10-02T05:08:38.091Z
}
{
"mongo_title" => "candidate",
"log_entry" => "{\"_id\"=>BSON::ObjectId('60db49519fbbf53f5dd96644'), \"title\"=>\"candidate\", \"modules\"=>[{\"module\"=>\"candidate-dashboard\", \"description\"=>\"User Dashborad\"}, {\"module\"=>\"candidate-assessment\", \"description\"=>\"User assessment\"}]}",
"mongo_id" => "60db49519fbbf53f5dd96644",
"logdate" => "2021-06-29T16:24:49+00:00",
"application" => "mongo-modules",
"#timestamp" => 2021-10-02T05:08:38.155Z
}
Seems like the output of stdout throws un-parsable code into
"log_entry"
After adding "rename" fields "modules" won't add a field.
I've tried the grok mutate filter, but after the _id %{DATA}, %{QUOTEDSTRING} and %{WORD} aren't working for me.
I've also tried updating a nested mapping into the index, didn't seem to work either
Is there anything else I can try to get the FULLY nested code into elasticsearch?
Solution is to filter with mutate
mutate { gsub => [ "log_entry", "=>", ": " ] }
mutate { gsub => [ "log_entry", "BSON::ObjectId\('([0-9a-z]+)'\)", '"\1"' ]}
json { source => "log_entry" remove_field => [ "log_entry" ] }
Outputs to stdout
"_id" => "60db49309fbbf53f5dd96619",
"title" => "user",
"modules" => [
[0] {
"module" => "user-dashboard",
"description" => "User Dashborad"
},
[1] {
"module" => "user-assessment",
"description" => "User assessment"
},
[2] {
"module" => "user-projects",
"description" => "User projects"
}
],

multiline field in csv (logstash)

I am trying to make the multiline field for csv file work in logstash.
But the multiline for a field is not working.
My log stash.conf content is:
input {
file {
type => "normal"
path => "/etc/logstash/*.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
codec => multiline {
pattern => "."
negate => true
what => "previous"
}
}
}
filter {
if [type] == "normal" {
csv {
separator => ","
columns => ["make", "model", "doors"]
}
mutate {convert => ["doors","integer"] }
}
}
output {
if [type] == "normal" {
elasticsearch {
hosts => "<put_local_ip>"
user => "<put_user>"
password => "<put_password>"
index => "cars"
document_type => "sold_cars"
}
stdout {}
}
}
.csv with multiple line (in quotes) for a field make is:
make,model,doors
mazda,mazda6,4
"mitsubishi
4000k", galant,2
honda,civic,4
After I run "logstash -f /etc/logstash/logstash.conf"
I am getting parse failure, from the logs:
{
"tags" => [
[0] "_csvparsefailure"
],
"#timestamp" => 2020-07-13T19:13:11.339Z,
"type" => "normal",
"host" => "<host_ip_greyedout>",
"message" => "\"mitsubishi",
"#version" => "1",
"path" => "/etc/logstash/cars4.csv"
}

Could not able to use geo_ip in logstash 2.4

I'm trying to use geoip from apache access log with logstash 2.4, elasticsearch 2.4, kibna 4.6.
my logstash filter is...
input {
file {
path => "/var/log/httpd/access_log"
type => "apache"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
geoip {
source => "clientip"
target => "geoip"
database =>"/home/elk/logstash-2.4.0/GeoLiteCity.dat"
#add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float" ]
}
}
output {
stdout { codec => rubydebug }
elasticsearch
{ hosts => ["192.168.56.200:9200"]
sniffing => true
manage_template => false
index => "apache-geoip-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
And if elasticsearch parsing some apache access log, the output is...
{
"message" => "xxx.xxx.xxx.xxx [24/Oct/2016:14:46:30 +0900] HTTP/1.1 8197 /images/egovframework/com/cmm/er_logo.jpg 200",
"#version" => "1",
"#timestamp" => "2016-10-24T05:46:34.505Z",
"path" => "/NCIALOG/JBOSS/SMBA/default-host/access_log.2016-10-24",
"host" => "smba",
"type" => "jboss_access_log",
"clientip" => "xxx.xxxx.xxx.xxx",
"geoip" => {
"ip" => "xxx.xxx.xxx.xxx",
"country_code2" => "KR",
"country_code3" => "KOR",
"country_name" => "Korea, Republic of",
"continent_code" => "AS",
"region_name" => "11",
"city_name" => "Seoul",
"latitude" => xx.5985,
"longitude" => xxx.97829999999999,
"timezone" => "Asia/Seoul",
"real_region_name" => "Seoul-t'ukpyolsi",
"location" => [
[0] xxx.97829999999999,
[1] xx.5985
],
"coordinates" => [
[0] xxx.97829999999999,
[1] xx.5985
]
}
}
I could not able to see geo_point field.
please help me.
Thanks.
I added my error in tile map .
It says "logstash-* index pattern does not contain any of the following field types: geo_point"
Mmmmm.... the geoip fields are already into you response !
Into the field "geoip" you can find all needed informations (ip, continent, country name, ...). The added field coordinates are present too.
So, what's the problem ?

Can't parse date in logstash

I need to parse my Date and it gives me an error.
input {
file {
path => "/home/osboxes/ELK/logstash/data/data.csv"
start_position => "beginning"
}
}
filter {
csv {
separator => ","
columns => ["Date","Open","High","Low","Close","Volume","Adj Close"]
}
mutate {convert => ["High", "float"]}
mutate {convert => ["Open", "float"]}
mutate {convert => ["Low", "float"]}
mutate {convert => ["Close", "float"]}
mutate {convert => ["Volume", "float"]}
}
output {
elasticsearch {
action => "index"
hosts => "localhost:9200"
index => "stock"
workers => 1
}
stdout {}
}
The data.csv when I'm reading this is like this:
Date,Open,High,Low,Close,Volume,Adj Close
2015-04-02,125.03,125.56,124.19,125.32,32120700,125.32
2015-04-01,124.82,125.12,123.10,124.25,40359200,124.25
Where am I missing? Thanks in advance.
My logstash terminal only say this:
$ bin/logstash -f /home/osboxes/ELK/logstash/logstash.conf
Settings: Default pipeline workers: 2
Pipeline main started
Add a date statement to the filter:
date {
match => [ "Date", "YYYY-MM-dd" ]
}

Reading positional file with logstash, converting two string fields to date, applying math operation to another

So, I have a positional file that looks like this
0100003074400003074400000000103000000000066167424000000000131527492000000000131527463C19860000000000000320160302201603300010019700XXXXXXXX XX XXXXXX 000000000133719971
02000008013000008013000000001010000000001327506142016033000000000000046053100000000013268252820160516000000000020091000000000066558874002002
And I want logstash to ship only the lines starting with '01' to elasticsearch. I've managed to do this by doing the following
filter {
# get only lines that start with 01
if ([message] !~ "^01") {
drop{}
}
grok {
match => { "message" => "^01(?<n_estab>.{9})(?<n_filial>.{9})(?<depart>.{9})(?<prod>.{2})(?<id_apres>.{18})(?<id_mov>.{18})(?<id_mov_orig>.{18})(?<orig_int>.{1})(?<cod_oper>.{1})(?<cod_moeda>.{3})(?<valor>.{14})(?<date_trans>.{8})(?<date_agenda>.{8})(?<num_parcela>.{3})(?<qtd_parcelas>.{3})(?<cod_rub>.{4})(?<desc_rub>.{30})(?<id_pgto>.{18})" }
}
mutate {
strip => [
"n_estab",
"n_filial",
"depart",
"prod",
"id_apres",
"id_mov",
"id_mov_orig",
"orig_int",
"cod_oper",
"cod_moeda",
"valor",
"date_trans",
"date_agenda",
"num_parcela",
"qtd_parcelas",
"cod_rub",
"desc_rub",
"id_pgto"
]
convert => {
"n_estab" => "integer"
"n_filial" => "integer"
"depart" => "integer"
"prod" => "integer"
"id_apres" => "integer"
"id_mov" => "integer"
"id_mov_orig" => "integer"
"orig_int" => "string"
"cod_oper" => "integer"
"cod_moeda" => "integer"
"valor" => "float"
"date_trans" => "string"
"date_agenda" => "string"
"num_parcela" => "integer"
"qtd_parcelas" => "integer"
"cod_rub" => "integer"
"desc_rub" => "string"
"id_pgto" => "integer"
}
}
}
Now, I want to divide valor by 100 and convert fields date_trans and date_agenda from string to date format, so I can index by any of those fields on elasticsearch and kibana.
I've tried adding the following lines to filter
ruby {
code => "event['valor'] = event['valor'] / 100
event['date_trans'] = Date.strptime(event['date_trans'], '%Y%m%d')
event['date_agenda'] = Date.strptime(event['date_agenda'], '%Y%m%d')"
}
After I've added those lines to my conf file, logstash starts, but doesn't parse any of my files... It simply hangs! Since I can add gibberish to the ruby code block and it won't alert me of anything, I figure it must be something with the ruby code, right...?
UPDATE
After executing
/opt/logstash/bin/logstash -f /etc/logstash/conf.d/subq_detliq.conf -v --debug --verbose
It started inserting into elasticsearch... Does logstash keeps which files it read somewhere and never come back to them?
Also, this it what it's inserting into ES...
"message" => "0100001504000001504000000000101000000000063916400000000000124569419000000000124569414C09860000000000011620151127201601260020029700XXXXXXXX XX XXXXXX 000000000128479123 ",
"#version" => "1",
"#timestamp" => "2016-04-28T18:11:58.681Z",
"host" => "cherno-alpha",
"path" => "/tmp/HSTRD0003/SUBQ_DETLIQ_HSTR_20160124_000100.REM",
"n_estab" => 15040,
"n_filial" => 15040,
"depart" => 1,
"prod" => 1,
"id_apres" => 63916400,
"id_mov" => 124569419,
"id_mov_orig" => 124569414,
"orig_int" => "C",
"cod_oper" => 0,
"cod_moeda" => 986,
"valor" => 1.16,
"date_trans" => #<Date: 2015-11-27 ((2457354j,0s,0n),+0s,2299161j)>,
"date_agenda" => #<Date: 2016-01-26 ((2457414j,0s,0n),+0s,2299161j)>,
"num_parcela" => 2,
"qtd_parcelas" => 2,
"cod_rub" => 9700,
"desc_rub" => "XXXXXXXX XX XXXXXX",
"id_pgto" => 128479123
Somehow Ruby converted the date fields, but not really? ES still thinks it's just a regular string and won't let create an index on them.

Resources