i'm having a problem with my data when push to ELK using logstash.
here is my input file
input {
file {
path => ["C:/Users/HoangHiep/Desktop/test17.txt"]
type => "_doc"
start_position => beginning
}
}
filter {
dissect {
mapping => {
"message" => "%{word}"
}
}
}
output {
elasticsearch{
hosts => ["localhost:9200"]
index => "test01"
}
stdout { codec => rubydebug}
}
My data is
"day la text"
this is the output
{
"host" => "DESKTOP-T41GENH",
"path" => "C:/Users/HoangHiep/Desktop/test17.txt",
"#timestamp" => 2020-01-15T10:04:52.746Z,
"#version" => "1",
"type" => "_doc",
"message" => "\"day la text\"\r",
"word" => "\"day la text\"\r"
}
Is there any way to handle the character ( " ).
i want the "word" just be like "day la text \r" don't have character \"
Thanks all.
I can explain more about this if this change works for you. The reason I say is I have newest mac so I don't see the trailing \r in my message.
the input just like you have it "day la text"
filter {
mutate {
gsub => [
"message","(\")", ""
]
}
}
response is
{
"#timestamp" => 2020-01-15T15:01:58.828Z,
"#version" => "1",
"headers" => {
"http_version" => "HTTP/1.1",
"request_method" => "POST",
"http_accept" => "*/*",
"accept_encoding" => "gzip, deflate",
"postman_token" => "5ae8b2a0-2e94-433c-9ecc-e415731365b6",
"cache_control" => "no-cache",
"content_type" => "text/plain",
"connection" => "keep-alive",
"http_user_agent" => "PostmanRuntime/7.21.0",
"http_host" => "localhost:8080",
"content_length" => "13",
"request_path" => "/"
},
"host" => "0:0:0:0:0:0:0:1",
"message" => "day la text" <===== see the extra inbuilt `\"` gone.
}
Related
I have been migrating some of the indexes from self-hosted Elasticsearch to AmazonElasticSearch using Logstash. While migrating the documents, We need to change the field names in the index based on some logic.
Our Logstash Config file
input {
elasticsearch {
hosts => ["https://staing-example.com:443"]
user => "userName"
password => "password"
index => "testingindex"
size => 100
scroll => "1m"
}
}
filter {
}
output {
amazon_es {
hosts => ["https://example.us-east-1.es.amazonaws.com:443"]
region => "us-east-1"
aws_access_key_id => "access_key_id"
aws_secret_access_key => "access_key_id"
index => "testingindex"
}
stdout{
codec => rubydebug
}
}
Here it is one of the documents for the testingIndex from our self-hosted elastic search
{
"uniqueIdentifier" => "e32d331b-ce5f-45c8-beca-b729707fca48",
"createdDate" => 1527592562743,
"interactionInfo" => [
{
"value" => "Hello this is testing",
"title" => "msg",
"interactionInfoId" => "8c091cb9-e51b-42f2-acad-79ad1fe685d8"
},
{
**"value"** => """"{"edited":false,"imgSrc":"asdfadf/soruce","cont":"Collaborated in <b class=\"mention\" gid=\"4UIZjuFzMXiu2Ege6cF3R4q8dwaKb9pE\">#2222222</b> ","chatMessageObjStr":"Btester has quoted your feed","userLogin":"test.comal#google.co","userId":"tester123"}"""",
"title" => "msgMeta",
"interactionInfoId" => "f6c7203b-2bde-4cc9-a85e-08567f082af3"
}
],
"componentId" => "compId",
"status" => [
"delivered"
]
},
"accountId" => "test123",
"applicationId" => "appId"
}
This is what we are expecting when documents get migrated to our AmazonElasticSearch
{
"uniqueIdentifier" => "e32d331b-ce5f-45c8-beca-b729707fca48",
"createdDate" => 1527592562743,
"interactionInfo" => [
{
"value" => "Hello this is testing",
"title" => "msg",
"interactionInfoId" => "8c091cb9-e51b-42f2-acad-79ad1fe685d8"
},
{
**"value-keyword"** => """"{"edited":false,"imgSrc":"asdfadf/soruce","cont":"Collaborated in <b class=\"mention\" gid=\"4UIZjuFzMXiu2Ege6cF3R4q8dwaKb9pE\">#2222222</b> ","chatMessageObjStr":"Btester has quoted your feed","userLogin":"test.comal#google.co","userId":"tester123"}"""",
"title" => "msgMeta",
"interactionInfoId" => "f6c7203b-2bde-4cc9-a85e-08567f082af3"
}
],
"componentId" => "compId",
"status" => [
"delivered"
]
},
"accountId" => "test123",
"applicationId" => "appId"
}
What we need is to change the "value" field to "value-keyword" wherever we find some JSON format. Is there any other filter in Logstash to achieve this
As documented in the Logstash website:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-rename
You can use the mutate filter, applying the rename function.
For example:
filter {
mutate {
replace => { "old-field" => "new-field" }
}
}
For nested fields, you could just pass the path of the field:
filter {
mutate {
replace => { "[interactionInfo][value]" => "[interactionInfo][value-keyword]" }
}
}
Try adding this to your filter:
filter {
ruby {
code => "event.get('interactionInfo').each { |item| if item['value'].match(/{.+}/) then item['value-keyword'] = item.delete('value') end }"
}
}
I want to know about the use of recursive function in kv filter. I am using a csv file. I uploaded the file to ES using logstash. After reading the guide from this link https://www.elastic.co/guide/en/logstash/current/plugins-filters-kv.html#plugins-filters-kv-recursive
I came to know that it duplicates the key/values pair and store it in a separate key. But i can't get additional info or examples about the filter. I added a recursive line in logstash config file. No changes.
Is it duplicates the fields with values(key-value pairs) or else what this function doing???
Here's my sample csv file data passing through logstash:
"host" => "smackcoders",
"Driveline" => "Four-wheel drive",
"Make" => "Jeep",
"Width" => "79",
"Torque" => "260",
"Year" => "2012",
"Horsepower" => "285",
"City_mpg" => "17",
"Height" => "34",
"Classification" => "Manual,Transmission",
"Model_Year" => "2012 Jeep Wrangler",
"Number_of_Forward_Gears" => "6",
"Length" => "41",
"Highway_mpg" => "21",
"#version" => "1",
"message" => "17,\"Manual,Transmission\",Four-wheel drive,Jeep 3.6L 6 Cylinder 280 hp 260 lb-ft,Gasoline,34,21,285,False,2012 Jeep Wrangler Arctic,41,Jeep,2012 Jeep Wrangler,6,260,6 Speed Manual,79,2012",
"Fuel_Type" => "Gasoline",
"Engine_Type" => "Jeep 3.6L 6 Cylinder 280 hp 260 lb-ft",
"path" => "/home/paulsteven/log_cars/cars.csv",
"Hybrid" => "False",
"ID" => "2012 Jeep Wrangler Arctic",
"#timestamp" => 2019-04-20T07:58:26.552Z,
"Transmission" => "6 Speed Manual"
}
Here's the config file:
input {
file {
path => "/home/paulsteven/log_cars/cars.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["City_mpg","Classification","Driveline","Engine_Type","Fuel_Type","Height","Highway_mpg","Horsepower","Hybrid","ID","Length","Make","Model_Year","Number_of_Forward_Gears","Torque","Transmission","Width","Year"]
}
kv {
recursive => "true"
}
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "kvfilter1"
document_type => "details"
}
stdout{}
}
Found some examples for recursive in kv filter:
input { generator { count => 1 message => 'foo=1,bar="foor=10,barr=11"' } }
filter {
kv { field_split => "," value_split => "=" recursive => false }
}
will produce
"foo" => "1",
"bar" => "foor=10,barr=11",
whereas
input { generator { count => 1 message => 'foo=1,bar="foor=10,barr=11"' } }
filter {
kv { field_split => "," value_split => "=" recursive => true }
}
will produce
"foo" => "1",
"bar" => {
"foor" => "10",
"barr" => "11"
},
I want to include #metadata field contents in my elasticsearch output.
This is the output when i am using stdout in my output filter-
{
"#timestamp" => 2018-03-08T08:17:42.059Z,
"thread_name" => "SimpleAsyncTaskExecutor-2",
"#metadata" => {
"dead_letter_queue" => {
"entry_time" => 2018-03-08T08:17:50.082Z,
"reason" => "Could not index event to Elasticsearch. status: 400, action: ["index", {:_id=>nil, :_index=>"applog-2018.03.08", :_type=>"doc", :_routing=>nil}, #LogStash::Event:0x3ab79ab5], response: {"index"=>{"_index"=>"applog-2018.03.08", "_type"=>"doc", "_id"=>"POuwBGIB0PJDPQOoDy1Q", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [message]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Can't get text on a START_OBJECT at 1:223"}}}}",
"plugin_type" => "elasticsearch",
"plugin_id" => "7ee60ceccc2ef7c933cf5aa718d42f24a65b489e12a1e1c7b67ce82e04ef0d37"
}
},
"#version" => "1",
"beat" => {
"name" => "filebeat-kwjn6",
"version" => "6.0.0"
},
"dateOffset" => 408697,
"source" => "/var/log/applogs/spring-cloud-dataflow/Log.log",
"logger_name" => "decurtis.dxp.deamon.JobConfiguration",
"message" => {
"timeStamp" => "2018-01-30",
"severity" => "ERROR",
"hostname" => "",
"commonUtility" => {},
"offset" => "Etc/UTC",
"messageCode" => "L_9001",
"correlationId" => "ea5b13c3-d395-4fa5-8124-19902e400316",
"componentName" => "dxp-deamon-refdata-country",
"componentVersion" => "1",
"message" => "Unhandled exceptions",
},
"tags" => [
[0] "webapp-log",
[1] "beats_input_codec_plain_applied",
[2] "_jsonparsefailure"
]
}
I want my #metadata field in elasticsearch output.
Below is my conf file:
input {
dead_letter_queue {
path => "/usr/share/logstash/data/dead_letter_queue"
commit_offsets => true
pipeline_id => "main"
}
}
filter {
json {
source => "message"
}
mutate {
rename => { "[#metadata][dead_letter_queue][reason]" => "reason" }
}
}
output {
elasticsearch {
hosts => "elasticsearch"
manage_template => false
index => "deadletterlog-%{+YYYY.MM.dd}"
}
}
Now in my output there is a field called "reason" but without any content. Is there something i am missing.
this can help :-
mutate {
add_field => {
"reason" => "%{[#metadata][dead_letter_queue][reason]}"
"plugin_id" => "%{[#metadata][dead_letter_queue][plugin_id]}"
"plugin_type" => "%{[#metadata][dead_letter_queue][plugin_type]}"
}
}
I'm writing a logstash 2.4.0 configuration to go through HTTP logs.
We'd like to have the PORT that is passed in the Header field to be included in the Line fields below.
There is no specific end-event defined. Although I have tried adding an end event as well.
The input log file I'm currently using is:
HEADER 9200
LINE 1 2016-10-05 08:39:00 Some log data
LINE 2 2016-10-05 08:40:00 Some other log data
FOOTER
HEADER 9300
LINE 4 2016-11-05 08:39:00 Some log data in another log
LINE 5 2016-11-05 08:40:00 Some other log data in another log
FOOTER
I would like to have an output like this:
The Server_port fields are currently missing from the output
{"message" => "HEADER 9200",
"#version" => "1",
"#timestamp" => "2016-11-15T11:17:18.425Z",
"path" => "test.log",
"host" => "hostname",
"type" => "event",
"env" => "test",
"port" => 9200,
"tags" => [[0] "Header"] }
{"message" => "LINE 1 2016-10-05 08:39:00 Some log data",
"#version" => "1",
"#timestamp" => "2016-11-15T11:17:20.186Z",
"path" => "test.log",
"host" => "hostname",
"type" => "event",
"env" => "test",
"logMessage" => "1 2016-10-05 08:39:00 Some log data",
"Server_port" => 9200,
"tags" => [[0] "Line"]}
{"message" => "LINE 2 2016-10-05 08:40:00 Some other log data",
"#version" => "1",<
"#timestamp" => "2016-11-15T11:17:20.192Z",
"path" => "test.log",
"host" => "hostname",
"type" => "event",
"env" => "test",
"logMessage" => "2 2016-10-05 08:40:00 Some other log data",
"Server_port" => 9200,
"tags" => [[0] "Line"]}
{"message" => "FOOTER",
"#version" => "1",
"#timestamp" => "2016-11-15T11:17:20.195Z",
"path" => "test.log",
"host" => "hostname",
"type" => "event",
"env" => "test",
"tags" => [[0] "Footer"]}
After trying out different things, the configuration I'm currently using is as follows, with a hardcoded taskid='abcd' for testing:
input{ file{ path => "test.log"
start_position => "beginning"
sincedb_path => "/dev/null"
ignore_older => 0
type => "event"
add_field => { "env" => "test"} }
}
filter{
grok {
break_on_match => false
tag_on_failure => []
match => {"message" => ["^HEADER%{SPACE}%{INT:port:int}"]}
add_tag => ["Header"]
}
grok {
break_on_match => false
tag_on_failure => []
match => {"message" => "^LINE%{SPACE}%{GREEDYDATA:logMessage}"}
add_tag => ["Line"]
}
grok {
break_on_match => false
tag_on_failure => []
match => {"message" => "^FOOTER"}
add_tag => ["Footer"]
}
if "Header" in [tags]{
aggregate{
task_id => "abcd"
code => "map['server_port'] ||= 0; map['server_port']=event['port']"
push_map_as_event_on_timeout => true
push_previous_map_as_event => true
map_action => "create"
}
}
elseif "Line" in [tags]{
aggregate{
task_id => "abcd"
code => "event.set('server_port',map['server_port'])"
map_action => "update"
}
}
else if "Footer" in [tags]{
aggregate{
task_id => "abcd"
code => "event.set('server_port',map['server_port'])"
map_action => "update"
end_of_task => true
timeout => 120
}
}
}
output {
stdout { codec => rubydebug }
}
While this config runs without errors it's not creating the server_port fields.
Where am I going wrong?
After fiddling around some more I have a working test case.
I've changed the configuration as follows:
grok {
break_on_match => false
tag_on_failure => []
match => {
"message" => ["^HEADER%{SPACE}%{INT:taskid:int}%{SPACE}%{INT:port:int}"]
}
add_tag => ["Header"]
}
and
if "Header" in [tags]{
aggregate{
task_id => "%{taskid}"
code => "map['port']=event.get('port')"
map_action => "create"
}
}
elseif "Line" in [tags]{
aggregate{
task_id =>"%{taskid}"
code => "event.set('port',map['port'])"
map_action => "update"
}
}
else if "Footer" in [tags]{
aggregate{
task_id => "%{taskid}"
code => "event.set('port',map['port'])"
map_action => "update"
end_of_task => true
timeout => 120
}
}
And added a task id field to the logs:
HEADER 123 9200
LINE 123 2016-10-05 08:39:00 Some log data
I have the following logstash configuration:
input {
file{
path => ["C:/Users/MISHAL/Desktop/ELK_Files/rm/evsb.json"]
type => "json"
start_position => "beginning"
}
}
filter {
json {
source => "message"
}
mutate {
convert => [ "increasedFare", "float"]
convert => ["enq", "float"]
convert => ["bkd", "float"]
}
date{
match => [ "date" , "YYYY-MM-dd HH:mm:ss" ]
target => "#timestamp"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => "localhost"
index => "zsx"
}
}
And this is the json data jt.json :
[{"id":1,"date":"2015-11-11 23:00:00","enq":"105","bkd":"9","increasedFare":"0"}, {"id":2,"date":"2015-11-15 23:00:00","eng":"55","bkd":"2","increasedFare":"0"}, {"id":3,"date":"2015-11-20 23:00:00","enq":"105","bkd":"9","increasedFare":"0"}, {"id":4,"date":"2015-11-25 23:00:00","eng":"55","bkd":"2","increasedFare":"0"}]
Tried running this in logstash however I am not able to parse the date or get the date in timestamp.
The following is the warning message im getting:
Failed parsing date from field {:field=>"[date]", :value=>"%{[date]}", :exception=>"Invalid format: \"%{[date]}\"", :config_parsers=>"YYYY-MM-dd HH:mm:ss", :config_locale=>"default=en_IN", :level=>:warn}
The following is the stdout
Logstash startup completed
{
"message" => "{\"id\":2,\"date\":\"2015-09-15 23:00:00\",\"enq\":\"34\",\"bkd\":\"2\",\"increasedFare\":\"0\"}\r",
"#version" => "1",
"#timestamp" => "2015-09-15T17:30:00.000Z",
"host" => "TCHWNG",
"path" => "C:/Users/MISHAL/Desktop/ELK_Files/jsonTest/jt.json",
"type" => "json",
"id" => 2,
"date" => "2015-09-15 23:00:00",
"enq" => 34.0,
"bkd" => 2.0,
"increasedFare" => 0.0
}
{
"message" => "{\"id\":3,\"date\":\"2015-09-20 23:00:00\",\"enq\":\"22\",\"bkd\":\"9\",\"increasedFare\":\"0\"}\r",
"#version" => "1",
"#timestamp" => "2015-09-20T17:30:00.000Z",
"host" => "TCHWNG",
"path" => "C:/Users/MISHAL/Desktop/ELK_Files/jsonTest/jt.json",
"type" => "json",
"id" => 3,
"date" => "2015-09-20 23:00:00",
"enq" => 22.0,
"bkd" => 9.0,
"increasedFare" => 0.0
}
{
"message" => "{\"id\":4,\"date\":\"2015-09-25 23:00:00\",\"enq\":\"66\",\"bkd\":\"2\",\"increasedFare\":\"0\"}\r",
"#version" => "1",
"#timestamp" => "2015-09-25T17:30:00.000Z",
"host" => "TCHWNG",
"path" => "C:/Users/MISHAL/Desktop/ELK_Files/jsonTest/jt.json",
"type" => "json",
"id" => 4,
"date" => "2015-09-25 23:00:00",
"enq" => 66.0,
"bkd" => 2.0,
"increasedFare" => 0.0
}
Been trying to solve this for two days and tried various things, But I am not able to solve this. Please tell what Im doing wrong here.