Timezone causing different results when doing a search query to an index in Elastic Search - elasticsearch

I'm trying to find out the results from a search query (ie: searching results for the given date range) of a particular index. So that I could get the results in a daily basis.
This is the query : http://localhost:9200/dialog_test/_search?q=timestamp:[2016-08-03T00:00:00.128%20TO%202016-08-03T23:59:59.128]
In the above, timestamp is a field which i added using my logstash.conf in order to get the actual log time. When i tried querying this, surprisingly i got a number of hits (total hits: 24) which should've been 0 since I didn't have any log records from the date of (2016-08-03) . It actually displays the count for the next day (ie: (2016-08-04), which has 24 records in the log file. I'm sure something has gone wrong with the timezone.
My timezone is GMT+5:30.
Here is my filtering part of logstash conf:
filter {
grok {
patterns_dir => ["D:/ELK Stack/logstash/logstash-2.3.4/bin/patterns"]
match => { "message" => "^%{LOGTIMESTAMP:logtimestamp}%{GREEDYDATA}" }
}
mutate {
add_field => { "timestamp" => "%{logtimestamp}" }
remove_field => ["logtimestamp"]
}
date {
match => [ "timestamp" , "ISO8601" , "yyyyMMdd HH:mm:ss.SSS" ]
target => "timestamp"
locale => "en"
}}
EDIT:
This is a snap of the first 24 records which has the date of (2016-08-04) from the log file:
And this is a snap of the JSON response I got when I searched for the date of 2016-08-03:
Where am i going wrong? Any help could be appreciated.

In your date filter you need to add a timezone
date {
match => [ "timestamp" , "ISO8601" , "yyyyMMdd HH:mm:ss.SSS" ]
target => "timestamp"
locale => "en"
timezone => "Asia/Calcutta" <--- add this
}

Related

Logstash to Opensearch , _dateparsefailure tag

I have some problems while using logstash to opensearch.
filter{
grok {
patterns_dir => ["/etc/logstash/conf.d/patterns"]
match => [ "message","%{DATE_FORM:logdate}%{LOGTYPE:logtype}:%{SPACE}%{GREEDYDATA:msgbody}" ]
}
date {
match => ["logdate", "yyyy.MM.dd-HH.mm.ss:SSS"]
timezone => "UTC"
target=>"timestamp"
}
mutate {
remove_field => ["message"]
add_field => {
"file" => "%{[#metadata][s3][key]}"
}
}
}
This is the conf file I'm using for logstash.
In the opensearch console
#timestamp : Dec 15, 2022 # 18:10:56.975
logdate [2022.12.10-11.57.36:345]
tags _dateparsefailure
The timestamp , logdate are different and _dateparsefailure error occurs.
In the raw logs , it starts with
[2022.12.10-11.57.36:345]
this format.
Right now ,
logdate : raw log's timestamp
#timestamp : the time that log send to opensearch
I want to match logdate and #timestamp.
How can I modify the filter.date.match part to make the results of the logdate and #timestamp filters the same?
If you have multiple times you can have more than one filter.date.match, you can do this:
filter{
date {
match => ["logdate", "yyyy.MM.dd-HH.mm.ss:SSS"]
timezone => "UTC"
target=>"logdate"
}
date {
match => ["#timestamp", "yyyy.MM.dd-HH.mm.ss:SSS"]
timezone => "UTC"
target=>"#timestamp"
}
}
If your time field has multiple formats, you can do this:
date {
match => [ "logdate", "yyyy.MM.dd-HH.mm.ss:SSS", "third_format", "ISO8601" ]
target=> "#timestamp"
}
Reference: https://www.elastic.co/guide/en/logstash/current/plugins-filters-date.html#plugins-filters-date-match

Logstash - Setting a timestamp from a JSON parsed object

I am having an issue with setting a timestamp from a JSON parse.
I have this string:
[{"orderNumber":"423523-4325-3212-4235-463a72e76fe8","externalOrderNumber":"reactivate_22d6ff0d8f55eb821be14df9d35505a6","operation":{"name":"CAPTURE","amount":134,"status":"SUCCESS","createdAt":"2015-05-11T09:14:30.969Z","updatedAt":{}}}]
I parse it as a json using this Logstash filter:
grok {
match => { "message" => "\[%{GREEDYDATA:firstjson}\]%{SPACE} \[%{GREEDYDATA:secondjson}\}]}]"}
}
json{
source => "firstjson"
}
date {
match => [ "operation.createdAt", "ISO8601"]
}
mutate {
remove_field => [ "firstjson", "secondjson" ]
}
}
This creates a document inside the ElasticSearch. I have a field named operation.createdAt which is properly recognised as a date field. But for some reason, this line:
date {
match => [ "operation.createdAt", "ISO8601"]
}
is not setting #timestamp field. Current #timestamp field is set at the moment of document insertion. What am I doing wrong?
Thanks to nice people at ES Logstash Community, I have found the answer.
Instead of:
date {
match => [ "operation.createdAt", "ISO8601"]
}
I use this:
date {
match => [ "[operation][createdAt]", "ISO8601"]
}
and that properly extracts and parses the JSON time object.

logstash-input-mongodb: controlling the output?

I'm trying to setup the logstash-input-mongodb plugin to read audits from my database, but all the parsing strategies seem to have issues and I don't see how to customize anything.
The "flatten" parse_method works quite nicely, but it ignores mongodb object IDs and does not output them anywhere except in the log_entry field.
The "simple" parse_method includes object IDs but outputs dates in a way that I cannot figure out how to parse with the date filter (e.g., "2017-02-12 16:30:00 UTC"). Then, in the absence of a proper timestamp, the plugin seems to generate timestamps on its own which have no relation to the current time (e.g., in 2022).
The "dig" method I haven't quite figured out yet.
So my questions:
Is there a way to parse data from the log_entry (see example below) field that the plugin outputs? I've tried the json filter but it is not json because it's been ruby-formatted.
Or, is there any way to get the "flatten" method to include object IDs?
Or, is there anyw ay to get the "simple" method to properly format mongodb ISODate fields?
Is there any way to prevent the plugin from reading data from the beginning of time (I only want to push the last day or so into logstash)?
Can be reproduced with any configuration, here's my basic one:
input {
mongodb {
uri => 'mongodb://localhost:27017/test'
placeholder_db_dir => '/elk/logstash-mongodb/'
placeholder_db_name => 'logstash_sqlite.db'
collection => 'auditcommunications'
batch_size => 1000
parse_method => "flatten"
}
}
filter {
date {
match => [ "timestamp", "ISO8601" ]
}
}
output {
stdout { codec => rubydebug }
}
Example data including log_entry:
{
"audit-id" => "58a2edc916e057270065fa74",
"created" => "2017-02-14T11:45:13Z",
"type" => "mongodb-audit",
"audit-type" => "PaymentAudit",
"mongo_id" => "58a2edc916e057270065fa74",
"expiresAt" => "2017-05-15T11:45:13Z",
"lastUpdated" => "2017-02-14T11:45:13Z",
"#timestamp" => 2017-02-14T11:45:13.000Z,
"log_entry" => "{\"_id\"=>BSON::ObjectId('58a2edc916e057270065fa74'), \"order\"=>BSON::ObjectId('a8a2f205790858970046aa59'), \"_type\"=>\"PaymentAudit\", \"lastUpdated\"=>2017-02-14 11:45:13 UTC, \"created\"=>2017-02-14 11:45:13 UTC, \"payment\"=>BSON::ObjectId('58a2edc02eafcd560101ee5f'), \"organization\"=>BSON::ObjectId('56edde0ba33e1c03ff54a5ec'), \"status\"=>\"succeeded\", \"context\"=>{\"type\"=>\"order\", \"id\"=>BSON::ObjectId('58a2e205790852270046ab59')}, \"expiresAt\"=>2017-05-15 11:45:13 UTC, \"__v\"=>0}",
"logdate" => "2017-02-14T11:45:13+00:00",
"__v" => 0,
"#version" => "1",
"context_type" => "order",
"status" => "succeeded",
"timestamp" => "2017-02-14T11:45:13Z"
}
How can I extract the organization from the log_entry field above?
I've tried the following:
filter {
ruby {
code => "event.set('organization', eval(event.get('[log_entry]')))"
}
}
but this throws a rubyexception: ERROR logstash.filters.ruby - Ruby exception occurred: (eval):1: syntax error, unexpected tINTEGER
If you use the simple parse_method then you can parse the timestamp easily with the following pattern yyyy-MM-dd HH:mm:ss ZZZ that you can add to your date filter.
filter {
date {
match => [ "timestamp", "yyyy-MM-dd HH:mm:ss ZZZ" ]
}
}
Regarding the last point, I suggest checking the since_* settings which allow you to keep a cursor of what's been already processed and only start from that cursor on the next logstash restart.

Extract Parameter (sub-string) from URL GROK Pattern

I have ELK running for log analysis. I have everything working. There are just a few tweaks I would like to make. To all the ES/ELK Gods in stackoverflow, I'd appreciate any help on this. I'd gladly buy you a cup of coffee! :D
Example:
URL: /origina-www.domain.com/this/is/a/path?page=2
First I would like to get the entire path as seen above.
Second, I would like to get just the path before the parameter: /origina-www.domain.com/this/is/a/path
Third, I would like to get just the parameter: ?page=2
Fourth, I would like to make the timestamp on the logfile be the main time stamp on kibana. Currently, the timestamp kibana is showing is the date and time the ES was processed.
This is what a sample entry looks like:
2016-10-19 23:57:32 192.168.0.1 GET /origin-www.example.com/url 200 1144 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-" "-"
Here's my config:
if [type] == "syslog" {
grok {
match => ["message", "%{IP:client}\s+%{WORD:method}\s+%{URIPATHPARAM:request}\s+%{NUMBER:bytes}\s+%{NUMBER:duration}\s+%{USER-AGENT}\s+%{QS:referrer}\s+%{QS:agent}%{GREEDYDATA}"]
}
date {
match => [ "timestamp", "MMM dd, yyyy HH:mm:ss a" ]
locale => "en"
}
}
ES Version: 5.0.1
Logstash Version: 5.0
Kibana: 5.0
UPDATE: I was actually able to solve it by using:
grok {
match => ["message", "%{IP:client}\s+%{WORD:method}\s+%{URIPATHPARAM:request}\s+%{NUMBER:bytes}\s+%{NUMBER:duration}\s+%{USER-AGENT}\s+%{QS:referrer}\s+%{QS:agent}%{GREEDYDATA}"]
}
grok {
match => [ "request", "%{GREEDYDATA:uri_path}\?%{GREEDYDATA:uri_query}" ]
}
kv {
source => "uri_query"
field_split => "&"
target => "query"
}
In order to use the actual timestamp of your log entry rather than the indexed time, you could use the date and mutate plugins as such to override the existing timestamp value. You could have your logstash filter look, something like this:
//filtering your log file
grok {
patterns_dir => ["/pathto/patterns"] <--- you could have a pattern file with such expression LOGTIMESTAMP %{YEAR}%{MONTHNUM}%{MONTHDAY} %{TIME} if you have to change the timestamp format.
match => { "message" => "^%{LOGTIMESTAMP:logtimestamp}%{GREEDYDATA}" }
}
//overriding the existing timestamp with the new field logtimestamp
mutate {
add_field => { "timestamp" => "%{logtimestamp}" }
remove_field => ["logtimestamp"]
}
//inserting the timestamp as UTC
date {
match => [ "timestamp" , "ISO8601" , "yyyyMMdd HH:mm:ss.SSS" ]
target => "timestamp"
locale => "en"
timezone => "UTC"
}
You could follow up Question for more as well. Hope it helps.
grok {
match => ["message", "%{IP:client}\s+%{WORD:method}\s+%{URIPATHPARAM:request}\s+%{NUMBER:bytes}\s+%{NUMBER:duration}\s+%{USER-AGENT}\s+%{QS:referrer}\s+%{QS:agent}%{GREEDYDATA}"]
}
grok {
match => [ "request", "%{GREEDYDATA:uri_path}\?%{GREEDYDATA:uri_query}" ]
}
kv {
source => "uri_query"
field_split => "&"
target => "query"
}

Logstash converting date to valid joda time (#timestamp)

Hope someone can help me out!
I have a question about logstash. I grok the following date with succes: 26/Jun/2013:14:00:26 +0200
Next, I want this date to be used as the #timestamp of the event. As you know logstash automatically adds a timestamp.
Replacing the timestamp that logstash is adding can be done by the date filter. I have added the following date filter: match => [ "date", "dd/MMM/YYYY:HH:mm:ss Z"]
But, for some reason, that doesn't work. When I test it out, I see that logstash just adds his own timestamp.
Code:
grok {
type => "log-date"
pattern => "%{HTTPDATE:date}"
}
date{
type => "log-date"
match => [ "date", "dd/MMM/YYYY:HH:mm:ss Z"]
}
I need to do this, so I can add events to elasticsearch.
Thanks in advance!
I used the following approach:
# strip the timestamp and force event timestamp to be the same.
# the original string is saved in field %{log_timestamp}.
# the original logstash input timestamp is saved in field %{event_timestamp}.
grok {
patterns_dir => "./patterns"
match => [ "message", "%{IRODS_TIMESTAMP:log_timestamp}" ]
add_tag => "got_syslog_timestamp"
add_field => [ "event_timestamp", "%{#timestamp}" ]
}
date {
match => [ "log_timestamp", "MMM dd HH:mm:ss" ]
}
mutate {
replace => [ "#timestamp", "%{log_timestamp}" ]
}
My problem now is that, even if #timestamp is replaced, I would like to convert it to a ISO8601-compatible format first so that other programs don't have problems interpreting it, like the timestamp present in "event_timestamp":
"#timestamp" => "Mar 5 14:38:40",
"#version" => "1",
"type" => "irods.relog",
"host" => "ids-dev",
"path" => "/root/logstash/reLog.2013.03.01",
"pid" => "5229",
"level" => "NOTICE",
"log_timestamp" => "Mar 5 14:38:40",
"event_timestamp" => "2013-09-17 12:20:28 UTC",
"tags" => [
[0] "got_syslog_timestamp"
]
You could convert it easily since you have the year information... In my case I would have to parse it out of the "path" (filename) attribute... but still, there does not seem to be an convert_to_iso8901 => #timestamp directive.
Hope this helps with your issue anyway! :)
The above answer is just a work around !, try to add locale => "en" to your code.
If not added, the date weekdays and month names will be parsed with the default platform locale language (spanish, french or whatever) and that's why it didn't work (since your log is in english).
date{
type => "log-date"
match => [ "date", "dd/MMM/YYYY:HH:mm:ss Z"]
locale => "en"
}

Resources