Elasticsearch / Logstash define time or date when importing old log files - elasticsearch

I have some old log files (one file per day).
log-2017.09.01.json
log-2017.09.02.json
etc
There is no date information in the json file.
By default, the timestamp of the index is the date of the creation of the index.
I am trying to create an index for each of these log file and I want the timestamp of the index corresponding to each log file to be the same as the one defined by the name of the file.
i.e., I want an index "log-2017.09.01" for which the timestamp would be 2017.09.01 and another index "log-2017.09.02" for which the timestamp would be 2017.09.02
Does anyone know how to simply do it ?

There isn't a simple here, but it can be done. It takes a few steps.
The first step, get the date out of the file-path.
filter {
grok {
match => { "path", "^log-%{DATA:date_partial}$" }
}
}
The second step is to pull your timestamp data out of the log-lines. I'm assuming you know how to do that.
The third step is to assemble a date field out of parts.
filter {
mutate {
add_field => { "full_timestamp", "%{date_partial} %{date_hour}:%{date_minute}" }
}
}
The last step is to use the date{} filter on that constructed field.
filter {
date {
match => [ "full_timestamp", "yyyy.MM.dd HH:mm" ]
}
}
This should give you an idea as to the technique needed.

Related

Add extra value to field before sending to elasticsearch

I'm using logstash, filebeat and grok to send data from logs to my elastisearch instance. This is the grok configuration in the pipe
filter {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:messageDate} %{GREEDYDATA:messagge}"
}
}
}
This works fine, the issue is that messageDate is in this format Jan 15 11:18:25 and it doesn't have a year entry.
Now, i actually know the year these files were created in and i was wondering if it is possible to add the value to the field during the process, that is, somehow turn Jan 15 11:18:25 into 2016 Jan 15 11:18:25 before sending to elasticsearch (obviously without editing the files, which i could do and even with ease but it'll be a temporary fix to what i have to do and not a definitive solution)
I have tried googling if it was possible but no luck...
Valepu,
The only way to modify the data from a field is using the ruby filter:
filter {
ruby {
code => "#your code here#"
}
}
For more information like...how to get,set field values, here is the link:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-ruby.html
If you have a separate field for date as a string, you can use logstash date plugin:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-date.html
If you don't have it as a separate field (as in this case) use this site to construct your own grok pattern:
http://grokconstructor.appspot.com/do/match
I made this to preprocess the values:
%{YEAR:yearVal} %{MONTH:monthVal} %{NUMBER:dayVal} %{TIME:timeVal} %{GREEDYDATA:message}
Not the most elegant I guess, but you get the values in different fields. Using this you can create your own date field and parse it with date filter so you will get a comparable value or you can use these fields by themselves. I'm sure there is a better solution, for example you could make your own grok pattern and use that, but I'm gonna leave some exploration for you too. :)
By reading thoroughly the grok documentation i found what google couldn't find for me and which i apparently missed the first time i read that page
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#plugins-filters-grok-add_field
Using the add_field and remove_field options i managed to add the year to my date, then i used the date plugin to send it to logstash as a timestamp. My filter configuration now looks like this
filter {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:tMessageDate} %{GREEDYDATA:messagge}"
add_field => { "messageDate" => "2016 %{tMessageDate}" }
remove_field => ["tMessageDate"]
}
}
date {
match => [ "messageDate", "YYYY MMM dd HH:mm:ss"]
}
}
And it worked fine

elastic stack : i need set Time Filter field name with another field

i need read messages(content is logs) from rabbitMq by logstash and then send that to elasticsearch for make visualize monitoring in kibana. so i wrote input for read from rabbitmq in logstash like this:
input {
rabbitmq {
queue => "testLogstash"
host => "localhost"
}
}
and i wrote output configuration for store in elasticsearch in logstash like this:
output {
elasticsearch{
hosts => "http://localhost:9200"
index => "d13-%{+YYYY.MM.dd}"
}
}
Both of them are placed in myConf.conf
In the content of each message, there is a Json that contains the fields like this:
{
"mDate":"MMMM dd YYYY, HH:mm:ss.SSS"
"name":"test name"
}
But there are two problems. First, there is no date field in the field of creating a new index(Time Filter field name). Second, I use the same timestamp as the default #timestamp, this field will not be displayed in the build type of graphs. I think the reason for this is because of the data type of the field. The field is of type date, but the string is considered.
i try to convert value of field to date by mutate in logstash config like this:
filter {
mutate {
convert => { "mdate" => "date" }
}
}
Now, two questions arise:
1- Is this the problem? If yes What is the right solution to fix it?
2- My main need is to use the time when messages are entered in the queue, not when Logstash takes them. What is the best solution?
If you don't specify a value for #timestamp, you should get the current system time when elasticsearch indexes the document. With that, you should be able to see items in kibana.
If I understand you correctly, you'd rather use you mDate field for #timestamp. For this, use the date{} filter in logstash.

Logstash Dynamic Index From Document Field Fails

I still face problems to figure out, how to tell Logstash to send a dynamic index, based on a document field. Furthermore, this Field must be transformed in order to get the "real" index at the very end.
Given, that there is a field "time" (which is a UNIX Timestamp). This Field gets already transformed with a "date" Filter to a DateTime Object for Elastic.
Additionally, it should server as index (YYYYMM). The index should NOT be derived from #Timestamp, which is not touched.
Example:
{...,"time":1453412341,...}
Shall go to the Index: 201601
I use the following Config:
filter {
date {
match => [ "time", "UNIX" ]
target => "time"
timezone => "Europe/Berlin"
}
}
output {
elasticsearch {
index => "%{time}%{+YYYYMM}"
document_type => "..."
document_id => "%{ID}"
hosts => "..."
}
}
Sadly, its not working. Any idea, how to achieve that?
Thanks a lot!
The "%{+YYYYMM}" says to use the date values from #timestamp. If you want an index named after the YYYYMM in %{time}, you need to make a string out of that date field and then reference that string in the output stanza. There might be a mutate{} that would do it, or drop into ruby{}.
In most installations, you want to set #timestamp to the event's value. The default of logstash's own time is not very useful (imagine if your events were delayed by an hour during processing). If you did that, then %{+YYYYMM}" would work just fine.
This is caused because the index name is created based on UTC time by default.

How to set time in log as main #timestamp in elasticsearch

Im using logstash to index some old log files in my elastic DB.
i need kibana/elastic to set the timestamp from within the logfile as the main #timestamp.
Im using grok filter in the following way:
%{TIMESTAMP_ISO8601:#timestamp}
yet elasticsearch sets the time of indexing as the main #timestamp and not the timestamp written in the log line.
Any idea what am i doing wrong here?
Thanks
Use the date filter to set the #timestamp field. Extract the timestamp in whatever format it's in into a separate (temporary) field, e.g. timestamp, and feed it to the date filter. In your case you'll most likely be able to use the special ISO8601 timestamp format token.
filter {
date {
match => ["timestamp", "ISO8601"]
remove_field => ["timestamp"]
}
}

Convert a string field to date

So, I have a two fields in my log: timeLogged, timeQueued all this fields have date format: 2014-06-14 19:41:21+0000
My question is, how to convert string date value to logstash date? like in #timestamp
For the sole purpose of converting to #timestamp there is a dedicated date filter
date {
match => ["timeLogged","YYYY-MM-dd HH:mm:ss+SSSS"]
}
Now in your case there are basically two types of fields that might be used so you will have to dig a little, either use a grok filter to copy the values in a generic "log_date" field, or trying to see if the date filter can take several arguments like one of thoses possibilities:
date {
match => ["timeLogged","YYYY-MM-dd HH:mm:ss+SSSS",
"timeQueued","YYYY-MM-dd HH:mm:ss+SSSS" ]
}
OR
date {
match => ["timeLogged","YYYY-MM-dd HH:mm:ss+SSSS"]
match => ["timeQueued","YYYY-MM-dd HH:mm:ss+SSSS"]
}
It is up to you to experiment, I never tried myself ;)
this should suffice
date{
match => [ "timeLogged","ISO8601","YYYY-MM-dd HH:mm:ss" ]
target => "timeLogged"
locale => "en"
}
You can try this filter
filter {
ruby {
code => "
event['timeLogged'] = Time.parse(event['timeLogged'])
event['timeQueued'] = Time.parse(event['timeQueued'])
"
}
}
Use the powerful ruby library to do what you need!

Resources