Have #timestamp in document as epoch-millis when using logstash - ruby

In a PoC that's being done in our project, we are trying out Logstash instead of our own java based indexing module to push data to ElasticSearch. The incoming json data doesn't have an #timestamp field. So when using Logstash, it's adding that field in ISO format. But we already have a specific mapping for that ES index, and it requires us to push the #timestamp in epoch-millis format.
I've tried playing with ruby filters to convert the #timestamp to epoch-millis, but no luck so far. Is there any way we can ingest records to ES through Logstash, with #timestamp being in epoch-millis format?
I'm using logstash 6.5.4 and ES 6.2.2
Update: After trying out the suggestion in the answer, my conf file looks like this:
input { stdin { } }
filter {
ruby {
code => "
epoch_ts = event.timestamp.time.localtime.strftime('%s').to_i
event.set( 'epoch', epoch_ts )
"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "myindex"
script_type => "inline"
script => 'ctx._source.#timestamp = params.event.get("epoch")'
}
stdout { codec => rubydebug }
}
But still it doesn't work. The #timestampvalue doesn't change at all. Now, I also need to remove that extra field epoch.

this ruby code should work for you:
ruby {
code => "
epoch_ts = event.timestamp.time.localtime.strftime('%s').to_f
event.set( '#timestamp', epoch_ts )
"
}

After quite a while of searching around the web, I finally gave up on this approach. Instead I forced ES to return the #timestamp in epoch_millis using this docvalue_fields approach.

Related

Filter for my Custom Logs in Logstash

i am new to the ELK stack, I want to use ELK stack to push my logs to elastic so that I can use Kibana on em. Below is the format of my custom log:
Date Time INFO - searchinfo#username#searchQuery#latitude#longitude#client_ip#responseTime
The below is an example of a log that follows the format.
2017-07-04 11:16:10 INFO - searchinfo#null#gate#0.0#0.0#180.179.209.54#598
Now I am using filebeat to push my .log files to logstash and logstash would push that data into elastic.
I need help, writing up a filter for config for logstash that would simply split using the # and then put data into respective fields into elastic index.
How can I do this?
Try to use grok plugin to parse your logs into structured data:
filter {
grok {
match => { "message" => "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:var0}%{SPACE}%{NOTSPACE}%{SPACE}(?<searchinfo>[^#]*)#(?<username>[^#]*)#(?<searchQuery>[^#]*)#(?<latitude>[^#]*)#(?<longitude>[^#]*)#(?<client_ip>[^#]*)#(?<responseTime>[^#]*)" }
}
}
You can debug it online:
You need to use a grok filter to parse your log.
You can try with this:
filter {
grok {
match => { "message" => "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:var0}%{SPACE}%{NOTSPACE}%{SPACE}(?<var1>[^#]*)#(?<var2>[^#]*)#(?<var3>[^#]*)#(?<var4>[^#]*)#(?<var5>[^#]*)#(?<var6>[^#]*)#(?<var7>[^#]*)" }
}
}
This will parse you log and add fields named var0, var1, etc to the parsed document. You can rename this variables as you prefer.

Import text file in Elasticsearch

I'd like to import a text file in Elasticsearch.
The text file contains a single (hash)value per line.
After spending several hours of struggling, I didn't get it done.
Help is greatly appreciated.
Elasticsearch 5.1.2 with Logstash installed.
Sample data:
2d75cc1bf8e57872781f9cd04a529256
00f538c3d410822e241486ca061a57ee
3f066dd1f1da052248aed5abc4a0c6a1
781770fda3bd3236d0ab8274577dddde
86b6c59aa48a69e16d3313d982791398
Need just one index 'hashes', type 'md5'
You can use duckimport, it's similar to Logstash but easy to use. I'm the developer of that
Well if you have logstash, import it with logstash.
Example config:
input {
file {
path => "/path/myfile"
start_position => "beginning"
type => "md5"
}
}
output {
elasticsearch {
index => "hashes"
}
}
assuming you run logstash on the same instance as elasticsearch.

Can't access Elasticsearch index name metadata in Logstash filter

I want to add the elasticsearch index name as a field in the event when processing in Logstash. This is suppose to be pretty straight forward but the index name does not get printed out. Here is the complete Logstash config.
input {
elasticsearch {
hosts => "elasticsearch.example.com"
index => "*-logs"
}
}
filter {
mutate {
add_field => {
"log_source" => "%{[#metadata][_index]}"
}
}
}
output {
elasticsearch {
index => "logstash-%{+YYYY.MM}"
}
}
This will result in log_source being set to %{[#metadata][_index]} and not the actual name of the index. I have tried this with _id and without the underscores but it will always just output the reference and not the value.
Doing just %{[#metadata]} crashes Logstash with the error that it's trying to accessing the list incorrectly so [#metadata] is being set but it seems like index or any values are missing.
Does anyone have a another way of assigning the index name to the event?
I am using 5.0.1 of both Logstash and Elasticsearch.
You're almost there, you're simply missing the docinfo setting, which is false by default:
input {
elasticsearch {
hosts => "elasticsearch.example.com"
index => "*-logs"
docinfo => true
}
}

Logstash -> Elasticsearch : update document #timestamp if newer, discard if older

Using the elasticsearch output in logstash, how can i update only the #timestamp for a log message if newer?
I don't want to reindex the whole document, nor have the same log message indexed twice.
Also, if the #timestamp is older, it must not update/replace the current version.
Currently, i'm doing this:
filter {
if ("cloned" in [tags]) {
fingerprint {
add_tag => [ "lastlogin" ]
key => "lastlogin"
method => "SHA1"
}
}
}
output {
if ("cloned" in [tags]) {
elasticsearch {
action => "update"
doc_as_upsert => true
document_id => "%{fingerprint}"
index => "lastlogin-%{+YYYY.MM}"
sniffing => true
template_overwrite => true
}
}
}
It is similar to How to deduplicate documents while indexing into elasticsearch from logstash but i do not want to always update the message field; only if the #timestamp field is more recent.
You can't decide from Logstash level if a document needs to be updated or nothing should be done, this needs to be decided at Elasticsearch level. Which means that you need to experiment and test with _update API.
I suggest looking at https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html#upserts. Meaning, if the document exists the script is executed (where you can check, if you want, the #timestamp), otherwise the content of upsert is considered as a new document.

use field in index name for elasticsearch plugin logstash

I am trying to have elasticsearch index based on field so I can get an index for each source (allowing for secure access to each index).
I tried something along the lines of
output {
stdout { codec => rubydebug }
elasticsearch {
index => [SERVER]"-%{+YYYY.MM.dd}"
}
}
as well as
output {
stdout { codec => rubydebug }
elasticsearch{
index => "[SERVER]-%{+YYYY.MM.dd}"
}
}
and neither work : first errors, second tries to create the index with [SERVER] in it then errors due to uppercase, this might not be supported as I can't find it anywhere in the docs, but I was wondering if anyone has gotten something like this functional for their own ELK stacks?
The right syntax for this is "%{SERVER}-%{+YYYY.MM.dd}"
According to the documentation :
[The index to write] can be dynamic using the %{foo} syntax.

Resources