how to add auto remove field in logstash filter - elasticsearch

I am trying to add a _ttl field in logstash so that elasticsearch removes the document after a while, 120 seconds in this case but that's for testing.
filter {
if "drop" in [message] {
drop { }
}
add_field => { "_ttl" => "120s" }
}
but now nothing is logged in elasticsearch.
I have 2 questions.
Where is logged what is going wrong, maybe the syntax of the filter is wrong?
How do I add a ttl field to elasticsearch for auto removal?

When you add a filter to logstash.conf with a mutator it works:
filter {
mutate {
add_field => { "_ttl" => "120s" }
}
}
POST myindex/_search
{
"query": {
"match_all": {}
}
}
Results:
"hits": [
{
"_index": "myindex",
...................
"_ttl": "120s",
For the other question, cant really help there. Im running logstash as container so logging is read with:
docker logs d492eb3c3d0d

Related

ElasticSearch: populating ip_range type field via logstash

I'm experimenting with the ip_range field type in ElasticSearch 6.8 (https://www.elastic.co/guide/en/elasticsearch/reference/6.8/range.html) and struggle to find a way to load ip data into the field properly via logstash
I was able to load some sample data via Kibana Dev Tools, but cannot figure out a way to do the same via logstash.
Index definition
PUT test_ip_range
{
"mapping": {
"_doc": {
"properties": {
"ip_from_to_range": {
"type": "ip_range"
},
"ip_from": {
"type": "ip"
},
"ip_to": {
"type": "ip"
}
}
}
}
}
Add sample doc:
PUT test_ip_range/_doc/3
{
"ip_from_to_range" :
{
"gte" : "<dotted_ip_from>",
"lte": "<dotted_ip_to>"
}
}
Logstash config (reading from DB)
input {
jdbc {
...
statement => "SELECT ip_from, ip_to, <???> AS ip_from_to_range FROM sample_ip_data"
}
}
output {
stdout { codec => json_lines }
elasticsearch {
"hosts" => "<host>"
"index" => "test_ip_range"
"document_type" => "_doc"
}
}
Question:
How do I get ip_from and ip_to DB fields into their respective gte and lte parts of the ip_from_to_range via logstash config??
I know I can also insert the ip range in CIDR notation, but would like to be able to have both options - loading in CIDR notation and loading as a range.
After some trial and error, finally figured out the logstash config.
I had posted about a similar issue here, which finally got me on the right track with the syntax for this use case as well.
input { ... }
filter {
mutate {
add_field => {
"[ip_from_to_range]" =>
'{
"gte": "%{ip_from}",
"lte": "%{ip_to}"
}'
}
}
json {
source => "ip_from_to_range"
target => "ip_from_to_range"
}
}
output { ... }
Filter parts explained
mutate add_field: create a new field [ip_from_to_range] with its value being a json string ( '{...}' ). It is important to have the field as [field_name], otherwise the next step to parse the string into json object doesn't work
json: parse the string representation into a json object

elaticsearch monitoring search queries

For more than a week I am struggling with trying to log into an index in elasticsearch information regarding queries which I run so I could compare performance between different types of queries. I have configured this config file on logstash home directory
input {
beats {
port> 5044
}
}
filter {
if "search" in [request]{
grok {
match => { "request" => ".*\n\{(?<query_body>.*)"}
}
grok {
match => { "path" => "\/(?<index>.*)\/_search"}
}
if [index] {
} else {
mutate {
add_field => { "index" => "All" }
}
}
mutate {
update => { "query_body" => "{%{query_body}" }
}
}
}
output {
if "search" in [request] and "ignore_unmapped" not in [query_body]{
elasticsearch {
hosts => "http://localhost:9200"
}
}
}
and also installed and configured packetbeat.yml
logstash hosts to :http://localhost:9200
Also in the tutorial that I have followed is mentioned that after starting Packetbeat it will listen for packets on 9200 sending them to Logstash and from there to the monitoring Elasticsearch cluster, it will be indexed in indexes like: logstash-2016.05.24. But these indexes does not exists.

Logstash - elasticsearch getting only new data

I want to run a logstash process that grabs real-time data with certain value in the field and output it into the screen. So far i've come up with this configuration:
input {
elasticsearch {
hosts => "localhost"
user => "logstash"
password => "logstash"
size => 100
query =>'{ "query" : { "bool" : { "must" : { "bool" : { "should" : [ {"match": {"field": "value2"}}, {"match": {"field": "value1"}} ] } } } } }'
}
}
output {
stdout {
codec => rubydebug
}
}
What I've learned from running this config is that:
Logstash output the data in batches which is determined by the size parameter.
There's a few seconds delay between each batch
Logstash grabbed the data from the existing data first.
My question, is there any configuration that can turn the process so that logstash will only listen for new data and output it as soon as the data come into Elastic? Any help would be appreciated.

How to leverage logstash to index data but not generating extra fields from logstash

I am testing ElasticSearch to handle around 1 billion small doc (only 8 fields). When i use logstash to index data, it adds other fields like "message", "#version", "#timestamp" that not useful to my case and seems to consume lots of doc size. Is there a way to only index the fields defined in configuration?
Yes, simply add the following mutate filter in your Logstash configuration:
filter {
mutate {
remove_field => [ "#version", "#timestamp", "message" ]
}
}
Yes, you can add and remove fields to remove use following snippet in your conf file.
filter {
mutate {
remove_field => [ "#timestamp", "message", "#version" ]
}
}
To add new field use following snippet.
filter {
mutate {
add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }
}
}

ELK for windows logs processing

I've made a working ELK stack on Debian Wheezy and have set up Nxlog to gather windows logs. I see the logs in Kibana - everything is working fine, but i get too much data and want to filter it by removing some fields that I don't need.
I've made a filter section but it's not working at all. What can be the reason?
The filter above
input {
tcp {
type => "eventlog"
port => 3515
format => "json"
}
}
filter {
type => "eventlog"
mutate {
remove => { "Hostname", "Keywords", "SeverityValue", "Severity", "SourceName", "ProviderGuid" }
remove => { "Version", "Task", "OpcodeValue", "RecordNumber", "ProcessID", "ThreadID", "Channel" }
remove => { "Category", "Opcode", "SubjectUserSid", "SubjectUserName", "SubjectDomainName" }
remove => { "SubjectLogonId", "ObjectType", "IpPort", "AccessMask", "AccessList", "AccessReason" }
remove => { "EventReceivedTime", "SourceModuleName", "SourceModuleType", "#version", "type" }
remove => { "_index", "_type", "_id", "_score", "_source", "KeyLength", "TargetUserSid" }
remove => { "TargetDomainName", "TargetLogonId", "LogonType", "LogonProcessName", "AuthenticationPackageName" }
remove => { "LogonGuid", "TransmittedServices", "LmPackageName", "ProcessName", "ImpersonationLevel" }
}
}
output {
elasticsearch {
cluster => "wisp"
node_name => "io"
}
}
I think you try to remove fields that do not exist in some logs.
Does all your logs contains all the fieds you're trying to remove ?
If not, you have to identify your logs before removing fields.
Your filter config will look like this :
filter {
type => "eventlog"
if [somefield] == "somevalue" {
mutate {
remove => { "specificfieldtoremove1", "specificfieldtoremove2" }
}
}
}

Resources