ElasticSearch: populating ip_range type field via logstash - elasticsearch

I'm experimenting with the ip_range field type in ElasticSearch 6.8 (https://www.elastic.co/guide/en/elasticsearch/reference/6.8/range.html) and struggle to find a way to load ip data into the field properly via logstash
I was able to load some sample data via Kibana Dev Tools, but cannot figure out a way to do the same via logstash.
Index definition
PUT test_ip_range
{
"mapping": {
"_doc": {
"properties": {
"ip_from_to_range": {
"type": "ip_range"
},
"ip_from": {
"type": "ip"
},
"ip_to": {
"type": "ip"
}
}
}
}
}
Add sample doc:
PUT test_ip_range/_doc/3
{
"ip_from_to_range" :
{
"gte" : "<dotted_ip_from>",
"lte": "<dotted_ip_to>"
}
}
Logstash config (reading from DB)
input {
jdbc {
...
statement => "SELECT ip_from, ip_to, <???> AS ip_from_to_range FROM sample_ip_data"
}
}
output {
stdout { codec => json_lines }
elasticsearch {
"hosts" => "<host>"
"index" => "test_ip_range"
"document_type" => "_doc"
}
}
Question:
How do I get ip_from and ip_to DB fields into their respective gte and lte parts of the ip_from_to_range via logstash config??
I know I can also insert the ip range in CIDR notation, but would like to be able to have both options - loading in CIDR notation and loading as a range.

After some trial and error, finally figured out the logstash config.
I had posted about a similar issue here, which finally got me on the right track with the syntax for this use case as well.
input { ... }
filter {
mutate {
add_field => {
"[ip_from_to_range]" =>
'{
"gte": "%{ip_from}",
"lte": "%{ip_to}"
}'
}
}
json {
source => "ip_from_to_range"
target => "ip_from_to_range"
}
}
output { ... }
Filter parts explained
mutate add_field: create a new field [ip_from_to_range] with its value being a json string ( '{...}' ). It is important to have the field as [field_name], otherwise the next step to parse the string into json object doesn't work
json: parse the string representation into a json object

Related

how to add auto remove field in logstash filter

I am trying to add a _ttl field in logstash so that elasticsearch removes the document after a while, 120 seconds in this case but that's for testing.
filter {
if "drop" in [message] {
drop { }
}
add_field => { "_ttl" => "120s" }
}
but now nothing is logged in elasticsearch.
I have 2 questions.
Where is logged what is going wrong, maybe the syntax of the filter is wrong?
How do I add a ttl field to elasticsearch for auto removal?
When you add a filter to logstash.conf with a mutator it works:
filter {
mutate {
add_field => { "_ttl" => "120s" }
}
}
POST myindex/_search
{
"query": {
"match_all": {}
}
}
Results:
"hits": [
{
"_index": "myindex",
...................
"_ttl": "120s",
For the other question, cant really help there. Im running logstash as container so logging is read with:
docker logs d492eb3c3d0d

How can I configure a custom field to be aggregatable in Kibana?

I am new to running the ELK stack. I have Logstash configured to feed my webapp log into Elasticsearch. I am trying to set up a visualization in Kibana that will show the count of unique users, given by the user_email field, which is parsed out of certain log lines.
I am fairly sure that I want to use the Unique Count aggregation, but I can't seem to get Kibana to include user_email in the list of fields which I can aggregate.
Here is my Logstash configuration:
filter {
if [type] == "wl-proxy-log" {
grok {
match => {
"message" => [
"(?<syslog_datetime>%{SYSLOGTIMESTAMP}\s+%{YEAR})\s+<%{INT:session_id}>\s+%{DATA:log_message}\s+license=%{WORD:license}\&user=(?<user_email>%{USERNAME}\#%{URIHOST})\&files=%{WORD:files}",
]
}
break_on_match => true
}
date {
match => [ "syslog_datetime", "MMM dd HH:mm:ss yyyy", "MMM d HH:mm:ss yyyy" ]
target => "#timestamp"
locale => "en_US"
timezone => "America/Los_Angeles"
}
kv {
source => "uri_params"
field_split => "&?"
}
}
}
output {
elasticsearch {
ssl => false
index => "wl-proxy"
manage_template => false
}
}
Here is the relevant mapping in Elasticsearch:
{
"wl-proxy" : {
"mappings" : {
"wl-proxy-log" : {
"user_email" : {
"full_name" : "user_email",
"mapping" : {
"user_email" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
Can anyone tell me what I am missing?
BTW, I am running CentOS with the following versions:
Elasticsearch Version: 6.0.0, Build: 8f0685b/2017-11-10T18:41:22.859Z, JVM: 1.8.0_151
Logstash v.6.0.0
Kibana v.6.0.0
Thanks!
I figured it out. The configuration was correct, AFAICT. The issue was that I simply hadn't refreshed the list of fields in the index in the Kibana UI.
Management -> Index Patterns -> Refresh Field List (the refresh icon)
After doing that, the field began appearing in the list of aggregatable terms, and I was able to create the necessary visualizations.

reindex while converting a string value of a specific field (present in old index) into a number field value (in the new index)

Could I ask, how could I reindex while converting a 'string' field e.g. "field2": "123.2" (in old index documents) into a float/double number e.g. "field2": 123.2 (intended to be in the new index) ? This post is the closest I could get, but I do not know which function to use for the cast/conversion of a string to a number. I am using ElasticSearch version 2.3.3. Thank you very much for any advice !!!
You could use Logstash to reindex your data and convert the field. Something like the following:
input {
elasticsearch {
hosts => "es.server.url"
index => "old_index"
query => "*"
size => 500
scroll => "5m"
docinfo => true
}
}
filter {
mutate {
convert => { "fieldname" => "long" }
}
}
output {
elasticsearch {
host => "es.server.url"
index => "new_index"
index_type => "%{[#metadata][_type]}"
document_id => "%{[#metadata][_id]}"
}
}
Use Elasticsearch templates to specify the mapping for the new index and specify the field as a double type.
The easiest way to build a template is to use the existing mapping.
GET oldindex/_mapping
POST _template/templatename
{
"template" : "newindex", // this can be a wildcard pattern to match indexes
"mappings": { // this is copied from the response of the previous call
"mytype": {
"properties": {
"field2": {
"type": "double" // change the type
}
}
}
}
}
POST newindex
GET newindex/_mapping
Then use the elasticsearch _reindex API to move the data from the old index to the new index and parse the field as a double using an inline scripting (you may need to enable inline scripting)
POST _reindex
{
"source": {
"index": "oldindex"
},
"dest": {
"index": "newindex"
},
"script": {
"inline": "ctx._source.field2 = ctx._source.field2.toDouble()"
}
}
Edit: Updated to use _reindex endpoint

Converting nginx access log bytes to number in Kibana4

I would like to create a visualization of the sum of bytes sent using the data from my nginx access logs. When trying to create a "Metric" visualization, I can't use the bytes field as a sum because it is a string type.
And I'm not able to change it under settings.
How do I go about changing this field type to a number/bytes type?
Here is my logstash config for nginx access logs
filter {
if [type] == "nginx-access" {
grok {
match => { "message" => "%{NGINXACCESS}" }
}
geoip {
source => "clientip"
}
useragent {
source => "agent"
target => "useragent"
}
}
}
Since each logstash index is being created as an index, I'm guess I need to change it here.
I tried adding
mutate {
convert => { "bytes" => "integer" }
}
But it doesn't seem to make a difference.
Field types are configured using mappings, which is configured at the index level and can hardly change. With Logstash, as a new index is created everyday, so if you wan't to change these mappings either wait for the next day or delete the current index if you can.
By default these mappings are generated automatically by Elasticsearch depending on the syntax of the indexed JSON document and the applied Index Templates:
# Type String
{"bytes":"123"}
# Type Integer
{"bytes":123}
In the end there are 2 solutions:
Tune Logstash, to make it generate an integer and let Elasticsearch guess the field type → Use the mutate/convert filter
Tune Elasticsearch, to force the field bytes for the document type nginx-access to be of type integer → Use Index Template:
Index Template API:
PUT _template/logstash-nginx-access
{
"order": 1,
"template": "logstash-*",
"mappings": {
"nginx-access": {
"properties": {
"bytes": {
"type": "integer"
}
}
}
}
}

How to stop logstash from creating a default mapping in ElasticSearch

I am using logstash to feed logs into ElasticSearch.
I am configuring logstash output as:
input {
file {
path => "/tmp/foo.log"
codec =>
plain {
format => "%{message}"
}
}
}
output {
elasticsearch {
#host => localhost
codec => json {}
manage_template => false
index => "4glogs"
}
}
I notice that as soon as I start logstash it creates a mapping ( logs ) in ES as below.
{
"4glogs": {
"mappings": {
"logs": {
"properties": {
"#timestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"#version": {
"type": "string"
},
"message": {
"type": "string"
}
}
}
}
}
}
How can I prevent logstash from creating this mapping ?
UPDATE:
I have now resolved this error too. "object mapping for [logs] tried to parse as object, but got EOF, has a concrete value been provided to it?"
As John Petrone has stated below, once you define a mapping, you have to ensure that your documents conform to the mapping. In my case, I had defined a mapping of "type: nested" but the output from logstash was a string.
So I removed all codecs ( whether json or plain ) from my logstash config and that allowed the json document to pass through without changes.
Here is my new logstash config ( with some additional filters for multiline logs ).
input {
kafka {
zk_connect => "localhost:2181"
group_id => "logstash_group"
topic_id => "platform-logger"
reset_beginning => false
consumer_threads => 1
queue_size => 2000
consumer_id => "logstash-1"
fetch_message_max_bytes => 1048576
}
file {
path => "/tmp/foo.log"
}
}
filter {
multiline {
pattern => "^\s"
what => "previous"
}
multiline {
pattern => "[0-9]+$"
what => "previous"
}
multiline {
pattern => "^$"
what => "previous"
}
mutate{
remove_field => ["kafka"]
remove_field => ["#version"]
remove_field => ["#timestamp"]
remove_tag => ["multiline"]
}
}
output {
elasticsearch {
manage_template => false
index => "4glogs"
}
}
You will need a mapping to store data in Elasticsearch and to search on it - that's how ES knows how to index and search those content types. You can either let logstash create it dynamically or you can prevent it from doing so and instead create it manually.
Keep in mind you cannot change existing mappings (although you can add to them). So first off you will need to delete the existing index. You would then modify your settings to prevent dynamic mapping creation. At the same time you will want to create your own mapping.
For example, this will create the mappings for the logstash data but also restrict any dynamic mapping creation via "strict":
$ curl -XPUT 'http://localhost:9200/4glogs/logs/_mapping' -d '
{
"logs" : {
"dynamic": "strict",
"properties" : {
"#timestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"#version": {
"type": "string"
},
"message": {
"type": "string"
}
}
}
}
'
Keep in mind that the index name "4glogs" and the type "logs" need to match what is coming from logstash.
For my production systems I generally prefer to turn off dynamic mapping as it avoids accidental mapping creation.
The following links should be useful if you want to make adjustments to your dynamic mappings:
https://www.elastic.co/guide/en/elasticsearch/guide/current/dynamic-mapping.html
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/custom-dynamic-mapping.html
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/dynamic-mapping.html
logs in this case is the index_type. If you don't want to create it as logs, specify some other index_type on your elasticsearch element. Every record in elasticsearch is required to have an index and a type. Logstash defaults to logs if you haven't specified it.
There's always an implicit mapping created when you insert records into Elasticsearch, so you can't prevent it from being created. You can create the mapping yourself before you insert anything (via say a template mapping).
The setting manage_template of false just prevents it from creating the template mapping for the index you've specified. You can delete the existing template if it's already been created by using something like curl -XDELETE http://localhost:9200/_template/logstash?pretty
Index templates can help you. Please see this jira for more details. You can create index templates with wildcard support to match an index name and put your default mappings.

Resources