How can i add extra fields in ELK Kibana - elasticsearch

I am using ELK with kibana.
I am also using filebeat for sending data to Logstash.
The i have created look like this
{
"mappings": {
"_default_": {
"properties": {
"msg":{"type":"string", "index":"not_analyzed"}
}
},
"log": {
"properties": {
"#timestamp":{"type":"date","format":"strict_date_optional_time||epoch_millis"},
"#version":{"type":"string"},
"beat": {
"properties": {
"hostname":{"type":"string"},
"name":{"type":"string"},
}
},
"count":{"type":"long"},
"host":{"type":"string"},
"input_type":{"type":"string"},
"message":{"type":"string"},
"msg":{"type":"string","index":"not_analyzed"},
"offset":{"type":"long"},
"source":{"type":"string"},
"type":{"type":"string"}
}
}
}
}';
I want to know that just like beat has 2 fields like hostname and name. Is it possible to have add more fields like environment: dev which i can see in kibana so that i can filter messages based on that

Yes, you can specify additional fields in your filebeat.yml configuration. Those new fields will be created. You have two options, you can either specify fields and/or fields_under_root.
If you use the former (see below), a new fields subgroup with your custom fields will appear in your document and you will be able to filter messages with fields.environment: dev in Kibana.
fields:
environment: dev
If you use the latter (see below), your custom fields will appear at the top-level in your document and you will be able to filter messages with environment: dev in Kibana.
fields_under_root: true

Related

Elastic Beats - Changing the Field Type of Default Fields in Beats Documents?

I'm still fairly new to the Elastic Stack and I'm still not seeing the entire picture from what I'm reading on this topic.
Let's say I'm using the latest versions of Filebeat or Metricbeat for example, and pushing that data to Logstash output, (which is then configured to push to ES). I want an "out of the box" field from one of these beats to have its field type changed (example: change beat.hostname from it's current default "text" type to "keyword"), what is the best place/practice for configuring this? This kind of change is something I would want consistent across multiple hosts running the same Beat.
I wouldn't change any existing fields since Kibana is building a lot of visualizations, dashboards, SIEM,... on the exptected fields + data types.
Instead extend (add, don't change) the default mapping if needed. On top of the default index template, you can add your own and they will be merged. Adding more fields will require some more disk space (and probably memory when loading), but it should be manageable and avoids a lot of drawbacks of other approaches.
Agreed with #xeraa. It is not advised to change the default template since that field might be used in any default visualizations.
Create a new template, you can have multiple templates for the same index pattern. All the mappings will be merged.The order of the merging can be controlled using the order parameter, with lower order being applied first, and higher orders overriding them.
For your case, probably create a multi-field for any field that needs to be changed. Eg: As shown here create a new keyword multifield, then you can refer the new field as
fieldname.raw
.
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
The other answers are correct but I did the below in Dev console to update the message field from text to text & keyword
PUT /index_name/_mapping
{
"properties": {
"message": {
"type": "match_only_text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 10000
}
}
}
}
}

Elasticsearch - Enable fulltext search of field

I have run into a brick wall considering searching in my logged events. I am using an elasticsearch solution, filebeat to load messages from logs to elasticsearch, and Kibana front end.
I currently log the messages into a field message and exception stacktrace (if present) into error.message. So the logged event's snippet may look like:
{
"message": "Thrown exception: CustomException (Exception for testing purposes)"
"error" : {
"message" : "com.press.controller.CustomException: Exception for testing purposes\n at
com.press.controller....<you get the idea at this point>"
}
}
Of course there are other fields like timestamp, but those are not important. What is important is this:
When I search message : customException, I can find the events I logged. When I search error.message : customException, I do not get the events. I need to be able to fulltext search all fields.
Is there a way how to tell elasticsearch to enable the fulltext search in the fields?
And why has the "message" field enabled it by default? None of my colleagues are aware that any indexing command was run on the field in the console after deployment and our privileges do not allow me or other team members to run indexing or analysis commands on any field. So it has to be in the config somewhere.
So far I was unable to find the solution. Please push me in the right direction.
Edit:
The config of fields is as follows:
We use a modified ECS, and both messages are declared as
level: core
type: text
in file fields.yml.
in filebeat, the config snippet is as such:
filebeat.inputs:
- type: log
enabled: true
paths: .....
...
...
processors:
- rename:
fields:
- from: "msg"
to: "message"
- from: "filepath"
to: "log.file.name"
- from: "ex"
to: "error.message"
ignore_missing: true
fail_on_error: true
logging.level: debug
logging.to_files: true
For security requirements, I cannot disclose full files. Also, I need to write all the snippets by hand, so misspells are probably my fault.
Thanks
Problem is with the analyzer associated with your field, by default for text fields in ES, standard analyzer is used which doesn't create separate tokens if text contains . for ex: foo.bar would result in just 1 token as foo.bar while if you want both foo and bar should match in foo.bar then you need to genrate 2 tokens as foo and bar.
What you need is a custom analyzer which creates token as above as your error.message text contains . which I explained in my example:
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"char_filter": ["replace_dots"]
}
},
"char_filter": {
"replace_dots": {
"type": "mapping",
"mappings": [
". => \\u0020"
]
}
}
}
}
}
POST /my_index/_analyze
{
"analyzer": "my_analyzer",
"text": "foo.bar"
}
The above example creates 2 tokens as foo and bar and same should happen with you when you create and test it with these API.
Let me know if you face any issue with it.
Elastic Search indexes all fields by default, here you did not define the mapping hence all fields should be indexed by default.
Also for your case I doubt if the data is properly going in elastic search as the log doesn't seem to be proper json.
Do you see proper logs in Kibana if yes please send a sample log/screenshot

How to extract and visualize values from a log entry in OpenShift EFK stack

I have an OKD cluster setup with EFK stack for logging, as described here. I have never worked with one of the components before.
One deployment logs requests that contain a specific value that I'm interested in. I would like to extract just this value and visualize it with an area map in Kibana that shows the amount of requests and where they come from.
The content of the message field basically looks like this:
[fooServiceClient#doStuff] {"somekey":"somevalue", "multivalue-key": {"plz":"12345", "foo": "bar"}, "someotherkey":"someothervalue"}
This plz is a German zip code, which I would like to visualize as described.
My problem here is that I have no idea how to extract this value.
A nice first success would be if I could find it with a regexp, but Kibana doesn't seem to work the way I think it does. Following its docs, I expect this /\"plz\":\"[0-9]{5}\"/ to deliver me the result, but I get 0 hits (time interval is set correctly). Even if this regexp matches, I would only find the log entry where this is contained and not just the specifc value. How do I go on here?
I guess I also need an external geocoding service, but at which point would I include it? Or does Kibana itself know how to map zip codes to geometries?
A beginner-friendly step-by-step guide would be perfect, but I could settle for some hints that guide me there.
It would be possible to parse the message field as the document gets indexed into ES, using an ingest pipeline with grok processor.
First, create the ingest pipeline like this:
PUT _ingest/pipeline/parse-plz
{
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"%{POSINT:plz}"
]
}
}
]
}
Then, when you index your data, you simply reference that pipeline:
PUT plz/_doc/1?pipeline=parse-plz
{
"message": """[fooServiceClient#doStuff] {"somekey":"somevalue", "multivalue-key": {"plz":"12345", "foo": "bar"}, "someotherkey":"someothervalue"}"""
}
And you will end up with a document like the one below, which now has a field called plz with the 12345 value in it:
{
"message": """[fooServiceClient#doStuff] {"somekey":"somevalue", "multivalue-key": {"plz":"12345", "foo": "bar"}, "someotherkey":"someothervalue"}""",
"plz": "12345"
}
When indexing your document from Fluentd, you can specify a pipeline to be used in the configuration. If you can't or don't want to modify your Fluentd configuration, you can also define a default pipeline for your index that will kick in every time a new document is indexed. Simply run this on your index and you won't need to specify ?pipeline=parse-plz when indexing documents:
PUT index/_settings
{
"index.default_pipeline": "parse-plz"
}
If you have several indexes, a better approach might be to define an index template instead, so that whenever a new index called project.foo-something is created, the settings are going to be applied:
PUT _template/project-indexes
{
"index_patterns": ["project.foo*"],
"settings": {
"index.default_pipeline": "parse-plz"
}
}
Now, in order to map that PLZ on a map, you'll first need to find a data set that provides you with geolocations for each PLZ.
You can then add a second processor in your pipeline in order to do the PLZ/ZIP to lat,lon mapping:
PUT _ingest/pipeline/parse-plz
{
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"%{POSINT:plz}"
]
}
},
{
"script": {
"lang": "painless",
"source": "ctx.location = params[ctx.plz];",
"params": {
"12345": {"lat": 42.36, "lon": 7.33}
}
}
}
]
}
Ultimately, your document will look like this and you'll be able to leverage the location field in a Kibana visualization:
{
"message": """[fooServiceClient#doStuff] {"somekey":"somevalue", "multivalue-key": {"plz":"12345", "foo": "bar"}, "someotherkey":"someothervalue"}""",
"plz": "12345",
"location": {
"lat": 42.36,
"lon": 7.33
}
}
So to sum it all up, it all boils down to only two things:
Create an ingest pipeline to parse documents as they get indexed
Create an index template for all project* indexes whose settings include the pipeline created in step 1

Specifying Field Types Indexing from Logstash to Elasticsearch

I have successfully ingested data using the XML filter plugin from Logstash to Elasticsearch, however all the field types are of the type "text."
Is there a way to manually or automatically specify the correct type?
I found the following technique good for my use:
Logstash would filter the data and change a field from the default - text to whatever form you want. The documentation would be found here. The example given in the documentation is:
filter {
mutate {
convert => { "fieldname" => "integer" }
}
}
This you add in the /etc/logstash/conf.d/02-... file in the body part. I believe the downside of this practice is that from my understanding it is less recommended to alter data entering the ES.
After you do this you will probably get the this problem. If you have this problem and your DB is a test DB that you can erase all old data just DELETE the index until now that there would not be a conflict (for example you have a field that was until now text and now it is received as date there would be a conflict between old and new data). If you can't just erase the old data then read into the answer in the link I linked.
What you want to do is specify a mapping template.
PUT _template/template_1
{
"index_patterns": ["te*", "bar*"],
"settings": {
"number_of_shards": 1
},
"mappings": {
"type1": {
"_source": {
"enabled": false
},
"properties": {
"host_name": {
"type": "keyword"
},
"created_at": {
"type": "date",
"format": "EEE MMM dd HH:mm:ss Z YYYY"
}
}
}
}
}
Change the settings to match your needs such as listing the properties to map what you want them to map to.
Setting index_patterns is especially important because it tells elastic how to apply this template. You can set an array of index patterns and can use * as appropriate for wildcards. i.e logstash's default is to rotate by date. They will look like logstash-2018.04.23 so your pattern could be logstash-* and any that match the pattern will receive the template.
If you want to match based on some pattern, then you can use dynamic templates.
Edit: Adding a little update here, if you want logstash to apply the template for you, here is a link to the settings you'll want to be aware of.

Change geoip.location mapping from number to geo_point

I am using a logstash filter to convert my filebeat IIS logs into location:
filter {
geoip {
source => "clienthost"
}
}
But the data type in elasticSearch is:
geoip.location.lon = NUMBER
geoip.location.lat = NUMBER
But in order to map points, I need to have
geoip.location = GEO_POINT
Is there a way to change the mapping?
I tried posting a changed mapping
sudo curl -XPUT "http://localhost:9200/_template/filebeat" -d#/etc/filebeat/geoip-mapping-new.json
with a new definition but it's not making a difference:
{
"mappings": {
"geoip": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
},
"template": "filebeat-*"
}
Edit: I've tried this with both ES/Kiabana/Logstash 5.6.3 and 5.5.0
This is not a solution but I deleted all the data and reinstalled ES, Kiabana, Logstash and Filebeat 5.5
And now ES recognizes location as a geopoint - I guess previously even though I had changed the data mapping, there was still data that was mapped incorrectly and Kibana was assuming the incorrect data type - probably a reindex of the complete data would have fixed the problem

Resources