long and float fields showing up as text fields in Kibana - elasticsearch

Running Kibana version 5.5.2.
My current setup is Logstash is taking the logs from Docker containers, runs grok filters before sending the logs to elasticsearch. The specific logs that I need to show up as long, float are two times from AWS calls to ECS and EC2 and currently a grok filter pulls them out. Here is the custom filter that pulls out the ECS timings: ECS_DESCRIBE_CONTAINER_INSTANCES (AWS)(%{SPACE})(ecs)(%{SPACE})(%{POSINT})(%{SPACE})(?<ECS_DURATION>(%{NUMBER}))(s)(%{SPACE})(?<ECS_RETRIES>(%{NONNEGINT}))(%{SPACE})(retries) so I need ECS_DURATION to be a float and ECS_RETRIES to be a long. In the docker log handler I have the following
if [ECS_DURATION] {
mutate {
convert => ["ECS_DURATION", "float"]
}
}
if [ECS_RETRIES] {
mutate {
convert => ["ECS_RETRIES", "integer"]
}
}
When I look at the field in Kibana, it still shows as a text field, but when I make the following request to elasticsearch for the mappings, it shows those fields as long and float.
GET /logstash-2020.12.18/_mapping
{
"logstash-2020.12.18": {
"mappings": {
"log": {
"_all": {
"enabled": true,
"norms": false
},
"dynamic_templates": [
{
"message_field": {
"path_match": "message",
"match_mapping_type": "string",
"mapping": {
"norms": false,
"type": "text"
}
}
},
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"norms": false,
"type": "text"
}
}
}
],
"properties": {
"#timestamp": {
"type": "date",
"include_in_all": false
},
"#version": {
"type": "keyword",
"include_in_all": false
},
"EC2_DURATION": {
"type": "float"
},
"EC2_RETRIES": {
"type": "long"
},
"ECS_DURATION": {
"type": "float"
},
"ECS_RETRIES": {
"type": "long"
},
I even created a custom mapping template in elasticsearch with the following call
PUT /_template/aws_durations?pretty
{
"template": "logstash*",
"mappings": {
"type1": {
"_source": {
"enabled": true
},
"properties": {
"ECS_DURATION": {
"type": "half_float"
},
"ECS_RETRIES": {
"type": "byte"
},
"EC2_DURATION": {
"type": "half_float"
},
"EC2_RETRIES": {
"type": "byte"
}
}
}
}
}

Have you checked that its actually going into the if [ECS_DURATION] and if [ECS_RETRIES] conditions? (I wasnt able to comment)

Related

Why after set mapping, index return nothing?

I am using Elasticsearch 7.12.0 , Logstash 7.12.0, Kibana 7.12.0 on Windows 10 x64. Logstash config file logistics.conf
input {
jdbc {
jdbc_driver_library => "D:\\tools\\postgresql-42.2.16.jar"
jdbc_driver_class => "org.postgresql.Driver"
jdbc_connection_string => "jdbc:postgresql://localhost:5433/ld"
jdbc_user => "xxxx"
jdbc_password => "sEcrET"
schedule => "*/5 * * * *"
statement => "select * from inventory_item_report();"
}
}
filter {
uuid {
target => "uuid"
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "localdist"
document_id => "%{uuid}"
doc_as_upsert => "true"
}
}
Run logstash
logstash -f logistics.conf
If I don't set mapping explicit, the query
GET /localdist/_search
{
"query": {
"match_all": {}
}
}
return many result.
My mappings
POST localdist/_mapping
{
}
DELETE /localdist
PUT /localdist
{
}
POST /localdist
{
}
PUT localdist/_mapping
{
"properties": {
"unt_cost": {
"type": "double"
},
"ii_typ": {
"type": "keyword"
},
"qty_uom_id": {
"type": "keyword"
},
"prod_id": {
"type": "keyword"
},
"root_cat_id": {
"type": "keyword"
},
"uom": {
"type": "keyword"
},
"product_name": {
"type": "text"
},
"ii_id": {
"type": "keyword"
},
"wght_uom_id": {
"type": "keyword"
},
"iid_seq_id": {
"type": "long"
},
"avai_diff": {
"type": "double"
},
"invt_change_typ": {
"type": "keyword"
},
"ccy": {
"type": "keyword"
},
"exp_date": {
"type": "date"
},
"req_amt": {
"type": "text"
},
"pur_cost": {
"type": "double"
},
"tot_pri": {
"type": "long"
},
"own_pid": {
"type": "keyword"
},
"doc_type": {
"type": "keyword"
},
"ii_date": {
"type": "date"
},
"fac_id": {
"type": "keyword"
},
"shipment_type_id": {
"type": "keyword"
},
"lot_id": {
"type": "keyword"
},
"phy_invt_id": {
"type": "keyword"
},
"facility_name": {
"type": "text"
},
"amt_ohand_diff": {
"type": "double"
},
"reason_id": {
"type": "keyword"
},
"cat_id": {
"type": "keyword"
},
"qty_ohand_diff": {
"type": "double"
},
"#timestamp": {
"type": "date"
}
}
}
run query
GET /localdist/_search
{
"query": {
"match_all": {}
}
}
return nothing.
How to fix it, how to make explicit mappings works correctly?
If I got you right, you are indexing via logstash. Elastic then create the index if missing, indexes the documents, and tries to guess the mapping for your documents based on the very first documents.
TL;DR: You are DELETING your index containing the data by yourself.
With
DELETE /localdist
you are deleting the whole index including all data. After that, by issuing
PUT /localdist
{
}
you are re-creating the previously deleted index which is empty again. And at the end, you are setting the index mapping with
PUT localdist/_mapping
{
"properties": {
"unt_cost": {
"type": "double"
},
"ii_typ": {
"type": "keyword"
},
...
Now, as you have an empty elastic index with a mapping set, start the logstash pipeline again. If your documents are matching the index mapping, the docs should start to appear very quickly.

Elasticsearch query for multiple terms

I am trying to create a search query that allows to search by name and type.
I have indexed the values, and my record in Elasticsearch look like this:
{
_index: "assets",
_type: "asset",
_id: "eAOEN28BcFmQazI-nngR",
_score: 1,
_source: {
name: "test.png",
mediaType: "IMAGE",
meta: {
content-type: "image/png",
width: 3348,
height: 1890,
},
createdAt: "2019-12-24T10:47:15.727Z",
updatedAt: "2019-12-24T10:47:15.727Z",
}
}
so how would I create for example, a query that finds all assets that have the name "test' and are images?
I tried multi_mach query but that did not return the correct results:
{
"query": {
"multi_match" : {
"query": "*test* IMAGE",
"type": "cross_fields",
"fields": [ "name", "mediaType" ],
"operator": "and"
}
}
}
The query above returns 0 results, and if I change the operator to "or" it returns all this assets of type IMAGE.
Any suggestions would be greatly appreciated. TIA!
EDIT: Added Mapping
Below is the mapping:
{
"assets": {
"aliases": {},
"mappings": {
"properties": {
"__v": {
"type": "long"
},
"createdAt": {
"type": "date"
},
"deleted": {
"type": "date"
},
"mediaType": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"meta": {
"properties": {
"content-type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"width": {
"type": "long"
},
"height": {
"type": "long"
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"originalName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"updatedAt": {
"type": "date"
}
}
},
"settings": {
"index": {
"creation_date": "1575884312237",
"number_of_shards": "1",
"number_of_replicas": "1",
"uuid": "nSiAoIIwQJqXQRTyqw9CSA",
"version": {
"created": "7030099"
},
"provided_name": "assets"
}
}
}
}
You are unnecessary using the wildcard expression for this simple query.
First, change your analyzer on name field.
You need to create a custom analyzer which replaces . with space as default standard analyzer doesn't do that, so that you when searching for test you get test.png as there will be both test and png in the inverted index. The main benefit of doing this is to avoid the regex queries which are very costly.
Updated mapping with custom analyzer which would do the work for you. Just update your mapping and re-index again all the doc.
{
"aliases": {},
"mappings": {
"properties": {
"__v": {
"type": "long"
},
"createdAt": {
"type": "date"
},
"deleted": {
"type": "date"
},
"mediaType": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"meta": {
"properties": {
"content-type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"width": {
"type": "long"
},
"height": {
"type": "long"
}
}
},
"name": {
"type": "text",
"analyzer" : "my_analyzer"
},
"originalName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"updatedAt": {
"type": "date"
}
}
},
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"char_filter": [
"replace_dots"
]
}
},
"char_filter": {
"replace_dots": {
"type": "mapping",
"mappings": [
". => \\u0020"
]
}
}
},
"index": {
"number_of_shards": "1",
"number_of_replicas": "1"
}
}
}
Second, you should change your query to bool query as below:
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "test"
}
},
{
"match": {
"mediaType.keyword": "IMAGE"
}
}
]
}
}
}
Which is using must with 2 match queries means, that it would return docs only when there is a match in all the clauses of must query.
I already tested my solution by creating the index, inserting a few sample docs and query them, let me know if you need any help.
Did you tried with best_fields ?
{
"query": {
"multi_match" : {
"query": "Will Smith",
"type": "best_fields",
"fields": [ "name", "mediaType" ],
"operator": "and"
}
}
}

ElasticSearch overriding mapping from text to object

I am trying to override a mapping for a field.
There is a default index template (which I can't change) and I am overriding it with a custom one.
The default index has a mapping for "message" field as text, but I need to make it treated like an object and make its fields indexable/searchable.
This is the default index template, with order 10.
{
"mappings": {
"_default_": {
"dynamic_templates": [
{
"message_field": {
"mapping": {
"index": true,
"norms": false,
"type": "text"
},
"match": "message",
"match_mapping_type": "string"
}
},
...
],
"properties": {
"message": {
"doc_values": false,
"index": true,
"norms": false,
"type": "text"
},
...
}
}
},
"order": 10,
"template": "project.*"
}
And here's my override:
{
"template" : "project.*",
"order" : 100,
"dynamic_templates": [
{
"message_field": {
"mapping": {
"type": "object"
},
"match": "message"
}
}
],
"mappings": {
"message": {
"enabled": true,
"properties": {
"tag": {"type": "string", "index": "not_analyzed"},
"requestId": {"type": "integer"},
...
}
}
}
}
This works nice, but I end up defining all fields (tag, requestId, ...) in the "message" object.
Is there a way to make all the fields in the "message" object indexable/searchable?
Here's a sample document:
{
"level": "30",
...
"kubernetes": {
"container_name": "data-sync-server",
"namespace_name": "alitest03",
...
},
"message": {
"tag": "AUDIT",
"requestId": 1234,
...
},
}
...
}
Tried lots of things, but I can't make it work.
I am using ElasticSearch version 2.4.4.
You can use the path_match property in your dynamic mapping :
Something like :
{
"template": "project.*",
"order": 100,
"mappings": {
"<your document type here>": {
"dynamic_templates": [
{
"message_field": {
"mapping": {
"type": "object"
},
"match": "message"
}
},
{
"message_properties": {
"path_match": "message.*",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}
]
}
}
}
But you will maybe have to distinguish between string / numeric with match_mapping_type

Some fields missing from the kibana “terms” options

I have a "message" field which is not appearing in kibana when I select aggregation by "Terms" and open the "Field" drop-down.
Sadly I'm a pre-beginner at ES but I'm guessing it might have something to do with the mappings. Here is the beginning of the result of the query "GET logstash-2017.02.17" (does it have something to do with the "message_field" section?):
{
"logstash-2017.02.17": {
"aliases": {},
"mappings": {
"_default_": {
"_all": {
"enabled": true,
"norms": false
},
"dynamic_templates": [
{
"message_field": {
"path_match": "message",
"match_mapping_type": "string",
"mapping": {
"norms": false,
"type": "text"
}
}
},
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"fields": {
"keyword": {
"type": "keyword"
}
},
"norms": false,
"type": "text"
}
}
}
],
"properties": {
"#timestamp": {
"type": "date",
"include_in_all": false
},
As you may have guessed the data is all coming in via logstash. In most cases the message field seems to be created by default when you use the logstash grok filter. However I have other logstash pipelines where I specifically assign data to the message field.
Answer on the elasticsearch forum.

ElasticSearch Logstash template

I would like to index the SMTP receive log of my Exchange Server with ElasticSearch. So I created a logstash config file and it works very well but all of my fields are strings instead ip for source and target server for example. So I tried to change the default mapping in the logstash template:
I run the command curl -XGET http://localhost:9200/_template/logstash?pretty > C:\temp\logstashTemplate.txt
Edit the textfile and add my 'SourceIP' field
{
"template": "logstash-*",
"settings": {
"index": {
"refresh_interval": "5s"
}
},
"mappings": {
"_default_": {
"dynamic_templates": [{
"message_field": {
"mapping": {
"fielddata": {
"format": "disabled"
},
"index": "analyzed",
"omit_norms": true,
"type": "string"
},
"match_mapping_type": "string",
"match": "message"
}
}, {
"string_fields": {
"mapping": {
"fielddata": {
"format": "disabled"
},
"index": "analyzed",
"omit_norms": true,
"type": "string",
"fields": {
"raw": {
"ignore_above": 256,
"index": "not_analyzed",
"type": "string"
}
}
},
"match_mapping_type": "string",
"match": "*"
}
}],
"_all": {
"omit_norms": true,
"enabled": true
},
"properties": {
"#timestamp": {
"type": "date"
},
"geoip": {
"dynamic": true,
"properties": {
"ip": {
"type": "ip"
},
"latitude": {
"type": "float"
},
"location": {
"type": "geo_point"
},
"longitude": {
"type": "float"
}
}
},
"#version": {
"index": "not_analyzed",
"type": "string"
},
"SourceIP": {
"type": "ip"
}
}
}
},
"aliases": {}
}
I uploaded the edited template with the command curl -XPUT http://localhost:9200/_t
emplate/logstash -d#C:\temp\logstash.template
Restart the ElasticSearch server and index deleted/re-created
The 'SourceIP' field did not changed to type ip. What do I wrong? Can you please give me a hint? Thanks!

Resources