I have a problem with a query that return no result. When I execute the following query either with match or term :
{
"size": 1,
"query": {
"bool": {
"must": [
{ "term": { "ALERT_TYPE.raw": "ERROR" }}
],
"filter": [
{ "range": {
"#timestamp": {
"gte": "2018-02-01T00:00:01.000Z",
"lte": "2018-02-28T23:55:55.000Z"
}
}}
]
}
}
}
I always got the following response, :
{
"took": 92,
"timed_out": false,
"_shards": {
"total": 215,
"successful": 215,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
But i'm sure the element is present because when i do a match_all query, the first hit is the following :
{
"took": 269,
"timed_out": false,
"_shards": {
"total": 210,
"successful": 210,
"failed": 0
},
"hits": {
"total": 68292,
"max_score": 1,
"hits": [
{
"_index": "logstash-2018.02.22",
"_type": "alert",
"_id": "AWEdVphtJjppDZ0FiAz-",
"_score": 1,
"_source": {
"#version": "1",
"#timestamp": "2018-02-22T10:07:41.549Z",
"path": "/something",
"host": "host.host",
"type": "alert",
"SERVER_TYPE": "STANDALONE",
"LOG_FILE": "log.log",
"DATE": "2018-02-22 11:02:02,367",
"ALERT_TYPE": "ERROR",
"MESSAGE": "There is an error"
}
}
]
}
}
Here I can see the field is the value that I am expecting. And from the mapping I know the field is analyzed by the default analyser and the raw field is not analysed (Thanks to the answer of Glenn Van Schil). The mapping is generated dynamically by logstash but it looks like this for the type i'm looking into:
"alert": {
"_all": {
"enabled": true,
"omit_norms": true
},
"dynamic_templates": [
{
"message_field": {
"mapping": {
"index": "analyzed",
"omit_norms": true,
"fielddata": { "format": "disabled" },
"type": "string"
},
"match": "message",
"match_mapping_type": "string"
}
},
{
"string_fields": {
"mapping": {
"index": "analyzed",
"omit_norms": true,
"fielddata": { "format": "disabled" },
"type": "string",
"fields": {
"raw": {
"index": "not_analyzed",
"ignore_above": 256,
"type": "string"
}
}
},
"match": "*",
"match_mapping_type": "string"
}
}
],
"properties": {
"#timestamp": { "type": "date", "format": "strict_date_optional_time||epoch_millis" },
"#version": { "type": "string", "index": "not_analyzed" },
"ALERT_TYPE": {
"type": "string",
"norms": { "enabled": false },
"fielddata": { "format": "disabled" },
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
},
"DATE": {
"type": "string",
"norms": { "enabled": false },
"fielddata": { "format": "disabled" },
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
},
"LOG_FILE": {
"type": "string",
"norms": { "enabled": false },
"fielddata": { "format": "disabled" },
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
},
"MESSAGE": {
"type": "string",
"norms": { "enabled": false },
"fielddata": { "format": "disabled" },
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
},
"SERVER_TYPE": {
"type": "string",
"norms": { "enabled": false },
"fielddata": { "format": "disabled" },
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
},
"geoip": {
"dynamic": "true",
"properties": {
"ip": { "type": "ip" },
"latitude": { "type": "float" },
"location": { "type": "geo_point" },
"longitude": { "type": "float" }
}
},
"host": {
"type": "string",
"norms": { "enabled": false },
"fielddata": { "format": "disabled" },
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
},
"path": {
"type": "string",
"norms": { "enabled": false },
"fielddata": { "format": "disabled" },
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
},
"type": {
"type": "string",
"norms": { "enabled": false },
"fielddata": { "format": "disabled" },
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
Does anyone have a clue about why this query keep returning nothing ? Maybe there is something in the mapping that i am missing which explain why the match or term query keep failing ? I'm running out of idea about what is happenning and i'm quite new to elasticsearch and logstash.
Versions of tools and environment :
OS: RHEL Server 6.5 (Santiago)
Java: 1.7.0_91
Elasticsearch: 2.4.6
Lucene: 5.5.4
Logstash: 2.4.1
This is not really an answer, but it was to complicated to write this as a comment.
from the mapping i know the field is not analysed.
You are searching for ALERT_TYPE, but this one is in fact analyzed with the default analyzer since you did not specify any analyzer directly under your ALERT_TYPE's mapping.
However, your ALERT_TYPE has an internal field named raw that is not analyzed. If you want to search documents using the raw field you'll need to change the query from
"must": [
{ "term": { "ALERT_TYPE": "ERROR" }}
]
to
"must": [
{ "term": { "ALERT_TYPE.raw": "ERROR" }}
]
Related
Make template base on https://github.com/vanthome/winston-elasticsearch/blob/master/index-template-mapping.json
{
"index_patterns": ["applogs-*"],
"settings": {
"number_of_shards": 1
},
"mappings": {
"_source": { "enabled": true },
"properties": {
"#timestamp": { "type": "date" },
"#version": { "type": "keyword" },
"message": { "type": "text", "index": true },
"severity": { "type": "keyword", "index": true },
"geohash":{ "type": "geo-point", "index": true},
"location":{ "type": "geo-point", "index": true},
}
}
}
but get an error
[mapper_parsing_exception] Root mapping definition has unsupported parameters: [severity : {index=true, type=keyword}] [#timestamp : {type=date}] [#version : {type=keyword}] [message : {index=true, type=text}] [fields : {dynamic=true, properties={}}]
probably some obsolete version? What I should update?
Based on the docs:
PUT _template/template_1
{
"index_patterns": [
"applogs-*"
],
"settings": {
"number_of_shards": 1
},
"mappings": {
"_source": {
"enabled": true
},
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "keyword"
},
"message": {
"type": "text",
"index": true
},
"severity": {
"type": "keyword",
"index": true
},
"geohash": {
"type": "geo_point",
"index": true
},
"location": {
"type": "geo_point",
"index": true
}
}
}
}
Your json was invalid (one comma too much) and also geo-point --> geo_point.
I have the following mapping in elasticsearch5.5:
"append": {
"type": "short"
},
"comment": {
"type": "text",
"index": "analyzed"
},
"create_time": {
"type": "date"
},
"desc_score": {
"type": "integer"
},
"goods_id": {
"type": "long"
},
"goods_name": {
"type": "keyword",
"index": "not_analyzed"
},
"handler": {
"type": "keyword",
"index": "not_analyzed"
},
"is_first_review": {
"type": "short"
},
"logistic_score": {
"type": "integer"
},
"mall_id": {
"type": "long"
},
"order_id": {
"type": "long"
},
"order_sn": {
"type": "keyword",
"index": "not_analyzed"
},
"parent_id": {
"type": "keyword",
"index": "not_analyzed"
},
"review_id": {
"type": "long"
},
"score": {
"type": "integer"
},
"service_score": {
"type": "integer"
},
"shipping_id": {
"type": "long"
},
"shipping_name": {
"type": "keyword",
"index": "not_analyzed"
},
"status": {
"type": "short"
},
"tracking_number": {
"type": "keyword",
"index": "not_analyzed"
},
"update_time": {
"type": "date"
},
"user_id": {
"type": "long"
}
When I profile in kibana with the following statement and got the result in the picture.
{
"from": 0,
"size": 10,
"query": {
"bool": {
"filter": [
{
"terms": {
"goods_id": [
"262628158"
],
"boost": 1.0
}
},
{
"terms": {
"status": [
"2",
"4"
],
"boost": 1.0
}
},
{
"range": {
"create_time": {
"from": "1514027649",
"to": "1514632449",
"include_lower": true,
"include_upper": true,
"boost": 1.0
}
}
}
],
"disable_coord": false,
"adjust_pure_negative": true,
"boost": 1.0
}
},
"sort": [
{
"create_time": {
"order": "desc"
}
}
]
}
I am confused why status filter cost so much time, and status is a field with only five value, this is perpahps the reason cause the problem but I do not know how to optimize my search clause. I have search a lot with google but got no answer yet. Any one can help me?
Finally I find the answer, it is because elasticsearch5.5 use kd-tree to store index data for numeric data type(short, long, etc.) for optimize range search which can be found in following article:
Better Query Planning For Range Queries
Tune For Search Speed
Elasticsearch Query Execution Order
Kd-tree is not suitable for exact term, and I changed the filed type to keyword.
When I query my index with query_string, I am getting results
But when I query using term query, I dont get any results
{
"query": {
"bool": {
"must": [],
"must_not": [],
"should": [
{
"query_string": {
"default_field": "Printer.Name",
"query": "HL-2230"
}
}
]
}
},
"from": 0,
"size": 10,
"sort": [],
"aggs": {}
}
I know that term is not_analyzed and query_string is analyzed but Name is already as "HL-2230", why doesnt it match with term query? I tried also searching with "hl-2230", I still didnt get any result.
EDIT: mapping looks like as below. Printer is the child of Product. Not sure if this makes difference
{
"state": "open",
"settings": {
"index": {
"creation_date": "1453816191454",
"number_of_shards": "5",
"number_of_replicas": "1",
"version": {
"created": "1070199"
},
"uuid": "TfMJ4M0wQDedYSQuBz5BjQ"
}
},
"mappings": {
"Product": {
"properties": {
"index": "not_analyzed",
"store": true,
"type": "string"
},
"ProductName": {
"type": "nested",
"properties": {
"Name": {
"store": true,
"type": "string"
}
}
},
"ProductCode": {
"type": "string"
},
"Number": {
"index": "not_analyzed",
"store": true,
"type": "string"
},
"id": {
"index": "no",
"store": true,
"type": "integer"
},
"ShortDescription": {
"store": true,
"type": "string"
},
"Printer": {
"_routing": {
"required": true
},
"_parent": {
"type": "Product"
},
"properties": {
"properties": {
"RelativeUrl": {
"index": "no",
"store": true,
"type": "string"
}
}
},
"PrinterId": {
"index": "no",
"store": true,
"type": "integer"
},
"Name": {
"store": true,
"type": "string"
}
}
},
"aliases": []
}
}
As per mapping provided by you above
"Name": {
"store": true,
"type": "string"
}
Name is analysed. So HL-2230 will split into two tokens, HL and 2230. That's why term query is not working and query_string is working. When you use term query it will search for exact term HL-2230 which is not there.
if i have a JSON document indexed into Elasticsearch, like the following:
"_source": {
"pid_no": 19321,
"aggregator_id": null,
"inet_family": "ipv4-unicast",
"origin_code": "igp",
"extended_community": null,
"atomic_aggregate": null,
"adv_type": "announce",
"local_preference": 250,
"med_metric": 0,
"time_stamp": 1447534931,
"net_mask": "23",
"prefix4_": {
"last": 222,
"first": 111
},
"counter_no": 69668,
"confederation_path": "",
"as_set": null,
and i have tried successfully to filter all of the keys of the doc,
but, except the nested ones.
the query looks like:
GET /SNIP!/SNIP!/_search?routing=SNIP!
{
"query": {
"bool": {
"must": {
"query": {
"match_all": {}
}
},
"filter": {
"bool": {
"filter": [
{
"range": {
"local_preference": {
"gt": 150,
"lte": 250
}
}
},
>>> if i remove the filter below, matches the document.
>>> when i apply the filter, i get 0 hits
{
"and": [
{
"range": {
"prefix4_.first": {
"lte": 200
}
}
},
{
"range": {
"prefix4_.last": {
"gte": 200
}
}
}
]
}
]
}
}
}
}
}
it goes without saying that the mapping is done using integers in the corresponding fields (prefix4_.first,prefix4_.last)
could you please advise on why the filtering does not work ?
EDIT: the mapping looks like this
{
"mappings": {
"_default_": {
"_all": { "enabled": False },
"dynamic": True,
"_routing": { "required": True },
"properties": {
"pid_no": { "type": "string", "index": "not_analyzed", "store": "no" },
"counter_no": { "type": "long", "store": "no" },
"time_stamp": { "type": "date", "format": "epoch_second", "store": "no" },
"host_name": { "type": "string", "index": "not_analyzed", "store": "no" },
"local_ip": { "type": "ip", "store": "no" },
"peer_ip": { "type": "ip", "store": "no" },
"local_asn": { "type": "string", "index": "not_analyzed", "store": "no" },
"peer_asn": { "type": "string", "index": "not_analyzed", "store": "no" },
"inet_family": { "type": "string", "index": "not_analyzed", "store": "no" },
"next_hop": { "type": "ip", "store": "no" },
"net_block": { "type": "string", "index": "analyzed", "store": "no" },
"as_path": { "type": "string", "index": "analyzed", "store": "no" },
"cluster_list": { "type": "string", "index": "not_analyzed", "store": "no" },
"confederation_path": { "type": "string", "index": "not_analyzed", "store": "no" },
"local_preference": { "type": "integer", "store": "no" },
"originator_ip": { "type": "ip", "store": "no" },
"origin_code": { "type": "string", "index": "not_analyzed", "store": "no" },
"community_note": { "type": "string", "index": "analyzed", "store": "no" },
"med_metric": { "type": "long", "store": "no" },
"atomic_aggregate": { "type": "boolean", "store": "no" },
"aggregator_id": { "type": "string", "index": "analyzed", "store": "no" },
"as_set": { "type": "string", "index": "analyzed", "store": "no" },
"extended_community": { "type": "string", "index": "not_analyzed", "store": "no" },
"adv_type": { "type": "string", "index": "not_analyzed", "store": "no" },
"prefix_": { "type": "string", "index": "not_analyzed", "store": "no" },
"net_mask": { "type": "integer", "store": "no" },
"prefix4_": {
"type": "nested",
"properties": {
"first": { "type": "integer", "store": "no" },
"last": { "type": "integer", "store": "no" }
}
},
"prefix6_": {
"type": "nested",
"properties": {
"lofirst": { "type": "long", "store": "no" },
"lolast": { "type": "long", "store": "no" },
"hifirst": { "type": "long", "store": "no" },
"hilast": { "type": "long", "store": "no" }
}
}
}
}
},
"settings" : {
"number_of_shards": 1,
"number_of_replicas": 0,
"index": {
"store.throttle.type": "none",
"memory.index_buffer_size": "20%",
"refresh_interval": "1m",
"merge.async": True,
"merge.scheduler.type": "concurrent",
"merge.policy.type": "log_byte_size",
"merge.policy.merge_factor": 15,
"cache.query.enable": True,
"cache.filter.type": "node",
"fielddata.cache.type": "node",
"cache.field.type": "soft"
}
}
}
Elasticsearch provides multiple ways of mapping nested documents. You are using nested which indexes nested documents as separate documents behind the scenes and as such querying them requires the use of a nested query.
The simplest way of indexing nested JSON like you've shown is using the object type mapping. This would allow you to query the field the way you were expecting, however Elasticsearch flattens the hierarchy which may not be acceptable for you.
use nested filters to filter your documents on nested fields.
https://www.elastic.co/guide/en/elasticsearch/reference/1.4/query-dsl-nested-filter.html
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"peer_ip": "pqr",
"_cache": true
}
},
{
"nested": {
"filter": {
"bool": {
"must": [
{
"terms": {
"first": [
"xyz"
],
"_cache": true
}
}
]
}
},
"path": "prefix4_",
"inner_hits": {}
}
},
{
"terms": {
"pid_no": [
"yyu"
],
"_cache": true
}
}
]
}
}
}
}
}
Continuing a previous question.
What am I trying to do?
Thanks to #AndreiStefan I'm trying to put a murmur3 hash off the heap, using doc_values:
"dynamic_templates": [
{
"murmur3_hashed": {
"mapping": {
"index": "not_analyzed",
"norms": {
"enabled": false
},
"fielddata": {
"format": "doc_values"
},
"doc_values": true,
"type": "string",
"fields": {
"hash": {
"index": "no",
"doc_values": true,
"type": "murmur3"
}
}
},
"match_mapping_type": "string",
"match": "my_prop"
}
}
]
I used stream2es for reindexing.
What is the result?
After a reindexing, the result property is:
"my_prop": {
"index": "not_analyzed",
"fielddata": {
"format": "doc_values"
},
"doc_values": true,
"type": "string",
"fields": {
"hash": {
"null_value": -1,
"precision_step": 2147483647,
"type": "murmur3"
}
}
},
What is the problem?
Why is the "index": "no", "doc_values": true missing in the result property?
This is the list of commands that I tested in ES 1.6.0:
PUT /test
{
"mappings": {
"_default_": {
"dynamic_templates": [
{
"murmur3_hashed": {
"mapping": {
"index": "not_analyzed",
"norms": {
"enabled": false
},
"fielddata": {
"format": "doc_values"
},
"doc_values": true,
"type": "string",
"fields": {
"hash": {
"index": "no",
"doc_values": true,
"type": "murmur3"
}
}
},
"match_mapping_type": "string",
"match": "my_prop"
}
}
]
}
}
}
POST /test/test_type/1
{
"my_prop": "xxx"
}
GET /test/test_type/_mapping
And I got this as output:
{
"test": {
"mappings": {
"test_type": {
"dynamic_templates": [
{
"murmur3_hashed": {
"mapping": {
"fielddata": {
"format": "doc_values"
},
"norms": {
"enabled": false
},
"index": "not_analyzed",
"type": "string",
"fields": {
"hash": {
"index": "no",
"type": "murmur3",
"doc_values": true
}
},
"doc_values": true
},
"match": "my_prop",
"match_mapping_type": "string"
}
}
],
"properties": {
"my_prop": {
"type": "string",
"index": "not_analyzed",
"doc_values": true,
"fielddata": {
"format": "doc_values"
},
"fields": {
"hash": {
"type": "murmur3",
"index": "no",
"doc_values": true,
"precision_step": 2147483647,
"null_value": -1
}
}
}
}
}
}
}
}