We are using Logstash, elasticsearch and kibana for handling and searching of our logs.
Often, we searching, Kibana will return results that do not contain the searched for item.
For example, we search on the exact phrase - here is the query
curl -XGET 'http://logs.magick.nu/kibana2/logstash-2014.10.17,logstash-2014.10.16/_search?pretty' -d '{
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "COND_30892c7a490e154e01490e2dcf7a0008(2)"
}
}
]
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"#timestamp": {
"from": 1413471279957,
"to": 1413557679958
}
}
}
]
}
}
}
},
"highlight": {
"fields": {},
"fragment_size": 2147483647,
"pre_tags": [
"#start-highlight#"
],
"post_tags": [
"#end-highlight#"
]
},
"size": 500,
"sort": [
{
"#timestamp": {
"order": "desc",
"ignore_unmapped": true
}
},
{
"#timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}'
And Kibana would return results such as:
{
"_index": "logstash-2014.10.17",
"_type": "app SwitchYard",
"_id": "unti1lWJRTelQd4N5_LVjA",
"_score": null,
"_source": {
"message": "2014/10/17 13:50:43,739 [com.domain.Connector.service.ent.BasicJMSTickListener] (NJ4X-63) Sending market info for product symbol to JMS topic. Broker Server: broker.Demo. Account Number: 1235. StrategyId: 4028e49447ac4296147af921d5f00b. OrderCount: 2",
"#version": "1",
"#timestamp": "2014-10-17T14:24:32.193Z",
"type": "app SwitchYard",
"tags": [
"node"
],
"domain": "trading1-magickdev.amakitu.com",
"env": "DEV",
"host": "nodelarge.amakitu.com",
"path": "/var/lib/openshift/541723389821cc77c2000167/jbosseap/logs/server.log"
},
"sort": [
1413555872193,
1413555872193
]
}
This happens a lot!
Any ideas what is wrong?
Related
I am trying to get query result along with highlight but highlights are not working script_score query in elasticsearch version 7.13. I am tryig following query
{
"_source": {"excludes": ["tag","document_vector"]},
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "cosineSimilarity(params.query_vector, 'document_vector') + 1.0",
"params": {
"query_vector": query_vector
}
}
}
},
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"highlight": {
"order": "score",
"number_of_fragments": 3,
"fragment_size": 150,
"pre_tags": [
"<mark>"
],
"post_tags": [
"</mark>"
],
"fields": {
"content": {}
}
}
I have an index with several types. The data in each record includes fields like "Customer ID" and "Device Name", "url" etc.
Elasticsearch is v5.6.8.
What I'd like to end up with is one document per "Customer ID" and "Device Name" and the value of the _type for the document. The single document per grouping should have a list of the 'url' values joined into one field called 'urls'.
I tried the following but it doesn't do what I thought it would do and I'm not sure what else to try:
GET _search
{
"query": {
"bool": {
"must": [
{
"term": {
"_index": "safebrowsing"
}
},
{
"range": {
"eventtime": {
"gte": "now-5d/d"
}
}
}
],
"must_not": [
{
"term": {
"reported_to_client": true
}
}
]
}
},
"size": 0,
"aggs": {
"Customer ID": {
"terms": {
"field": "Customer ID.keyword"
},
"aggs": {
"Device Name": {
"terms": {
"field": "Device Name.keyword"
},
"aggs": {
"documenttype": {
"terms": {
"field": "_type"
},
"aggs": {
"urls": {
"terms": {
"script": "_doc['url'].values"
}
}
}
}
}
}
}
}
}
}
This is the error I get:
{
"error": {
"root_cause": [
{
"type": "circuit_breaking_exception",
"reason": "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead; this limit can be changed by the [script.max_compilations_per_minute] setting",
"bytes_wanted": 0,
"bytes_limit": 0
},
{
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"_doc['url'].values",
"^---- HERE"
],
"script": "_doc['url'].values",
"lang": "painless"
}
],
...etc
I figured this out... Basically what one must do is have an aggregation type called top_hits which returns the actual hits (as many as indicated by "size") within each higher level aggregation.
GET /_search
{
"query": {
"bool": {
"must": [
{"term": {"_index": "safebrowsing"}},
{"range": {"eventtime": {"gte": "now-2d/d"}}}
],
"must_not": [
{"term": {"reported_to_client": true}}
]
}
},
"aggs": {
"Customer ID": {
"terms": {
"field": "Customer ID.keyword"
},
"aggs": {
"Device Name": {
"terms": {
"field": "Device Name.keyword"
},
"aggs": {
"thetype": {
"terms": {
"field": "_type"
},
"aggs": {
"thedocs": {
"top_hits": {
"sort": [{"eventtime": {"order": "desc"}}],
"_source": {
"includes": [ "ip", "type", "eventtime", "url" ]
},
"size": 2
}
}
}
}
}
}
}
}
},
"size": 0
}
Each hit within the aggregation I've called thedocs looks like this:
{
"_index": "safebrowsing",
"_type": "SOCIAL_ENGINEERING",
"_id": "7ffe641xxxyyydc3536189ce33d5dfb9",
"_score": null,
"_source": {
"ip": "xxx.xxx.7.88",
"eventtime": "2018-05-08T23:34:03-07:00",
"type": "SOCIAL_ENGINEERING",
"url": "http://xyz-domainname.tld/bankofwhatever/"
},
"sort": [
1525847643000
]
}
I am trying to write a GROUP BY query in elastic search using version 5.2
I want to query the data and limit that down to those which have a particular 'tag'. In the case below. I want to select items which contain the word "FIY" in the title or content fields and then narrow that down so as to only search those documents which have the tags "FIY" and "Competition"
The query part is fine but I am struggling to limit it to the given tag.
So far I have got, but I am getting the error.
"reason": "[bool] query does not support [terms]",
GET advice-articles/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "FIY",
"fields": ["title", "content"]
}
}
], "filter": {
"bool": {
"terms": {
"tags.tagName": [
"competition"
]
}
}
}
}
}
}
an example index is
"_index": "advice-articles",
"_type": "article",
"_id": "1460",
"_score": 4.3167734,
"_source": {
"id": "1460",
"title": "Your top FIY tips",
"content": "Fix It Yourself in April 2012.",
"tags": [
{
"tagName": "Fix it yourself"
},
{
"tagName": "customer tips"
},
{
"tagName": "competition"
}
]
the mappings I have are as follows
{
"advice-articles": {
"mappings": {
"article": {
"properties": {
"content": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"tags": {
"type": "nested",
"properties": {
"tagName": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
}
}
}
}
bool query built using one or more boolean clauses, each clause with a typed occurrence. The occurrence types are:
must, must_not, filter, should
GET _search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "FIY",
"fields": [
"title",
"content"
]
}
},
{
"nested": {
"path": "tags",
"query": {
"terms": {
"tags.tagName": [
"competition"
]
}
}
}
}
]
}
}
}
Here is how you can use a must clause for your query requirements.
Inside the filter you dont need to put bool.
POST newindex/test/1460333
{
"title": "Your top FIY tips",
"content": "Fix It Yourself in April 2012.",
"tags": [
{
"tagName": "Fix it yourself"
},
{
"tagName": "customer tips"
},
{
"tagName": "shoud not return"
}
]
}
POST newindex/test/1460
{
"title": "Your top FIY tips",
"content": "Fix It Yourself in April 2012.",
"tags": [
{
"tagName": "Fix it yourself"
},
{
"tagName": "customer tips"
},
{
"tagName": "competition"
}
]
}
Query:
GET newindex/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "FIY",
"fields": [
"title",
"content"
]
}
}
],
"filter": {
"terms": {
"tags.tagName": [
"competition"
]
}
}
}
}
}
Result :
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "newindex",
"_type": "test",
"_id": "1460",
"_score": 0.2876821,
"_source": {
"title": "Your top FIY tips",
"content": "Fix It Yourself in April 2012.",
"tags": [
{
"tagName": "Fix it yourself"
},
{
"tagName": "customer tips"
},
{
"tagName": "competition"
}
]
}
}
]
}
}
I'm new to Elastic 5.1, (new to elastic in general) and I have a list which I send using msearch to elastic.
However the following does not return any hits, but my documents in the index look like:
{
"_index": "all_items",
"_type": "product",
"_id": "1000002007900",
"_version": 2,
"found": true,
"_source": {
"doc": {
"title": "title here",
"brand": null,
"updatedOn": "2016-12-22T14:00:26.016290",
"price": 49,
"viewed7": 0,
"idInShop": "11",
"active": true,
"model": null,
"_id": 1000002007900,
"purchased7": 0
},
"doc_as_upsert": true
}
}
and here is the body sent to msearch
[
{
"index": "all_items",
"type": "product"
},
{
"sort": [
{
"_score": "desc"
}
],
"query": {
"function_score": {
"query": {
"bool": {
"filter": [
{
"term": {
"active": true
}
}
],
"should": [],
"must_not": [],
"must": []
}
},
"functions": [
{
"script_score": {
"script": {
"lang": "painless",
"inline": "_score * params.constant * (doc['discountPrice'] > 0 ? doc['price'] / doc['discountPrice'] : 0)",
"params": {
"constant": 1.2
}
}
}
}
],
"score_mode": "multiply"
}
},
"from": 0,
"size": 3
}
]
If I only send {"query":{"match_all":{}}} I get hits.
You can use match query to get the result you want.
[
{
"index": "all_items",
"type": "product"
},
{
"sort": [
{
"_score": "desc"
}
],
"query": {
"function_score": {
"query": {
"match": {
"active": true
}
},
"functions": [
{
"script_score": {
"script": {
"lang": "painless",
"inline": "_score * params.constant * (doc['discountPrice'] > 0 ? doc['price'] / doc['discountPrice'] : 0)",
"params": {
"constant": 1.2
}
}
}
}
],
"score_mode": "multiply"
}
},
"from": 0,
"size": 3
}
]
You can read more about match query and term based query (which you used) at this link.
I have this document in Elasticsearch (1.6)
{
"_index": "onkopedia",
"_type": "document_",
"_id": "0afa26afc2d1440a8ed03dac0e8511fc",
"_version": 1,
"_score": null,
"_source": {
"description": "",
"contributors": [ ],
"metaTypeName": "Connector",
"sortableTitle": "mammakarzinom der frau",
"subject": [ ],
"authorizedUsers": [
"Anonymous"
],
"language": "",
"title": "Mammakarzinom der Frau",
"url": "http://dev1.veit-schiele.de:9080/onkopedia/de/onkopedia/guidelines/mammakarzinom-der-frau",
"author": "ajung",
"modified": "2015-05-11T05:21:14",
"metaType": "xmldirector.plonecore.connector",
"content": " Mammakarzinom der Frau Stand: Januar 2013 Autoren der aktuellen .....",
"authorName": "ajung",
"created": "2015-05-11T05:21:14",
"review_state": "published"
},
"sort": [
null
]
}
containing a key
'authorizedUsers': ['Anonymous']
The following query is supposed to return the document above however it does not:
{
"sort": [
"_score"
],
"from": 0,
"fields": [
"url",
"title",
"description",
"metaType",
"metaTypeName",
"author",
"authorName",
"contributors",
"modified",
"subject",
"review_state",
"language",
"content"
],
"query": {
"filtered": {
"filter": {
"and": [
{
"terms": {
"execution": "or",
"metaType": [
"Document",
"FormFolder",
"Collection",
"Discussion Item",
"News Item",
"xmldirector.plonecore.connector",
"CaptchaField"
]
}
},
{
"terms": {
"execution": "or",
"authorizedUsers": [
"Manager",
"Authenticated",
"Anonymous",
"user:ajung"
]
}
}
]
},
"query": {
"query_string": {
"query": "mammakarzinom",
"default_operator": "AND",
"fields": [
"title^3",
"contributors^2",
"subject^2",
"description",
"content"
]
}
}
}
},
"highlight": {
"fields": {
"content": {
"fragment_size": 250,
"number_of_fragments": 3
},
"description": {
"fragment_size": 250,
"number_of_fragments": 2
},
"title": {
"number_of_fragments": 0
}
}
},
"size": 15
}
The query without the filter for 'authorizedUsers' does return the document.
Why? 'Anonymous' as value for 'authorizedUsers' is available within the query, so I would expect that the document would be found by the first query, or?
{
"sort": [
"_score"
],
"from": 0,
"fields": [
"url",
"title",
"description",
"metaType",
"metaTypeName",
"author",
"authorName",
"contributors",
"modified",
"subject",
"review_state",
"language",
"content"
],
"query": {
"filtered": {
"filter": {
"and": [
{
"terms": {
"execution": "or",
"metaType": [
"Document",
"FormFolder",
"Collection",
"Discussion Item",
"News Item",
"xmldirector.plonecore.connector",
"CaptchaField"
]
}
}
]
},
"query": {
"query_string": {
"query": "mammakarzinom",
"default_operator": "AND",
"fields": [
"title^3",
"contributors^2",
"subject^2",
"description",
"content"
]
}
}
}
},
"highlight": {
"fields": {
"content": {
"fragment_size": 250,
"number_of_fragments": 3
},
"description": {
"fragment_size": 250,
"number_of_fragments": 2
},
"title": {
"number_of_fragments": 0
}
}
},
"size": 15
}
Probably your analyzer for authorizedUsers field is lowercasing the value itself. So, in your index the actual values is anonymous (lowercase a).
Try this filter:
{
"terms": {
"execution": "or",
"authorizedUsers": [
"manager",
"authenticated",
"anonymous",
"user:ajung"
]
}
}
meaning, search the index with the values that are actually there.
One more thing: terms is not analyzing the input text. This means that if you search for Anonymous then this is what it will look into the index. Since you have anonymous in the index, it will not match.