How to search for a nested key in kibana - elasticsearch

I have kibana documents that look like this
{
"_index": "echo.caspian-test.2020-06-11.idx.2",
"_type": "status",
"_id": "01754abe95fd084495da20646194fdf7",
"_score": 1,
"_source": {
"applicationVersion": "9f80e49dea1c647fa1baf2e70665aba3a74158eb",
"echoClientVersion": "1.5.1",
"echoMetadata": {
"transportType": "echo"
},
"dataCenter": "hdc-digital-non-prod",
"echoLoggerVersion": "EchoLogbackAppender-1.5.1",
"host": "e22ab1e4-9256-438b-5855-ad04",
"type": "INFO",
"message": "AddUpdate process method ends",
"messageDetail": {
"logger": "com.kroger.cxp.app.transformer.processor.AddUpdateTransformerImpl",
"thread": "DispatchThread: [com.ibm.mq.jmqi.remote.impl.RemoteSession[:/1f6e1b6c][connectionId=414D5143514D2E4150504C2E54455354967C7F5F0407B82E]]"
},
"routingKey": "caspian-test",
"timestamp": "1603276805250"
},
"fields": {
"timestamp": [
"2020-10-21T10:40:05.250Z"
]
}
}
I need to search all the docs having a particular connectionId which is present in
"messageDetail": {
"logger": "com.kroger.cxp.app.transformer.processor.AddUpdateTransformerImpl",
"thread": "DispatchThread: [com.ibm.mq.jmqi.remote.impl.RemoteSession[:/1f6e1b6c][connectionId=414D5143514D2E4150504C2E54455354967C7F5F0407B82E]]"
}
How can i do that . I have tried searching for messageDetail.thread=%$CONNECTION_ID% but it didn't work

You need to add a nested path in your search query to make it work and your messageDetail must be of nested datatype, something like below
{
"query": {
"nested": {
"path": "messageDetail", --> note this
"query": {
"bool": {
"must": [
{
"match": {
"messageDetail. thread": "CONNECTION_ID"
}
}
]
}
}
}
}
}
Adding a working sample with mapping, search query, and result
Index mapping
{
"mappings": {
"properties": {
"messageDetail": {
"type" : "nested"
}
}
}
}
Index sample doc
{
"applicationVersion": "9f80e49dea1c647fa1baf2e70665aba3a74158eb",
"echoClientVersion": "1.5.1",
"echoMetadata": {
"transportType": "echo"
},
"dataCenter": "hdc-digital-non-prod",
"echoLoggerVersion": "EchoLogbackAppender-1.5.1",
"host": "e22ab1e4-9256-438b-5855-ad04",
"type": "INFO",
"message": "AddUpdate process method ends",
"messageDetail": {
"logger": "com.kroger.cxp.app.transformer.processor.AddUpdateTransformerImpl",
"thread": "DispatchThread: [com.ibm.mq.jmqi.remote.impl.RemoteSession[:/1f6e1b6c][connectionId=414D5143514D2E4150504C2E54455354967C7F5F0407B82E]]"
},
"routingKey": "caspian-test",
"timestamp": "1603276805250"
}
And search query
{
"query": {
"nested": {
"path": "messageDetail",
"query": {
"bool": {
"must": [
{
"match": {
"messageDetail.thread": "DispatchThread"
}
}
]
}
}
}
}
}
And search res
"hits": [
{
"_index": "nestedmsg",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"applicationVersion": "9f80e49dea1c647fa1baf2e70665aba3a74158eb",
"echoClientVersion": "1.5.1",
"echoMetadata": {
"transportType": "echo"
},
"dataCenter": "hdc-digital-non-prod",
"echoLoggerVersion": "EchoLogbackAppender-1.5.1",
"host": "e22ab1e4-9256-438b-5855-ad04",
"type": "INFO",
"message": "AddUpdate process method ends",
"messageDetail": {
"logger": "com.kroger.cxp.app.transformer.processor.AddUpdateTransformerImpl",
"thread": "DispatchThread: [com.ibm.mq.jmqi.remote.impl.RemoteSession[:/1f6e1b6c][connectionId=414D5143514D2E4150504C2E54455354967C7F5F0407B82E]]"
},
"routingKey": "caspian-test",
"timestamp": "1603276805250"
}
}
]

Related

Query for nested fields returns results as if there was no nested mapping

I am having difficulties understanding, why a query across nested fields is returning unexpected results.
I have the following template for my index
PUT /_template/nested_test
{
"index_patterns": [ "nested-*" ],
"settings": { "index.mapping.coerce": false },
"mappings": {
"dynamic": "strict",
"properties" {
"vNested": {
"type": "nested",
"properties": {
"v1": { "type": "keyword" },
"v2": {
"properties": {
"v21": {
"type": long"
}
}
}
}
}
}
}
}
I will post two documents to an index that matches the template.
POST /nested-example/_doc
{
"vNested": [
{
"v1": "User1",
"v2": {
"v21": 1
}
},
{
"v1": "User3",
"v2": {
"v21": 3
}
}
]
}
POST /nested-example/_doc
{
"vNested": [
{
"v1": "User1",
"v2": {
"v21": 3
}
},
{
"v1": "User2",
"v2": {
"v21": 2
}
}
]
}
Now I will create a query with the goal of only getting the results of those documents, where there exists User1 with a corresponding v21 value of 3. As far as I understand, my nested mapping should ensure that I will only get the second document as query result.
The following query:
GET /nested-example/_search
{
"query" : {
"bool": {
"filter": {
"bool": {
"must": [
{
"nested: {
"path": "vNested",
"query": {
"match": {
"vNested.v1": "User1"
}
}
}
},
{
"nested: {
"path": "vNested",
"query": {
"match": {
"vNested.v2.v21": "3"
}
}
}
}
]
}
}
}
}
}
returns both documents, not only the single document that I expected
I understand that the query string is not the most elegant - this is due to some business logic + front-end framework logic in place for creating the query strings based on user input and any suggestions on how to remove redundancies there are welcome as well.
However I struggle to understand why does this query return both documents including the one where the vNested object with v1=User1, and v21=1. Shouldn't the nested mapping of the vNested field prevent just that issue?
You need to use bool/must query inside the nested query since you are querying on a single object and not on multiple objects. Modify your query as
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "vNested",
"query": {
"bool": {
"must": [
{
"match": {
"vNested.v1": "User1"
}
},
{
"match": {
"vNested.v2.v21": "3"
}
}
]
}
},
"inner_hits":{}
}
}
]
}
}
}
}
}
Search Result is
"hits": [
{
"_index": "nested-example",
"_type": "_doc",
"_id": "AAu0IXkBKyWl6Va6kmTU",
"_score": 0.0,
"_source": {
"vNested": [
{
"v1": "User1",
"v2": {
"v21": 3
}
},
{
"v1": "User2",
"v2": {
"v21": 2
}
}
]
},
"inner_hits": {
"vNested": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.6931472,
"hits": [
{
"_index": "nested-example",
"_type": "_doc",
"_id": "AAu0IXkBKyWl6Va6kmTU",
"_nested": {
"field": "vNested",
"offset": 0
},
"_score": 1.6931472,
"_source": {
"v1": "User1",
"v2": {
"v21": 3
}
}
}
]
}
}
}
}
]

elasticsearch - fuzziness with bool_prefix type

I have the following query:
{
size: 6,
query: {
multi_match: {
query,
type: 'bool_prefix',
fields: ['recommendation', 'recommendation._2gram', 'recommendation._3gram'],
},
},
highlight: {
fields: {
recommendation: {},
},
},
}
I want to add fuzziness: 1 to this query, but it has issues with the type: 'bool_prefix'. I need the type: 'bool_prefix to remain there b/c its integral to how the query works, but I'd also like to add some fuzziness to it. Any ideas?
As mentioned in the official ES documentation of bool_prefix
The fuzziness, prefix_length, max_expansions, fuzzy_rewrite, and
fuzzy_transpositions parameters are supported for the terms that are
used to construct term queries, but do not have an effect on the
prefix query constructed from the final term.
Adding a working example with index mapping, data, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"recommendation": {
"type": "search_as_you_type",
"max_shingle_size": 3
}
}
}
}
Index Data:
{
"recommendation":"good things"
}
{
"recommendation":"good"
}
Search Query:
You can add fuzziness parameter with bool_prefix, as shown below
{
"size": 6,
"query": {
"multi_match": {
"query": "goof q",
"type": "bool_prefix",
"fields": [
"recommendation",
"recommendation._2gram",
"recommendation._3gram"
],
"fuzziness": 1
}
},
"highlight": {
"fields": {
"recommendation": {}
}
}
}
Search Result:
"hits": [
{
"_index": "65817192",
"_type": "_doc",
"_id": "2",
"_score": 1.1203322,
"_source": {
"recommendation": "good things"
},
"highlight": {
"recommendation": [
"<em>good</em> things"
]
}
},
{
"_index": "65817192",
"_type": "_doc",
"_id": "1",
"_score": 0.1583319,
"_source": {
"recommendation": "good"
},
"highlight": {
"recommendation": [
"<em>good</em>"
]
}
}
]
I ended up with additional fuzzy query combined with multi_match by bool. In your case it would look like this:
{
"size": 6,
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "goof q",
"type": "bool_prefix",
"fields": [
"recommendation",
"recommendation._2gram",
"recommendation._3gram"
]
}
},
{
"fuzzy": {
"nameSearch": {
"value": "goof q",
"fuzziness": "AUTO"
}
}
}
]
}
},
"highlight": {
"fields": {
"recommendation": {}
}
}
}

search array of strings by partially match in elasticsearch

I got fields like that:
names: ["Red:123", "Blue:45", "Green:56"]
it's mapping is
"names": {
"type": "keyword"
},
how could I search like this
{
"query": {
"match": {
"names": "red"
}
}
}
to get all the documents where red is in element of names array?
Now it works only with
{
"query": {
"match": {
"names": "red:123"
}
}
}
You can add multi fields OR just change the type to text, to achieve your required result
Index Mapping using multi fields
{
"mappings": {
"properties": {
"names": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings":{
"properties":{
"names":{
"type":"text"
}
}
}
}
Index Data:
{
"names": [
"Red:123",
"Blue:45",
"Green:56"
]
}
Search Query:
{
"query": {
"match": {
"names": "red"
}
}
}
Search Result:
"hits": [
{
"_index": "64665127",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"names": [
"Red:123",
"Blue:45",
"Green:56"
]
}
}
]

How to configure elasticsearch regexp query

I try to configure elasticsearch request. I use DSL and try to find some data with word "swagger" into "message" field.
Here is one of correct answer I want to show :
{
"_index": "apiconnect508",
"_type": "audit",
"_id": "AWF1us1T4ztincEzswAr",
"_score": 1,
"_source": {
"consumerOrgId": null,
"headers": {
"http_accept": "application/json",
"content_type": "application/json",
"request_path": "/apim-5a7c34e0e4b02e66c60edbb2-2018.02/auditevent",
"http_version": "HTTP/1.1",
"http_connection": "keep-alive",
"request_method": "POST",
"http_host": "localhost:9700",
"request_uri": "/apim-5a7c34e0e4b02e66c60edbb2-2018.02/auditevent",
"content_length": "533",
"http_user_agent": "Wink Client v1.1.1"
},
"nlsMessage": {
"resource": "messages",
"replacements": [
"test",
"1.0.0",
"ext_mafashagov#rencredit.ru"
],
"key": "swagger.import.notification"
},
"notificationType": "EVENT",
"eventType": "AUDIT",
"source": null,
"envId": null,
"message": "API test version 1.0.0 was created from a Swagger document by ext_mafashagov#rencredit.ru.",
"userId": "ext_mafashagov#rencredit.ru",
"orgId": "5a7c34e0e4b02e66c60edbb2",
"assetType": "api",
"tags": [
"_geoip_lookup_failure"
],
"gateway_geoip": {},
"datetime": "2018-02-08T14:04:32.731Z",
"#timestamp": "2018-02-08T14:04:32.747Z",
"assetId": "5a7c58f0e4b02e66c60edc53",
"#version": "1",
"host": "127.0.0.1",
"id": "5a7c58f0e4b02e66c60edc55",
"client_geoip": {}
}
}
I try to find ths JSON by :
POST myAddress/_search
Next query works without "regexp" field. How should I configure regexp part of my query?
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"range": {
"#timestamp" : {"gte" : "now-100d"}
}
},
{
"term": {
"_type": "audit"
}
},
{
"regexp" : {
"message": "*wagger*"
}
}
]
}
}
}
},
"sort": {
"TraceDateTime": {
"order": "desc",
"ignore_unmapped": "true"
}
}
}
If message field is analyzed, this simple match query should work:
"match":{
"message":"*swagger*"
}
However if it is not analyzed, these two queries should also work for you:
These two queries are case sensitive so you should consider lower casing your field if you wish to keep it not analyzed.
"wildcard":{
"message":"*swagger*"
}
or
"regexp":{
"message":"swagger"
}
Be careful as wildcard and regexp queries degrade performance.

Empty inner_hits in compound Elasticsearch filter

I'm seeing what appears to be aberrant behavior in inner_hits results within nested boolean queries.
Test data (abbreviated for brevity):
# MAPPING
PUT unit_testing
{
"mappings": {
"document": {
"properties": {
"display_name": {"type": "text"},
"metadata": {
"properties": {
"NAME": {"type": "text"}
}
}
}
},
"paragraph": {
"_parent": {"type": "document"},
"_routing": {"required": true},
"properties": {
"checksum": {"type": "text"},
"sentences": {
"type": "nested",
"properties": {
"text": {"type": "text"}
}
}
}
}
}
}
# DOCUMENT X 2 (d0, d1)
PUT unit_testing/document/doc_id_d0
{
"display_name": "Test Document d0",
"paragraphs": [
"para_id_d0p0",
"para_id_d0p1"
],
"metadata": {"NAME": "Test Document d0 Metadata"}
}
# PARAGRAPH X 2 (d0p0, d1p0)
PUT unit_testing/paragraph/para_id_d0p0?parent=doc_id_d0
{
"checksum": "para_checksum_d0p0",
"sentences": [
{"text": "Test sentence d0p0s0"},
{"text": "Test sentence d0p0s1 ODD"},
{"text": "Test sentence d0p0s2 EVEN"},
{"text": "Test sentence d0p0s3 ODD"},
{"text": "Test sentence d0p0s4 EVEN"}
]
}
This initial query behaves as I would expect (I'm aware that the metadata filter isn't actually necessary in this example case):
GET unit_testing/paragraph/_search
{
"_source": "false",
"query": {
"bool": {
"must": [
{
"has_parent": {
"query": {
"match_phrase": {
"metadata.NAME": "Test Document d0 Metadata"
}
},
"type": "document"
}
},
{
"nested": {
"inner_hits": {},
"path": "sentences",
"query": {
"match": {
"sentences.text": "d0p0s0"
}
}
}
}
]
}
}
}
It yields an inner_hits object containing the one sentence that matched the predicate (some fields removed for clarity):
{
"hits": {
"hits": [
{
"_source": {},
"inner_hits": {
"sentences": {
"hits": {
"hits": [
{
"_source": {
"text": "Test sentence d0p0s0"
}
}
]
}
}
}
}
]
}
}
The following query is an attempt to embed the query above within a parent "should" clause, to create a logical OR between the initial query, and an additional query that matches a single sentence:
GET unit_testing/paragraph/_search
{
"_source": "false",
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"has_parent": {
"query": {
"match_phrase": {
"metadata.NAME": "Test Document d0 Metadata"
}
},
"type": "document"
}
},
{
"nested": {
"inner_hits": {},
"path": "sentences",
"query": {
"match": {
"sentences.text": "d0p0s0"
}
}
}
}
]
}
},
{
"nested": {
"inner_hits": {},
"path": "sentences",
"query": {
"match": {
"sentences.text": "d1p0s0"
}
}
}
}
]
}
}
}
While the "d1" query outputs the result one would expect, with an inner_hits object containing the matching sentence, the original "d0" query now yields an empty inner_hits object:
{
"hits": {
"hits": [
{
"_source": {},
"inner_hits": {
"sentences": {
"hits": {
"total": 0,
"hits": []
}
}
}
},
{
"_source": {},
"inner_hits": {
"sentences": {
"hits": {
"hits": [
{
"_source": {
"text": "Test sentence d1p0s0"
}
}
]
}
}
}
}
]
}
}
Although I'm using the elasticsearch_dsl Python library to build and combine these queries, and I'm something of a novice with respect to the Query DSL, the query format looks solid to me.
What am I missing?
I think what is missing is the name parameter for inner_hits - you have two inner_hits clauses at two different queries that would end up with the same name. Try giving the inner_hits a name parameter (0).
0 - https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html#_options

Resources