Use a compound "and" in ES - elasticsearch

Upon trying to emulate the following expression in ES
expr1 && expr2 && expr3
And I came up with this
curl -X GET "http://localhost:9200/policy_router-2019.12.30/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must" : [
{
"range" : { "#timestamp": { "gte": "now-15s", "lte": "now"} }
},
{
"query_string": { "query": "rewriteGateway", "default_field" : "message" }
},
{
"query_string": {"query": "policy-router-summer-snow-5555" ,"default_field" : "host" }
}
]
}
}
}'
But, it seems I'm not able to equate what I want, correctly with the above query. i.e every time I run the above query I see the documents which has host value different what I want it to be i.e policy-router-summer-snow-5555 here
I also tried nesting must inside the outer must but that resulted in a syntax error.
I'm failing to understand why does the last query_string expression for the host does not match.
Following is my ES version
{
"name" : "an2FbQZ",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "gIk5QPI3Rb6NPckeCRaUqQ",
"version" : {
"number" : "5.5.2",
"build_hash" : "b2f0c09",
"build_date" : "2017-08-14T12:33:14.154Z",
"build_snapshot" : false,
"lucene_version" : "6.6.0"
},
"tagline" : "You Know, for Search"
}
Following is how my document looks like (this is one of the documents returned for the above query, it can be seen clearly that the host does not match)
{
"_index" : "policy_router-2019.12.31",
"_type" : "policy_router",
"_id" : "AW9aG7_1tIiuv3oe07ZO",
"_score" : 4.6003995,
"_source" : {
"severity" : "INFO",
"input" : "udp",
"#timestamp" : "2019-12-31T03:59:25.107Z",
"#version" : "1",
"host" : "policy-router-proud-cherry-2098",
"message" : "2019-12-31 03:59:25.107111 I [PolicyRouter::Push] PolicyRouter -- PR -> LA ... rewrite gateway ... ",
"type" : "policy_router"
}
}```
Any guidance here from ES expert.

Try this query along with your field's name and value.
{
"query": {
"bool": {
"must": [
{
"range": {
"FIELD": {
"gte": 10,
"lte": 20
}
}
},
{
"match_phrase": {
"FIELD": "PHRASE"
}
},
{
"match_phrase": {
"FIELD": "PHRASE"
}
}
]
}
}
}

Related

ELK bool query with match and prefix

I'm new in ELK. I have a problem with the followed search query:
curl --insecure -H "Authorization: ApiKey $ESAPIKEY" -X GET "https://localhost:9200/commsrch/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"should" : [
{"match" : {"cn" : "franc"}},
{"prefix" : {"srt" : "99889300200"}}
]
}
}
}
'
I need to find all documents that satisfies the condition: OR field "cn" contains "franc" OR field "srt" starts with "99889300200".
Index mapping:
{
"commsrch" : {
"mappings" : {
"properties" : {
"addr" : {
"type" : "text",
"index" : false
},
"cn" : {
"type" : "text",
"analyzer" : "compname"
},
"srn" : {
"type" : "text",
"analyzer" : "srnsrt"
},
"srt" : {
"type" : "text",
"analyzer" : "srnsrt"
}
}
}
}
}
Index settings:
{
"commsrch" : {
"settings" : {
"index" : {
"routing" : {
"allocation" : {
"include" : {
"_tier_preference" : "data_content"
}
}
},
"number_of_shards" : "1",
"provided_name" : "commsrch",
"creation_date" : "1675079141160",
"analysis" : {
"filter" : {
"ngram_filter" : {
"type" : "ngram",
"min_gram" : "3",
"max_gram" : "4"
}
},
"analyzer" : {
"compname" : {
"filter" : [
"lowercase",
"stop",
"ngram_filter"
],
"type" : "custom",
"tokenizer" : "whitespace"
},
"srnsrt" : {
"type" : "custom",
"tokenizer" : "standard"
}
}
},
"number_of_replicas" : "1",
"uuid" : "C15EXHnaTIq88JSYNt7GvA",
"version" : {
"created" : "8060099"
}
}
}
}
}
Query works properly with just only one condition. If query has only "match" condition, results has properly documents count. If query has only "prefix" condition, results has properly documents count.
In case of two conditions "match" and "prefix", i see in result documents that corresponds only "prefix" condition.
In ELK docs can't find any limitation about mixing "prefix" and "match", but as i see some problem exists. Please help to find where is the problem.
In continue of experince I have one more problem.
Example:
Source data:
1st document cn field: "put stone is done"
2nd document cn field:: "job one or two"
Mapping and index settings the same as described in my first post
Request:
{
"query": {
"bool": {
"should" : [
{"match" : {"cn" : "one"}},
{"prefix" : {"cn" : "one"}}
]
}
}
}
'
As I understand, the high scores got first document, because it has more repeats of "one". But I need high scores for documents, that has at least one word in field "cn" started from string "one". I have experiments with query:
{
"query": {
"bool": {
"should": [
{"match": {"cn": "one"}},
{
"constant_score": {
"filter": {
"prefix": {
"cn": "one"
}
},
"boost": 100
}
}
]
}
}
}
But it doesn't work properly. What's wrong with my query?

Elasticsearch Nested query not working as expected

I am bit new to elastic search. I am trying a nested query to get the result soem thing like below sql in query DSL..means I wanna restrict the search to driver last name as well as the vehicle make as well..like below use case.
select driver.last_name,driver.vehicle.make,driver.vehicle.model from drivers
where driver.last_name='Hudson' and driver.vehicle.make"="Miller-Mete;
But this doesn't work in elastic search sql as well as Query DSL...
--> can we do the query like this in ES...like let me clarify..
if department has List[employees] in Elasticsearch denoarmalized data..
and i want to restrict the query to department_name and emp_position..
--> is this use case even possible in elastic search?
select department_name,emp_name,emp_salary,emp_position
where emp_position="Intern" and department.name="devlopment"
--> Below are mappings and search Query DSL...
PUT /drivers
{
"mappings": {
"properties": {
"driver": {
"type": "nested",
"properties": {
"last_name": {
"type": "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"vehicle": {
"type": "nested",
"properties": {
"make": {
"type": "text"
},
"model": {
"type": "text"
}
}
}
}
}
}
}
}
GET /drivers/_mapping
O/P:
{
"drivers" : {
"mappings" : {
"properties" : {
"driver" : {
"type" : "nested",
"properties" : {
"last_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"vehicle" : {
"type" : "nested",
"properties" : {
"make" : {
"type" : "text"
},
"model" : {
"type" : "text"
}
}
}
}
}
}
}
}
}
--> inserting documents..
PUT /drivers/_doc/1
{
"driver" : {
"last_name" : "McQueen",
"vehicle" : [
{
"make" : "Powell Motors",
"model" : "Canyonero"
},
{
"make" : "Miller-Meteor",
"model" : "Ecto-1"
}
]
}
PUT /drivers/_doc/2
{
"driver" : {
"last_name" : "Hudson",
"vehicle" : [
{
"make" : "Mifune",
"model" : "Mach Five"
},
{
"make" : "Miller-Meteor",
"model" : "Ecto-1"
}
]
}
}
--> Below is the search query dsl..this gives 0 results. Even i replace
"term": {
"driver.last_name.keyword": "McQueen"
}
with "match" or "filter" still gives 0 results...
GET /drivers/_search
{
"query": {
"nested": {
"path": "driver",
"query": {
"nested": {
"path": "driver.vehicle",
"query": {
"bool": {
"must": [
{ "match": { "driver.vehicle.make": "Powell Motors" } },
{ "match": { "driver.vehicle.model": "Canyonero" } },
{
"term": {
"driver.last_name.keyword": "McQueen"
}
}
]
}
}
}
}
}
}
}
==> below Query DSL gives 2 results...
GET /drivers/_search
{
"query": {
"nested": {
"path": "driver",
"query": {
"nested": {
"path": "driver.vehicle",
"query": {
"bool": {
"must": [
{ "match": { "driver.vehicle.make": "Miller-Meteor" } }
]
}
}
}
}
}
}
}
O/P:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.3097506,
"hits" : [
{
"_index" : "drivers",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.3097506,
"_source" : {
"driver" : {
"last_name" : "McQueen",
"vehicle" : [
{
"make" : "Powell Motors",
"model" : "Canyonero"
},
{
"make" : "Miller-Meteor",
"model" : "Ecto-1"
}
]
}
}
},
{
"_index" : "drivers",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.3097506,
"_source" : {
"driver" : {
"last_name" : "Hudson",
"vehicle" : [
{
"make" : "Mifune",
"model" : "Mach Five"
},
{
"make" : "Miller-Meteor",
"model" : "Ecto-1"
}
]
}
}
}
]
}
}
==> this gives "parsing_exception",
"reason" : "[bool] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
==> even replacing 1st query bool to "match" also gives this error...as below
GET /drivers/_search
{
"query": {
"nested": {
"path": "driver",
"query": {
"bool": {
"must": [
{"match": {
"driver.last_name.keyword": "Hudson"
}}
]
},
"nested": {
"path": "driver.vehicle",
"query": {
"bool": {
"must": [
{
"match": {
"driver.vehicle.make": "Miller-Meteor"
}
}
]
}
}
}
}
}
}

Combining nested query get illegal_state_exception failed to find nested object under path

I'm creating a query on Elasticsearch, for find documents through all indices.
I need to combine should, must and nested query on Elasticsearch, i get the right result but i get an error inside the result.
This is the query I'm using
GET _all/_search
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{ "term": { "trimmed_final_url": "https://www.repubblica.it/t.../" } }
],
"must": [
{
"nested": {
"path": "entities",
"query": {
"bool": {
"must": [
{ "term": { "entities.id": "138511" } }
]
}
}
}
},
{
"term": {
"language": { "value": "it" }
}
}
]
}
}
And this is the result
{
"_shards" : {
"total" : 38,
"successful" : 14,
"skipped" : 0,
"failed" : 24,
"failures" : [
{
"shard" : 0,
"index" : ".kibana_1",
"node" : "7twsq85TSK60LkY0UiuWzA",
"reason" : {
"type" : "query_shard_exception",
"reason" : """
failed to create query: {
...
"index_uuid" : "HoHi97QFSaSCp09iSKY1DQ",
"index" : ".reporting-2019.06.02",
"caused_by" : {
"type" : "illegal_state_exception",
"reason" : "[nested] failed to find nested object under path [entities]"
}
}
},
...
"hits" : {
"total" : {
"value" : 50,
"relation" : "eq"
},
"max_score" : 16.90015,
"hits" : [
{
"_index" : "i_201906_v1",
"_type" : "_doc",
"_id" : "MugcbmsBAzi8a0oJt96Q",
"_score" : 16.90015,
"_source" : {
"language" : "it",
"entities" : [
{
"id" : 101580,
},
{
"id" : 156822,
},
...
I didn't write some fields because the code is too long
I am new to StackOverFlow (made this account to answer this question :D) so if this answer is out of line bear with me. I have been dabbling in nested fields in Elasticsearch recently so I have some ideas as to how this error could be appearing.
Have you defined a mapping for your document type? I don't believe Elasticsearch will recognize the field as nested if you do not tell it to do so in the mapping:
PUT INDEX_NAME
{
"mappings": {
"DOC_TYPE": {
"properties": {
"entities": {"type": "nested"}
}
}
}
}
You may have to specify this mapping for each index and document type. Not sure if there is a way to do that all with one request.
I also noticed you have a "should" clause with minimum matches set to 1. I believe this is exactly the same as a "must" clause so I am not sure what purpose this achieves (correct me if I'm wrong). If your mapping is specified, the query should look something like this:
GET /_all/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "entities",
"query": {
"term": {
"entities.id": {
"value": "138511"
}
}
}
}
},
{
"term": {
"language": {
"value": "it"
}
}
},
{
"term": {
"trimmed_final_url": {
"value": "https://www.repubblica.it/t.../"
}
}
}
]
}
}
}

ElasticSearch Query aggregations per #timestamp hour

im making a Query on elasticSearch, of metricbeat, to rate the most used process per hourly, in these moment i'm aggregating per process start time, and process name, i need to "divide" these groups using field "#timestamp" hourly
that's my actual query
GET metricbeat*/_search?
{"query": {
"bool": {
"must": [
{ "wildcard" : { "beat.hostname" : "ibmcx*" }},
{ "range": {
"#timestamp": {
"gte": "2019-03-22T00:00:00",
"lte": "2019-03-23T00:00:00"}}},
{"terms" : { "beat.hostname" : ["ibmcxapp101", "ibmcxapp102", "ibmcxapp103",
"ibmcxapp104", "ibmcxapp105", "ibmcxapp106", "ibmcxapp107",
"ibmcxapp108", "ibmcxapp109", "ibmcxapp110", "ibmcxapp111",
"ibmcxapp112", "ibmcxapp113", "ibmcxapp114", "ibmcxapp115",
"ibmcxapp116", "ibmcxapp117", "ibmcxapp118", "ibmcxapp119",
"ibmcxapp120", "ibmcxapp121", "ibmcxapp122", "ibmcxxaa100",
"ibmcxxaa101", "ibmcxxaa102", "ibmcxxaa103", "ibmcxxaa104",
"ibmcxxaa105", "ibmcxxaa106", "ibmcxxaa107", "ibmcxxaa108",
"ibmcxxaa109", "ibmcxxaa110", "ibmcxxaa111", "ibmcxxaa112",
"ibmcxxaa201", "ibmcxxaa202", "ibmcxxaa203", "ibmcxxaa204"
] }},
{"exists": {"field": "system.process.cmdline"}}
],
"must_not": [
{"term" : { "system.process.username" : "NT AUTHORITY\\SYSTEM" }},
{"term" : { "system.process.username" : "NT AUTHORITY\\NETWORK SERVICE" }},
{"term" : { "system.process.username" : "NT AUTHORITY\\LOCAL SERVICE" }},
{"term" : { "system.process.username" : "NT AUTHORITY\\Servicio de red"}},
{"term" : { "system.process.username" : "" }}
]
}
},
"size": 0,
"aggs": {
"group_by_start_time": {
"terms": {
"field": "system.process.cpu.start_time"
},
"aggs": {
"group_by_name": {
"terms": {
"field": "system.process.name.keyword"
}
}
}
}
},
"size": 0,
"sort" : [
{ "system.process.cpu.start_time" : {"order" : "asc"}},
{ "#timestamp" : {"order" : "asc"}},
{ "system.process.pid" : {"order" : "desc"}}
]}
It's a bit hard to follow and reproduce — a minimal example (I think the entire query is not really needed) and sample docs would go a long way.
If you want to have an hourly aggregation, the first thing you'll need to do is that aggregation and then run the others inside.
The minimal example for an hourly aggregation would be:
POST /metricbeat*/_search?size=0
{
"aggs" : {
"metrics_per_hour" : {
"date_histogram" : {
"field" : "#timestamp",
"interval" : "hour"
}
}
}
}
Folding in the other aggregation would look like this:
POST /metricbeat*/_search?size=0
{
"aggs" : {
"metrics_per_hour" : {
"date_histogram" : {
"field" : "#timestamp",
"interval" : "hour"
},
"aggs" : {
...
}
}
}
}
PS: If you are using a daily index pattern, you could just use the right day instead of the wildcard one and then skip this part of the query:
"range": {
"#timestamp": {
"gte": "2019-03-22T00:00:00",
"lte": "2019-03-23T00:00:00"
}
}

How to see which of the queries in boolean is matched?

I have given multiple queries using the bool query. Now it can happen that some of them might have matches and some queries might not have matches in the database. How can I know which of the queries had a match?
For example, here I have a bool query with two should conditions against the field landMark.
{
"query": {
"bool": {
"should": [
{
"match": {
"landMark": "wendys"
}
},
{
"match": {
"landMark": "starbucks"
}
}
]
}
}
}
How can I know which one of them matched in the above query if only one of them matches the documents?
You can use named queries for this purpose. Try this
{
"query": {
"bool": {
"should": [
{
"match": {
"landMark": {
"query": "wendys",
"_name": "wendy match"
}
}
},
{
"match": {
"landMark": {
"query": "starbucks",
"_name": "starbucks match"
}
}
}
]
}
}
}
you can use any _name . In response you will get something like this
"matched_queries": ["wendy match"]
so you will be able to tell which query matched that specific document.
Named query is certainly the way to go.
LINK - https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-named-queries-and-filters.html
Idea of named query is simple , you tag a name to each of your query and in the result , it shows which all tags matched per document.
curl -XPOST 'http://localhost:9200/data/data' -d ' { "landMark" : "wendys near starbucks" }'
curl -XPOST 'http://localhost:9200/data/data' -d ' { "landMark" : "wendys" }'
curl -XPOST 'http://localhost:9200/data/data' -d ' { "landMark" : "starbucks" }'
Hence create you query in this fashion -
curl -XPOST 'http://localhost:9200/data/_search?pretty' -d '{
"query": {
"bool": {
"should": [
{
"match": {
"landMark": {
"query": "wendys",
"_name": "wendy_is_a_match"
}
}
},
{
"match": {
"landMark": {
"query": "starbucks",
"_name": "starbuck_is_a_match"
}
}
}
]
}
}
}'
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 0.581694,
"hits" : [ {
"_index" : "data",
"_type" : "data",
"_id" : "AVMCNNCY3OZJfBZCJ_tO",
"_score" : 0.581694,
"_source": { "landMark" : "wendys near starbucks" },
"matched_queries" : [ "starbuck_is_a_match", "wendy_is_a_match" ] ---> "Matched tags
}, {
"_index" : "data",
"_type" : "data",
"_id" : "AVMCNS0z3OZJfBZCJ_tQ",
"_score" : 0.1519148,
"_source": { "landMark" : "starbucks" },
"matched_queries" : [ "starbuck_is_a_match" ]
}, {
"_index" : "data",
"_type" : "data",
"_id" : "AVMCNRsF3OZJfBZCJ_tP",
"_score" : 0.04500804,
"_source": { "landMark" : "wendys" },
"matched_queries" : [ "wendy_is_a_match" ]
} ]
}
}

Resources