ElasticSearch: How to apply regular expression on indices - elasticsearch

I am trying to restrict the return of a search query to only those indices that start with abc-* pattern.
I tried the following regex but it didn't work.
{
"sort": [
{
"timestamp": {
"order": "desc",
"unmapped_type": "boolean"
}
}
],
"query": {
"filtered": {
"query": {
"query_string": {
"regexp": {
"index": "abc-*"
}
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"gte": "now-24h"
}
}
}
]
}
}
}
}
}
Is it possible to use the indices query and apply regex on it?
even the following doesn't filter appropriately:
{
"sort": [
{
"timestamp": {
"order": "desc",
"unmapped_type": "boolean"
}
}
],
"query": {
"filtered": {
"query": {
"indices" : {
"query" : { "regexp" : { "index" : "abc-.*" } }
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"gte": "now-24h"
}
}
}
]
}
}
}
}
}

There's a much easier solution simply by means of specifying your index pattern in the URL directly:
POST /abc-*/_search
{
"sort": [
{
"timestamp": {
"order": "desc",
"unmapped_type": "boolean"
}
}
],
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"gte": "now-24h"
}
}
}
]
}
}
}
}
}

Not sure but faced same problem in different case . I think problem with - in "abc-*" .
just replace - with space , it will work
"index": "abc *"

The index pattern in the URL only supports native expressions, not regex expressions. It does solve the problem though.

Related

How to combine Boolean AND with Boolean OR in Elasticsearch query?

Query: Get employee name "Mahesh" whose id is "200" and joining datetime is in a given date range and his epf status must be either 'NOK' or 'WRN'. (Possible values of epf_status are {OK,NOK,WRN,CANCELLED}.
I have written the following query, that matches epf_status also with OK, CANCELLED, but it must only match when epf_status is either 'NOK' or 'WRN'. What else do I need to change to make it work, as required?
GET myindex01/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"empname": { "query": "Mahesh", "operator": "AND" }
}
},
{
"match": {
"empid": { "query": "200", "operator": "AND" }
}
},
{
"range": {
"joining_datetime": {
"gte": "2020-01-01T00:00:00",
"lte": "2022-06-24T23:59:59"
}
}
}
],
"should": [
{ "match": { "epf_status": "NOK" } },
{ "match": { "epf_status": "WRN" } }
]
}
}
}
SAMPLE DATA:
{"Mahesh","200","2022-04-01","OK"}
{"Mahesh","200","2022-04-01","NOK"}
{"Mahesh","200","2022-04-01","WRN"}
{"Mahesh","200","2022-04-01","CANCELLED"}
REQUIRED OUTPUT:
{"Mahesh","200","2022-04-01","NOK"}
{"Mahesh","200","2022-04-01","WRN"}
Tldr;
You could be using the terms query for that I believe.
Returns documents that contain one or more exact terms in a provided field.
To solve
GET myindex01/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"empname": { "query": "Mahesh", "operator": "AND" }
}
},
{
"match": {
"empid": { "query": "200", "operator": "AND" }
}
},
{
"range": {
"joining_datetime": {
"gte": "2020-01-01T00:00:00",
"lte": "2022-06-24T23:59:59"
}
}
}
],
"should": [
{ "terms": { "epf_status": ["NOK", "WRN"] } }
]
}
}
}

Combine multiple individual queries into one to get aggregated result in Elasticsearch

I have built two queries in ElasticSearch to get the counts for each error message. for example, the first query is to get how many error messages related to "was not found" error
GET /logstash*/_search
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"kubernetes.pod_name": "api"
}
},
{
"match": {
"log": "error"
}
},
{
"match": {
"log": {
"query": "was not found",
"operator": "and"
}
}
},
{
"range": {"#timestamp": {
"time_zone": "CET",
"gt": "now-7d",
"lte": "now"}}
}
]
}
}
}
},
"aggs" : {
"type_count" : {
"value_count" : {
"script" : {
"source" : "doc['log.keyword'].value"
}
}
}
}
}
The second query is to get the count of error messages related to "Duplicate Entry" error
GET /logstash*/_search
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"kubernetes.pod_name": "api"
}
},
{
"match": {
"log": "error"
}
},
{
"match": {
"log": {
"query": "Duplicate entry",
"operator": "and"
}
}
},
{
"range": {"#timestamp": {
"time_zone": "CET",
"gt": "now-7d",
"lte": "now"}}
}
]
}
}
}
},
"aggs" : {
"type_count" : {
"value_count" : {
"script" : {
"source" : "doc['log.keyword'].value"
}
}
}
}
}
My boss really wants me to combine these individual query into a one big query, then get the list of counts for each error messages in one output. Since we have a lot of error messages, which means we have to write each query for each error message, then we have to run each query to get the counts. Is there a way I can click one run to get the list of counts?
I have been trying use query string query and looking for solutions on either Stack Overflow and Documentation. However, there is no luck
You can use filter aggregation along with the value_count aggregation to combine these two queries. In both the queries, out of the 4 queries inside must clause only one differs. You can take this out and combine them with the two filter aggregations as below:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"match": {
"kubernetes.pod_name": "api"
}
},
{
"match": {
"log": "error"
}
},
{
"range": {
"#timestamp": {
"time_zone": "CET",
"gt": "now-7d",
"lte": "now"
}
}
}
]
}
}
}
},
"aggs": {
"not_found_count": {
"filter": {
"match": {
"log": {
"query": "was not found",
"operator": "and"
}
}
},
"aggs": {
"count": {
"value_count": {
"script": {
"source": "doc['log.keyword'].value"
}
}
}
}
},
"duplicate_entry_count": {
"filter": {
"match": {
"log": {
"query": "Duplicate entry",
"operator": "and"
}
}
},
"aggs": {
"count": {
"value_count": {
"script": {
"source": "doc['log.keyword'].value"
}
}
}
}
}
}
}

Filtered bool vs Bool query : elasticsearch

I have two queries in ES. Both have different turnaround time on the same set of documents. Both are doing the same thing conceptually. I have few doubts
1- What is the difference between these two?
2- Which one is better to use?
3- If both are same why they are performing differently?
1. Filtered bool
{
"from": 0,
"size": 5,
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "1987112602"
}
},
{
"term": {
"original_sender_address_number": "6870340319"
}
},
{
"range": {
"x_event_timestamp": {
"gte": "2016-07-01T00:00:00.000Z",
"lte": "2016-07-30T00:00:00.000Z"
}
}
}
]
}
}
}
},
"sort": [
{
"x_event_timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}
2. Simple Bool
{
"query": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "1277478699"
}
},
{
"term": {
"original_sender_address_number": "8020564722"
}
},
{
"term": {
"cause_code": "573"
}
},
{
"range": {
"x_event_timestamp": {
"gt": "2016-07-13T13:51:03.749Z",
"lt": "2016-07-16T13:51:03.749Z"
}
}
}
]
}
},
"from": 0,
"size": 10,
"sort": [
{
"x_event_timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}
Mapping:
{
"ccp": {
"mappings": {
"type1": {
"properties": {
"original_sender_address_number": {
"type": "string"
},
"called_party_address_number": {
"type": "string"
},
"cause_code": {
"type": "string"
},
"x_event_timestamp": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
.
.
.
}
}
}
}
}
Update 1:
I tried bool/must query and bool/filter query on same set of data,but I found the strange behaviour
1-
bool/must query is able to search the desired document
{
"query": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "8701662243"
}
},
{
"term": {
"cause_code": "401"
}
}
]
}
}
}
2-
While bool/filter is not able to search the document. If I remove the second field condition it searches the same record with field2's value as 401.
{
"query": {
"bool": {
"filter": [
{
"term": {
"called_party_address_number": "8701662243"
}
},
{
"term": {
"cause_code": "401"
}
}
]
}
}
}
Update2:
Found a solution of suppressing scoring phase with bool/must query by wrapping it within "constant_score".
{
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"term": {
"called_party_address_number": "1235235757"
}
},
{
"term": {
"cause_code": "304"
}
}
]
}
}
}
}
}
Record we are trying to match have "called_party_address_number": "1235235757" and "cause_code": "304".
The first one uses the old 1.x query/filter syntax (i.e. filtered queries have been deprecated in favor of bool/filter).
The second one uses the new 2.x syntax but not in a filter context (i.e. you're using bool/must instead of bool/filter). The query with 2.x syntax which is equivalent to your first query (i.e. which runs in a filter context without score calculation = faster) would be this one:
{
"query": {
"bool": {
"filter": [
{
"term": {
"called_party_address_number": "1277478699"
}
},
{
"term": {
"original_sender_address_number": "8020564722"
}
},
{
"term": {
"cause_code": "573"
}
},
{
"range": {
"x_event_timestamp": {
"gt": "2016-07-13T13:51:03.749Z",
"lt": "2016-07-16T13:51:03.749Z"
}
}
}
]
}
},
"from": 0,
"size": 10,
"sort": [
{
"x_event_timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}

ElasticSeach query not return results

I got 2 querys, the different between them is just 1 filter term.
The first query:
GET _search
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "*"
}
},
"filter": {
"and": {
"filters": [
{
"term": {
"type": "log"
}
},
{
"term": {
"context.blueprint_id": "adv1"
}
},
{
"term": {
"context.deployment_id": "deploy1"
}
}
]
}
}
}
}
}
return this result:
{
"_source": {
"level": "info",
"timestamp": "2014-03-24 10:12:41.925680",
"message_code": null,
"context": {
"blueprint_id": "Adv1",
"execution_id": "27efcba7-3a60-4270-bbe2-9f17d602dcef",
"deployment_id": "deploy1"
},
"type": "log",
"#version": "1",
"#timestamp": "2014-03-24T10:12:41.927Z"
}
}
The second query is:
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "*"
}
},
"filter": {
"and": {
"filters": [
{
"term": {
"type": "log"
}
},
{
"term": {
"context.blueprint_id": "adv1"
}
},
{
"term": {
"context.deployment_id": "deploy1"
}
},
{
"term": {
"context.execution_id": "27efcba7-3a60-4270-bbe2-9f17d602dcef"
}
}
]
}
}
}
}
}
return empty results.
the different between them is in the second query, i just add this term:
{
"term": {
"context.execution_id": "27efcba7-3a60-4270-bbe2-9f17d602dcef"
}
}
and in the result we can see that there is result match to that query, but it still not work.
what i'm doing wrong here?
thanks.
By default, ElasticSearch will treat string fields as text and will analyze them (i.e. tokenize, stem etc. before indexing). This means you might not be able to find them when searching for their exact content.
You should make sure the mapping for the execution_id field is not analyzed. Start with GET /_mappings and work from there. :)

ElasticSearch ignoring sort when filtered

ElasticSearch Version: 0.90.1, JVM: 1.6.0_51(20.51-b01-457)
I'm trying to do two things with my ElasticSearch query: 1) filter the results based on a boolean (searchable) and "open_date < tomorrow" and 2) two sort by the field "open_date" DESC
This produces the following query:
{
"query": {
"bool": {
"should": [
{
"prefix": {
"name": "foobar"
}
},
{
"query_string": {
"query": "foobar"
}
},
{
"match": {
"name": {
"query": "foobar"
}
}
}
],
"minimum_number_should_match": 1
},
"filtered": {
"filter": {
"and": [
{
"term": {
"searchable": true
}
},
{
"range": {
"open_date": {
"lt": "2013-07-16"
}
}
}
]
}
}
},
"sort": [
{
"open_date": "desc"
}
]
}
However, the results that come back are not being sorted by "open_date". If I remove the filter:
{
"query": {
"bool": {
"should": [
{
"prefix": {
"name": "foobar"
}
},
{
"query_string": {
"query": "foobar"
}
},
{
"match": {
"name": {
"query": "foobar"
}
}
}
],
"minimum_number_should_match": 1
}
},
"sort": [
{
"open_date": "desc"
}
]
}
... the results come back as expected.
Any ideas?
I'm not sure about the Tire code, but the JSON does not correctly construct a filtered query. My guess is that this overflows and causes the sort element to also not be correctly parsed.
A filtered query should be constructed like this (see http://www.elasticsearch.org/guide/reference/query-dsl/filtered-query/ ):
{
"query": {
"filtered": { // Note: this contains both query and filter
"query": {
"bool": {
"should": [
{
"prefix": {
"name": "foobar"
}
},
{
"query_string": {
"query": "foobar"
}
},
{
"match": {
"name": {
"query": "foobar"
}
}
}
],
"minimum_number_should_match": 1
}
},
"filter": {
"and": [
{
"term": {
"searchable": true
}
},
{
"range": {
"open_date": {
"lt": "2013-07-16"
}
}
}
]
}
}
},
"sort": [
{
"open_date": "desc"
}
]
}
Cheers,
Boaz

Resources