Elasticsearch wildcard query on numeric fields without using mapping - elasticsearch

I'm looking for a creative solution because I can't use mapping as solution is already in production.
I have this query:
{
"size": 4,
"query": {
"bool": {
"filter": [
{
"range": {
"time": {
"from": 1597249812405,
"to": null,
}
}
},
{
"query_string": {
"query": "*181*",
"fields": [
"deId^1.0",
"deTag^1.0",
],
"type": "best_fields",
"default_operator": "or",
"max_determinized_states": 10000,
"enable_position_increments": true,
"fuzziness": "AUTO",
"fuzzy_prefix_length": 0,
"fuzzy_max_expansions": 50,
"phrase_slop": 0,
"escape": false,
"auto_generate_synonyms_phrase_query": true,
"fuzzy_transpositions": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"time": {
"order": "asc"
}
}
]
}
"deId" field is an integer in elasticsearch and the query returns nothing (though should),
Is there a solution to search for wildcards in numeric fields without using the multi field option which requires mapping?

Once you index an integer, ES does not treat the individual digits as position-sensitive tokens. In other words, it's not directly possible to perform wildcards on numeric datatypes.
There are some sub-optimal ways of solving this (think scripting & String.substring) but the easiest would be to convert those integers to strings.
Let's look at an example deId of 123181994:
POST prod/_doc
{
"deId_str": "123181994"
}
then
GET prod/_search
{
"query": {
"bool": {
"filter": [
{
"query_string": {
"query": "*181*",
"fields": [
"deId_str"
]
}
}
]
}
}
}
works like a charm.
Since your index/mapping is already in production, look into _update_by_query and stringify all the necessary numbers in a single call. After that, if you don't want to (and/or cannot) pass the strings at index time, use ingest pipelines to do the conversion for you.

Related

AWS Elastic Search(v7.10.2) cannot use `collapse` in conjunction with `search_after`

I'm using AWS Elastic Search verion 7.10.2 and getting error ```cannot use collapse in conjunction with `search_after```` on following query. Please let me know what could be the possible solution to fix it. I cannot upgrade the ES version as from the AWS console it looks like it is the max supported. Please let me know if there can be any alternative query for the same.
Search_after I wanted to use for the pagination support and collpase I wanted to use to pick the latest version of the document
StreamingSegmentId : it is the unique Id for each document
StreamingSegmentStartTime : My query can return many unique documents. I wanted to sort the entire result set in ascending order of StreamingSegmentStartTime
FanoutPublishTimestamp : This is the doc published timestamp. since I want the latest version of each document. I'm sorting all the docs belongs to same StreamingSegmentId in descending order and use collpase of pick the top one.
GET /sessions/_search
{
"size": 2,
"query": {
"bool": {
"must": [
{
"terms": {
"benefitId": [
"PRIME"
],
"boost": 1
}
},
{
"range": {
"streamingSegmentStartTime": {
"from": 1647821557000,
"to": 1647825157000,
"include_lower": true,
"include_upper": false,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"_source": {
"includes": [
"deviceTypeId",
"timeline"
],
"excludes": []
},
"sort": [
{
"streamingSegmentStartTime": {
"order": "asc"
},
"fanoutPublishTimestamp" : {
"order": "desc"
}
}
],
"search_after": [
6749348300022,
6749348300048
],
"collapse": {
"field": "streamingSegmentId"
}
}

ElasticSearch : how to sort ES document based on document version

Following is my ES query I want to sort my document in descending order based on "_version" but not sure how to do it
{
"query":
{
"bool":
{
"must":
[
{
"terms":
{
"streamingSegmentId":
[
"00003319-b7fa-3409-806a-fa3bb5d2be26"
],
"boost": 1
}
},
{
"range":
{
"streamingSegmentStartTime":
{
"from": 1644480000000,
"to": 1647476658447,
"include_lower": true,
"include_upper": false,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"version": true,
"_source":
{
"includes":
[
"errorCount",
"benefitId",
"streamingSegmentStopTime",
"fanoutPublishTimestamp",
"sessionUpdateTime",
"contentSegmentUpdateTime"
],
"excludes":
[]
},
"sort":
[
{
"streamingSegmentStartTime":
{
"order": "asc"
}
},
{
"_version":
{
"order": "desc"
}
}
]
}
I was not able to sort based on _version, hence I added one timestamp field called fanoutPublishTimestamp to sort my document in descending order of time. Following is my udpated query and I'm using collapse to fetch only latest timestamp document. Now the recent problem I started facing with following query is collpase cannot be used with search_after. search_after I'm using to add pagination support in my ES query.
I'm using AWS Elastic search which is using 7.10 version of ES and 8.1 ES version only supports collapse with Search_after. Please let me know if anybody has better solution to deal with this issue
GET /sessions/_search
{
"size": 2,
"query": {
"bool": {
"must": [
{
"terms": {
"benefitId": [
"PRIME"
],
"boost": 1
}
},
{
"range": {
"streamingSegmentStartTime": {
"from": 1647821557000,
"to": 1647825157000,
"include_lower": true,
"include_upper": false,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"_source": {
"includes": [
"deviceTypeId",
"timeline"
],
"excludes": []
},
"sort": [
{
"streamingSegmentStartTime": {
"order": "asc"
}
},
{
"fanoutPublishTimestamp": {
"order": "desc"
}
}
],
"search_after": [
"1647821557001",
"1647829603837"
],
"collapse": {
"field": "streamingSegmentId"
}
}
As you didn't provide your sample documents, and didn't explain what it means that its not working, I am assuming that it's because of two sort param you are using, if you use just one _version it works(tested locally on my sample documents).
Mostly your another sort criteria streamingSegmentStartTime is causing few documents which has higher _version to come later in the response, try to remove it and see if it provided you expected result.

How to search a substring in a json string attribute in Kibana (Elastic Search)?

I have an attribute stored in Elastic Search DB. The attribute is somewhat of this form:-
{
"a":{
"key1":"value1",
"key2":"value2"
}
}
Now, I want to search for all instances which have value1 defined. How to achieve this using Kibana query?
Below is the query:
GET ${index}/_search
{
"from": 0,
"size": 200,
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"match_phrase": {
"a.key1": {
"query": "value1",
"slop": 0,
"zero_terms_query": "NONE",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}
If you want to query all the instances, you also need to know the document count. If the count is bigger than 10000, you need to use the scroll.

Elastic search query is not executed

Hi I am using elastic search engine to search for some items, items are placed in some buildings, when running this query, Items returned are not sorted even if I change the sort direction. My first impression is that the block sort is not even executed. Is there something wrong with the query ?
{
"from": 0,
"size": 20,
"query": {
"bool": {
"filter": [
{
"terms": {
"buildingsUuid": [
"9caff147-d019-416a-a167-f02bab7334fd"
],
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"itemId": {
"order": "desc"
}
}
]
}

Exclude empty array fields - but include documents missing the field - in elasticsearch

I'm trying to run a query against elasticsearch that will find documents where one of the following conditions applies:
The document is missing the given field (tags) OR
The document has the value foo as an element of the tags array
The problem is that my current query will return documents that have a tags field where the value is an empty array. Presumably this is because elasticsearch is treating an empty array as the same thing as not having the field at all. Here's the full query I'm running that's returning the bad results:
{
"from": 0,
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "_rankings.public"
}
},
{
"or": [
{
"missing": {
"existence": true,
"field": "tags",
"null_value": false
}
},
{
"terms": {
"execution": "or",
"tags": [
"foo"
]
}
}
]
}
]
}
},
"query": {
"match_all": {}
}
}
},
"size": 10000,
"sort": [
{
"_rankings.public": {
"ignore_unmapped": true,
"order": "asc"
}
}
]
}
I don't think you can achieve this so easily "out-of-the-box" for the reason you already mentioned: there's no difference between an empty array and a field (corresponding to that array) with no values in it.
Your only option might be to use a "null_value" for that "tags" field and, if you have any control over the data that goes into your documents, to treat a "[]" array as a '["_your_null_value_of_choice_"]'. And in your query to change "null_value": false to true.

Resources