Elastic Search Query Where value exists between 2 field values - elasticsearch

Im trying to figure out how to construct a query against elastic search where the query value exists between the range of 2 field values.
Lets say I have a template
{
"template": "addresses",
"mappings": {
"addresses": {
"properties": {
"street_number_1": { "type": "integer" },
"street_number_2": { "type": "integer" },
... //other unimportant fields
}
}
}
Based on the above definition, If I have an address of 100-120 High Street, where street_number_1 is 100 and street_number_2 is 120, if I were to perform a Search for 112 High Street, this record should be returned as it is between 100 and 120. What kind of elastic search function/query would allow me to do this?

You have two options. With your current mapping, you can use two range queries like this:
{
"query": {
"bool": {
"filter": [
{
"range": {
"street_number_1": {
"lte": 112
}
}
},
{
"range": {
"street_number_2": {
"gte": 112
}
}
}
]
}
}
}
The second option involves changing your mapping to use an integer range for the street number. Define your street number mapping like this:
PUT addresses
{
"mappings": {
"_doc": {
"properties": {
"street_number": {
"type": "integer_range"
}
}
}
}
}
Then index your address document like this:
PUT addresses/_doc/1
{
"street_number" : {
"gte" : 100,
"lte" : 120
}
}
And finally query it like this:
POST addresses/_search
{
"query" : {
"term" : {
"street_number" : {
"value": 112
}
}
}
}

Related

Elasticsearch - Is it possible to create histograms without having the field indexed

I come across the following phrase
https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html
For instance if you have a numeric field called foo that you need to run histograms on but that you never need to filter on, you can safely disable indexing on this field in your mappings:
PUT index
{
"mappings": {
"properties": {
"foo": {
"type": "integer",
"index": false
}
}
}
}
Does it mean aggregations like histograms can be created though the field is NOT indexed ?
Yes, that's correct and that's easy to test:
Create the index:
PUT index
{
"mappings": {
"properties": {
"foo": {
"type": "integer",
"index": false
}
}
}
}
Index a sample document:
PUT index/_doc/1
{
"foo": 23
}
Run an histogram aggregation:
POST index/_search
{
"aggs": {
"histo": {
"histogram": {
"field": "foo",
"interval": 10
}
}
}
}
Results:
"aggregations" : {
"histo" : {
"buckets" : [
{
"key" : 20.0,
"doc_count" : 1
}
]
}
}

ES query to match all elements in array

So I got this document with a
nested array that I want to filter with this query.
I want ES to return all documents where all items have changes = 0 and that only.
If document has even a single item in the list with a change = 1, that's discarded.
Is there any way I can achieve this starting from the query I have already wrote? Or should I use a script instead?
DOCUMENTS:
{
"id": "abc",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 1
}
]
}
},
{
"id": "def",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 0
}
]
}
}
QUERY:
GET trips_solutions/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"id": {
"value": "abc"
}
}
},
{
"nested": {
"path": "trips",
"query": {
"range": {
"trips.changes": {
"gt": -1,
"lt": 1
}
}
}
}
}
]
}
}
}
EXPECTED RESULT:
{
"id": "def",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 0
}
]
}
}
Elasticsearch version: 7.6.2
Already read this answers but they didn't help me:
https://discuss.elastic.co/t/how-to-match-all-item-in-nested-array/163873
ElasticSearch: How to query exact nested array
First off, if you filter by id: abc, you obviously won't be able to get id: def back.
Second, due to the nature of nested fields which are treated as separate subdocuments, you cannot query for all trips that have the changes equal to 0 -- the connection between the individual trips is lost and they "don't know about each other".
What you can do is return only the trips that matched your nested query using inner_hits:
GET trips_solutions/_search
{
"_source": "false",
"query": {
"bool": {
"must": [
{
"nested": {
"inner_hits": {},
"path": "trips",
"query": {
"term": {
"trips.changes": {
"value": 0
}
}
}
}
}
]
}
}
}
The easiest solution then is to dynamically save this nested info on a parent object like discussed here and using range/term query on the resulting array.
EDIT:
Here's how you do it using copy_to onto the doc's top level:
PUT trips_solutions
{
"mappings": {
"properties": {
"trips_changes": {
"type": "integer"
},
"trips": {
"type": "nested",
"properties": {
"changes": {
"type": "integer",
"copy_to": "trips_changes"
}
}
}
}
}
}
trips_changes will be an array of numbers -- I presume they're integers but more types are available.
Then syncing a few docs:
POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":1}]}
POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":0}]}
And finally querying:
GET trips_solutions/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "trips",
"query": {
"term": {
"trips.changes": {
"value": 0
}
}
}
}
},
{
"script": {
"script": {
"source": "doc.trips_changes.stream().filter(val -> val != 0).count() == 0"
}
}
}
]
}
}
}
Note that we first filter normally using the nested term query to narrow down our search context (scripts are slow so this is useful). We then check if there are any non-zero changes in the accumulated top-level changes and reject those that apply.

Querying Nested JSON based on 1 term value

I have indexed JSON like below format
JSON:
{"work":[{"organization":"abc", end:"present"},{"organization":"edf", end:"old"}]}
{"work":[{"organization":"edf", end:"present"},{"organization":"abc", end:"old"}]}
I want to query records where organization is "abc" and end is "present"
but below query is not working
work.0.organization: "abc" AND work.0.end:"present"
No records are matched
if I give query like below
work.organization: "abc" AND work.end:"present"
Both the records are matched. Whereas only the first record is what I want
The matched record should be only the below
{"work":[{"organization":"abc", end:"present"},{"organization":"edf", end:"old"}]}
You have to use nested_types. First map work as nested type in elastic using following mappings
PUT index_name_3
{
"mappings": {
"document_type" : {
"properties": {
"work" : {
"type": "nested",
"properties": {
"organization" : {
"type" : "text"
},
"end" : {
"type" : "text"
}
}
}
}
}
}
}
Use the following query to do nested filter match and innerhits
{
"query": {
"nested": {
"path": "work",
"inner_hits": {},
"query": {
"bool": {
"must": [{
"term": {
"work.organization": {
"value": "abc"
}
}
},
{
"term": {
"work.end": {
"value": "present"
}
}
}
]
}
}
}
}
}

Elasticsearch Query Filter for Word Count

I am currently looking for a way to return documents with a maximum of n words in a certain field.
The query could look like this for a resultset that contains documents with less than three words in the "name" field but there is nothing like word_count as far as I know.
Does anyone know how to handle this, maybe even in a different way?
GET myindex/myobject/_search
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"word_count": {
"name": {
"lte": 3
}
}
}
]
}
},
"query": {
"match_all" : { }
}
}
}
}
You can use the token_count data type in order to index the number of tokens in a given field and then search on that field.
# 1. create the index/mapping with a token_count field
PUT myindex
{
"mappings": {
"myobject": {
"properties": {
"name": {
"type": "string",
"fields": {
"word_count": {
"type": "token_count",
"analyzer": "standard"
}
}
}
}
}
}
}
# 2. index some documents
PUT index/myobject/1
{
"name": "The quick brown fox"
}
PUT index/myobject/2
{
"name": "brown fox"
}
# 3. the following query will only return document 2
POST myindex/_search
{
"query": {
"range": {
"name.word_count": { 
"lt": 3
}
}
}
}

Elasticsearch sorting by matching array item

I have a following structure in indexed documents:
document1: "customLists":[{"id":8,"position":8},{"id":26,"position":2}]
document2: "customLists":[{"id":26,"position":1}]
document3: "customLists":[{"id":8,"position":1},{"id":26,"position":3}]
I am able to search matching documents that belong to a given list with match query "customLists.id = 26". But I need to sort the documents based on the position value within that list and ignore positions of the other lists.
So the expected results would be in order of document2, document1, document3
Is the data structure suitable for this kind of sorting and how to handle this?
One way to achieve this would be to set mapping type of customLists as nested and then use sorting by nested fields
Example :
1) Create Index & Mapping
put test
put test/test/_mapping
{
"properties": {
"customLists": {
"type": "nested",
"properties": {
"id": {
"type": "integer"
},
"position": {
"type": "integer"
}
}
}
}
}
2) Index Documents :
put test/test/1
{
"customLists":[{"id":8,"position":8},{"id":26,"position":2}]
}
put test/test/2
{
"customLists":[{"id":26,"position":1}]
}
put test/test/3
{
"customLists":[{"id":8,"position":1},{"id":26,"position":3}]
}
3) Query to sort by positon for given id
post test/_search
{
"filter": {
"nested": {
"path": "customLists",
"query": {
"term": {
"customLists.id": {
"value": "26"
}
}
}
}
},
"sort": [
{
"customLists.position": {
"order": "asc",
"mode": "min",
"nested_filter": {
"term": {
"customLists.id": {
"value": "26"
}
}
}
}
}
]
}

Resources