Elasticsearch compare field data - elasticsearch

I have a problem. and I have a two field in elasticsearch mapping.I want to compare two field data and want to list the result 1 returns. How can I do this?
{
"query": {
"bool": {
"must": {
"range": {
"price": {
"gt": 100
}
}
},
"filter": {
"script": {
"script": "doc['departureDate'].value-doc['returnDate'].value==1"
}
}
}
}
}

Related

Compond query with Elasticsearch

I'm trying to perform a search with the intended criteria being (activationDate in range 1598889600 to 1602051579) or someFlag=true.
Below is the query I tried, but it does not yield any records with someFlag=true (even with a big size, e.g. 5000). My Elasticsearch does have a lot of records with someFlag=true.
There are about 3000 total documents and this query returns around 280 documents.
{
"query": {
"bool": {
"must": [
{
"range": {
"activationDate": {
"gte": 1598889600
}
}
},
{
"range": {
"activationDate": {
"lte": 1602051579
}
}
}
],
"should": {
"match": {
"someFlag": true
}
}
}
},
"from": 1,
"size": 1000
}
Am I missing something?
This should work:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"activationDate": {
"gte": 1598889600,
"lte": 1602051579
}
}
},
{
"term": {
"someFlag": true
}
}
]
}
}
]
}
}
}
In theory this should do the same:
{
"query": {
"bool": {
"should": [
{
"range": {
"activationDate": {
"gte": 1598889600,
"lte": 1602051579
}
}
},
{
"term": {
"someFlag": true
}
}
]
}
}
}
However the first query I've given wraps bool clause within a filter context (so that it does not need to score and query becomes cacheable).
Your bool query might have not worked because you were using match query, not term. match is normally used for text search only.
Replace the must with an should and set minimum_should_match=1 as is is an OR query and you are fine if just one of the ceiterias is met by any record. Next reduce the two range criterias to just one, where you combine gte and lte.

Difference between the result of two date fields then getting average

I am looking to get the average of the difference between two different fields in an elastic DB, I have been able to write a query to return the last 1000 results, however I am not sure how I go about getting the difference between each result then getting an overall average.
Elastic query below:
POST my_index/_search
{
"size":1000,
"_source": ["date.time.received","date.time.sent"],
"query": {
"bool": {
"must": [
{
"range": {
"date.time.received": {
"gte": "2019-06-19"
}
}
},
{
"range": {
"date.time.sent": {
"gte": "2019-06-19"
}
}
}
]
}
}
}
I am using average aggregation and script
POST testindex5/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"date.time.received": {
"gte": "2019-06-19"
}
}
},
{
"range": {
"date.time.sent": {
"gte": "2019-06-19"
}
}
}
]
}
},
"aggs": {
"avg_resp": {
"avg": {
"script": "(doc['date.time.received'].value.toInstant().toEpochMilli()- doc['date.time.sent'].value.toInstant().toEpochMilli())/1000/86400" ---> convert to days
}
}
}
}

Query elasticsearch where a key's value is at least some number

I am processing files to recognize if they contain labels and what the confidence the label was recognized.
I created a nested mapping called tags which contains label (text) and confidence (float between 0 and 100).
Here is an example of how I think the query would work (I know it's invalid). It should be a something like "Find documents that have the tags labelled A and B. A must have a confidence of at least 37 and B must have a confidence of at least 80".
{
"query": {
"nested": {
"path": "tags",
"query": {
"bool": {
"must": [
{
"match": {
"tags.label": "A"
},
"range": {
"tags.confidence": {
"gte": 37
}
}
},
{
"match": {
"tags.label": "B"
},
"range": {
"tags.confidence": {
"gte": 80
}
}
}
]
}
}
}
}
}
Any ideas? I am pretty sure I need to approach it differently (different mapping). I am not sure how to accomplish this in ElasticSearch. Is this possible?
Let's say your parent document would contain two nested documents, something like below:
{
"tags":[
{
"label":"A",
"confidence":40
},
{
"label":"B",
"confidence":85
}
]
}
If that is the case, below is how your query would be:
Nested Query:
POST <your_index_name>/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "tags",
"query": {
"bool": {
"must": [
{
"match": {
"tags.label": "A"
}
},
{
"range": {
"tags.confidence": {
"gte": 37
}
}
}
]
}
}
}
},
{
"nested": {
"path": "tags",
"query": {
"bool": {
"must": [
{
"match": {
"tags.label": "B"
}
},
{
"range": {
"tags.confidence": {
"gte": 80
}
}
}
]
}
}
}
}
]
}
}
}
Note that each nested document is indexed as a separate document. That is the reason you have to mention two separate queries. Otherwise, with what you have what it does it, it would search all the four values inside one/single nested document of its parent document.
Hope this helps!

Putting two queries together

How am I able to put both of these queries together, as you can see that query one is bringing back all the date from today and the second query is bringing back data for all users that has the name test in it.
So I want to bring back all of the data for data with the name that has test in it.
Could someone show me how this is done please?
Query one:
{
"_source":["VT"],
"query": {
"range": {
"VT": {
"gte": "now/d",
"lt": "now/d+13h"
}
}}
}
Query two:
from elasticsearch import Elasticsearch
es = Elasticsearch(["9200"])
res = es.search(index="search", body=
{
"_source": ["DTDT", "TRDT"],
"query": {
"bool": {
"should": [
{"wildcard": {"N": "TEST*"}}
]
}
}
}, size=10
)
for doc in res['hits']['hits']:
print(doc)
You can use a bool query with two must clauses, like this:
{
"_source": ["DTDT", "TRDT", "VT"],
"query": {
"bool": {
"must": [
{
"range": {
"VT": {
"gte": "now/d",
"lt": "now/d+13h"
}
}
},
{
"wildcard": {
"N": "TEST*"
}
}
]
}
}
}
Check out the docs for the bool query.
This will help you:
POST _search
{
"query": {
"bool": {
"must": [
{
"range": {
"VT": {
"gte": "now/d",
"lt": "now/d+13h"
}
}
},
{
"match": {
"N": {
"query": "TEST",
"operator": "and"
}
}
}]
}
}
}

Filter with match_all VS query

I have 2 types of queries. They are both logically identical however I'm not sure if there is any performance difference between the two.
I will be glad if someone can enlighten me.
Using match_all and filter:
{
"query": {
"filtered": {
"query": {
"term": {
"user_id": "1234567"
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"ephoc_date": {
"lt": 1437033590,
"gte": 1437026390
}
}
}
]
}
}
}
}
}
Using term query:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"term": {
"user_id": "1234567"
}
},
{
"range": {
"ephoc_date": {
"lt": 1437033590,
"gte": 1437026390
}
}
}
]
}
}
}
}
}
Looking at your query it seems like you don't care about how documents are scored based on the value of user_id field being "1234567". What I mean to say is - If more than one document have user_id set to "1234567", you don't care about the order of documents in the result. If that is the case, 2nd option is better with respect to performance because there is some computation cost associated with scoring in the 1st query while there is no scoring in the 2nd query. By the way, your 2nd query can also be simplified to below:
{
"filter": {
"bool": {
"must": [
{
"term": {
"user_id": "1234567"
}
},
{
"range": {
"ephoc_date": {
"lt": 1437033590,
"gte": 1437026390
}
}
}
]
}
}
}

Resources