If Else Elasticsearch - elasticsearch

I have two sets of documents, which are joined by fragmentId. I have written a query that pulls both documents, but I am thinking is there any other way to write it.
first set Document - There could be only one document which has type = fragment and fragmentId = 1
{
"fragmentId": "1",
"type" : "fragment"
}
The second kind of documents - There could be multiple such documents, separated by start and end values. In the query, I will be passing a value and only document inside that range should come.
Doc-1
{
"fragmentId" : "1",
"type": "cf",
"start": 1,
"end": 5
}
Doc- 2
{
"fragmentId" : "1",
"type": "cf",
"start": 6,
"end": 10
}
In the result, I want the first set document, then from the second set only the document which has a specific start and end values.
Here is the query, which is working for me-
GET test/_search
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"term": {
"type": "fragment"
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"term": {
"type": "cf"
}
},
{
"range" :{
"start": {
"gte": 1,
"lte": 5
}
}
}
]
}
}
]
}
}
}
Is there a way to re-write this query in more simple form, so that first document is always picked, with the range matching document from the second set, basically a join operation on fragmentId?

Are you looking for something like this?
GET test/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"type": "fragment"
}
},
{
"bool": {
"must": [
{
"term": {
"type": "cf"
}
},
{
"range": {
"start": {
"gte": 1,
"lte": 5
}
}
}
]
}
}
]
}
}
]
}
}
}
This query translates to :
(fragmentId = 1 AND (type = fragment OR (type = cf AND start is within 1 and 5)))

Related

Simplest way to query a elasticsearch index with chained conditions

I have an index of products on which I want to find all the products who fulfill conditions , such as :
((type = "orange" and price > 10) or (type = "apple" and price > 8)) and on_sale=True.
What about
(type = "orange" or type = "apple") and (price <= 25 or on_sale=True) .
You need to combine bool clauss, with "must" and "should".
Find below the required query for the first statement
GET _search
{
"query": {
"bool": {
"must": [
{
"term": {
"on_sale": {
"value": "True"
}
}
},
{
"bool": {
"should": [
{
"bool": {
"must": [
{
"term": {
"type": {
"value": "orange"
}
}
},
{
"range": {
"price": {
"gte": 10
}
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"type": {
"value": "apple"
}
}
},
{
"range": {
"price": {
"gte": 8
}
}
}
]
}
}
]
}
}
]
}
}
}
It is just about wrapping "must" or "Should" clauses into one another as required. You need a little bit of practise to figure out how to chain them. But literally any combination can be queried using this kind of syntax.
For the second query:
{
"query": {
"bool": {
"must": [
{
"terms": {
"type": [
"ornage",
"apple"
]
}
},
{
"bool": {
"should": [
{
"term": {
"on_sale": {
"value": "True"
}
}
},
{
"range": {
"price": {
"gte": 10
}
}
}
]
}
}
]
}
}
}
When you need "and" use "MUST", when you need "or" use "SHOULD".
HTH.

Elasticsearch Add additional condition if type is different

GET test/_search
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"term": {
"type": "fragment"
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"term": {
"type": "cf"
}
},
{
"range" :{
"start": {
"gte": 1,
"lte": 5
}
}
}
]
}
}
]
}
}
}
I am looking for two documents, one which has fragment id = 1, and type = fragment, whereas another where fragment id = 1, type = "cf" and start between 1 and 5.
The above query is doing the job, but I need to write type and fragment id twice. Is there a way I can add range condition only when the type is cf, basically clubbing both bools in one ?
This is the query you're looking for:
{
"query": {
"bool": {
"filter": [
{
"term": {
"fragmentId": "1"
}
}
],
"minimum_should_match": 1,
"should": [
{
"term": {
"type": "fragment"
}
},
{
"bool": {
"filter": [
{
"term": {
"type": "cf"
}
},
{
"range": {
"start": {
"gte": 1,
"lte": 5
}
}
}
]
}
}
]
}
}
}

Filter query by length of nested objects. ie. min_child

I'm trying to filter my query by the number of nested objects found. The Elastic Search documentation mentions that using a script is an expensive task, so I've set out to do it with a score, though I can't seem to get the script to work either.
Here's my mappings:
"mappings": {
"properties": {
"dates" : {
"type" : "nested",
"properties" : {
"rooms" : {
"type" : "integer"
},
"timestamp" : {
"type" : "long"
}
}
},
"doc_id" : {
"type" : "text"
},
"distance" : {
"type" : "integer"
}
...
}
}
Here's some example data:
PUT /test/_doc/1
{
"doc_id": "1",
"distance": 1,
"dates": [
{
"rooms": 1,
"timestamp": 1
},
{
"rooms": 1,
"timestamp": 2
},
...
]
}
I'm filtering by the parents distance field, among others, and filtering the nested dates by their timestamps, and rooms. I need to filter all results to an exact number of nest dates found.
I tried to borrow from here.
This is my search query:
GET /test/_search
{
"query" : {
"function_score": {
"min_score": 20,
"boost": 1,
"functions": [
{
"script_score": {
"script": {
"source": "if (_score > 20) { return - 1; } return _score;"
}
}
}
],
"query": {
"bool" : {
"filter": [
{ "range": { "distance": { "lt": 5 }}},
{
"nested": {
"score_mode": "sum",
"boost": 10,
"path": "dates",
"query": {
"bool": {
"filter": [
{ "range": { "dates.rooms": { "gte": 1 } } },
{ "range": { "dates.timestamp": { "lte": 2 }}},
{ "range": { "dates.timestamp": { "gte": 1 }}}
]
}
}
}
}
]
}
}
}
}
}
This returns all the results that match, yet they all have a score of 0.0 and aren't getting filtered by the number of nested objects found.
If this is the right solution, how can I get this working? If not, how can I get a script to do it within this search?
Thanks!
Before getting started, keep in mind that the scoring function has changed between Elastic 6 and 7. You can find the updated code samples on this this gist.
Your question didn't outline the specifics of your search. Reading the code, it seems like you want to retrieve all documents where the distance is less than five, and the number of matching rooms is precisely 2. If this is correct, the code you submitted does not achieve this.
Reasons: your function score contains your primary condition and your condition on the number of matching rooms (it is quite tricky to mix both, though not impossible). To make things simpler, isolate them for the function score to be only applicable to the number of rooms.
Supposing you are using elastic 7+, this might work:
{
"_source": {
"includes": ["*"],
"excludes": ["dates"]
},
"query": {
"bool": {
"must": [
{"range": {"distance": {"lt": 5}}},
{
"function_score": {
"min_score": 20,
"boost": 1,
"score_mode": "multiply",
"boost_mode": "replace",
"functions": [
{
"script_score": {
"script": {
"source": "if (_score > 20) { return 0; } return _score;"
}
}
}
],
"query": {
"nested": {
"path": "date",
"boost": 10,
"score_mode": "sum",
"query": {
"constant_score": {
"boost": 1,
"filter": {
"bool": {
"should": [
{
"bool": {
"must": [
{"term": {"dates.timestamp": 1}},
{"range": {"dates.rooms": {"lt": 5}}}
],
"should": [
{"term": {"dates.other_prop": 1}},
{"term": {"dates.other_prop": 4}}
]
}
},
{
"bool": {
"must": [
{"term": {"dates.timestamp": 2}},
{"range": {"dates.rooms": {"lt": 5}}}
],
"should": [
{"term": {"dates.other_prop": 1}},
{"term": {"dates.other_prop": 3}}
]
}
}
]
}
}
}
}
}
}
}
}
]
}
}
}
I managed to get it all working with scoring as filtering doesn't allow scoring. Using GET /test/_explain/[id] helped to understand exactly what was happening
GET /test/_search
{
// Don't return the nested fields, they are returned in the inner_hits
"_source": {
"includes": [ "*" ],
"excludes": [ "dates" ]
},
"query": {
"function_score": {
// Score is calculated with 1 point for each matched inner property and outer property.
// 7 is the exact score to allow
"min_score": 7,
"boost": 1,
"score_mode": "sum",
"boost_mode": "multiply",
"functions": [
{
"script_score": {
"script": {
// Ignore any results that don't match exactly
"source": "if (_score == 7) { return 1; } return 0;",
"lang": "painless"
}
}
}
],
"query": {
"bool" : {
"must" : [
{ "range" : { "distance" : { "lt": 10 }}},
{
"nested": {
"inner_hits" : {},
"path": "dates",
"score_mode": "sum",
"query": {
"bool": {
// Match each required nested object individually, then verify with the score if we got 1 match for each should
"should": [
{
"bool": {
"must": [
{ "term": { "dates.timestamp": 1 }},
{ "range": { "dates.rooms": { "lt": 5 } } }
],
"should": [
{ "term": { "dates.other_prop": 1 }},
{ "term": { "dates.other_prop": 4 }}
]
}
},
{
"bool": {
"must": [
{ "term": { "dates.timestamp": 2 }},
{ "range": { "dates.rooms": { "lt": 5 } } }
],
"should": [
{ "term": { "dates.other_prop": 1 }},
{ "term": { "dates.other_prop": 3 }}
]
}
}
]
}
}
}
}
]
}
}
}
}
}

Using multiple Should queries

I want to get docs that are similar to multiple "groups" but separately. Each group has it's own rules (terms).
When I try to use more than one Should query inside a "bool" I get items that are a mix of both Should's terms.
I want to use 1 query total and not msearch for example.
Can someone please help me with that?
{
"explain": true,
"query": {
"filtered": {
"filter": {
"bool": {
"must_not": [
{
"term": {
"p_id": "123"
}
},
{
"term": {
"p_id": "124"
}
}
]
}
},
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"cat": "1"
}
},
{
"term": {
"cat": "2"
}
},
{
"term": {
"keys": "a"
}
},
{
"term": {
"keys": "b"
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"cat": "6"
}
},
{
"term": {
"cat": "7"
}
},
{
"term": {
"keys": "r"
}
},
{
"term": {
"keys": "u"
}
}
]
}
}
]
}
}
}
},
"from": 0,
"size": 3
}
You can try using a terms aggregation on multiple fields with scripting and add a top hits aggregation as a sub-aggregation. Be warned this will be pretty slow. Add this after the query/filter and adjust the size parameter as needed
"aggs": {
"Cat_and_Keys": {
"terms": {
"script": "doc['cat'].values + doc['keys'].values"
},
"aggs":{ "separate_docs": {"top_hits":{"size":1 }} }
}
}

Elastic search DSL Syntax equivalence for SQL statement

I'm trying to replicate the below query logic in an elastic search query but something's not right.
Basically the query below returns one doc. I'd like either the first condition to be applied: "name": "iphone" OR the more complex second one which is: (username = 'gogadget' AND status_type = '1' AND created_time between 4532564 AND 64323238). Note that the nested bool must inside the should would take care of the more complex condition. I should still see 1 doc if I change the outside match of "name": "iphone" to be changed to "name": "wrong value". But I get nothing when I do that. I'm not sure where this is wrong.
The SQL Query is here below.
SELECT * from data_points
WHERE name = 'iphone'
OR
(username = 'gogadget' AND status_type = '1' AND created_time between 4532564 AND 64323238)
{
"size": 30,
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": "1",
"should": [
{
"bool": {
"must": [
{
"match": {
"username": "gogadget"
}
},
{
"terms": {
"status_type": [
"3",
"4"
]
}
},
{
"range": {
"created_time": {
"gte": 20140712,
"lte": 1405134711
}
}
}
]
}
}
],
"must": [],
"must_not": []
}
},
{
"match": {
"name": "iphone"
}
}
]
}
}
}
should query will match the query and return.
You don't need use must to aggregate your OR query.
The query should like:
{
"query": {
"bool": {
"should": [{
"bool": {
"must": [{
"match": {
"username": "gogadget"
}
}, {
"terms": {
"status_type": [
"3",
"4"
]
}
}, {
"range": {
"created_time": {
"gte": 20140712,
"lte": 1405134711
}
}
}]
}
}, {
"match": {
"name": "iphone"
}
}]
}
}
}

Resources