Elastic search DSL Syntax equivalence for SQL statement - elasticsearch

I'm trying to replicate the below query logic in an elastic search query but something's not right.
Basically the query below returns one doc. I'd like either the first condition to be applied: "name": "iphone" OR the more complex second one which is: (username = 'gogadget' AND status_type = '1' AND created_time between 4532564 AND 64323238). Note that the nested bool must inside the should would take care of the more complex condition. I should still see 1 doc if I change the outside match of "name": "iphone" to be changed to "name": "wrong value". But I get nothing when I do that. I'm not sure where this is wrong.
The SQL Query is here below.
SELECT * from data_points
WHERE name = 'iphone'
OR
(username = 'gogadget' AND status_type = '1' AND created_time between 4532564 AND 64323238)
{
"size": 30,
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": "1",
"should": [
{
"bool": {
"must": [
{
"match": {
"username": "gogadget"
}
},
{
"terms": {
"status_type": [
"3",
"4"
]
}
},
{
"range": {
"created_time": {
"gte": 20140712,
"lte": 1405134711
}
}
}
]
}
}
],
"must": [],
"must_not": []
}
},
{
"match": {
"name": "iphone"
}
}
]
}
}
}

should query will match the query and return.
You don't need use must to aggregate your OR query.
The query should like:
{
"query": {
"bool": {
"should": [{
"bool": {
"must": [{
"match": {
"username": "gogadget"
}
}, {
"terms": {
"status_type": [
"3",
"4"
]
}
}, {
"range": {
"created_time": {
"gte": 20140712,
"lte": 1405134711
}
}
}]
}
}, {
"match": {
"name": "iphone"
}
}]
}
}
}

Related

Simplest way to query a elasticsearch index with chained conditions

I have an index of products on which I want to find all the products who fulfill conditions , such as :
((type = "orange" and price > 10) or (type = "apple" and price > 8)) and on_sale=True.
What about
(type = "orange" or type = "apple") and (price <= 25 or on_sale=True) .
You need to combine bool clauss, with "must" and "should".
Find below the required query for the first statement
GET _search
{
"query": {
"bool": {
"must": [
{
"term": {
"on_sale": {
"value": "True"
}
}
},
{
"bool": {
"should": [
{
"bool": {
"must": [
{
"term": {
"type": {
"value": "orange"
}
}
},
{
"range": {
"price": {
"gte": 10
}
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"type": {
"value": "apple"
}
}
},
{
"range": {
"price": {
"gte": 8
}
}
}
]
}
}
]
}
}
]
}
}
}
It is just about wrapping "must" or "Should" clauses into one another as required. You need a little bit of practise to figure out how to chain them. But literally any combination can be queried using this kind of syntax.
For the second query:
{
"query": {
"bool": {
"must": [
{
"terms": {
"type": [
"ornage",
"apple"
]
}
},
{
"bool": {
"should": [
{
"term": {
"on_sale": {
"value": "True"
}
}
},
{
"range": {
"price": {
"gte": 10
}
}
}
]
}
}
]
}
}
}
When you need "and" use "MUST", when you need "or" use "SHOULD".
HTH.

If Else Elasticsearch

I have two sets of documents, which are joined by fragmentId. I have written a query that pulls both documents, but I am thinking is there any other way to write it.
first set Document - There could be only one document which has type = fragment and fragmentId = 1
{
"fragmentId": "1",
"type" : "fragment"
}
The second kind of documents - There could be multiple such documents, separated by start and end values. In the query, I will be passing a value and only document inside that range should come.
Doc-1
{
"fragmentId" : "1",
"type": "cf",
"start": 1,
"end": 5
}
Doc- 2
{
"fragmentId" : "1",
"type": "cf",
"start": 6,
"end": 10
}
In the result, I want the first set document, then from the second set only the document which has a specific start and end values.
Here is the query, which is working for me-
GET test/_search
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"term": {
"type": "fragment"
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"term": {
"type": "cf"
}
},
{
"range" :{
"start": {
"gte": 1,
"lte": 5
}
}
}
]
}
}
]
}
}
}
Is there a way to re-write this query in more simple form, so that first document is always picked, with the range matching document from the second set, basically a join operation on fragmentId?
Are you looking for something like this?
GET test/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"type": "fragment"
}
},
{
"bool": {
"must": [
{
"term": {
"type": "cf"
}
},
{
"range": {
"start": {
"gte": 1,
"lte": 5
}
}
}
]
}
}
]
}
}
]
}
}
}
This query translates to :
(fragmentId = 1 AND (type = fragment OR (type = cf AND start is within 1 and 5)))

Search for documents matching all terms in a nested array Elasticsearch

I am learning to use Elasticsearch as a basic recommender engine.
My elasticsearch document contains records with nested entities as follows
PUT recs/user/1
{
"name" : "Brad Pitt",
"movies_liked": [
{
"name": "Forrest Gump",
"score": 1
},
{
"name": "Terminator",
"score": 4
},
{
"name": "Rambo",
"score": 4
},
{
"name": "Rocky",
"score": 4
},
{
"name": "Good Will Hunting",
"score": 2
}
]
}
PUT recs/user/2
{
"name" : "Tom Cruise",
"movies_liked": [
{
"name": "Forrest Gump",
"score": 2
},
{
"name": "Terminator",
"score": 1
},
{
"name": "Rocky IV",
"score": 1
},
{
"name": "Rocky",
"score": 1
},
{
"name": "Rocky II",
"score": 1
},
{
"name": "Predator",
"score": 4
}
]
}
I would like to search for users who specifically like "Forrest Gump","Terminator" and "Rambo".
I have used a nested query which currently looks like this
POST recs/user/_search
{
"query": {
"nested": {
"path": "movies_liked",
"query": {
"terms": {
"movies_liked.name": ["Forrest Gump","Terminator","Rambo"]
}
}
}
}
}
However when I execute this search, I expected to see only the first record which has all the required terms, but in the results I am getting both the records. In the second record the user clearly does not have "Rambo" in his liked list. I understand that this query is doing an "OR" operation with the given terms, How do I tweak this query to do an "AND" operation so that only the records having all the terms get matched?
How do I tweak this query to do an "AND" operation so that only the records having all the terms get matched?
By using a bool query:
POST recs/user/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "movies_liked",
"query": {
"bool": {
"must": [
{
"terms": {
"movies_liked.name": [
"Forrest Gump"
]
}
}
]
}
}
}
},
{
"nested": {
"path": "movies_liked",
"query": {
"bool": {
"must": [
{
"terms": {
"movies_liked.name": [
"Terminator"
]
}
}
]
}
}
}
},
{
"nested": {
"path": "movies_liked",
"query": {
"bool": {
"must": [
{
"terms": {
"movies_liked.name": [
"Rambo"
]
}
}
]
}
}
}
}
]
}
}
}
Note that bool wraps around several nested queries, not the other way around. It is important because the scope of a nested query is the nested document, because it basically a hidden separate object.
Hope that helps!

Elasticsearch must_not filter not works with a big bunch of values

I have the next query that include some filters:
{
"from": 0,
"query": {
"function_score": {
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"idpais": [
115
]
}
},
{
"term": {
"tipo": [
1
]
}
}
],
"must_not": [
{
"term": {
"idregistro": [
5912471,
3433876,
9814443,
11703069,
6333176,
8288242,
9924922,
6677850,
11852501,
12530205,
4703469,
12776479,
12287659,
11823679,
12456304,
12777457,
10977614,
...
]
}
}
]
}
},
"query": {
"bool": {
"should": [
{
"match_phrase": {
"area": "Coordinator"
}
},
{
"match_phrase": {
"company": {
"boost": 5,
"query": "IBM"
}
}
},
{
"match_phrase": {
"topic": "IT and internet stuff"
}
},
{
"match_phrase": {
"institution": {
"boost": 5,
"query": "University of my city"
}
}
}
]
}
}
}
},
"script_score": {
"params": {
"idpais": 115,
"idprovincia": 0,
"relationships": []
},
"script_id": "ScoreUsuarios"
}
}
},
"size": 24,
"sort": [
{
"_script": {
"order": "desc",
"script_id": "SortUsuarios",
"type": "number"
}
}
]
}
The must_not filter has a big bunch of values to exclude (around 200 values), but it looks like elasticsearch ignores those values and it includes on the result set. If I try to set only a few values (10 to 20 values) then elasticsearch applies the must_not filter.
Exists some restriction a bout the amount of values in the filters? Exists some way to remove a big amount of results from the query?
terms query is used for passing a list of values not term query.You have to use it like below in your must filter.
{
"query": {
"terms": {
"field_name": [
"VALUE1",
"VALUE2"
]
}
}
}

ElasticSearch How to AND a nested query

I am trying to figure out how to AND my Elastic Search query. I've tried a few different variations but I am always hitting a parser error.
What I have is a structure like this:
{
"title": "my title",
"details": [
{ "name": "one", "value": 100 },
{ "name": "two", "value": 21 }
]
}
I have defined details as a nested type in my mappings. What I'm trying to achieve is a query where it matches a part of the title and it matches various details by the detail's name and value.
I have the following query which gets me nearly there but I haven't been able to figure out how to AND the details. As an example I'd like to find anything that has:
detail of one with value less than or equal to 100
AND detail of two with value less than or equal to 25
The following query only allows me to search by one detail name/value:
"query" : {
"bool": {
"must": [
{ "match": {"title": {"query": titleQuery, "operator": "and" } } },
{
"nested": {
"path": "details",
"query": {
"bool": {
"must": [
{ "match": {"details.name" : "one"} },
{ "range": {"details.value" : { "lte": 100 } } }
]
}
}
} // nested
}
] // must
}
}
As a second question, would it be better to query the title and then move the nested part of the query into a filter?
You were so close! Just add another "nested" clause in your outer "must":
POST /test_index/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "title",
"operator": "and"
}
}
},
{
"nested": {
"path": "details",
"query": {
"bool": {
"must": [
{"match": {"details.name": "one" } },
{ "range": { "details.value": { "lte": 100 } } }
]
}
}
}
},
{
"nested": {
"path": "details",
"query": {
"bool": {
"must": [
{"match": {"details.name": "two" } },
{ "range": { "details.value": { "lte": 25 } } }
]
}
}
}
}
]
}
}
}
Here is some code I used to test it:
http://sense.qbox.io/gist/1fc30d49a810d22e85fa68d781114c2865a7c92e
EDIT: Oh, the answer to your second question is "yes", though if you're using 2.0 things have changed a little.

Resources