Elasticsearch AND Parens - elasticsearch

I'm attempting to do the following with the query dsl but I'll express it as SQL:
(matrices.matrix = 'Matrix1' AND matrices.count = 1) AND (matrices.matrix = 'Matrix2' AND matrices.count >= 0)
So, I need to get docs that have both of these nested docs with these values.
This is the nested document it sits on the _source level
"matrices": [
{
"terms": [],
"count": 0,
"matrix": "none"
},
{
"terms": [
"greater"
],
"count": 1,
"matrix": "Matrix1"
}
]
And here is the mapping for the sub-doc:
"matrices": {
"type": "nested",
"include_in_parent": true,
"properties": {
"count": {
"type": "long"
},
"matrix": {
"type": "string"
},
"terms": {
"type": "string"
}
}
}
So, I need to generate a query that will allow me to get docs that match both (matrix = 'none' && count=0) && (matrix = 'Matrix' && count = 1)
Thanks,

So basically you want to retrieve documents that MUST contain two nested documents with the following criteria:
one nested document with matrices.count=0 AND matrices.matrix=none
another nested document with matrices.count=1 AND matrices.matrix=Matrix
Then with the mapping you have, you can achieve that result using the following query. We use bool/must for two nested queries which in turn match the criteria each of the nested documents that must be retrieved.
curl -XPOST localhost:9200/_search -d '{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "matrices",
"query": {
"bool": {
"must": [
{
"term": {
"matrices.count": 0
}
},
{
"term": {
"matrices.matrix": "none"
}
}
]
}
}
}
},
{
"nested": {
"path": "matrices",
"query": {
"bool": {
"must": [
{
"term": {
"matrices.count": 1
}
},
{
"term": {
"matrices.matrix": "matrix"
}
}
]
}
}
}
}
]
}
}
}
}
}

Related

Get the count of all the documents including innerHits in elasticsearch

I have an index defined in Elasticsearch which has 3 level of hierarchy relation defined.
aggParent
aggChildL1
aggChildL0
Below is the mapping for that index.
{
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 0
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"deviceName": {
"type": "keyword"
},
"agg_relation_type": {
"type": "join",
"relations": {
"aggParent": "aggChildL1",
"aggChildL1": "aggChildL0"
}
}
}
}
}
I have written a query that will return parent documents in the hits and the corresponding children in the innerHits.
Following is the query
{
"size": 1,
"query": {
"bool": {
"should": [
{
"has_child": {
"type": "aggChildL1",
"query": {
"bool": {
"should": [
{
"has_child": {
"type": "aggChildL0",
"query": {
"match": {
"id": "nc1olt5onu1unia"
}
},
"inner_hits": {
}
}
},
{
"bool": {
"must": [
{
"match": {
"id": "nc1olt5onu1unia"
}
},
{
"match": {
"agg_relation_type": "aggChildL1"
}
}
]
}
}
]
}
},
"inner_hits": {
"size": 64,
"sort": [
{
"deviceType": {
"order": "desc"
}
}
]
}
}
},
{
"bool": {
"must": [
{
"match": {
"id": "nc1olt5onu1unia"
}
},
{
"match": {
"agg_relation_type": "aggParent"
}
}
]
}
},
{
"bool": {
"must_not": {
"exists": {
"field": "agg_relation_type"
}
},
"must": [
{
"match": {
"id": "nc1olt5onu1unia"
}
}
]
}
}
]
}
}
}
This query returns a count at the top level with only the count of total aggParent documents.
I need to get the count at the inner hits level as well.
The count of all matching documents at the aggChildL0 level and then the count of all documents that gets loaded at the aggChildL1 level based on the has_child query and then the count of documents that match the filter on the aggChildL1 level.
Similarly the count of all documents that get loaded at aggParent level based on the top most has_child query and then the count of documents that match the filter on the aggParent level.
Basically the total count of all the documents that can be returned with the query.
Is there any way of getting the total count in ES?

If Else Elasticsearch

I have two sets of documents, which are joined by fragmentId. I have written a query that pulls both documents, but I am thinking is there any other way to write it.
first set Document - There could be only one document which has type = fragment and fragmentId = 1
{
"fragmentId": "1",
"type" : "fragment"
}
The second kind of documents - There could be multiple such documents, separated by start and end values. In the query, I will be passing a value and only document inside that range should come.
Doc-1
{
"fragmentId" : "1",
"type": "cf",
"start": 1,
"end": 5
}
Doc- 2
{
"fragmentId" : "1",
"type": "cf",
"start": 6,
"end": 10
}
In the result, I want the first set document, then from the second set only the document which has a specific start and end values.
Here is the query, which is working for me-
GET test/_search
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"term": {
"type": "fragment"
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"term": {
"type": "cf"
}
},
{
"range" :{
"start": {
"gte": 1,
"lte": 5
}
}
}
]
}
}
]
}
}
}
Is there a way to re-write this query in more simple form, so that first document is always picked, with the range matching document from the second set, basically a join operation on fragmentId?
Are you looking for something like this?
GET test/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"type": "fragment"
}
},
{
"bool": {
"must": [
{
"term": {
"type": "cf"
}
},
{
"range": {
"start": {
"gte": 1,
"lte": 5
}
}
}
]
}
}
]
}
}
]
}
}
}
This query translates to :
(fragmentId = 1 AND (type = fragment OR (type = cf AND start is within 1 and 5)))

Score keyword terms query on nested fields in elastichsearch 6.3

I have a set of keywords (skills in my example) and I would like to retrieve documents which match most of them. The documents should be sorted by how many of the keywords they match. The field i am searching into (skills) is of nested type. The index has the following mapping:
{
"mappings": {
"profiles": {
"properties": {
"id": {
"type": "keyword"
},
"skills": {
"type": "nested",
"properties": {
"level": {
"type": "float"
},
"name": {
"type": "keyword"
}
}
}
}
}
}
}
I tried both a terms query on the keyword field like:
{
"query": {
"nested": {
"path": "skills",
"query": {
"terms": {
"skills.name": [
"python",
"java"
]
}
}
}
}
}
And a boolean query
{
"query": {
"nested": {
"path": "skills",
"query": {
"bool": {
"should": [
{
"terms": {
"skills.name": [
"java"
]
}
},
{
"terms": {
"skills.name": [
"r"
]
}
}
]
}
}
}
}
}
For both queries the maximum score of the returned documents is 1. Thus both return documents that have ANY of the skills, but do not sort them such those with both skills are on top. The issues seems to be that skills is a nested field.
The second query works if each element of should is a nested query.
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "skills",
"query": {
"terms": {
"skills.name": [
"java"
]
}
}
}
},
{
"nested": {
"path": "skills",
"query": {
"terms": {
"skills.name": [
"r"
]
}
}
}
}
]
}
}
}

Elasticsearch search in array

I'm trying to find a way to search in a same array.
Example Dataset
"_id":"23424232",
"vehicule":[
"tags":['kawasaki','suzuki','ducati'],
"tags":['opel','mercedes','ford']
]
if i search for someone with "kawasaki" and "opel" in the same tags array i'm expecting to have 0 hits but elastic found the customer
Query
"query": {
"bool": {
"must": [
{ "term": { "vehicule.tags" : "kawasaki"}},
{ "term": { "vehicule.tags" : "opel"}}
]
}
}
Mapping
"vehicule": {
"include_in_parent": true,
"type": "nested",
"properties": {
"tags":{
"type":"string",
"analyzer":"code_tokenizer"
},
I think it's because for elastic tags is flat and i would like to avoid that. How can i do that ?
"tags":['kawasaki','suzuki','ducati','opel','mercedes','ford']
i found the solution for me.
{
"query": {
"nested": {
"path": "vehicule.tags",
"query": {
"bool": {
"must": [
{
"term": {
"vehicule.tags": "suzuki"
}
},
{
"term": {
"vehicule.tags": "opel"
}
}
]
}
}
}
}
}
and for that query elastic found 0 customer :)

Elastic search DSL Syntax equivalence for SQL statement

I'm trying to replicate the below query logic in an elastic search query but something's not right.
Basically the query below returns one doc. I'd like either the first condition to be applied: "name": "iphone" OR the more complex second one which is: (username = 'gogadget' AND status_type = '1' AND created_time between 4532564 AND 64323238). Note that the nested bool must inside the should would take care of the more complex condition. I should still see 1 doc if I change the outside match of "name": "iphone" to be changed to "name": "wrong value". But I get nothing when I do that. I'm not sure where this is wrong.
The SQL Query is here below.
SELECT * from data_points
WHERE name = 'iphone'
OR
(username = 'gogadget' AND status_type = '1' AND created_time between 4532564 AND 64323238)
{
"size": 30,
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": "1",
"should": [
{
"bool": {
"must": [
{
"match": {
"username": "gogadget"
}
},
{
"terms": {
"status_type": [
"3",
"4"
]
}
},
{
"range": {
"created_time": {
"gte": 20140712,
"lte": 1405134711
}
}
}
]
}
}
],
"must": [],
"must_not": []
}
},
{
"match": {
"name": "iphone"
}
}
]
}
}
}
should query will match the query and return.
You don't need use must to aggregate your OR query.
The query should like:
{
"query": {
"bool": {
"should": [{
"bool": {
"must": [{
"match": {
"username": "gogadget"
}
}, {
"terms": {
"status_type": [
"3",
"4"
]
}
}, {
"range": {
"created_time": {
"gte": 20140712,
"lte": 1405134711
}
}
}]
}
}, {
"match": {
"name": "iphone"
}
}]
}
}
}

Resources