ElasticSearch simple query - elasticsearch

I have structure like this in my ElasticSearch
{
_index: 'index',
_type: 'product',
_id: '896',
_score: 0,
_source: {
entity_id: '896',
category: [
{
category_id: 2,
is_virtual: 'false'
},
{
category_id: 82,
is_virtual: 'false'
}
]
}
}
I want return all "producs" that have "82" category_id.
{
"query": {
"bool": {
"filter": {
"terms": {
"category.category_id": [
82
]
}
}
}
}
}
This query gives me 0 hits.
What is right way to do this?

Adding working example, you need to define the category as nested field and modify your search query by including the nested path
Index Mapping
{
"mappings": {
"properties": {
"entity_id": {
"type": "text"
},
"category": {
"type": "nested"
}
}
}
}
Index your document
{
"entity_id": "896",
"category": [
{
"category_id": 2,
"is_virtual": false
},
{
"category_id": 82,
"is_virtual": false
}
]
}
Proper search query, note we are using nested query which doesn't support normal filter(so your query gives error)
{
"query": {
"nested": {
"path": "category",
"query": {
"bool": {
"must": [
{
"match": {
"category.category_id": 82
}
}
]
}
}
}
}
}
Search result retuns indexed doc
"hits": [
{
"_index": "complexnested",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"entity_id": "896",
"category": [
{
"category_id": 2,
"is_virtual": false
},
{
"category_id": 82,
"is_virtual": false
}
]
}
}
]

If your query gives you no results, I suspect that category is of type nested in your index mapping. If that's the case, that's good and you can modify your query like this to use the nested query:
{
"query": {
"bool": {
"filter": {
"nested": {
"path": "category",
"query": {
"terms": {
"category.category_id": [
82
]
}
}
}
}
}
}
}

Related

Query for nested fields returns results as if there was no nested mapping

I am having difficulties understanding, why a query across nested fields is returning unexpected results.
I have the following template for my index
PUT /_template/nested_test
{
"index_patterns": [ "nested-*" ],
"settings": { "index.mapping.coerce": false },
"mappings": {
"dynamic": "strict",
"properties" {
"vNested": {
"type": "nested",
"properties": {
"v1": { "type": "keyword" },
"v2": {
"properties": {
"v21": {
"type": long"
}
}
}
}
}
}
}
}
I will post two documents to an index that matches the template.
POST /nested-example/_doc
{
"vNested": [
{
"v1": "User1",
"v2": {
"v21": 1
}
},
{
"v1": "User3",
"v2": {
"v21": 3
}
}
]
}
POST /nested-example/_doc
{
"vNested": [
{
"v1": "User1",
"v2": {
"v21": 3
}
},
{
"v1": "User2",
"v2": {
"v21": 2
}
}
]
}
Now I will create a query with the goal of only getting the results of those documents, where there exists User1 with a corresponding v21 value of 3. As far as I understand, my nested mapping should ensure that I will only get the second document as query result.
The following query:
GET /nested-example/_search
{
"query" : {
"bool": {
"filter": {
"bool": {
"must": [
{
"nested: {
"path": "vNested",
"query": {
"match": {
"vNested.v1": "User1"
}
}
}
},
{
"nested: {
"path": "vNested",
"query": {
"match": {
"vNested.v2.v21": "3"
}
}
}
}
]
}
}
}
}
}
returns both documents, not only the single document that I expected
I understand that the query string is not the most elegant - this is due to some business logic + front-end framework logic in place for creating the query strings based on user input and any suggestions on how to remove redundancies there are welcome as well.
However I struggle to understand why does this query return both documents including the one where the vNested object with v1=User1, and v21=1. Shouldn't the nested mapping of the vNested field prevent just that issue?
You need to use bool/must query inside the nested query since you are querying on a single object and not on multiple objects. Modify your query as
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "vNested",
"query": {
"bool": {
"must": [
{
"match": {
"vNested.v1": "User1"
}
},
{
"match": {
"vNested.v2.v21": "3"
}
}
]
}
},
"inner_hits":{}
}
}
]
}
}
}
}
}
Search Result is
"hits": [
{
"_index": "nested-example",
"_type": "_doc",
"_id": "AAu0IXkBKyWl6Va6kmTU",
"_score": 0.0,
"_source": {
"vNested": [
{
"v1": "User1",
"v2": {
"v21": 3
}
},
{
"v1": "User2",
"v2": {
"v21": 2
}
}
]
},
"inner_hits": {
"vNested": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.6931472,
"hits": [
{
"_index": "nested-example",
"_type": "_doc",
"_id": "AAu0IXkBKyWl6Va6kmTU",
"_nested": {
"field": "vNested",
"offset": 0
},
"_score": 1.6931472,
"_source": {
"v1": "User1",
"v2": {
"v21": 3
}
}
}
]
}
}
}
}
]

Elasticsearch filter by nested fields

I have a problem with creating a query to Elasticsearch with many conditions. My model looks like:
data class Product(
#Id
val id: String? = null,
val category: String,
val imagesUrls: List<String>,
#Field(type = FieldType.Double)
val price: Double?,
#Field(type = FieldType.Nested)
val parameters: List<Parameter>?
)
data class Parameter(
val key: String,
val values: List<String>
)
I would like to query products by:
category (for example cars)
price (between 20k $ and 50k $)
and parameters -> For example products with many parameters, like key capacity values 4L, 5L and second parameter gear transmission values manual
My current query looks like this:
GET data/_search
{
"size": 10,
"query": {
"bool": {
"must": [
{
"term": {
"category.keyword": {
"value": "cars"
}
}
},
{
"nested": {
"path": "parameters",
"query": {
"bool": {
"must": [
{"term": {
"parameters.key.keyword": {
"value": "Capacity"
}
}},
{
"term": {
"parameters.key": {
"value": "4L, 5L"
}
}
}
]
}
}
}
}
]
}
}
Could you tell me how to filter the product when parameter key is equal to Capacity and check that the values list contains one of the values?
How to combine many this kind operations in one query?
Example data:
{
"category":"cars",
"name":"Ferrari",
"price":50000,
"parameters":[
{
"key":"capacity",
"values":"4L"
},
{
"key":"gear transmission",
"values":"automcatic"
}
]
}
The search query shown below queries the data based on:
category (for example cars)
And parameters -> For example products with many parameters, like key capacity values 4L, 5L and second parameter gear transmission
values manual
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"parameters": {
"type": "nested"
}
}
}
}
Index Data:
{
"category":"cars",
"name":"Ferrari",
"price":50000,
"parameters":[
{
"key":"gear transmission",
"values":["4L","5L"]
},
{
"key":"capacity",
"values":"automcatic"
}
]
}
{
"category":"cars",
"name":"Ferrari",
"price":50000,
"parameters":[
{
"key":"capacity",
"values":["4L","5L"]
},
{
"key":"gear transmission",
"values":"automcatic"
}
]
}
{
"category":"cars",
"name":"Ferrari",
"price":50000,
"parameters":[
{
"key":"capacity",
"values":"4L"
},
{
"key":"gear transmission",
"values":"automcatic"
}
]
}
Search Query:
{
"query": {
"bool": {
"must": [
{
"term": {
"category.keyword": {
"value": "cars"
}
}
},
{
"nested": {
"path": "parameters",
"query": {
"bool": {
"must": [
{
"match": {
"parameters.key": "capacity"
}
},
{
"terms": {
"parameters.values": [
"4l",
"5l"
]
}
}
]
}
}
}
},
{
"nested": {
"path": "parameters",
"query": {
"bool": {
"must": [
{
"match": {
"parameters.key": "gear transmission"
}
},
{
"match": {
"parameters.values": "automcatic"
}
}
]
}
}
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "bstof",
"_type": "_doc",
"_id": "1",
"_score": 3.9281754,
"_source": {
"category": "cars",
"name": "Ferrari",
"price": 50000,
"parameters": [
{
"key": "capacity",
"values": "4L"
},
{
"key": "gear transmission",
"values": "automcatic"
}
]
}
},
{
"_index": "bstof",
"_type": "_doc",
"_id": "2",
"_score": 3.9281754,
"_source": {
"category": "cars",
"name": "Ferrari",
"price": 50000,
"parameters": [
{
"key": "capacity",
"values": [
"4L",
"5L"
]
},
{
"key": "gear transmission",
"values": "automcatic"
}
]
}
}
]
When you need to match any one from a list then you can use terms query instead of term. Update the part in query from:
{
"term": {
"parameters.key": {
"value": "4L, 5L"
}
}
}
to below:
{
"terms": {
"parameters.values": {
"value": [
"4L",
"5L"
]
}
}
}
Note that if parameters.key is analysed field and there exist a keyword sub-field for the same, then use it instead. e.g parameters.values.keyword
You can read more on terms query here.

How to combine simplequerystring with bool/must

I have this ElasticSearch query for ES version 7:
{
"from": 0,
"simple_query_string": {
"query": "*"
},
"query": {
"bool": {
"must": [
{
"term": {
"organization_id": "fred"
}
},
{
"term": {
"assigned_user_id": "24584080"
}
}
]
}
},
"size": 50,
"sort": {
"updated": "desc"
},
"terminate_after": 50,
}
but ES gives me back this error:
reason: Unknown key for a START_OBJECT in [simple_query_string]
my goal is to be able to use a query-string for multiple fields, and also use term/match with bool/must. Should I abandon the query string and just use bool.must[{match:"my query"}]?
You can use bool to combine multiple queries in this way. The must clause will work as logical AND, and will make sure all the conditions are matched.
You need to include the simple_query_string inside the query section
Adding Working example with sample docs, and search query.
Index Sample Data
{
"organization_id": 1,
"assigned_user_id": 2,
"title": "welcome"
}{
"organization_id": 2,
"assigned_user_id": 21,
"title": "hello"
}{
"organization_id": 3,
"assigned_user_id": 22,
"title": "hello welocome"
}
Search Query :
{
"query": {
"bool": {
"must": [
{
"simple_query_string": {
"fields" : ["title"],
"query" : "welcome"
}
},
{
"match": {
"organization_id": "1"
}
},
{
"match": {
"assigned_user_id": "2"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 3.0925694,
"_source": {
"organization_id": 1,
"assigned_user_id": 2,
"title": "welcome"
}
}
]

ElasticSearch Aggregation on nested field with bucketing on parent id

Following is my doc structure
'Order': {
u'properties': {
u'order_id': {u'type': u'integer'},
'Product': {
u'properties': {
u'product_id': {u'type': u'integer'},
u'product_category': {'type': 'text'},
},
u'type': u'nested'
}
}
}
Doc1
"Order": {
"order_id": "1",
"Product": [
{
"product_id": "1",
"product_category": "category_1"
},
{
"product_id": "2",
"product_category": "category_2"
},
{
"product_id": "3",
"product_category": "category_2"
},
]
}
Doc2
"Order": {
"order_id": "2",
"Product": [
{
"product_id": "4",
"product_category": "category_1"
},
{
"product_id": "1",
"product_category": "category_1"
},
{
"product_id": "2",
"product_category": "category_2"
},
]
}
I want to get following output
"aggregations": {
"Order": [
{
"order_id": "1"
"category_counts": [
{
"category_1": 1
},
{
"category_2": 2
},
]
},
{
"order_id": "1"
"category_counts": [
{
"category_1": 2
},
{
"category_2": 1
},
]
},
]
}
I tried using nested aggregation
"aggs": {
"Product-nested": {
"nested": {
"path": "Product"
}
"aggs": {
"category_counts": {
"terms": {
"field": "Product.product_category"
}
}
},
}
}
It does not give output for each order but gives combined output for all orders
{
"Product-nested": {
"category_counts": [
"category_1": 3,
"category_2": 3
]
}
}
I have two questions:
How to get the desired output in above scenario?
What if instead of single product_category I have an array of
product_categories then how will we achieve the same in this
scenario?
I am using elasticsearch >= 5.0
I have an idea but i dont think its the best one..
you can make a terms aggregation on the "order_id" field, then a sub nestes aggregation on "Product.product_category".
somthing like this :
{
"aggs": {
"all-order-id": {
"terms": {
"field": "order_id",
"size": 10
},
"aggs": {
"Product-nested": {
"nested": {
"path": "Product"
},
"aggs": {
"all-products-in-order-id": {
"terms": {
"field": "Product.product_category"
}
}
}
}
}
}
}
}
sorry its lock bit messy i'm not so good with this answer editor

Empty inner_hits in compound Elasticsearch filter

I'm seeing what appears to be aberrant behavior in inner_hits results within nested boolean queries.
Test data (abbreviated for brevity):
# MAPPING
PUT unit_testing
{
"mappings": {
"document": {
"properties": {
"display_name": {"type": "text"},
"metadata": {
"properties": {
"NAME": {"type": "text"}
}
}
}
},
"paragraph": {
"_parent": {"type": "document"},
"_routing": {"required": true},
"properties": {
"checksum": {"type": "text"},
"sentences": {
"type": "nested",
"properties": {
"text": {"type": "text"}
}
}
}
}
}
}
# DOCUMENT X 2 (d0, d1)
PUT unit_testing/document/doc_id_d0
{
"display_name": "Test Document d0",
"paragraphs": [
"para_id_d0p0",
"para_id_d0p1"
],
"metadata": {"NAME": "Test Document d0 Metadata"}
}
# PARAGRAPH X 2 (d0p0, d1p0)
PUT unit_testing/paragraph/para_id_d0p0?parent=doc_id_d0
{
"checksum": "para_checksum_d0p0",
"sentences": [
{"text": "Test sentence d0p0s0"},
{"text": "Test sentence d0p0s1 ODD"},
{"text": "Test sentence d0p0s2 EVEN"},
{"text": "Test sentence d0p0s3 ODD"},
{"text": "Test sentence d0p0s4 EVEN"}
]
}
This initial query behaves as I would expect (I'm aware that the metadata filter isn't actually necessary in this example case):
GET unit_testing/paragraph/_search
{
"_source": "false",
"query": {
"bool": {
"must": [
{
"has_parent": {
"query": {
"match_phrase": {
"metadata.NAME": "Test Document d0 Metadata"
}
},
"type": "document"
}
},
{
"nested": {
"inner_hits": {},
"path": "sentences",
"query": {
"match": {
"sentences.text": "d0p0s0"
}
}
}
}
]
}
}
}
It yields an inner_hits object containing the one sentence that matched the predicate (some fields removed for clarity):
{
"hits": {
"hits": [
{
"_source": {},
"inner_hits": {
"sentences": {
"hits": {
"hits": [
{
"_source": {
"text": "Test sentence d0p0s0"
}
}
]
}
}
}
}
]
}
}
The following query is an attempt to embed the query above within a parent "should" clause, to create a logical OR between the initial query, and an additional query that matches a single sentence:
GET unit_testing/paragraph/_search
{
"_source": "false",
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"has_parent": {
"query": {
"match_phrase": {
"metadata.NAME": "Test Document d0 Metadata"
}
},
"type": "document"
}
},
{
"nested": {
"inner_hits": {},
"path": "sentences",
"query": {
"match": {
"sentences.text": "d0p0s0"
}
}
}
}
]
}
},
{
"nested": {
"inner_hits": {},
"path": "sentences",
"query": {
"match": {
"sentences.text": "d1p0s0"
}
}
}
}
]
}
}
}
While the "d1" query outputs the result one would expect, with an inner_hits object containing the matching sentence, the original "d0" query now yields an empty inner_hits object:
{
"hits": {
"hits": [
{
"_source": {},
"inner_hits": {
"sentences": {
"hits": {
"total": 0,
"hits": []
}
}
}
},
{
"_source": {},
"inner_hits": {
"sentences": {
"hits": {
"hits": [
{
"_source": {
"text": "Test sentence d1p0s0"
}
}
]
}
}
}
}
]
}
}
Although I'm using the elasticsearch_dsl Python library to build and combine these queries, and I'm something of a novice with respect to the Query DSL, the query format looks solid to me.
What am I missing?
I think what is missing is the name parameter for inner_hits - you have two inner_hits clauses at two different queries that would end up with the same name. Try giving the inner_hits a name parameter (0).
0 - https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html#_options

Resources