Checking if a field exists for any and/or all nested objects - elasticsearch

I took a look at ElasticSearch: search inside the array of objects and while it helps, I'm actually trying to determine if at least one has a field and if all nested objects have the field.
Pretending we have an index of all refrigerators with a superfluous document like:
{
"_id": "whatever",
"location": "North Building 1",
"floor": 2,
"tag": "refrigerator-1",
"contents" : [
{
"item": "milk-carton",
"expires": 1-1-2023
},
{
"item": "pyrex-container",
}
]
}
How do I create an Elastic search query to;
Find any refrigerator that has at least 1 item that CAN expire ( "exists" : { "field" : "expires" } }
Find refrigerators that have no items that expire
Find refrigerators that where all items have an expire field

If you want to do this in a single query , use named_queries
Query
{
"query": {
"bool": {
"should": [
{
"nested": {
"_name": "At least one expires",
"path": "contents",
"query": {
"exists": {
"field": "contents.expires"
}
}
}
},
{
"bool": {
"_name": "None expires",
"must_not": [
{
"nested": {
"path": "contents",
"query": {
"exists": {
"field": "contents.expires"
}
}
}
}
]
}
},
{
"bool": {
"_name": "All expires",
"must": [
{
"nested": {
"path": "contents",
"query": {
"exists": {
"field": "contents.expires"
}
}
}
}
],
"must_not": [
{
"nested": {
"path": "contents",
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "contents.expires"
}
}
]
}
}
}
}
]
}
}
]
}
}
}
Result
"hits" : [
{
"_index" : "index70",
"_type" : "_doc",
"_id" : "Qt2PVoQB_m3FhzcGBasD",
"_score" : 2.0,
"_source" : {
"location" : "North Building 1",
"floor" : 3,
"tag" : "refrigerator-3",
"contents" : [
{
"item" : "milk-carton",
"expires" : "2023-01-01"
},
{
"item" : "pyrex-container",
"expires" : "2023-01-01"
}
]
},
"matched_queries" : [
"At least one expires",
"All expires"
]
},
{
"_index" : "index70",
"_type" : "_doc",
"_id" : "QN2BVoQB_m3FhzcG9qsG",
"_score" : 1.0,
"_source" : {
"location" : "North Building 1",
"floor" : 2,
"tag" : "refrigerator-1",
"contents" : [
{
"item" : "milk-carton",
"expires" : "2023-01-01"
},
{
"item" : "pyrex-container"
}
]
},
"matched_queries" : [
"At least one expires"
]
},
{
"_index" : "index70",
"_type" : "_doc",
"_id" : "Qd2HVoQB_m3FhzcGUauO",
"_score" : 0.0,
"_source" : {
"location" : "North Building 1",
"floor" : 3,
"tag" : "refrigerator-2",
"contents" : [
{
"item" : "milk-carton"
},
{
"item" : "pyrex-container"
}
]
},
"matched_queries" : [
"None expires"
]
}
]
Query is self explanatory. If you want use separate queries for three conditions, break above query. Each should clause will become a separate query

Related

Elasticsearch multiple fields query

I'm asking for your help.
elasticsearch create search query
first, search field is keyword type
data
"hits" : [
{
"_index" : "search_event",
"_type" : "_doc",
"_score" : 5.179434,
"_source" : {
"search_keyword" : [
{
"search" : "or",
"keyword" : "developer",
"type" : "18"
}
]
},
{
"_source" : {
"search_keyword" : [
{
"search" : "or",
"keyword" : "tail"
},
{
"search" : "or",
"keyword" : "cap"
},
{
"search" : "and",
"keyword" : "developer"
}
]
}
}
}
When searching,
Must be keyword=developer and search=or
"query": {
"bool": {
"filter": [
{
"term": {
"search_keyword.keyword": {
"value": "developer"
}
}
},
{
"term": {
"search_keyword.search": {
"value": "or"
}
}
}
]
}
}
}
However, 'keyword=developer and search=and' but also a search.
how do I write a query?
"hits" : [
{
"_index" : "search_event",
"_type" : "_doc",
"_score" : 5.179434,
"_source" : {
"search_keyword" : [
{
"search" : "or",
"keyword" : "developer",
"type" : "18"
},
{
"search" : "or",
"keyword" : "tail"
},
{
"search" : "or",
"keyword" : "cap"
},
{
"search" : "and",
"keyword" : "developer"
}
]
}
]
}
i wan't search 'keyword=developer and search=and' documents
only 'keyword=developer and search=or' documents
use below query
"query": {
"bool": {
"must": [ --> note instead of `filter`, it's `must` clause.
{
"term": {
"search_keyword.keyword": {
"value": "developer"
}
}
},
{
"term": {
"search_keyword.search": {
"value": "or"
}
}
}
]
}
}
}

elasticsearch filter nested object

I have an index with a nested object containing two attributes namely scopeId and categoryName. Following is the mappings part of the index
"mappedCategories" : {
"type" : "nested",
"properties": {
"scopeId": {"type":"long"},
"categoryName": {"type":"text",
"analyzer" : "productSearchAnalyzer",
"search_analyzer" : "productSearchQueryAnalyzer"}
}
}
A sample document containing the nested mappedCategories object is as follows:
POST productsearchna_2/_doc/1
{
"categoryName" : "Operating Systems",
"contexts" : [
0
],
"countryCode" : "US",
"id" : "10076327-1",
"languageCode" : "EN",
"localeId" : 1,
"mfgpartno" : "test123",
"manufacturerName" : "Hewlett Packard Enterprise",
"productDescription" : "HPE Microsoft Windows 2000 Datacenter Server - Complete Product - Complete Product - 1 Server - Standard",
"productId" : 10076327,
"skus" : [
{"sku": "43233004",
"skuName": "UNSPSC"},
{"sku": "43233049",
"skuName": "SP Richards"},
{"sku": "43234949",
"skuName": "Ingram Micro"}
],
"mappedCategories" : [
{"scopeId": 3228552,
"categoryName": "Laminate Bookcases"},
{"scopeId": 3228553,
"categoryName": "Bookcases"},
{"scopeId": 3228554,
"categoryName": "Laptop"}
]
}
I want to filter categoryName "lap" on scopeId: 3228553 i.e. my query should return 0 hits since Laptop is mapped to scopeId 3228554. But my following query is returning 1 hit with scopeId : 3228554
POST productsearchna_2/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "mappedCategories",
"query": {
"term": {
"mappedCategories.categoryName": "lap"
}
},
"inner_hits": {}
}
}
],
"filter": [
{
"nested": {
"path": "mappedCategories",
"query": {
"term": {
"mappedCategories.scopeId": {
"value": 3228552
}
}
}
}
}
]
}
},
"_source": ["mappedCategories.categoryName", "productId"]
}
Following is part of the result of the query:
"inner_hits" : {
"mappedCategories" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.5586993,
"hits" : [
{
"_index" : "productsearchna_2",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "mappedCategories",
"offset" : 2
},
"_score" : 1.5586993,
"_source" : {
"scopeId" : 3228554,
"categoryName" : "Laptop"
}
}
]
}
}
I want my query to return zero hits, and in case I search for "book" with scopeId: 3228552, I want my query to return 2 hits, 1 for Bookcases and another for Laminate Bookcases categoryNames. Please help.
This query solves part of the problem but when searching for book" with scopeId: 3228552 it will only get 1 result.
GET idx_test/_search?filter_path=hits.hits.inner_hits
{
"query": {
"nested": {
"path": "mappedCategories",
"query": {
"bool": {
"filter": [
{
"term": {
"mappedCategories.scopeId": {
"value": 3228553
}
}
}
],
"must": [
{
"match": {
"mappedCategories.categoryName": "laptop"
}
}
]
}
},
"inner_hits": {}
}
}
}

Documents repeating in the query of elasticsearch

I'm new to elasticsearch. I need to build the query dynamically, where for each field name the the corresponding file is fetched
I have the below query, can anyone say if its the right approach? Also with this query, the documents are just repeating for one particular file name
Please let me know how to go about it
GET index_name/_search
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match_phrase": {
"field_name": "program"
}
},
{
"match_phrase": {
"field_value": "aaa-123"
}
}
]
}
},
{
"bool": {
"must": [
{
"match_phrase": {
"field_name": "species"
}
},
{
"match_phrase": {
"field_value": "mouse"
}
}
]
}
},
{
"bool": {
"must": [
{
"match_phrase": {
"field_name": "model name"
}
},
{
"match_phrase": {
"field_value": "b45"
}
}
]
}
}
]
}
},"aggs": {
"2": {
"terms": {
"field": "myfile_file_name.keyword",
"size": 1000,
"order": {
"_key": "asc"
}
},
"aggs": {
"3": {
"terms": {
"field": "field_name.keyword",
"size": 1000,
"order": {
"_key": "asc"
}
}
}
}
}
}
}
mapping and Output
{
"_index" : "test",
"_type" : "test_data",
"_id" : "123",
"_score" : 1.0,
"_source" : {
"document_id" : 123,
"m_id" : 1,
"source" : "ADDD",
"type" : "M",
"name" : "Animal",
"value" : "None",
"test_type" : "Test123",
"file_name" : "AA.zip",
"description" : "testing",
"program" : ["hello"],
"species" : ["mouse"],
"study" : ["Study1"],
"create_date" : "2020-08-20 11:51:21.152",
"update_date" : "2020-08-20 11:51:21.152",
"source_name" : "Anim",
"auth" : ["na"],
"treatment" : ["TR001", "TR002", "TR004"],
"timepoint" : ["72", "48"],
"findings_reports" : "na",
"model" : ["None",],
"additional" : "{'view': '', 'load': []}",
"data" : "Pre"
}
},
]
}
}

Using named queries (matched_queries) for nested types in Elasticsearch?

Using named queries, I can get a list of the matched_queries for boolean expressions such as:
(query1) AND (query2 OR query3 OR true)
Here is an example of using named queries to match on top-level document fields:
DELETE test
PUT /test
PUT /test/_mapping/_doc
{
"properties": {
"name": {
"type": "text"
},
"type": {
"type": "text"
},
"TAGS": {
"type": "nested"
}
}
}
POST /test/_doc
{
"name" : "doc1",
"type": "msword",
"TAGS" : [
{
"ID" : "tag1",
"TYPE" : "BASIC"
},
{
"ID" : "tag2",
"TYPE" : "BASIC"
},
{
"ID" : "tag3",
"TYPE" : "BASIC"
}
]
}
# (query1) AND (query2 or query3 or true)
GET /test/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": {
"query": "doc1",
"_name": "query1"
}
}
}
],
"should": [
{
"match": {
"type": {
"query": "msword",
"_name": "query2"
}
}
},
{
"exists": {
"field": "type",
"_name": "query3"
}
}
]
}
}
}
The above query correctly returns all three matched_queries in the response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.5753641,
"hits" : [
{
"_index" : "test",
"_type" : "_doc",
"_id" : "TKNJ9G4BbvPS27u-ZYux",
"_score" : 1.5753641,
"_source" : {
"name" : "doc1",
"type" : "msword",
"TAGS" : [
{
"ID" : "ds1",
"TYPE" : "BASIC"
},
{
"ID" : "wb1",
"TYPE" : "BASIC"
}
]
},
"matched_queries" : [
"query1",
"query2",
"query3"
]
}
]
}
}
However, I'm trying to run a similar search:
(query1) AND (query2 OR query3 OR true)
only this time on the nested TAGS object rather than top-level document fields.
I've tried the following query, but the problem is I need to supply the inner_hits object for nested objects in order to get the matched_queries in the response, and I can only add it to one of the three queries.
GET /test/_search
{
"query": {
"bool": {
"must": {
"nested": {
"path": "TAGS",
"query": {
"match": {
"TAGS.ID": {
"query": "tag1",
"_name": "tag1-query"
}
}
},
// "inner_hits" : {}
}
},
"should": [
{
"nested": {
"path": "TAGS",
"query": {
"match": {
"TAGS.ID": {
"query": "tag2",
"_name": "tag2-query"
}
}
},
// "inner_hits" : {}
}
},
{
"nested": {
"path": "TAGS",
"query": {
"match": {
"TAGS.ID": {
"query": "tag3",
"_name": "tag3-query"
}
}
},
// "inner_hits" : {}
}
}
]
}
}
}
Elasticsearch will complain if I add more than one 'inner_hits'. I've commented out the places above where I can add it, but each of these will only return the single matched query.
I want my response to this query to return:
"matched_queries" : [
"tag1-query",
"tag2-query",
"tag3-query"
]
Any help is much appreciated, thanks!
A colleague helpfully provided a solution to this; move the _named parameter to directly under each nested section:
GET /test/_search
{
"query": {
"bool": {
"must": {
"nested": {
"_name": "tag1-query",
"path": "TAGS",
"query": {
"match": {
"TAGS.ID": {
"query": "tag1"
}
}
}
}
},
"should": [
{
"nested": {
"_name": "tag2-query",
"path": "TAGS",
"query": {
"match": {
"TAGS.ID": {
"query": "tag2"
}
}
}
}
},
{
"nested": {
"_name": "tag3-query",
"path": "TAGS",
"query": {
"match": {
"TAGS.ID": {
"query": "tag3"
}
}
}
}
}
]
}
}
}
This correctly returns all three tags now in the matched_queries response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 2.9424875,
"hits" : [
{
"_index" : "test",
"_type" : "_doc",
"_id" : "TaNy9G4BbvPS27u--oto",
"_score" : 2.9424875,
"_source" : {
"name" : "doc1",
"type" : "msword",
"TAGS" : [
{
"ID" : "ds1",
"TYPE" : "DATASOURCE"
},
{
"ID" : "wb1",
"TYPE" : "WORKBOOK"
},
{
"ID" : "wb2",
"TYPE" : "WORKBOOK"
}
]
},
"matched_queries" : [
"tag1-query",
"tag2-query",
"tag3-query"
]
}
]
}
}

Elasticsearch unable retrieve child documents

I recently migrated Elasticsearch version 2.4 to 6.2.1 and my previous GET query is not working. Below is the query I am trying to retrieve the child document based on _id and _parent values. DO i have to change the implementation to retreive the documnets from ES?
{
"query": {
"bool": {
"must": [
{
"term": {
"_id": {
"value": "9:v0",
"boost": 1
}
}
},
{
"term": {
"_parent": {
"value": "v0",
"boost": 1
}
}
},
{
"terms": {
"assoc.domainId": [
"XX"
],
"boost": 1
}
},
{
"terms": {
"assoc.nodeId": [
"YY"
],
"boost": 1
}
}
],
"adjust_pure_negative": false,
"boost": 1
}
}
}
parent document in ES:
{
"_index" : "test",
"_type" : "assocjoin",
"_id" : "v0",
"_score" : 1.0,
"_source" : {
"my_join_field" : {
"name" : "version"
},
"versionnumber" : "v0",
"versiondate" : "2018/03/29 13:25:02"
}
}
Child document in ES:
{
"_index" : "test",
"_type" : "versionjoin",
"_id" : "9:v0",
"_score" : 0.18232156,
"_routing" : "v0",
"_source" : {
"id" : 0,
"assocDTO" : {
"id" : 9,
"domainId" : "XX",
"nodeId" : "YY"
},
"biomarkers" : [
{
....
}
],
"contexts" : [
{
....
}
]
},
"my_join_field" : {
"name" : "assocversion",
"parent" : "v0"
}
}
}
]
}

Resources