Elasticsearch filter by multiple fileds does not return document - elasticsearch

I have document:
{
"_index" : "logs",
"_id" : "e174f29c-9f0b-4aab-a3b3-7ab5dcb8a50a",
"_score" : null,
"_source" : {
"number" : 1,
"request_type" : 1,
"request_entity_type" : 1,
"entity_type" : 1,
"entity_id" : "6c125004-4720-4258-a5d6-3fa1c7468bc8",
"field_name" : "name",
"old_value" : null,
"new_value" : """[{"locale":"ru-RU","text_value":"1234"}]""",
"created_by" : "b6aa1f8f-79b8-45b6-a11c-fe65b8bdfc35",
"created_at" : "2022-06-29T10:47:43.205753"
}
}
And when I try to get this document by entity_type and field_name fields, it works:
GET logs/_search
{
"query": {
"bool": {
"filter": [
{"term" : { "entity_type" : "1" }},
{"term": {"field_name": "name"}}
]
}
},
"sort": [
{
"number": {
"order": "desc"
}
}
]
}
But when I change field_name to entity_id I get zero hits:
GET logs/_search
{
"query": {
"bool": {
"filter": [
{"term" : { "entity_type" : "1" }},
{"term": {"entity_id": "6c125004-4720-4258-a5d6-3fa1c7468bc8"}}
]
}
},
"sort": [
{
"number": {
"order": "desc"
}
}
]
}
Why doesn't it work? What is the difference between field_name and entity_id

Looks like your entity_id field is created by Out of the box mapping of Elasticsearch which analyzes it as a text field and break it, adding .keyword should work.
{
"query": {
"bool": {
"filter": [
{"term" : { "entity_type" : "1" }},
{"term": {"entity_id.keyword": "6c125004-4720-4258-a5d6-3fa1c7468bc8"}} // note `entity_id.keyword` as a field name.
]
}
},
"sort": [
{
"number": {
"order": "desc"
}
}
]
}
Note: If you don't define explicit mapping, Elasticsearch generates both text and keyword field for every text field as it doesn't know your use case.

Related

Checking if a field exists for any and/or all nested objects

I took a look at ElasticSearch: search inside the array of objects and while it helps, I'm actually trying to determine if at least one has a field and if all nested objects have the field.
Pretending we have an index of all refrigerators with a superfluous document like:
{
"_id": "whatever",
"location": "North Building 1",
"floor": 2,
"tag": "refrigerator-1",
"contents" : [
{
"item": "milk-carton",
"expires": 1-1-2023
},
{
"item": "pyrex-container",
}
]
}
How do I create an Elastic search query to;
Find any refrigerator that has at least 1 item that CAN expire ( "exists" : { "field" : "expires" } }
Find refrigerators that have no items that expire
Find refrigerators that where all items have an expire field
If you want to do this in a single query , use named_queries
Query
{
"query": {
"bool": {
"should": [
{
"nested": {
"_name": "At least one expires",
"path": "contents",
"query": {
"exists": {
"field": "contents.expires"
}
}
}
},
{
"bool": {
"_name": "None expires",
"must_not": [
{
"nested": {
"path": "contents",
"query": {
"exists": {
"field": "contents.expires"
}
}
}
}
]
}
},
{
"bool": {
"_name": "All expires",
"must": [
{
"nested": {
"path": "contents",
"query": {
"exists": {
"field": "contents.expires"
}
}
}
}
],
"must_not": [
{
"nested": {
"path": "contents",
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "contents.expires"
}
}
]
}
}
}
}
]
}
}
]
}
}
}
Result
"hits" : [
{
"_index" : "index70",
"_type" : "_doc",
"_id" : "Qt2PVoQB_m3FhzcGBasD",
"_score" : 2.0,
"_source" : {
"location" : "North Building 1",
"floor" : 3,
"tag" : "refrigerator-3",
"contents" : [
{
"item" : "milk-carton",
"expires" : "2023-01-01"
},
{
"item" : "pyrex-container",
"expires" : "2023-01-01"
}
]
},
"matched_queries" : [
"At least one expires",
"All expires"
]
},
{
"_index" : "index70",
"_type" : "_doc",
"_id" : "QN2BVoQB_m3FhzcG9qsG",
"_score" : 1.0,
"_source" : {
"location" : "North Building 1",
"floor" : 2,
"tag" : "refrigerator-1",
"contents" : [
{
"item" : "milk-carton",
"expires" : "2023-01-01"
},
{
"item" : "pyrex-container"
}
]
},
"matched_queries" : [
"At least one expires"
]
},
{
"_index" : "index70",
"_type" : "_doc",
"_id" : "Qd2HVoQB_m3FhzcGUauO",
"_score" : 0.0,
"_source" : {
"location" : "North Building 1",
"floor" : 3,
"tag" : "refrigerator-2",
"contents" : [
{
"item" : "milk-carton"
},
{
"item" : "pyrex-container"
}
]
},
"matched_queries" : [
"None expires"
]
}
]
Query is self explanatory. If you want use separate queries for three conditions, break above query. Each should clause will become a separate query

elasticsearch filter nested object

I have an index with a nested object containing two attributes namely scopeId and categoryName. Following is the mappings part of the index
"mappedCategories" : {
"type" : "nested",
"properties": {
"scopeId": {"type":"long"},
"categoryName": {"type":"text",
"analyzer" : "productSearchAnalyzer",
"search_analyzer" : "productSearchQueryAnalyzer"}
}
}
A sample document containing the nested mappedCategories object is as follows:
POST productsearchna_2/_doc/1
{
"categoryName" : "Operating Systems",
"contexts" : [
0
],
"countryCode" : "US",
"id" : "10076327-1",
"languageCode" : "EN",
"localeId" : 1,
"mfgpartno" : "test123",
"manufacturerName" : "Hewlett Packard Enterprise",
"productDescription" : "HPE Microsoft Windows 2000 Datacenter Server - Complete Product - Complete Product - 1 Server - Standard",
"productId" : 10076327,
"skus" : [
{"sku": "43233004",
"skuName": "UNSPSC"},
{"sku": "43233049",
"skuName": "SP Richards"},
{"sku": "43234949",
"skuName": "Ingram Micro"}
],
"mappedCategories" : [
{"scopeId": 3228552,
"categoryName": "Laminate Bookcases"},
{"scopeId": 3228553,
"categoryName": "Bookcases"},
{"scopeId": 3228554,
"categoryName": "Laptop"}
]
}
I want to filter categoryName "lap" on scopeId: 3228553 i.e. my query should return 0 hits since Laptop is mapped to scopeId 3228554. But my following query is returning 1 hit with scopeId : 3228554
POST productsearchna_2/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "mappedCategories",
"query": {
"term": {
"mappedCategories.categoryName": "lap"
}
},
"inner_hits": {}
}
}
],
"filter": [
{
"nested": {
"path": "mappedCategories",
"query": {
"term": {
"mappedCategories.scopeId": {
"value": 3228552
}
}
}
}
}
]
}
},
"_source": ["mappedCategories.categoryName", "productId"]
}
Following is part of the result of the query:
"inner_hits" : {
"mappedCategories" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.5586993,
"hits" : [
{
"_index" : "productsearchna_2",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "mappedCategories",
"offset" : 2
},
"_score" : 1.5586993,
"_source" : {
"scopeId" : 3228554,
"categoryName" : "Laptop"
}
}
]
}
}
I want my query to return zero hits, and in case I search for "book" with scopeId: 3228552, I want my query to return 2 hits, 1 for Bookcases and another for Laminate Bookcases categoryNames. Please help.
This query solves part of the problem but when searching for book" with scopeId: 3228552 it will only get 1 result.
GET idx_test/_search?filter_path=hits.hits.inner_hits
{
"query": {
"nested": {
"path": "mappedCategories",
"query": {
"bool": {
"filter": [
{
"term": {
"mappedCategories.scopeId": {
"value": 3228553
}
}
}
],
"must": [
{
"match": {
"mappedCategories.categoryName": "laptop"
}
}
]
}
},
"inner_hits": {}
}
}
}

Nested Query Elastic Search

Currently I am trying to search/filter a nested Document in Elastic Search Spring Data.
The Current Document Structure is:
{
"id": 1,
"customername": "Cust#123",
"policydetails": {
"address": {
"city": "Irvine",
"state": "CA",
"address2": "23994384, Out OF World",
"post_code": "92617"
},
"policy_data": [
{
"id": 1,
"status": true,
"issue": "Variation Issue"
},
{
"id": 32,
"status": false,
"issue": "NoiseIssue"
}
]
}
}
Now we need to filter out the policy_data which has Noise Issue and If there is no Policy Data which has Noise Issue the policy_data will be null inside the parent document.
I have tried to use this Query
{
"query": {
"bool": {
"must": [
{
"match": {
"customername": "Cust#345"
}
},
{
"nested": {
"path": "policiesDetails.policy_data",
"query": {
"bool": {
"must": {
"terms": {
"policiesDetails.policy_data.issue": [
"Noise Issue"
]
}
}
}
}
}
}
]
}
}
}
This works Fine to filter nested Document. But If the Nested Document does not has the match it removes the entire document from the view.
What i want is if nested filter does not match:-
{
"id": 1,
"customername": "Cust#123",
"policydetails": {
"address": {
"city": "Irvine",
"state": "CA",
"address2": "23994384, Out OF World",
"post_code": "92617"
},
"policy_data": null
}
If any nested document is not found then parent document will not be returned.
You can use should clause for policy_data. If nested document is found it will be returned under inner_hits otherwise parent document will be returned
{
"query": {
"bool": {
"must": [
{
"match": {
"customername": "Cust#345"
}
}
],
"should": [
{
"nested": {
"path": "policydetails.policy_data",
"inner_hits": {}, --> to return matched policy_data
"query": {
"bool": {
"must": {
"terms": {
"policydetails.policy_data.issue": [
"Noise Issue"
]
}
}
}
}
}
}
]
}
},
"_source": ["id","customername","policydetails.address"] --> selected fields
}
Result:
{
"_index" : "index116",
"_type" : "_doc",
"_id" : "f1SxGHoB5tcHqHDtAkTC",
"_score" : 0.2876821,
"_source" : {
"policydetails" : {
"address" : {
"city" : "Irvine",
"address2" : "23994384, Out OF World",
"post_code" : "92617",
"state" : "CA"
}
},
"id" : 1,
"customername" : "Cust#123"
},
"inner_hits" : {
"policydetails.policy_data" : {
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ] --> nested query result , matched document returned
}
}
}
}

Unable to retrieve nested object within Elastic Search

An ELK noob here, having the ELK task drop to me last minute.
We are adding an extra data named prospects into the vehicle index, so the user could search for it. I'm able to to add the prospects into the index, now I'm unable to get the nested prospects obj within the vehicle index. I'm using Elastic Search & Kibana v6.8.11, and elastic-search-rails gem and checked up the docs on nested object. My search method looks correct according to the docs. Would like some expert to point out what when wrong here, please let me know if you need more info.
Here is the suppose index obj -
{
"_index" : "vehicles",
"_type" : "_doc",
"_id" : "3MZBxxxxxxx",
"_score" : 0.0,
"_source" : {
"vin" : "3MZBxxxxxxx",
"make" : "mazda",
"model" : "mazda3",
"color" : "unknown",
"year" : 2018,
"vehicle" : "2018 mazda mazda3",
"trim" : "grand touring",
"estimated_mileage" : null,
"dealership" : [
209
],
"current_owner_group_id" : null,
"current_owner_customer_id" : null,
"last_service_date" : null,
"last_service_revenue" : null,
"purchase_type" : [ ],
"in_service_date" : null,
"deal_headers" : [ ],
"services" : [ ],
"customers" : [ ],
"salesmen" : null,
"service_appointments" : [ ],
"prospects" : [
{
"first_name" : "Kammy",
"last_name" : "Maytag",
"name" : "Kammy Maytag",
"company_name" : null,
"emails" : [ ],
"phone_numbers" : [ ],
"address" : "31119 field",
"city" : "helen",
"state" : "keller",
"zip" : "81411",
"within_dealership_aoi_region" : true,
"dealership_ids" : [
209
],
"dealership_dppa_protected_ids" : [
209
],
"registration_id" : 12344,
"id" : 1054,
"prospect_source_id" : "12344",
"type" : "Prospect"
}
]
}
}
]
}
}
Here is how I'm trying to get it -
GET /vehicles/_search
{
"query": {
"bool": {
"must": { "match_all": {} },
"filter": [
{ "term": { "dealership": "209" } },
{
"nested": {
"path": "prospects",
"query": {
"bool": {
"must": [
{ "term": { "prospects.first_name": "Kammy" } },
{ "term": { "prospects.dealership": "209" } },
{ "term": { "prospects.type": "Prospect" } }
]
}
}
}
},
{ "bool": { "must_not": { "term": { "purchase_type": "Wholesale" } } } }
]
}
},
"sort": [{ "_doc": { "order": "asc" } }]
}
I see two issues with the nested query:
You're querying prospects.dealership but the example doc only shows prospects.dealership_ids. Change query to target prospects.dealership_ids.
More importantly, you're using a term query on prospects.first_name and prospects.type. I'm assuming your index mapping doesn't define those as keywords which means that they were most likely lowercased (for reasons explained here) but term is looking for exact matches.
Option 1: Use match instead of term.
Option 2: Change prospects.first_name → prospects.first_name.keyword and do the same for .type.

Documents repeating in the query of elasticsearch

I'm new to elasticsearch. I need to build the query dynamically, where for each field name the the corresponding file is fetched
I have the below query, can anyone say if its the right approach? Also with this query, the documents are just repeating for one particular file name
Please let me know how to go about it
GET index_name/_search
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match_phrase": {
"field_name": "program"
}
},
{
"match_phrase": {
"field_value": "aaa-123"
}
}
]
}
},
{
"bool": {
"must": [
{
"match_phrase": {
"field_name": "species"
}
},
{
"match_phrase": {
"field_value": "mouse"
}
}
]
}
},
{
"bool": {
"must": [
{
"match_phrase": {
"field_name": "model name"
}
},
{
"match_phrase": {
"field_value": "b45"
}
}
]
}
}
]
}
},"aggs": {
"2": {
"terms": {
"field": "myfile_file_name.keyword",
"size": 1000,
"order": {
"_key": "asc"
}
},
"aggs": {
"3": {
"terms": {
"field": "field_name.keyword",
"size": 1000,
"order": {
"_key": "asc"
}
}
}
}
}
}
}
mapping and Output
{
"_index" : "test",
"_type" : "test_data",
"_id" : "123",
"_score" : 1.0,
"_source" : {
"document_id" : 123,
"m_id" : 1,
"source" : "ADDD",
"type" : "M",
"name" : "Animal",
"value" : "None",
"test_type" : "Test123",
"file_name" : "AA.zip",
"description" : "testing",
"program" : ["hello"],
"species" : ["mouse"],
"study" : ["Study1"],
"create_date" : "2020-08-20 11:51:21.152",
"update_date" : "2020-08-20 11:51:21.152",
"source_name" : "Anim",
"auth" : ["na"],
"treatment" : ["TR001", "TR002", "TR004"],
"timepoint" : ["72", "48"],
"findings_reports" : "na",
"model" : ["None",],
"additional" : "{'view': '', 'load': []}",
"data" : "Pre"
}
},
]
}
}

Resources