How multi_match search in elastic on main object and nested array of objects? - elasticsearch

I'm using elastic-search v7 and I have mapped object like below.
Items its nested array of objects.
My problem is, when I try search by multi_match items fields, its not working like I expect, result is empty. But when I try to search with query and boolean, its finds my document.
I don't correct understand what a different there, how I understand is query_search its exact matches using for filter and aggregation of data, and multi_match for full text search and autocomplete , right?
And how to find documents searching in root fields and nested fields?
{
"orders" : {
"aliases" : { },
"mappings" : {
"properties" : {
"amazonOrderId" : {
"type" : "keyword"
},
"carrierCode" : {
"type" : "text"
},
"carrierName" : {
"type" : "text"
},
"id" : {
"type" : "keyword"
},
"items" : {
"type" : "nested",
"properties" : {
"amazonItemId" : {
"type" : "keyword"
},
"amazonPrice" : {
"type" : "integer"
},
"amazonQuantity" : {
"type" : "integer"
},
"amazonSku" : {
"type" : "keyword"
},
"graingerItem" : {
"type" : "nested"
},
"graingerOrderId" : {
"type" : "keyword"
},
"graingerPrice" : {
"type" : "integer"
},
"graingerShipDate" : {
"type" : "date"
},
"graingerShipMethod" : {
"type" : "short"
},
"graingerTrackingNumber" : {
"type" : "keyword"
},
"graingerWebNumber" : {
"type" : "keyword"
},
"id" : {
"type" : "keyword"
}
}
}
}
}
}
}
multi_match request
GET orders/_search
{
"query":{
"multi_match" : {
"query": "4.48 - 1 pack - 4.48",
"fields": [
"items.amazonSku",
"carrierCode",
"recipientName"
]
}
}
}
Debugging by _explain api returns me that description
"explanation" : {
"value" : 0.0,
"description" : "Failure to meet condition(s) of required/prohibited clause(s)",
"details" : [
{
"value" : 0.0,
"description" : "no match on required clause (items.amazonSku:4.48 - 1 pack - 4.48)",
"details" : [
{
"value" : 0.0,
"description" : "no matching term",
"details" : [ ]
}
]
},
{
"value" : 0.0,
"description" : "match on required clause, product of:",
"details" : [
{
"value" : 0.0,
"description" : "# clause",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "DocValuesFieldExistsQuery [field=_primary_term]",
"details" : [ ]
}
]
}
]
}
Query search
GET orders/_search
{
"query": {
"nested": {
"path": "items",
"query": {
"bool": {
"must": [
{ "match": { "items.amazonSku": "4.48 - 1 pack - 4.48"}}
]
}
}
}
}
}

Since you are querying on nested field items, you need to include the nested param in your query so that it searches for the nested field object
Modify your search as
{
"query": {
"nested": {
"path": "items",
"query": {
"multi_match": {
"query": "4.48 - 1 pack - 4.48",
"fields": [
"items.amazonSku"
]
}
}
}
}
}

Related

How to split object (nested) into multiple columns in Elasticsearch / Kibana data table visualization

I have a nested object indexed in elasticsearch (7.10) and I need to visualize it with a kibana table. The problem is that kibana throws in the values from the nested field which have the same name in one column.
Part of the index:
{
"index" : {
"mappings" : {
"properties" : {
"data1" : {
"type" : "keyword"
},
"Details" : {
"type" : "nested",
"properties" : {
"Amount" : {
"type" : "float"
},
"Currency" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"DetailType" : {
"type" : "keyword"
},
"Price" : {
"type" : "float"
},
"Quantity" : {
"type" : "float"
},
"TotalAmount" : {
"type" : "float"
.......
The problem in the table:
How can I get three rows named Details each with one split term (e.g DetailType: "start_fee")?
Update:
I could query the nested object in the console:
GET _search
{
"query": {
"nested": {
"path": "Details",
"query": {
"bool": {
"must": [
{ "match": { "Details.DetailType": "energybased_fee" }}
]
}
},
"inner_hits": {
}
}}}
But how can I visualize in the table only the "inner_hits" value?

Elasticsearch `function_score` with `score_mode` confusion when used with nested objects

Background:
I have the following mapping for curriculum_posts documents. Notice the nested skills property.
{
"curriculum_posts" : {
"mappings" : {
"dynamic" : "false",
"properties" : {
"title" : {
"type" : "text",
"analyzer" : "english"
},
"skills" : {
"type" : "nested",
"properties" : {
"slug" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
},
"text" : {
"type" : "text"
}
}
},
"start_skill_level" : {
"type" : "keyword"
},
"start_skill_level_value" : {
"type" : "integer"
}
}
}
}
}
}
}
A sample record looks like this:
{
"_source" : {
"skills" : [
{
"start_skill_level_value" : 1,
"slug" : "infrastructure-as-code-iac"
},
{
"start_skill_level_value" : 1,
"slug" : "devops"
}
],
"title" : "Terraform: Infrastructure as code"
}
}
I wanted to run a query that return all documents but with scores matching the number of skills.slug values that matched. My query looks like this:
{
"query": {
"nested": {
"path": "skills",
"query": {
"function_score": {
"query": { "match_all": {} },
"functions": [
{ "script_score": { "script": "0" } },
{
"filter": {
"term": { "skills.slug.raw": { "value": "devops" } }
},
"weight": 2
},
{
"filter": {
"term": { "skills.slug.raw": { "value": "infrastructure-as-code-iac" } }
},
"weight": 2
}
],
"score_mode": "sum",
"boost_mode": "replace"
}
}
}
}
}
I decided to use function_score with boost_mode: replace so that the scores from documents are ignored and only the function scores are taken. The score_mode: sum to ensure that the scores from the function matches are summed up.
The problem
So, for the above query, on the example document above, I was expecting the score to be 4.0 because it matches the skills.slug for both infrastructure-as-code-iac and devops. However, I the score in the result is only 2.0 for the document.
Question
I suppose I'm not understanding how function_score takes the scores from the functions or how my functions are effecting the score. Could someone help me understand the scoring here?
Some debugging
I looked at the explanation but I'm unable to decode much information from it. Nevertheless, here is the explanation:
{
"_index" : "curriculum_posts",
"_type" : "_doc",
"_id" : "18",
"matched" : true,
"explanation" : {
"value" : 2.0,
"description" : "Score based on 2 child docs in range from 83 to 93, best match:",
"details" : [
{
"value" : 2.0,
"description" : "sum of:",
"details" : [
{
"value" : 2.0,
"description" : "min of:",
"details" : [
{
"value" : 2.0,
"description" : "function score, score mode [sum]",
"details" : [
{
"value" : 0.0,
"description" : "script score function, computed with script:\"Script{type=inline, lang='painless', idOrCode='0', options={}, params={}}\"",
"details" : [
{
"value" : 1.0,
"description" : "_score: ",
"details" : [
{
"value" : 1.0,
"description" : "*:*",
"details" : [ ]
}
]
}
]
},
{
"value" : 2.0,
"description" : "function score, product of:",
"details" : [
{
"value" : 1.0,
"description" : "match filter: skills.slug.raw:infrastructure-as-code-iac",
"details" : [ ]
},
{
"value" : 2.0,
"description" : "product of:",
"details" : [
{
"value" : 1.0,
"description" : "constant score 1.0 - no function provided",
"details" : [ ]
},
{
"value" : 2.0,
"description" : "weight",
"details" : [ ]
}
]
}
]
}
]
},
{
"value" : 3.4028235E38,
"description" : "maxBoost",
"details" : [ ]
}
]
},
{
"value" : 0.0,
"description" : "match on required clause, product of:",
"details" : [
{
"value" : 0.0,
"description" : "# clause",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "_type:__skills",
"details" : [ ]
}
]
}
]
}
]
}
}

Elasticsearch multi_match + nested search

I am trying to execute a multi_match + nested search in ElasticSearch 6.4. I have the following mappings:
"name" : {
"type" : "text"
},
"status" : {
"type" : "short"
},
"user" : {
"type" : "nested",
"properties" : {
"first_name" : {
"type" : "text"
},
"last_name" : {
"type" : "text"
},
"pk" : {
"type" : "integer"
},
"profile" : {
"type" : "nested",
"properties" : {
"location" : {
"type" : "nested",
"properties" : {
"name" : {
"type" : "text",
"analyzer" : "html_strip"
}
}
}
}
}
}
},
And this is the html_strip analyzer:
"html_strip" : {
"filter" : [
"lowercase",
"stop",
"snowball"
],
"char_filter" : [
"html_strip"
],
"type" : "custom",
"tokenizer" : "standard"
}
And my current query is this one:
"query": {
"bool": {
"must": {
"multi_match": {
"query": 'Paris',
"fields": ['name', 'user.profile.location.name']
},
},
"filter": {
"term": {
"status": 1
}
}
}
}
Obviously searching for "Paris" in user.profile.location.name doesn't work. I was trying to adapt my code to following this answer https://stackoverflow.com/a/48836012/12007123 but without any success.
What I am basically trying to achieve, is to be able to search for a value in multiple fields, this may or may not be nested.
I was also checking this discussion https://discuss.elastic.co/t/multi-match-query-string-with-nested-and-non-nested-fields/118652/5 but everything I tried wasn't successful.
If I just search for name, the search is working fine.
Any tips on how can I achieve this the right way, would be much appreciated.
EDIT:
While I didn't get an answer to my initial question, I was following Nikolay's (#nikolay-vasiliev) comment and changed th mappings to Object instead of Nested.
At least now I am able to search in user.profile.location.name. This is how the new mapping for user looks like:
"user" : {
"properties" : {
"first_name" : {
"type" : "text"
},
"last_name" : {
"type" : "text"
},
"pk" : {
"type" : "integer"
},
"profile" : {
"properties" : {
"location" : {
"properties" : {
"name" : {
"type" : "text",
"analyzer" : "html_strip"
}
}
}
}
}
}
},

Must match query is not working in elastic search

I am tying to find all Videos with the name "The Shining"
but also with a parent_id = 189, and parent_type = "folder"
My query seams to connect all of the match statements with "OR" instead of "AND"
What am I doing wrong?
{
"fields": ["name","parent_id","parent_type"],
"query": {
"and": {
"must":[
{
"match": {
"name": "The Shining"
}
},
{
"match": {
"parent_id": 189
}
},
{
"match": {
"parent_type": "folder"
}
}
]
}
}
}
Mapping:
{"video" : {
"mappings" : {
"video" : {
"properties" : {
"homepage_tags" : {
"type" : "nested",
"properties" : {
"id" : {
"type" : "integer"
},
"metaType" : {
"type" : "string"
},
"tag_category_name" : {
"type" : "string"
},
"tag_category_order" : {
"type" : "integer"
},
"tag_name" : {
"type" : "string"
}
}
},
"id" : {
"type" : "integer"
},
"name" : {
"type" : "string"
},
"parent_id" : {
"type" : "integer"
},
"parent_type" : {
"type" : "string"
},
"provider" : {
"type" : "string"
},
"publish" : {
"type" : "string"
},
"query" : {
"properties" : {
"bool" : {
"properties" : {
"must" : {
"properties" : {
"match" : {
"properties" : {
"name" : {
"type" : "string"
},
"parent_id" : {
"type" : "long"
}
}
}
}
}
}
}
}
},
"source_id" : {
"type" : "string"
},
"subtitles" : {
"type" : "nested",
"include_in_root" : true,
"properties" : {
"content" : {
"type" : "string",
"store" : true,
"analyzer" : "no_stopwords"
},
"end_time" : {
"type" : "float"
},
"id" : {
"type" : "integer"
},
"parent_type" : {
"type" : "string"
},
"start_time" : {
"type" : "float"
},
"uid" : {
"type" : "integer"
},
"video_id" : {
"type" : "string"
},
"video_parent_id" : {
"type" : "integer"
},
"video_parent_type" : {
"type" : "string"
}
}
},
"tags" : {
"type" : "nested",
"properties" : {
"content" : {
"type" : "string"
},
"end_time" : {
"type" : "string"
},
"id" : {
"type" : "integer"
},
"metaType" : {
"type" : "string"
},
"parent_type" : {
"type" : "string"
},
"start_time" : {
"type" : "string"
}
}
},
"uid" : {
"type" : "integer"
},
"vid_url" : {
"type" : "string"
}
}
}
}}}
I was able to solve the problem by reading Volodymryrs answer above.
Aditional matches were being found because they were matching "the". I tried to add the operator argument, but that did not work unfortunately. What I did instead was to use "match_phrase" and also switched my two other match fields to "term" - see my answer below –
[
'query' => [
'bool' => [
'must' => [
[
'match_phrase' => [
'name' => $searchTerm
]
],
[
'term' => [
'parent_id' => intVal($parent_id)
]
],
[
'term' => [
'parent_type' => strtolower($parent_type)
]
]
]
]
]
];
You will get matches on the or shining because of match query, and depending on matches you will get score. One of the easiest fixes would be to add operator and:
{
"match": {
"name": "The Shining",
"operator": "and"
}
}
But it's not what you need since this will also match names "shining The" or "The sun is shining".
Other option is that if you need to do exact matches on name, then you would need to make field name as non-analyzed. In ES 5 you can set field type as a keyword
In addition I would recommend you to use bool query with term queries since they will do exact match.
{
"fields": ["name","parent_id","parent_type"],
"query": {
"bool": {
"must":[{
"term": {
"name": "The Shining"
}
},
{
"term": {
"parent_id": 189
}
},
{
"term": {
"parent_type": "folder"
}
}
]
}
}
}

Elasticsearch - Conditional nested fetching

I have index mapping:
{
"dev.directory.3" : {
"mappings" : {
"profile" : {
"properties" : {
"email" : {
"type" : "string",
"index" : "not_analyzed"
},
"events" : {
"type" : "nested",
"properties" : {
"id" : {
"type" : "integer"
},
"name" : {
"type" : "string",
"index" : "not_analyzed"
},
}
}
}
}
}
}
}
with data:
"hits" : [ {
"_index" : "dev.directory.3",
"_type" : "profile",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"email" : "test#dummy.com",
"events" : [
{
"id" : 111,
"name" : "ABC",
},
{
"id" : 222,
"name" : "DEF",
}
],
}
}]
I'd like to filter only matched nested elements instead of returning all events array - is this possible in ES?
Example query:
{
"nested" : {
"path" : "events",
"query" : {
"bool" : {
"filter" : [
{ "match" : { "events.id" : 222 } },
]
}
}
}
}
Eg. If I query for events.id=222 there should be only single element on the result list returned.
What strategy for would be the best to achieve this kind of requirement?
You can use inner_hits to only get the nested records which matched the query.
{
"query": {
"nested": {
"path": "events",
"query": {
"bool": {
"filter": [
{
"match": {
"events.id": 222
}
}
]
}
},
"inner_hits": {}
}
},
"_source": false
}
I am also excluding the source to get only nested hits

Resources