Using Elasticsearch, I'm trying to search nested data and return the documents with the most "hits."
Relevant Example Data
POST myrecipes/recipe
{
"title": "Potato Salad",
"ingredients": [
{"name":"potato"},
{"name":"mayonaise"},
{"name":"mustard"},
{"name":"bacon"},
{"name":"onion"},
{"name":"parsley"}
]
}
POST myrecipes/recipe
{
"title": "Seared Scallops",
"ingredients": [
{"name":"scallops"},
{"name":"butter"},
{"name":"bacon"}
]
}
POST myrecipes/recipe
{
"title": "Tuna melt",
"ingredients": [
{"name":"tuna"},
{"name":"onion"},
{"name":"butter"},
{"name":"bacon"},
{"name":"mayonaise"},
{"name":"lettuce"},
{"name":"tomato"},
{"name":"bread"}
]
}
My Index
PUT myrecipes
{
"mappings": {
"recipe": {
"properties": {
"ingredients": {
"type": "nested",
"include_in_parent": true,
"properties": {
"name": {
"type": "string",
"fields": {
"untouched": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
}
My Search:
POST myrecipes/recipe/_search
{
"query": {
"nested": {
"path": "ingredients",
"query": {
"bool": {
"should": [
{
"match": {
"ingredients.name": {
"query": "bacon"
}
}
},
{
"match": {
"ingredients.name": {
"query": "butter"
}
}
}
]
}
}
}
}
}
The query returns
"Tuna melt", "Potato Salad", "Seared Scallops"
I'm hoping for:
"Seared Scallops", "Tuna melt", "Potato Salad"
Since there are two "hits" on "Seared scallops" and "Tuna Melt" (bacon and butter), and only one "hit" on "Potato Salad" (bacon).
Related
Hi i want to search array element from index using elastic search query
{
"name": "Karan",
"address": [
{
"city": "newyork",
"zip": 12345
},
{
"city": "mumbai",
"zip": 23456
}]
}}
when i am trying to search using match query it does not work
{
"query": {
"bool": {
"must": [
{
"match": {
"address.city": "newyork"
}
}
]
}
}
}
when i access simple feild like "name": "Karan" it works, there is only issue for array element.
Because nested objects are indexed as separate hidden documents, we can’t query them directly. Instead, we have to use the nested query to access them:
GET /my_index/blogpost/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "eggs"
}
},
{
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{
"match": {
"comments.name": "john"
}
},
{
"match": {
"comments.age": 28
}
}
]
}
}
}
}
]
}}}
See the docs
The way i followed..
Mapping :
{
"mappings": {
"job": {
"properties": {
"name": {
"type": "text"
},
"skills": {
"type": "nested",
"properties": {
"value": {
"type": "text"
}
}
}
}
}
}
Records
[{"_index":"jobs","_type":"job","_id":"2","_score":1.0,"_source":{"name":"sr soft eng","skills":[{"value": "java"}, {"value": "oracle"}]}},{"_index":"jobs","_type":"job","_id":"1","_score":1.0,"_source":{"name":"sr soft eng","skills":[{"value": "java"}, {"value": "oracle"}, {"value": "javascript"}]}},
search Query
{
"query": {
"nested": {
"path": "skills",
"query": {
"bool": {
"must": [
{ "match": {"skills.value": "java"}}
]
}
}
}
}
}
can somebody help me to understand what Elastic means by nested. In documentations https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html#_nested_sorting_examples is an example which does not show how the document object looks like. It look like I should imagine the mapping from the search query. Query looks like:
POST /_search
{
"query": {
"nested": {
"path": "parent",
"query": {
"bool": {
"must": {"range": {"parent.age": {"gte": 21}}},
"filter": {
"nested": {
"path": "parent.child",
"query": {"match": {"parent.child.name": "matt"}}
}
}
}
}
}
},
"sort" : [
{
"parent.child.age" : {
"mode" : "min",
"order" : "asc",
"nested": {
"path": "parent",
"filter": {
"range": {"parent.age": {"gte": 21}}
},
"nested": {
"path": "parent.child",
"filter": {
"match": {"parent.child.name": "matt"}
}
}
}
}
}
]
}
Can somebody write a document structure on which this query will work?
Something like this.
{
"parent": {
"name": "Elasti Sorch",
"age": 23,
"child": [
{
"name": "Kibana Lion",
"age": 12
},
{
"name": "Matt",
"age": 15
}
]
}
}
In Elastic nested means it's an array of objects. To store an array of objects into a field in elastic search you have to map the field to a nested while creating the index.
PUT parent
{
"mappings": {
"doc":{
"properties": {
"name":{
"type": "text"
},
"age":{
"type": "integer"
},
"child":{
"type": "nested",
"properties": {
"name":{
"type":"text"
},
"age":{
"type":"integer"
}
}
}
}
}
}
}
and a sample nested document cab be inserted like this
POST parent/doc
{
"name":"abc",
"age":50,
"child":[
{
"name":"son1",
"age":25
},
{
"name":"adughter1",
"age":20
}
]
}
Im trying to search two or more values on array and get only those ones that match with all words (AND CLAUSE)
Some example:
{ "name" : "Chevrolet",
"value" : [ "gasolina", "alcool", "diesel"]
}
{ "name" : "Fiat",
"value" : [ "eletrica", "alcool"]
}
{ "name" : "Honda",
"value" : [ "diesel", "gasolina"]
}
My mapping
{
"mappings": {
"cars": {
"properties": {
"name": {
"type": "string"
},
"GasType": {
"type": "nested",
"properties": {
"value": {
"type": "string"
}
}
}
}
}
}
}
Query:
{
"query": {
"nested": {
"path": "GasType",
"query": {
"bool": {
"must": [
{ "match": {"GasType.value": "gasolina"}},
{ "match": {"GasType.value": "diesel"}}
]
}
}
}
}
}
My return is always empty and if i change de query i have got all those that contains "Gasolina" or "diesel"
I need those that has "Gasolina" AND "diesel"
Your test data doesn't match the mapping of the index. In your test data I don't see the nested field name GasType. In any case, the following works for me just fine:
DELETE test
PUT test
{
"mappings": {
"cars": {
"properties": {
"name": {
"type": "string"
},
"GasType": {
"type": "nested",
"properties": {
"value": {
"type": "string"
}
}
}
}
}
}
}
POST test/cars/_bulk
{"index":{}}
{"name":"Chevrolet","GasType":{"value":["gasolina","alcool","diesel"]}}
{"index":{}}
{"name":"Fiat","GasType":{"value":["eletrica","alcool"]}}
{"index":{}}
{"name":"Honda","GasType":{"value":["diesel","gasolina"]}}
{"index":{}}
{"name":"Honda","GasType":{"value":["diesel"]}}
GET test/_search
{
"query": {
"nested": {
"path": "GasType",
"query": {
"bool": {
"must": [
{
"match": {
"GasType.value": "gasolina"
}
},
{
"match": {
"GasType.value": "diesel"
}
}
]
}
}
}
}
}
I have two documents in my index (same type) :
{
"first_name":"John",
"last_name":"Doe",
"age":"24",
"phone_numbers":[
{
"contract_number":"123456789",
"phone_number":"987654321",
"creation_date": ...
},
{
"contract_number":"123456789",
"phone_number":"012012012",
"creation_date": ...
}
]
}
{
"first_name":"Roger",
"last_name":"Waters",
"age":"36",
"phone_numbers":[
{
"contract_number":"546987224",
"phone_number":"987654321",
"creation_date": ...,
"expired":true
},
{
"contract_number":"87878787",
"phone_number":"55555555",
"creation_date": ...
}
]
}
Clients would like to perform a full text search. Okay no problem here
My problem :
In this full text search, sometimes user will search by phone_number. In this case there is a parameter like expired=true.
Example :
First client search request : "987654321" with expired absent or set to false
--> Result : Only first document
Second client search request : "987654321" with expired set to true
--> Result : The two documents
How can I achieve that ?
Here is my mapping :
{
"user": {
"_all": {
"auto_boost": true,
"omit_norms": true
},
"properties": {
"phone_numbers": {
"type": "nested",
"properties": {
"phone_number": {
"type": "string"
},
"creation_date": {
"type": "string",
"index": "no"
},
"contract_number": {
"type": "string"
},
"expired": {
"type": "boolean"
}
}
},
"first_name":{
"type": "string"
},
"last_name":{
"type": "string"
},
"age":{
"type": "string"
}
}
}
}
Thanks !
MC
EDIT :
I tried this query :
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "987654321",
"analyze_wildcard": "true"
}
},
"filter": {
"nested": {
"path": "phone_numbers",
"filter": {
"bool": {
"should":[
{
"bool": {
"must": [
{
"term": {
"phone_number": "987654321"
}
},
{
"missing": {
"field": "expired"
}
}
]
}
},
{
"bool": {
"must_not": [
{
"term": {
"phone_number": "987654321"
}
}
]
}
}
]
}
}
}
}
}
}}
But I get the two documents instead of get only the first one
You're very close. Try using a combination of must and should, where the must clause ensures the phone_number matches the search value, and the should clause ensures that either the expired field is missing or set to false. For example:
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "987654321",
"analyze_wildcard": "true"
}
},
"filter": {
"nested": {
"path": "phone_numbers",
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"phone_number": "987654321"
}
}
],
"should": [
{
"missing": {
"field": "expired"
}
},
{
"term": {
"expired": false
}
}
]
}
}
}
}
}
}
}
}
}
I ran this query using your mapping and sample documents and it returned the one document for John Doe, as expected.
I am running into a query problem with ElasticSearch.
We have objects that looks like this:
{
"id":"1234",
"tags":[
{ "tagName": "T1", "tagValue":"V1"},
{ "tagName": "T2", "tagValue":"V2"},
{ "tagName": "T3", "tagValue":"V3"}
]
}
{
"id":"5678",
"tags":[
{ "tagName": "T1", "tagValue":"X1"},
{ "tagName": "T2", "tagValue":"X2"}
]
}
And I would like to get a list of tagValues for tagName=T1, which is "V1" and "X1".
I tried
{
"filter": {
"bool": {
"must": [
{
"term":{
"tags.tagName": "T1"
}
}
]
}
},
"facets": {
"TagValues":{
"filter": {
"term": {
"tags.tagName": "T1"
}
},
"terms": {
"field": "tags.tagValue",
"size": 30
}
}
}
}
It seems like it's returning all tagValues from all tags "T1", "T2", and "T3".
Can someone please help me with this query? How can I get faceted list for objects that's in an array?
Any help would be appreciated.
Thank you,
The main idea is to use the nested type for your tags field. Here is the mapping you should use:
curl -XPUT localhost:9200/mytags -d '{
"mappings": {
"mytag": {
"properties": {
"id": {
"type": "string"
},
"tags": {
"type": "nested",
"properties": {
"tagName": {
"type": "string",
"index": "not_analyzed"
},
"tagValue": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}'
Then you can reindex your data and run a query like the one below, which will first filter only the document containing a tagName whose value is T1 and then using aggregations (don't use facets anymore as they are deprecated), you can again select only those tags whose tagName is T1 and then retrieve the associated tagValue fields. This will get you the expected V1 and X1 values.
curl -XPOST localhost:9200/mytags/mytag/_search -d '{
"size": 0,
"query": {
"filtered": {
"filter": {
"nested": {
"path": "tags",
"query": {
"term": {
"tags.tagName": "T1"
}
}
}
}
}
},
"aggs": {
"tags": {
"nested": {
"path": "tags"
},
"aggs": {
"values": {
"filter": {
"term": {
"tags.tagName": "T1"
}
},
"aggs": {
"values": {
"terms": {
"field": "tags.tagValue"
}
}
}
}
}
}
}
}'