How to improve inner_hits in Elasticsearch - elasticsearch

I have two ES_TYPEs in my_index
user
user_property
One is defined as parent (user) and another as child (user_property)
user_property has following mapping:
PUT /my_index/_mapping/user_property
{
"user_property": {
"properties": {
"name": {
"type": "keyword",
},
"value": {
"type": "keyword"
}
}
}
}
I want to get all users having some properties (say property1, property2) along with their properties value, so to do this I create following query with inner_hits but query response time is exponentially large with inner_hits.
GET /my_index/user/_search
{
"query": {
"bool": {
"must": [
{
"has_child": {
"type": "user_property",
"query": {
"bool": {
"must": [
{
"term": {
"name": "property1"
}
}
]
}
},
"inner_hits": {
"name": "inner_hits_1"
}
}
},
{
"has_child": {
"type": "user_property",
"query": {
"bool": {
"must": [
{
"term": {
"name": "property2"
}
}
]
}
},
"inner_hits": {
"name": "inner_hits_2"
}
}
}
]
}
}
}
Is there any way to reduce this time ?

Related

Get the count of all the documents including innerHits in elasticsearch

I have an index defined in Elasticsearch which has 3 level of hierarchy relation defined.
aggParent
aggChildL1
aggChildL0
Below is the mapping for that index.
{
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 0
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"deviceName": {
"type": "keyword"
},
"agg_relation_type": {
"type": "join",
"relations": {
"aggParent": "aggChildL1",
"aggChildL1": "aggChildL0"
}
}
}
}
}
I have written a query that will return parent documents in the hits and the corresponding children in the innerHits.
Following is the query
{
"size": 1,
"query": {
"bool": {
"should": [
{
"has_child": {
"type": "aggChildL1",
"query": {
"bool": {
"should": [
{
"has_child": {
"type": "aggChildL0",
"query": {
"match": {
"id": "nc1olt5onu1unia"
}
},
"inner_hits": {
}
}
},
{
"bool": {
"must": [
{
"match": {
"id": "nc1olt5onu1unia"
}
},
{
"match": {
"agg_relation_type": "aggChildL1"
}
}
]
}
}
]
}
},
"inner_hits": {
"size": 64,
"sort": [
{
"deviceType": {
"order": "desc"
}
}
]
}
}
},
{
"bool": {
"must": [
{
"match": {
"id": "nc1olt5onu1unia"
}
},
{
"match": {
"agg_relation_type": "aggParent"
}
}
]
}
},
{
"bool": {
"must_not": {
"exists": {
"field": "agg_relation_type"
}
},
"must": [
{
"match": {
"id": "nc1olt5onu1unia"
}
}
]
}
}
]
}
}
}
This query returns a count at the top level with only the count of total aggParent documents.
I need to get the count at the inner hits level as well.
The count of all matching documents at the aggChildL0 level and then the count of all documents that gets loaded at the aggChildL1 level based on the has_child query and then the count of documents that match the filter on the aggChildL1 level.
Similarly the count of all documents that get loaded at aggParent level based on the top most has_child query and then the count of documents that match the filter on the aggParent level.
Basically the total count of all the documents that can be returned with the query.
Is there any way of getting the total count in ES?

Elasticsearch for index array element

Hi i want to search array element from index using elastic search query
{
"name": "Karan",
"address": [
{
"city": "newyork",
"zip": 12345
},
{
"city": "mumbai",
"zip": 23456
}]
}}
when i am trying to search using match query it does not work
{
"query": {
"bool": {
"must": [
{
"match": {
"address.city": "newyork"
}
}
]
}
}
}
when i access simple feild like "name": "Karan" it works, there is only issue for array element.
Because nested objects are indexed as separate hidden documents, we can’t query them directly. Instead, we have to use the nested query to access them:
GET /my_index/blogpost/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "eggs"
}
},
{
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{
"match": {
"comments.name": "john"
}
},
{
"match": {
"comments.age": 28
}
}
]
}
}
}
}
]
}}}
See the docs
The way i followed..
Mapping :
{
"mappings": {
"job": {
"properties": {
"name": {
"type": "text"
},
"skills": {
"type": "nested",
"properties": {
"value": {
"type": "text"
}
}
}
}
}
}
Records
[{"_index":"jobs","_type":"job","_id":"2","_score":1.0,"_source":{"name":"sr soft eng","skills":[{"value": "java"}, {"value": "oracle"}]}},{"_index":"jobs","_type":"job","_id":"1","_score":1.0,"_source":{"name":"sr soft eng","skills":[{"value": "java"}, {"value": "oracle"}, {"value": "javascript"}]}},
search Query
{
"query": {
"nested": {
"path": "skills",
"query": {
"bool": {
"must": [
{ "match": {"skills.value": "java"}}
]
}
}
}
}
}

Score keyword terms query on nested fields in elastichsearch 6.3

I have a set of keywords (skills in my example) and I would like to retrieve documents which match most of them. The documents should be sorted by how many of the keywords they match. The field i am searching into (skills) is of nested type. The index has the following mapping:
{
"mappings": {
"profiles": {
"properties": {
"id": {
"type": "keyword"
},
"skills": {
"type": "nested",
"properties": {
"level": {
"type": "float"
},
"name": {
"type": "keyword"
}
}
}
}
}
}
}
I tried both a terms query on the keyword field like:
{
"query": {
"nested": {
"path": "skills",
"query": {
"terms": {
"skills.name": [
"python",
"java"
]
}
}
}
}
}
And a boolean query
{
"query": {
"nested": {
"path": "skills",
"query": {
"bool": {
"should": [
{
"terms": {
"skills.name": [
"java"
]
}
},
{
"terms": {
"skills.name": [
"r"
]
}
}
]
}
}
}
}
}
For both queries the maximum score of the returned documents is 1. Thus both return documents that have ANY of the skills, but do not sort them such those with both skills are on top. The issues seems to be that skills is a nested field.
The second query works if each element of should is a nested query.
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "skills",
"query": {
"terms": {
"skills.name": [
"java"
]
}
}
}
},
{
"nested": {
"path": "skills",
"query": {
"terms": {
"skills.name": [
"r"
]
}
}
}
}
]
}
}
}

Elasticsearch query on data with multi level child

Given this sample data:
"users": {
"user1": {
"first": "john",
"last": "bellamy"
},
"user2": {
.....
.....
}
}
How can I set up elasticsearch to query/search on child first and last? Ohter tutorials only shows one level child, not this 2 or more level child.
I tried looking for a solution, and I guess that it has something to do with mapping option?
I just started elasticsearch few days ago, already manage to set up and adding data.
This works for me
{
"query": {
"bool": {
"must": [{
"term": {
"users.user2.firstname": {
"value": "sumit"
}
}
}]
}
}
}
nested users approach
mappings
{
"mappings": {
"test_type": {
"properties": {
"users": {
"type": "nested",
"properties": {
"firstname": {
"type": "text"
},
"lastname": {
"type": "text"
}
}
}
}
}
}
}
query
{
"query": {
"bool": {
"must": [{
"nested": {
"inner_hits": {},
"path": "users",
"query": {
"bool": {
"must": [{
"term": {
"users.firstname": {
"value": "ajay"
}
}
}]
}
}
}
}]
}
}
}

Elasticsearch : search document with conditional filter

I have two documents in my index (same type) :
{
"first_name":"John",
"last_name":"Doe",
"age":"24",
"phone_numbers":[
{
"contract_number":"123456789",
"phone_number":"987654321",
"creation_date": ...
},
{
"contract_number":"123456789",
"phone_number":"012012012",
"creation_date": ...
}
]
}
{
"first_name":"Roger",
"last_name":"Waters",
"age":"36",
"phone_numbers":[
{
"contract_number":"546987224",
"phone_number":"987654321",
"creation_date": ...,
"expired":true
},
{
"contract_number":"87878787",
"phone_number":"55555555",
"creation_date": ...
}
]
}
Clients would like to perform a full text search. Okay no problem here
My problem :
In this full text search, sometimes user will search by phone_number. In this case there is a parameter like expired=true.
Example :
First client search request : "987654321" with expired absent or set to false
--> Result : Only first document
Second client search request : "987654321" with expired set to true
--> Result : The two documents
How can I achieve that ?
Here is my mapping :
{
"user": {
"_all": {
"auto_boost": true,
"omit_norms": true
},
"properties": {
"phone_numbers": {
"type": "nested",
"properties": {
"phone_number": {
"type": "string"
},
"creation_date": {
"type": "string",
"index": "no"
},
"contract_number": {
"type": "string"
},
"expired": {
"type": "boolean"
}
}
},
"first_name":{
"type": "string"
},
"last_name":{
"type": "string"
},
"age":{
"type": "string"
}
}
}
}
Thanks !
MC
EDIT :
I tried this query :
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "987654321",
"analyze_wildcard": "true"
}
},
"filter": {
"nested": {
"path": "phone_numbers",
"filter": {
"bool": {
"should":[
{
"bool": {
"must": [
{
"term": {
"phone_number": "987654321"
}
},
{
"missing": {
"field": "expired"
}
}
]
}
},
{
"bool": {
"must_not": [
{
"term": {
"phone_number": "987654321"
}
}
]
}
}
]
}
}
}
}
}
}}
But I get the two documents instead of get only the first one
You're very close. Try using a combination of must and should, where the must clause ensures the phone_number matches the search value, and the should clause ensures that either the expired field is missing or set to false. For example:
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "987654321",
"analyze_wildcard": "true"
}
},
"filter": {
"nested": {
"path": "phone_numbers",
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"phone_number": "987654321"
}
}
],
"should": [
{
"missing": {
"field": "expired"
}
},
{
"term": {
"expired": false
}
}
]
}
}
}
}
}
}
}
}
}
I ran this query using your mapping and sample documents and it returned the one document for John Doe, as expected.

Resources