elastic search join query - elasticsearch

I am having hard time doing join queries in elastic search. My use case is given a particular field and value for it in child document , retrieve the parent document.
I have established parent child relationship between two document following the documentation. When I tried to query, I dont get an error but I dont get any results as well. Following is the query
GET my_index/_search
{
"query": {
"has_parent" : {
"parent_type" : "<parent_type>",
"query" : {
"match" : {
"name": "child_name"
}
}
}
}
}
I have loaded parent documents and child documents in the same index "my_index". I have established the mapping before loading documents. My mapping is as follows
PUT my_index
{
"mappings": {
"doc": {
"properties": {
"parent_join": {
"type": "join",
"relations": {
"<parent_type>": "<child_type>"
}
}
}
}
}
}
I have added the routing number to be parent id while loading document.

You've said you want to retrieve the parent document based on child's field but your query does the opposite. You want has_child:
{
"query": {
"has_child" : {
"type" : "<child_type>",
"query" : {
"term" : {
"name" : "child_name"
}
}
}
}
}

Related

Elastic Search Multiple Filter values for the same field

Say that I have to filter cars constructors in a Elastic Search Index (ES 7.15), where the field car_maker is mapped to keyword, having it a limited number of possibilities among car makers string names:
{
"mappings": {
"properties": {
"car_maker": {
"type": "keyword"
}
}
}
}
GET /cars/_search
{
"query": {
"bool": {
"filter": [{
"term": {
"car_maker": "Honda"
}
}]
}
}
}
This, along with a matching query will work ok. The filter will not participate to score calculation as desired.
Now I would like to to filter more car makers for that query (let's say a should query):
{
"query": {
"bool": {
"filter" : [
{"term" : { "car_maker" : "Honda"}},
{"term" : { "car_maker" : "Ferrari"}}
]
}
}
}
this is not going to work. I will have any error from ES query engine, but any result too. Of course is always possibile to apply more filters to different fields like car_maker and car_color, but how to do the opposite: apply more values (Honda, Ferrari, etc.) to the same filter field car_maker like in the example above, without conditioning the score calculation?
You might want to try the following filter query:
{
"query" : {
"bool" : {
"filter" : {
"terms" : {
"car_maker" : ["Honda", "Ferrari"]
}
}
}
}
}

ElasticSearch filtering for a tag in array

I've got a bunch of events that are tagged for their audience:
{ id = 123, audiences = ["Public", "Lecture"], ... }
I've trying to do an ElasticSearch query with filtering, so that the search will only return events that have the an exact entry of "Public" in that audiences array (and won't return events that a "Not Public").
How do I do that?
This is what I have so far, but it's returning zero results, even though I definitely have "Public" events:
curl -XGET 'http://localhost:9200/events/event/_search' -d '
{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"audiences": "Public"
}
},
"query" : {
"match" : {
"title" : "[searchterm]"
}
}
}
}
}'
You could use this mapping for you content type
{
"your_index": {
"mappings": {
"your_type": {
"properties": {
"audiences": {
"type": "string",
"index": "not_analyzed"
},
}
}
}
}
}
not_analyzed
Index this field, so it is searchable, but index the
value exactly as specified. Do not analyze it.
And use lowercase term value in search query

Elasticsearch - Query by parent id

I have three document types with the following mappings with Parent/Child relationships. I have omitted other properties as I thought they are not relevant to the question.
"mappings": {
"Parent": {
},
"Child": {
"_parent": {
"type": "Parent"
},
"_routing": {
"required": true
}
},
"GrandChild": {
"_parent": {
"type": "Child"
},
"_routing": {
"required": true
}
}
}
I'm using java api to insert documents to the index. Before indexing a document, existing documents with the same ids needs to removed to avoid duplicates. Parent's Id and Child's Id are stored externally. First all "GrandChild" documents are deleted using a term query on its parent id (which is an id of type "Child" in this case). There are no errors but the "GrandChild" docs are not getting deleted.
By running the following term query using the chrome plugin Sense, I found out that the problem is in the term query. It doesn't return hits. Parent, Child and GrandChild have the same routing value which is set to the id of Parent. This is the query I tried.
POST /myindex/GrandChild/_search?routing=DFC0E8CD59EBC00EC2DDC9A0FF5D1F2DB272B2449680824CCE60B6864568D498
{
"query" : {
"term" : { "_parent" : "//id/of/doc/of/type/Child" }
}
}
When I try to search for a "Child" document using its parent id, it works. I get a "Child" using the following query.
POST /myindex/Child/_search?routing=DFC0E8CD59EBC00EC2DDC9A0FF5D1F2DB272B2449680824CCE60B6864568D498
{
"query" : {
"term" : { "_parent" : "DFC0E8CD59EBC00EC2DDC9A0FF5D1F2DB272B2449680824CCE60B6864568D498" }
}
}
What could I be doing wrong in the "GrandChild" search?
Any help is appreciated. Thanks.
Got an answer for the question from elasticsearch mailing list
https://groups.google.com/forum/#!topic/elasticsearch/d_aZejNMBD8
There is an open issue for this here:
https://github.com/elasticsearch/elasticsearch/issues/5399
Edit :
As shown in the Github issue the way to perform your query is to prefix the ID by the parent type :
POST /myindex/GrandChild/_search?routing=DFC0E8CD59EBC00EC2DDC9A0FF5D1F2DB272B2449680824CCE60B6864568D498
{
"query" : {
"term" : { "_parent" : "Child#//id/of/doc/of/type/Child" }
}
}

Elasticsearch grouping facet by owner, mine vs others

I am using Elasticsearch to index documents that have an owner which is stored in a userId property of the source object. I can easily do a facet on the userId and get facets for each owner that there is, but I'd like to have the facets for owner show up like so:
Documents owned by me (X)
Documents owned by others (Y)
I could handle this on the client side and take all of the facets returned by elasticsearch and go through them and figure out those owned by the current user and not and display it appropriately, but I was hoping there was a way to tell elasticsearch to handle this in the query itself.
You can use filtered facets to do this:
curl -XGET "http://localhost:9200/_search" -d'
{
"query": {
"match_all": {}
},
"facets": {
"my_docs": {
"filter": {
"term": { "user_id": "my_user_id" }
}
},
"others_docs": {
"filter": {
"not": {
"term": { "user_id": "my_user_id" }
}
}
}
}
}'
One of the nice things about this is that the two terms filters are identical and so are only executed once. The not filter just inverts the results of the cached term filter.
You're right, ElasticSearch has a way to do that. Take a look to scripting term facets, specially to the second example ("using the boolean feature"). You should be able to do somthing like:
{
"query" : {
"match_all" : { }
},
"facets" : {
"userId" : {
"terms" : {
"field" : "userId",
"size" : 10,
"script" : "term == '<your user id>' ? true : false"
}
}
}
}

Trouble with has_parent query containing scripted function_score

I have two document types, in a parent-child relationship:
"myParent" : {
"properties" : {
"weight" : {
"type" : "double"
}
}
}
"myChild" : {
"_parent" : {
"type" : "myParent"
},
"_routing" : {
"required" : true
}
}
The weight field is to be used for custom scoring/sorting. This query directly against the parent documents works as intended:
{
"query" : {
"function_score" : {
"script_score" : {
"script" : "_score * doc['weight'].value"
}
}
}
}
However, when trying to do similar scoring for the child documents with a has_parent query, I get an error:
{
"query" : {
"has_parent" : {
"query" : {
"function_score" : {
"script_score" : {
"script" : "_score * doc['weight'].value"
}
}
},
"parent_type" : "myParent",
"score_type" : "score"
}
}
}
The error is:
QueryPhaseExecutionException[[myIndex][3]: query[filtered(ParentQuery[myParent](filtered(function score (ConstantScore(:),function=script[_score * doc['weight'].value], params [null]))->cache(_type:myParent)))->cache(_type:myChild)],from[0],size[10]: Query Failed [failed to execute context rewrite]]; nested: ElasticSearchIllegalArgumentException[No field found for [weight] in mapping with types [myChild]];
It seems like instead of applying the scoring function to the parent and then passing its result to the child, ES is trying to apply the scoring function itself to the child, causing the error.
If I don't use score for score_type, the error doesn't occur, although the results scores are then all 1.0, as documented.
What am I missing here? How can I query these child documents with custom scoring based on a parent field?
This I would say is a bug: it is using the myChild mapping as the default context, even though you are inside a has_parent query. But I'm not sure how easy the bug would be to fix. properly.
However, you can work around it by including the type name in the full field name:
curl -XGET "http://localhost:9200/t/myChild/_search" -d'
{
"query": {
"has_parent": {
"query": {
"function_score": {
"script_score": {
"script": "_score * doc[\"myParent.weight\"].value"
}
}
},
"parent_type": "myParent",
"score_type": "score"
}
}
}'
I've opened an issue to see if we can get this fixed #4914
I think the problem is that you are trying to score child documents based on a field in the parent document and that the function score should really be the other way round.
To solve the problem my idea would be to store the parent/child relation and the score with the child documents. Then you would filter for child documents and score them according to the weight in the child document.
An example:
"myParent" : {
"properties" : {
"name" : {
"type" : "string"
}
}
}
"myChild" : {
"_parent" : {
"type" : "myParent"
},
"_routing" : {
"required" : true
},
"properties": {
"weight" : {
"type" : "double"
}
}
}
Now you could use a has_parent filter to select all child documents that have a certain parent and then score them using the function score:
{
"query": {
"filtered": {
"query": {
"function_score" : {
"script_score" : {
"script" : "_score * doc['weight'].value"
}
}
},
"filter": {
"has_parent": {
"parent_type": "myParent",
"query": {
"term": {
"name": "something"
}
}
}
}
}
}
}
So if parent documents were blog posts and child comments, then you could filter all posts and score the comments based on weight. I doubt that scoring childs based on parents is possible though I might be wrong :)
Disclaimer: 1st post to stack overflow...

Resources