Trouble with has_parent query containing scripted function_score - elasticsearch

I have two document types, in a parent-child relationship:
"myParent" : {
"properties" : {
"weight" : {
"type" : "double"
}
}
}
"myChild" : {
"_parent" : {
"type" : "myParent"
},
"_routing" : {
"required" : true
}
}
The weight field is to be used for custom scoring/sorting. This query directly against the parent documents works as intended:
{
"query" : {
"function_score" : {
"script_score" : {
"script" : "_score * doc['weight'].value"
}
}
}
}
However, when trying to do similar scoring for the child documents with a has_parent query, I get an error:
{
"query" : {
"has_parent" : {
"query" : {
"function_score" : {
"script_score" : {
"script" : "_score * doc['weight'].value"
}
}
},
"parent_type" : "myParent",
"score_type" : "score"
}
}
}
The error is:
QueryPhaseExecutionException[[myIndex][3]: query[filtered(ParentQuery[myParent](filtered(function score (ConstantScore(:),function=script[_score * doc['weight'].value], params [null]))->cache(_type:myParent)))->cache(_type:myChild)],from[0],size[10]: Query Failed [failed to execute context rewrite]]; nested: ElasticSearchIllegalArgumentException[No field found for [weight] in mapping with types [myChild]];
It seems like instead of applying the scoring function to the parent and then passing its result to the child, ES is trying to apply the scoring function itself to the child, causing the error.
If I don't use score for score_type, the error doesn't occur, although the results scores are then all 1.0, as documented.
What am I missing here? How can I query these child documents with custom scoring based on a parent field?

This I would say is a bug: it is using the myChild mapping as the default context, even though you are inside a has_parent query. But I'm not sure how easy the bug would be to fix. properly.
However, you can work around it by including the type name in the full field name:
curl -XGET "http://localhost:9200/t/myChild/_search" -d'
{
"query": {
"has_parent": {
"query": {
"function_score": {
"script_score": {
"script": "_score * doc[\"myParent.weight\"].value"
}
}
},
"parent_type": "myParent",
"score_type": "score"
}
}
}'
I've opened an issue to see if we can get this fixed #4914

I think the problem is that you are trying to score child documents based on a field in the parent document and that the function score should really be the other way round.
To solve the problem my idea would be to store the parent/child relation and the score with the child documents. Then you would filter for child documents and score them according to the weight in the child document.
An example:
"myParent" : {
"properties" : {
"name" : {
"type" : "string"
}
}
}
"myChild" : {
"_parent" : {
"type" : "myParent"
},
"_routing" : {
"required" : true
},
"properties": {
"weight" : {
"type" : "double"
}
}
}
Now you could use a has_parent filter to select all child documents that have a certain parent and then score them using the function score:
{
"query": {
"filtered": {
"query": {
"function_score" : {
"script_score" : {
"script" : "_score * doc['weight'].value"
}
}
},
"filter": {
"has_parent": {
"parent_type": "myParent",
"query": {
"term": {
"name": "something"
}
}
}
}
}
}
}
So if parent documents were blog posts and child comments, then you could filter all posts and score the comments based on weight. I doubt that scoring childs based on parents is possible though I might be wrong :)
Disclaimer: 1st post to stack overflow...

Related

Elasticsearch query all documents where keyword value is greater than X [7.2]

I am trying to find all documents that have a name that is over 32 characters in length.
This is the mapping of the document.
export const boards = {
handle: {
type: "text"
},
name: {
type: "keyword"
},
};
I tried to use painless to query the size of the field but the following query did not return any results despite the fact that there are.
Query
GET /_search
{
"query": {
"bool" : {
"filter" : {
"script" : {
"script" : {
"source": "doc['name'].size() > 32",
"lang": "painless"
}
}
}
}
}
}
I am thinking that it is perhaps related to the keyword type being used.
I figured out from this forum post that when using keyword size is not the correct method. Instead you need to use:
.value.length() making my final query look like the following:
{
"query": {
"bool" : {
"filter" : {
"script" : {
"script" : {
"source": "doc['name'].value.length() > 32",
"lang": "painless"
}
}
}
}
}
}

elastic search join query

I am having hard time doing join queries in elastic search. My use case is given a particular field and value for it in child document , retrieve the parent document.
I have established parent child relationship between two document following the documentation. When I tried to query, I dont get an error but I dont get any results as well. Following is the query
GET my_index/_search
{
"query": {
"has_parent" : {
"parent_type" : "<parent_type>",
"query" : {
"match" : {
"name": "child_name"
}
}
}
}
}
I have loaded parent documents and child documents in the same index "my_index". I have established the mapping before loading documents. My mapping is as follows
PUT my_index
{
"mappings": {
"doc": {
"properties": {
"parent_join": {
"type": "join",
"relations": {
"<parent_type>": "<child_type>"
}
}
}
}
}
}
I have added the routing number to be parent id while loading document.
You've said you want to retrieve the parent document based on child's field but your query does the opposite. You want has_child:
{
"query": {
"has_child" : {
"type" : "<child_type>",
"query" : {
"term" : {
"name" : "child_name"
}
}
}
}
}

sorting/scoring parent documents based on child field values

Types Description: parent type
1)Parent Type: "product"
2)childType : "ratings"
Description of problem :i have an es query(query-1) which is working fine and fetching me results from parent Type(product), now i added a new type(ratings), and made the new type child of "product", and I need to extend this existing query where the new results should get sorted based on the filed values of matched child(ratings) type.
query -1:
{
"bool" : {
"must" : [ {
"has_parent" : {
"query" : {
"bool" : {
"must" : {
"term" : {
"searchable" : "true"
}
}
}
},
"parent_type" : "supplierparent",
"inner_hits" : { }
}
}, {
"bool" : {
"must" : {
"query" : {
"simple_query_string" : {
"query" : "rice"
}
}
}
}
}, {
"nested" : {
"query" : {
"term" : {
"productStatusList.status" : "Approved"
}
},
"path" : "productStatusList"
}
} ]
}
}
I think there should be better ways to do this like we do with nested documents' attribute's sort. But one way with parent-child relation to sort can occur is on current type's attribute.
since they are now completely different document store, it would not be possible to use child property for sorting parent. There can be multiple children for the same parent doc, which would create ambiguity in sorting results.

Query documents with access control filter

Each document in my Elasticsearch index has two access control lists containing user ids. One is an allow list, the other is a deny list. I am trying to add a filter to a given query that considers these ACLs. I thought I could use a bool query with a must clause for the given query, a filter clause for the allow list, and a must_not clause for the deny list. What I have so far (example for user 1):
{
"bool" : {
"must" : {
[given query]
},
"filter" : [ {
"match" : {
"acl.allow" : {
"query" : "/user/1",
"type" : "boolean"
}
}
}],
"must_not" : [ {
"match" : {
"acl.deny" : {
"query" : "/user/1",
"type" : "boolean"
}
}
}]
}
}
Unfortunately, this query does not return the desired result. It returns objects that have not listed user 1 in their allow list (a behavior I don't understand). Also, it (obviously) ignores objects with empty access control lists (which should be visible to anyone). Any suggestions to fix that?
I figured it out. First of all, using match isn't really a good solution for that kind of query—due to its analyzer. Using term though left me puzzled why I did not get any results. Term queries only return results if the corresponding field is set to not_analyzed. Thus I changed my mapping:
"acl": {
"properties": {
"allow": {
"type": "string",
"index": "not_analyzed"
},
"deny": {
"type": "string",
"index": "not_analyzed"
}
}
}
My second problem—treating objects with empty ACLs as visible to anyone—was solved using exists nested in must_not nested in bool. This is recommended as substitute for the deprecated missing query. My final query looks like this and passed all ACL related tests I could think of.
{
"bool" : {
"must" : {
[given query]
},
"filter" : {
"bool" : {
"should" : [ {
"terms" : {
"acl.allow" : [ "/user/1" ]
}
}, {
"bool" : {
"must_not" : {
"exists" : {
"field" : "acl.allow"
}
}
}
} ]
}
},
"must_not" : {
"terms" : {
"acl.deny" : [ "/user/1" ]
}
}
}
}

Elastic(search): How to structure nested queries correctly?

I'm currently quite confuse about the structuring of queries in elastic. Let me explain what I mean with the following template that works fine for me:
{
"template" : {
"query" : {
"filtered" : {
"query" : {
"bool" : {
"must" : [
{ "match" : {
"user" : "{{param_user}}"
} },
{ "match" : {
"session" : "{{param_session}}"
} },
{ "range" : {
"date" : {
"gte" : "{{param_from}}",
"lte" : "{{param_to}}"
}
} }
]
}
}
}
}
}
}
Ok so I want to get entries of a specific session of a user in a certain time period. Now if you take a llok at this link http://www.elastic.co/guide/en/elasticsearch/guide/current/combining-filters.html you can find the following query:
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"should" : [
{ "term" : {"price" : 20}},
{ "term" : {"productID" : "XHDK-A-1293-#fJ3"}}
],
"must_not" : {
"term" : {"price" : 30}
}
}
}
}
}
}
In this example we have right after the "filtered" the "filter" keyword. However if I exchange my second "query" with a "filter" as in the example , my template won't work anymore. This is really counterintuitive and I payed alot of time to figure this out. A̶l̶s̶o̶ ̶I̶ ̶d̶o̶n̶'̶t̶ ̶u̶n̶d̶e̶r̶s̶t̶a̶n̶d̶ ̶w̶h̶y̶ ̶w̶e̶ ̶n̶e̶e̶d̶ ̶t̶o̶ ̶p̶u̶t̶ ̶e̶v̶e̶r̶y̶ ̶f̶i̶l̶t̶e̶r̶ ̶i̶n̶ ̶s̶e̶p̶a̶r̶a̶t̶e̶ ̶̶{̶ ̶}̶̶ ̶e̶v̶e̶n̶ ̶t̶h̶o̶u̶g̶h̶ ̶t̶h̶e̶y̶ ̶a̶r̶e̶ ̶a̶l̶r̶e̶a̶d̶y̶ ̶s̶e̶p̶a̶r̶a̶t̶e̶d̶ ̶b̶y̶ ̶t̶h̶e̶ ̶a̶r̶r̶a̶y̶ ̶s̶y̶n̶t̶a̶x̶.̶
Another issue I had was that I suggested to match several fields I can just type smth like:
{
"query" : {
"match" : {
"user" : "{{param_user}}",
"session" : "{{param_session}}"
}
}
}
but it seemed that I have to use a bool query which I didn't know of, so I searched for 'elastic multi match' but got something completely different.
My question: where can I find how to structure a query properly (smth like a PEG)? The documentation only give basic examples but doesn't state what we can actually do and how.
Best regards,
Jan
Edit: Ok I just found by accident that I cannot exchange "query" with "filter" as "match" is a query and not a filter. But then again what about "range"? It seems to be a query as well as a filter... Is there a summary of keywords specifying in which context they can be used?
Is there a summary of keywords specifying in which context they can be used?
I wouldn't consider that as keywords. It's just there are both queries and filters with the same names (but not all of them).
Here is everything you need. For example there are both range query and filter. All you need is to understand the difference between filters and queries.
For example, if you want to move range section from query to filter, you can do that like shown in the code below (not tested). Since your code already contains filtered type of query, you can just create filter section right after query section.
{
"template": {
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"match": {
"user": "{{param_user}}"
}
},
{
"match": {
"session": "{{param_session}}"
}
}
]
}
},
"filter": {
"range": {
"date": {
"gte": "{{param_from}}",
"lte": "{{param_to}}"
}
}
}
}
}
}
}
Just remember that you can filter only not analyzed fields.

Resources