How can I restrict the results from elasticsearch to only include parent-type documents? - elasticsearch

I'm kinda stuck on this issue and struggling to find a solution:
I do have two types of entries in my shared Elasticsearch index, which are joined by a parent:child relationship.
I'd like to only receive documents of the "parent" type, but also include all "parent" documents which do not actually have child documents available.
Is there any way to implement this?
Best wishes,
Stefan

Below query is what you are looking for. I've made use of the Bool query
Let's say you have the below mapping for parent-child i.e. have question and its children are answer.
Mapping
PUT <your_index_name>
{
"mappings": {
"_doc": {
"properties": {
"my_join_field": {
"type": "join",
"relations": {
"question": "answer"
}
}
}
}
}
}
Query
POST <your_index_name>/_search
{
"query":{
"bool":{
"must":[
{ "term":{ "my_join_field":"question" }},
{
"bool":{
"must_not":[
{ "has_child":{
"type":"answer",
"query":{ "match_all":{} }
}
}
]
}
}
]
}
}
}
The above query would display all the parent documents i.e. question which doesn't have any children i.e.manswer.
Note that if you convert must_not into must in the above query, it would return you all the parent documents i.e. question which has children i.e answer ;)
Now if you want only parent documents. i.e. all the parent documents, your query would simply be in the below format:
Query for all parent documents
POST <your_index_name>/_search
{
"query":{
"bool":{
"must":[
{
"term":{
"my_join_field":"question"
}
}
]
}
}
}
OR it can be as simple as below:
POST <your_index_name>/_search
{
"query": {
"term": {
"my_join_field": "question"
}
}
}
Basically I've implemented Term Queries.
Let me know if it helps!

Related

what is purpose in must nested in filter elasticsearch?

what's difference between the following es filter query?
1. filter context for multi query conditions:
{
"query": {
"bool": {
"filter": [
{ "term": { "status": "published" }},
{ "range": { "publish_date": { "gte": "2015-01-01" }}}
]
}
}
}
must in filter context:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{ "term": { "status": "published" }},
{ "range": { "publish_date": { "gte": "2015-01-01" }}}
]
}
}
]
}
}
}
The first query is used in scenarios where you just want to filter using AND operator on different fields. By default if you write filter query in this way, it would be executed as AND operation.
The second query, in your case/scenario, does exactly as the first query (no difference, just two ways of doing same thing), however the reason we can "also" do that is to implement/cover more complex filter use-cases that uses many different AND and OR combinations.
Note that in Elasticsearch AND is represented by must while OR is represented by should clauses.
Let's say I would want to filter a scenario like I want all documents having
sales from department 101 or
sales from department 101B along with price > 150.
You probably would have to end up writing query in the below way:
POST sometestindex/_search
{
"query":{
"bool":{
"filter":[
{
"bool":{
"should":[
{
"term":{
"dept.keyword":"101"
}
},
{
"bool":{
"must":[
{
"term":{
"dept.keyword":"101B"
}
},
{
"range":{
"price":{
"gte":150
}
}
}
]
}
}
],
"minimum_should_match": 1
}
}
]
}
}
}
In short, for your scenario, first query is just a short-hand way of writing the second-query, however if you have much more complex filter logic, then you need to leverage the Bool query inside your filter as you've mentioned in your second query, as I've mentioned in the sample example.
Hope that clarifies!

Filtering and matching with an elasticsearch query

I am having trouble applying a secondary filter to my elasticsearch query below. Only the first filter is matching. I want both filters to apply to the query.
"query": {
"bool": {
"must": [
{
"bool": {
"filter": {
"range": {
"#timestamp": {
"gte": "2019-03-12",
"lte": "2019-03-13"
}
}
}
}
},
{
"bool": {
"filter": {
"bool": {
"must": {
"match": {
"msg_text": "foo AND bar"
}
}
}
}
}
}
]
}
}
Well I've mentioned two solutions, first one makes use of Match Query while the second one makes use of Query String.
Also I'm assuming msg_text field is of type text.
Difference is that, query_string uses a parser, that would parse the text you mention based on the operators like AND, OR.
While match query would read the text, analyse the text and based on it constructs a bool query. In the sense you don't need to mention operators and it won't work
You can read more about them in the links I've mentioned.
1. Using Match Query
POST <your_index_name>/_search
{
"query":{
"bool":{
"filter":{
"bool":{
"must":[
{
"range":{
"#timestamp":{
"gte":"2019-03-12",
"lte":"2019-03-13"
}
}
},
{
"match":{
"msg_text":"foo bar"
}
}
]
}
}
}
}
}
2. Using Query String
POST <your_index_name>/_search
{
"query":{
"bool":{
"filter":{
"bool":{
"must":[
{
"range":{
"#timestamp":{
"gte":"2019-03-12",
"lte":"2019-03-13"
}
}
},
{
"query_string":{
"fields": ["msg_text"], <----- You can add more fields here using comma as delimiter
"query":"foo AND bar"
}
}
]
}
}
}
}
}
Technically nothing is wrong with your solution, in the sense, it would work, but I hope my answers clear, simplifies the query and helps you understand what you are trying to do.
Let me know if it helps!

Elasticsearch inner-hits on child type not in the query

Struggling with inner-hits on elasticsearch. Would appreciate any help.
I have two child types: childA and childB.
I am querying parents of childA like this
"query":{
"bool": {
"should": {
"has_child": {
"type": "ChildA",
"query": {
"match": {
"name": {
"query": "a"
}
}
}
}
}
}
}
My problem is how to include in the results all child docs of type childB as well without affecting results from the above query.
I was thinking to use inner-hits on a has_child query(type childB) for that but my query doesn't depend on childB type.
Anyone has an idea?
Thanks in advance
I found a way to include childB type docs.
I combine the following query with above query (has_child on childA type) in a filter query to get childB docs also. I am not really sure if its a good way though(thinking about performance)
{
"query":{
"bool":{
"should":[
{
"bool":{
"must_not":[
{
"has_child":{
"type":"ChildB",
"query":{
"match_all":{}
},
"inner_hits":{}
}
}
]
}
},
{
"has_child":{
"type":"ChildB",
"query":{
"match_all":{}
},
"inner_hits":{}
}
}
]
}
}
}

multiple search conditions in one query in es and distinguish the items according to the conditions

For one case I need to put multiple search conditions in one query to reduce the number of queries we need.
However, I need to distinguish the returning items based on the conditions.
Currently I achieved this goal by using function score query, specifically: each condition is assigned with a score, and I can differentiate the results based on those scores.
However, the performance is not that good. Plus now we need to get the doc count of each condition.
So is there any way to do it? I'm thinking using aggregation, but not sure if I can do it.
Thanks!
update:
curl -X GET 'localhost:9200/locations/_search?fields=_id&from=0&size=1000&pretty' -d '{
"query":{
"bool":{
"should":[
{
"filtered":{
"filter":{
"bool":{
"must":[{"term":{"city":"new york"}},{"term":{"state":"ny"}}]
}
}
}
},
{
"filtered":{
"filter":{
"bool":{
"must":[{"term":{"city":"los angeles"}},{"term":{"state":"ca"}}]
}
}
}
}
]
}
}}'
Well to answer the first part of your question , names queries are the best.
For eg:
{
"query": {
"bool": {
"should": [
{
"match": {
"field1": {
"query": "qbox",
"_name": "firstQuery"
}
}
},
{
"match": {
"field2": {
"query": "hosted Elasticsearch",
"_name": "secondQuery"
}
}
}
]
}
}
}
This will return an additional field called matched_queries for each hit which will have the information on queries matched for that document.
You can find more info on names queries here
But this this information cant be used for aggregation.
So you need to handle the second part of your question in a separate manner.
Filter aggregation for each query type would be the idea solution here.
For eg:
{
"query": {
"bool": {
"should": [
{
"match": {
"text": {
"query": "qbox",
"_name": "firstQuery"
}
}
},
{
"match": {
"source": {
"query": "elasticsearch",
"_name": "secondQuery"
}
}
}
]
}
},
"aggs": {
"firstQuery": {
"filter": {
"term": {
"text": "qbox"
}
}
},
"secondQuery": {
"filter": {
"term": {
"source": "elasticsearch"
}
}
}
}
}
You can find more on filter aggregation here

Mixing bool and multi match/function score query

I'm currently doing a query that's a mix of multi match and function score. The important bit of the JSON looks like this:
"function_score":{
"query":{
"query_string":{
"query":"some query",
"fields":["id","name","strippedDescription","colourSearch","sizeSearch"]
}
}
}
However, I also want to include results that don't necessarily match the query but have a particular numeric value that's greater than 0. I think a bool query would do this, but I don't know how to use a bool query with a function score query.
I understand that a multi match query is just shorthand for a bool query, and I could expand out the multi match query into its bool counter-part, however, I then don't know how I would do function score within that.
Any ideas? I'm on version 1.1.0 by the way.
Figured it out! I was missing the fact that you can nest multi field queries within bool queries! My final solution looks like this:
{
"query":{
"function_score":{
"query":{
"bool":{
"should": [
{
"range": {
"allBoost": {
"gt": 0
}
}
},{
"multi_match":{
"query":"some search query",
"fields":[
"id",
"name",
"description",
"category"
]
}
}
]
}
},
"functions":[
{
"filter":{
"range": {
"allBoost": {
"gt": 0
}
}
},
"script_score":{
"script":"doc['allBoost'].value"
}
},
{
"filter":{
"range": {
"allBoost": {
"lte": 0
}
}
},
"script_score":{
"script":"_score"
}
}
],
"boost_mode": "replace"
}
}
}

Resources