Elasticsearch - Aggregations on part of bool query - elasticsearch

Say I have this bool query:
"bool" : {
"should" : [
{ "term" : { "FirstName" : "Sandra" } },
{ "term" : { "LastName" : "Jones" } }
],
"minimum_should_match" : 1
}
meaning I want to match all the people with first name Sandra OR last name Jones.
Now, is there any way that I can get perform an aggregation on all the documents that matched the first term only?
For example, I want to get all of the unique values of "Prizes" that anybody named Sandra has. Normally I'd just do:
"query": {
"match": {
"FirstName": "Sandra"
}
},
"aggs": {
"Prizes": {
"terms": {
"field": "Prizes"
}
}
}
Is there any way to combine the two so I only have to perform a single query which returns all of the people with first name Sandra or last name Jones, AND an aggregation only on the people with first name Sandra?
Thanks alot!

Use post_filter.
Please refer the following query. Post_filter will make sure that your bool should clause don't effect your aggregation scope.
Aggregations are filtered based on main query as well, but they are unaffected by post_filter. Please refer to the link
{
"from": 0,
"size": 20,
"aggs": {
"filtered_lastname": {
"filter": {
"query": {
"match": {
"FirstName": "sandra"
}
}
},
"aggs": {
"prizes": {
"terms": {
"field": "Prizes",
"size": 10
}
}
}
}
},
"post_filter": {
"bool": {
"should": [{
"term": {
"FirstName": "Sandra"
}
}, {
"term": {
"LastName": "Jones"
}
}],
"minimum_should_match": 1
}
}
}
Running a filter inside the aggs before aggregating on prizes can help you achieve your desired usecase.
Thanks
Hope this helps

Related

Elastic search bool query

My objective is to find out most recent 10 documents which match message id as MSG-1013 and Severity field must be info. Both conditions should satisfied and match text should be exact. I have tried with search query below but it does not give me expected results. What am I doing wrong here ?
{
"size": 10,
"query": {
"bool": {
"must": [
{
"match": { "messageId": "MSG-1013" }
},
{
"match": { "Severity": "Info" }
}
]
}
}
}
If I have understood you correctly, you want to find the top 10 (recent) documents having exactly fields "messageId" and "Severity". I assume, you don't need a score because your score seems to be the the document timestamp or something else like a date field. For this purpose, you could use the bool filter in combination with a sort query.
{
"query": {
"bool": {
"filter": [
{ "term": { "messageId": "MSG-1013" } },
{ "term": { "Severity": "Info" } }
]
}
},
"sort" : [
{ "documentTimestamp" : {"order" : "desc"}}
],
"size": 10
}

How can we use exists query in tandem with the search query?

I have a scenario in Elasticsearch where my indexed docs are like this :-
{"id":1,"name":"xyz", "address": "xyz123"}
{"id":1,"name":"xyz", "address": "xyz123"}
{"id":1,"name":"xyz", "address": "xyz123", "note": "imp"}
Here the requirement stress that we have to do a term match query and then provide relevance score to them which is a straight forward thing but the additional aspect here is if any doc found in search result has note field then it should be given higher relevance. How can we achieve it with DSL query? Using exists we can check which docs contain notes but how to integrate with match query in ES query. Have tried lot of ways but none worked.
With ES 5, you could boost your exists query to give a higher score to documents with a note field. For example,
{
"query": {
"bool": {
"must": {
"match": {
"name": {
"query": "your term"
}
}
},
"should": {
"exists": {
"field": "note",
"boost": 4
}
}
}
}
}
With ES 2, you could try a boosted filtered subset
{
"query": {
"function_score": {
"query": {
"match": { "name": "your term" }
},
"functions": [
{
"filter": { "exists" : { "field" : "note" }},
"weight": 4
}
],
"score_mode": "sum"
}
}
}
I believe that you are looking for boosting query feature
https://www.elastic.co/guide/en/elasticsearch/reference/5.1/query-dsl-boosting-query.html
{
"query": {
"boosting": {
"positive": {
<put yours original query here>
},
"negative": {
"filtered": {
"filter": {
"exists": {
"field": "note"
}
}
}
},
"negative_boost": 4
}
}
}

how to distinct value after query in elasticsearch

I use elasticsearch like :
{
"query": {
"match_phrase": {
"title": "my title"
}
},
"aggs": {
"unique_title": {
"cardinality": {
"field": "title"
}
}
}
}
i just want to sql
select distinct title from table where title like '%my title%'
the result give me multiple same results, "cardinality" dont worked whit "query"
if you dont understand me, Please forgive my poor English ^_^
Cardinality aggregation calculates the count of distinct values for a field.
Hence the equivalent sql query for the elasticsearch query you wrote would look like:
select count(distinct title) from table where title like '%my title%'
What you need to use is the Terms aggregation for getting the distinct titles.
{
"query": {
"match_phrase": {
"title": "my title"
}
},
"aggs": {
"unique_title": {
"terms": {
"field": "title"
}
}
}
}
And you need to look into the "aggregations" section of the search response to get the distinct values in the "buckets" array.
You can use below query to get expected result:
GET my_index/my_type/_search
{
"from": 0,
"size": 200,
"query": {
"filtered": {
"filter": {
"bool": {
"must": {
"query": {
"wildcard": {
"title": "*my title*"
}
}
}
}
}
}
},
"_source": {
"includes": [
"title"
],
"excludes": []
}
}

Elastic search 2.1 : Intersection of aggregations

I have some sample data in elastic search, which looks like below
Data1: {
"name": "rahul",
"socialnetwork": "facebook",
"day":1
}Data2: {
"name": "rahul",
"searchengine": "google"
"day": 1
}Data3: {
"name": "vivek",
"socialnetwork": "facebook",
"day":1
}Data4: {
"name": "devendra",
"searchengine": "google",
"day":2
}Data5: {
"name": "rahul",
"socialnetwork": "facebook",
"day":2
}
I need to get aggregations on "name" field, where socialnetwork = "facebook" and searchengine = "google".
As far as I know, we can use two aggregations and get an intersection of aggregations.
1st aggregation :
{
"query": {
"match": {
"searchengine": "google"
}
},
"aggs": {
"searcheng": {
"terms": {
"field": "name"
}
}
}
}
2nd aggregation :
{
"query": {
"match": {
"socialnetwork": "facebook"
}
},
"aggs": {
"socialnet": {
"terms": {
"field": "name"
}
}
}
}
And get the common aggregations (i.e. intersection) from both the aggregations.
But I am not able to get intersection using elastic search.
I have tried many things: subaggregations doesn't help in this case, significant terms aggregations results are not good enough, filters, pipeline aggregations, but couldn't find a solution.
Above sample data is just a simplified version of a big data, there are more than two filters, around 20 filters.
No,you dont need to have intersection of two aggregations.
The above can be easily achieved using bool query.For your desired output you can use should clause.
{
"query": {
"bool": {
"should": [
{
"match": {
"searchengine": "google"
}
},
{
"match": {
"socialnetwork": "facebook"
}
}
],
"minimum_number_should_match": 1
}
},
"aggs": {
"searcheng": {
"terms": {
"field": "name",
"min_doc_count" :2
}
}
}
}
Hope it helps.

Elasticsearch multi term filter

I'm quite new to Elasticsearch, so here's my question.
I wanna do a search query with elasticsearch and wanna filter with multiple terms.
If I want to search for a user 'tom', then I would like to have all the matches where the user 'isActive = 1', 'isPrivate = 0' and 'isOwner = 1'.
Here's my search query
"query":{
"filtered": {
"query": {
"query_string": {
"query":"*tom*",
"default_operator": "OR",
"fields": ["username"]
}
},
"filter": {
"term": {
"isActive": "1",
"isPrivate": "0",
"isOwner": "1"
}
}
}
}
When I use 2 terms, it works like a charm, but when i use 3 terms it doesn't.
Thanks for the help!!
You should use bool filter to AND all your terms:
"query":{
"filtered": {
"query": {
"query_string": {
"query":"*tom*",
"default_operator": "OR",
"fields": ["username"]
}
},
"filter": {
"bool" : {
"must" : [
{"term" : { "isActive" : "1" } },
{"term" : { "isPrivate" : "0" } },
{"term" : { "isOwner" : "1" } }
]
}
}
}
}
For version 2.x+ you can use bool query instead of filtered query with some simple replacement: https://www.elastic.co/guide/en/elasticsearch/reference/7.4/query-dsl-filtered-query.html
As one of the comments says, the syntax has changed in recent ES versions. If you are using Elasticsearch 6.+, and you want to use a wildcard and a sequence of terms in your query (such as in the question), you can use something like this:
GET your_index/_search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"your_field_name_1": {
"value": "tom*"
}
}
},
{
"term": {
"your_field_name_2": {
"value": "US"
}
}
},
{
"term": {
"your_field_name_3": {
"value": "Michigan"
}
}
},
{
"term": {
"your_field_name_4": {
"value": "0"
}
}
}
]
}
}
}
Also, from the documentation about wildcard queries:
Note that this query can be slow, as it needs to iterate over many
terms. In order to prevent extremely slow wildcard queries, a wildcard
term should not start with one of the wildcards * or ?.
I hope this helps.

Resources