Elasticsearch: Query to filter out specific documents based on field value and return count - elasticsearch

I'm trying to compose a query in Elasticsearch that filters out documents with a specific field value, and also returns the number of documents that has been filtered out as an aggregation.
What I have so far is below, however, with my solution it seems that the documents are filtered out first, then after the filtering, the count is performed, which is making it always be 0.
{
"query":{
"bool":{
"must_not":[
{
"terms":{
"gender":[
"male"
]
}
}
]
}
},
"aggs":{
"removed_docs_count":{
"filter":{
"term":{
"gender":"male"
}
}
}
}
}

You don't need a query block, just aggs will provide you expected results.
{
"aggs":{
"removed_docs_count":{
"filter":{
"term":{
"gender":"male"
}
}
}
}
}

Related

Elasticsearch sort exact matches and fuzzy matches in different sets

This is my first ever question here so I apologize if I make any mistakes.
I'm trying to make a fuzzy search (match query with fuzziness parameter) on my index that will return the results in Alphabetical order. But I need the exact matches to come first(Alphabetically ordered among themselves) and fuzzy matches later.
I have tried this to make exact matches have higher scores. But they are just being sorted by their scores:
"query":{
"bool":{
"must":[
{
"match":{
"myPropertyName":{
"query":"myWord",
"fuzziness":"AUTO"
}
}
}
],
"should":[
{
"match":{
"myPropertyName":{
"query":"myWord",
"boost":20
}
}
}
]
}
},
"sort":[
"_score",
{
"myProperty.keyword":{
"order":"asc"
}
}
],
"track_scores":true
}
Then I have tried to make the scores of all exact matches and fuzzy matches same among themselves with many methods. I can make it for fuzzy matches by using filter or constant_score but I couldn't figure a way to assign a custom score to the results of should query in my search.
How can I achieve this?
I've managed to achieve this by using a function score query with "boost_mode": "replace" and setting a custom value to weight parameter like: "weight": "10".
{
"query":{
"function_score":{
"query":{
"bool":{
"filter":[
{
"match":{
"myPropertyName":{
"query":"myWord",
"fuzziness":"AUTO"
}
}
}
]
}
},
"boost_mode":"replace",
"functions":[
{
"filter":{
"match":{
"myPropertyName":{
"query":"myWord"
}
}
},
"weight":"10"
}
]
}
},
"sort":[
"_score",
{
"myProperty.keyword":{
"order":"asc"
}
}
],
"track_scores":true
}
This way documents that match the match query will return with 0 score since it's also a filter query. Then among these documents the ones that match the function will return with 10 score since "boost_mode": "replace" and "weight: "10".
When it comes to sorting firstly Elasticsearch will sort the results by their score's since it comes first in "sort[]" array. Then documents with same scores will be sorted alphabetically among themselves.
This worked perfectly for me.

Elastic Search query: filter aggregation by number range

I have a query like
{
"query":{
"bool":{
"must":[
{
"range":{
"created_date":{
"gte":1801301,
"lte":1807061
}
}
}
]
}
},
"aggs":{
"rating":{
"filters":{
"filters":{
"neutral":{
"match":{
"rating":0
}
},
"positive":{
"match":{
"rating":1
}
},
"negative":{
"match":{
"rating":2
}
}
}
}
}
},
"size":0
}
The query filters documents by created_date. I use date range that covers two date ranges: current and previous. Like data for this month and previous month. This is needed in other calculations(original query is much bigger).
This query works, but it calculates the rating for current and previous date ranges. I need to calculate rating in shorter date range: created_date: 1804181-1807061.
Is there a way how I can do this?
You can use
{
"range: {
"created_date": {
"gte":"now-10d/d",
"lte":"now/d"
}
}
}
I'm thinking this will help for you. Let me know if you any questions

Get result from aggs in script ElasticSearch/Painless

I'm new in ElasticSearch world. I've been trying write simple request and I need to get aggs result in my script to make simple condition. Is it possible to do it in this way?
The condition below is only for example.
GET _search
{
"aggs" : {
"sum_field" : { "sum" : { "field" : "someField" } }
},
"script_fields": {
"script_name": {
"script": {
"lang": "painless",
"source": """
// get there aggs result (sum_field)
if(sum_field > 5){
return sum_field
}
"""
}
}
}
}
The requirement is to execute sum aggregation over multiple indexes having the same field name
Now with multiple indexes, you'll have to check if that particular field exists in that indexes or not AND if the field is of the same datatype.
Indexes
I've created three indexes, having a single field called num.
index_1
- num: long
index_2
- num: long
index_3
- num: text
: fielddata: true
Also notice how if the field is of type text, then I've set its property fielddata:true. But if you do not set it, then the below query would give you aggregation result as well as an error saying you cannot retrieve the value of type text as its an analyzed string and you can only use doc for fields which are non_analyzed.
Sample Query:
POST /_search
{
"size":0,
"query":{
"bool":{
"filter":[
{
"exists":{
"field":"num"
}
}
]
}
},
"aggs":{
"myaggs":{
"sum":{
"script":{
"source":"if(doc['num'].value instanceof long) return doc['num'].value;"
}
}
}
}
}
Query if you cannot set fielddata:true
In that case, you need to explicitly mention the indexes on which you'd want to aggregate.
POST /_search
{
"size":0,
"query":{
"bool":{
"filter":[
{
"exists":{
"field":"num"
}
},
{
"terms":{
"_index":[
"index_1",
"index_2"
]
}
}
]
}
},
"aggs":{
"myaggs":{
"sum":{
"script":{
"source":"if(doc['num'].value instanceof long) return doc['num'].value;"
}
}
}
}
}
Hope this helps!

Elasticsearch Prefix Exact Match

i have text fields like above
elastic|b|c
elastic,search|b|c
elastic,search,prefix|b|c
I want to query on this string with prefix. And the query is
aggs":{
"field":{
"filter":{
"match":{
"field":{
"type":"prefix",
"query":"elastic|"
}
}
},
"aggs":{
"field":{
"terms":{
"field":"textField",
"size":255
}
}
}
}
}
},
"
and this query return all texts below in the example.
Do i need extra analyzer or token filter on texts?
How can i exact match search with prefix on elastic ?
you can achieve that by using wildcards in elasticsearch.
{
"query": {
"wildcard": {
"textField": {
"value": "elastic*"
}
}
}
}

elastic search get bucket count

I have the following query:
GET images/_search
{
"query":{
"bool":{
"must":[
{
"term":{
"appID.raw":"myApp"
}
}
]
}
},
"size":0,
"aggs":{
"perDeviceAggregation":{
"terms":{
"field":"deviceID",
"min_doc_count":50000
}
}
}
}
This query returns a "buckets" array, but I would like to return only the length of the array, without the array itself.
Explanation: the purpose of this query is to count how many devices that belong to app "myApp", have over 50,000 images. I don't need the query to return these devices, just to know how many are there.
The terms aggregation returns buckets -- 1 bucket for each unique term of the field -- where each bucket contains the count of documents that contain the term.
It sounds like you want to know the number of unique terms instead of the document count per term. This concept is called cardinality.
There is a different aggregation to determine cardinality. Your query would look like this:
GET images/_search
{
"query":{
"bool":{
"must":[
{
"term":{
"appID.raw":"myApp"
}
}
]
}
},
"size":0,
"aggs":{
"deviceIdCardinality":{
"cardinality":{
"field":"deviceID"
}
}
}
}
NOTE: cardinality counts are approximate. You can configure the accuracy with the precision_threshold parameter to the aggregation. See the documentation for specifics.

Resources