Make Elasticsearch return the number of all documents on query - elasticsearch

When I do a query Elasticsearch returns how many hits I get. Can I also get it to reply how many documents it has in total?
Here I've added the imaginary field sum_documents to the result. Does such thing exist, or to I have to make an extra query to fetch the sum?
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"sum_documents": 500,
"max_score" : null,
"hits" : [ ]
}
}

You can add a global aggregation in your query, and it will return the total document count in your search context (index/alias + type(s))
{
"query": {
"query_string": {
"query": "viking",
"default_operator": "AND"
}
},
"aggs": {
"harvester-test": {
"global": {}
}
}
}

Related

Elastic query with an aggregation on a nested field and a sub aggregation on field (from root) returns empty array of buckets

I Wrote a query with an aggregation on a nested field and a sub aggregation on a field that is not nested but in the root instead. I expected to get a sum for each ownerId, but instead I got an empty bucket array.
The following query returns an empty array of buckets - though there are results and a positive sum.
GET my-index/_search
{
"size": 0,
"aggs": {
"agg_owner": {
"nested": {
"path": "owner_fields"
},
"aggs": {
"raw_names": {
"terms": {
"field": "owner_fields.id.keyword",
"size": 10
},
"aggs": {
"total_amount": {
"reverse_nested": {},
"aggs": {
"total_inner_amount": {
"terms": {
"field": "amount",
"size": 10
}
}
}
}
}
}
}
}
}
}
returns:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 45430,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"agg_owner" : {
"doc_count" : 15494,
"raw_names" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ ]
}
}
}
}
I expected to see a sum for each ownerId,
But that didn't happen.
had to remove keyword from aggregation:
"owner_fields.id.keyword" => "owner_fields.id"

Bucket Script Aggregation - Elastic Search

I'm trying to build a query at Elastic Search, in order to get the difference of two values:
Here's the code I'm using:
GET /monitora/_search
{
"size":0,
"aggs": {
"CALC_DIFF": {
"filters": {
"filters": {
"FTS_callback": {"term":{ "msgType": "panorama_fts"}},
"FTS_position": {"term":{ "msgType": "panorama_position"}}
}
},
"aggs": {
"subtract": {
"bucket_script": {
"buckets_path": {
"PCountCall": "_count",
"PcountPos":"_count"
},
"script": "params.PCountCall - params.PcountPos"
}
}
}
}
}
}
And this is what I get back when I run it:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"CALC_DIFF" : {
"buckets" : {
"FTS_callback" : {
"doc_count" : 73530,
"subtract" : {
"value" : 0.0
}
},
"FTS_position" : {
"doc_count" : 156418,
"subtract" : {
"value" : 0.0
}
}
}
}
}
}
However, instead of getting the subtraction inside these buckets (which will always be zero), I was looking for the subtraction of the counts on each bucket, which would return me (73530 - 156418) following this example.
After that, I would like to display the result as a "metric" visualization element in Kibana. Is it possible?
Could anyone give me a hand to get it right?
Thanks in advance!

What is the difference in these elasticsearch queries?

I have the following elasticsearch query that returns plenty of results.
{
"query": {
"multi_match": {
"query": "swartz",
"fields": ["notes"]
}
},
"size": 20,
"from": 0,
"sort": {
"last_modified_date": {
"order": "desc"
}
}
}
I'm trying to redo it as a bool query so I can add should and must_not, but am getting no results and I'm not sure why.
{
"query": {
"bool": {
"must": [
{"term": { "notes": "swartz" }}
]
}
},
"size": 20,
"from": 0,
"sort": {
"last_modified_date": {
"order": "desc"
}
}
}
Instead of results, what I do get is this.
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 6,
"successful" : 5,
"skipped" : 0,
"failed" : 1,
"failures" : [
{
"shard" : 0,
"index" : ".kibana_1",
"node" : "E2fjoon_Smm5m7LFcQp9XQ",
"reason" : {
"type" : "query_shard_exception",
"reason" : "No mapping found for [last_modified_date] in order to sort on",
"index_uuid" : "0pZdhm_nRXWiWGcqFgvvHQ",
"index" : ".kibana_1"
}
}
]
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
First, I'm not sure why I get results and it orders properly with the first query, and secondly, even if I take the sort out of the second query I still get no results.
At first you use a match query will look any occurrence of "swartz" somewhere in the content of "notes".
In a SQL world it's something like :
where notes ilike "%swartz%"
In the second query you use a term query which will look for a perfect equality in the field.
In SQL :
where "notes"=="swartz"
It could probably explain your behavior
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html

filtering on 2 values of same field

I have a status field, which can have one of the following values,
I can filter for data which have status completed. I can also see data which has ongoing.
But I want to display the data which have status completed and ongoing at the same time.
But I don't know how to add filters for 2 values on a single field.
How can I achieve what I want ?
EDIT - Thanks for answers. But that is not what i wanted.
Like here I have filtered for status:completed, I want to filter for 2 values in this exact way.
I know I can edit this filter and , and use your queries, But I need a simple way to do this(query way is complex), as I have to show it to my marketing team and they don't have any idea about queries. I need to convince them.
If I understand your question correctly, you want to perform an aggregation on 2 values of a field.
This should be possible with a query similar to this one with a terms query:
{
"size" : 0,
"query" : {
"bool" : {
"must" : [ {
"terms" : {
"status" : [ "completed", "unpaid" ]
}
} ]
}
},
"aggs" : {
"freqs" : {
"terms" : {
"field" : "status"
}
}
}
}
This will give a result like this one:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
},
"hits" : {
"total" : 5,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"freqs" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ {
"key" : "unpaid",
"doc_count" : 4
}, {
"key" : "completed",
"doc_count" : 1
} ]
}
}
}
Here is my toy mapping definition:
{
"bookings" : {
"properties" : {
"status" : {
"type" : "keyword"
}
}
}
}
You need a filter in aggregation.
{
"size": 0,
"aggs": {
"agg_name": {
"filter": {
"bool": {
"should": [
{
"terms": {
"status": [
"completed",
"ongoing"
]
}
}
]
}
}
}
}
}
Use the above query to get results like this:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 8,
"max_score": 0,
"hits": []
},
"aggregations": {
"agg_name": {
"doc_count": 6
}
}
}
The result what you want is the doc_count
For your reference bool query in elasticsearch, should it's like OR conditions,
{
"query":{
"bool":{
"should":[
{"must":{"status":"completed"}},
{"must":{"status":"ongoing"}}
]
}
},
"aggs" : {
"booking_status" : {
"terms" : {
"field" : "status"
}
}
}
}

Aggregate filtered result using Elastic Search API

I would like to aggregate and count the number of docs appears based on my filtering rules.
I looked at the API from their website: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filters-aggregation.html
and came out with this:
{ "size": 0,
"aggregations": {
"messages": {
"filters":{
"filters": {
"knowledge service": { "match": {"syslog_msg": "my-domain.com"}}
}
}
}
}
}
"syslog_msg" can contain information such as "my-domain.com some other value".
The response i got:
{
"_scroll_id" : "some scroll id",
"took" : 89,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1000000,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"messages" : {
"buckets" : {
"knowledge service" : {"doc_count" : 12000}
}
}
}
}
It seems working fine, but when I ran a query to look at the 12000 records, some of them do not have exact match to the string (in this case my-domain.com) that I searched for.
For example, some docs have the string "my" in syslog_msg instead of "my-domain.com".
How do I change the query so that it filters the exact match for the string that I am looking for?
The solution is to replace match with match_phrase which will search and return the exact phrase found
You should add aggregations to your filter
As elasticsearch document says (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html);
{
"aggs" : {
"red_products" : {
"filter" : { "term": { "color": "red" } },
"aggs" : {
"avg_price" : { "avg" : { "field" : "price" } }
}
}
}
}

Resources