ElasticSearch : Aggregations on one field not working - elasticsearch

I have few documents in one index in elastic search. When I aggregate by one of its fields, I do not get any results. The field's mapping is
{
"type": "string",
"index": "not_analyzed"
}
I have another field that is indexed in the same manner but I am able to do aggregations on that. What possible causes can be there for this? How do I narrow down the issue?
Edit : The Elastic Search version is 1.6.0 and I am running the following query for aggregation:
{
"aggregations": {
"aggr_name": {
"terms": {
"field": "storeId",
"size": 100
}
}
}
}
where "storeId" is the field I am aggregating on. The same aggregation works on another field with the same mapping.

Related

ElasticSearch REST API Aggregating Text Field

So I'm brand new to ElasticSearch/Kibana, trying to create a simple Curl command to hit Elastic's REST API and return the number of logs that contain a given string of text. But I'm getting the following error:
"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [timestamp] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
My code is as follows:
{
"size": 0,
"query": {
"range": {
"timestamp": {
"gte": "2021-06-15",
"lte": "2021-06-23"
}
}
},
"aggs": {
"hit_count_per_day": {
"date_histogram": {
"field": "timestamp",
"calendar_interval": "day"
}
}
}
}
Where should I be adding this "fielddata=true" value mentioned in the error? Can anyone point me towards a reference doc for ElasticSearch API syntax?
Based on the error you are getting, it seems that timestamp field is of text type. You cannot perform aggregation on text type fields.
Since you are using date_histogram aggregation, you should usetimestamp field to be of date type.
Modify your index mapping as shown below
{
"mappings": {
"properties": {
"timestamp": {
"type": "date"
}
}
}
}

Installed mapper-size plugin and added it to index mapping but no _size field is showing up in indexes

We are trying to figure out which documents in our Elasticsearch (version 7.0.1) index are consuming the most disk space. We found the mapper-size plugin provided by Elastic. We installed the plugin on all Elasticsearch data/master nodes, and restarted the ES service on each one. We also added the _size field to the index pattern mapping. However, the _size field is not showing up. This index is fed by several Filebeat services running on our application servers, and the index rolls over each night.
We tried creating a brand new index that matches the index pattern. The _size field was present in the mapping:
"application_log_test" : {
"mappings" : {
"_size" : {
"enabled" : true
}
After adding a few test documents, however, the _size field did not show up in the queried documents. We verified that all Elasticsearch nodes came up with the plugin loaded:
[2019-09-16T15:10:45,103][INFO ][o.e.p.PluginsService ] [node-name-1] loaded plugin [mapper-size]
We are expecting any document added to the index to calculate and display a _size metadata field. This field doesn't display in our output.
The _size field is not added to your source document. You can query it, aggregate it, sort on it, but to actually see its value, you need to do it through script fields. Try to run the query below and you'll see:
GET application_log_test/_search
{
"query": {
"range": {
"_size": {
"gt": 10
}
}
},
"aggs": {
"sizes": {
"terms": {
"field": "_size",
"size": 10
}
}
},
"sort": [
{
"_size": {
"order": "desc"
}
}
],
"script_fields": {
"size": {
"script": "doc['_size']"
}
}
}

Elasticsearch - Given a query_string with a wildcard, can I aggregate on the matched term?

I'm about to describe the use case for a terms aggregation and the reason why mappings should be properly configured but given the state of our cluster, neither of these are options.
I'm doing full-text searching on a terabyte of raw log data and trying to do some counts on the specific terms being matched.
Given a query string like 192.168.0.* I'm finding documents that reference terms like 192.168.0.12 somewhere in the body as expected. The specific field is not consistent.
What I'd like to do is an aggregation on the term that was found. If ES returns 100 documents in which 192.168.0.12 was found, there should be a counter that reflects this (192.168.0.12: 100). Similarly, if 50 documents were found for 192.168.0.254 I'd expect to see 192.168.0.254: 50.
Given the scale and timing this has to be done in Elasticsearch, not sideloaded and iterated application-side. Is this doable?
For this, you will need to define your mapping something like this
"IP_ADDRESS": {
"type": "keyword",
"fields": {
"raw":{
"type": "text"
}
}
}
So the searching will be on IP_ADDRESS.raw and term aggregation will be on IP_ADDRESS
{
"query": {
"query_string": {
"default_field": "IP_ADDRESS.raw",
"query": "192.168.0.*"
}
},
"aggs": {
"count_term": {
"terms": {
"field": "IP_ADDRESS",
"size": 1000
}
}
}
}

Elasticsearch not searching a field in term query

I'm having a problem in searching records with a field. Although, field exists with the value in the document in Elasticsearch but when I use this field to search as term, it does not fetch the record. Other fields are doing great.
JSON Request:
{
"query": {
"filtered": {
"query": {
"match_all": [
]
},
"filter": {
"and": [
{
"term": {
"sellerId": "6dd7035e-1d6f-4ddb-82f4-521902bfc29e"
}
}
]
}
}
}
}
It does not return any error, it just doesn't fetch the related document. I tried searching with other fields and they worked fine.
Is there anything I'm missing here ?
Elasticsearch version: 2.2.2
Arfeen
You need to reindex your data and change the mapping of that field to
"sellerId": {
"type": "string",
"index": "not_analyzed"
}
That way the UUID won't be analyzed and split into tokens and you'll be able to search it using a term query.

ElasticSearch terms aggregation by entire field

How can I write an ElasticSearch term aggregation query that takes into account the entire field value, rather than individual tokens? For example, I would like to aggregate by city name, but the following returns new, york, san and francisco as individual buckets, not new york and san francisco as the buckets as expected.
curl -XPOST "http://localhost:9200/cities/_search" -d'
{
"size": 0,
"aggs" : {
"cities" : {
"terms" : {
"field" : "city",
"min_doc_count": 10
}
}
}
}'
You should fix this in your mapping. Add a not_analyzed field. You can create the multi field if you also need the analyzed version.
"album": {
"city": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
Now create your aggregate on city.raw
Update at 2018-02-11
now we can use syntax .keyword after grouped by field according to this
GET /bank/_search
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state.keyword"
}
}
}
}
This elastic doc suggests to fix that in mapping (as suggested in the accepted answer) - either to make the field not_analyzed or to add a raw field with not_analyzed and use it in aggregations.
There is no other way for it. As the aggregations operate upon inverted index and if the field is analyzed, the inverted index is bound to have only tokens and not the original values of the field.

Resources