Is it possible to get the total size in bytes of all documents in an Elasticsearch type? - elasticsearch

I can't submit this question with just a title. Is it possible to get the total size in bytes of all documents in an Elasticsearch type?

Related

Elasticsearch / Lucene: Determine total bytes used by field in index

I have a lucene index (generated by ES 6.X.X) and I want to understand how much storage space (bytes) each of my fields is consuming.
Does anyone know how I can figure out how many bytes it requires to store all of the values for a given field across all documents? I have the index opened up in Luke, but I am new to that tool and it's not clear to me how I can answer this question. Is there a better tool for this?

ElasticSearch: Explaining the discrepancy between sum of all document "_size" and "store.size_in_bytes" API endpoint?

I'm noticing if I sum up the _size property of all my ElasticSearch documents in an index, I get a value of about 180 GB, but if I go to the _stats API endpoint for the same index I get a size_in_bytes value for all primaries to be 100 GB.
From my understanding the _size property should be the size of the _source field and the index currently stores the _source field, so should it not be at least as large as the sum of the _size?
The _size seems to be storing the actual size the source document. When actually storing the source in stored_fields, Elasticsearch would be compressing it(LZ4 default if I remember correctly). So I would expect it to be less size on disk than the actual size. And if the source doesn't have any binary data in it, the compression ratio is going to be significantly higher too.

Elasticsearch inconsistent file size and index size query

The goal in my application is to measure the total amount of space that elasticsearch is using on the operating system.
The showing of these statistiscs is a functionality that provides users of our application to measure how many disk space they need to run elasticsearch.
Following lucene call is used to get this data.
GET _stats/store
This returns all indices with their corresponding size_in_bytes property.
The problem is that this number is different than the disk size. The disk size is +- 10 times more than I get returned by this statistics query.
This makes the result of the query not really relevant to what we want to show to the customer.
Is there a more accurate way to get the size that elasticsearch indices are using on the file system?

Elasticsearch and Lucene document limit

Document count in our elasticsearch installation from stats api shows about 700 million when the actual document count is about 27 million from the count api. We understand that this difference is from nested documents count - stats api shows all.
In Lucene documentation, we read that there is 2 billion hard document count limit for a shard. Should I worry that elasticsearch is about to hit the document limit? Or should I monitor the data from the count api?
Yes there is limit to the number of docs per shard of 2 billion, which is a hard lucene limit.
There is a maximum number of documents you can have in a single Lucene index. As of LUCENE-5843, the limit is 2,147,483,519 (= Integer.MAX_VALUE - 128) documents.
You should consider scaling horizontally.

Maximum Size of string metric in sonarqube

Is there any limit on size of metric whose data type is string in sonarqube?
4000 characters is the max size of data measures

Resources