Looking for a way to push named (rndc) stats from DNS servers and output them to Elastic for better visibility - elasticsearch

What is the best way to grab the named stats such as Resolver Statistics (IPv4 queries sent, IPv4 responses received etc) and Cache Statistics (cache hits, cache misses etc) and parse them to be able to visualize them on a dashboard with Elastic. The named_stats.txt file on the DNS master and DNS slaves have this file that has a ton of stats that can be helpful in troubleshooting DNS issues for us. We currently leverage Logstash for most our queries today.
++ Cache Statistics ++
[View: default]
195374 IPv4 queries sent
195307 IPv4 responses received
13600 NXDOMAIN received
5188 SERVFAIL received
2683 query retries
67 query timeouts
4066 queries with RTT < 10ms
184536 queries with RTT 10-100ms
6670 queries with RTT 100-500ms
20 queries with RTT 500-800ms
15 queries with RTT 800-1600ms
523 bucket size
2572 spilled due to server quota
[View: _bind]
++ Cache Statistics ++
[View: default]
712227 cache hits
195628 cache misses
380050 cache hits (from query)
325574 cache misses (from query)
0 cache records deleted due to memory exhaustion
17819 cache records deleted due to TTL expiration
630 cache database nodes
519 cache database hash buckets
2418422 cache tree memory total
247299 cache tree memory in use
323080 cache tree highest memory in use
393216 cache heap memory total
132096 cache heap memory in use
132096 cache heap highest memory in use
[View: _bind (Cache: _bind)]
I've tried using different grok patterns with Logstash but nothing is working as expected so far.

Related

What does size=0 mean in Elasticsearch shard request cache?

By default, the requests cache will only cache the results of search requests where size=0, so it will not cache hits, but it will cache hits.total, aggregations, and suggestions.
I do not understand the part where stated: "size=0".
What is the the size context/meaning here?
Does it mean that results cache will
cache only for empty results?
cache page 1 only (default 10 results I think)?
No, size param is useful if you want to fetch results different than 10, as default size is 10, so if you are using a search query for which you need to fetch lets suppose 1000 results, than you specify size param to 1000, without this you will get only top 10 search results, sorted on their score in descending order.
size=0, in shard request cache, is that it will not cache the exact results(ie number of documents with their score) but only cache the metadata like total number of results(which is hits.total) and other things.

root cause of data too large in elastic search

I am struggling to find the exact cause of the problem :
"type": "circuit_breaking_exception",
"reason": "[fielddata] Data too large, data for [_id] would be [1704048152/1.5gb], which is larger than the limit of [1704040857/1.5gb]",
"bytes_wanted": 1704048152,
"bytes_limit": 1704040857,
"durability": "PERMANENT"
It happened on my aws elastic search server, I thought memory might be an issue so on my local laptop I assigned -xms tp 32 mb and -xmx to 64 mb and tried inserting data in my index after around 1 00 000 records I got error :
circuit_breaking_exception
"reason": "[parent] Data too large
I was not able to get exact same error as that I got on AWS elastic search
I reproduce the problem I inserted more than 3500000 records but still, I am not getting that exception on my local
I am new to elastic search and I want to know what changes do I need to make so that I can avoid this problem on AWS elastic search
The configuration for AWS elastic search is :
Elasticsearch version7.4
Instance type (data)c5.xlarge.elasticsearch
EBS volume size60 GiB
Max clause count: 1024
Field data cache allocation: unbounded (default)
let me If more details are required
The mentioned field data circuit breaker is being taken into account when estimating the amount of memory a field will need in order to be loaded into the JVM heap. It prevents the field data loading by raising an exception if the operation heap would exceed the limit. By default, it's configured to 40% of the maximum JVM heap. It can be configured too (see https://www.elastic.co/guide/en/elasticsearch/reference/current/circuit-breaker.html#fielddata-circuit-breaker) here also related settings you should be aware of: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-fielddata.html
It seems, that the node is overloaded. Increase the JVM heap. If not feasible add another nodes in order to distribute the shards over more instanes.

LRU Caching policy of a Node Query cache in Elasticsearch

I have a elasticsearch cluster set up with node query cache enabled, I have set the size of the cache to be 2gb, but I am not completely sure how the LRU caching policy works in this case.
I have a query context run against the elasticsearch index and i expect the result to be cached, so that when there is request for the same query context again - there should be increase in the hit_count, but this is not the behavior i see in ES.
These are the stats of my query_cache
memory_size_in_bytes: 7176480,
total_count: 36605,
hit_count: 15657,
miss_count: 20948,
cache_size: 130,
cache_count: 130,
evictions: 0
Even though the memory_size_in_bytes is not reached its max. The result of the query context is not completely cached and when the same query context is fired against the elasticsearch index i see miss counts stats getting increased rather than hit counts.
Can anyone please explain about how the node query caching works in ES.

How can I route ElasticSearch requests to a few shards

My ES cluster has 12 servers, but when I create my index I just indicated 3 shards. So should I use the parameter route for each time writing and reading for making the latency shorted.
If you want to controll shard allocation there is few options
One of the options you can set in config yml file node.rack: rack1
Then when you create/update index
PUT test/_settings
{
"index.routing.allocation.include.rack": "rack1"
}
In addition it depends on size of you index, for instance in my app i am using different type of indexes and some of them have 1 shard (they are settings indexes) other have 3 shards and 1 replica, and i dont care about allocation because its super fast, so if you care about latency then maybe its better to think about upgrading network

Clear cache Elasticsearch not clearing much data

When running curl -XPOST node1:9200/_cache/clear
I expected a large amount of cache to disappear in each node.
But when looking at bigdesk, cache size doesn't seem to drop a lot.
Shouldn't it drop a lot??
The main problem from that is that each node has generally very high cache size and it's the reason for exceptions like below:
ElasticsearchException[org.elasticsearch.common.breaker.CircuitBreakingException: Data too large, data for field [ts] would be larger than limit of [24108466176/22.4gb]]
Another thing is I don't see any cache eviction generally(not after clear_cache only), there are no properties for cache eviction set in settings(perhaps some should be set?). The only property set is fielddata.breaker.limit: 75%
What can be done for handling better cache in elasticseach?
ElasticSearch version: 1.3.5

Resources