Caches node of Cassandra jconsole is not expandable - caching

I have 1 node of Cassandra 1.1.2 installed on Linux, and I want to determine the size that every CF is occupying in the cache, and how many percents of every CF is in the cache (both for row cache and key cache)
When I connecting to this node via jconsole, and I'm expanding the org.apache.cassandra.db node in jconsole, the 'Caches' node is unexpandable, although according to:
http://www.datastax.com/docs/1.1/operations/monitoring#monitoring-and-adjusting-cache-performance
It should be expandable.
In addition, the output of the nodetool also does not contain the properties
Key cache capacity, Key cache size and Key cache hit rate:
Column Family: io2
SSTable count: 4
Space used (live): 566387478
Space used (total): 566387478
Number of Keys (estimate): 3858816
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 0
Read Latency: NaN ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Bloom Filter False Postives: 0
Bloom Filter False Ratio: 0.00000
Bloom Filter Space Used: 7238040
Compacted row minimum size: 125
Compacted row maximum size: 149
Compacted row mean size: 149
Any idea?

The Caches node of Cassandra jconsole is not expandable in Cassandra 1.1.2,
because individual per cf caches were combined into a global cache in 1.1,
thus it is not possible to see these fields in 1.1.2
http://www.datastax.com/dev/blog/caching-in-cassandra-1-1

Related

Elastic search Design - Room for improvement

We are starting designing a cluster and come up with the following optimal configuration . Please suggest if there is any scope for improvement or save some budget if it is over optimized.
100 fields - 1 MB per field(including inverted index) - 125 MB( to be safer side) per document
Total - 4M documents corresponds to 500 GB - 25GB per shard - 20 shards - total 40 shards including ( r = 1) - we had seen somewhere a shard size of 25GB works well in most of the scenarios.
Also it seems heap max 32 GB per each shard RAM) ( jvm uses compressed pointers ) works well - which translates to 64 GB (rest 50% for FS cache) .So , considering 256 GB RAM - this translates 2 shards per machine (128GB) - this translates 20 Data nodes per cluster(2 shards per each data node ) , 3 master nodes (HA ) , 1 coordinating node
Please add your recommendations

Meaning input records Spark

I have a doubt, I use Spark Streaming and I can see in the sparkUI:
I get in each microbatch 160.000 records, I can see it by SparkUI and the offsets I'm reading.(160K)
first stage which reads from Kafka I see:
Total Time Across All Tasks: 39 min
Locality Level Summary: Process local: 54
**Input Size / Records: 755.2 MB / 48114**
Output: 124.8 KB / 5179
Why isn't the input size 160K? what does it mean exactly Input Size / Records?

Elasticsearch Memory Usage increases over time and is at 100%

I see that indexing performance degraded over a period of time in Elasticsearch. I see that the mem usage has slowly increased over a period of time until it became 100%. At this state I cannot index any more data. I have default shard settings - 5 primary and 1 replica. My index is time based with index created every hour to store coral service logs of various teams. An index size corresponds to about 3GB with 5 shards and with replica it is about 6GB. With a single shard and 0 replicas it comes to about 1.7 GB.
I am using ec2's i2.2x large hosts which offer 1.6TB space and 61GB RAM and 8 cores.
I have set the heap size to 30GB.
Following is node statistics:
https://jpst.it/1eznd
Could you please help in fixing this? My whole cluster came down that I had to delete all the indices.

How can i save disk space in elasticsearch

I have a three nodes cluster each of 114 GB disk capacity. I am pushing syslog events from around 190 devices to this server. The events flow at a rate of 100k withing 15 mins. The index has 5 shards and uses best_compression method. Inspite of this the disk gets filled up soon so i was forced to remove the replica for this index. The index size is 170 GB and each shard is of size 34.1 GB. Now if i get additional disk space and i try to re index this data to a new index with 3 shards and replica will it save disk space ?

Elasticsearch 1.5.2 deployment issue

I have ES 1.5.2 cluster with the following specs:
3 nodes with RAM: 32GB, CPU cores: 8 each
282 total indices
2,564 total shards
799,505,935 total docs
767.84GB total data
ES_HEAP_SIZE=16g
The problem is when I am using Kibana to query some thing (very simple queries), if it a single query it`s working fine, but if I continue to query some more - elastic is getting so slow and eventually stuck because the JVM heap usage (from Marvel) is getting to 87-95%. It happens also when I trying to load some Kibana dashboard and the only solution for this situation is to restart the service on all the nodes.
(This is also happens on ES 2.2.0 , 1 node, with Kibana 4)
What is wrong, what am I missing?
Am I suppose to query less?
EDIT:
I had to mention that I have a lot of empty indices (0 documents) but the shards are counted. This is this way because I set ttl on the documents for 4w, and the empty indices will be deleted with curator.
Also we have not disabled doc_values in 1.5.2 nor 2.2.0 clusters.
The accurate specs are as following (1.5.2):
3 nodes with RAM: 32GB, CPU cores: 8 each
282 total indices = 227 empty + 31 marvel + 1 kibana + 23 data
2,564 total shards = (1135 empty + 31 marvel + 1 kibana + 115 data)* 1 replica
799,505,935 total docs
767.84GB total data
ES_HEAP_SIZE=16g
curl _cat/fielddata?v result:
1.5.2:
total os.cpu.usage primaries.indexing.index_total total.fielddata.memory_size_in_bytes jvm.mem.heap_used_percent jvm.gc.collectors.young.collection_time_in_millis primaries.docs.count device.imei fs.total.available_in_bytes os.load_average.1m index.raw #timestamp node.ip_port.raw fs.total.disk_io_op node.name jvm.mem.heap_used_in_bytes jvm.gc.collectors.old.collection_time_in_millis total.merges.total_size_in_bytes jvm.gc.collectors.young.collection_count jvm.gc.collectors.old.collection_count total.search.query_total
2.1gb 1.2mb 3.5mb 3.4mb 1.1mb 0b 3.5mb 2.1gb 1.9mb 1.8mb 3.6mb 3.6mb 1.7mb 1.9mb 1.7mb 1.6mb 1.5mb 3.5mb 1.5mb 1.5mb 3.2mb
1.9gb 1.2mb 3.4mb 3.3mb 1.1mb 1.5mb 3.5mb 1.9gb 1.9mb 1.8mb 3.5mb 3.6mb 1.7mb 1.9mb 1.7mb 1.5mb 1.5mb 3.4mb 0b 1.5mb 3.2mb
2gb 0b 0b 0b 0b 0b 0b 2gb 0b 0b 0b 0b 0b 0b 0b 0b 0b 0b 0b 0b 0b
2.2.0:
total index_stats.index node.id node_stats.node_id buildNum endTime location.timestamp userActivity.time startTime time shard.state shard.node indoorOutdoor.time shard.index dataThroughput.downloadSpeed
176.2mb 0b 0b 0b 232b 213.5kb 518.8kb 479.7kb 45.5mb 80.1mb 1.4kb 920b 348.7kb 2.5kb 49.1mb
delete the empty indices
for the 1.5 cluster the major usage of your heap is for fielddata - around 9.5GB for each node, 1.2GB for filter cache and around 1.7GB for segments files' metadata
even if you have that snippet in your template to make the strings as not_analyzed, in 1.5 this doesn't automatically mean ES will use doc_values, you need to specifically enable them.
if you enable doc_values now in 1.5.x cluster, the change will be effective with the new indices. For the old indices you need to reindex the data. Or if you have time-based indices (created daily, weekly etc) you just need to wait for the new indices to be created and the old ones to be deleted.
until the doc_values will be predominant in your indices in the 1.5 cluster, what #Val suggested in the comments is the only option: limit the fielddata cache size or add more nodes to your cluster (and implicitly more memory) or increase the RAM on your nodes. Or manually clear the fielddata cache ;-) from time to time.
not related to the memory issue entirely, but try to avoid using ttl. If you don't need some data anymore, simply delete the index, don't rely on ttl, it is much more costly than simply deleting the index. The use of ttl creates can potentially cause issues at search time and affect the overall performance of a cluster, as it deletes documents from indices, which means a lot of updates and a lot of merging to those indices. Since you probably have time-based indices (which means data from yesterday doesn't really change) using ttl brings unnecessary operations on data that should otherwise be static (and which can potentially be optimized).
If your heap is getting affected rapidly while querying, that means you're doing something really heavy in your query, like for example aggregations. Like val and Andrei suggested, the problem might be with your field data going unbounded. I'd suggest to check your mappings and use doc_values and not_analyzed properties wherever applicable to cut down query cost.

Resources