Cassandra read latency high even with row caching, why? - performance

I am testing cassandra performance with a simple model.
CREATE TABLE "NoCache" (
key ascii,
column1 ascii,
value ascii,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
bloom_filter_fp_chance=0.010000 AND
caching='ALL' AND
comment='' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.100000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'SnappyCompressor'};
I am fetching 100 columns of a row key using pycassa, get/xget function (). but getting read latency about 15ms in the server.
colums=COL_FAM.get(row_key, column_count=100)
nodetool cfstats
Column Family: NoCache
SSTable count: 1
Space used (live): 103756053
Space used (total): 103756053
Number of Keys (estimate): 128
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 20
Read Latency: 15.717 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Bloom Filter False Positives: 0
Bloom Filter False Ratio: 0.00000
Bloom Filter Space Used: 976
Compacted row minimum size: 4769
Compacted row maximum size: 557074610
Compacted row mean size: 87979499
Latency of this type is amazing! When nodetool info shows that read hits directly in the row cache.
Row Cache : size 4834713 (bytes), capacity 67108864 (bytes), 35 hits, 38 requests, 1.000 recent hit rate, 0 save period in seconds
Can anyone tell me why is cassandra taking so much time while reading from row cache?

Enable tracing and see what it's doing. http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2

Related

EHCache has too many misses on the heap tier

I am using a standalone EHcache 3.10 with heap tier, and disk tier.
The heap is configured to contain 100 entries with no expirations.
Actually, the algorithm insert only 30 entries into the cache, but it perform many "updates" and "reads" of these 30 entries.
Before I insert a new entry to the cache, there is a check if the entry already exist.
Therefore when I check ehcache statistics, I expect to see only 30 - misses on the heap tier, and 30 misses on the disk tier. But,
Instead I get 9806 misses on the heap tier, and 9806 hits on the disk tier. (Meaning 9806 times ehcache did not found the entry on the heap, but instead it was found on the disk)
These numbers make no sense to me, since the heap tier should contain 100 entries, so, why there are so many misses?
Here is my configuration:
statisticsService = new DefaultStatisticsService();
// create temp dir
Path cacheDirectory = getCacheDirectory();
ResourcePoolsBuilder resourcePoolsBuilder =
ResourcePoolsBuilder.newResourcePoolsBuilder()
.heap(100, EntryUnit.ENTRIES)
.disk(Long.MAX_VALUE, MemoryUnit.B, true)
;
CacheConfigurationBuilder cacheConfigurationBuilder =
CacheConfigurationBuilder.newCacheConfigurationBuilder(
String.class, // The cache key
ARFileImpl.class, // The cache value
resourcePoolsBuilder)
.withExpiry(ExpiryPolicyBuilder.noExpiration()) // No expiration
.withResilienceStrategy(new ThrowingResilienceStrategy<>())
.withSizeOfMaxObjectGraph(100000);
// Create the cache manager
cacheManager =
CacheManagerBuilder.newCacheManagerBuilder()
// Set the persistent directory
.with(CacheManagerBuilder.persistence(cacheDirectory.toFile()))
.withCache(ARFILE_CACHE_NAME, cacheConfigurationBuilder)
.using(statisticsService)
.build(true);
Here is the result of the statistics:
Cache stats:
CacheExpirations: 0
CacheEvictions: 0
CacheGets: 6669684
CacheHits: 6669684
CacheMisses: 0
CacheHitPercentage: 100.0
CacheMissPercentage: 0.0
CachePuts: 10525
CacheRemovals: 0
Heap stats:
AllocatedByteSize: -1
Mappings: 30
Evictions: 0
Expirations: 0
Hits: 6659878
Misses: 9806
OccupiedByteSize: -1
Puts: 0
Removals: 0
Disk stats:
AllocatedByteSize: 22429696
Mappings: 30
Evictions: 0
Expirations: 0
Hits: 9806
Misses: 0
OccupiedByteSize: 9961952
Puts: 10525
Removals: 0
The reason for asking this question is that there are a lot of redundant disk reads, which result with performance degradation.

Elasticsearch Slow as size parameter increases (not dependent on data size)

I have the following peculiar case for a simple GET request "query": {"match_all": {}},"sort": ["_doc"] with:
size = 0:
_source = ["X"]: 0.03 sec and 262 bytes
_source = ["X", "Y", "Z"]: 0.02 sec and 262 bytes
size = 10:
_source = ["X"]: 0.90 sec and 2944 bytes
_source = ["X", "Y", "Z"]: 0.88 sec and 949 Kb
size = 20:
_source = ["X"]: 1.20 sec and 5613 bytes
_source = ["X", "Y", "Z"]: 1.14 sec and 1006 Kb
size = 60:
_source = ["X"]: 2.30 sec and 9693 bytes
_source = ["X", "Y", "Z"]: 2.08 sec and 1120 Kb
The fields "X", "Y", "Z" are of type "text" and hold either years, either one / two words.
Seems to me that the increase in data size is not the problem of the slow query, but rather only the increase in size parameter.
For 1120Kb of data, it shouldn't take 2.5 seconds to retrieve the data - way too much.
I have an architecture of one cluster with one node and 5 indices. Each index has 1 shard allocated except for an index which has 2 shards.
RAM capacity is of 58GB and the heap size is 8GB.
The Elasticsearch version is 7.6
I already have:
request cache set to True
cache size equal to 20% of heap
preload of ["nvd", "dvd", "dvm", "doc"]
no explicit refresh
refresh_time of 2s on the index on which the queries are done
2 shards on on the index on which the queries are done
swap disabled
GET _cat/indices shows the heap is used 63% and the RAM is used 37% (it should be the other way around)
GET index/_settings shows the fetch phase is 76 times bigger than the query phase! (~2.3s vs 0.03s)
Where could the problem be?
Thanks in advance.

Spark SQL performance too slow if the number of rows are 100000

I'm testing Spark performance with very many rows table.
What I did is very simple.
Prepare csv file which has many rows and only 2 data records.
eg, csv file is like as follows:
col000001,col000002,,,,,,,col100000
dtA000001,dtA000002,,,,,,,,dtA100000
dtB000001,dtB000002,,,,,,,,dtB100000
dfdata100000 = sqlContext.read.csv('../datasets/100000c.csv', header='true')
dfdata100000.registerTempTable("tbl100000")
result = sqlContext.sql("select col000001,ol100000 from tbl100000")
Then get 1 row by show(1)
%%time
result.show(1)
File sizes are as follows(very small).
File name shows the number of rows:
$ du -m *c.csv
3 100000c.csv
1 10000c.csv
1 1000c.csv
1 100c.csv
1 20479c.csv
2 40000c.csv
2 60000c.csv
3 80000c.csv
Results are like as follows:
As you can see, the execution time is exponentially increase.
Example result:
+---------+---------+
|col000001|col100000|
+---------+---------+
|dtA000001|dtA100000|
+---------+---------+
only showing top 1 row
CPU times: user 218 ms, sys: 509 ms, total: 727 ms
Wall time: 53min 22s
Question1: Is it an acceptable result? Why is the execution time exponentially increase?
Question2: Is there any other method to do faster?

Large amount of memory allocated by net/protocol.rb:153

While running memory_profiler, I noticed a large amount of memory being allocated by Ruby's net/protocol.rb component. I call it when performing an HTTP request to a server to download a file. The file is 43.67MB large and net/protocol.rb alone allocates 262,011,476 bytes just to download it.
Looking at the "allocated memory by location" section in the profiler report below, I can see net/protocol.rb:172 and http/response.rb:334 allocating 50-60MB of memory each, which is about the size of the file, so that looks reasonable. However, the top most entry (net/protocol.rb:153) worries me: that's 200MB of memory, at least 4x the size of the file.
I have two questions:
Why does net/protocol need to allocate 5x the size of the file in order to download it?
Is there anything I can do to reduce the amount of memory used by net/protocol?
memory_profiler output:
Total allocated: 314461424 bytes (82260 objects)
Total retained: 0 bytes (0 objects)
allocated memory by gem
-----------------------------------
314461304 ruby-2.1.2/lib
120 client/lib
allocated memory by file
-----------------------------------
262011476 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb
52435727 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb
7971 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb
2178 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb
1663 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb
1260 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb
949 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb
120 /Users/andre.debrito/git/techserv-cache/client/lib/connections/cache_server_connection.rb
80 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/http.rb
allocated memory by location
-----------------------------------
200483909 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:153
60548199 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:172
52428839 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:334
978800 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:155
2537 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:61
2365 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:172
2190 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:54
1280 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:56
960 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:62
836 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:165
792 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:13
738 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:125
698 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:263
576 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:214
489 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:40
480 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:127
360 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:40
328 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:610
320 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:71
320 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:30
320 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:59
308 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:322
256 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:879
240 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:1615
239 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:211
232 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:38
224 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:181
200 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:17
192 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:42
179 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:877
169 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:1459
160 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:1029
160 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:434
160 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:435
160 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:445
160 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:1617
149 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:1445
147 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:1529
129 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:98
128 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:1475
120 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:444
120 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:446
120 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:447
120 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:29
120 /Users/andre.debrito/git/techserv-cache/client/lib/connections/cache_server_connection.rb:45
96 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:899
80 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:39
80 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:45
80 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:46
80 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:145
allocated memory by class
-----------------------------------
309678360 String
3445304 Thread::Backtrace
981096 Array
352376 IO::EAGAINWaitReadable
1960 MatchData
1024 Hash
328 Net::HTTP
256 TCPSocket
256 URI::HTTP
128 Time
120 Net::HTTP::Get
120 Net::HTTPOK
96 Net::BufferedIO
allocated objects by gem
-----------------------------------
82259 ruby-2.1.2/lib
1 client/lib
allocated objects by file
-----------------------------------
81908 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb
129 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb
127 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb
28 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb
23 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb
23 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb
19 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/http.rb
1 /Users/andre.debrito/git/techserv-cache/client/lib/connections/cache_server_connection.rb
allocated objects by location
-----------------------------------
36373 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:153
24470 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:155
21057 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:172
48 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:61
38 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:54
32 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:56
31 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:172
24 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:62
12 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:127
9 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:40
8 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:71
8 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:165
8 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:30
8 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:59
6 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:214
6 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:1615
5 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:263
4 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:1029
4 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:322
4 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:17
4 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:434
4 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:435
4 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:445
4 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:42
4 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:125
4 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:1617
3 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:1529
3 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:444
3 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:446
3 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:447
3 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:40
3 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:1445
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:877
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:39
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:45
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/generic_request.rb:46
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:13
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:145
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/header.rb:31
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:111
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:144
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:98
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:179
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:181
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:213
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:1640
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:1642
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:343
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:530
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:557
allocated objects by class
-----------------------------------
47935 String
24519 Array
4894 IO::EAGAINWaitReadable
4894 Thread::Backtrace
7 MatchData
3 Hash
2 URI::HTTP
1 Net::BufferedIO
1 Net::HTTP
1 Net::HTTP::Get
1 Net::HTTPOK
1 TCPSocket
1 Time
retained memory by gem
-----------------------------------
NO DATA
retained memory by file
-----------------------------------
NO DATA
retained memory by location
-----------------------------------
NO DATA
retained memory by class
-----------------------------------
NO DATA
retained objects by gem
-----------------------------------
NO DATA
retained objects by file
-----------------------------------
NO DATA
retained objects by location
-----------------------------------
NO DATA
retained objects by class
-----------------------------------
NO DATA
Allocated String Report
-----------------------------------
11926 ""
7019 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:172
4894 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:153
10 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/http/response.rb:54
2 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:179
1 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:67
4894 "Resource temporarily unavailable - read would block"
4894 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:153
4894 "UTF-8"
4894 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:153
4894 "read would block"
4894 /Users/andre.debrito/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/net/protocol.rb:153
...
Relevant code:
report = MemoryProfiler.report do
begin
response = nil
Net::HTTP.new(uri.host, uri.port).start { |http|
request = Net::HTTP::Get.new uri.request_uri
response = http.request request
}
rescue Net::ReadTimeout => e
raise RequestTimeoutError.new(e.message)
rescue Exception => e
raise ServerConnectionError.new(e.message)
end
end
report.pretty_print
Network traffic data from Charles proxy:
Request Header: 168 bytes
Response Header: 288 bytes
Request: -
Response: 43.67 MB (45792735 bytes)
Total: 43.67 MB (45793191 bytes)
Almost all of those strings allocated in net/protocol.rb#L153 are short-lived and are reclaimed by the next GC run. Those allocated objects are thus pretty harmless and will not result in a significantly larger process size.
You get a lot of exceptions (which are used for control flow here to read form the socket) and the actual read data which is appended to the buffer. All of these operations create temporary (internally used) objects.
As such, you are probably measuring the wrong thing. What would probably make more sense is to:
measure the maximum RSS of the process (i.e. the "used" memory);
and to measure the amount of additional memory still allocated after the read.
You will notice that (depending on the memory pressure on your computer), the RSS will not grow significantly above the amount of actually read data and that the references memory after the read is about the same size as the read data with about no internal intermediate objects still referenced.

Caching not Working in Cassandra

I dont seem to have any caching enabled when checking in Opscenter or cfstats. Im running Cassandra 1.1.7 with Solandra on Debian. I have set the required global options in cassandra.yaml:
key_cache_size_in_mb: 800
key_cache_save_period: 14400
row_cache_size_in_mb: 800
row_cache_save_period: 15400
row_cache_provider: SerializingCacheProvider
Column Families were created as follows:
create column family example
with column_type = 'Standard'
and comparator = 'BytesType'
and default_validation_class = 'BytesType'
and key_validation_class = 'BytesType'
and read_repair_chance = 1.0
and dclocal_read_repair_chance = 0.0
and gc_grace = 864000
and min_compaction_threshold = 4
and max_compaction_threshold = 32
and replicate_on_write = true
and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
and caching = 'ALL';
Opscenter shows no data available on caching graphs and CFSTATS doesn't show any cache related fields:
Column Family: charsets
SSTable count: 1
Space used (live): 5558
Space used (total): 5558
Number of Keys (estimate): 128
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 61381
Read Latency: 0.123 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Bloom Filter False Postives: 0
Bloom Filter False Ratio: 0.00000
Bloom Filter Space Used: 16
Compacted row minimum size: 1917
Compacted row maximum size: 2299
Compacted row mean size: 2299
Any help or suggestions are appreciated.
Sam
The caching stats have been moved from cfstats to info in Cassandra 1.1. If you run nodetool info you should see something like:
Key Cache : size 5552 (bytes), capacity 838860800 (bytes), 38 hits, 47 requests, 0.809 recent hit rate, 14400 save period in seconds
Row Cache : size 0 (bytes), capacity 838860800 (bytes), 0 hits, 0 requests, NaN recent hit rate, 15400 save period in seconds
This is because there are now global caches, rather than per-CF. It seems that Opscenter needs updating for this change - maybe there is a later version available that will work.

Resources