Cassandra long connection times (compared to Redis) - performance

I'm surprised by Cassandra long connection times (compared to Redis) made from the python client (cassandra-driver) to a single-node Cassandra cluster running on the same host. How can they be improved?
More info
Even though Cassandra server runs on the same host and I connect directly (through k8s/OCP service port), the connection times out until I set a relatively long triple-digits connect_timeout thresholds (relative to our standards built on 3-year experiences with Redis in several clusters).
In case of Redis the connection takes just a fraction of a millisecond, and the total (connection plus read) timeout sufficient to work in these conditions has always been 10 ms, while for Cassandra it would have to be set orders of magnitude larger!
2022-09-18 11:08:16.899097 - connecting to Redis database...
Redis<ConnectionPool<Connection<host=<redacted>.svc.cluster.local,port=<redacted>,db=0>>>
2022-09-18 11:08:16.899423 - connected to Redis database in 0.32 ms
versus:
2022-09-18 12:29:13.229688 - connecting to Cassandra database...
<cassandra.cluster.Session object at 0x7fd0065e80d0>
2022-09-18 12:29:13.409084 - connected to Cassandra database in 190.856 ms
What's even more surprising, the read timeout for Cassandra can be much shorter than connection timeout: only 16 ms for SELECT queries (with cold start, i.e. first query for a given key) vs. 128+ ms for connections alone (every time, apparently there's no connections caching on the client side).
Almost reproducible example (just fill in your cluster and db user details):
from cassandra.cluster import Cluster, PlainTextAuthProvider
# connection timeout
conn_timeout_ms = 64 # timeout
# conn_timeout_ms = 128 # OK (best time)
cas_auth_provider = PlainTextAuthProvider(username=cas_user,
password=cas_pass)
cas_cluster = Cluster(contact_points=[cas_uri],
port=cas_port,
auth_provider=cas_auth_provider,
connect_timeout=conn_timeout_ms / 1000, # ms -> sec
)
cas_session = cas_cluster.connect()
I've also tried setting the low-level TCP_NODELAY sockets option using the sockopts arg exposed by Cluster (to disable Nagle's aggregating algo, and send the data as soon as it's available), but it did not help:
sockopts = [(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)]

Related

Clustered Nifi running erratically,the issue often appear

java.net.SocketTimeoutException: Read timed out
3 NODE,Nifi-1.10.0,ZK-3.6.5
i reset the relevant setting to make Nifi to respond in given time as following.But this ways can't work!?
nifi.cluster.node.connection.timeout=120 sec
nifi.cluster.node.read.timeout=120 sec
nifi.zookeeper.connect.timeout=30 secs
nifi.zookeeper.session.timeout=30 secs
nifi.zookeeper.connect.timeout=30 secs
nifi.zookeeper.session.timeout=30 secs
nifi.cluster.load.balance.comms.timeout=30 sec
UPDATED:
While enter NIFI UI,Nifi can't running.There are only an app in this VM.
3 Node has same spec and configuration
java.arg.2=-Xms4g
java.arg.3=-Xmx4g
NIFI-APP.LOG
2020-06-03 08:54:27,845 WARN [Curator-ConnectionStateManager-0] o.a.c.f.state.ConnectionStateManager Session timeout has elapsed while SUSPENDED. Injecting a session expiration. Elapsed ms: 32546. Adjusted session timeout ms: 30000
ZK-LOG
2020-06-02 18:12:45,232 [myid:1] - WARN [NIOWorkerThread-5:NIOServerCnxn#366] - Unable to read additional data from client sessionid 0x1014019b26f0005, likely client has closed socket
#Kong a few things to consider:
Set your min max at 4g and 8g, work up to 60-70% of the total ram of the node. Restart Nifi often to reset ram stolen from memory leaks. Restart all before adjusting for performance.
Check your max thread tuning. This needs to be based on the number of cores you have. Default settings will not make full use of the node cores.
Adjust Garbage Collection - there are quite a few posts around with info on what to do here.
Inspect flow for bad concurrency or schedule settings. A processor on the wrong schedule (0 sec for example), or with large concurrency can create instability.
I suspect you have a small combination of settings to adjust and you should see the stability you are looking for.
I suppose that the memory of JVM is too much,so that VM isn't enough for performance to make the Zookeeper work.
Tune JVM to 2G,the Nifi can work successfully.

Amazon MQ (ActiveMQ) bad performance on large messages

We are migrating from IBM MQ to Amazon MQ, at least we would like to do so. The problem is Amazon MQ is having bad performance when using JMS producer to put a large message on a queue compared to IBM MQ.
All messages are persistent and the system is High Available regarding IBM MQ, and Amazon MQ is multi AZ.
If we put this size of XML files to IBM MQ (2 cpu and 8GB RAM HA instance) we have this performance:
256 KB = 15ms
4,6 MB = 125ms
9,3 MB = 141ms
18,7 MB = 218ms
37,4 MB = 628ms
74,8 MB = 1463ms
If we put the same files on Amazon MQ (mq.m5.2xlarge = 8 CPU and 32 GB RAM) or ActiveMQ we have this performance:
256 KB = 967ms
4,6 MB = 1024ms
9,3 MB = 1828ms
18,7 MB = 3550ms
37,4 MB = 8900ms
74,8 MB = 14405ms
What we also see is that IBM MQ has equal response times for sending a message to a queue and getting a message from a queue, while Amazon MQ is real fast in getting a message (e.g. just takes 1 ms), but very slow on sending.
On Amazon MQ we use the OpenWire protocol. We use this config in Terraform style:
resource "aws_mq_broker" "default" {
broker_name = "bernardamazonmqtest"
deployment_mode = "ACTIVE_STANDBY_MULTI_AZ"
engine_type = "ActiveMQ
engine_version = "5.15.10"
host_instance_type = "mq.m5.2xlarge"
auto_minor_version_upgrade = "false"
apply_immediately = "false"
publicly_accessible = "false"
security_groups = [aws_security_group.pittensbSG-allow-mq-external.id]
subnet_ids = [aws_subnet.pittensbSN-public-1.id, aws_subnet.pittensbSN-public-3.id]
logs {
general = "true"
audit = "true"
}
We use Java 8 with JMS ActiveMQ library via POM (Maven):
<dependency>
<groupId>org.apache.activemq</groupId>
<artifactId>activemq-client</artifactId>
<version>5.15.8</version>
</dependency>
<dependency>
<groupId>org.apache.activemq</groupId>
<artifactId>activemq-pool</artifactId>
<version>5.15.8</version>
</dependency>
In JMS we have this Java code:
private ActiveMQConnectionFactory mqConnectionFactory;
private PooledConnectionFactory mqPooledConnectionFactory;
private Connection connection;
private Session session;
private MessageProducer producer;
private TextMessage textMessage;
private Queue queue;
this.mqConnectionFactory = new ActiveMQConnectionFactory();
this.mqPooledConnectionFactory = new PooledConnectionFactory();
this.mqPooledConnectionFactory.setConnectionFactory(this.mqConnectionFactory);
this.mqConnectionFactory.setBrokerURL("ssl://tag-1.mq.eu-west-1.amazonaws.com:61617");
this.mqPooledConnectionFactory.setMaxConnections(10);
this.connection = mqPooledConnectionFactory.createConnection());
this.connection.start();
this.session = this.connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
this.session.createQueue("ExampleQueue");
this.producer = this.session.createProducer(this.queue);
long startTimeSchrijf = 0;
startTimeWrite= System.currentTimeMillis();
producer.send("XMLFile.xml"); // here we send the files
logger.debug("EXPORTTIJD_PUT - Put to queue takes: " + (System.currentTimeMillis() - startTimeWrite));
// close session, producer and connection after 10 cycles
We also have run the performance test as a Single Instance AmazonMQ. But same results.
We have also run the performance test with a mq.m5.4xlarge (16 cpu, 96 GB RAM) engine but still no improvement of the bad performance.
Performance test configuration:
We first push the messages(XML files) according above one by one to a queue. We do that 5 times. After 5 times we read those messages(XML files) from the queue. We call this 1 cycle.
We run 10 cycles one after another, so in total we have pushed 300 files to the queue and we have getted 300 files from the queue.
We run 3 tests in parallel: One from AWS Region Londen, one from AWS Region Frankfurt in a different VPC and 1 from Frankfurt in the same VPC as the Amazon MQ broker and in the same subnet. Alle clients run on an EC2 instance: m4.xlarge.
If we run a test with only one VPC for example only the local VPC which is in the same subnet as the AmazonMQ broker the performance improves and we have these results:
256 KB = 72ms
4,6 MB = 381ms
9,3 MB = 980ms
18,7 MB = 2117ms
37,4 MB = 3985ms
74,8 MB = 7781ms
The client and server are in the same subnet, so we have nothing to do with firewalling etc.
Maybe somebody can tell me what is wrong, and why we have such a terrible performance with Amazon MQ or ActiveMQ?
extra info:
Response times are measured in the JMS Java app with Java starttime just before the producer.send('XML') and just endtime just after the producer.send('XML'). Difference is the recorded time. Times are average times over 300 calls.
IBM MQ server is located in our datacenter, and client app is running at a server in the same datacenter.
extra info test:
The jms app starts create connectionFactory queues sessions. Then it uploads the files to MQ 1 by 1. This is a cycle, then it run this cycle 10 times in a for lus without opening or closing sessions queues or connectionfactorys. Then all 60 messages are read from queue and written to files on the local drive. Then it closes the connection factory and session and producer/consumer. This is one batch.
Then we run 5 batches. So between the batches connectionFactory, queue, session are recreated.
In response to Sam:
When I also execute the test with the same size of files like you did Sam I approach the same response times, I set the persistence mode also to false value between () :
500 KB = 30ms (6ms)
1 MB = 50ms (13ms)
2 MB = 100ms (24ms)
I removed the connection pooling and I set
concurrentStoreAndDispatchQueues="false"
The system I have used broker: mq.m5.2xlarge and client: m4.xlarge.
But if I test with bigger files, this are the response times:
256 KB = 72ms
4,6 MB = 381ms
9,3 MB = 980ms
18,7 MB = 2117ms
37,4 MB = 3985ms
74,8 MB = 7781ms
I am having a very simple requirement. I have a system what puts messages on a queue and the messages are get from the queue by another system, sometimes at the same time sometimes not, sometimes there are 20 or 30 messages on the system before they get unloaded. Thats why I need a queue and messages must be persistent and it must be a Java JMS implementation.
I think Amazon MQ might be a solution for small files but for big files it is not. I think we have to use IBM MQ for this case which has better performance. But one important thing: I tested IBM MQ only on premis in our LAN. We tried to test IBM MQ on Amazon but we didn't succeed yet.
I tried to reproduce the scenario you were testing. When I ran a JMS client in the same VPC as the AmazonMQ broker for mq.m5.4xlarge broker with an Active and Standby instance, I see the following roundtrip latencies - measuring the moment from which a producer sends a message to the moment when consumer receives the message.
2MB - 50ms
1MB - 31ms
500KB - 15ms
My code just created a connection and a session. I did not use a PooledConnectionFactory (stating this as a matter of fact, not saying/suspecting that's the cause). Also it is better to strip down the code to bare minimum in order to establish a baseline and remove noise when doing performance testing. That way, when you introduce additional code, you can easily see if the new code introduced a performance issue. I used the default broker configuration.
In ActiveMQ, there is a concept of Fast Producer and Fast Consumer, this means, if consumer can process the messages at the same rate as the Producer, the broker transfers the message from producer to consumer via memory and then it writes the message to disk. This is the default behavior and is controlled by a broker configuration setting named concurrentStoreAndDispatch which is true (default)
If consumer is unable to keep up with producer, and thus becomes a "slow" consumer and with the concurrentStoreAndDispatch flag set to true, you take a performance hit.
ActiveMQ provides advisory topics which you can subscribe to detect slow consumers. If in fact, you detected that the consumer is slower than the producer, it is better to set concurrentStoreAndDispatch flag to false to get better performance.
I don't get any response.
I think its because there is no solution for this performance problem. Amazon MQ is a cloud service and mabye thats the reason why performance is this bad.
IBM MQ is a different architecture, and it is on premise.
I have to investigate the performance of ActiveMQ some more before I can tell what exactly the reason is for this problem.

How to configure Redis connections with Rails 4, Puma and Sidekiq?

I am using Sidekiq (on Heroku with Puma) to send emails asynchronously and would like to use Redis to keep counters and cache models.
RedisCloud's free plan includes 30 connections to Redis. It is not clear to me how to manage:
redis connections used by Sidekiq
redis connections used in models (caching and counters)
Sidekiq Client size is configured like this:
Sidekiq.configure_client do |config|
config.redis = {url: ENV["REDISCLOUD_URL"], size: 3}
end
If I understood this correctly, Puma forks multiple processes, 2 in my case, which will result in:
2 (Puma Workers) * 3 (size) * 1 (Web Dyno) = 6 connections to redis used to push jobs.
Sidekiq Server
With Sidekiq taking 2 connections (or 5 in version 4), setting a concurrency of 10 would default in a server size of 12 or 15.
If I wanted to use all the remaining available connections (30 - 6 = 24), I could set :
Sidekiq.configure_client do |config|
config.redis = { size: 19 }
end
Total redis connections would be 19 + 5 (Sidekiq 4) = 24, and use the default concurrency of 25 would be ok.
As Mike Perham stated generally the concurrency must not be more than (server pool size - 2) * 2.
Now, where it starts to get confusing for me is the use of Redis out of Sidekiq.
# initializers/redis.rb
$redis = Redis.new(:url => uri)
Whenever I use Redis in a model or controller I call like so:
$redis.hincrby("mycounter", "key", 1)
As I understand it, all the puma threads wait on each other on a single Redis connection when $redis.whateverFunction is called.
In this answer What is the best way to use Redis in a Multi-threaded Rails environment? (Puma / Sidekiq), the recommended approach is using the connection_pool gem, related to the Sidekiq Wiki https://github.com/mperham/sidekiq/wiki/Advanced-Options#connection-pooling
require 'connection_pool'
$redis = ConnectionPool.new(size: 10) { Redis.new }
If I understand it right, it that case $redis.whateverFunction would have its own connection pool of 10, and sidekiq its own connection workers pool which would now be set out a new total of 20 redis connections ( 30 (available total) - 10 (redis model connections ), and Sidekiq client and server size would need to be changed.
How do you determine the size of the connection pool (here 10) needed for model/controller redis connections? Since Redis is single-threaded, how does increasing the connection pool actually increases redis operations performance?
Any thoughts on this would be of great help.
Thx!
Redis is single-threaded, but written in pure C, uses an event loop inside and handles connections asynchronously, so connection count does not affect it by much provided the same number of requests. It is capable of handling requests faster than your application can generate them because of network delay, ruby being slower than compiled and optimized C, etc, so you do not need to worry about it being single-threaded.
Increasing number of connections is beneficial for concurrent requests from different threads because there's no need to wait for response to be delivered over network to unlock connection, plus ruby can do parallel IOs.
Also you can tell if pool is too small when connection checkout times become worse than you expect/tolerate and corresponding thread/worker is idling while waiting for it, so benchmark your code and have a good look on your actual usage and behavior patterns.
On the other side i'd advise against using all of the connection count limit, there're times when you might need these extra connections. For example:
for graceful/"zero downtime" dyno restarts ("preboot") you need twice the connections, since old processes are still running for some time
keep at least one free connection for emergency debug as you may want to be able to connect from console/directly and see what data is inside when some unexpected highload comes

MongoDB-Java performance with rebuilt Sync driver vs Async

I have been testing MongoDB 2.6.7 for the last couple of months using YCSB 0.1.4. I have captured good data comparing SSD to HDD and am producing engineering reports.
After my testing was completed, I wanted to explore the allanbank async driver. When I got it up and running (I am not a developer, so it was a challenge for me), I first wanted to try the rebuilt sync driver. I found performance improvements of 30-100%, depending on the workload, and was very happy with it.
Next, I tried the async driver. I was not able to see much difference between it and my results with the native driver.
The command I'm running is:
./bin/ycsb run mongodb -s -P workloads/workloadb -p mongodb.url=mongodb://192.168.0.13:27017/ycsb -p mongodb.writeConcern=strict -threads 96
Over the course of my testing (mostly with the native driver), I have experimented with more and less threads than 96; turned on "noatime"; tried both xfs and ext4; disabled hyperthreading; disabled half my 12 cores; put the journal on a different drive; changed sync from 60 seconds to 1 second; and checked the network bandwidth between the client and server to ensure its not oversubscribed (10GbE).
Any feedback or suggestions welcome.
The Async move exceeded my expectations. My experience is with the Python Sync (pymongo) and Async driver (motor) and the Async driver achieved greater than 10x the throughput. further, motor is still using pymongo under the hoods but adds the async ability. that could easily be the case with your allanbank driver.
Often the dramatic changes come from threading policies and OS configurations.
Async needn't and shouldn't use any more threads than cores on the VM or machine. For example, if you're server code is spawning a new thread per incoming conn -- then all bets are off. start by looking at the way the driver is being utilized. A 4 core machine uses <= 4 incoming threads.
On the OS level, you may have to fine-tune parameters like net.core.somaxconn, net.core.netdev_max_backlog, sys.fs.file_max, /etc/security/limits.conf nofile and the best place to start is looking at nginx related performance guides including this one. nginx is the server that spearheaded or at least caught the attention of many linux sysadmin enthusiasts. Contrary to popular lore one should reduce your keepalive timeout opposed to lengthen it. The default keep-alive timeout is some absurd (4 hours) number of seconds. you might want to cut the cord in 1 minute. basically, think a short sweet relationship with your clients connections.
Bear in mind that Mongo is not Async so you can use a Mongo driver pool. nevertheless, don't let the driver get stalled on slow queries. cut it off in 5 to 10 seconds using the following equivalents in Java. I'm just cutting and pasting here with no recommendations.
# Specifies a time limit for a query operation. If the specified time is exceeded, the operation will be aborted and ExecutionTimeout is raised. If max_time_ms is None no limit is applied.
# Raises TypeError if max_time_ms is not an integer or None. Raises InvalidOperation if this Cursor has already been used.
CONN_MAX_TIME_MS = None
# socketTimeoutMS: (integer) How long (in milliseconds) a send or receive on a socket can take before timing out. Defaults to None (no timeout).
CLIENT_SOCKET_TIMEOUT_MS=None
# connectTimeoutMS: (integer) How long (in milliseconds) a connection can take to be opened before timing out. Defaults to 20000.
CLIENT_CONNECT_TIMEOUT_MS=20000
# waitQueueTimeoutMS: (integer) How long (in milliseconds) a thread will wait for a socket from the pool if the pool has no free sockets. Defaults to None (no timeout).
CLIENT_WAIT_QUEUE_TIMEOUT_MS=None
# waitQueueMultiple: (integer) Multiplied by max_pool_size to give the number of threads allowed to wait for a socket at one time. Defaults to None (no waiters).
CLIENT_WAIT_QUEUE_MULTIPLY=None
Hopefully you will have the same success. I was ready to bail on Python prior to async

Hector is unable to read Cassandra data when nodes reboot or terminate

We are trying to run a cassandra cluster on AWS/EC2 within a standard VPC footprint (cassandra nodes on private subnets). Because this is AWS there is always a chance that an EC2 instance will terminate or reboot with no warning. I have been simulating this case on a test cluster and I am seeing things with the cluster that I thought a cluster was suppose to prevent. Specifically if a node reboots some data will go temporarily missing until the node completes its reboot. If a node terminates it appears that some data is lost forever.
For my test I just did a bunch of writes (using QUORUM consistency) to some keyspaces then interrogate the contents of those keyspaces as I bring down nodes (either through reboot or terminate). I'm just using cqlsh SELECT to do the keyspace/column family interrogation of the cluster using ONE consistency level.
Note, even though I am performing no writes to the cluster while I am doing the SELECTs rows temporarily disappear when rebooting and can permanently go missing during termination.
I thought Netflix Priam might be able to help, but sadly it doesn't work in a VPC the last time I checked.
Also, because we are using ephemeral storage instances there is no equivalent of 'shutdown' so I cannot run any scripts during reboot/terminate of an instance to perform a nodetool decommission or nodetool removenode before an instance goes away. Terminate is the equivalent of kicking the plug out of the wall.
Since I am using a replication factor of 3 and quorum/write that should mean that all data is written to at least 2 nodes. So, unless I am totally misunderstanding things (which is possible), losing one node should not mean that I lose any data for any period of time when I am using consistency level ONE for the read.
Questions
Why wouldn't a 6 node cluster with a replication factor of 3 work?
Do I need to run something like a 12 node cluster with a replication factor of 7? Don't bother telling me that will fix the problem, because it doesn't.
Do I need to use consistency level of ALL on the writes then use ONE or QUORUM on the reads?
Is there something not quite right with virtual nodes? unlikely
Are there nodetool commands besides removenode that I need to run when a node terminates to recover missing data? As mentioned earlier, when a reboot occurs, eventually the missing data reappears.
Is there some cassandra savant who can look at my cassandra.yaml file below and send me on the path to salvation?
More Info added 7/19
I don't think this is a QUORUM vs ONE vs ALL is the issue. The test I set up performs no writes to the keyspaces after the initial population of the column families. So the data has had plenty of time (hours) to make it to all the nodes as required by the replication factor. Plus the test dataset is REALLY small (2 column families with about 300-1000 values each). So in other words, the data is completely static.
The behavior I am seeing seems to be tied to the fact that the ec2 instance is no longer on the network. The reason I say this is because if I log on to a node and just do a cassandra stop I see no loss of data. But if I do the reboot or terminate I start getting the following in a stack trace.
CassandraHostRetryService - Downed Host Retry service started with queue size -1 and retry delay 10s
CassandraHostRetryService - Downed Host retry shutdown complete
CassandraHostRetryService - Downed Host retry shutdown hook called
Caused by: TimedOutException()
Caused by: TimedOutException()
So it seems to be more of a networking communication issue in that the cluster is expecting, for example 10.0.12.74, to be on the network after it has joined the cluster. If that ip is suddenly unreachable either due to reboot or termination the timeouts start happening.
When I do a nodetool status under all three scenarios (cassandra stop, reboot or terminate) the status of the node shows up as DN. Which is what you would expect. Eventually nodetool status will return to UN with cassandra start or reboot, but obviously termination always stays DN.
Details of my Configuration
Here are some details of my configuration (cassandra.yaml is at the bottom of this posting):
Nodes are running in private subnets of a VPC.
Cassandra 1.2.5 with num_tokens: 256 (virtual nodes). initial_token: (blank). I am really hoping this works because all of our nodes run in autoscaling groups so the thought that redistribution could be handle dynamically is appealing.
EC2 m1.large one seed and one non-seed node in each availability zone. (so 6 total nodes in the cluster).
Ephemeral storage, not EBS.
Ec2Snitch with NetworkTopologyStrategy and all keyspaces have replication factor of 3.
Non-seed nodes are auto_bootstraped, seed nodes are not.
sample cassandra.yaml file
cluster_name: 'TestCluster'
num_tokens: 256
initial_token:
hinted_handoff_enabled: true
max_hint_window_in_ms: 10800000
hinted_handoff_throttle_in_kb: 1024
max_hints_delivery_threads: 2
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
authorizer: org.apache.cassandra.auth.AllowAllAuthorizer
partitioner: org.apache.cassandra.dht.Murmur3Partitioner
disk_failure_policy: stop
key_cache_size_in_mb:
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
row_cache_provider: SerializingCacheProvider
saved_caches_directory: /opt/company/dbserver/caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "SEED_IP_LIST"
flush_largest_memtables_at: 0.75
reduce_cache_sizes_at: 0.85
reduce_cache_capacity_to: 0.6
concurrent_reads: 32
concurrent_writes: 8
memtable_flush_queue_size: 4
trickle_fsync: false
trickle_fsync_interval_in_kb: 10240
storage_port: 7000
ssl_storage_port: 7001
listen_address: LISTEN_ADDRESS
start_native_transport: false
native_transport_port: 9042
start_rpc: true
rpc_address: 0.0.0.0
rpc_port: 9160
rpc_keepalive: true
rpc_server_type: sync
thrift_framed_transport_size_in_mb: 15
thrift_max_message_length_in_mb: 16
incremental_backups: true
snapshot_before_compaction: false
auto_bootstrap: AUTO_BOOTSTRAP
column_index_size_in_kb: 64
in_memory_compaction_limit_in_mb: 64
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 16
compaction_preheat_key_cache: true
read_request_timeout_in_ms: 10000
range_request_timeout_in_ms: 10000
write_request_timeout_in_ms: 10000
truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 10000
cross_node_timeout: false
endpoint_snitch: Ec2Snitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
index_interval: 128
server_encryption_options:
internode_encryption: none
keystore: conf/.keystore
keystore_password: cassandra
truststore: conf/.truststore
truststore_password: cassandra
client_encryption_options:
enabled: false
keystore: conf/.keystore
keystore_password: cassandra
internode_compression: all
I think http://www.datastax.com/documentation/cassandra/1.2/cassandra/dml/dml_config_consistency_c.html will clear up a lot of this. In particular, QUORUM/ONE is not guaranteed to return the most recent data. QUORUM/QUORUM is. So is ALL/ONE, but that will be intolerant to failure on write.
Edit to go with the new information:
CassandraHostRetryService is part of Hector. I assumed you were testing with cqlsh like a sane person would. Lessons:
Use cqlsh for testing
Use the DataStax Java Driver for building your application, which is faster, easier to use, and has more insight into the cluster state than Hector thanks to the native protocol it's built on.

Resources