Replication queue got stuck on error - clickhouse

I've 3 node cluster with replication 2 and the replicated table stats.
Recently saw that there is a delay on the replica db using /replica_satatus
db.stats: Absolute delay: 0. Relative delay: 0.
db2.stats: Absolute delay: 912916. Relative delay: 912916.
Here is data from system.replication_queue
Row 1:
──────
database: db2
table: stats
replica_name: replica_2
position: 3
node_name: queue-0001743101
type: GET_PART
create_time: 2018-06-19 20:57:42
required_quorum: 0
source_replica: replica_1
new_part_name: 20180619_20180619_823572_823572_0
parts_to_merge: []
is_detach: 0
is_currently_executing: 0
num_tries: 917943
last_exception:
last_attempt_time: 2018-06-29 15:32:50
num_postponed: 118617
postpone_reason:
last_postpone_time: 2018-06-29 15:32:23
Row 2:
──────
database: db2
table: stats
replica_name: replica_2
position: 4
node_name: queue-0001743103
type: MERGE_PARTS
create_time: 2018-06-19 20:57:48
required_quorum: 0
source_replica: replica_1
new_part_name: 20180619_20180619_823568_823573_1
parts_to_merge: ['20180619_20180619_823568_823568_0','20180619_20180619_823569_823569_0','20180619_20180619_823570_823570_0','20180619_20180619_823571_823571_0','20180619_20180619_823572_823572_0','20180619_20180619_823573_823573_0']
is_detach: 0
is_currently_executing: 0
num_tries: 917943
last_exception: Code: 234, e.displayText() = DB::Exception: No active replica has part 20180619_20180619_823568_823573_1 or covering part, e.what() = DB::Exception
last_attempt_time: 2018-06-29 15:32:50
num_postponed: 199384
postpone_reason: Not merging into part 20180619_20180619_823568_823573_1 because part 20180619_20180619_823572_823572_0 is not ready yet (log entry for that part is being processed).
last_postpone_time: 2018-06-29 15:32:35
Any clue how to deal with it?.
Should I detach broken replika partition and attach it again?

Stop all inserts to this cluster, it should auto clear the replication queue.

Related

Daily index not created

On my single test server with 8G of RAM (1955m to JVM) having es v 7.4, I have 12 application indices + few system indices like (.monitoring-es-7-2021.08.02, .monitoring-logstash-7-2021.08.02, .monitoring-kibana-7-2021.08.02) getting created daily. So on an average I can see daily es creates 15 indices.
today I can see only two indices are created.
curl -slient -u elastic:xxxxx 'http://127.0.0.1:9200/_cat/indices?v' -u elastic | grep '2021.08.03'
Enter host password for user 'elastic':
yellow open metricbeat-7.4.0-2021.08.03 KMJbbJMHQ22EM5Hfw 1 1 110657 0 73.9mb 73.9mb
green open .monitoring-kibana-7-2021.08.03 98iEmlw8GAm2rj-xw 1 0 3 0 1.1mb 1.1mb
and reason for above I think is below,
While looking into es logs, found
[2021-08-03T12:14:15,394][WARN ][o.e.x.m.e.l.LocalExporter] [elasticsearch_1] unexpected error while indexing monitoring document org.elasticsearch.xpack.monitoring.exporter.ExportException: org.elasticsearch.common.ValidationException: Validation Failed: 1: this action would add [1] total shards, but this cluster currently has [1000]/[1000] maximum shards open;
logstash logs for application index and filebeat index
[2021-08-03T05:18:05,246][WARN ][logstash.outputs.elasticsearch][main] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"ping_server-2021.08.03", :_type=>"_doc", :routing=>nil}, #LogStash::Event:0x44b98479], :response=>{"index"=>{"_index"=>"ping_server-2021.08.03", "_type"=>"_doc", "_id"=>nil, "status"=>400, "error"=>{"type"=>"validation_exception", "reason"=>"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [1000]/[1000] maximum shards open;"}}}}
[2021-08-03T05:17:38,230][WARN ][logstash.outputs.elasticsearch][main] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-7.4.0-2021.08.03", :_type=>"_doc", :routing=>nil}, #LogStash::Event:0x1e2c70a8], :response=>{"index"=>{"_index"=>"filebeat-7.4.0-2021.08.03", "_type"=>"_doc", "_id"=>nil, "status"=>400, "error"=>{"type"=>"validation_exception", "reason"=>"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [1000]/[1000] maximum shards open;"}}}}
Adding active and unassigned shards totals to 1000
"active_primary_shards" : 512,
"active_shards" : 512,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 488,
"delayed_unassigned_shards" : 0,
"active_shards_percent_as_number" : 51.2
If I check with below command, I see all unassigned shards are replica shards
curl -slient -XGET -u elastic:xxxx http://localhost:9200/_cat/shards | grep 'UNASSIGNED'
.
.
dev_app_server-2021.07.10 0 r UNASSIGNED
apm-7.4.0-span-000028 0 r UNASSIGNED
ping_server-2021.07.02 0 r UNASSIGNED
api_app_server-2021.07.17 0 r UNASSIGNED
consent_app_server-2021.07.15 0 r UNASSIGNED
Q. So for now, can I safely delete unassigned shards to free up some shards as its single node cluster?
Q. Can I changed the settings from allocating 2 shards (1 primary and 1 replica) to 1 primary shard being its a single server for each index online?
Q. If I have to keep one year of indices, Is below calculation correct?
15 indices daily with one primary shard * 365 days = 5475 total shards (or say 6000 for round off)
Q. Can I set 6000 shards as shard limit for this node so that I will never face this mentioned shard issue?
Thanks,
You have a lot of unassigned shards (probably because you have a single node and all indices have replicas=1), so it's easy to get rid of all of them and get rid of the error at the same time, by running the following command
PUT _all/_settings
{
"index.number_of_replicas": 0
}
Regarding the count of the indices, you probably don't have to create one index per day if those indexes stay small (i.e. below 10GB each). So the default 1000 shards count is more than enough without you have to change anything.
You should simply leverage Index Lifecycle Management in order to keep your index size at bay and not create too many small ones of them.

Elasticsearch Reenable shard allocation ineffective?

I am running a 2 node cluster on version 5.6.12
I followed the following rolling upgrade guide: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/rolling-upgrades.html
After reconnecting the last upgraded node back into my cluster, the health status remained as yellow due to unassigned shards.
Re-enabling shard allocation seemed to have no effect:
PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.enable": "all"
}
}
My query results when checking cluster health:
GET _cat/health:
1541522454 16:40:54 elastic-upgrade-test yellow 2 2 84 84 0 0 84 0 - 50.0%
GET _cat/shards:
v2_session-prod-2018.11.05 3 p STARTED 6000 1016kb xx.xxx.xx.xxx node-25
v2_session-prod-2018.11.05 3 r UNASSIGNED
v2_session-prod-2018.11.05 1 p STARTED 6000 963.3kb xx.xxx.xx.xxx node-25
v2_session-prod-2018.11.05 1 r UNASSIGNED
v2_session-prod-2018.11.05 4 p STARTED 6000 1020.4kb xx.xxx.xx.xxx node-25
v2_session-prod-2018.11.05 4 r UNASSIGNED
v2_session-prod-2018.11.05 2 p STARTED 6000 951.4kb xx.xxx.xx.xxx node-25
v2_session-prod-2018.11.05 2 r UNASSIGNED
v2_session-prod-2018.11.05 0 p STARTED 6000 972.2kb xx.xxx.xx.xxx node-25
v2_session-prod-2018.11.05 0 r UNASSIGNED
v2_status-prod-2018.11.05 3 p STARTED 6000 910.2kb xx.xxx.xx.xxx node-25
v2_status-prod-2018.11.05 3 r UNASSIGNED
Is there another way to try and get shards allocation working again so I can get my cluster health back to green?
The other node within my cluster had a "high disk watermark [90%] exceeded" warning message so shards were "relocated away from this node".
I updated the config to:
cluster.routing.allocation.disk.watermark.high: 95%
After restarting the node, shards began to allocate again.
This is a quick fix - I will also attempt to increase the disk space on this node to ensure I don't lose reliability.

Tarantool sphia make slow selects?

Use tarantool version: Tarantool 1.6.8-586-g504e151
It installed from epel.
I use tarantool on sphia mode:
log_space = box.schema.space.create('logs',
{
engine = 'sophia',
if_not_exists = true
}
)
log_space:create_index('primary', {
parts = {1, 'STR'}
}
)
I have 500.000 records and make select request:
box.space.logs:select({'log_data'})
it takes aboute 1min.
Why so slow ?
unix/:/var/run/tarantool/g_sofia.control> box.stat()
—-
- DELETE:
total: 0
rps: 0
SELECT:
total: 587575
rps: 25
INSERT:
total: 815315
rps: 34
EVAL:
total: 0
rps: 0
CALL:
total: 0
rps: 0
REPLACE:
total: 1
rps: 0
UPSERT:
total: 0
rps: 0
AUTH:
total: 0
rps: 0
ERROR:
total: 23
rps: 0
UPDATE:
total: 359279
rps: 17
Sophia engine is deprecated since 1.7.x . Please use vinyl engine instead of it.
Please take a look for more details: https://www.tarantool.io/en/doc/1.10/book/box/engines/vinyl/
After direct on-site help and debugging with agent-0007, we have found several issues.
Most of them been related to slow virtual environment (openvz been used), which shows inadequate pread() stalls and io timings.
Additionally we have found two integration issues:
https://github.com/tarantool/tarantool/issues/1411 (SIGSEGV in eio_finish)
https://github.com/tarantool/tarantool/issues/1401 (Bug in upsert applier callback function using sophia)
Thanks.

Cassandra read latency high even with row caching, why?

I am testing cassandra performance with a simple model.
CREATE TABLE "NoCache" (
key ascii,
column1 ascii,
value ascii,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
bloom_filter_fp_chance=0.010000 AND
caching='ALL' AND
comment='' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.100000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'SnappyCompressor'};
I am fetching 100 columns of a row key using pycassa, get/xget function (). but getting read latency about 15ms in the server.
colums=COL_FAM.get(row_key, column_count=100)
nodetool cfstats
Column Family: NoCache
SSTable count: 1
Space used (live): 103756053
Space used (total): 103756053
Number of Keys (estimate): 128
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 20
Read Latency: 15.717 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Bloom Filter False Positives: 0
Bloom Filter False Ratio: 0.00000
Bloom Filter Space Used: 976
Compacted row minimum size: 4769
Compacted row maximum size: 557074610
Compacted row mean size: 87979499
Latency of this type is amazing! When nodetool info shows that read hits directly in the row cache.
Row Cache : size 4834713 (bytes), capacity 67108864 (bytes), 35 hits, 38 requests, 1.000 recent hit rate, 0 save period in seconds
Can anyone tell me why is cassandra taking so much time while reading from row cache?
Enable tracing and see what it's doing. http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2

Caching not Working in Cassandra

I dont seem to have any caching enabled when checking in Opscenter or cfstats. Im running Cassandra 1.1.7 with Solandra on Debian. I have set the required global options in cassandra.yaml:
key_cache_size_in_mb: 800
key_cache_save_period: 14400
row_cache_size_in_mb: 800
row_cache_save_period: 15400
row_cache_provider: SerializingCacheProvider
Column Families were created as follows:
create column family example
with column_type = 'Standard'
and comparator = 'BytesType'
and default_validation_class = 'BytesType'
and key_validation_class = 'BytesType'
and read_repair_chance = 1.0
and dclocal_read_repair_chance = 0.0
and gc_grace = 864000
and min_compaction_threshold = 4
and max_compaction_threshold = 32
and replicate_on_write = true
and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
and caching = 'ALL';
Opscenter shows no data available on caching graphs and CFSTATS doesn't show any cache related fields:
Column Family: charsets
SSTable count: 1
Space used (live): 5558
Space used (total): 5558
Number of Keys (estimate): 128
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 61381
Read Latency: 0.123 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Bloom Filter False Postives: 0
Bloom Filter False Ratio: 0.00000
Bloom Filter Space Used: 16
Compacted row minimum size: 1917
Compacted row maximum size: 2299
Compacted row mean size: 2299
Any help or suggestions are appreciated.
Sam
The caching stats have been moved from cfstats to info in Cassandra 1.1. If you run nodetool info you should see something like:
Key Cache : size 5552 (bytes), capacity 838860800 (bytes), 38 hits, 47 requests, 0.809 recent hit rate, 14400 save period in seconds
Row Cache : size 0 (bytes), capacity 838860800 (bytes), 0 hits, 0 requests, NaN recent hit rate, 15400 save period in seconds
This is because there are now global caches, rather than per-CF. It seems that Opscenter needs updating for this change - maybe there is a later version available that will work.

Resources