How to speed up big query in ClickHouse? - performance

Backgroud:
I submitted a local query in ClickHouse (without using cache), and it processed 414.43 million rows, 42.80 GB.
The query lasted 100+ seconds.
My ClickHouse instances were installed on AWS c5.9xlarge EC2 with 12T st1 EBS
During this query, the IOPS is up to 500 and read throughput is up to 20M/s.
And as a comparison, st1 EBS max IOPS is 500 and max throughput is 500M/s.
Here is my question:
Does 500 IOPS actually limit my query (file-reading) speed? (never mind the cache) Should I change EBS volume type to gp2 or io1 to increase IOPS?
Is there any setting can improve throughput under the same IOPS? (as I can see, it's far from ceiling actually)
I tried increasing 'max_block_size' to read more file at one time, but it doesn't seem to work.
How to extend the cache time?Big query took minutes. Cache took seconds. But cache doesn't seem to last very long.
How can I warm-up columns to meet all requirements? Please show sqls.

Does 500 IOPS actually limit my query (file-reading) speed?
yes
Should I change EBS volume type to gp2 or io1 to increase IOPS?
yes
Is there any setting can improve throughput under the same IOPS?
tune max_bytes_to_read
reduce number of columns (in select)
reduce number of parts (in select)
How to extend the cache time?
min_merge_bytes_to_use_direct_io=1
How can I warm-up columns to meet all requirements? Please show sqls.
select a,b,c,d from T Format Null

Related

optimization on old indexes collecting logs from my apps

I have an elastic cluster with 3x nodes(each 6x cpu, 31GB heap , 64GB RAM) collecting 25GB logs per day , but after 3x months I realized my dashboards become very slow when checking stats in past weeks , please, advice if there is an option to improve the indexes read erformance so it become faster when calculating my dashboard stats?
Thanks!
I would suggest you try to increase the shards number
when you have more shards Elasticsearch will split your data over the shards so as a result, Elastic will send multiple parallel requests to search in a smaller data stack
for Shards number you could try to split it based on your heap memory size
No matter what actual JVM heap size you have, the upper bound on the maximum shard count should be 20 shards per 1 GB of heap configured on the server.
ElasticSearch - Optimal number of Shards per node
https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index
https://opster.com/elasticsearch-glossary/elasticsearch-choose-number-of-shards/
It seems that the amount of data that you accumulated and use for your dashboard is causing performance problems.
A straightforward option is to increase your cluster's resources but then you're bound to hit the same problem again. So you should rather rethink your data retention policy.
Chances are that you are really only interested in most recent data. You need to answer the question what "recent" means in your use case and simply discard anything older than that.
Elasticsearch has tools to automate this, look into Index Lifecycle Management.
What you probably need is to create an index template and apply a lifecycle policy to it. Elasticsearch will then handle automatic rollover of indices, eviction of old data, even migration through data tiers in hot-warm-cold architecture if you really want very long retention periods.
All this will lead to a more predictable performance of your cluster.

How to determine what causes ES's query API instability

Normally, my ES query API takes less than 1s.But sometimes these queries get slow.
cluster consists of three 32G machines (16G allocated to ES).The index consists of 20 primaries and 1 replica, 303,000,000 dos count and 500gb primaries storage size and 1tb storage size.
Here's kibana's monitoring data:
`
Personally, I think it's the result of GC. I want to add machines.But I need to find a reason to convince my leader.
Yes it could be a GC problem. But can you be more specific? What do you mean by slow?
Anyway it seems the allocated heap is way too large for your needs. You have a collection when the heap is at 12Go ( 75% of 16go ) and it goes back to 5go every time. Its generate huge garbage collection.
You should try to lower the heap to like 10Go and check the impact on performance GC count and GC duration.
I recommands you too read this article https://www.elastic.co/blog/a-heap-of-trouble especially the "Together We Can Prevent Forest Fires" part.

How to increase greenplum concurrency and # query per sec

We have a fairly big Greenplum v4.3 cluster. 18 hosts, each host has 3 segment nodes. Each host has approx 40 cores and 60G memory.
The table we have is 30 columns wide, which has 0.1 billion rows. The query we are testing has 3-10 secs response time when there is no concurrency pressure. As we increase the # of queries we fired in parallel, the latency is decreasing from avg 3 secs to 50ish secs as expected.
But we've found that regardless how many queries we fired in parallel, we only have like very low QPS(query per sec), almost just 3-5 queries/sec. We've set the max_memory=60G, memory_limit=800MB, and active_statments=100, hoping the CPU and memory can be highly utilized, but they are still poorly used, like 30%-40%.
I have a strong feeling, we tried to feed up the cluster in parallel badly, hoping to take the best out of the CPU and Memory utilization. But it doesn't work as we expected. Is there anything wrong with the settings? or is there anything else I am not aware of?
There might be multiple reasons for such behavior.
Firstly, every Greenplum query uses no more than one processor core on one logical segment. Say, you have 3 segments on every node with 40 physical cores. Running two parallel queries will utilize maximum 2 x 3 = 6 cores on every node, so you will need about 40 / 6 ~= 6 parallel queries to utilize all of your CPUs. So, maybe for your number of cores per node its better to create more segments (gpexpand can do this). By the way, are the tables that used in the queries compressed?
Secondly, it may be a bad query. If you will provide a plan for the query, it may help to understand. There some query types in Greenplum that may have master as a bottleneck.
Finally, that might be some bad OS or blockdev settings.
I think this document page Managing Resources might help you mamage your resources
You can use Resource Group limit/controll your resource especialy concurrency attribute(The maximum number of concurrent transactions, including active and idle transactions, that are permitted in the resource group).
Resouce queue help limits ACTIVE_STATEMENTS
Note: The ACTIVE_STATEMENTS will be the total statement current running, when you have 50s cost queries and next incoming queries, this could not be working, mybe 5 * 50 is better.
Also, you need config memory/CPU settings to enable your query can be proceed.

Elasticsearch Indexing Speed Degrade with the Time

I am doing some performance tuning in elastic search for my project and I need some help in improving the elastic search indexing speed. I am using ES 5.1.1 and I have 2 nodes setup with 8 shards for the index. I have the servers for 2 nodes with 16GB RAM and 12CPUs allocated for each server with 2.2GHz clock speed. I need to index around 25,000,000 documents within 1.5 hours, which I am currently doing in around 4 hours. I have done the following config changes to improve the indexing time.
Setting ‘indices.store.throttle.type’ to ‘none’
Setting ‘refresh_interval’ to ‘-1’
Increasing ‘translog.flush_threshold_size’ to 1GB
Setting ‘number_of_replicas’ to ‘0’
Using 8 shards for the index
Setting VM Options -Xms8g -Xmx8g (Half of the RAM size)
I am using the bulk processor to generate the documents in my java application and I’m using the following configurations to setup the bulk processor.
Bulk Actions Count : 10000
Bulk Size in MB : 100
Concurrent Requests : 100
Flush Interval : 30
Initially I can index around 356167 in the first minute. But with the time, It decreases and after around 1 hour its around 121280 docs per minute.
How can I keep the indexing rate steady over the time? Is there any other ways to improve the performance?
I highly encourage not to change configuration parameters like the translog flush size, the throttling, unless you know what you are doing (and this does not mean reading some blog post on the internet :-)
Try a single shard per server and especially reduce the bulk size to something like 10MB. 100MB * 100 concurrent requests means you need 10GB of heap to deal with those (without doing anything else). I suppose not all of the documents get indexed because of your rejected tasks in your threadpools.
Start small and get bigger instead of starting big but not have any insight in your indexing.

Elasticsearch indexing performance issues

We are facing some performance issues with elasticsearch in the last couple of days. As you can see on the screenshot, the indexing rate has some significant drops after the index reaches a certain size. At normal speed, we index arround 3000 logs per second. When the index we write to reaches a size of about ~10 GB, the rate drops.
We are using time based indices and arround 00:00, when a new Index is created by Logstash, the rates climb again to ~3000 logs per second (thats why we think its somehow related to the size of the index).
Server stats show nothing unusal at the CPU or memory stats (they are the same during drop-phases), but one of the servers has alot of I/O waits. Our Elasticsearch config is quite standard, with some adjustments to index performance (taken from the ES guide):
# If your index is on spinning platter drives, decrease this to one
# Reference / index-modules-merge
index.merge.scheduler.max_thread_count: 1
# allows larger segments to flush and decrease merge pressure
index.refresh_interval: 5s
# increase threshold_size from default when you are > ES 1.3.2
index.translog.flush_threshold_size: 1000mb
# JVM settings
bootstrap.mlockall: true (ES_HEAP SIZE is 50% of RAM)
We use two nodes. Both with 8 GB of RAM, 2 CPU cores and 300GB HDD size (dev environment).
I already saw clusters with alot bigger indices than ours. Do you guys have any idea what we could do to fix the issues?
BR
Edit:
Just ran into the performance issues again. Top sometimes shows arround 60% wa (wait), but iotop only reports about 1000 K/s read and write at max. I have no idea where these waits are coming from.

Resources