I am having a table with around 2 billion rows that i try to query the max(id) from. Id is not the sort key of the table and the table is using the table engine mergeTree.
No matter what I try, I get memory errors. This does not stop with this one query only. As soon as I try to query any table fully (vertical) to find data my 12 gb ram is not enough. Now I know I can just add more but that is not the point. Is it by design that clickhouse just throws an error when it doesn't have enough memory? Is there a setting that tells clickhouse to use disk instead?
SQL Error [241]: ClickHouse exception, code: 241, host: XXXXXX, port: 8123; Code: 241, e.displayText() = DB::Exception: Memory limit (for query) exceeded: would use 9.32 GiB (attempt to allocate chunk of 9440624 bytes), maximum: 9.31 GiB (version 21.4.6.55 (official build))
Alexey Milovidov disagree to put into CH documentation minimum RAM requirements. But I would say that 32 GB is a minimum for production CH.
At least:
You need to lower mark cache because it's 5GB!!!! by default (set it 500MB).
You need to lower max_block_size to 16384.
You need to lower max_threads to 2.
You need to set max_bytes_before_external_group_by to 3GB.
You need to set aggregation_memory_efficient_merge_threads to 1.
For me what worked was to change the maximum server memory usage from 0.9 to 1.2.
<max_server_memory_usage_to_ram_ratio>1.2</max_server_memory_usage_to_ram_ratio>
--> config.xml
Thanks for the reply as it led me ultimately to this.
Related
We have a small Greenplum cluster in which some queries abort.
System related information:
Greenplum Version: 6.3
Master Host: 1
Segment Host: 2
RAM per Segmenthost: 32GB
SWAP per Segmenthost: 32GB
TOTAL segment: 8 Primary + 0 mirror
segment per host: 4
vm_overcommmit_ratio: 95
gp_vmem_protect_limit: 8072MB
statement_mem: 250MB
The queries are executed with a none superuser.
Symptom:
The query failed with the following error massage:
Canceling query because of high VMEM usage. Used: 7245MB, available 801MB, red zone: 7264MB (runaway_cleaner.c:189)
What we tried:
We calculate the Greenplum Parameter with this information: https://gpdb.docs.pivotal.io/6-3/best_practices/sysconfig.html
This help us for some "simple" queries but for more complicated ones the error happend again.
In the next Step we configured the max_statement_mem: 2000MB
This didn't have any effect to the memory consumption on the segmenthosts. We track this with following Query:
select segid, sum (vmem_mb) from session_state.session_level_memory_consumption
where query like '%<some snippet of the query>%'
group by segid
order by segid;
The memory consumption increases very quickly and the error happend again.
We tried to restrict the memory consumption by setting the following resource queue for the user:
CREATE RESOURCE QUEUE adhoc with (ACTIVE_STATEMENTS=6, MEMORY_LIMIT=6291);
ALTER ROLE user1 RESOURCE QUEUE adhoc;
The Database is set to use the resource queue with the parameter gp_resource_manager: queue
We see in the Table 'gp_toolkit.gp_resqueue_status' when we execute a statement that the 'rsqmemoryvalue' is 1048 but the memory consumption in the session_state.session_level_memory_consumption table shows higher values for the segments until the error occurs again.
Has anyone a tip to fix this problem?
Each query will ask for 250MB memory and you set gp_vmem_protect_limit to 8GB. In this case, you can probably run (8GB- primary process memory)/250MB =~ 20-30 queries at the same time. The size of primary process depends on other settings, shared_buffers, wal_buffers,...
Statement_mem can be set in a session. This means some users can set statement_mem higher (up to max_statement_mem) and you will see less queries in concurrent.
When the memory allocated to those concurrent queries reach 90(OR 95) % of gp_vmem_protect_limit, runaway detector will start to cancel queries to protect primary process from OS OOM Kill.
To "fix" the problem (it is not a problem actually), you can
1) set lower default statement_mem, so you can have more queries running concurrently but slower.
2) increase RAM on segment hosts, such that you can increase gp_vmem_protect_limit.
I have a local installation of CockroachDb on my windows PC and when I run a particular select query, I get the following error message:
7143455 bytes requested, 127403581 currently allocated, 134217728 bytes in budget.
I have read a blog post here but I haven't found a solution. I will appreciate a help on how to increase this budget limit.
CockroachDB versions between 2.0 and 2.0.2 have a bug in memory accounting for JSONB columns, leading to this error. The bug will be fixed in version 2.0.3, due in mid-June.
As a workaround, you may be able to rewrite this query to be more efficient (This might reduce the memory usage enough to work even with the bug. Even if it doesn't, it'll speed up the query when 2.0.3 is available). If I'm reading your query correctly, this is equivalent to
SELECT ID, JsonData,PrimaryIDs,IsActive,IsDeleted FROM "TableName"
WHERE LOWER(JsonData->>'Name') LIKE '%transaction%'
ORDER BY ID OFFSET 0 FETCH NEXT 100 ROWS ONLY
The subquery with ROW_NUMBER() was used with older versions of SQL Server, but since SQL Server 2012, the OFFSET 0 FETCH NEXT N ROWS ONLY version has been available and is more efficient.
The syntax OFFSET 0 FETCH NEXT N ROWS ONLY syntax comes from the SQL standard so it should work with most databases. CockroachDB also supports the LIMIT keyword which is used in MySQL and PostgreSQL for the same purpose.
We are using sqlplus to offload data from oracle using sqlplus on a large table with 500+ columns and around 15 million records per day.
The query fails as oracle is not able to allocate the required memory for the result set.
Fine tuning oracle DB server to increase memory allocation is ruled out since it is used across teams and is critical.
This is a simple select with a filter on a column.
What options do I have to make it work?
1) to break my query down into multiple chunks and run it in nightly batch mode.
If so , how can a select query be broken down
2) Are there any optimization techniques I can use while using sqlplus for a select query on a large table?
3) Any java/ojdbc based solution which can break a select into chunks and reduce the load on db server?
Any pointers are highly appreciated.
Here is the errror message thrown:
ORA-04030: out of process memory when trying to allocate 169040 bytes (pga heap,kgh stack)
ORA-04030: out of process memory when trying to allocate 16328 bytes (koh-kghu sessi,pl/sql vc2)
The ORA-4030 indicates the process needs more memory(UGA in SGA/PGA depending upon the server architecture) to execute job.
This could be caused by shortage of RAM(Dedicated server mode environment), a small PGA size, or may be operating system setting to restrict allocation of enough RAM.
This MOS Note describes how to diagnose and resolve ORA-04030 error.
Diagnosing and Resolving ORA-4030 Errors (Doc ID 233869.1)
Your option 1 seems in your control. Breaking down the query will require knowledge of the query/data. Either a column in the data might work; i.e.
query1: select ... where col1 <= <value>
query2: select ... where col1 > <value>
... or ... you might have to build more code around the problem.
Thought: does the query involving sorting/grouping? Can you live without it? Those operations take up more memory.
calling mysqldump for a database containing innodb & myisam tables.
Dump still runs very fast when it comes to a fat MyISAM table with 11GB size.
Fast means iotop shows me more than 70MB/s write performance.
I view the process in mytop so i know it happens at a big table.
Dump files grows up to 8GB and then suddenly the I/O is only about 1 MB/s.
Server Load is OK, no other processes running.
Tried to change my.cnf settings but nothing worked.
Performance depends on a few factors.
I had to create an alternative solution to Mysqldump for a client to make them load a 42GB dump file (with more than 1 billion rows)
For reference: originally, MySQLDump took 3.9 days on a 16 core server with 64Gb ram and a 10 disk SSD array.
Using uniVocity We loaded the same data in 90 minutes, using a 3 year old laptop. You can use it with a 30 day evaluation license to load this.
Other than that, here are a few things that may impact performance:
Check if you have this on your dump file to disable constraints:
SET #OLD_UNIQUE_CHECKS=##UNIQUE_CHECKS, UNIQUE_CHECKS=0
SET #OLD_FOREIGN_KEY_CHECKS=##FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0
SET #OLD_SQL_MODE=##SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO'
If it doesn't add them or alter the create table script to remove all constraints. If you have constraints enabled (primary keys, foreign keys, etc) while running your dump load, the process will get slower over time as the database will validate these contraints on every insert against a growing number of possibilities (more PK's and FK's).
If you are using InnoDB (not exactly your case but it may help someone else), add this to your my.cfg file:
innodb_doublewrite = 0
innodb_buffer_pool_size = 8000M
# innodb_log_file_size = 512M - If I enable this one the server won't start. Couldn't identify why.
log-bin = 0
innodb_support_xa = 0
innodb_flush_log_at_trx_commit = 0
I've got Oracle database that is used as a storage for web services. Most of the time data are in read-only mode and cached in RAM directly by the service. However during the system startup all data are pulled once from Oracle and the database tries to be smart and keeps the data in RAM (1GB).
How can I limit/control the amount of RAM available to the Oracle 9 instance?
A short answer is modify SGA_MAX_SIZE. The long one follows.
If you are referring to the "data", you have to check the DB_CACHE_SIZE (size of the memory buffers) and related to this the SGA_MAX_SIZE (max memory usage for the SGA instance).
Because SGA_MAX_SIZE reffers to the SGA memory (buffers, shared pool and redo buffers) if you want to free up the size of buffers you also have to drecrease the SGA_MAX_SIZE.
Take a look to Setting Initialization Parameters that Affect the Size of the SGA or give more details.
There are several database parameters that control memory usage in Oracle. Here is a reasonable starting point - it's not a trivial exercise to get it right. In particular, you probably want to look at DB_CACHE_SIZE.