Memory limit exceeded when running very simple query in Clickhouse

Memory limit exceeded when running very simple query in Clickhouse - clickhouse

I have a very large table (730M rows) that uses the ReplacingMergeTree engine. I've started getting "Memory limit (for query) exceeded" even when running trivial queries.
For example, SELECT * FROM my_table LIMIT 5 gives:
Code: 241. DB::Exception: Received from localhost:9000. DB::Exception: Memory limit (for query) exceeded: would use 24.50 GiB (attempt to allocate chunk of 26009509376 bytes), maximum: 9.31 GiB: While executing MergeTree.
Why is Clickhouse trying to use 24.5G of memory for a simple SELECT query, and how can I fix it?

Because many parallel threads read all columns and 65k rows and allocate several MB for each column.
How many columns in the table?
Try
set max_block_size=512, max_threads=1, max_rows_to_read=512;
SELECT * FROM my_table LIMIT 5;

First of all this looks like a ClickHouse disadvantage... However, there is a way to get around this limitation, I use a query on system.tables table, there is a partition_key field.
SELECT partition_key FROM system.tables
WHERE database = 'your_db' AND name='your_big_table'
It indicates what partition key the table has, or rather how parts of the table can be addressed. Since the table is large, there are probably partitions and we can use them:
SELECT *
FROM your_db.your_big_table
WHERE toYYYYMM(event_time) = toYYYYMM(now())
LIMIT 5;
or if it is intDiv(id_value, 10000000)
SELECT *
FROM your_db.your_big_table
WHERE intDiv(id_value, 10000000) = 0
LIMIT 5;
This way you reduce the number of rows to iterate and bypass the limitation.
I think LIMIT is not transmitted to the nodes of the CH cluster and that is why such an overhead occurs, and you got the exception.
By the way, I have the same story because of the limitation in max_rows_to_read_leaf
DB::Exception: Limit for rows (controlled by 'max_rows_to_read_leaf' setting) exceeded

Related

Clickhouse Exception: Memory limit (total) exceeded

Attempting to connect Clickhouse to replicate data from PostgreSQL using https://clickhouse.com/docs/en/engines/database-engines/materialized-postgresql/. Any ideas on how to solve the error or what's the best way to replicate PostgreSQL data to Clickhouse?
CREATE DATABASE pg_db
ENGINE = MaterializedPostgreSQL('localhost:5432', 'dbname', 'dbuser', 'dbpass')
SETTINGS materialized_postgresql_schema = 'dbschema'
Then running SHOW TABLES FROM pg_db; doesn't show all tables (missing large tables that has 800k rows). When attempting to attach that large table using ATTACH TABLE pg_db.lgtable;, gets an error below:
Code: 619. DB::Exception: Failed to add table lgtable to replication.
Info: Code: 241. DB::Exception: Memory limit (total) exceeded: would
use 1.75 GiB (attempt to allocate chunk of 4219172 bytes), maximum:
1.75 GiB. (MEMORY_LIMIT_EXCEEDED) (version 22.1.3.7 (official build)). (POSTGRESQL_REPLICATION_INTERNAL_ERROR) (version 22.1.3.7 (official
build))
I've tried increasing allocated memory and adjusting other settings, but still getting the same problem.
set max_memory_usage = 8000000000;
set max_memory_usage_for_user = 8000000000;
set max_bytes_before_external_group_by = 1000000000;
set max_bytes_before_external_sort = 1000000000;
set max_block_size=512, max_threads=1, max_rows_to_read=512;

Why is Oracle RESULT_CACHE not reducing the number of LAST_CR_BUFFER gets?

I'm doing some tests with the Oracle result_cache and came across something that looks strange (to me, anyway).
I've created the following table:
create table CACHE_TEST(
COL1 int,
COL2 int
);
And inserted some dummy data into it:
insert into CACHE_TEST select * from(
select level as COL1, level*3 as COL2 from DUAL
connect by level <= 100
);
Now I run an Autotrace on the following query:
select * from CACHE_TEST;
As expected, it shows a normal full table scan. I then run an Autotrace on the following query several times, and expect it to be using the result cache:
select /*+ RESULT_CACHE */ * from CACHE_TEST;
The Autotrace shows that it is indeed using the cache, but the number of buffer gets and cost is exactly the same as the first query.
Interestingly, if I do some kind of aggregate, eg:
select /*+ RESULT_CACHE */ AVG(COL1) FROM CACHE_TEST;
It reduces the buffer gets to zero, but the cost is still the same.
Can anyone explain why:
The result cache doesn't seem to reduce the number of buffer gets unless you do an aggregate?
Even when it does use the result cache the cost is still reportedly the same (even though I see a marked performance increase)?

If you're selecting all the data from the table, why would you expect it to require fewer reads to read all that data from the table (which is likely completely in the buffer cache) rather than from the result cache? Either way, you're reading the same number of blocks and getting the same amount of data back.
The result cache is really helpful when your query is doing some sort of expensive calculation. That could be an aggregate-- reading a single cached value is obviously more efficient than reading every row from a table. Or it could be a query that does a lot of work to figure out which subset of the table to read (joining through some other tables, for example).

postgres not using index on SELECT COUNT(*) for a large table

I have four tables; two for current data, two for archive data. One of the archive tables has tens of millions of rows. All tables have a couple narrow indexes and are very similar.
Given the following queries:
SELECT (SELECT COUNT(*) FROM A)
UNION SELECT (SELECT COUNT(*) FROM B)
UNION SELECT (SELECT COUNT(*) FROM C_LargeTable)
UNION SELECT (SELECT COUNT(*) FROM D);
A, B and D perform index scans. C_LargeTable uses a seq scan and the query takes about 20 seconds to execute. Table D has millions of rows as well, but is only about 10% of the size of C_LargeTable
If I then modify my query to execute using the following logic, which sufficiently narrows counts, I still get the same results, the index is used and the query takes about 5 seconds, or 1/4th of the time
...
SELECT (SELECT COUNT(*) FROM C_LargeTable WHERE idx_col < 'G')
+ (SELECT COUNT(*) FROM C_LargeTable WHERE idx_col BETWEEN 'G' AND 'Q')
+ (SELECT COUNT(*) FROM C_LargeTable WHERE idx_col > 'Q')
...
It does not makes sense to me to have the I/O overhead of a full table scan for a count when perfectly good indexes exist and there is a covering primary key which would ensure uniqueness. My understanding of postgres is that a PRIMARY KEY isn't like a SQL Server clustering index in that it determines a sort, but it implicitly creates a btree index to ensure uniqueness, which I assume should require significantly less I/O than a full table scan.
Is this potentially an indication of an optimization that I may need to perform to organize data within C_LargeTable?

There isn't a covering index on the primary key because PostgreSQL doesn't support them (true up to and including 9.4 anyway).
The heap scan is required because of MVCC visibility. The index doesn't contain visibility information. Pg can do an index scan, but it still has to check visibility info from the heap, and with an index scan that'd be random I/O to read the whole table, so a seqscan will be much faster.
Make sure you run 9.2 or newer, and that autovacuum is configured to run frequently on the table. You should then be able to do an index-only scan where the visibility map is used. This only works under limited circumstances as Horse notes; see the wiki page on count and on index-only scans. If you aren't letting autovacuum run regularly enough the visibility map will be outdated and Pg won't be able to do an index-only scan.
In future, make sure you post explain or preferably explain analyze output with any queries.

Oracle 10g small Blob or Clob not being stored inline?

According to the documents I've read, the default storage for a CLOB or BLOB is inline, which means that if it is less than approx 4k in size then it will be held in the table.
But when I test this on a dummy table in Oracle (10.2.0.1.0) the performance and response from Oracle Monitor (by Allround Automations) suggest that it is being held outwith the table.
Here's my test scenario ...
create table clobtest ( x int primary key, y clob, z varchar(100) )
;
insert into clobtest
select object_id, object_name, object_name
from all_objects where rownum < 10001
;
select COLUMN_NAME, IN_ROW
from user_lobs
where table_name = 'CLOBTEST'
;
This shows: Y YES (suggesting that Oracle will store the clob in the row)
select x, y from CLOBTEST where ROWNUM < 1001 -- 8.49 seconds
select x, z from CLOBTEST where ROWNUM < 1001 -- 0.298 seconds
So in this case, the CLOB values will have a maximum length of 30 characters, so should always be inline. If I run Oracle Monitor, it shows a LOB.Length followed by a LOB.Read() for each row returned, again suggesting that the clob values are held outwith the table.
I also tried creating the table like this
create table clobtest
( x int primary key, y clob, z varchar(100) )
LOB (y) STORE AS (ENABLE STORAGE IN ROW)
but got exactly the same results.
Does anyone have any suggestions how I can force (persuade, encourage) Oracle to store the clob value in-line in the table? (I'm hoping to achieve similar response times to reading the varchar2 column z)
UPDATE: If I run this SQL
select COLUMN_NAME, IN_ROW, l.SEGMENT_NAME, SEGMENT_TYPE, BYTES, BLOCKS, EXTENTS
from user_lobs l
JOIN USER_SEGMENTS s
on (l.Segment_Name = s. segment_name )
where table_name = 'CLOBTEST'
then I get the following results ...
Y YES SYS_LOB0000398621C00002$$ LOBSEGMENT 65536 8 1

The behavior of Oracle LOBs is the following.
A LOB is stored inline when:
(
The size is lower or equal than 3964
AND
ENABLE STORAGE IN ROW has been defined in the LOB storage clause
) OR (
The value is NULL
)
A LOB is stored out-of-row when:
(
The value is not NULL
) AND (
Its size is higher than 3964
OR
DISABLE STORAGE IN ROW has been defined in the LOB storage clause
)
Now this is not the only issue which may impact performance.
If the LOBs are finally not stored inline, the default behavior of Oracle is to avoid caching them (only inline LOBs are cached in the buffer cache with the other fields of the row). To tell Oracle to also cache non inlined LOBs, the CACHE option should be used when the LOB is defined.
The default behavior is ENABLE STORAGE IN ROW, and NOCACHE, which means small LOBs will be inlined, large LOBs will not (and will not be cached).
Finally, there is also a performance issue at the communication protocol level. Typical Oracle clients will perform 2 additional roundtrips per LOBs to fetch them:
- one to retrieve the size of the LOB and allocate memory accordingly
- one to fetch the data itself (provided the LOB is small)
These extra roundtrips are performed even if an array interface is used to retrieve the results. If you retrieve 1000 rows and your array size is large enough, you will pay for 1 roundtrip to retrieve the rows, and 2000 roundtrips to retrieve the content of the LOBs.
Please note it does not depend on the fact the LOB is stored inline or not. They are complete different problems.
To optimize at the protocol level, Oracle has provided a new OCI verb to fetch several LOBs in one roundtrips (OCILobArrayRead). I don't know if something similar exists with JDBC.
Another option is to bind the LOB on client side as if it was a big RAW/VARCHAR2. This only works if a maximum size of the LOB can be defined (since the maximum size must be provided at bind time). This trick avoids the extra rountrips: the LOBs are just processed like RAW or VARCHAR2. We use it a lot in our LOB intensive applications.
Once the number of roundtrips have been optimized, the packet size (SDU) can be resized in the net configuration to better fit the situation (i.e. a limited number of large roundtrips). It tends to reduce the "SQL*Net more data to client" and "SQL*Net more data from client" wait events.

If you're "hoping to achieve similar response times to reading the varchar2 column z", then you'll be disappointed in most cases.
If you're using a CLOB I suppose you need to store more than 4,000 bytes, right? Then if you need to read more bytes that's going to take longer.
BUT if you have a case where yes, you use a CLOB, but you're interested (in some instances) only in the first 4,000 bytes of the column (or less), then you have a chance of getting similar performance.
It looks like Oracle can optimize the retrieval if you use something like DBMS_LOB.SUBSTR and ENABLE STORAGE IN ROW CACHE clause with your table. Example:
CREATE TABLE clobtest (x INT PRIMARY KEY, y CLOB)
LOB (y) STORE AS (ENABLE STORAGE IN ROW CACHE);
INSERT INTO clobtest VALUES (0, RPAD('a', 4000, 'a'));
UPDATE clobtest SET y = y || y || y;
INSERT INTO clobtest SELECT rownum, y FROM all_objects, clobtest WHERE rownum < 1000;
CREATE TABLE clobtest2 (x INT PRIMARY KEY, z VARCHAR2(4000));
INSERT INTO clobtest2 VALUES (0, RPAD('a', 4000, 'a'));
INSERT INTO clobtest2 SELECT rownum, z FROM all_objects, clobtest2 WHERE rownum < 1000;
COMMIT;
In my tests on 10.2.0.4 and 8K block, these two queries give very similar performance:
SELECT x, DBMS_LOB.SUBSTR(y, 4000) FROM clobtest;
SELECT x, z FROM clobtest2;
Sample from SQL*Plus (I ran the queries multiple times to remove physical IO's):
SQL> SET AUTOTRACE TRACEONLY STATISTICS
SQL> SET TIMING ON
SQL>
SQL> SELECT x, y FROM clobtest;
1000 rows selected.
Elapsed: 00:00:02.96
Statistics
------------------------------------------------------
0 recursive calls
0 db block gets
3008 consistent gets
0 physical reads
0 redo size
559241 bytes sent via SQL*Net to client
180350 bytes received via SQL*Net from client
2002 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1000 rows processed
SQL> SELECT x, DBMS_LOB.SUBSTR(y, 4000) FROM clobtest;
1000 rows selected.
Elapsed: 00:00:00.32
Statistics
------------------------------------------------------
0 recursive calls
0 db block gets
2082 consistent gets
0 physical reads
0 redo size
18993 bytes sent via SQL*Net to client
1076 bytes received via SQL*Net from client
68 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1000 rows processed
SQL> SELECT x, z FROM clobtest2;
1000 rows selected.
Elapsed: 00:00:00.18
Statistics
------------------------------------------------------
0 recursive calls
0 db block gets
1005 consistent gets
0 physical reads
0 redo size
18971 bytes sent via SQL*Net to client
1076 bytes received via SQL*Net from client
68 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1000 rows processed
As you can see, consistent gets are quite higher, but SQL*Net roundtrips and bytes are nearly identical in the last two queries, and that apparently makes a big difference in execution time!
One warning though: the difference in consistent gets might become a more likely performance issue if you have large result sets, as you won't be able to keep everything in buffer cache and you'll end up with very expensive physical reads...
Good luck!
Cheers

Indeed, it is stored within the row. You are likely dealing with the simple overhead of using a LOB instead of a varchar. Nothing is free. The DB probably doesn't know ahead of time where to find the row, so it probably still "follows a pointer" and does extra work just in case the LOB is big. If you can get by with a varchar, you should. Even old hacks like 2 varchars to deal with 8000 characters might solve your business case with higher performance.
LOBS are slow, difficult to query, etc. On the positive, they can be 4G.
What would be interesting to try is to shove something just over 4000 bytes into that clob, and see what the performance looks like. Maybe it is about the same speed? This would tell you that it's overhead slowing you down.
Warning, at some point network traffic to your PC slows you down on these kind of tests.
Minimize this by wrapping in a count, this isolates the work to the server:
select count(*) from (select x,y from clobtest where rownum<1001)
You can achieve similar effects with "set autot trace", but there will be tracing overhead too.

There are two indirections when it comes to CLOBs and BLOBs:
The LOB value might be stored in a different database segment than the rest of the row.
When you query the row, only the non-LOB fields are contained in the result set and accessing the LOB-fields requries one or more additional round trips between the client and the server (per row!).
I don't quite know how you measure the execution time and I've never used Oracle Monitor, but you might primarily be affected by the second indirection. Depending on the client software you use, it is possible to reduce the round trips. E.g. when you use ODP.NET, the parameter is called InitialLobFetchSize.
Update:
One one to tell which of the two indirections is relevant, you can run your LOB query with 1000 rows twice. If the time drops significantly from the first to the second run, it's indirection 1. On the second run, the caching pays off and access to the separate database segment isn't very relevant anymore. If the time stays about the same, it's the second indirection, namely the round trips between the client and the server, which cannot improve between two runs.
The time of more than 8 seconds for 1000 rows in a very simple query indicate it's indirection 2 because 8 seconds for 1000 rows can't really be explained with disk access unless your data is very scattered and your disk system under heavy load.

This is the key information (how to read LOB without extra roundtrips), which is not available in Oracle's documentation I think:
Another option is to bind the LOB on client side as if it was a big
RAW/VARCHAR2. This only works if a maximum size of the LOB can be
defined (since the maximum size must be provided at bind time). This
trick avoids the extra rountrips: the LOBs are just processed like RAW
or VARCHAR2. We use it a lot in our LOB intensive applications.
I had problem with loading simple table (few GB) with one blob column ( 14KB => thousands of rows) and I was investigating it for a long time, tried a lot of lob storage tunings (DB_BLOCK_SIZE for new tablespace, lob storage specification - CHUNK ), sqlnet.ora settings, client prefetching attributes, but this (treat BLOB as LONG RAW with OCCI ResultSet->setBufferData on client side) was the most important thing (persuade oracle to send blob column immediately without sending lob locator at first and loading each lob separately based on lob locator.
Now I can get even ~ 500Mb/s throughput (with columns < 3964B).
Our 14KB blob will be separated into multiple columns - so it'll be stored in row to get almost sequential reads from HDD. With one 14KB blob (one column) I get ~150Mbit/s because of non-sequential reads (iostat: low amount of merged read requests).
NOTE: don't forget to set also lob prefetch size/length:
err = OCIAttrSet(session, (ub4) OCI_HTYPE_SESSION, (void *) &default_lobprefetch_size, 0, (ub4) OCI_ATTR_DEFAULT_LOBPREFETCH_SIZE, errhp);
But I don't know how is it possible to achieve the same fetching throughput with ODBC connector. I was trying it without any success.

where rownum=1 query taking time in Oracle

I am trying to execute a query like
select * from tableName where rownum=1
This query is basically to fetch the column names of the table.There are more than million records in the table.When I put the above condition its taking so much time to fetch the first row.Is there any alternate to get the first row.

This question has already been answered, I will just provide an explanation as to why sometimes a filter ROWNUM=1 or ROWNUM <= 1 may result in a long response time.
When encountering a ROWNUM filter (on a single table), the optimizer will produce a FULL SCAN with COUNT STOPKEY. This means that Oracle will start to read rows until it encounters the first N rows (here N=1). A full scan reads blocks from the first extent to the high water mark. Oracle has no way to determine which blocks contain rows and which don't beforehand, all blocks will therefore be read until N rows are found. If the first blocks are empty, it could result in many reads.
Consider the following:
SQL> /* rows will take a lot of space because of the CHAR column */
SQL> create table example (id number, fill char(2000));
Table created
SQL> insert into example
2 select rownum, 'x' from all_objects where rownum <= 100000;
100000 rows inserted
SQL> commit;
Commit complete
SQL> delete from example where id <= 99000;
99000 rows deleted
SQL> set timing on
SQL> set autotrace traceonly
SQL> select * from example where rownum = 1;
Elapsed: 00:00:05.01
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7 Card=1 Bytes=2015)
1 0 COUNT (STOPKEY)
2 1 TABLE ACCESS (FULL) OF 'EXAMPLE' (TABLE) (Cost=7 Card=1588 [..])
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
33211 consistent gets
25901 physical reads
0 redo size
2237 bytes sent via SQL*Net to client
278 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
As you can see the number of consistent gets is extremely high (for a single row). This situation could be encountered in some cases where for example, you insert rows with the /*+APPEND*/ hint (thus above high water mark), and you also delete the oldest rows periodically, resulting in a lot of empty space at the beginning of the segment.

Try this:
select * from tableName where rownum<=1
There are some weird ROWNUM bugs, sometimes changing the query very slightly will fix it. I've seen this happen before, but I can't reproduce it.
Here are some discussions of similar issues: http://jonathanlewis.wordpress.com/2008/03/09/cursor_sharing/ and http://forums.oracle.com/forums/thread.jspa?threadID=946740&tstart=1

Surely Oracle has meta-data tables that you can use to get column names, like the sysibm.syscolumns table in DB2?
And, after a quick web search, that appears to be the case: see ALL_TAB_COLUMNS.
I'd use those rather than go to the actual table, something like (untested):
SELECT COLUMN_NAME
FROM ALL_TAB_COLUMNS
WHERE TABLE_NAME = "MYTABLE"
ORDER BY COLUMN_NAME;
If you are hell-bent on finding out why your query is slow, you should revert to the standard method: asking your DBMS to explain the execution plan of the query for you. For Oracle, see section 9 of this document.
There's a conversation over at Ask Tom - Oracle that seems to suggest the row numbers are created after the select phase, which may mean the query is retrieving all rows anyway. The explain will probably help establish that. If it contains FULL without COUNT STOPKEY, then that may explain the performance.
Beyond that, my knowledge of Oracle specifics diminishes and you will have to analyse the explain further.

Your query is doing a full table scan and then returning the first row.
Try
SELECT * FROM table WHERE primary_key = primary_key_value;
The first row, particularly as it pertains to ROWNUM, is arbitrarily decided by Oracle. It may not be the same from query to query, unless you provide an ORDER BY clause.
So, picking a primary key value to filter by is as good a method as any to get a single row.

I think you're slightly missing the concept of ROWNUM - according to Oracle docs: "ROWNUM is a pseudo-column that returns a row's position in a result set. ROWNUM is evaluated AFTER records are selected from the database and BEFORE the execution of ORDER BY clause."
So it returns ANY row that it consideres #1 in the result set which in your case will contain 1M rows.
You may want to check out a ROWID pseudo-column: http://psoug.org/reference/pseudocols.html

I've recently had the same problem you're describing: I want one row from the very large table as a quick, dirty, simple introspection, and "where rownum=1" alone behaves very poorly. Below is a remedy which worked for me.
Select the max() of the first term of some index, and then use it to choose some small fraction of all rows with "rownum=1". Suppose my table has some index on numerical "group-id", and compare this:
select * from my_table where rownum = 1;
-- Elapsed: 00:00:23.69
with this:
select * from my_table where rownum = 1
and group_id = (select max(group_id) from my_table);
-- Elapsed: 00:00:00.01

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio