In CockroachDB, how do I figure out how much memory a query is using? - cockroachdb

I'm trying to figure out how to attribute memory use in CockroachDB to a particular query. How do I do that? I've tried using EXPLAIN, but that doesn't seem to show memory usage.

You can use EXPLAIN ANALYZE to show the answer to this question. Here's a transcript from CockroachDB 20.2.2 that shows the method:
> ./cockroach-v20.2.2.darwin-10.9-amd64/cockroach demo
# Enter \? for a brief introduction.
#
root#127.0.0.1:64830/demo> create table a (a int primary key, b int);
CREATE TABLE
Time: 3ms total (execution 3ms / network 0ms)
root#127.0.0.1:64830/demo> insert into a select g, 1 from generate_series(1, 10000) g(g);
INSERT 10000
Time: 81ms total (execution 81ms / network 0ms)
root#127.0.0.1:64830/demo> select count(*) from a group by b;
count
---------
10000
(1 row)
Time: 7ms total (execution 7ms / network 0ms)
root#127.0.0.1:64830/demo> explain analyze select count(*) from a group by b;
automatic | url
------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
true | https://cockroachdb.github.io/distsqlplan/decode.html#eJyUkV-Pk0AUxd_9FDf3qTWjO6Cb6DyVXashIlSg0WpIM2VuCAkwODPEbRq-uwHinzVx4z6ec--5c36ZC9pvDQrcft5FQRhDEAfR4csWVm_CLM8-RmvIttH2NodSD51bPV3D2zT5ABLepcl-BzcHOCHDTiuKZUsWxVf0sGDYG12StdpM1mVeCNUdCs6w7vrBTXbBsNSGUFzQ1a4hFJjLU0MpSUXmiiNDRU7WzXxWbnpTt9KckWHWy84KeIYMk8EJ2PjI0OjvFgxJJcDjnE9x62TTgKtbEnD93L9-1VpkeDo7-rn5gvvwvr7BYmSoB_e7mXWyIhTeyP6_fVBVhirptLny7pffTDqID8c4yY_xPopWG2-NDG-TfZwf0-RTtlo_BublL5hW3kFLrTZnGCwpAa_5g0D-Y4BSsr3uLN2D-ddlPhYMSVW0fLnVgylpZ3Q5P7PIZM7NhiLrlqm3iLBbRlPBP8Peg2H_r3AxPvkRAAD__3pr4n4=
(1 row)
Time: 13ms total (execution 13ms / network 0ms)
The link shows the following pictures, which contains the memory usage per operator. Notice that the aggregator (the group by operator) shows 70 kb, the quantity that we were looking for:

Related

Postgresql 9.6 Parallel execution from foreign table

I upgraded my postgresql from 9.5 to 9.6 in order to use the parallel execution to improve my performance. However, I didnt succeed to use it. In my main database almost all my selects look like :
select * from foreign_table
The foreign table is a foreign table that is located on an oracle database.
Some of the tables are big 10G+ and 1,000,000+ records so parallel query should help me alot in this case of select.
The parameters that i configured :
min_parallel_relation_size = 200MB
max_parallel_workers_per_gather = 2
max_worker_processes = 8
When I try to use explain analyze select * from a big table that her size is 1.5G and has 5,000,000 records I see only foreign scan :
Foreign Scan on customer_prod (cost=10000.00..20000.00 rows=1000
width=2438) (actual time=6.337..231408.085 rows=5770616 loops
=1)
Oracle query: ......
Planning time: 2.827 ms
Execution time: 232198.137 ms
*I also tried select * from foreign_table where 1=1 but still same result.
On the other hand the next code worked :
postgres=# CREATE TABLE people_mariel_test (id int PRIMARY KEY NOT NULL, age int NOT NULL);
CREATE TABLE
postgres=# INSERT INTO people_mariel_test SELECT id, (random()*100)::integer AS age FROM generate_series(1,10000000) AS id;
INSERT 0 10000000
postgres=# explain analyze select * from people_mariel_test where age=6;
QUERY PLAN
--------------------------------------------------------------------------------
------------------------------------------------
----------
Gather (cost=1000.00..123777.76 rows=50000 width=8) (actual time=0.239..771.801 rows=99409 loops=1)
Workers Planned: 1
Workers Launched: 1
-> Parallel Seq Scan on people_mariel_test (cost=0.00..117777.76 rows=29412 width=8) (actual time=0.045..748.213 rows=49704
loops=2)
Filter: (age = 6)
Rows Removed by Filter: 4950296
Planning time: 0.261 ms
Execution time: 785.924 ms
(8 rows)
Any idea how can I continue?
From documentation:
A ForeignScan node can, optionally, support parallel execution. A
parallel ForeignScan will be executed in multiple processes and should
return each row only once across all cooperating processes. To do
this, processes can coordinate through fixed size chunks of dynamic
shared memory. This shared memory is not guaranteed to be mapped at
the same address in every process, so pointers may not be used. The
following callbacks are all optional in general, but required if
parallel execution is to be supported.
I have searched source code for Oracle FDW by Laurenz Albe and it does not implement IsForeignScanParallelSafe and thus cannot use parallel execution.

spool space issue due to inequality condition

I am facing a spool space issue for one of my query.Below is the query:
SEL * from (
SEL A.ICCUSNO,
B.ACACCNO,
D.PARTY_ID,
E.PARTY_IDENTIFICATION_NUM,
E.PARTY_IDENTIFICATION_TYPE_CD,
A.ICIDTY,
A.ICIDNO AS ICIDNO,
A.ICEXPD,
ROW_NUMBER() OVER(PARTITION BY ICCUSNO ORDER BY ICEXPD DESC ) AS ICEFFD2
FROM GE_SANDBX.GE_CMCUST A
INNER JOIN GE_SANDBX.GE_CMACCT B
ON A.ICCUSNO=B.ACCUSNO
INNER JOIN GE_VEW.ACCT C
ON B.ACACCNO=C.ACCT_NUM
AND C.DATA_SOURCE_TYPE_CD='ILL'
INNER JOIN GE_VEW.PARTY_ACCT_HIST D
ON C.ACCT_ID=D.ACCT_ID
LEFT OUTER JOIN GE_VEW.GE_PI E
ON D.PARTY_ID=E.PARTY_ID
AND A.ICIDTY=E.PARTY_IDENTIFICATION_TYPE_CD
AND E.DSTC NOT IN( 'SCRM', 'BCRM')
--WHERE B.ACACCNO='0657007129'
--WHERE A.ICIDNO<>E.PARTY_IDENTIFICATION_NUM
QUALIFY ICEFFD2=1) T
where t.PARTY_IDENTIFICATION_NUM<>t.ICIDNO;
I am trying to pick one record based on expiry date ICEXPD. My inner query gives me one record per customer no ICCUSNO as below:
I
CCUSNO ACACCNO PARTY_ID Party_Identification_Num Party_Identification_Type_Cd ICIDNO ICEXPD ICEFFD2
100000013 500010207 5,862,640 1-0121-2073-7 S 1-0212-2073-4 9/20/2007 1
But i have update the table only when the PARTY_IDENTIFICATION_NUM doesn't match with the ICIDNO.
Below is the explain plan:
1) First, we lock GE_SANDBX.A for access, we lock
GE_SANDBX.B for access, we lock DP_TAB.PARTY_ACCT_HIST for
access, we lock DP_TAB.GE_PI for access, and we
lock DP_TAB.ACCT for access.
2) Next, we do an all-AMPs RETRIEVE step from DP_TAB.ACCT by way of
an all-rows scan with a condition of (
"DP_TAB.ACCT.DATA_SOURCE_TYPE_CD = 'ILL '") into Spool 3
(all_amps), which is built locally on the AMPs. The size of Spool
3 is estimated with no confidence to be 9,834,342 rows (
344,201,970 bytes). The estimated time for this step is 2.18
seconds.
3) We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of a
RowHash match scan, which is joined to DP_TAB.PARTY_ACCT_HIST by
way of a RowHash match scan with no residual conditions. Spool 3
and DP_TAB.PARTY_ACCT_HIST are joined using a merge join, with a
join condition of ("Acct_Id = DP_TAB.PARTY_ACCT_HIST.ACCT_ID").
The result goes into Spool 4 (all_amps), which is redistributed by
the hash code of (DP_TAB.ACCT.Acct_Num) to all AMPs. Then we do a
SORT to order Spool 4 by row hash. The size of Spool 4 is
estimated with no confidence to be 13,915,265 rows (487,034,275
bytes). The estimated time for this step is 0.98 seconds.
4) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from GE_SANDBX.B by way of a
RowHash match scan with no residual conditions, which is
joined to Spool 4 (Last Use) by way of a RowHash match scan.
GE_SANDBX.B and Spool 4 are joined using a merge join,
with a join condition of ("GE_SANDBX.B.ACACCNO =
Acct_Num"). The result goes into Spool 5 (all_amps) fanned
out into 18 hash join partitions, which is redistributed by
the hash code of (GE_SANDBX.B.ACCUSNO) to all AMPs. The
size of Spool 5 is estimated with no confidence to be
13,915,265 rows (2,657,815,615 bytes). The estimated time
for this step is 1.33 seconds.
2) We do an all-AMPs RETRIEVE step from GE_SANDBX.A by way
of an all-rows scan with no residual conditions into Spool 6
(all_amps) fanned out into 18 hash join partitions, which is
redistributed by the hash code of (GE_SANDBX.A.ICCUSNO)
to all AMPs. The size of Spool 6 is estimated with high
confidence to be 12,169,929 rows (5,427,788,334 bytes). The
estimated time for this step is 52.24 seconds.
3) We do an all-AMPs RETRIEVE step from
DP_TAB.GE_PI by way of an all-rows scan with a
condition of (
"(DP_TAB.GE_PI.DSTC <> 'BSCRM')
AND (DP_TAB.GE_PI.DATA_SOURCE_TYPE_CD <>
'SCRM')") into Spool 7 (all_amps), which is built locally on
the AMPs. The size of Spool 7 is estimated with low
confidence to be 161,829 rows (19,419,480 bytes). The
estimated time for this step is 1.97 seconds.
5) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of an
all-rows scan, which is joined to Spool 6 (Last Use) by way of an
all-rows scan. Spool 5 and Spool 6 are joined using a hash join
of 18 partitions, with a join condition of ("ICCUSNO = ACCUSNO").
The result goes into Spool 8 (all_amps), which is redistributed by
the hash code of (DP_TAB.PARTY_ACCT_HIST.PARTY_ID,
TRANSLATE((GE_SANDBX.A.ICIDTY )USING
LATIN_TO_UNICODE)(VARCHAR(255), CHARACTER SET UNICODE, NOT
CASESPECIFIC)) to all AMPs. The size of Spool 8 is estimated with
no confidence to be 15,972,616 rows (8,593,267,408 bytes). The
estimated time for this step is 4.37 seconds.
6) We do an all-AMPs JOIN step from Spool 7 (Last Use) by way of an
all-rows scan, which is joined to Spool 8 (Last Use) by way of an
all-rows scan. Spool 7 and Spool 8 are right outer joined using a
single partition hash join, with condition(s) used for
non-matching on right table ("NOT (ICIDTY IS NULL)"), with a join
condition of ("(PARTY_ID = Party_Id) AND ((TRANSLATE((ICIDTY
)USING LATIN_TO_UNICODE))= Party_Identification_Type_Cd)"). The
result goes into Spool 2 (all_amps), which is built locally on the
AMPs. The size of Spool 2 is estimated with no confidence to be
16,053,773 rows (10,306,522,266 bytes). The estimated time for
this step is 2.11 seconds.
7) We do an all-AMPs STAT FUNCTION step from Spool 2 (Last Use) by
way of an all-rows scan into Spool 13 (Last Use), which is
redistributed by hash code to all AMPs. The result rows are put
into Spool 11 (all_amps), which is built locally on the AMPs. The
size is estimated with no confidence to be 16,053,773 rows (
18,558,161,588 bytes).
8) We do an all-AMPs RETRIEVE step from Spool 11 (Last Use) by way of
an all-rows scan with a condition of ("(Party_Identification_Num
<> ICIDNO) AND (Field_10 = 1)") into Spool 16 (group_amps), which
is built locally on the AMPs. The size of Spool 16 is estimated
with no confidence to be 10,488,598 rows (10,404,689,216 bytes).
The estimated time for this step is 2.20 seconds.
9) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 16 are sent back to the user as the result
of statement 1.
All the required stats are collected.
Thanks for your help.

Show oracle CPU usage for sessions as percentage

The following scripts returns the cpu usage for active sessions. The result shows the cpu usage in seconds.
What I need is the same report with cpu usage in percentage. What is the best way to do this?
--
-- Show CPU Usage for Active Sessions
--
SET PAUSE ON
SET PAUSE 'Press Return to Continue'
SET PAGESIZE 60
SET LINESIZE 300
COLUMN username FORMAT A30
COLUMN sid FORMAT 999,999,999
COLUMN serial# FORMAT 999,999,999
COLUMN "cpu usage (seconds)" FORMAT 999,999,999.0000
SELECT
s.username,
t.sid,
s.serial#,
SUM(VALUE/100) as "cpu usage (seconds)"
FROM
v$session s,
v$sesstat t,
v$statname n
WHERE
t.STATISTIC# = n.STATISTIC#
AND
NAME like '%CPU used by this session%'
AND
t.SID = s.SID
AND
s.status='ACTIVE'
AND
s.username is not null
GROUP BY username,t.sid,s.serial#
/
Long story short: you won't be able to do it with a single query, you will need to write a PL/SQL to gather useful data in order to obtain useful information.
Oracle has "accumulated time" statistics, this means that the engine keeps a continous track of use. You will have to define a start time and an end time for analysis.
You can query 'DB CPU' from V$SYS_TIME_MODEL
select value into t_db_cpu_i
from sys.V_$SYS_TIME_MODEL
where stat_name = 'DB CPU' ; /* start time */
...
select value into t_db_cpu_f
from sys.V_$SYS_TIME_MODEL
where stat_name = 'DB CPU' ; /* end time */
CPU statistics will be affected if you have just #1 CPU or #8 CPUs. So, you will have to determine how many CPUs is your engine using.
You can query 'cpu_count' from V$PARAMETER to obtain this value.
select value into t_cpus
from sys.v_$parameter
where name='cpu_count' ;
Then, it's quite simple:
Maximum total time will be seconds * number of CPUs, so if you have just #1 CPU then maximum total time would be "60" , but if you have #2 CPUs then maximun total time would be "120" .. #3 CPUs will be "180" .. etc. ...
So you take start time and end time of the analyzed period using sysdate:
t_start := sysdate ;
t_end := sysdate ;
And now you compute the following:
seconds_elapsed := (t_end - t_start)*24*60*60 ;
total_time := seconds_elapsed * t_cpus ;
used_cpu := t_db_cpu_f - t_db_cpu_i ;
secs_cpu := seconds_elapsed/1000000 ;
avgcpu := (secs_cpu/total_time)*100 ;
And that's it, "avgcpu" is the value you are looking for.

Cassandra Timing out because of TTL expiration

Im using a DataStax Community v 2.1.2-1 (AMI v 2.5) with preinstalled default settings+ increased read time out to 10sec here is the issue
create table simplenotification_ttl (
user_id varchar,
real_time timestamp,
insert_time timeuuid,
read boolean,
msg varchar, PRIMARY KEY (user_id, real_time, insert_time));
Insert Query:
insert into simplenotification_ttl (user_id, real_time, insert_time, read)
values ('test_3',14401440123, now(),false) using TTL 800;
For same 'test_3' I inserted 33,000 tuples. [This problem does not happen for 24,000 tuples]
Gradually i see
cqlsh:notificationstore> select count(*) from simplenotification_ttl where user_id = 'test_3';
count
-------
15681
(1 rows)
cqlsh:notificationstore> select count(*) from simplenotification_ttl where user_id = 'test_3';
count
-------
12737
(1 rows)
cqlsh:notificationstore> select count(*) from simplenotification_ttl where user_id = 'test_3';
**errors={}, last_host=127.0.0.1**
I have experimented this many times even on different tables. Once this happens, even if i insert with same user_id and do a retrieval with limit 1. It times out.
I require TTL to work properly ie give count 0 after speculated time. How to solve this issue?
Thanks
[My other node related setup is using m3.large with 2 nodes EC2Snitch]
You're running into a problem where the number of tombstones (deleted values) is passing a threshold, and then timing out.
You can see this if you turn on tracing and then try your select statement, for example:
cqlsh> tracing on;
cqlsh> select count(*) from test.simple;
activity | timestamp | source | source_elapsed
---------------------------------------------------------------------------------+--------------+--------------+----------------
...snip...
Scanned over 100000 tombstones; query aborted (see tombstone_failure_threshold) | 23:36:59,324 | 172.31.0.85 | 123932
Scanned 1 rows and matched 1 | 23:36:59,325 | 172.31.0.85 | 124575
Timed out; received 0 of 1 responses for range 2 of 4 | 23:37:09,200 | 172.31.13.33 | 10002216
You're kind of running into an anti-pattern for Cassandra where data is stored for just a short time before being deleted. There are a few options for handling this better, including revisiting your data model if needed. Here are some resources:
The cassandra.yaml configuration file - See section on tombstone settings
Cassandra anti-patterns: Queues and queue-like datasets
About deletes
For your sample problem, I tried lowering the gc_grace_seconds setting to 300 (5 minutes). That causes the tombstones to be cleaned up more frequently than the default 10 days, but that may or not be appropriate based on your application. Read up on the implications of deletes and you can adjust as needed for your application.

Oracle trouble in getting data from a partitioned table

On a new job I have to figure out how some database reporting scripts are working.
There is one table that is giving me some trouble. I see in existing scripts that it is a partitioned table.
My problem is that whatever query I run on this table returns me "no rows selected".
Here are some details about my investigation in this table:
Table size estimate
SQL> select sum(bytes)/1024/1024 Megabytes from dba_segments where segment_name = 'PPREC';
MEGABYTES
----------
45.625
Partitions
There are a total of 730 partitions on date range.
SQL> select min(PARTITION_NAME),max(PARTITION_NAME) from dba_segments where segment_name = 'PPREC';
MIN(PARTITION_NAME) MAX(PARTITION_NAME)
------------------------------ ------------------------------
PART20110201 PART20130130
There are several tablespaces and partitions are allocated in them
SQL> select tablespace_name, count(partition_name) from dba_segments where segment_name = 'PPREC' group by tablespace_name;
TABLESPACE_NAME COUNT(PARTITION_NAME)
------------------------------ ---------------------
REC_DATA_01 281
REC_DATA_02 48
REC_DATA_03 70
REC_DATA_04 26
REC_DATA_05 44
REC_DATA_06 51
REC_DATA_07 13
REC_DATA_08 48
REC_DATA_09 32
REC_DATA_10 52
REC_DATA_11 35
REC_DATA_12 30
Additional query:
SQL> select * from dba_segments where segment_name='PPREC' and partition_name='PART20120912';
OWNER SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE TABLESPACE_NAME HEADER_FILE HEADER_BLOCK BYTES BLOCKS EXTENTS
----- ------------ -------------- --------------- --------------- ----------- ------------ ----- ------ -------
HIST PPREC PART20120912 TABLE PARTITION REC_DATA_01 13 475315 65536 8 1
INITIAL_EXTENT NEXT_EXTENT MIN_EXTENTS MAX_EXTENTS PCT_INCREASE FREELISTS FREELIST_GROUPS RELATIVE_FNO BUFFER_POOL
-------------- ----------- ----------- ----------- ------------ --------- --------------- ------------ -----------
65536 1 2147483645 13 DEFAULT
Tabespace usage
Here is a space summary (composite of dba_tablespaces, dba_data_files, dba_segments, dba_free_space)
TABLESPACE_NAME TOTAL_MEGABYTES USED_MEGABYTES FREE_MEGABYTES
------------------------------ --------------- -------------- --------------
REC_01_INDX 30,700 250 30,449
REC_02_INDX 7,745 7 7,737
REC_03_INDX 22,692 15 22,677
REC_04_INDX 15,768 10 15,758
REC_05_INDX 25,884 16 25,868
REC_06_INDX 27,992 16 27,975
REC_07_INDX 17,600 10 17,590
REC_08_INDX 18,864 11 18,853
REC_09_INDX 19,700 12 19,687
REC_10_INDX 28,716 16 28,699
REC_DATA_01 102,718 561 102,156
REC_DATA_02 24,544 3,140 21,403
REC_DATA_03 72,710 4 72,704
REC_DATA_04 29,191 2 29,188
REC_DATA_05 42,696 3 42,692
REC_DATA_06 52,780 323 52,456
REC_DATA_07 16,536 1 16,534
REC_DATA_08 49,247 3 49,243
REC_DATA_09 30,848 2 30,845
REC_DATA_10 49,620 3 49,616
REC_DATA_11 40,616 2 40,613
REC_DATA_12 184,922 123,435 61,486
The tablespace usage seems to confirm that this table is not empty, in fact its last tablespace (REC_DATA_12) seems pretty busy.
Existing scripts
What I find puzzling is that there are some PL/SQL stored procedures that seem to work on that table and get data out of it.
An example of such a stored procedure is as follows:
procedure FIRST_REC as
vpartition varchar2(12);
begin
select 'PART'||To_char(sysdate,'YYYYMMDD') INTO vpartition FROM DUAL;
execute immediate
'MERGE INTO FIRST_REC_temp a
USING (SELECT bno, min(trdate) mintr,max(trdate) maxtr
FROM PPREC PARTITION ('||vpartition||') WHERE route_id IS NOT NULL AND trunc(trdate) <= trunc(sysdate-1)
GROUP BY bno) b
ON (a.bno=b.bno)
when matched then
update set a.last_tr = b.maxtr
when not matched then
insert (a.bno,a.last_tr,a.first_tr)
values (b.bno,b.maxtr,b.mintr)';
commit;
However if I try using the same syntax manually on the table, here is what I get:
SQL> select count(*) from PPREC PARTITION (PART20120912);
COUNT(*)
----------
0
I have tried a few random partitions and I always get the same 0 count.
Summary
- I see a table that seems to contain data (space used, tablespaces, data files)
- The table is partitioned (one partition per day over a period of 730 days ending end of January 2013)
- Scripts are extracting data from that table somehow
Question
- My queries using PARTITION are all returning me "no rows selected". What am I doing wrong? How could I find out how to extract data from this table?
I suppose it's possible that some other process might be deleting the data, but without visiting your site there's no way for anyone here to tell if that might be so.
I don't see in your post that you mentioned the name of the partitioning DATE column, but based on the SQL you posted I'll assume it's TRDATE - if this is not correct, change TRDATE in the statement below to be the partitioning column.
That said, give this a try:
SELECT COUNT(*)
FROM PPREC
WHERE TRDATE >= TO_DATE('01-SEP-2012 00:00:00', 'DD-MON-YYYY HH24:MI:SS')
This assumes you should have data in this table from September. If you find data, great. If you don't - well, Back In The Day (when men were men, women were women, and computers were water-cooled :-) we had a little saying about memory on IBM mainframes:
1. If you can see it, and it's there, it's Real.
2. If you can't see it, but it's there, it's Protected.
3. If you can see it, but it's not there, it's Virtual.
4. If you can't see it, and it's not there, it's GONE!
:-)
Use of the PARTITION clause should be reserved for situations where you are experiencing a performance problem (note: guessing about what is or is not going to be a performance problem is not allowed. Until you've got a performance problem you don't have a performance problem. Over the years I've found that software spends a lot of execution time in the darndest places :-), and the usual fixes (adding indexes, deleting unnecessary data, human sacrifice, etc) haven't worked. Basically, write your queries normally and trust the database to get it right. (In the general case - always write the simplest code - and do the simplest thing - that could possibly work. 99+ percent of the time it will be fine. That allows you to spend your optimization time on the less-than-one-percent cases where simple isn't good enough - and most of the software you write or design will be simple and easy to understand).
Share and enjoy.

Resources