MonetDB: jdbc based queries return -1 as affected rows often even though the query was successful - jdbc

e.g, below query:
create table t1 ( age int, name varchar(10) )
insert into t1 values(1, 'name'),(2, 'surname')
copy select * from t1 into 't1.dat' DELIMITERS '|','
','"' null as '';
The copy select cmd returns -1 as the affected row count, although it should return 2 as the value. Not sure why this is so. At many other times, I have seen the same query returning correctly the affected row count.
If I run the same query in the Dbeaver tool Iam using, I see this:
Updated Rows: -1
Query: copy select * from t1 into 't1.dat' DELIMITERS '|','
','"' null as ''
Finish time: Sat Apr 30 16:53:28 IST 2022

I think you have to use an absolute path to the export file, e.g.
copy select * from t1 into '/home/user/t1.dat' DELIMITERS '|','
','"' null as '';
When you see a message like 2 affected rows, it might actually be due to the result summary of the successfully completed INSERT INTO statement. But that doesn't mean the COPY INTO <file> statement is successful though.

Related

MIN function behavior changed on Oracle databases after SAS Upgrade to 9.4M7

I have a program that has been working for years. Today, we upgraded from SAS 9.4M3 to 9.4M7.
proc setinit
Current version: 9.04.01M7P080520
Since then, I am not able to get the same results as before the upgrade.
Please note that I am querying on Oracle databases directly.
Trying to replicate the issue with a minimal, reproducible SAS table example, I found that the issue disappear when querying on a SAS table instead of on Oracle databases.
Let's say I have the following dataset:
data have;
infile datalines delimiter="|";
input name :$8. id $1. value :$8. t1 :$10.;
datalines;
Joe|A|TLO
Joe|B|IKSK
Joe|C|Yes
;
Using the temporary table:
proc sql;
create table want as
select name,
min(case when id = "A" then value else "" end) as A length 8
from have
group by name;
quit;
Results:
name A
Joe TLO
However, when running the very same query on the oracle database directly I get a missing value instead:
proc sql;
create table want as
select name,
min(case when id = "A" then value else "" end) as A length 8
from have_oracle
group by name;
quit;
name A
Joe
As per documentation, the min() function is behaving properly when used on the SAS table
The MIN function returns a missing value (.) only if all arguments are missing.
I believe this happens when Oracle don't understand the function that SAS is passing it - the min functions in SAS and Oracle are very different and the equivalent in SAS would be LEAST().
So my guess is that the upgrade messed up how is translates the SAS min function to Oracle, but it remains a guess. Does anyone ran into this type of behavior?
EDIT: #Richard's comment
options sastrace=',,,d' sastraceloc=saslog nostsuffix;
proc sql;
create table want as
select t1.name,
min(case when id = 'A' then value else "" end) as A length 8
from oracle_db.names t1 inner join oracle_db.ids t2 on (t1.tid = t2.tid)
group by t1.name;
ORACLE_26: Prepared: on connection 0
SELECT * FROM NAMES
ORACLE_27: Prepared: on connection 1
SELECT UI.INDEX_NAME, UIC.COLUMN_NAME FROM USER_INDEXES UI,USER_IND_COLUMNS UIC WHERE UI.TABLE_NAME='NAMES' AND
UIC.TABLE_NAME='NAMES' AND UI.INDEX_NAME=UIC.INDEX_NAME
ORACLE_28: Executed: on connection 1
SELECT statement ORACLE_27
ORACLE_29: Prepared: on connection 0
SELECT * FROM IDS
ORACLE_30: Prepared: on connection 1
SELECT UI.INDEX_NAME, UIC.COLUMN_NAME FROM USER_INDEXES UI,USER_IND_COLUMNS UIC WHERE UI.TABLE_NAME='IDS' AND
UIC.TABLE_NAME='IDS' AND UI.INDEX_NAME=UIC.INDEX_NAME
ORACLE_31: Executed: on connection 1
SELECT statement ORACLE_30
ORACLE_32: Prepared: on connection 0
select t1."NAME", MIN(case when t2."ID" = 'A' then t1."VALUE" else ' ' end) as A from
NAMES t1 inner join IDS t2 on t1."TID" = t2."TID" group by t1."NAME"
ORACLE_33: Executed: on connection 0
SELECT statement ORACLE_32
ACCESS ENGINE: SQL statement was passed to the DBMS for fetching data.
NOTE: Table WORK.SELECTED_ATTR created, with 1 row and 2 columns.
! quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.34 seconds
cpu time 0.09 seconds
Use the SASTRACE= system option to log SQL statements sent to the DBMS.
options SASTRACE=',,,d';
will provide the most detailed logging.
From the prepared statement you can see why you are getting a blank from the Oracle query.
select
t1."NAME"
, MIN ( case
when t2."ID" = 'A' then t1."VALUE"
else ' '
end
) as A
from
NAMES t1 inner join IDS t2 on t1."TID" = t2."TID"
group by
t1."NAME"
The SQL MIN () aggregate function will exclude null values from consideration.
In SAS SQL, a blank value is also interpreted as null.
In SAS your SQL query returns the min non-null value TLO
In Oracle transformed query, the SAS blank '' is transformed to ' ' a single blank character, which is not-null, and thus ' ' < 'TLO' and you get the blank result.
The actual MIN you want to force in Oracle is min(case when id = "A" then value else null end) which #Tom has shown is possible by omitting the else clause.
The only way to see the actual difference is to run the query with trace in the prior SAS version, or if lucky, see the explanation in the (ignored by many) "What's New" documents.
Why are you using ' ' or '' as the ELSE value? Perhaps Oracle is treating a string with blanks in it differently than a null string.
Why not use null in the ELSE clause?
or just leave off the ELSE clause and let it default to null?
libname mylib oracle .... ;
proc sql;
create table want as
select name
, min(case when id = "A" then value else null end) as A length 8
from mylib.have_oracle
group by name
;
quit;
Also try running the Oracle code yourself, instead of using implicit pass thru.
proc sql;
connect to oracle ..... ;
create table want as
select * from connection to oracle
(
select name,
min(case when id = "A" then value else null end) as A length 8
from have_oracle
group by name
)
;
quit;
When I try to reproduce this in Oracle I get the result you are looking for so I suspect it has something to do with SAS (which I'm not familiar with).
with t as (
select 'Joe' name, 'A' id, 'TLO' value from dual union all
select 'Joe' name, 'B' id, 'IKSK' value from dual union all
select 'Joe' name, 'C' id, 'Yes' value from dual
)
select name
, min(case when id = 'A' then value else '' end) as a
from t
group by name;
NAME A
---- ----
Joe TLO
Unrelated, if you are only interested in id = 'A' then a better query would be:
select name
, min(value) as a
from t
where id = 'A'
group by name;

Oracle PL/SQL Update statement looping forever - 504 Gateway Time-out

I'm trying to update a table based on another one's information:
Source_Table (Table 1) columns:
TABLE_ROW_ID (Based on trigger-sequence when insert)
REP_ID
SOFT_ASSIGNMENT
Description (Table 2) columns:
REP_ID
NEW_SOFT_ASSIGNMENT
This is my loop statement:
SELECT count(table_row_id) INTO V_ROWS_APPROVED FROM Source_Table;
FOR i IN 1..V_ROWS_APPROVED LOOP
SELECT REQUESTED_SOFT_MAPPING INTO V_SOFT FROM Source_Table WHERE ROW_ID = i;
SELECT REP_ID INTO V_REP_ID FROM Source_Table WHERE ROW_ID = i;
UPDATE Description_Table D
SET D.NEW_SOFT_ASSIGNMENT = V_SOFT
WHERE D.REP_ID = V_REP_ID;
END LOOP;
END;
The ending result of this loop is a beautiful ''504 Gateway Time-out''.
I know the issue is on the Update query but there's no other way (I can think about) of doing it.
Can someone give me a hand please?
Thanks
Unless your row_id values are contiguous - i.e. count(row_id) == max(row_id) - then this will get a no-data-found. Sequences aren't gapless, so this seems fairly likely. We have no way of telling if that is happening and somehow that is leaving your connection hanging until it times out, or if it's just taking a long time because you're doing a lot of individual queries and updates over a large data set. (And you may be squashing any errors that do occur, though you haven't shown that.)
You don't need to query and update in a loop though, or even use PL/SQL; you can apply all the values in the source table to the description table with a single update or merge:
merge into description_table d
using source_table s
on (s.rep_id = d.rep_id)
when matched then
update set d.new_soft_assignment = s.requested_soft_mapping;
db<>fiddle with some dummy data, including a non-contiguous row_id to show that erroring.

Oracle XQuery delete, insert, update

Everything below I am able to do in separate operations, but as I am relatively new to XQuery I am struggling to work out how to perform multiple operations in one go which would be neat.
I am trying to update an XML column, with some xml data from another column (this XML has come from a spreadsheet with two columns, promotion numbers and the department numbers within each promotion). Which I load into a table and then run the below against.
INSERT INTO proms
select promid, '<Promotion><MultibuyGroup><MMGroupID>'||depts||'</MMGroupID>
</MultibuyGroup></Promotion>' DEPTS
from (
SELECT promid, listagg (id,'</MMGroupID><MMGroupID>') within GROUP
(ORDER BY id) as depts FROM mmgroups
GROUP BY promid
);
Creating a table with a column for PROMID and an XML Column like the below (just one MMGroup in example for ease of read.
<Promotion><MultibuyGroup><MMGroupID>1</MMGroupID></Promotion></MultibuyGroup>
When I run the below, I can successfully update any XML in the PROMOTIONS where the value of ID column matches the value of PROMID in the table I have created above.
merge into PROMOTIONS tgt
using (
select PROMID
, xmlquery('/Promotion/MultibuyGroup/MMGroupID'
passing xmlparse(document DEPTS)
returning content
) as new_mmg
from PROMS WHERE PROMID = 'EMP35Level1'
) src
on (tgt.ID = src.PROMID)
when matched then update
set tgt.xml =
xmlserialize(document
xmlquery(
'copy $d := .
modify
insert node $new_mmg as last into $d/Promotion/MultibuyGroup
return $d'
passing xmlparse(document tgt.xml)
, src.new_mmg as "new_mmg"
returning content
)
no indent
) ;
However what I would like my query to do is to delete any existing MMGroupID nodes from the target xml (if they exist), then replace them with all of the nodes from source xml.
Also within the target xml is a LastUpdated Node which I would like to update with the SYSDATE at time of update
Also two separate columns LAST_UPDATED DATE and ROW_UPDATED NUMBER (20,0) which has the epoch time of update would be nice to update at the same time.
<Promotion>
<LastUpdated>2018-08-23T14:56:35+01:00</LastUpdated>
<MajorVersion>1</MajorVersion>
<MinorVersion>52</MinorVersion>
<PromotionID>EMP35Level1</PromotionID>
<Description enabled="1">Staff Discount 15%</Description>
<MultibuyGroup>
<AlertThresholdValue>0.0</AlertThresholdValue>
<AlertThresholdValue currency="EUR">0.0</AlertThresholdValue>
<UseFixedValueInBestDeal>0</UseFixedValueInBestDeal>
<UpperThresholdValue>0.0</UpperThresholdValue>
<UpperThresholdValue currency="EUR">0.0</UpperThresholdValue>
<GroupDescription>Employee Discount 15%</GroupDescription>
<Rolling>0</Rolling>
<DisableOnItemDiscount>1</DisableOnItemDiscount>
<UniqueItems>0</UniqueItems>
<AllItems>0</AllItems>
<RoundingRule>3</RoundingRule>
<UseLowestNetValue>0</UseLowestNetValue>
<TriggerOnLostSales>0</TriggerOnLostSales>
<MMGroupID>2</MMGroupID>
<MMGroupID>8</MMGroupID>
<MMGroupID>994</MMGroupID>
</MultibuyGroup>
<Timetable>
<XMLSchemaVersion>1</XMLSchemaVersion>
<CriterionID/>
<StartDate>1970-01-01T00:00:00+00:00</StartDate>
<FinishDate>2069-12-31T00:00:00+00:00</FinishDate>
</Timetable>
<AllowedForEmployeeSale>1</AllowedForEmployeeSale>
<Notes enabled="1"/>
<AlertMessage enabled="1"/>
</Promotion>
Since posting, have edited the query to be:
merge INTO PROMOTIONS3 tgt
using (
SELECT PROMID
,xmlquery('/Promotion/MultibuyGroup/MMGroupID'
passing xmlparse(document DEPTS)
returning content
) as new_mmg
FROM PROMS WHERE PROMID = 'EMP35Level1'
) src
ON (tgt.ID = src.PROMID)
when matched then update
SET tgt.xml =
xmlserialize(document
xmlquery(
'copy $d := .
modify(
delete nodes $d/Promotion/MultibuyGroup/MMGroupID,
insert node $new_mmg as last into $d/Promotion/MultibuyGroup,
replace value of node $d/Promotion/LastUpdated with current-dateTime())
return $d'
passing xmlparse(document tgt.xml)
,src.new_mmg as "new_mmg"
returning content
)
no indent
)
,last_updated = (SELECT SYSDATE FROM dual)
,row_updated = (SELECT ( SYSDATE - To_date('01-01-1970 00:00:00','DD-MM-YYYY HH24:MI:SS') ) * 24 * 60 * 60 * 1000 FROM dual) ;
So almost correct, except I need
<LastUpdated>2018-08-23T14:56:35+01:00</LastUpdated>
Not
<LastUpdated>2018-11-09T11:53:10.591000+00:00</LastUpdated>
So I need to figure that out.
Cheers.
For syntax you should google for XQuery Update Facility.
An example
xmlquery(
'copy $d := .
modify(
delete nodes $d/Promotion/MMGroupID,
replace value of node $d/Promotion/LastUpdated with current-date(),
insert node <node1>x</node1> as last into $d/Promotion/MultibuyGroup,
insert node <node2>x</node2> as last into $d/Promotion/MultibuyGroup)
return $d
'

pl-sql include column names in query

A weird request maybe but. My boss wants me to create an admin version of a page we have that displays data from an oracle query in a table.
The admin page, instead of displaying the data (query returns 1 row), needs to return the table name and column name
Ex: Instead of:
Name Initial
==================
Bob A
I want:
Name Initial
============================
Users.FirstName Users.MiddleInitial
I realize I can do this in code but would rather just modify the query to return the data I want so I can leave the report generation code mostly alone.
I don't want to do it in a stored procedure.
So when I spit out the data in the report using something like:
blah blah = MyDataRow("FirstName")
I can leave that as is but instead of it displaying "BOB" it would display "Users.FirstName"
And I want to do the query using select * if possible instead of listing all the columns
So for each of the columns I am querying in the * , I want to get (instead of the column value) the tablename.ColumnName or tablename|columnName
hope you are following- I am confusing myself...
pseudo:
select tablename + '.' + Columnname as WhateverTheColumnNameIs
from Table1
left join Table2 on whatever...
Join Table_Names on blah blah
Whew- after writing all this I think I will just do it on the code side.
But if you are up for it maybe a fun challenge
Oracle does not provide an authentic way(there is no pseudocolumn) to get the column name of a table as a result of a query against that table. But you might consider these two approaches:
Extract column name from an xmltype, formed by passing cursor expression(your query) in the xmltable() function:
-- your table
with t1(first_name, middle_name) as(
select 1,2 from dual
), -- your query
t2 as(
select * -- col1 as "t1.col1"
--, col2 as "t1.col2"
--, col3 as "t1.col3"
from hr.t1
)
select *
from ( select q.object_value.getrootelement() as col_name
, rownum as rn
from xmltable('//*'
passing xmltype(cursor(select * from t2 where rownum = 1))
) q
where q.object_value.getrootelement() not in ('ROWSET', 'ROW')
)
pivot(
max(col_name) for rn in (1 as "name", 2 as "initial")
)
Result:
name initial
--------------- ---------------
FIRST_NAME MIDDLE_NAME
Note: In order for column names to be prefixed with table name, you need to list them
explicitly in the select list of a query and supply an alias, manually.
PL/SQL approach. Starting from Oracle 11g you could use dbms_sql() package and describe_columns() procedure specifically to get the name of columns in the cursor(your select).
This might be what you are looking for, try selecting from system views USER_TAB_COLS or ALL_TAB_COLS.

optimizing a dup delete statement Oracle

I have 2 delete statements that are taking a long time to complete. There are several indexes on the columns in where clause.
What is a duplicate?
If 2 or more records have same values in columns id,cid,type,trefid,ordrefid,amount and paydt then there are duplicates.
The DELETEs delete about 1 million record.
Can they be re-written in any way to make it quicker.
DELETE FROM TABLE1 A WHERE loaddt < (
SELECT max(loaddt) FROM TABLE1 B
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
COMMIT;
DELETE FROM TABLE1 a where rowid > (
Select min(rowid) from TABLE1 b
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
commit;
Explain Plan:
DELETE TABLE1
HASH JOIN 1296491
Access Predicates
AND
A.ID=ITEM_1
A.CID=ITEM_2
ITEM_3=NVL(TYPE,'-99999')
ITEM_4=NVL(TREFID,'-99999')
ITEM_5=NVL(ORDREFID,'-99999')
ITEM_6=NVL(AMOUNT,(-99999))
ITEM_7=NVL(PAYDT,TO_DATE(' 9999-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
Filter Predicates
LOADDT<MAX(LOADDT)
TABLE ACCESS TABLE1 FULL 267904
VIEW VW_SQ_1 690385
SORT GROUP BY 690385
TABLE ACCESS TABLE1 FULL 267904
How large is the table? If count of deleted rows is up to 12% then you may think about index.
Could you somehow partition your table - like week by week and then scan only actual week?
Maybe this could be more effecient. When you're using aggregate function, then oracle must walk through all relevant rows (in your case fullscan), but when you use exists it stops when the first occurence is found. (and of course the query would be much faster, when there was one function-based(because of NVL) index on all columns in where clause)
DELETE FROM TABLE1 A
WHERE exists (
SELECT 1
FROM TABLE1 B
WHERE
A.loaddt != b.loaddt
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
Although some may disagree, I am a proponent of running large, long running deletes procedurally. In my view it is much easier to control and track progress (and your DBA will like you better ;-) Also, not sure why you need to join table1 to itself to identify duplicates (and I'd be curious if you ever run into snapshot too old issues with your current approach). You also shouldn't need multiple delete statements, all duplicates should be handled in one process. Finally, you should check WHY you're constantly re-introducing duplicates each week, and perhaps change the load process (maybe doing a merge/upsert rather than all inserts).
That said, you might try something like:
-- first create mat view to find all duplicates
create materialized view my_dups_mv
tablespace my_tablespace
build immediate
refresh complete on demand
as
select id,cid,type,trefid,ordrefid,amount,paydt, count(1) as cnt
from table1
group by id,cid,type,trefid,ordrefid,amount,paydt
having count(1) > 1;
-- dedup data (or put into procedure and schedule along with mat view refresh above)
declare
-- make sure my_dups_mv is refreshed first
cursor dup_cur is
select * from my_dups_mv;
type duprec_t is record(row_id rowid);
duprec duprec_t;
type duptab_t is table of duprec_t index by pls_integer;
duptab duptab_t;
l_ctr pls_integer := 0;
l_dupcnt pls_integer := 0;
begin
for rec in dup_cur
loop
l_ctr := l_ctr + 1;
-- assuming needed indexes exist
select rowid
bulk collect into duptab
from table1
where id = rec.id
and cid = rec.cid
and type = rec.type
and trefid = rec.trefid
and ordrefid = rec.ordrefid
and amount = rec.amount
and paydt = rec.paydt
-- order by whatever makes sense to make the "keeper" float to top
order by loaddt desc
;
for i in 2 .. duptab.count
loop
l_dupcnt := l_dupcnt + 1;
delete from table1 where rowid = duptab(i).row_id;
end loop;
if (mod(l_ctr, 10000) = 0) then
-- log to log table here (calling autonomous procedure you'll need to implement)
insert_logtable('Table1 deletes', 'Commit reached, deleted ' || l_dupcnt || ' rows');
commit;
end if;
end loop;
commit;
end;
Check your log table for progress status.
1. Parallel
alter session enable parallel dml;
DELETE /*+ PARALLEL */ FROM TABLE1 A WHERE loaddt < (
...
Assuming you have Enterprise Edition, a sane server configuration, and you are on 11g. If you're not on 11g, the parallel syntax is slightly different.
2. Reduce memory requirements
The plan shows a hash join, which is probably a good thing. But without any useful filters, Oracle has to hash the entire table. (Tbone's query, that only use a GROUP BY, looks nicer and may run faster. But it will also probably run into the same problem trying to sort or hash the entire table.)
If the hash can't fit in memory it must be written to disk, which can be very slow. Since you run this query every week, only one of the tables needs to look at all the rows. Depending on exactly when it runs, you can add something like this to the end of the query: ) where b.loaddt >= sysdate - 14. This may significantly reduce the amount of writing to temporary tablespace. And it may also reduce read IO if you use some partitioning strategy like jakub.petr suggested.
3. Active Report
If you want to know exactly what your query is doing, run the Active Report:
select dbms_sqltune.report_sql_monitor(sql_id => 'YOUR_SQL_ID_HERE', type => 'active')
from dual;
(Save the output to an .html file and open it with a browser.)

Resources