I am getting Unique Constraint Violation error on direct update as well as standard ADSO while inserting 180k records. I am sure that all the records are unique for my composite primary key (Combination of 12 fields).
I am inserting the data using AMDP in platform edition.
Validated uniqueness of Records using this:
Select a, b, c, d from "Calculation View" group by a, b, c, d having count(*)>1
This query returns no rows. My ADSO does not currently have any data.
Also validated using:
Select Count(*) from (Select a, b, c, d from "Calculation View")
Select Count(*) from (Select distinct a, b, c, d from "Calculation View")
Count is same in both queries.
Here is the error:
Error when executing the database procedure
"ZFXX_VOLUME_REPORTING=>METH_INSERT_BMS_PIVOT". SQL error: "301". SQL
message: "unique constraint violated:
"SAPABAP1"."ZFXX_VOLUME_REPORTING=>METH_INSERT_BMS_PIVOT#stb2#20170616162711"":
line 10 col 3 (at pos 253):
"SAPABAP1"."ZFXX_VOLUME_REPORTING=>METH_INSERT_BMS_PIVOT": line 27 col
1 (at pos 903): [301] (range 3) unique constraint violated exception:
unique constraint violated: TrexUpdate failed on table
'SAPABAP1:/BIC/AG9SC26ADU2' with error: unique constraint violation in
self check for table SAPABAP1:/BIC/AG9SC26ADU2en,
constraint='$trexexternalkey$',
div='10,1030201703;6,201703;12,FR0010451260;2,20;4,FR04;6,DE1410;7,Managed;1,0;3,DIS;1,D;1,0;12,Alternatives',
pos=195705, indexname=/BIC/AG9SC26ADU2~0, rc=55".
Without deeper on system analysis it's close to impossible to see what's happening here. It's likely a bug, but to verify this, SAP support will have to review the system and the problematic ADSO and data source. I highly recommend opening a support incident.
Related
We have almost 1B records in a replicated merge tree table.
The primary key is a,b,c
Our App keeps writing into this table with every user action. (we accumulate almost a million records per hour)
We append (store) the latest timestamp (updated_at) for a given unique combination of (a,b)
The key requirement is to provide a roll-up against the latest timestamp for a given combination of a,b,c
Currently, we are processing the queries as
select a,b,c, sum(x), sum(y)...etc
from table_1
where (a,b,updated_at) in (select a,b,max(updated_at) from table_1 group by a,b)
and c in (...)
group by a,b,c
clarification on the sub-query
(select a,b,max(updated_at) from table_1 group by a,b)
^ This part is for illustration only.. our app writes latest updated_at for every a,b implying that the clause shown above is more like
(select a,b,updated_at from tab_1_summary)
[where tab_1_summary has latest record for a given a,b]
Note: We have to keep the grouping criteria as-is.
The table is structured with partition (c) order by (a, b, updated_at)
Question is, is there a way to write a better query. (that can returns results faster..we are required to shave off few seconds from the overall processing)
FYI: We toyed working with Materialized View ReplicatedReplacingMergeTree. But, given the size of this table, and constant inserts + the FINAL clause doesn't necessarily work well as compared to the query above.
Thanks in advance!
Just for test try to use join instead of tuple in (tuples):
select t.a, t.b, t.c, sum(x), sum(y)...etc
from table_1 AS t inner join tab_1_summary using (a, b, updated_at)
where c in (...)
group by t.a, t.b, t.c
Consider using AggregatingMergeTree to pre-calculate result metrics:
CREATE MATERIALIZED VIEW table_1_mv
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(updated_at)
ORDER BY (updated_at, a, b, c)
AS SELECT
updated_at,
a,b,c,
sum(x) AS x, /* see [SimpleAggregateFunction data type](https://clickhouse.tech/docs/en/sql-reference/data-types/simpleaggregatefunction/) */
sum(y) AS y,
/* For non-simple functions should be used [AggregateFunction data type](https://clickhouse.tech/docs/en/sql-reference/data-types/aggregatefunction/). */
// etc..
FROM table_1
GROUP BY updated_at, a, b, c;
And use this way to get result:
select a,b,c, sum(x), sum(y)...etc
from table_1_mv
where (updated_at,a,b) in (select updated_at,a,b from tab_1_summary)
and c in (...)
group by a,b,c
I am sure this is the most common problem with Cassandra.
Nevertheless:
I have this example table:
CREATE TABLE test.test1 (
a text,
b text,
c timestamp,
id uuid,
d timestamp,
e decimal,
PRIMARY KEY ((a),c, b, id)) WITH CLUSTERING ORDER BY (b ASC, compartment ASC);
My query:
select b, (toUnixTimestamp(d) - toUnixTimestamp(c))/1000/60/60/24/365.25 as age from test.test1 where a = 'x' and c > -2208981600000 ;
This works fine but I can't get the data sorted by column b which I need. I need all the entries in column b and their corresponding 'age's.
eg:
select b, (toUnixTimestamp(d) - toUnixTimestamp(c))/1000/60/60/24/365.25 as age from test.test1 where a = 'x' and c > -2208981600000 order by b;
gives the error:
InvalidRequest: Error from server: code=2200 [Invalid query] message="Order by currently only supports the ordering of columns following their declared order in the PRIMARY KEY"
I have tried different orders in the clustering columns and different options in the partition key but I get caught by some logic and just can't seem to outwit Cassandra to get what I want. If I get the sort order I want, I loose the ability to filter on column 'c'.
Is there some logic I am not applying here, or alternatively, what must I omit(?) to get a list of entries in column b with the corresponding age.
Short answer - it's impossible to sort data on arbitrary column using CQL, even if it's a part of the primary key. Cassandra sorts data first by first clustering column, then inside it by second, etc. (see this answer).
So the only workaround right now is to fetch all data & sort on the client side.
Let say I've two tables,
Table A
PK SIZE
89733 5
83644 3
87351 8
84423 11
Table B
ID Table_A_PK
1 89733,83644,86455
2 87351,89542
3 84132
4 84566,84646
Note: Column Table_A_PK is of collection type, that's why it has many values.
I want to select value of size column of Table A if column PK value exits in Table B's column Table_A_PK
For this I tried this but it's not working and throwing an error
Select {a.SIZE}
from {A as a} where {a.PK}
in ({{ SELECT {b.Table_A_PK} FROM {B as b}
Actual Result: ORA-01722: invalid number
Expected Result
SIZE
5
3
8
First, collectiontypes are deprecated. If you use them by choice, prefer relations. They are much easier to work with.
I realized this once with the LIKE operator:
... WHERE Table_A_PK LIKE '%MYPK%'
However this is NOT best practice.
You might be able to use the Concat-Funktion to concatenate the % signs with the PK in the original table for a join. However I have not tried this.
SELECT {a.SIZE}
FROM {A AS a JOIN B AS b
ON {b.TABLE_A_PK} LIKE Concat('%', {a.pk}, '%') }
I would suggest to use Relation instead of CollectionType. If you are in the situation where you can't modify the itemType then you can search using LIKE operator
SELECT {a.SIZE}
FROM
{
B AS b JOIN A AS a
ON {b.TABLE_A_PK} LIKE CONCAT( '%', CONCAT( {a.PK} , '%' ) )
}
DatabaseA - TableA - FieldA VARCHAR2
DatabaseB - TableB - FieldB NUMBER [dblink created]
SELECT *
FROM TableB#dblink b
INNER JOIN TableA a
ON b.FieldB = a.FieldA
There are 2 complications.
1. FieldA is VARCHAR2 but FieldB is NUMBER.
2. FieldA contains - and FieldB contains 0.
More info about the fields
FieldA: VARCHAR2(15), NOT NULL
Sample values
-
123
No non-numeric values, except for -
FieldB: NUMBER(5,0)
Sample values
0
123
No non-numeric values
What I'm trying to do is to ignore the rows if FieldA='-' OR FieldB=0, otherwise compare FieldA to FieldB.
SELECT *
FROM TableB#dblink b
JOIN TableA a
ON to_char(b.FieldB) = a.FieldA
I get the following error:
SQL Error: 17410, SQLState: 08000
No more data to read from socket.
NULLs will never match with equals, so your join already takes care of that.
You would get an implicit type conversion of (probably) the NUMBER to VARCHAR, so that should also be taken care of.
Having said that, I am a big proponent of not relying on implicit datatype conversions. So I would write my query as
SELECT *
FROM TableB#dblink b
JOIN TableA a
ON to_char(b.FieldB) = a.FieldA
If that is not giving the results you want, perhaps posting examples of the data in each table and the results you desire would be helpful.
full disclosure this is part of a homework question but I have tried 6 different versions and I am stuck.
I am trying for find 1 manager every time the query runs. I.e I put the department id in and 1 name pops out. currently, I get all the names, multiple times. I have tried a nesting with an '=' not nesting, union, intersection, etc. I can get the manager id with a basic query, I just can't get the name. the current version looks like this:
select e.ename
from .emp e
where d.managerid in (select unique d.managerid
from works w, .dept d, emp e1
where d.did=1 and e1.eid=w.eid and d.did=w.did );
I realize its probably a really basic mistake that I am not seeing - any ideas?
Its not clear what do you mean get 1 menager any time. are it should be different menagers any time or the same?
Lest go throw your query:
you select all empolyes from table emp where manager_id in next query dataset
You get all managers for dep=1. The rest of tables and conditions are not influent on result dataset.
I theing did is primary key for table dept, If so your query may be rewritten to
select e.ename
from emp e
where d.managerid in (select unique d.managerid
from dept d
where d.did=1);
but this query return to you all emploees and not manager from dept=1
and if you need a manager. you should get emploee who is a manager. If eid is primary key of employee, and managerid is id from employee table you need something like:
select e.ename
from emp e
where e1.eid in (select unique d.managerid
from dept d
where d.did=1);