Currently, I have an endpoint that allows the update of multiple records. Before saving the changes to the database I need to validate the data that is being sent from the front end.
I am having an issue with some kinds of validations that require checks against all the other records in the database table. (Ex: ate intervals overlaps/gaps, unique pairs checks).
In these cases when I try to do the validations I have two sets of data
The data sent from the front end that are stored in memory/variable.
The data on the database.
For the validations to be run correctly I need a way to merge the data in memory(the updated records) with the data in a database(original records + other data that is not currently being updated).
Is there a good way of doing this that does not require loading everything on the memory and merging both datasets there?
Another idea I am thinking of is to open a database transaction set the new data to the database and then when executing the gaps/overlap check queries use dirty read. I don't know if this is a good approach though.
Extra notes:
I am using Oracle as a database and Dapper to communicate with it.
The tables that need validation usually hold millions of records.
The same issue is for the create endpoint.
Another example
I am trying to create entities. The create endpoint is called and it has these data on the body (date format dd/mm/yyy):
StartDate
EndDate
01/01/2022
05/01/2022
10/01/2022
11/01/2022
12/01/2022
15/01/2022
In database I have these records saved:
Id
StartDate
EndDate
1
06/01/2022
09/01/2022
2
16/01/2022
20/01/2022
I need to check if there are any gaps between the dates. If there are I need to send a warning to the user(data in the database can be invalid - the application has old data and I can't do anything about that at the moment).
The way I check for this right now is like by using the SQL below
WITH CTE_INNERDATA AS(
SELECT s.STARTDATE, s.ENDDATE
FROM mytable s
WHERE FK = somefkvalue
UNION ALL
SELECT :StartDt AS STARTDATE, :EndDt AS ENDDATE FROM DUAL -- this row contains the data from one of the rows that came form the front-end
),
CTE_DATA AS (
SELECT ctid.STARTDATE, ctid.ENDDATE, LAG(ctid.ENDDATE, 1) OVER (ORDER BY ctid.STARTDATE) AS PREV_ENDDATE FROM CTE_INNERDATA ctid
)
SELECT COUNT(1) FROM cte_data ctd
WHERE PREV_ENDDATE IS NOT NULL
AND PREV_ENDDATE < STARTDATE
Using this SQL query when validating the third row (12/01/2022 - 15/01/2022) there will be a gap between dates 09/01/2022 and 12/01/2022.
This issue would be fixed if instead of using union with a single row, to use it with all the rows send from the front-end but I can't figure out a way to do something like that.
#Update
I iterate through the records the frontend sent and call this method to check for gaps.
private async Task ValidateIntervalGaps(int someFkValue, DateTime startdate, DateTime endDate)
{
var connection = _connectionProvider.GetOpenedConnection();
var gapsCount = await connection.QueryFirstAsync<int>(#"<<Query from above>>",
new { StartDt = startdate, EndDt = endDate, somefkvalue= someFkValue });
if (gapsCount > 0)
{
// Add warning message here
}
}
Related
I solved this in SQL Server with a trigger. Now I face it on Oracle.
I have a big set of data that periodically increases with new items.
The item has these fundamental columns:
ID string identifier (not null)
DATETIME (not null)
optional (eventually null, always null for type 1) DATETIME_EMIS emission datetame equal to the DATETIME of the corresponding emission item
type (0 or 1)
value (only if type 1)
It is basically a logbook.
For example: An item with ID='FIREALARM' and datetime='2023-02-12 12:02' has closing like this:
ID='FIREALARM' in datetime='2023-02-12 15:11', emission datetime='2023-02-12 12:02' (equal to the emission item).
What I need is to obtain a final item in the destination table like this:
ID='FIREALARM' in DATETIME_BEGIN ='2023-02-12 12:02', DATETIME_END ='2023-02-12 15:11'
Not all the items have the closing datetime (the ones of Type=1 instead 0), in this case the next item should be use to close the previous one (with the problem of finding it). For example:
Item1:
ID='DEVICESTATUS', datetime='2023-02-12 22:11', Value='Broken' ;
Item2:
ID='DEVICESTATUS', datetime='2023-02-12 22:14', Value='Running'
Should result in
ID='DEVICESTATUS', DATETIME_BEGIN ='2023-02-12 22:11',DATETIME_END ='2023-02-12 22:14', Value='Broken'
The final data should be extracted by a select query as faster as possible.
The process of the elaboration should be independent from the order of inserting.
In SQL Server, I created a trigger with several operations which involve a temporary table, some queries on the inserted set and the entire destination table, so a complex procedure that is not worth to be shown to understand the problem.
Now I discovered that Oracle has some limitations and is not easy to port the trigger on it. For example is not easy to use a temporary table in the same way, and the operation are for each row.
I am asking what could be a good strategy in Oracle to elaborate the data in the final form considering that the set increase continuously and the open and the closure items must be reduce to a single item. I am not asking for a solution of the problem, I am trying to understand what could be the instrument in Oracle useful to achieve a complex elaboration like this. Thanks.
From Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row pattern matching:
SELECT *
FROM destination
MATCH_RECOGNIZE(
PARTITION BY id
ORDER BY datetime
MEASURES
FIRST(datetime) AS datetime_begin,
LAST(datetime) AS datetime_end,
FIRST(value) AS value
PATTERN ( ^ any_row+ $ )
DEFINE
any_row AS 1 = 1
)
Which, for the sample data:
CREATE TABLE destination (id, datetime, value) AS
SELECT 'DEVICESTATUS', DATE '2023-02-12' + INTERVAL '22:11' HOUR TO MINUTE, 'Broken' FROM DUAL UNION ALL
SELECT 'DEVICESTATUS', DATE '2023-02-12' + INTERVAL '22:14' HOUR TO MINUTE, 'Running' FROM DUAL;
Outputs:
ID
DATETIME_BEGIN
DATETIME_END
VALUE
DEVICESTATUS
2023-02-12 22:11:00
2023-02-12 22:14:00
Broken
fiddle
I'm attempting to extract all the Merged records out of CRM and I see in the "Audit History" page of a Contact record the following:
If I click on it, I get the following:
And if I profile the code to see what SQL runs, it executes the following:
exec sp_executesql N'select
"audit0".AuditId as "auditid"
, "audit0".AttributeMask as "attributemask"
, "audit0".ChangeData as "changedata"
, "audit0".CreatedOn as "createdon"
, "audit0".Action as "action"
, "audit0".Operation as "operation"
, "audit0".CallingUserId as "callinguserid"
, "audit0".UserId as "userid"
, "audit0".ObjectId as "objectid"
, "audit0".ObjectTypeCode as "objecttypecode"
, "audit0".CallingUserIdName as "callinguseridname"
, "audit0".UserIdName as "useridname"
, "audit0".ObjectIdName as "objectidname"
from
Audit as "audit0"
where
("audit0".AuditId = #AuditId0)',N'#AuditId0 uniqueidentifier',
#AuditId0='7FE1B120-87EC-E811-8BE0-005056B12EA2'
with the results shown as:
Does anyone know how I get the Yes record that it was actually merged? I can't make sense of the results on how they get displayed as Old Value v New Value.
When you click the Save button on the record in CRM there's one audit row created in SQL, which describes all the changes made by this save. Only the old values are stored in SQL.
So how to get new values?
You can sort audit query by createdon ascending and for every row the new value is the next's row old value. To get the final new value you should join audit table with the contact table and get the Merged field value from the contact table.
In Oracle (PROD), we will be creating views on table(s) and the users will be querying the views to fetch data for each reporting period (a single month, eg: between '01-DEC-2015' and '31-DEC-2015'). We created a view as
CREATE OR REPLACE VIEW VW_TABLE1 AS SELECT ACCNT_NBR, BIZ_DATE, MAX(COL1) COL1, MAX(COL2) COL2 FROM TABLE1_D WHERE BIZ_DATE IN (SELECT BIZ_DATE FROM TABLE2_M GROUP BY BIZ_DATE) GROUP BY ACCNT_NBR, BIZ_DATE;
The issue here is TABLE1_D (daily table, has data from Dec2015 to Feb2016) has records with multiple dates for a month say for Dec2015, it has records with 01-DEC-2015, 02-DEC-2015,....,29-DEC-2015,30-DEC-2015 (may not be continuous, but loaded on business date) with each day having close to 2,500,000 of records.
TABLE2_M is a monthly table and has a single date for a month (eg for Dec2015 say 30-DEC-2015) with around 4000 records for each date.
When we query the view as
SELECT * FROM VW_TABLE1 WHERE BIZ_DATE BETWEEN '01-DEC-2015' AND '31-DEC-2015'
it returns the aggregated data in table TABLE1_D for 30-DEC-2015 as expected. I thought the Grouping on BIZ_DATE in TABLE1_D is unnecessary as only one BIZ_DATE will be the output from the INNER query.
Checked by removing the BIZ_DATE in the final GROUP BY assuming that there will be data for a single day from the inner query.
Hence took 2 rows for the dates 30-dec-2015 and 30-jan-2016 from both tables and created them in SIT for testing and created view as
CREATE VIEW VW_TABLE1 AS SELECT ACCNT_NBR, MAX(BIZ_DATE) BIZ_DATE, MAX(COL1) COL1, MAX(COL2) COL2 FROM TABLE1_D WHERE BIZ_DATE IN (SELECT BIZ_DATE FROM TABLE2_M GROUP BY BIZ_DATE) GROUP BY ACCNT_NBR;
The select with between for each month (or = exact month date) gives correct data in SIT; i.e., when used BETWEEN for a single month, it produces the respective months data.
SELECT * FROM VW_TABLE1 WHERE BIZ_DATE BETWEEN '01-DEC-2015' AND '31-DEC-2015';
SELECT * FROM VW_TABLE1 WHERE BIZ_DATE = '30-DEC-2015';
With this, I modified the view DDL in PROD (to be same as SIT). But surprisingly the same select (2nd one with ='30-DEC-2015' ; 1st one was taking too long due to volume of data, hence aborted)
returned no data; as I hope that the inner query is sending out dates all 30-DEC-2015 to 30-JAN-2016 and thereby the MAX(BIZ_DATE) is being derived to be from 30-jan-2016. (Table2_M doesn't have FEB2016 data)
I verified whether there was any version differences of Oracle in SIT and PROD and found it to be same from v$version (11.2.0.4.0). Can you please explain this behavior as the same query on same view DDL in different environments returning different results with same data ...
I am writing an SQL query where the query should first search the first value, and only if this value is missing the query should search for the second value.
I have two tables. One of these tables contains the modification date (this is not always filled and can be null) and a creation date which is always filled.
Now what I want is that the query first looks in the table with the modification date and only if it is null looks at the table with the creation date.
Example of the query:
Select *
from all_articles
where to_char(modification_date, 'YYYYMMDD') = to_char(sysdate, 'YYYYMMDD')-1
-- if this is an empty record then
to_char(creation_date, 'YYYYMMDD') = to_char(sysdate, 'YYYYMMDD')-1
Can anyone help me with this query?
Almost all the major RDBMS' available have in built functions to handle such a situation.
The Oracle DB has NVL function which works as follows:
NVL(Modified_dt, Create_dt);
The above will return Modified_dt column data by default. However, if that isn't available, it will return Create_dt.
See here for details:
http://www.w3schools.com/sql/sql_isnull.asp
I have some trouble writing SQL queries. Inside a package function, I am trying to reuse the result of a query in two other queries. Here's how it goes :
My schema stores Requests. Each Request concerns multiple destinations. Also, each Request is detailed in another table (Request_Detail). In addition, Requests are identified by their Ids.
So, I am using mainly 3 tables. One for Requests, another for the destinations and the last one for the details. Each one of theses tables is indexed by the Request_Id column.
The query I want to optimize is when a user wants to find all requests, plus their destinations and commands that have been sent between two dates.
I want to query the Request_Table first in order to get all Request_ids. Then, use this Request_Ids list to query the Command table and the Destination one.
I couldn't find how to do that... I can't use ref cursors as they can't be fetched twice... I just need some array-like or column-like variable to store the Request_Ids, then use this variable twice or more...
Here's the original queries I would like to optimize :
FUNCTION EXTRACT_REQUEST_WITH_DATE (ze_from_date DATE, ze_to_date DATE, x_request_list OUT cursor_type, x_destination_list OUT cursor_type,
x_command_list OUT cursor_type) RETURN VARCHAR2 AS
my_function_id VARCHAR2(80) := PACKAGE_ID || '.EXTRACT_REQUEST_WITH_DATE';
my_return_code VARCHAR2(2);
BEGIN
OPEN x_request_list FOR
SELECT NAME,DESTINATION_TYPE,
SUCCESS_CNT, STATUS, STATUS_DESCRIPTION,
REQUEST_ID, PARENT_REQUEST_ID, DEDUPLICATION_ID, SUBMIT_DATE, LAST_UPDATE_DATE
FROM APP_DB.REQUEST_TABLE
WHERE SUBMIT_DATE >= ze_from_date
AND SUBMIT_DATE < ze_to_date
ORDER BY REQUEST_ID;
OPEN x_destination_list FOR
SELECT REQUEST_ID, DESTINATION_ID
FROM APP_DB.DESTINATION_TABLE
WHERE SUBMIT_DATE >= ze_from_date
AND SUBMIT_DATE < ze_to_date
ORDER BY REQUEST_ID;
OPEN x_command_list FOR
SELECT SEQUENCE_NUMBER, NAME, PARAMS, DESTINATION_ID
SEND_DATE, LAST_UPDATE_DATE,PROCESS_CNT, STATUS, STATUS_DESCRIPTION,
VALIDITY_PERIOD, TO_ABORT_FLAG
FROM APP_DB.REQUEST_DETAILS_TABLE
WHERE SUBMIT_DATE >= ze_from_date
AND SUBMIT_DATE < ze_to_date
ORDER BY REQUEST_ID, DESTINATION_ID, SEQUENCE_NUMBER;
return RETURN_OK;
END EXTRACT_REQUEST_WITH_DATE;
As you see, we use the same predicate (that is the SUBMIT_DATE conditions) for all 3 queries. I think there's maybe some way to optimize it by getting REQUEST_IDs then using them in the remaining queries.
Thanks for hearing me out !
Based on the queries you posted I'd just add a SUBMIT_DATE index to REQUEST_TABLE, DESTINATION_TABLE and REQUEST_DETAILS_TABLE and leave your SQL as is. All three queries will be optimized and will run just as fast as matching against a table of REQUEST_ID values.
So...
I found this method that seems to be efficient enough :
First, defining global types to use as arrays. Here's the code :
Object(Record) type :
create or replace
TYPE "GENERIC_ID" IS OBJECT(ID VARCHAR2(64));
Variable size array of GENERIC_ID
create or replace
TYPE "GENERIC_ID_ARRAY" IS TABLE OF "GENERIC_ID";
Then, populating is done via extend() in a FOR LOOP. The resulting array can be used as a table in SQL requests, using :
TABLE(CAST(my_array_of_ids AS GENERIC_ID_ARRAY)
Thanks,