I have an app that requires a single instance of a task to be run. In order to check whether a current instance of the task is already running, i check the status of the task. If the task has a combination of one or more of those statuses, then it knows the task is already running and should skip the task for now. These tasks can be called from multiple places so i could have a hundred or so calls for the task to be run in a minute.
I have the following query on Oracle 11g.
The SQL:
INSERT INTO Log2 (LogID, Action, EventDateTime)
SELECT 102211, 2, SYSDATE FROM dual WHERE NOT EXISTS
(SELECT LogID FROM Log2 T3 WHERE T3.Param2 = 102 AND T3.Action = 34 AND T3.AuditLogID NOT IN
(SELECT T2.LogID FROM Log2 T1, Log2 T2 WHERE (T1.Action IN (1,2,3) AND T2.Action = 6 AND T1.Param3=T2.Param3 AND T1.Param2 = 102))
);
At the moment the above query will sometimes allow 2 records to be inserted at the same time (eventDateTime tells me that).How can i ensure that this does not happen ? Do i need to add some locking hints ? I thought the whole query was atomic. The JDBC connection is on auto commit.
There are several parts of the app that update this table. i only want this locking to happens for the tasks item as stated above. Other parts of the app that add records to this Log2 table only ever insert one record at a time, so this single instance behaviour is not required for those other parts.
Thanks
Related
In adbtask_settings Oracle Table what is the meaning of INTERVAL and max_run_time column? This table is being used for AUTOTask job execution so I want to know the significance of these columns.
The column in table adbtask_settings.
Interval=180 -- it means every 180 seconds, Autotask will try to schedule your task.
max_run_time=900 -- it means if client task already run for 1800 seconds but not yet finish, a timeout error will be signaled to interrupt client task
which means that is the maximal runtime allowed.
If you want test these functionalities then you use below proc to change the value and check the job execution time in below SELECT query:
execute DBMS_ADBTASK_ADMIN.MODIFY_ADBTASK_SETTINGS ('Auto Compress','INTERVAL',180);
execute DBMS_ADBTASK_ADMIN.MODIFY_ADBTASK_SETTINGS ('Auto Compress','MAX RUN TIME',900);
select run_duration from DBA_SCHEDULER_JOB_RUN_DETAILS where job_name='ORA$_ATSK_AUTOCOMP' order by actual_start_date desc;
I have a views table joining with a temp table with the below parameters intentionally enabled.
hive.auto.convert.join=true;
hive.execution.engine=tez;
The Code Snippet is,
CREATE TABLE STG_CONVERSION AS
SELECT CONV.CONVERSION_ID,
CONV.USER_ID,
TP.TIME,
CONV.TIME AS ACTIVITY_TIME,
TP.MULTI_DIM_ID,
CONV.CONV_TYPE_ID,
TP.SV1
FROM VIEWS TP
JOIN SCU_TMP CONV ON TP.USER_ID = CONV.USER_ID
WHERE TP.TIME <= CONV.TIME;
In the normal scenario, both the tables can have any number of records.
However,in the SCU_TMP table, only 10-50 records are expected with the same User Id.
But in some cases, couple of User IDs come with around 10k-20k records in SCU Temp table, which creates a cross product effect.
In such cases, it'll run for ever with just 1 mapper to complete.
Is there any way to optimise this and run this gracefully?
I was able to find a solution to it by the below query.
set hive.exec.reducers.bytes.per.reducer=10000
CREATE TABLE STG_CONVERSION AS
SELECT CONV.CONVERSION_ID,
CONV.USER_ID,
TP.TIME,
CONV.TIME AS ACTIVITY_TIME,
TP.MULTI_DIM_ID,
CONV.CONV_TYPE_ID,
TP.SV1
FROM (SELECT TIME,MULTI_DIM_ID,SV1 FROM VIEWS SORT BY TIME) TP
JOIN SCU_TMP CONV ON TP.USER_ID = CONV.USER_ID
WHERE TP.TIME <= CONV.TIME;
The problem arises due to the fact that when a single user id dominates the table, join of that user gets processed through a single mapper which gets stuck.
Two modifications to it,
1) Replaced Table name with a subquery - which added a sorting process before the join.
2)Reduced the hive.exec.reducers.bytes.per.reducer parameter to 10KB.
Sort by time in step (1) added a shuffle phase which evenly distributed the data which was earlier skewed by the User ID.
Reducing the bytes per reducer parameter resulted in distribution of data to all available reducers.
By these two enhancements, 10-12hrs run was reduced to 45 mins.
I need to update a table once in a period, however periods are either 4 or 5 weeks long.
In order to schedule a SQL query to run this process, I have been attempting to use the create table statement as a result of a successful when clause.
Is this possible as I am not getting any joy and cannot find anything that states one way or the other.
e.g.
select
case when 1=1 then
(create table db.case_test_1 as
select 'Y' as test)
end as test
Hi I have a partitioned table and when I am trying to update taking few selected partition in a loop with passing partition name dynamically, it s not working.
for i in 1..partition_tbl.count Loop
UPDATE cdr_data PARTITION(partition_tbl(i)) cdt
SET A='B'
WHERE
cdt.ab='c'
End Loop;
The partition_tbl object has all the partition in which I want to perform this update.
Please suggest me how to proceed here.
Thanks in advance
What is the problem that you are trying to solve? It doesn't make sense to run separate UPDATE statements against each partition in a loop. If you really want to update every row in the table where ab = 'c', just issue a single UPDATE statement
UPDATE cdr_data cdt
SET a = 'B'
WHERE ab = 'c'
potentially with a PARALLEL hint that would allow Oracle to update multiple partitions in parallel.
If you really, really want to update each partition independently, it would make much more sense to do so based on the partition keys. For example, if your table has daily partitions based on a date
FOR i IN 1 .. <<number of daily partitions>>
LOOP
UPDATE cdr_data cdt
SET a = 'B'
WHERE ab = 'c'
AND partition_key = <<minimum date>> + i;
END LOOP;
Using the partition( <<partition name>> ) syntax is an absolute last resort. If you're really determined to go down that path, you'd need to use dynamic SQL, constructing the SQL statement in the loop and using EXECUTE IMMEDIATE or dbms_sql to execute it.
Preferably let Oracle take care about partitions - pretend in your statement they do not exist
UPDATE cdr_data cdt SET A='B' WHERE cdt.ab='c'
it will from the where conditions and your partitions definitions choose the right partition(s) to apply the command on.
There may be rare event when you need a partition bounded DML, but certainly it is not the example shown. In such situation you can't provide partition name dynamically, like you can't normally provide table name dynamically e.g. you can't
select * from _variable_containing_table_name
If you really insist on partition bounded command then it would be
select * from table_name partition (partition_Name)
e.g.
select * from bills partition (p201403)
To use dynamic partition name the whole statement would have to be dynamically executed via execute immediate or dbms_sql.
But once again, do not choose partition, Oracle will.
We have a process that aggregates some data and inserts the results into another table that we use for efficient querying. The problem we're facing is that we now have multiple aggregators running at roughly the same time.
We use the original records id as the primary key in this new table - a unique constraint. However, if two aggregation processes are running at the same time, one of them will error with a unique constraint violation.
Is there a way to specify some kind of locking mechanism which will make the second writer wait until the first is finished? Alternatively, is there a way to tell oracle to ignore that specific row and continue with the rest?
Unfortunately it's not practical to reduce the aggregation to a single process, as the following procedures rely on an up to date version of the data being available and those procedures do need to scale out.
Edit:
The following is my [redacted] query:
INSERT INTO
agg_table
SELECT
h.id, h.col, h.col2
FROM history h
JOIN call c
ON c.callid = h.callid
WHERE
h.id > (SELECT coalesce(max(id),0) FROM agg_table)
It is possible run an INSERT statement with an error logging clause. The example from the Oracle docs is as follows:
INSERT INTO dw_empl
SELECT employee_id, first_name, last_name, hire_date, salary, department_id
FROM employees
WHERE hire_date > sysdate - 7
LOG ERRORS INTO err_empl ('daily_load') REJECT LIMIT 25
Alternatively, you could try using a [MERGE][2] statement. You would be merging into the summary table with a select from the detail table. If a match is not found, you INSERT and if it is found you would UPDATE. I believe this solution will handle your concurrency issues, but you would need to test it.
have a look at FOR UPDATE clause. If you correctly write the SELECT statement with FOR UPDATE clause within a transaction before your update/insert statements you will be able to "lock" the required records
Serialising the inserts is probably the best way, as there's no method that will get you round the problem of the multiple inserts being unable to see what each one is doing.
DBMS_Lock is probably the appropriate serialisation mechanism.