Oracle Concurrency Problem with competing race condition - oracle

I am facing a concurrency problem with Oracle DB. Say I have 3 Objects of type A that need to be processed and only on the final processing of an Object A can I move on to processing Objects of type B. Additionally, the processing of Objects of Type A are occurring in parallel from multiple deployed instances
Example:
Desired Behavior:
ObjectA-1 - Update status -> IsLastObjectA -> false
ObjectA-2 - Update status -> IsLastObjectA -> false
ObjectA-3 - Update status -> IsLastObjectA -> true -> Begin processing Objects of type B
Current Behavior failing
ObjectA-1 - Update status -> IsLastObjectA -> false
ObjectA-2 - Update status (happens in parallel with ObjectA-3) -> IsLastObjectA (at this point all Object As are in complete status) -> true -> Begin processing Objects of type B (This should only occur once)
ObjectA-3 - Update status (happens in parallel with ObjectA-2) -> IsLastObjectA (at this point all Object As are in complete status)-> true -> Begin processing Objects of type B (This should only occur once)
Ideally I want the transactions to happen in a serialized way (similar to that of isolation level Serializable). But not only does this hurt performance but also I don't have permission to increase ini trans params to the recommended 3. Also, a select for update and things of this locking nature cant be used because we only update the status once and there are no similar objects that are processed. They are all processed based on a unique primary key. Therefore, there is never One object A trying to update another. There is only reading the status from all other Object As after its current respective status has been updated.
I have tried different propagation types that oracle allows as well a locking technique and nothing has worked. Serializable seems to be the best option but I dont have the permissions to implement that
In the code snippet below this is a mocked version of the actual code. The endpoint in the controller gets called from a microservice that is listening to a messaging system queue. The service consumes off the message queue (This service is not shown)
#Data
public class ObjectA {
private int status;
private Long id;
}
#Service
// Lets assume, there is a listener before this call that picks up a message off a queue
// maps the json to that of objectA
// then calls this method
public boolean processObjectA(final ObjectA objecta) {
final boolean isLastUpdate;
isLastUpdate = service.updateObjectAndIsLastObjectToProcess(objectA); // for simplicity, lets assume this calls the method in the controller
if(isLastUpdate){
//Call DB and gather all info realted to ObjectBs and begin to process
}
}
public class Controller {
#Autowired
private ObjectService objectService;
#PutMapping("/updatestatus/islastobject")
public boolean isLastObjectToUpdate(
#RequestParam(name = "id") final Long id,
#RequestParam(name = "status") final int statusCode) {
final boolean updateStatus;
final boolean hasLastObjectBeenProcessed;
try {
// Update object to complete status
updateStatus = objectService.updateObject(id, statusCode);
if (updateStatus) {
//Verify if all ObjectA are in complete status
hasLastObjectBeenProcessed = objectService.hasLastObjectBeenProcessed(id);
return hasLastObjectBeenProcessed;
} else {
throw new RuntimeException();
}
} catch (RuntimeException e) {
return false;
}
}
}
ORACLE queries used
//Update ObjectA to compelte status
updateStatus query = update Object_A o set o.status = 9 where o.id = id
// Verifies is all ObjectA are in complete (9) status
hasLastObjectBeenProcessed query = SELECT object.id FROM Object_A o ml WHERE o.status = 9

Assuming two possible statuses on each row ("Active" - this row needs to be processed, and "Completed", this row is done), how about a model like below for your "worker" threads (pseudo-code)
work_to_be_done =
select count(*)
from table
where status = 'Active'
and objtype = 'A'
and rownum = 1;
if work_to_be_done = 0
<move onto objtype = B>
else
open cursor for
select *
from table
where status = 'Active'
and objtype = 'A'
for update skip locked;
for each row in cursor
process row
update status to 'Completed';
end;
First we see if there is any work to be done for "A". If this returns zero, then everyone has completed and committed their work and we're good to move onto "B".
If it returns non-zero, then you have active work to do, but some of that might already be being worked on by other threads. So we do a skip locked to find rows that we can work on. It might be zero, but that's OK, we just loop around (maybe sleep a little) and then start from the top. Eventually either we will find work to do, or we will find that all the work has been done and we can move on

Related

Is there any better way to check if the same data is present in a table in .Net core 3.1?

I'm pulling data from a third party api. The api runs multiple times in a day. So, if the same data is present in the table it should ignore that record, else if there are any changes it should update that record or insert a new record if anything new shows up in the json received.
I'm using the below code for inserting any new data.
var input = JsonConvert.DeserializeObject<List<DeserializeLookup>>(resultJson).ToList();
var entryset = input.Select(y => new Lookup
{
lookupType = "JOBCODE",
code = y.Code,
description = y.Description,
isNew = true,
lastUpdatedDate = DateTime.UtcNow
}).ToList();
await _context.Lookup.AddRangeAsync(entryset);
await _context.SaveChangesAsync();
But, after the first run, when the api runs again it's again inserting the same data in the table. As a result, duplicate entries are getting into table. To handle the same, I used a foreach loop as below before inserting data to the table.
foreach (var item in input)
{
if (!_context.Lookup.Any(r =>
r.code== item.Code))
{
//above insert code
}
}
But, the same doesn't work as expected. Also, the api takes a lot of time to run when I put a foreach loop. Is there a solution to this in .net core 3.1
List<DeserializeLookup> newList=new();
foreach (var item in input)
{
if (!_context.Lookup.Any(r =>
r.code== item.Code))
{
newList.add(item);
//above insert code
}
}
await _context.Lookup.AddRangeAsync(newList);
await _context.SaveChangesAsync();
It will be better if you try this way
I’m on my phone so forgive me for not being able to format the code in my response. The solution to your problem is something I actually just encountered myself while syncing data from an azure function and third party app and into a sql database.
Depending on your table schema, you would need one column with a unique identifier. Make this column a primary key (first step to preventing duplicates). Here’s a resource for that: https://www.w3schools.com/sql/sql_primarykey.ASP
The next step you want to take care of is your stored procedure. You’ll need to perform what’s commonly referred to as an UPSERT. To do this you’ll need to merge a table with the incoming data...on a specified column (whichever is your primary key).
That would look something like this:
MERGE
Table_1 AS T1
USING
Incoming_Data AS source
ON
T1.column1 = source.column1
/// you can use an AND / OR operator in here for matching on additional values or combinations
WHEN MATCHED THEN
UPDATE SET T1.column2= source.column2
//// etc for more columns
WHEN NOT MATCHED THEN
INSERT (column1, column2, column3) VALUES (source.column1, source.column2, source.column3);
First of all, you should decouple the format in which you get your data from your actual data handling. In your case: get rid of the JSon before you actually interpret the data.
Alas, I haven't got a clue what your data represents, so Let's assume your data is a sequence of Customer Orders. When you get new data, you want to Add all new orders, and you want to update changed orders.
So somewhere you have a method with input your json data, and as output a sequence of Orders:
IEnumerable<Order> InterpretJsonData(string jsonData)
{
...
}
You know Json better than I do, besides this conversion is a bit beside your question.
You wrote:
So, if the same data is present in the table it should ignore that record, else if there are any changes it should update that record or insert a new record
You need an Equality Comparer
To detect whether there are Added or Changed Customer Orders, you need something to detect whether Order A equals Order B. There must be at least one unique field by which you can identify an Order, even if all other values are of the Order are changed.
This unique value is usually called the primary key, or the Id. I assume your Orders have an Id.
So if your new Order data contains an Id that was not available before, then you are certain that the Order was Added.
If your new Order data has an Id that was already in previously processed Orders, then you have to check the other values to detect whether it was changed.
For this you need Equality comparers: one that says that two Orders are equal if they have the same Id, and one that says checks all values for equality.
A standard pattern is to derive your comparer from class EqualityComparer<Order>
class OrderComparer : EqualityComparer<Order>
{
public static IEqualityComparer<Order> ByValue = new OrderComparer();
... // TODO implement
}
Fist I'll show you how to use this to detect additions and changes, then I'll show you how to implement it.
Somewhere you have access to the already processed Orders:
IEnumerable<Order> GetProcessedOrders() {...}
var jsondata = FetchNewJsonOrderData();
// convert the jsonData into a sequence of Orders
IEnumerable<Order> orders = this.InterpretJsonData(jsondata);
To detect which Orders are added or changed, you could make a Dictonary of the already Processed orders and check the orders one-by-one if they are changed:
IEqualityComparer<Order> comparer = OrderComparer.ByValue;
Dictionary<int, Order> processedOrders = this.GetProcessedOrders()
.ToDictionary(order => order.Id);
foreach (Order order in Orders)
{
if(processedOrders.TryGetValue(order.Id, out Order originalOrder)
{
// order already existed. Is it changed?
if(!comparer.Equals(order, originalOrder))
{
// unequal!
this.ProcessChangedOrder(order);
// remember the changed values of this Order
processedOrder[order.Id] = Order;
}
// else: no changes, nothing to do
}
else
{
// Added!
this.ProcessAddedOrder(order);
processedOrder.Add(order.Id, order);
}
}
Immediately after Processing the changed / added order, I remember the new value, because the same Order might be changed again.
If you want this in a LINQ fashion, you have to GroupJoin the Orders with the ProcessedOrders, to get "Orders with their zero or more Previously processed Orders" (there will probably be zero or one Previously processed order).
var ordersWithTPreviouslyProcessedOrder = orders.GroupJoin(this.GetProcessedOrders(),
order => order.Id, // from every Order take the Id
processedOrder => processedOrder.Id, // from every previously processed Order take the Id
// parameter resultSelector: from every Order, with its zero or more previously
// processed Orders make one new:
(order, previouslyProcessedOrders) => new
{
Order = order,
ProcessedOrder = previouslyProcessedOrders.FirstOrDefault(),
})
.ToList();
I use GroupJoin instead of Join, because this way I also get the "Orders that have no previously processed orders" (= new orders). If you would use a simple Join, you would not get them.
I do a ToList, so that in the next statements the group join is not done twice:
var addedOrders = ordersWithTPreviouslyProcessedOrder
.Where(orderCombi => orderCombi.ProcessedOrder == null);
var changedOrders = ordersWithTPreviouslyProcessedOrder
.Where(orderCombi => !comparer.Equals(orderCombi.Order, orderCombi.PreviousOrder);
Implementation of "Compare by Value"
// equal if all values equal
protected override bool Equals(bool x, bool y)
{
if (x == null) return y == null; // true if both null, false if x null but y not null
if (y == null) return false; // because x not null
if (Object.ReferenceEquals(x, y) return true;
if (x.GetType() != y.GetType()) return false;
// compare all properties one by one:
return x.Id == y.Id
&& x.Date == y.Date
&& ...
}
For GetHashCode is one rule: if X equals Y then they must have the same hash code. If not equal, then there is no rule, but it is more efficient for lookups if they have different hash codes. Make a tradeoff between calculation speed and hash code uniqueness.
In this case: If two Orders are equal, then I am certain that they have the same Id. For speed I don't check the other properties.
protected override int GetHashCode(Order x)
{
if (x == null)
return 34339d98; // just a hash code for all null Orders
else
return x.Id.GetHashCode();
}

KStream-KTable join writing to the KTable: How to sync the join with the ktable write?

I'm having some issue with how the following topology behaves:
String topic = config.topic();
KTable<UUID, MyData> myTable = topology.builder().table(UUIDSerdes.get(), GsonSerdes.get(MyData.class), topic);
// Receive a stream of various events
topology.eventsStream()
// Only process events that are implementing MyEvent
.filter((k, v) -> v instanceof MyEvent)
// Cast to ease the code
.mapValues(v -> (MyEvent) v)
// rekey by data id
.selectKey((k, v) -> v.data.id)
.peek((k, v) -> L.info("Event:"+v.action))
// join the event with the according entry in the KTable and apply the state mutation
.leftJoin(myTable, eventHandler::handleEvent, UUIDSerdes.get(), EventSerdes.get())
.peek((k, v) -> L.info("Updated:" + v.id + "-" + v.id2))
// write the updated state to the KTable.
.to(UUIDSerdes.get(), GsonSerdes.get(MyData.class), topic);
My Issue happens when i receive different events at the same time. As my state mutation is done by the leftJoin and then written by the to method. I can have the following occuring if event 1 and 2 are received at the same time with the same key:
event1 joins with state A => state A mutated to state X
event2 joins with state A => state A mutated to state Y
state X written to the KTable topic
state Y written to the KTable topic
Because of that, state Y doesn't have the changes from event1, so I lost data.
Here's in terms of logs what I see (the Processing:... part is logged from inside the value joiner):
Event:Event1
Event:Event2
Processing:Event1, State:none
Updated:1-null
Processing:Event2, State:none
java.lang.IllegalStateException: Event2 event received but we don't have data for id 1
Event1 can be considered as the creation event: it will create the entry in the KTable so it doesn't matter if the state is empty. Event2 though needs to apply it's changes to an existing state but it doesn't find any because the first state mutation still hasn't been written to the KTable (it's still hasn't been processed by the to method)
Is there anyway to make sure that my leftJoin and my writes into the ktable are done atomically ?
Thanks
Update & current solution
Thanks to the response of #Matthias I was able to find a solution using a Transformer.
Here's what the code looks like:
That's the transformer
public class KStreamStateLeftJoin<K, V1, V2> implements Transformer<K, V1, KeyValue<K, V2>> {
private final String stateName;
private final ValueJoiner<V1, V2, V2> joiner;
private final boolean updateState;
private KeyValueStore<K, V2> state;
public KStreamStateLeftJoin(String stateName, ValueJoiner<V1, V2, V2> joiner, boolean updateState) {
this.stateName = stateName;
this.joiner = joiner;
this.updateState = updateState;
}
#Override
#SuppressWarnings("unchecked")
public void init(ProcessorContext context) {
this.state = (KeyValueStore<K, V2>) context.getStateStore(stateName);
}
#Override
public KeyValue<K, V2> transform(K key, V1 value) {
V2 stateValue = this.state.get(key); // Get current state
V2 updatedValue = joiner.apply(value, stateValue); // Apply join
if (updateState) {
this.state.put(key, updatedValue); // write new state
}
return new KeyValue<>(key, updatedValue);
}
#Override
public KeyValue<K, V2> punctuate(long timestamp) {
return null;
}
#Override
public void close() {}
}
And here's the adapted topology:
String topic = config.topic();
String store = topic + "-store";
KTable<UUID, MyData> myTable = topology.builder().table(UUIDSerdes.get(), GsonSerdes.get(MyData.class), topic, store);
// Receive a stream of various events
topology.eventsStream()
// Only process events that are implementing MyEvent
.filter((k, v) -> v instanceof MyEvent)
// Cast to ease the code
.mapValues(v -> (MyEvent) v)
// rekey by data id
.selectKey((k, v) -> v.data.id)
// join the event with the according entry in the KTable and apply the state mutation
.transform(() -> new KStreamStateLeftJoin<UUID, MyEvent, MyData>(store, eventHandler::handleEvent, true), store)
// write the updated state to the KTable.
.to(UUIDSerdes.get(), GsonSerdes.get(MyData.class), topic);
As we're using the KTable's KV StateStore and applying changes directly in it through the put method events shoudl always pick up the updated state.
One thing i'm still wondering: what if I have a continuous high throughput of events.
Could there still be a race condition between the puts we do on the KTable's KV store and the writes that are done in the KTable's topic ?
A KTable is sharded into multiple physical stores and each store is only updated by a single thread. Thus, the scenario you describe cannot happen. If you have 2 records with the same timestamp that both update the same shard, they will be processed one after each other (in offset order). Thus, the second update will see the state of after the first update.
So maybe you just did describe your scenario not correctly?
Update
You cannot mutate the state when doing a join. Thus, the expectation that
event1 joins with state A => state A mutated to state X
is wrong. Independent of any processing order, when event1 joins with state A, it will access state A in read only mode and state A will not be modified.
Thus, when event2 joins, it will see the same state as event1. For stream-table join, the table state is only updated when new data is read from the table-input-topic.
If you want to have a shared state that is updated from both inputs, you would need to build a custom solution using transform():
builder.addStore(..., "store-name");
builder.stream("table-topic").transform(..., "store-name"); // will not emit anything downstream
KStream result = builder.stream("stream-topic").transform(..., "store-name");
This will create one store that is shared by both processors and both can read/write as they wish. Thus, for the table-input you can just update the state without sending anything downstream, while for the stream-input you can do the join, update the state, and send a result downstream.
Update 2
With regard to the solution, there will be no race condition between the updates the Transformer applies to the state and records the Transformer processes after the state update. This part will be executed in a single thread, and records will be processed in offset-order from the input topic. Thus, it's ensured that a state update will be available to later records.

can somebody tell me what error i have in this BPM

This code will auto generate a new part number. This is a Post-processing BPM for BO GetNewPart
int iPartnum = 0;
string cPartid = string.Empty;
Erp.Tables.Company Company;
foreach (var ttpart_xRow in ttPart)
{
var ttpartRow = ttpart_xRow;
Company = (from Company_Row in Db.Company
where Company_Row.Company == Session.CompanyID
select Company_Row).FirstOrDefault();
iPartnum = (decimal)Company["AutoGenerate_c"] + 1;
cPartid = System.Convert.ToString(iPartnum);
ttpartRow.PartNum = cPartid;
Services.Lib.UpdateTableBuffer._UpdateTableBuffer(Company,"AutoGenerate_c", iPartnum);
}
Is it just not working or is there an error message?
Services.Lib.UpdateTableBuffer._UpdateTableBuffer(Company,"AutoGenerate_c", iPartnum);
I have personally never used or even seen this Lib item so I can't vouch for it. I would update the object manually inside of a transaction scope because I doubt GetNewPart ever touches that database and therefore probably doesn't create a transaction.
using (System.Transactions.TransactionScope txScope = IceDataContext.CreateDefaultTransactionScope())//start the transaction
{
//Your Logics go here
Db.Validate();
txScope.Complete();//commit the transaction
}
As a side note, I try to keep these sorts of things off of the company record because nearly every process in the system touches it and I don't want a process to lock it up or cause weird race conditions. I generally like to reserve a record that will only get touched for this specific purpose so I have a UDCodeType/UDCode for this sort of thing.

Spring JdbcTemplate is update atomic?

Is the following jdbcTemplate update script is threadsafe? what it does basically is :
balance -= amount;
Here is the code:
String sql = "update player.playerbalance b set b.balance = (b.balance - ?) where b.id = ? and b.balance >= ?";
jdbcTemplate = new JdbcTemplate(dataSource);
int i = jdbcTemplate.update(
sql,
new Object[] {wager, playerBalance.getId(), wager});
What happens if two updates of this kind happens at the same time?
Thanks,
It has nothing to do with thread-safetiness. The call is supposed to be thread-safe.
DBMS is going to be smart enough to make sure that one update finish before the other update of the same record comes in (unless you have set it with very low isolation level). Therefore, if two thread (or process etc) invoke that same method twice (using same balance ID), the same record will be deducted twice.

Best Practice Checking for duplicate rows before inserting list of items

I have a an array of objects that I want to enter into the database.
My method call looks like this.
public void Add(CardElement[] cardElements){
foreach (var cardElement in cardElements)
{
Data.Entry(cardElement).State = System.Data.EntityState.Added;
}
Data.SaveChanges();
}
The database table resembles this
MS SQL = Table mytable Columns a,b,c,d,e,f
Unique Constraint a,b,c
The data I want to insert resembles this.
var obj [] = new [] {
new MyObject () { a = 1, b =1, c = 1 },
new MyObject () { a = 1, b =1, c = 2 }
new MyObject () { a = 1, b =1, c = 3 }
};
So, I want to check the database for these three rows before I add them to the database.
I could do something like but I assume this should cause some extra trips to the database.
private bool checkExists()...
foreach (var cardElement in cardElements)
{
var exists = (from ce in Data.CardElements
where ce.CardId == cardElement.CardId
where ce.Area == cardElement.Area
where ce.ElementName == cardElement.ElementName
select ce).Any();
if(exists return true)
}
return false
So, how could I handle this more gracefully?
Is it even worth trying to accomplish this using linq?
Should I write some stored procedures for performance?
I agree that you should let the db make the decision.
Please have a look at using UPSERT as stated in this post
Why not just attempt the insert and let the database tell you if any unique constraint violations have occurred (using try/catch)?
The problem is that even if you query data somebody else can insert the record between your query and saving changes. You will still have to handle exception for violating unique constraint despite your additional queries - and yes, every check will do additional trip to database.
If your main concern is performance use stored procedure where you can additionally use table hint to lock table for inserts during initial check for existence.

Resources