Prevent Room Observable Query Triggering Multiple Times When Multiple Tables Changed - android-room

I use observable queries in Android Room to trigger updates that eventually change the UI when the underlying data changes.
Sometimes these queries involve multiple tables, and sometimes a user performs an action that inserts new values into these tables. The inserts are usually done very quickly one after the other (i.e. in less than a one half of a second time span)
Currently, an observer subscribed to updates from Room will be triggered x times where x is the number of tables in the query that were updated.
So if values are inserted into six different tables in the query all within one second of each other, any observer that is observing the query will fire six times.
Is there a way to tell Room; "Only trigger observers a maximum of once every second"? Or, better yet "When you see an update, wait half a second to collect any other updates, and then notify observers of everything at once"?
Here is the Room Query. As you can see multiple tables are involved.
#Query("SELECT Distinct order_material.* FROM order_material" +
" JOIN orders on order_material.orderId = orders.id" +
" JOIN assignment on order_material.id = assignment.orderMaterialID" +
" JOIN box_inside on assignment.boxInsideID = box_inside.id" +
" WHERE orders.isActive = 1 " +
" AND (box_inside.dateCFOrigPacked <= :dateLimitYema OR box_inside.stemLength IS NOT NULL)" +
" AND assignment.id IN (SELECT id FROM assignment WHERE active = 1" +
" AND quantity > 0 AND assignment.id NOT IN "+
" (Select assignmentID FROM " +
" (Select assignmentID, assignment.quantity as sum_assign, SUM(preparation.quantityPrepared) as sum_prep " +
" from preparation " +
" JOIN assignment on preparation.assignmentID = assignment.id" +
" GROUP BY preparation.assignmentID " +
" HAVING sum_assign <= sum_prep)))" +
" Order By order_material.deliveryDate ASC")
Observable<OrderMaterial[]> getOrderMaterialsReadyToPrepareAsync(Date dateLimitYema);
Here is a repository method observing the query. It gets triggered X times where X is the number of tables that update within a very short timespan.
public Observable<List<OrderMaterial>> findOrderMaterialsReadyToPrepareAsync(){
int daysInOneWeek = 7;
Calendar calendar = Calendar.getInstance();
calendar.add(Calendar.DAY_OF_YEAR, -1 * daysInOneWeek);
Date limitOneWeekYemas = calendar.getTime();
return Observable.create(subscribe ->
assignmentDao.getOrderMaterialsReadyToPrepareAsync(limitOneWeekYemas)
.observeOn(AndroidSchedulers.mainThread()).subscribeOn(Schedulers.io())
.subscribe(new Observer<OrderMaterial[]>() {
#Override
public void onSubscribe(Disposable d) {
Timber.v("findOrderMaterialsReadyToPrepareAsync on subscribe");
subscribe.setDisposable(d);
preparationDisposable.add(d);
}
#Override
public void onNext(OrderMaterial[] materials) {
// Timber.v("findOrderMaterialsReadyToPrepareAsync onNext with %s", Arrays.toString(materials));
subscribe.onNext(Arrays.asList(materials));
}
#Override
public void onError(Throwable e) {
Timber.e(e,"findOrderMaterialsReadyToPrepareAsync onError");
subscribe.onError(e);
}
#Override
public void onComplete() {
Timber.e("findOrderMaterialsReadyToPrepareAsync onComplete");
}
})
);
}

Is there a way to tell Room; "Only trigger observers a maximum of once every second"? Or, better yet "When you see an update, wait half a second to collect any other updates, and then notify observers of everything at once"?
As far as I know there is no such a way to communicate with Room. Observable queries (with LiveData, RxJava, Flow) in Room under the hood invokes using of DB's InvalidationTracker (it's like a simple-any-changes-trigger):
Adds the given observer to the observers list and it will be notified if any table it observes changes.
More than that, even if you observe (with one of Observables) query with some condition like select * from users where id =:id, you'll be notified after each table's change, even if there is no changes in tables' rows with this specific id.
You can try add #Transaction to your query (to get just one triggering) but I guess it will not help.
So I think you shouldn't expect too much intelligence from Room in that aspect. Then RxJava if you use it - is your first candidate for expecting such smart "conversation" (may be debounce operator would be your choice).

Related

Oracle Concurrency Problem with competing race condition

I am facing a concurrency problem with Oracle DB. Say I have 3 Objects of type A that need to be processed and only on the final processing of an Object A can I move on to processing Objects of type B. Additionally, the processing of Objects of Type A are occurring in parallel from multiple deployed instances
Example:
Desired Behavior:
ObjectA-1 - Update status -> IsLastObjectA -> false
ObjectA-2 - Update status -> IsLastObjectA -> false
ObjectA-3 - Update status -> IsLastObjectA -> true -> Begin processing Objects of type B
Current Behavior failing
ObjectA-1 - Update status -> IsLastObjectA -> false
ObjectA-2 - Update status (happens in parallel with ObjectA-3) -> IsLastObjectA (at this point all Object As are in complete status) -> true -> Begin processing Objects of type B (This should only occur once)
ObjectA-3 - Update status (happens in parallel with ObjectA-2) -> IsLastObjectA (at this point all Object As are in complete status)-> true -> Begin processing Objects of type B (This should only occur once)
Ideally I want the transactions to happen in a serialized way (similar to that of isolation level Serializable). But not only does this hurt performance but also I don't have permission to increase ini trans params to the recommended 3. Also, a select for update and things of this locking nature cant be used because we only update the status once and there are no similar objects that are processed. They are all processed based on a unique primary key. Therefore, there is never One object A trying to update another. There is only reading the status from all other Object As after its current respective status has been updated.
I have tried different propagation types that oracle allows as well a locking technique and nothing has worked. Serializable seems to be the best option but I dont have the permissions to implement that
In the code snippet below this is a mocked version of the actual code. The endpoint in the controller gets called from a microservice that is listening to a messaging system queue. The service consumes off the message queue (This service is not shown)
#Data
public class ObjectA {
private int status;
private Long id;
}
#Service
// Lets assume, there is a listener before this call that picks up a message off a queue
// maps the json to that of objectA
// then calls this method
public boolean processObjectA(final ObjectA objecta) {
final boolean isLastUpdate;
isLastUpdate = service.updateObjectAndIsLastObjectToProcess(objectA); // for simplicity, lets assume this calls the method in the controller
if(isLastUpdate){
//Call DB and gather all info realted to ObjectBs and begin to process
}
}
public class Controller {
#Autowired
private ObjectService objectService;
#PutMapping("/updatestatus/islastobject")
public boolean isLastObjectToUpdate(
#RequestParam(name = "id") final Long id,
#RequestParam(name = "status") final int statusCode) {
final boolean updateStatus;
final boolean hasLastObjectBeenProcessed;
try {
// Update object to complete status
updateStatus = objectService.updateObject(id, statusCode);
if (updateStatus) {
//Verify if all ObjectA are in complete status
hasLastObjectBeenProcessed = objectService.hasLastObjectBeenProcessed(id);
return hasLastObjectBeenProcessed;
} else {
throw new RuntimeException();
}
} catch (RuntimeException e) {
return false;
}
}
}
ORACLE queries used
//Update ObjectA to compelte status
updateStatus query = update Object_A o set o.status = 9 where o.id = id
// Verifies is all ObjectA are in complete (9) status
hasLastObjectBeenProcessed query = SELECT object.id FROM Object_A o ml WHERE o.status = 9
Assuming two possible statuses on each row ("Active" - this row needs to be processed, and "Completed", this row is done), how about a model like below for your "worker" threads (pseudo-code)
work_to_be_done =
select count(*)
from table
where status = 'Active'
and objtype = 'A'
and rownum = 1;
if work_to_be_done = 0
<move onto objtype = B>
else
open cursor for
select *
from table
where status = 'Active'
and objtype = 'A'
for update skip locked;
for each row in cursor
process row
update status to 'Completed';
end;
First we see if there is any work to be done for "A". If this returns zero, then everyone has completed and committed their work and we're good to move onto "B".
If it returns non-zero, then you have active work to do, but some of that might already be being worked on by other threads. So we do a skip locked to find rows that we can work on. It might be zero, but that's OK, we just loop around (maybe sleep a little) and then start from the top. Eventually either we will find work to do, or we will find that all the work has been done and we can move on

Is there any better way to check if the same data is present in a table in .Net core 3.1?

I'm pulling data from a third party api. The api runs multiple times in a day. So, if the same data is present in the table it should ignore that record, else if there are any changes it should update that record or insert a new record if anything new shows up in the json received.
I'm using the below code for inserting any new data.
var input = JsonConvert.DeserializeObject<List<DeserializeLookup>>(resultJson).ToList();
var entryset = input.Select(y => new Lookup
{
lookupType = "JOBCODE",
code = y.Code,
description = y.Description,
isNew = true,
lastUpdatedDate = DateTime.UtcNow
}).ToList();
await _context.Lookup.AddRangeAsync(entryset);
await _context.SaveChangesAsync();
But, after the first run, when the api runs again it's again inserting the same data in the table. As a result, duplicate entries are getting into table. To handle the same, I used a foreach loop as below before inserting data to the table.
foreach (var item in input)
{
if (!_context.Lookup.Any(r =>
r.code== item.Code))
{
//above insert code
}
}
But, the same doesn't work as expected. Also, the api takes a lot of time to run when I put a foreach loop. Is there a solution to this in .net core 3.1
List<DeserializeLookup> newList=new();
foreach (var item in input)
{
if (!_context.Lookup.Any(r =>
r.code== item.Code))
{
newList.add(item);
//above insert code
}
}
await _context.Lookup.AddRangeAsync(newList);
await _context.SaveChangesAsync();
It will be better if you try this way
I’m on my phone so forgive me for not being able to format the code in my response. The solution to your problem is something I actually just encountered myself while syncing data from an azure function and third party app and into a sql database.
Depending on your table schema, you would need one column with a unique identifier. Make this column a primary key (first step to preventing duplicates). Here’s a resource for that: https://www.w3schools.com/sql/sql_primarykey.ASP
The next step you want to take care of is your stored procedure. You’ll need to perform what’s commonly referred to as an UPSERT. To do this you’ll need to merge a table with the incoming data...on a specified column (whichever is your primary key).
That would look something like this:
MERGE
Table_1 AS T1
USING
Incoming_Data AS source
ON
T1.column1 = source.column1
/// you can use an AND / OR operator in here for matching on additional values or combinations
WHEN MATCHED THEN
UPDATE SET T1.column2= source.column2
//// etc for more columns
WHEN NOT MATCHED THEN
INSERT (column1, column2, column3) VALUES (source.column1, source.column2, source.column3);
First of all, you should decouple the format in which you get your data from your actual data handling. In your case: get rid of the JSon before you actually interpret the data.
Alas, I haven't got a clue what your data represents, so Let's assume your data is a sequence of Customer Orders. When you get new data, you want to Add all new orders, and you want to update changed orders.
So somewhere you have a method with input your json data, and as output a sequence of Orders:
IEnumerable<Order> InterpretJsonData(string jsonData)
{
...
}
You know Json better than I do, besides this conversion is a bit beside your question.
You wrote:
So, if the same data is present in the table it should ignore that record, else if there are any changes it should update that record or insert a new record
You need an Equality Comparer
To detect whether there are Added or Changed Customer Orders, you need something to detect whether Order A equals Order B. There must be at least one unique field by which you can identify an Order, even if all other values are of the Order are changed.
This unique value is usually called the primary key, or the Id. I assume your Orders have an Id.
So if your new Order data contains an Id that was not available before, then you are certain that the Order was Added.
If your new Order data has an Id that was already in previously processed Orders, then you have to check the other values to detect whether it was changed.
For this you need Equality comparers: one that says that two Orders are equal if they have the same Id, and one that says checks all values for equality.
A standard pattern is to derive your comparer from class EqualityComparer<Order>
class OrderComparer : EqualityComparer<Order>
{
public static IEqualityComparer<Order> ByValue = new OrderComparer();
... // TODO implement
}
Fist I'll show you how to use this to detect additions and changes, then I'll show you how to implement it.
Somewhere you have access to the already processed Orders:
IEnumerable<Order> GetProcessedOrders() {...}
var jsondata = FetchNewJsonOrderData();
// convert the jsonData into a sequence of Orders
IEnumerable<Order> orders = this.InterpretJsonData(jsondata);
To detect which Orders are added or changed, you could make a Dictonary of the already Processed orders and check the orders one-by-one if they are changed:
IEqualityComparer<Order> comparer = OrderComparer.ByValue;
Dictionary<int, Order> processedOrders = this.GetProcessedOrders()
.ToDictionary(order => order.Id);
foreach (Order order in Orders)
{
if(processedOrders.TryGetValue(order.Id, out Order originalOrder)
{
// order already existed. Is it changed?
if(!comparer.Equals(order, originalOrder))
{
// unequal!
this.ProcessChangedOrder(order);
// remember the changed values of this Order
processedOrder[order.Id] = Order;
}
// else: no changes, nothing to do
}
else
{
// Added!
this.ProcessAddedOrder(order);
processedOrder.Add(order.Id, order);
}
}
Immediately after Processing the changed / added order, I remember the new value, because the same Order might be changed again.
If you want this in a LINQ fashion, you have to GroupJoin the Orders with the ProcessedOrders, to get "Orders with their zero or more Previously processed Orders" (there will probably be zero or one Previously processed order).
var ordersWithTPreviouslyProcessedOrder = orders.GroupJoin(this.GetProcessedOrders(),
order => order.Id, // from every Order take the Id
processedOrder => processedOrder.Id, // from every previously processed Order take the Id
// parameter resultSelector: from every Order, with its zero or more previously
// processed Orders make one new:
(order, previouslyProcessedOrders) => new
{
Order = order,
ProcessedOrder = previouslyProcessedOrders.FirstOrDefault(),
})
.ToList();
I use GroupJoin instead of Join, because this way I also get the "Orders that have no previously processed orders" (= new orders). If you would use a simple Join, you would not get them.
I do a ToList, so that in the next statements the group join is not done twice:
var addedOrders = ordersWithTPreviouslyProcessedOrder
.Where(orderCombi => orderCombi.ProcessedOrder == null);
var changedOrders = ordersWithTPreviouslyProcessedOrder
.Where(orderCombi => !comparer.Equals(orderCombi.Order, orderCombi.PreviousOrder);
Implementation of "Compare by Value"
// equal if all values equal
protected override bool Equals(bool x, bool y)
{
if (x == null) return y == null; // true if both null, false if x null but y not null
if (y == null) return false; // because x not null
if (Object.ReferenceEquals(x, y) return true;
if (x.GetType() != y.GetType()) return false;
// compare all properties one by one:
return x.Id == y.Id
&& x.Date == y.Date
&& ...
}
For GetHashCode is one rule: if X equals Y then they must have the same hash code. If not equal, then there is no rule, but it is more efficient for lookups if they have different hash codes. Make a tradeoff between calculation speed and hash code uniqueness.
In this case: If two Orders are equal, then I am certain that they have the same Id. For speed I don't check the other properties.
protected override int GetHashCode(Order x)
{
if (x == null)
return 34339d98; // just a hash code for all null Orders
else
return x.Id.GetHashCode();
}

KStream-KTable join writing to the KTable: How to sync the join with the ktable write?

I'm having some issue with how the following topology behaves:
String topic = config.topic();
KTable<UUID, MyData> myTable = topology.builder().table(UUIDSerdes.get(), GsonSerdes.get(MyData.class), topic);
// Receive a stream of various events
topology.eventsStream()
// Only process events that are implementing MyEvent
.filter((k, v) -> v instanceof MyEvent)
// Cast to ease the code
.mapValues(v -> (MyEvent) v)
// rekey by data id
.selectKey((k, v) -> v.data.id)
.peek((k, v) -> L.info("Event:"+v.action))
// join the event with the according entry in the KTable and apply the state mutation
.leftJoin(myTable, eventHandler::handleEvent, UUIDSerdes.get(), EventSerdes.get())
.peek((k, v) -> L.info("Updated:" + v.id + "-" + v.id2))
// write the updated state to the KTable.
.to(UUIDSerdes.get(), GsonSerdes.get(MyData.class), topic);
My Issue happens when i receive different events at the same time. As my state mutation is done by the leftJoin and then written by the to method. I can have the following occuring if event 1 and 2 are received at the same time with the same key:
event1 joins with state A => state A mutated to state X
event2 joins with state A => state A mutated to state Y
state X written to the KTable topic
state Y written to the KTable topic
Because of that, state Y doesn't have the changes from event1, so I lost data.
Here's in terms of logs what I see (the Processing:... part is logged from inside the value joiner):
Event:Event1
Event:Event2
Processing:Event1, State:none
Updated:1-null
Processing:Event2, State:none
java.lang.IllegalStateException: Event2 event received but we don't have data for id 1
Event1 can be considered as the creation event: it will create the entry in the KTable so it doesn't matter if the state is empty. Event2 though needs to apply it's changes to an existing state but it doesn't find any because the first state mutation still hasn't been written to the KTable (it's still hasn't been processed by the to method)
Is there anyway to make sure that my leftJoin and my writes into the ktable are done atomically ?
Thanks
Update & current solution
Thanks to the response of #Matthias I was able to find a solution using a Transformer.
Here's what the code looks like:
That's the transformer
public class KStreamStateLeftJoin<K, V1, V2> implements Transformer<K, V1, KeyValue<K, V2>> {
private final String stateName;
private final ValueJoiner<V1, V2, V2> joiner;
private final boolean updateState;
private KeyValueStore<K, V2> state;
public KStreamStateLeftJoin(String stateName, ValueJoiner<V1, V2, V2> joiner, boolean updateState) {
this.stateName = stateName;
this.joiner = joiner;
this.updateState = updateState;
}
#Override
#SuppressWarnings("unchecked")
public void init(ProcessorContext context) {
this.state = (KeyValueStore<K, V2>) context.getStateStore(stateName);
}
#Override
public KeyValue<K, V2> transform(K key, V1 value) {
V2 stateValue = this.state.get(key); // Get current state
V2 updatedValue = joiner.apply(value, stateValue); // Apply join
if (updateState) {
this.state.put(key, updatedValue); // write new state
}
return new KeyValue<>(key, updatedValue);
}
#Override
public KeyValue<K, V2> punctuate(long timestamp) {
return null;
}
#Override
public void close() {}
}
And here's the adapted topology:
String topic = config.topic();
String store = topic + "-store";
KTable<UUID, MyData> myTable = topology.builder().table(UUIDSerdes.get(), GsonSerdes.get(MyData.class), topic, store);
// Receive a stream of various events
topology.eventsStream()
// Only process events that are implementing MyEvent
.filter((k, v) -> v instanceof MyEvent)
// Cast to ease the code
.mapValues(v -> (MyEvent) v)
// rekey by data id
.selectKey((k, v) -> v.data.id)
// join the event with the according entry in the KTable and apply the state mutation
.transform(() -> new KStreamStateLeftJoin<UUID, MyEvent, MyData>(store, eventHandler::handleEvent, true), store)
// write the updated state to the KTable.
.to(UUIDSerdes.get(), GsonSerdes.get(MyData.class), topic);
As we're using the KTable's KV StateStore and applying changes directly in it through the put method events shoudl always pick up the updated state.
One thing i'm still wondering: what if I have a continuous high throughput of events.
Could there still be a race condition between the puts we do on the KTable's KV store and the writes that are done in the KTable's topic ?
A KTable is sharded into multiple physical stores and each store is only updated by a single thread. Thus, the scenario you describe cannot happen. If you have 2 records with the same timestamp that both update the same shard, they will be processed one after each other (in offset order). Thus, the second update will see the state of after the first update.
So maybe you just did describe your scenario not correctly?
Update
You cannot mutate the state when doing a join. Thus, the expectation that
event1 joins with state A => state A mutated to state X
is wrong. Independent of any processing order, when event1 joins with state A, it will access state A in read only mode and state A will not be modified.
Thus, when event2 joins, it will see the same state as event1. For stream-table join, the table state is only updated when new data is read from the table-input-topic.
If you want to have a shared state that is updated from both inputs, you would need to build a custom solution using transform():
builder.addStore(..., "store-name");
builder.stream("table-topic").transform(..., "store-name"); // will not emit anything downstream
KStream result = builder.stream("stream-topic").transform(..., "store-name");
This will create one store that is shared by both processors and both can read/write as they wish. Thus, for the table-input you can just update the state without sending anything downstream, while for the stream-input you can do the join, update the state, and send a result downstream.
Update 2
With regard to the solution, there will be no race condition between the updates the Transformer applies to the state and records the Transformer processes after the state update. This part will be executed in a single thread, and records will be processed in offset-order from the input topic. Thus, it's ensured that a state update will be available to later records.

Spring JdbcTemplate is update atomic?

Is the following jdbcTemplate update script is threadsafe? what it does basically is :
balance -= amount;
Here is the code:
String sql = "update player.playerbalance b set b.balance = (b.balance - ?) where b.id = ? and b.balance >= ?";
jdbcTemplate = new JdbcTemplate(dataSource);
int i = jdbcTemplate.update(
sql,
new Object[] {wager, playerBalance.getId(), wager});
What happens if two updates of this kind happens at the same time?
Thanks,
It has nothing to do with thread-safetiness. The call is supposed to be thread-safe.
DBMS is going to be smart enough to make sure that one update finish before the other update of the same record comes in (unless you have set it with very low isolation level). Therefore, if two thread (or process etc) invoke that same method twice (using same balance ID), the same record will be deducted twice.

Homework: Algorithm to determine the two most popular items in a shopping cart

I'm trying to determine the two most popular items in a user's shopping cart.
Each time the user adds or removes an item to/from the cart a function updatePopularity(Item item) is called which is passed a single argument referencing the object that was updated. Here is a snippet of the code:
private void updatePopularity(InventoryItem item)
{
InventoryItem tempItem;
if (mostPopular == null)
{
if (item.count > 0)
{
mostPopular = item;
mostPopularLabel.setText(MOST_POPULAR + " " + item.name);
}
}
else if (nextPopular == null)
{
if (mostPopular.name != item.name && item.count > 0)
{
nextPopular = item;
nextPopularLabel.setText(NEXT_POPULAR + " " + item.name);
}
}
else if (mostPopular.count < item.count)
{
tempItem = mostPopular;
mostPopular = item;
mostPopularLabel.setText(MOST_POPULAR + " " + item.name);
nextPopular = tempItem;
nextPopularLabel.setText(NEXT_POPULAR + " " + nextPopular.name);
}
else if (nextPopular.count < item.count)
{
nextPopular = item;
nextPopularLabel.setText(NEXT_POPULAR + " " + nextPopular.name);
}
else if (mostPopular.count == 0)
{
}
}
I'm getting all fumbled in the logic though because there are so many possible scenarios that can play out.
The end result should be:
At any given moment the two items with the highest quantities in the shopping cart should appear as the most popular.
If the second most popular ends up with a greater quantity, the two items should swap positions for most and next-most popular.
As items are removed from the cart, popularity should also be updated.
Finally, if there is only one popular item because of cart updates, I should reflect that.
Can anyone help me with how to go about logically mapping out the possibilities?
I'm currently taking a Discrete Mathematics class, but I'm not far enough in to attempt solving this on paper yet. I'm not asking for the code to be written out for me, but just some guidance into how one works through the logic without a mess of if statements that get super confusing.
The implementation I ended up using is not efficient but it works...
Each time an item is is added or removed from the cart, the inventory is scanned finding the first two occurrences of the most and second most popular items. Then, those items are compared to the current holders of most and second most popular. The tricky part was with the logic because if an item took first place, then the current item in first needs to get pushed to second. However, and if an item that was currently second got bumped to first, then the first item to have been tied for second needed to become second. I was trying to map out the logic on paper, but I just don't have the skills or knowledge to do that. I end up confusing myself. :)

Resources