Apache Ignite: Cache item lock usage - thread-safety

I need to modify cache items on different threads on JVM so I need to be sure that all items are modified in order and safely. So I thought that it would be ok if thread create or acquire lock with the cache key and release it after the work is finished. Like that:
if(this.igniteCache.lock(k).tryLock()){
try {
if(this.igniteCache.containsKey(k)){
List value=this.igniteCache.get(k);
value.addAll(v);
this.igniteCache.put(k, value);
}
} finally{
this.igniteCache.lock(k).unlock();
}
}
So my question is: is it wise to create that muck lock items? Is there any significant cost on memory or network side?
Or can you direct me on this if there are any other way?
Thx

This is a typical use case for PESSIMISTIC/REPEATABLE_READ transaction. You can refer to CacheTransactionExample [1] included in Ignite (see deposit() method).
Also see [2] for more information about transactional support in Ignite.
[1] https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/datagrid/CacheTransactionExample.java
[2] https://apacheignite.readme.io/docs/transactions

Related

Apache Ignite CollisionSpi configuration

I have a requirement like "Only allow cache updates on same cache to run in sequence". Our client node is written in .net.
Every cache has affinity key and we use computeJob.AffinityCallAsync("cacheName", "affinityKey", job) to submit the compute job for execution.
Now If I use collisionSpi then, can I achieve "Sync jobs running on same node for same cache"? What configuration do I need to use?
Do I need to write same configuration for all the nodes(server and client)? I saw collisionSpi has no implementation for .net, so what can I do for .net client node?
Wrap your job logic in a lock to make it run in sequence:
public class MyJob : IComputeFunc<string>
{
private static readonly object SyncRoot = new object();
public string Invoke()
{
lock (SyncRoot)
{
// Update cache
}
}
}
Notes:
ICache.Invoke may be a better fit for your use case
The requirement for sequential update sounds weird and may cause suboptimal performance: Ignite caches are safe to update concurrently. Please make sure this requirement makes sense.
UPDATE
Adding a lock will ensure that one update happens at a time on a given node. Other nodes may perform updates in parallel. The order of updates is not guaranteed as well.

Biztalk Debatched Message Value Caching

I get a file with 4000 entries and debatch it, so i dont lose the whole message if one entry has corrupting data.
The Biztalkmap is accessing an SQL server, before i debatched the Message I simply cached the SLQ data in the Map, but now i have 4000 indipendent maps.
Without caching the process takes about 30 times longer.
Is there a way to cache the data from the SQL Server somewhere out of the Map without losing much Performance?
It is not a recommendable pattern to access a database in a Map.
Since what you describe sounds like you're retrieving static reference data, another option is to move the process to an Orchestration where the reference data is retrieved one time into a Message.
Then, you can use a dual input Map supplying the reference data and the business message.
In this patter, you can either debatch in the Orchestration or use a Sequential Convoy.
I would always avoid accessing SQL Server in a map - it gets very easy to inadvertently make many more calls than you intend (whether because of a mistake in the map design or because of unexpected volume or usage of the map on a particular port or set of ports). In fact, I would generally avoid making any kind of call in a map that has to access another system or service, but if you must, then caching can help.
You can cache using, for example, MemoryCache. The pattern I use with that generally involves a custom C# library where you first check the cache for your value, and if there's a miss you check SQL (either for the paritcular entry or the entire cache, e.g.:
object _syncRoot = new object();
...
public string CheckCache(string key)
{
string check = MemoryCache.Default.Get(key) as string;
if (check == null)
{
lock (_syncRoot)
{
// make sure someone else didn't get here before we acquired the lock, avoid duplicate work
check = MemoryCache.Default.Get(key) as string;
if (check != null) return check;
string sql = #"SELECT ...";
using (SqlConnection conn = new SqlConnection(connStr))
{
conn.Open();
using (SqlCommand cmd = conn.CreateCommand())
{
cmd.CommandText = sql;
cmd.Parameters.AddWithValue(...);
// ExecuteScalar or ExecuteReader as appropriate, read values out, store in cache
// use MemoryCache.Default.Add with sensible expiration to cache your data
}
}
}
}
else
{
return check;
}
}
A few things to keep in mind:
This will work on a per AppDomain basis, and pipelines and orchestrations run on separate app domains. If you are executing this map in both places, you'll end up with caches in both places. The complexity added in trying to share this accross AppDomains is probably not worth it, but if you really need that you should isolate your caching into something like a WCF NetTcp service.
This will use more memory - you shouldn't just throw everything and anything into a cache in BizTalk, and if you're going to cache stuff make sure you have lots of available memory on the machine and that BizTalk is configured to be able to use it.
The MemoryCache can store whatever you want - I'm using strings here, but it could be other primitive types or objects as well.

Does Await have overhead in test case

I have a very simple test cases(scalatest, but doesn't matter) and I provide two implementation of accessing some resources, this method returns either Try or some case class instance.
Test cases:
"ResourceLoader" must
"successfully initialize resource" in {
/async code test
noException should be thrownBy Await.result(ResourceLoader.initializeRemoteResourceAsync(credentials, networkConfig), Duration.Inf)
}
"ResourceLoader" must
"successfully sync initialize remote resources" in {
noException should be thrownBy ResourceLoader.initializeRemoteResource(credentials, networkConfig)
}
This tests testing different code which access some remote resources
Sync version
def initializeRemoteResource(credentials: Credentials, absolutePathToNetworkConfig: String): Resource = {
//some code accessing remote server
}
Async version
def initializeRemoteResourceAsync(credentials: Credentials, absolutePathToNetworkConfig: String): Future[Try[Resource]] = {
Future {
//the same code as in sync version
}
}
In IDEA test tab I see that future based version is twice slower then sync version, my question is there overhead for calling Await.result explicitly? If not, why it slows down the execution? Appreciate any help, Thanks.
Note: I know it is not the best way to measure performance of production system. But it at list says how much time was spend on each test case.
Yes, there will be a small overhead for Await.result, but in practice it probably doesn't amount to much. Future {} requires an ExecutionContext (thread pool or thread creator) in implicit scope so you won't be able to successfully use it without importing the default execution context (which will simply spawn a thread) or some other context. If you're using the default execution context, for example, you will have two threads instead of one which will involve some overhead for context switching. It shouldn't be much though. If 'twice as slow' means 40ms instead of 20 then perhaps it's not worth worrying about.

Waiting for Realm writes to be completed

We are using Realm in a Xamarin app and have some issues refreshing the local database based on a remote source. Data is fetched from a remote endpoint and stored locally using Realm for easier/faster access.
Program flow is as follows:
Fetch data from remote source (if possible).
Loop through the entities returned by the remote source while keeping track of the IDs we've seen so far. New or updated entities are written to Realm.
Loop through the set of locally stored entities, removing entities we haven't seen in step 2 with Realm.Remove(entity); (in a transaction)
Return Realm.All<Entity>();
Unfortunately, the entities are returned by step 4 before all "remove" operations have been written. As a result, it takes a couple of refreshes before the local database is completely in sync.
The remove operation is done as follows:
foreach (Entity entity in realm.All<Entity>())
{
if (seenIds.Contains(entity.Id))
{
continue;
}
realm.Write(() => {
realm.Remove(entity);
});
}
Is there a way to have Realm wait till the transaction is completed, before returning the Realm.All<Entity>();?
I am pretty sure this is not particularly a Realm issue - the same pattern would cause problems with a lot of enumerable, mutable containers. You are removing items from a list whilst iterating it so enumeration is moving on too far.
There is no buffering on Realm transactions so I guarantee it is not about have Realm wait till the transaction is completed but is your list logic.
There are two basic ways to do this differently:
Use ToList to get a list of all objects from the All - this is expensive if many objects because you will instantiate all the objects.
Instead of removing objects inside the loop, add them to a list of items to be removed then iterate that list.
Note that using a transaction per-remove, as you are doing with Write here is relatively slow. You can do many operations in one transaction.
We are also working on other improvements to the Realm API that might give a more efficient way of handling this. It would be very helpful to know the relative data sizes - the number of removals vs records in the loop. We love getting sample data and schemas (can send privately to help#realm.io).
an example of option 2:
var toDelete = new List<Entity>();
foreach (Entity entity in realm.All<Entity>())
{
if (!seenIds.Contains(entity.Id))
toDelete.Add(entity);
}
realm.Write(() => {
foreach (Entity entity in toDelete))
realm.Remove(entity);
});

COMMIT WRITE BATCH NOWAIT in Hibernate

Is it possible to execute COMMIT WRITE BATCH NOWAIT in Hibernate?
I didn't search extensively but I couldn't find any evidence that you can access this functionality at the JDBC driver level.
And this leaves you with the option to specify the COMMIT_WRITE parameter at the instance or session level, if this makes sense for you.
Just in case, let me quote this blog post (I'm pasting the content for reference because the original site is either unavailable or dead and I had to use Google Cache):
Using "Commit Write Batch Nowait" from within JDBC
Anyone who has used the new
asynchronous commit feature of Oracle
10.2 will be aware that it's very useful for transaction processing
systems that would traditionally be
bound by log_file_sync wait events.
COMMIT WRITE BATCH NOWAIT is faster
because it doesn't wait for a message
assuring it that the transaction is
safely in the redo log - instead it
assumes it will make it. This nearly
eliminates log_file_sync events. It
also arguably undermines the whole
purpose of commit, but there are many
situations where the loss of a
particular transaction (say to delete
a completed session) is perfectly
survivable and far more preferable
than being unable to serve incoming
requests because all your connections
are busy with log_file_sync wait
events.
The problem anyone using Oracle's JDBC
driver is that neither the 10.2 or
11.1 drivers have any extensions which allow you to access this functionality
easily - while Oracle have lots of
vendor specific extensions for all
sorts of things support for async
commit is missing.
This means you can:
Turn on async commit at the instance level by messing with the
COMMIT_WRITE init.ora parameter.
There's a really good chance this will
get you fired, as throughout the
entire system COMMIT will be
asynchronous. While we think this is
insane for production systems there
are times where setting it on a
development box makes sense, as if you
are 80% log file sync bound setting
COMMIT_WRITE to COMMIT WRITE BATCH
NOWAIT will allow you to see what
problems you face if you can somehow
fix your current ones.
Change COMMIT_WRITE at the session level. This isn't as dangerous as
doing it system wide but it's hard to
see it being viable for a real world
system with transactions people care
about.
Prepare and use a PL/SQL block that goes "BEGIN COMMIT WRITE BATCH NOWAIT;
END". This is safer than the first
two ideas but still involves a network
round trip.
Wrap your statement in an anonymous block with an asynchronous commit.
This is the best approach we've seen.
Your code will look something like
this:
BEGIN
--
insert into generic_table
(a_col, another_col, yet_another_col)
values
(?,?,?);
--
COMMIT WRITE BATCH NOWAIT;
--
END;
I was looking for a way to do this but couldn't get it working in a test. The reason for my hold up was that I was expecting the wrong results from my test. I was testing by manually acquiring a shared table lock to simulate adding an index - but in this case, the insert query acquires the lock, not the commit. So it doesn't actually solve the problem I was looking to solve. I got round my problem by moving these insertions into a background queue, so that they don't hold up the main web request.
Anyway I think you can still do asynchronous commits in Hibernate. Basically you can use the Session.doWork() method to get access to the native Connection object (or in older versions of Hibernate, the Session.connection() method). I also moved the commit SQL into a strategy interface, so that we can run our HSQLDB-based tests which wouldn't understand the Oracle specific SQL.
In fact, it may be fine to use Session.createSQLQuery and give that the SQL, avoiding having to directly use Connection. Try it and see how it works.
private NativeStrategy nativeStrategy = new OracleStrategy();
interface NativeStrategy {
String commit();
}
public static final class OracleStrategy implements NativeStrategy {
public String commit() {
return "COMMIT WRITE BATCH NOWAIT";
}
}
public void saveAsynchronously(MyItem item) {
session.save(item);
session.flush();
// Try to issue an asynchronous commit where supported.
session.doWork(new Work() {
public void execute(Connection connection) throws SQLException {
Statement commit = connection.createStatement();
try {
commit.execute( nativeStrategy.commit() );
} finally {
commit.close();
}
}
});
}

Resources