How does JanusGraph.open() work and how to scale? - janusgraph

I am evaluating different Graph databases and libraries etc.. and JanusGraph seems to be providing most of what I need. I do have a couple of questions:
I would like to connect to it via Gremlin Server with Cluster option, however I don't seem to see any Java examples to handle transaction rollbacks etc at all.
And if I was to use JanusGraphFactory.open("...") option, how exactly does this work? Would it mean the entire Graph is loaded into memory in JVM?
If the entire graph is loaded into memory, how would one scale up and different JVMs are keeping up to date with each other?
Thanks & regards
Tin

I would like to connect to it via Gremlin Server with Cluster option, however I don't seem to see any Java examples to handle transaction rollbacks etc at all.
Connecting to Gremlin Server involves sessionless communication, meaning each request equals one transaction. You can connect with a session but it is not typically encouraged for most use cases.
And if I was to use JanusGraphFactory.open("...") option, how exactly does this work? Would it mean the entire Graph is loaded into memory in JVM?
It just creates a reference to the data and provides a Graph instance from which you can create a GraphTraversalSource to interact with for spawning traversals. It doesn't load any of that data into memory just by virtue of calling it.

Related

Connecting to Cassandra on startup, and monitoring session health

Two related questions
1) Currently, the session to C* is established in a lazy fashion - aka, only on the first any table is accessed.
Instead, we would like to establish a session as soon as the application is started (in case there is a connectivity problem, etc. ). What would be the best way to do that? Should I just get a session object in my startup code?
connector.provider.session
2) How would I then monitor the health of the connection? I could call
connector.provider.session.isClosed()
but I'm not sure it will do the job.
I wouldn't manually rely on that mechanism per say as you may want to get more metrics out of the cluster, for which purpose you have native JMX support, so through the JMX protocol you can look at metrics in more detail.
Now obviously you have OpsCenter which natively leverages this feature, but alternatively you can use a combination of a JMX listener with something like Graphana(just a thought) or whatever supports native compatibility.
In terms of low level methods, yes, you are on the money:
connector.provider.session.isClosed()
But you also have heartbeats that you can log and look at and so on. There's more detail here.

Are service fabric services entirely single-threaded?

I'm trying to get to grips with service fabric and I'm struggling a little bit. Some questions:
are all service fabric service instances single-threaded? I created a stateless web api, one instance, with a method that did a Task.Delay, then returned a string. Two requests to this service were served one after the other, not concurrently. So am I right in thinking then that the number of concurrent requests that can be served is purely a function of the service instance count in the application manifest? Edit Thinking about this, it is probably to do with the set up of OWIN Wep Api. Could it be it is blocking by session? I assumed there is no session by default?
I have long-running operations that I need to perform in service fabric (that can take several hours). Is there a recommended pattern that I can use for this in service fabric? These are currently handled using a storage queue that triggers a webjob. Maybe something with Reliable Queues and a RunAsync loop?
It seems you handled the first part so I will comment on the second part: "long-running operations".
We can see long running operations / workflows being handled far before service fabric came about. For this reason, we can build on the shoulders of giants by looking on the design patterns that software experts have been using for decades. For example, the famous and all inclusive Process Manager. Mind you that this pattern is sometimes an overkill. If it is in your case, just check out the rest of the related patterns in the Enterprise Integration Patterns book (by Gregor Hohpe).
As for the use of reliable collections, those are implementation details when choosing a data structure supporting the chosen design pattern.
I hope that helps
With regards to your second point - It really depends on the nature of your long running task.
Is your long running task the kind of workload that runs on an isolated thread that depends on local OS/VM level resources and eventually comes back with a result (A)? or is it the kind of long running task that goes through stages and builds up a model of the result through a series of persisted state changes (B)?
From what I understand of Service Fabric, it isn't really designed for running long running workloads (A), but more for writing horizontally-scalable, highly-available systems.
If you were absolutely keen on using service fabric (and your kind of workload tends to be more like B than A) I would definitely find a way to break down those long running tasks that could be processed in parallel across the cluster. But even then, there is probably more appropriate technologies designed for this such as Azure Batch?
P.s. If you are going to put a long running process in the RunAsync method, you should design the workload so it is interruptable and its state can be persisted in a way that can be resumed from another node in the cluster
In a stateful service, only the primary replica has write access to
state and thus is generally when the service is performing actual
work. The RunAsync method in a stateful service is executed only when
the stateful service replica is primary. The RunAsync method is
cancelled when a primary replica's role changes away from primary, as
well as during the close and abort events.
P.s.s Long running operations are the devil when trying to write scalable systems. Try and tackle that now and save yourself the future pain if possibe.
To the first point - this is purely a client issue. Chrome saw my requests as indentical and so delayed the 2nd request until the 1st got a response. Varying the parameter of the requests allowed them to be served concurrently.

What is the recommended way of creating a distributed Lock with Redis on Azure?

I'm looking to create a distributed Lock within Redis on Azure for our multi-instance Worker Role. I need a way of creating "critical sections" for which only a single thread can have access at a time across multiple-instances of the Worker Role.
I am using the StackExchange.Redis client to do this and, helpfully, it has an implementation of transactional TakeLock\ReleaseLock already, and this answer on SO gives me a good idea of the pattern to use and details about how to create a lock.
Reading further around the subject, I also read this Redis article regarding distlock which describes the weaknesses of failover-based Redis nodes when trying to implement a distributed lock mechanism.
The Azure Redis cache implements master/slave failover (apart from the Basic tier) so does this mean that I will need to implement the redlock pattern in order to guarantee that only one thing will ever have the lock?
Additionally, I am wondering:
Why do Azure Redis example connection strings not seem to list the master and slave in them? Have Azure implemented the master/slave failover in a different way?
Why has one .NET implementation of redlock chosen not to support using master/slaves in its usage? (See Usage section, first para) Is this just by choice or is it because master/slave is not a valid usage of redlock (that would not seem to be the case in the redis article)
I'm the author of the RedLock.net library that you linked in your question. The reason the documentation specifies connecting to independent redis instances is based on the reasoning in the Redis Distlock documentation. By forcing writes only to master nodes, we hopefully avoid the situation where a user might misconfigure Redlock to connect to multiple replicated hosts.
According to Azure Redis Cache 103 - Failover and Monitoring there is a load balancer in front of an Azure Redis Cache (at the standard tier and above) that ensures that you are always connected to the master.
Connecting to multiple redis instances (either replicated or not) should give a fairly good guarantee that no two processes end up running at the same time (moreso than a single replicated instance).
In order for another process to 'steal' the lock before the first had finished, more than half of the independent redis instances would need to lose their lock keys (e.g. by restarting without persistence), then have process two gain the lock before the timer in process one reacquired it during its extend timer.

Best way to initialize initial connection with a server for REST calls?

I've been building some apps that connect to a SQL backend. I use ajax calls to hit WebMethods, a WebAPI, etc.
I notice that the first initial call to the SQL backend retrieves the data fairly slow. I can only assume that this is because it must first negotiate credentials first before retrieving the data. It probably caches this somewhere, and thus, any calls made afterwards come back very fast.
I'm wondering if there's an ideal, or optimal way, to initialize this connection.
My thought was to make a simple GET call right when the page loads (grabbing something very small, like a single entry). I probably wouldn't be using the returned data in any useful way, other than to ensure that any calls afterwards come back faster.
Is this an okay way to approach fixing the initial delay? I'd love to hear how others handle this.
Cheers!
There are a number of reasons that your first call could be slower than subsequent ones
Depending on your server platform, code may be compiled when first executed
You may not have an active DB connection in your connection pool
The database may not have cached indices or data on the first call
Some VM platforms may take a while to allocate sufficient resources to your server if it has been idle for a while.
One way I deal with those types of issues on the server side is to add startup code to my web service that fetches data likely to be used by many callers when the service first initializes (e.g. lookup tables, user credential tables, etc).
If you only control the client, consider that you may well wish to monitor server health (I use the open source monitoring platform Zabbix. There are also many commercial web-based monitoring solutions). Exercising the server outside of end-user code is probably better than making an extra GET call from a page that an end user has loaded.

How does session replication across containers work?

I would be interested in some timing details. For example I place in session some container, which can keep different data. I do change of content of the container frequently. How can I assure that the container session value get replicates across nodes for any change?
You don't need to make sure; that's the application server's job.
The J2EE specification doesn't deal with session-information synchronization amongst distributed components.
Theoretically, all you have to do is code thread-safe. In your example, simply make sure that access to the container is synchronized. If your application server is bug-free, then you can safely assume that the session information is properly replicated across all nodes in a seamless manner; if your application server has bugs around session synchronization... well... then nothing is really safe anymore, now is it.
Application servers use different strategies to synchronize session information between nodes. Session content can be considered as dirty and required synchronization at
put data in session
get data from session
get data from session falls in two categories as
get structured object
get scalar object or immutable object
So if session data get modified indirectly by modifying an structured object, then simple re-read it from session can assure that the object content got replicated.

Resources