Unreachable nodes in Akka Cluster Sharding - sharding

I'm researching Akka Cluster Sharding, after read Akka documents and some topic in internet, I thought that when cluster has an unreachable node, only communication to shards on this node are interrupted. It means that messages are told to shards on it will be buffered and senders will receive a timeout response and messages are told to shards on other available nodes will be send and receive normally.
But when I tried to implement a cluster with 4 nodes, and then stopped service of one node, checked via Akka HTTP Management and confirm it was set unreachable, then I tested sending messages from an actor in an available node to an actor in other available node and received Timeout response.
So, anyone can help me to confirm when cluster has unreachable node, entire sharding system will not work normally, or just unreachable node?
Thanks

From documentation:
Shard Coordinator State
The state of shard locations in the ShardCoordinator is persistent (durable) with Distributed Data or Persistence to survive failures. When a crashed or unreachable coordinator node has been removed (via down) from the cluster a new ShardCoordinator singleton actor will take over and the state is recovered. During such a failure period shards with known location are still available, while messages for new (unknown) shards are buffered until the new ShardCoordinator becomes available.
In this we have "During such a failure period shards with known location are still available", so when the Shard Coordinator is unreachable the whole cluster could not work properly.
For reference: Shard Coordinator State in doc.akka.io

Related

How to use RabbitMQ quorum queue for data replication

In RabbitMQ documentation, it is mentioned that:
All data/state required for the operation of a RabbitMQ broker is replicated across all nodes. An exception to this are message queues, which by default reside on one node, though they are visible and reachable from all nodes. To replicate queues across nodes in a cluster, use a queue type that supports replication. This topic is covered in the Quorum Queues guide.
If we are using springboot amqp classic queue and we need to start using a cluster of RabbitMQ where data is replicated across nodes for a lowest risk of data loss, what changes needs to be done to the code to start using a quorum queue?
When defining the queue, by default, the type is classic queue, to choose the quorum type instead, just add the type of queue as argument:
#Bean
public Queue eventsQueue() {
Map<String, Object> args = new HashMap<>();
args.put("x-queue-type", "quorum");
return new Queue(queueName, true, false, false, args);
}
In addition to the above, make sure you point your spring boot rabbit mq to the cluster not to one node. This can be done by changing spring.rabbitmq.host configuration in application.properties to spring.rabbitmq.addresses=[comma separated ip:port]
A Classic Queue has a master running somewhere on a node in the cluster, while the mirrors run on other nodes. This works the very same way for Quorum Queues, whereby the leader, by default, runs on the node the client application that created it was connected to, and followers are created on the rest of the nodes in the cluster.
In the past, replication of queues was specified by using policies in conjunction with Classic Queues. Quorum queues are created differently, but should be compatible with all client applications which allow you to provide arguments when declaring a queue. The x-queue-type argument needs to be provided with the value quorum when creating the queue.

Corda State Events : Do events have an order?

A network consist of 3 nodes, where 1 node is read-only and participates in every transaction. Request can start from either of the nodes which in turn creates a request state. It is received and processed by other node to create a new response state. Both only issue new states and do not consume the state. Both these state events are received by the read-only node. Would the State events received by the read-only corda node have an order or would it be processed in any order ?
For eg can we say that the request originator state event would be received/processed first and then the other node ? or can it be possible under high load that the other node request gets received/processed by the read-only node first and then the originators event is received.
My experience with corda is very minimal and need to understand
how events are received by the parties when one party acts as
read-only and all remaining parties only issue new states.
In general, the order of the receiving messages is not guaranteed. A node will process messages in the order it receives them. But it's not guaranteed that the received messages are sequential.
If Node A is receiving messages from Node B and Node C, and Node B produces a message before node C. There is no guarantee that the message from Node B is processed first. The one which reaches Node A first gets processed first. The delay could be because of multiple reasons like network latency, etc.

NiFi DistributedMapCacheServer and flowfiles on different nodes

Hi I am using NiFi DistributedMapCacheServer to keep track of processed files in my flow. The issue is that we are working in a cluster and to leverage it we are using load balancing in queues so Flowfiles are not on the same node. Once they are arriving to Put/GetDistributedMapCache that is using DistributedMapCacheClient with fixed name of one of the hosts it only works when arriving Flowfile is on the same node as the one specified in DistributedMapCacheClient- for others we are getting:
FetchDistributedMapCache[id=d4713096-5ae5-1cb4-b777-202948e39e50] Unable to communicate with cache when processing StandardFlowFileRecord[uuid=5b1e8092-5bc5-4213-97a3-fa023691973f,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1587393798960-14, container=default, section=14], offset=983015, length=5996],offset=0,name=bf15d684-4100-4aa5-9fb5-fa0ddb21b140,size=5996] due to No route to host: java.net.NoRouteToHostException: No route to host
Is there any way to set up DMC server/client to work in such case, or can I somehow route all flowfiles to explicitly given node?
This means the hostname/ip-address that you specified in the DistributedMapCacheClient for the location of the server is unreachable by the other nodes in your cluster. Your nodes must be able to communicate since you have a cluster, so you just need to set this to the correct value.

ElasticSearch Clusters Setting

Does anyone know how to tell Elastic Search to stop node to node communications and then restart it..In my system I would like to tell it to stop until a certain condition then restart the communications ( synchronize data)
By node to node communications, do you mean data synchronization and shard relocations?
If yes, you can do it by setting cluster.routing.allocation.enable to none using cluster settings API.
If you don't mean data synchronization, you can achieve this by blocking the port 9300 (or which ever port ES is using for internal communication).
Please note that any node leaves the cluster will cause the elasticsearch to rebalance the shards and replica. The overall cluster loading increases when any node is lost since the cluster needs to fulfill the shard and replica settings by copying existing data to rest of nodes. Therefore, if the operation happens often, the considerable extra space will be consumed for additional shards and replicas.
If you fully understand the impact, you can try the shard allocation filtering. For example, exclude the host ip 10.0.0.1 from the cluster:
PUT _cluster/settings
{
"transient" : {
"cluster.routing.allocation.exclude._ip" : "10.0.0.1"
}
}
Other than ip, you can use node name or host name to exclude the node as well.
You can find the full documentation here: https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-filtering.html

How to setup Elasticsearch client nodes?

I have couple of Elasticsearch questions regarding client node:
Can I say: any nodes as long as they are opening HTTP port, I can treat them as "client" nodes, because we can do search/index through this node.
Actually we treat the node as client node when the cluster=false and data=false, if I set up 10 client nodes, do I need to route in my client side, I mean if I specify clientOne:9200 in my code as ES portal, then would clientOne forward other HTTP requests to other client nodes, otherwise, clientOne would be under very high pressure. i.e do they communicate with each other between client nodes?
When I specify client nodes in ES cluster, should I close other nodes' HTTP port? Because we can only query client nodes.
Do you think it's necessary to set up both data node and client node in the same machine, or just setup data node acts as client node as well, anyways it's in the same machine?
If the ES cluster would be heavily/frequently indexed while less searched, then I don't have to set up client node, because client node good for gathering data, right please?
For general search/index purpose should I use http port or tcp port, what's the difference in clients perspective please?
Yes, you can send queries via http to any node that has port 9200 open.
With node.data: false and node.master: false, you get a "client node". These are useful for offloading indexing and search traffic from your data nodes. If you have 10 of them, you would want to put a load balancer in front of them.
Closing the data node's http port (http.enabled: false) would keep them from serving client requests (probably good), though it would also prevent you from curl'ing them directly for stats, etc.
Client nodes are useful (see #2), so I wouldn't route traffic directly to your data nodes. Whether you run both a client and data node on the same piece of hardware would be dependent on the config of that machine (do you have sufficient RAM, etc).
Client node are also useful for indexing, because they know which data node should receive the data for storage. If you sent an indexing request to a random data node instead, the odds would be high that it would have to redirect that request to another node. That's a waste of time and resources, if you can create client nodes.
Having your clients join the cluster might give them access to more information about the cluster, but using http gives them a more generic "black box" interface. With http, you also don't have to keep your clients at the same version as your ES nodes.
Hope that helps.

Resources