consul query all service nodes in one request - consul

https://www.consul.io/docs/agent/http/catalog.html
/v1/catalog/services: Lists the nodes in a given service
I have lot of services, have to query consul for the nodes for each of them .. THUS This is called multiple times
/v1/catalog/service/ : Nodes for the service
need a http api to get all the services in just one requests , something like
/v1/catalog/servicesNodes : Nodes for each service
{
"service1":[{Node":"2e6c1dbe173f","Address":"172.17.42.1",
"ServiceID":"aa:80", "ServiceName":"aaww",...},{}],
"service2":[{Node":"2e6c1dbe173ee","Address":"172.17.42.1",
"ServiceID":"aaqq:80", "ServiceName":"aaqqww",...},{}],
}

Related

consul - Duplicate call generated during timeout

I have two micro services MS1 and MS2 running on 2 nodes say n1 and n2. When MS1 calls MS2, consul discovers MS2 of node1. However, MS2 of node1 takes longer than the read timeout defined. As soon as the timeout is reached, I see a call coming to MS2 of node2.
Is this the expected behavior of Consul to redirect a call to a different node when the 1st node takes long?
This is happening only if the endpoint is a GET Request, does not happen in a POST request.

Infinispan clustered REPL_ASYNC cache: command indefinitely bounced between two nodes

Im running a spring boot application using infinispan 10.1.8 in a 2 node cluster. The 2 nodes are communicating via jgroups TCP. I configured several REPL_ASYNC.
The problem:
One of these caches, at some point is causing the two nodes to exchange the same message over and over, causing high CPU and memory usage. The only way to stop this is to stop one of the two nodes.
More details, here is the configuration.
org.infinispan.configuration.cache.Configuration replAsyncNoExpirationConfiguration = new ConfigurationBuilder()
.clustering()
.cacheMode(CacheMode.REPL_ASYNC)
.transaction()
.lockingMode(LockingMode.OPTIMISTIC)
.transactionMode(TransactionMode.NON_TRANSACTIONAL)
.statistics().enabled(cacheInfo.isStatsEnabled())
.locking()
.concurrencyLevel(32)
.lockAcquisitionTimeout(15, TimeUnit.SECONDS)
.isolationLevel(IsolationLevel.READ_COMMITTED)
.expiration()
.lifespan(-1) //entries do not expire
.maxIdle(-1) // even when they are idle for some time
.wakeUpInterval(-1) // disable the periodic eviction process
.build();
One of these caches (named formConfig) is causing me abnormal communication between the two nodes, this is what happens:
with jmeter I generate traffic load targeting only node 1
for some time node 2 receives cache entries from node 1 via SingleRpcCommand, no anomalies, even formConfig cache behaves properly
after some time a new cache entry is sent to the formConfig cache
At this point the same message seems to keep bouncing between the two nodes:
node 1 sends entry mn-node1.company.acme-develop sending command to all: SingleRpcCommand{cacheName='formConfig', command=PutKeyValueCommand{key=SimpleKey [form_config,MECHANICAL,DESIGN,et,7850]
node 2 receives the entry mn-node2.company.acme-develop received command from mn-node1.company.acme-develop: SingleRpcCommand{cacheName='formConfig', command=PutKeyValueCommand{key=SimpleKey [form_config,MECHANICAL,DESIGN,et,7850]
node 2 sends the entry back to node 1 mn-node2.company.acme-develop sending command to all: SingleRpcCommand{cacheName='formConfig', command=PutKeyValueCommand{key=SimpleKey [form_config,MECHANICAL,DESIGN,et,7850]
node 1 receives the entry mn-node1.company.acme-develop received command from mn-node2.company.acme-develop: SingleRpcCommand{cacheName='formConfig', command=PutKeyValueCommand{key=SimpleKey [form_config,MECHANICAL,DESIGN,et,7850],
node 1 sends the entry to node 2 and so on and on...
Some other things:
the system is not under load, jmeter is running only few users in parallel
Even stopping jmeter this loop doesn't stop
formConfig is the only cache that behaves this way. All the other REPL_ASYNC caches work properly. I deactivated only formConfig cache and the system is working correctly.
I cannot reproduce the problem with two nodes running on my machine
Here's a more complete log file including logs from both nodes.
Other infos:
opendjdk 11 hot spot
spring boot 2.2.7
infinispan spring boot starter 2.2.4
using JbossUserMarshaller
I'm suspecting
something related to transactional configuration
or something related to serialization/deserialization of the cached object
The only scenario where this can happen is when the SimpleKey has different hashCode().
Are there any exceptions in the log? Are you able to check if the hashCode() is the same after serialization & deserialization of the key?

Invoke a command on a node in Karaf Cellar based on NodeID

At the moment, I have a cellar setup with just two nodes (meant for testing); as seen in the dump below:
| Id | Alias | Host Name | Port
--+-------------------+----------------+--------------+-----
x | 192.168.99.1:5702 | localhost:8182 | 192.168.99.1 | 5702
| 192.168.99.1:5701 | localhost:8181 | 192.168.99.1 | 5701
Edit 1 -- Additional Information about the setup (begin):
I have multiple cellar nodes. I am trying to make one node as a master, which is supposed to expose a management web panel via which I would like to fetch stats from all the other nodes. For this purpose, I have exposed my custom implementations of Mbeans involving my business logics. I understand that these mbeans can be invoked using Jolokia, and I am already doing that. So, that means, all these different nodes will have Jolokia installed, while the master node will have Hawtio installed (such that I can connect to slave nodes via Jolokia API through hawtio panel).
Right now, I am manually assigning the alias for every node (which refers to the web endpoint that it exposes via pax.web configuration). This is just a workaround to simplify my testing procedures.
Desired Process:
I have access to the ClusterManager service via service registry. Thus, I am able to invoke clusterManager.listNodes() and loop through the result in my MBean. While looping through this, all I get is the basic node info. But, if it is possible, I would like to parse the etc/org.ops4j.pax.web.cfg file from every node and get the port number (or the value of the property org.osgi.service.http.port).
While retrieving the list of nodes, I would like to get a response as:
{
"Node 1": {
"hostname": "192.168.0.100",
"port": 5701,
"webPort": "8181",
"alias": "Data-Node-A"
"id": "192.168.0.100:5701"
},
"Node 2": {
"hostname": "192.168.0.100",
"port": 5702,
"webPort": "8182",
"alias": "Data-Node-B",
"id": "192.168.0.100:5702"
}
}
Edit 1 (end):
I am trying to find a way to execute specific commands on a particular node. For example, I want to execute a command on Node *:5702 from *:5701 such that *:5702 returns the properties and values of a local configuration file.
My current method is not optimal, as I am setting the alias(the web endpoint for jolokia) of a node manually, and based on that I am retrieving my desired info via my custom mbean. I guess, this is not the best practice.
So far, I have:
Set<Node> nodes = clusterManager.listNodes();
thus, if I loop through this set of nodes, I would like to retrieve config settings from local configuration file from every node based on the node ID.
Do I need to implement something specific to dosgi here?
Or would it be something similar to the sample code of ping-pong (https://github.com/apache/karaf-cellar/tree/master/utils/src/main/java/org/apache/karaf/cellar/utils/ping) from apache-cellar project?
Any input on this would be very helpful.
P.S. I tried posting this in Karaf mailing list, but my posts are getting bounced.
Regards,
Cooshal.

High Availability in SymmetricDS

To all those SymmetricDS nerds over there, this one's for you all.
Right, so we have a main db, DB-01. We have 3 instances of our application running namely R1,R2,R3. Each instance has its own in-memory db namely D1,D2,D3 which it(application) is accessing respectively. We are using SymmetricDS to do one-way sync from DB-01 to D1,D2,D3. So, there is a server node, corporate C0, pointing to DB-01 and 3 client nodes, stores S1,S2,S3 pointing to D1,D2,D3 respectively.
All is working fine.
But now, we would like to introduce High Availability and there by FAILOVER into this topology i.e., at any time there will be 2 server nodes running, say Master and Slave, that would be accessing the same DB-01. If Master server goes down, clients should automatically connect to the Slave node and continue operation.
What all might be the configuration changes required to accomplish this? Are there any examples or documentations that i can reproduce to understand this concept?
We do this via clustering with 2 SymmetricDS services running on 2 app servers pointing to the High Availability (HA) connections. Then all you need is HA connections to failover like normal and Symmetric DS clustering does the rest.
Link for the user manual on clustering.
https://www.symmetricds.org/doc/3.13/html/user-guide.html#_clustering
EDIT let me get some configs for you on here service 1:
engine.name=<SDS_SERVICE_1>
db.driver=net.sourceforge.jtds.jdbc.Driver
db.url=jdbc:jtds:sqlserver://<HA_connection1>:1433/<DB>;useCursors=true;bufferMaxMemory=10240;lobBuffer=5242880
db.user=***********
db.password=***********
registration.url=http://<IP>:7004/sync/<SDS_MAIN>
sync.url=http://<IP>:7004/sync/<SDS_SERVICE_1>
group.id=<GID>
external.id=100
auto.registration=true
initial.load.create.first=true
sync.table.prefix=sym
start.initial.load.extract.job=false
cluster.lock.enabled=true
cluster.server.id=11
cluster.lock.timeout.ms=600000
cluster.lock.refresh.ms=60000
compression.level=-1
compression.strategy=0
Service 2:
engine.name=<SDS_SERVICE_2>
db.driver=net.sourceforge.jtds.jdbc.Driver
db.url=jdbc:jtds:sqlserver://<HA_connection2>:1433/<DB>;useCursors=true;bufferMaxMemory=10240;lobBuffer=5242880
db.user=***********
db.password=***********
registration.url=http://<IP>:7004/sync/<SDS_MAIN>
sync.url=http://<IP>:7004/sync/<SDS_SERVICE_2>
group.id=<GID>
external.id=100
auto.registration=true
initial.load.create.first=true
sync.table.prefix=sym
start.initial.load.extract.job=false
cluster.lock.enabled=true
cluster.server.id=12
cluster.lock.timeout.ms=600000
cluster.lock.refresh.ms=60000
compression.level=-1
compression.strategy=0

Connecting to elasticsearch cluster in NEST

Let's assume I have several elasticsearch machines in a cluster: 192.168.1.1, 192.168.1.2 and 192.168.1.3
Any of the machines can go down. It doesn't look like NEST supports providing a range of IPs to try to connect.
So how do I make sure I connect to any of the available machines from Nest? Just try to open connection to one, if TryConnect didn't work, try another?
You can run a local ES instance at your application server (eg your web server) and config it to work as a load balancer:
Set node.client: true (or node.master: false and node.data: false) for this local ES config to make it a load balancer. This mean ES will not become master nor contains data
Config it to join the cluster (your 3 nodes don't need to know this ES)
Config NEST to use local ES as your search server
Then this ES become a part of your cluster, and will distribute your requests to suitable nodes
If you don't want "load balancer", then you have to manually checking on client side to determine which node is alive.
Since you have a small set of nodes, you can use a StaticConnectionPool:
var uri1 = new Uri("192.168.1.1");
var uri2 = new Uri("192.168.1.2");
var uri3 = new Uri("192.168.1.3");
var uris = new List<Uri> { uri1, uri2, uri3 };
var connectionPool = new StaticConnectionPool(uris);
var connectionSettings = new ConnectionSettings(connectionPool); // <-- need to be reused
var client = new ElasticClient(connectionSettings);
An important point to keep in mind is to reuse the same ConnectionSetting when creating a new elastic client, since elasticsearch cache is per ConnectionSetting. See this GitHub post:
...In any case its important to share the same ConnectionSettings
instance across any elastic client you instantiate. ElasticClient can
be a singleton or not as long as each instance shares the same
ConnectionSettings instance.
All of our caches are per ConnectionSettings, this includes
serialization caches.
Also a single ConnectionSettings holds a single IConnectionPool and
IConnection something you definitely want to reuse across requests.
I would set up one of the nodes as a load balancer. Meaning that the URL your are calling should allways be up.
Though if you increase the number of replicas you can call any of the nodes by URL and still access the same data. ElasticSearch does not care which one you access while in a cluster. So you could build your own range of ips in your application.

Resources