How do I use Consul to make sure only one service is performing a task?
I've followed the examples in http://www.consul.io/ but I am not 100% sure which way to go. Should I use KV? Should I use services? Or should I use a register a service as a Health Check and make it be callable by the cluster at a given interval?
For example, imagine there are several data centers. Within every data center there are many services running. Every one of these services can send emails. These services have to check if there are any emails to be sent. If there are, then send the emails. However, I don't want the same email be sent more than once.
How would it make sure all emails are sent and none was sent more than once?
I could do this using other technologies, but I am trying to implement this using Consul.
This is exactly the use case for Consul Distributed Locks
For example, let's say you have three servers in different AWS availability zones for fail over. Each one is launched with:
consul lock -verbose lock-name ./run_server.sh
Consul agent will only run the ./run_server.sh command on which ever server acquires the lock first. If ./run_server.sh fails on the server with the lock Consul agent will release the lock and another node which acquires it first will execute ./run_server.sh. This way you get fail over and only one server running at a time. If you registered your Consul health checks properly you'll be able to see that the server on the first node failed and you can repair and restart the consul lock ... on that node and it will block until it can acquire the lock.
Currently, Distributed Locking can only happen within a single Consul Datacenter. But, since it is up to you to decide what a Consul Servers make up a Datacenter, you should be able to solve your issue. If you want locking across Federated Consul Datacenters you'll have to wait for it, since it's a roadmap item.
First Point:
The question is how to use Consul to solve a specific problem. However, Consul cannot solve that specific problem because of intrinsic limitations in the nature of a gossip protocol.
When one datacenter cannot talk to another you cannot safely determine if the problem is the network or the affected datacenter.
The usual solution is to define what happens when one DC cannot talk to another one. For example, if we have 3 datacenters (DC1, DC2, and DC3) we can determine that whenever one DC cannot talk to the other 2 DCs then it will stop updating the database.
If DC1 cannot talk to DC2 and DC3 then DC1 will stop updating the database, and the system will assume DC2 and DC3 are still online.
Let's imagine that DC2 and DC3 are still online and they can talk to each other, then we have quorum to continue running the system.
When DC1 comes online again it will play catch up with the database.
Where can Consul help here? It can communicate between DCs and check if they are online... but so can ICMP.
Take a look at the comments. Did this answer your question? Not really. But I don't think the question has an answer.
Second point: The question is "How to use Consul in leader election?" It would have been better to ask how does Consul elect a new leader. Or "Given the documentation in Consul.io, can you give me an example on how to determine the leader using Consul".
If that is what you really want, then the question was already answered: How does a Consul agent know it is the leader of a cluster?
Related
I am trying to understand how to best implement the MDP example in c# to be used in a windows service in a multiple client - single server environment.
I have read the docs but I am still unclear on the following:
Should all Worker instances be created on startup and left to run?
Should the Workers all be different types of services or just different instances of the same service?
Can I have one windows service when contains the Broker and Workers or is it best to split them out into their own services?
The example code I am using is the MajorDomo Pattern taken from here https://github.com/NetMQ/Samples
Yes, all workers in a MDP environment should be created independently of the requests, since the broker should not know how to create them
Each worker handles a given "service" (contract). Obviously each contract should have at least one worker.
If you need parallelized handling of requests, and a given worker can only do one at a time, having extra workers for that service could make sense. Generally you would do this if multiple machines were involved however (horizontal scaling)
You can have the broker and workers in the same process. HOWEVER, if you want to update only a worker, taking down the broker at the same time can be annoying for the clients. I would recommend letting the broker be its own process, with the workers in one or more other processes.
I'm reading around Redis at the moment and trying to find a good understanding of what a 'node' is terms of how Redis works. Am I right to think of it in the same was as an endpoint?
In Redis' context, a node is a server running one or more redis-server processes.
Endpoint is a network address through which you can access one or more such processes, depending on how Redis is clustered.
When using the open source Redis cluster, an endpoint is any of the processes - meaning a node's address and the port that the process listens to. Redis client libraries use the protocol to interrogate the clustered redis-server process about other members of the cluster (again, processes listening on ports on nodes), so they can establish connections to other endpoints accordingly.
Disclaimer: it appears that you're asking about AWS ElastiCache, which may or may not be using the OSS implementation in whole or partially. I do not claim to have any knowledge on that subject.
Its a type of (temporary memory [RAM]) to which network is attached. Its the smallest unit where frequently accessed data is stored by following lazy loading or write through strategy. A collection of such nodes ,where a predefined Redis process is running on each node , is called cluster.
More on node :
https://redis.io/commands/cluster-nodes/
Recently I started learning Redis and have been able to do everything from learning aspect in 32 bit Windows. I am a .net developer and made caching available using Redis using ServiceStack client in a Web API setup. I have been able to successfully run a Redis cluster of 4 masters and 4 slaves, and was wondering how can I make that work in conjunction with the ServiceStack client.
My main concern is that if the master that I connect my client to, goes down, then how can the client automatically connect to some other available slave that takes over, as the port of that slave is going to be different. So failover is working at Redis level, but how the client handles it?
I recreated the mentioned scenario, using Redis Command Line Interface, but when I took the master down, the interface just stopped responding, as in everything was just going in a blackhole. So, per my experience, the cli does not automatically handles failover as a client.
I have started studying StackExchange's client to Redis, but still have the same question.
I am using Redis distribution given by Microsoft for learning purposes available at Github (Sorry, cannot provide link as I am new here and do not have sufficient reputation points).
Redis Sentinel are additional Redis processes which monitor the health of your Redis Master/Slaves and takes care of performing Automatic Failover when it detects that your Master instance is down. The Redis Config project provides a quick way to setup a popular Redis Sentinel Configuration.
The ServiceStack.Redis Client supports Redis Sentinel and implements the Recommended client Strategy which is what enables it to automatically recover after a failover by asking one of the Sentinels for the next available address to connect to, resuming operations with one of the available instances.
You can learn more about Redis Sentinel in the official Documentation.
I've been doing some research for enhancement of in-house Discovery Service on my project. We have a number of nodes in a cluster accountable for discovery service, higly available. In order to get access to some service each client app sends a multicast message to all these nodes in the cluster. All nodes respond to a client and the very first response defines a particular node for further work. This is an overhead and I'm thinking of using some kind of leader election algorithm where only a single leader responds to clients. Is it reasonable to use such an algorithm for this task?
I think what you are trying to do is load balance across multiple machines where in any machine can handle the requests. Leader selection etc seems a overhead. Probably a loadbalancer can solve the issue.
I am considering the feasibility that if we can replace our message-queue-middleware with ØMQ.
I have two set of servers.
The first set of the servers, they don't talk to another server from the same set, they only append the requests into specific message-queue.
The 2nd set of the servers, they don't talk to another server from the same set, they only receive the requests from specific message-queue to handle the requests.
It looks like a producer-consumer model.
And I think it can be replaced by the ØMQ's freelance pattern http://zguide.zeromq.org/page:all#Brokerless-Reliability-Freelance-Pattern.
But the questions are:
How to support dynamic discovery for both server & clients?
How to support dynamic discovery for both server & clients?
There are probably a hundred ways you could implement that, and greatly depend on your situation. If all the servers will always be on the same LAN you could bootstrap using the broadcast address on the local network and ask all responders who they are. Quick and dirty.
I would personally implement a bootstrap service that everyone knows about. They all can ask this always-available service for who is 'online' for the type of server they're after.
Another option, you could also use pub-sub. This would require a central publisher. newly connecting nodes would notify the publisher who would notify all other nodes of the new join, possibly including the new nodes ID, ip:port (if desired) etc. All nodes will still be able to communicate if the publisher crashes since its only used for global notifications, and a backup publisher could be used to make the system failsafe. Each node can also send heartbeats to publisher, with publisher notifying all other nodes when a node leaves/crashes.