I'm trying to design a system using ZeroMQ where I have a bunch of consumers that each want to use a service, and I need a way for the consumers to start the desired service if it's not started yet. I can visualize having a broker do this (useful to do so, since both consumers and service are dynamic, and their endpoints are TCP ports that can't be known before runtime), but then I have to figure out how to start the broker if it hasn't started yet.
How do you reliably start a single broker using ZeroMQ? The call to bind() seems to accept more than one socket.
Related
I have successfully able to connect worker and broker on tcp protocol and then client to broker on tcp.
Now i am evaluating that is it possible that worker and broker can connect on ipc/inproc protocol while client will connect to broker on tcp.
My workers and broker will be on same machine and might even reside in same process. My client can connect to my broker from different machine that's why it needed to be on tcp
Can Broker be binded in dual way ?
Yes, there's no problem doing what you suggest. Each ZMQ socket operates completely distinct from the other sockets in your code. It often makes sense to mix connection protocols to optimize communication the way you're looking to do.
One assumption I'm making here is that your broker has 2 sets of sockets: client facing sockets that you can connect via TCP and worker facing sockets that you can connect via some other protocol. If both client and worker are connecting to the same socket on the broker, then they must connect via the same protocol.
The only thing to consider is whether your workers will always reside in the same process as your broker, or if it might grow to a point where it makes sense to separate them. But, if you define your socket connections in some sort of configurable way, rather than baking it into the code, even that could be a relatively easy fix if you decide to change things down the line.
To get Kafka running, you need to set some properties in config/server.properties file. There are two settings I don't understand.
Can somebody explain the difference between listeners and advertised.listeners property?
The documentation says:
listeners: The address the socket server listens on.
and
advertised.listeners:
Hostname and port the broker will advertise to producers and consumers.
When do I have to use which setting?
listeners is what the broker will use to create server sockets.
advertised.listeners is what clients will use to connect to the brokers.
The two settings can be different if you have a "complex" network setup (with things like public and private subnets and routing in between).
Since I cannot comment yet I will post this as an "answer", adding on to M.Situations answer.
Within the same document he links there is this blurb about which listener is used by a KAFKA client (https://cwiki.apache.org/confluence/display/KAFKA/KIP-103%3A+Separation+of+Internal+and+External+traffic):
As stated previously, clients never see listener names and will make metadata requests exactly as before. The difference is that the list of endpoints they get back is restricted to the listener name of the endpoint where they made the request.
This is important as depending on what URL you use in your bootstrap.servers config that will be the URL* that the client will get back if it is mapped in advertised.listeners (do not know what the behavior is if the listener does not exist).
Also note this:
The exception is ZooKeeper-based consumers. These consumers retrieve the broker registration information directly from ZooKeeper and will choose the first listener with PLAINTEXT as the security protocol (the only security protocol they support).
As an example broker config (for all brokers in cluster):
advertised.listeners=EXTERNAL://XXXXX.compute-1.amazonaws.com:9990,INTERNAL://ip-XXXXX.ec2.internal:9993
inter.broker.listener.name=INTERNAL
listener.security.protocol.map=EXTERNAL:SSL,INTERNAL:PLAINTEXT
If the client uses XXXXX.compute-1.amazonaws.com:9990 to connect, the metadata fetch will go to that broker. However, the returning URL to use with the Group Coordinator or Leader could be 123.compute-1.amazonaws.com:9990* (a different machine!). This means that the match is done on the listener name as advertised by KIP-103 irrespective of the actual URL (node).
Since the protocol map for EXTERNAL is SSL this would force you to use an SSL keystore to connect.
If on the other hand you are within AWS lets say, you can then issue ip-XXXXX.ec2.internal:9993 and the corresponding connection would be plaintext as per the protocol map.
This is especially needed in IaaS where in my case brokers and consumers live on AWS, whereas my producer lives on a client site, thus needing different security protocols and listeners.
EDIT:
Also adding Inbound Rules is much easier now that you have different ports for different clients (brokers, producers, consumers).
EDIT2:
This article is a great in depth guide if the above is still not clear: https://rmoff.net/2018/08/02/kafka-listeners-explained/
There's so much confusion or little information in answers provided here for the question. So posting my elaborate answer for clarity.
listeners - Used by the embedded jetty web server in kafka to bind to. This jetty web server is used to provide REST API that provides the control plane for Kafka Connect workers. The hostname in this setting can be left empty if you want kafka to bind to localhost (it does by calling InetAddress.getLocalHost().getCanonicalHostName() java api)
advertised.listeners: This address is published to zookeeper by every kafka broker. If this setting is not set, then value of listeners will be used here and published to zookeeper. That's the only purpose of this setting for notifying others. Kafka Clients use the 'advertised.listeners' setting published to zookeeper (as /brokers/ids/<id>/ # endpoints) to talk to Kafka broker.
Now the question is why to have two setting? Why not a single setting? Let's say your kafka broker is sitting behind a proxy. And all the kafka clients have to talk to the proxy to reach the broker. In this case, we want kafka's embedded jetty server to bind to localhost and local port, but we can't publish this to zookeeper as clients can't use it. So kafka admin can set the setting advertised.listeners to the proxy host and port.
Also, in some of our production hosts, InetAddress.getLocalHost().getCanonicalHostName() returns empty and so listeners setting's hostname was empty which was fine for jetty to bind. But advertised.listeners was published to zookeeper as NULL:9092 since it took the same value as listeners by default. Now all the brokers tried to publish in this way to zookeeper and so brokers got the error java.lang.IllegalArgumentException: requirement failed: Configured end points null:14092 as advertised.listeners as NULL:9092 is already registered by broker 101. The fix was to change the advertised.listeners setting to have hostname in it.
Listeners are all the addresses the Kafka broker listens on (it can be more than 1 address) whereas advertised listeners are the addresses other agents (producers, consumers, or brokers) need to connect to if they want to talk to the current broker.
The 2 lists should be the same if all are running on the same machine (can connect using localhost:9092 or 127.0.0.1:9092) but if consumers, producers, or other brokers do not stay on the same machine or Docker instance, they must use different addresses (that's why we have advertised listeners). Two examples:
Saying we use Docker to run 2 Kafka instances named kafka and kafka2. kafka2 for sure cannot connect to kafka using localhost:29092. It must use kafka:9092 instead. So for kafka, listener = localhost:29092, advertised listener = kafka:9092
Producer from host machine cannot connect to kafka using kafka:9092. It must use localhost:29092 instead.
Let use the following docker-compose config to understand more about the startup process of a Kafka broker:
# config/docker-compose.yml
kafka:
image: docker.io/bitnami/kafka:3
ports:
- "29092:29092"
- "9092:9092"
environment:
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- ALLOW_PLAINTEXT_LISTENER=yes
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CLIENT:PLAINTEXT,EXTERNAL:PLAINTEXT
- KAFKA_CFG_LISTENERS=CLIENT://:9092,EXTERNAL://:29092
- KAFKA_CFG_ADVERTISED_LISTENERS=CLIENT://kafka:9092,EXTERNAL://localhost:29092
- KAFKA_CFG_INTER_BROKER_LISTENER_NAME=CLIENT
depends_on:
- zookeeper
With this config, Docker will start 1 Kafka broker instance which listens on 2 ports:
9092 with name CLIENT
29092 with name EXTERNAL
The broker then connects to Zookeeper at zookeeper:2181 and registers its 2 addresses: kafka:9092 and localhost:29092. Also, with KAFKA_CFG_INTER_BROKER_LISTENER_NAME=CLIENT, it wants Zookeeper to tell other brokers to connect to kafka:9092 if want to talk to it.
But why need 2 ports? Read more here
References:
My notes while learning Kafka
Kafka Listeners – Explained
From this link: https://cwiki.apache.org/confluence/display/KAFKA/KIP-103%3A+Separation+of+Internal+and+External+traffic
During the 0.9.0.0 release cycle, support for multiple listeners per
broker was introduced. Each listener is associated with a security
protocol, ip/host and port. When combined with the advertised
listeners mechanism, there is a fair amount of flexibility with one
limitation: at most one listener per security protocol in each of the
two configs (listeners and advertised.listeners).
In some environments, one may want to differentiate between external
clients, internal clients and replication traffic independently of the
security protocol for cost, performance and security reasons. A few
examples that illustrate this:
Replication traffic is assigned to a separate network interface so that it does not interfere with client traffic.
External traffic goes through a proxy/load-balancer (security, flexibility) while internal traffic hits the brokers directly
(performance, cost).
Different security settings for external versus internal traffic even though the security protocol is the same (e.g. different set of
enabled SASL mechanisms, authentication servers, different keystores,
etc.)
As such, we propose that Kafka brokers should be able to define
multiple listeners for the same security protocol for binding (i.e.
listeners) and sharing (i.e. advertised.listeners) so that internal,
external and replication traffic can be separated if required.
So,
listeners - Comma-separated list of URIs we will listen on and their protocols.
Specify hostname as 0.0.0.0 to bind to all interfaces.
Leave hostname empty to bind to default interface.
Examples of legal listener lists:
PLAINTEXT://myhost:9092,TRACE://:9091
PLAINTEXT://0.0.0.0:9092, TRACE://localhost:9093
advertised.listeners - Listeners to publish to ZooKeeper for clients to use, if different than the listeners above.
In IaaS environments, this may need to be different from the interface to which the broker binds. If this is not set, the value for listeners will be used.
Can there more than one listener to a queue manager ? I have used one listener/queue manager combination so far and wonder if this possible. This is because we have 2 applications connecting to same queue manager and seems to have problem with that.
There are a couple meanings for the term listener in an MQ context. Let's see if we can clear up some confusion over the terminology and then answer the question as it relates to each.
As defined in the spec, a JMS listener is an object that implements a callback mechanism. It listens on destinations for messages and calls onMessage when they arrive. The destinations may be queues or topics hosted by any JMS-compliant transport provider.
In IBM MQ terms, a listener is a process (runmqlsr) that handles inbound connection requests on a server. Although these can handle a variety of protocols, in practice they are almost exclusively TCP listeners that bind a port (1414 by default) and negotiate connection requests on sockets.
TCP Ports
Tim's answer applies to the second of these contexts. MQ can listen for sockets on multiple ports and indeed it is quite common to do so. Each listener listens on one and only one port. It may listen on that port across all network interfaces or can be bound to a specific network interface. No two listeners can bind to the same combination of interface and port though.
In a B2B context the best practice is to run a dedicated listener for each external business partner to isolate each of their connections across dedicated access paths. Internally I usually recommend separate ports for QMgr-to-QMgr, app-to-QMgr and interactive user connections.
In that sense it is possible to run multiple listeners on a given QMgr. Each of those listeners can accept many connections. Their job is to negotiate the connection then hand the socket off to a Message Channel Agent which talks to the QMgr on behalf of the remotely connected client or QMgr.
JMS Listeners
Based on comments, Ulab refers to JMS listeners. These objects establish a connection to a queue manager and then wait in GET mode for new messages arriving on a destination. On arrival of a message, they call the onMessage method which is an asynchronous callback routine.
As to the question "can there more than one (JMS) listener to a queue manager?" the answer is definitely yes. A multi-threaded application can have multiple listeners connected, multiple application instances can connect at the same time, and many thousands of application connections can be handled by a single queue manager with sufficient memory, disk and CPU available.
Of course, each of these applications is ultimately connected to one or more queues so then the question becomes one of whether they can connect to the same queue.
Many listeners can listen on the same queue so long as they do not get exclusive access to it. Each will receive a portion of the messages arriving.
Listeners on QMgr-managed subscriptions are exclusively attached to a dynamic queue but multiple instances on the same topic will all receive the same messages.
If the queue is clustered and there is more than one instance of it multiple listeners will be required to get all the messages since they will normally be distributed by MQ workload distribution across those instances.
Yes, you can create as many listeners as you wish:
http://www.ibm.com/support/knowledgecenter/SSFKSJ_9.0.0/com.ibm.mq.explorer.doc/e_listener.htm
However, there is no reason why two applications can't connect to the queue manager via the same listener (on the same port). What problem have you run into?
I have two applications deployed on Cloudfoundry: a service application that computes stuff (aka computeService) and a client application that renders html for us mortals to hit buttons on (aka clientService). I would like a controller in the clientService to send commands to the computeService (when mortals hit buttons). The broker and the computeService run on the same machine.
I know I cannot make remote AMQP connections into a service on cloudfoundry.com, but I assume I can make connections between applications. However, every sensible address combination for broker and clientService gives me the same error:
javax.jms.JMSException: Could not connect to broker URL: tcp://127.0.0.1:61616. Reason: java.net.ConnectException: Connection refused
Whatever address I try, I cannot post to the queue. The code works flawlessly on my local machine.
My question: can I use RabbitMQ to pass messages between the two applications on Cloudfoundry? And if so, which addresses should I use?
Thanx!
One way to try this out is to create two replicas of the rabbit message example at Spring Samples
...a message sender and a message receiver. When deployed, they should share the same rabbit service.
I pushed the rabbit message which worked for me to: rabbitmessage-sndrcv
Can I set up one MDB to listen to more than one listener port? Each listener port will be connected to one particular queue.
If not, why is the restriction that one MDB can listen to only one port?
No. An MDB may only be associated with one listener port (or one activation specification).
As a possible workaround to this limitation, you can configure your MDB multiple times so that each one can be bound to a different queue (listener port / activation spec).
MDBs are deployed to an application server. The application server is typically only listening on one port. You can build a simple java application that creates different connections to different servers; in a configurable way though. Just not as MDBs.
MDB's are a layer(probably several) of abstraction(s) above the concept of a port. Most messaging implementations will broker traffic across a single port,but likely a combination of data/control ports.
Think of the broker like a mail depot the letters come in and the broker puts them in the right mailbox while providing a number of other services(peer failover/communication, persistence, guaranteed delivery, message acknowledgment, etc.).
MDBs are the agents that subscribe to those abstract mailboxes. They really have no understanding of the underlying architecture. As far as they are concerned things are all happening locally in memory. Their only job is to adhere to the EJB standard and the container (typically through the application of more low level standards like JCA layered again over raw sockets) takes care of making sure the messages arrive at their destinations.
Maybe it would be helpful if you further elaborated on why you are concerned about how your MDBs relate to ports