I am developing a JAVA multicast application using JZMQ (PGM protocol).
Is it possible to send and receive data through the same socket?
If ZMQ.PUB is used, only send() works and recv() is not working.
If ZMQ.SUB is used, send() doesn't work.
Is there any alternative way for using both send() and recv() using the same Socket?
ZMQ.Context context = ZMQ.context(1);
ZMQ.Socket socket = context.socket(ZMQ.PUB);
socket.send(msg);
socket.recv();
Radio broadcast will never deliver your voice into the Main Station
Yes, both parts of the ZeroMQ PUB/SUB Scalable Formal Communication Pattern's archetypes are uni-directional ( by-definition ) one can just .send(), the other(s) may just listen ( and if were configured well, they will ).
How to do what you have asked for? ( ... and forget to have this using pgm:// )
Yes, there are ways to use other ZeroMQ archetypes for this - i.e. a single socket over PAIR/PAIR endpoints( capable of both .send() and .recv() methods ) or a pair of (A)->--PUSH/PULL->-(B) + (A)-<-PULL/PUSH-<-(B) so as to construct the bi-directional signalling / messaging channel by using just uni-directional archetypes.
You also need to select an appropriate transport-class for being used in .bind() + .connect() between the configured ZeroMQ endpoints.
// -------------------------------------------------------- HOST-(A)
ZMQ.Context aCONTEXT = ZMQ.context( 1 );
ZMQ.Socket aPubSOCKET = aCONTEXT.socket( ZMQ.PUB );
aPubSOCKET.setsockopt( ZMQ.LINGER, 0 );
// ----------------------
aPubSOCKET.bind( "tcp://*:8001" );
// ----------------------
// set msg = ...;
// ----------------------
aPubSOCKET.send( msg, ZMQ.NOWAIT );
// ...
// ----------------------
aPubSOCKET.close();
aCONTEXT.term();
// ----------------------
The SUB-side has one more duty ...
// -------------------------------------------------------- HOST-(B)
ZMQ.Context aCONTEXT = ZMQ.context( 1 );
ZMQ.Socket aSubSOCKET = aCONTEXT.socket( ZMQ.SUB );
aSubSOCKET.setsockopt( ZMQ.LINGER, 0 );
aSubSOCKET.setsockopt( ZMQ.SUBSCRIBE, "" );
// ----------------------
aSubSOCKET.connect( "tcp://<host_A_IP_address>:8001" );
// ----------------------
// def a msg;
// ----------------------
msg = aSubSOCKET.recv( ZMQ.NOWAIT );
// ...
// ----------------------
aSubSOCKET.close();
aCONTEXT.term();
// ----------------------
Related
In order to design our API/messages, I've made some preliminary tests with our data:
Protobuf V3 Message:
message TcpGraphes {
uint32 flowId = 1;
repeated uint64 curTcpWinSizeUl = 2; // max 3600 elements
repeated uint64 curTcpWinSizeDl = 3; // max 3600 elements
repeated uint64 retransUl = 4; // max 3600 elements
repeated uint64 retransDl = 5; // max 3600 elements
repeated uint32 rtt = 6; // max 3600 elements
}
Message build as multipart message in order to add the filter functionality for the client
Tested with 10 python clients: 5 running on the same PC (localhost), 5 running on an external PC.
Protocol used was TCP. About 200 messages were sent every second.
Results:
Local client are working: they get every messages
Remote clients are missing some messages (throughput seems to be limited by the server to 1Mbit/s per client)
Server code (C++):
// zeroMQ init
zmq_ctx = zmq_ctx_new();
zmq_pub_sock = zmq_socket(zmq_ctx, ZMQ_PUB);
zmq_bind(zmq_pub_sock, "tcp://*:5559");
every second, about 200 messages are sent in a loop:
std::string serStrg;
tcpG.SerializeToString(&serStrg);
// first part identifier: [flowId]tcpAnalysis.TcpGraphes
std::stringstream id;
id << It->second->first << tcpG.GetTypeName();
zmq_send(zmq_pub_sock, id.str().c_str(), id.str().length(), ZMQ_SNDMORE);
zmq_send(zmq_pub_sock, serStrg.c_str(), serStrg.length(), 0);
Client code (python):
ctx = zmq.Context()
sub = ctx.socket(zmq.SUB)
sub.setsockopt(zmq.SUBSCRIBE, '')
sub.connect('tcp://x.x.x.x:5559')
print ("Waiting for data...")
while True:
message = sub.recv() # first part (filter part, eg:"134tcpAnalysis.TcpGraphes")
print ("Got some data:",message)
message = sub.recv() # second part (protobuf bin)
We have looked at the PCAP and the server don't use the full bandwidth available, I can add some new subscribers, remove some existing ones, every remote subscriber gets "only" 1Mbit/s.
I've tested an Iperf3 TCP connection between the two PCs and I reach 60Mbit/s.
The PC who runs the python clients has about 30% CPU last.
I've minimized the console where the clients are running in order to avoid the printout but it has no effect.
Is it a normal behavior for the TCP transport layer (PUB/SUB pattern) ? Does it means I should use the EPGM protocol ?
Config:
windows xp for the server
windows 7 for the python remote clients
zmq version 4.0.4 used
A performance motivated interest ?
Ok, let's first use the resources a bit more adequately :
// //////////////////////////////////////////////////////
// zeroMQ init
// //////////////////////////////////////////////////////
zmq_ctx = zmq_ctx_new();
int aRetCODE = zmq_ctx_set( zmq_ctx, ZMQ_IO_THREADS, 10 );
assert( 0 == aRetCODE );
zmq_pub_sock = zmq_socket( zmq_ctx, ZMQ_PUB );
aRetCODE = zmq_setsockopt( zmq_pub_sock, ZMQ_AFFINITY, 1023 );
// ^^^^
// ||||
// (:::::::::::)-------++++
// >>> print ( "[{0: >16b}]".format( 2**10 - 1 ) ).replace( " ", "." )
// [......1111111111]
// ||||||||||
// |||||||||+---- IO-thread 0
// ||||||||+----- IO-thread 1
// |......+------ IO-thread 2
// :: : :
// |+------------ IO-thread 8
// +------------- IO-thread 9
//
// API-defined AFFINITY-mapping
Non-windows platforms with a more recent API can touch also scheduler details and tweak O/S-side priorities even better.
Networking ?
Ok, let's first use the resources a bit more adequately :
aRetCODE = zmq_setsockopt( zmq_pub_sock, ZMQ_TOS, <_a_HIGH_PRIORITY_ToS#_> );
Converting the whole infrastructure into epgm:// ?
Well, if one wishes to experiment and gets warranted resources for doing that E2E.
I am seeing a strange behavior using ZMQ_PUB.
I have a producer which .connect()-s to different processesthat .bind() on ZMQ_SUB sockets.
The subscribers all .bind(), the publisher .connect()-s.
When a producer starts, it creates a ZMQ_PUB socket and .connect()-s it to different processes. It then immediately starts sending messages at a regular period.
As expected, if there are no connected subscribers, it drops all messages, until a subscriber starts.
The flow works normal then, when a subscriber starts, it receives the messages from that moment on.
Now, the problem is:
I disconnect the subscriber ( stopping the process ).
There are no active subscribers at this point, as I stopped the only one. The producer continues sending messages, which should be dropped, as there are no connected subscribers anymore…
I restart the original subscriber, it binds, the publisher reconnects... and the subscriber receives all messages produced in the meantime !!
So what I see is that the producer enqueued all messages while the subscriber was down. As soon as the socket reconnected, because the subscriber process restarted, it sent all queued messages.
As I understood from here, a publisher should drop all sent messages when there are no connected subscribers:
ZeroMQ examples
"A publisher has no connected subscribers, then it will simply drop all messages."
Why is this happening?
By the way, I am using C++ over linux for these tests.
I tried setting a different identity on the subscriber when it binds, but it didn't work. Publisher still enqueues messages, and delivers them all when subscriber restart.
Thanks in advance,
Luis
UPDATE:
IMPORTANT UPDATE!!!!!Before posting this questionI had tried different solutions. One was to set ZMQ_LINGER to 0, which didn't work.I added ZMQ:IMMEDIATE, and it worked, but I just found out that ZMQ:IMMEDIATE alone does not work. It requires also ZMQ_LINGER.Luis Rojas 3 hours ago
UPDATE:
As per request, I am adding some simple test cases to show my point.
One is a simple subscriber, which runs on command line and receives the uri where to bind, for instance :
$ ./sub tcp://127.0.0.1:50001
The other is a publisher, which receives a list of uris to connect to, for instance :
./pub tcp://127.0.0.1:50001 tcp://127.0.0.1:50002
The subscriber receives up to 5 messages, then closes socket and exit. We can see on wireshark the exchange of FIN/ACK, both ways, and how the socket moves to TIME_WAIT state. Then, publisher starts sending SYN, trying to reconnect (that probes the ZMQ_PUB knows that connection closed)
I am explicitely not unsubscribing the socket, just closing it. In my opinion, if the socket closed, the publisher should automatically end any subscription for that connection.
So what I see is : I start subscriber(one or more), I start publisher, which starts sending messages. Subscriber receives 5 messages and ends. In the meantime publisher continues sending messages, WITH NO CONNECTED SUBSCRIBER. I restart the subscriber, and receives immediately several messages, because they were queued at the publishers side. I think those queued messages break the Publish/Subscribe model, where messages should be delivered only to connected subscribers. If a susbcriber closes the connection, messages to that subscriber should be dropped. Even more, when subscriber restarts, it may decide to subscribe to other messages, but it will still receive those subscribed by a "previous encarnation" that was binded at same port.
My proposal is that ZMQ_PUB (on connect mode), when detecting a socket disconnection, should clear all subscriptions on that socket, until it reconnects and the NEW subscriber decides to resubscribe.
I apologize for language mistakes, but english is not my native language.
Pub's code:
#include <stdio.h>
#include <stdlib.h>
#include <libgen.h>
#include <unistd.h>
#include <string>
#include <zeromq/zmq.hpp>
int main( int argc, char *argv[] )
{
if ( argc < 2 )
{
fprintf( stderr, "Usage : %s <remoteUri1> [remoteUri2...]\n",
basename( argv[0] ) );
exit ( EXIT_FAILURE );
}
std::string pLocalUri( argv[1] );
zmq::context_t localContext( 1 );
zmq::socket_t *pSocket = new zmq::socket_t( localContext, ZMQ_PUB );
if ( NULL == pSocket )
{
fprintf( stderr, "Couldn't create socket. Aborting...\n" );
exit ( EXIT_FAILURE );
}
int i;
try
{
for ( i = 1; i < argc; i++ )
{
printf( "Connecting to [%s]\n", argv[i] );
{
pSocket->connect( argv[i] );
}
}
}
catch( ... )
{
fprintf( stderr, "Couldn't connect socket to %s. Aborting...\n", argv[i] );
exit ( EXIT_FAILURE );
}
printf( "Publisher Up and running... sending messages\n" );
fflush(NULL);
int msgCounter = 0;
do
{
try
{
char msgBuffer[1024];
sprintf( msgBuffer, "Message #%d", msgCounter++ );
zmq::message_t outTask( msgBuffer, strlen( msgBuffer ) + 1 );
printf("Sending message [%s]\n", msgBuffer );
pSocket->send ( outTask );
sleep( 1 );
}
catch( ... )
{
fprintf( stderr, "Some unknown error ocurred. Aborting...\n" );
exit ( EXIT_FAILURE );
}
}
while ( true );
exit ( EXIT_SUCCESS );
}
Sub's code
#include <stdio.h>
#include <stdlib.h>
#include <libgen.h>
#include <unistd.h>
#include <string>
#include <zeromq/zmq.hpp>
int main( int argc, char *argv[] )
{
if ( argc != 2 )
{
fprintf( stderr, "Usage : %s <localUri>\n", basename( argv[0] ) );
exit ( EXIT_FAILURE );
}
std::string pLocalUri( argv[1] );
zmq::context_t localContext( 1 );
zmq::socket_t *pSocket = new zmq::socket_t( localContext, ZMQ_SUB );
if ( NULL == pSocket )
{
fprintf( stderr, "Couldn't create socket. Aborting...\n" );
exit ( EXIT_FAILURE );
}
try
{
pSocket->setsockopt( ZMQ_SUBSCRIBE, "", 0 );
pSocket->bind( pLocalUri.c_str() );
}
catch( ... )
{
fprintf( stderr, "Couldn't bind socket. Aborting...\n" );
exit ( EXIT_FAILURE );
}
int msgCounter = 0;
printf( "Subscriber Up and running... waiting for messages\n" );
fflush( NULL );
do
{
try
{
zmq::message_t inTask;
pSocket->recv ( &inTask );
printf( "Message received : [%s]\n", inTask.data() );
fflush( NULL );
msgCounter++;
}
catch( ... )
{
fprintf( stderr, "Some unknown error ocurred. Aborting...\n" );
exit ( EXIT_FAILURE );
}
}
while ( msgCounter < 5 );
// pSocket->setsockopt( ZMQ_UNSUBSCRIBE, "", 0 ); NOT UNSUBSCRIBING
pSocket->close();
exit ( EXIT_SUCCESS );
}
Q: Why is this happening?
Because the SUB is actually still connected ( not "disconnected" enough ).
Yes, might be surprising, but killing the SUB-process, be it on .bind()- or .connect()-attached side of the socket's transport-media, does not mean, the Finite-State-Machine of the I/O-pump has "moved" into disconnected-state.
Given that, the PUB-side has no other option but to consider the SUB-side still live and connected ( even while the process was silently killed beyond the line-of-sight of the PUB-side ) and for such "distributed"-state there is a ZeroMQ protocol-defined behaviour ( a PUB-side duty ) to collect all the interim messages for a ( yes, invisibly dead ) SUB-scriber, which the PUB-side still considers fair to live ( but might be having just some temporally intermitent issues somewhere low, on the transport I/O-levels or some kinds of remote CPU-resources starvations or concurrency-introduced transiently intermitent { local | remote } blocking states et al ).
So it buffers...
In case your assassination of the SUB-side agent would appear to have been a bit more graceful ( using the zeroised ZMQ_LINGER + an adequate .close() on the socket-resource instance ) the PUB-side will recognise the "distributed"-system system-wide Finite-State-Automaton shift into an indeed "DISCONNECT"-ed state and a due change-of-behaviour will happen on the PUB-side of the "distributed-FSA", not storing any messages for this "visibly" indeed "DISCONNECT"-ed SUB -- exactly what the documentation states.
"Distributed-FSA" has but quite a weak means to recognise state-change events "beyond it's horizon of localhost contols. KILL-ing a remote process, which implements some remarkable part of the "distributed-FSA" is a devastating event, not a method how to keep the system work. A good option for such external risks might be
Sounds complex?
Oh, yes, it is complex, indeed. That's exactly why ZeroMQ solved this for us, to be free and enjoy designing our application architectures based on top of these ( already solved ) low level complexities.
Distributed-system FSA ( a system-wide FSA of layered composition of sub-FSA-s )
To just imagine what is silently going on under the hood, imagine just having a simple, tandem pair of FSA-FSA - exactly what the pair of .Context() instances try to handle for us in the simplest ever 1:1 PUB/SUB scenario where the use-case KILL-s all the sub-FSA-s on the SUB-side without giving a shot to acknowledge the intention to the PUB-side. Even the TCP-protocol ( living both on the PUB-side and SUB-side ) has several state-transition from [ESTABLISHED] to [CLOSED] state.
A quick X-ray view on a distributed-systems' FSA-of-FSA-s
( just the TCP-protocol FSA was depicted for clarity )
PUB-side:
.socket( .. ) instance's behaviour FSA:
SUB-side:
(Courtesy nanomsg).
Bind and Connect although indifferent, have specific meaning here.
Option 1:
Change your code to this way and there's no problem:
Publisher should bind to an address
Subscriber should connect to that address
'coz if you bind a subscriber and then interrupt it, there's no way the publisher knows that the subscriber is unbound and so it queues the messages to the bound port and when you restart again on the same port, the queued messages will be drained.
Option 2:
But if you want to do it your way, you need to do the following things:
Register an interrupt handler (SIGINT) in the subscriber code
On the interrupt of the subscriber do the following:
unsubscribe the topic
close the sub socket
exit the subscriber process cleanly with preferably 0 return code
UPDATE:
Regarding the point of identity, do not assume that setting identity will uniquely identify a connection. If it is left to zeromq, it will assign the identities of the incoming connections using unique arbitrary numbers.
Identities are not used to respond back to the clients in general. They are used for responding back to the clients in case ROUTER sockets are used.
'Coz ROUTERsockets are asynchronous where as REQ/REP are synchronous. In Async we need to know to whom we respond back. It can be n/w address or a random number or uuid etc.
UPDATE:
I don't consider this as in issue with zeromq because throughout the guide PUB/SUB is explained in the way that Publisher is generally static (Server and is bound to a port) and subscribers come and go along the way (Clients which connect to the port).
There is another option which would exactly fit to your requirement
ZMQ_IMMEDIATE or ZMQ_DELAY_ATTACH_ON_CONNECT
Setting the above socket option on the publisher would not let the messages en queue when there are no active connections to it.
I cannot set the max rate by ZMQ_RATE (which default at a very low 100 kbits/sec) on a ZeroMQ multicast socket - the call to zmq_setsocketopt() fails (using C langauge).
I need the rate much higher as my application involves streaming video.
Can anyone shine any light on this - here is the stripped down code to replicate the problem
void* _context;
void* _responder;
_context = zmq_ctx_new ();
_responder = zmq_socket ( _context, ZMQ_SUB );
int64_t val = 100000;
int rc;
rc = zmq_setsockopt( _responder, ZMQ_RATE, &val, sizeof(int64_t) );
int ze2 = zmq_errno ();
int major, minor, patch;
zmq_version( &major, &minor, &patch );
printf( "DIAG[zmq_setsockopt() API:%d.%d.%d] RC: (%d) ~ Errno: (%d) ~ Error:(%s)\n",
major,
minor,
patch
rc,
ze2,
zmq_strerror( ze2 )
);
The output of the above is:
DIAG[zmq_setsockopt() API:4.0.4] RC: (-1) ~ Errno: (22) ~ Error: (Invalid argument)
If I change a socket type to ZMQ_PUB I also get the error.Have tested many rates from 1 to 100000 in various orders or magnitude, all fail the same way.
The version is 4.0.4 running on Windows 7
API for zmq_setsockopt() has only these possible error-states:
EINVAL
The requested option option_name is unknown, or the requested option_len or option_value is invalid.
ETERM
The ØMQ context associated with the specified socket was terminated.
ENOTSOCK
The provided socket was invalid.
EINTR
The operation was interrupted by delivery of a signal.
My suspect is the EINVALand a minimum step to forward the code closer towards a MCVE is to do this and post the terminal outputs:
void* _context;
void* _responder;
assert ( ZMQ_RATE == 8 ); # mod.000: validate zmq.h compliance
assert ( ZMQ_SUB == 2 ); # mod.000: validate zmq.h compliance
_context = zmq_ctx_new ();
assert ( _context ); # mod.000: validate <context> instance
_responder = zmq_socket ( _context, ZMQ_SUB );
assert ( _responder ); # mod.000: validate <socket> instance
int val = 123; # mod.000: enforce (int)
int rc = zmq_setsockopt( _responder, ZMQ_RATE, &val, sizeof(val) );
int ze2 = zmq_errno ();
int major, minor, patch;
zmq_version( &major, &minor, &patch );
printf( "DIAG[zmq_setsockopt() API:%d.%d.%d] RC: (%d) ~ Errno: (%d) ~ Error: (%s)\n",
major,
minor,
patch
rc,
ze2,
zmq_strerror( ze2 )
);
UPDATE 000.INF:
From ZeroMQ v3.x, filtering happens at the publisher side when using a connected protocol ( tcp:// or ipc://). Using the epgm:// protocol, filtering happens at the subscriber side.In ZeroMQ v2.x, all filtering happened at the subscriber side.
UPDATE 000.w7.CHECK:
as per
http://technet.microsoft.com/en-us/library/cc957547.aspx
may want to check REGISTRY for a {key: value}-presence/state of HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\IGMPLevel
<-- not to be missing or 0that is kind of a morbid part of an "answer", but w7 could have multicast forbidden ( which should not self-demonstrate by zmq_errno() but it seems better to be sure of rather the solid grounds than building on moving sands, doesn't it? ).
UPDATE 001.w7.Final_step:
In case the indicated EINVAL error gets returned also on a linux-based system, re-tested for the same scenario, fill a ZeroMQ BugReport.
Otherwise, the trouble gets resolved to be w7-related, consult your localhost administrator for fixing the issue.
The pgm transport implementation requires access to raw IP sockets. Additional privileges may be required on some operating systems for this operation. Applications not requiring direct interoperability with other PGM implementations are encouraged to use the epgm transport instead which does not require any special privileges.
My compound module is multiple layers as show in the attached figure.
Here Layer2 has a cPacketQueue buffer and I want the Layer1 module to directly insert packets into this cPacketQueue of Layer2. Layer1 and Layer2 gates are connected unidirecttionally as show in the figure.
Layer1Gate --> Layer2Gate
UPDATED:
Layer 1 creates Packets with different priorities (0-7) and injects to 8 different cPacketQueues in Layer2 named as priorityBuffers[i], (i is the index).
The Layer2 then sends self messages in intervals of 10ns to poll all these buffers in each iteration and send the packets.
This is all I am doing now. It works fine. But I know 10ns polling is definitely not an efficient way to do this and achieve QoS. So requesting for a better alternative.
I suggest adding a ControlInfo object with priority to every packet from Layer1, send the packet using send() command, then checking ControlInfo of received packet in Layer2, and insert the packet into a specific queue.
Firstly, one should define a class for ControlInfo, for example in common.h:
// common.h
class PriorityControlInfo : public cObject {
public:
int priority;
};
Then in C++ code of Layer1 simple module:
#include "common.h"
// ...
// in the method where packet is created
cPacket * packet = new cPacket();
PriorityControlInfo * info = new PriorityControlInfo();
info->priority = 2; // 2 is desired queue number
packet->setControlInfo(info);
send (packet, "out");
And finally in Layer2:
#include "common.h"
// ...
void Layer2::handleMessage(cMessage *msg) {
cPacket *packet = dynamic_cast<cPacket *>(msg);
if (packet) {
cObject * ci = packet->removeControlInfo();
if (ci) {
PriorityControlInfo * info = check_and_cast<PriorityControlInfo*>(ci);
int queue = info->priority;
EV << "Received packet to " << static_cast<int> (queue) << " queue.\n";
priorityBuffers[queue].insert(packet);
EV << priorityBuffers[queue].info() << endl;
}
}
}
According to using of self messages: I do not understand clearly what is your intention.
Does Layer2 should send a packet immediately after receiving it? If yes why do you use a buffer? In that situation instead of inserting a packet to a buffer, Layer2 should just send it to the Layer3.
Does Layer2 should do something else after receiving a packet and inserting it in a buffer? If yes, just call this action (function) in the above handleMessage().
In the both above variants there is no need to use self messages.
I am now trying to use Windows P2P native functions in my application to connect instances of it over the internet. For the testing, I've setup one application that uses PeerGraphCreate to establish a P2P graph and then registers some peer name using PeerPnrpRegister. I they register for messages using PeerGraphRegisterEvent and enter a loop while the application is listening for events in a thread. This side seems to work fine.
In the second application I open the graph using PeerGraphOpen which succeeds. I then resolve the peer name from the first app using PeerPnrpResolve. It returns two ipv6 addresses. However, when I feed any of those to the PeerGraphConnect function, it returns a HRESULT reading "Requested address is not valid in its context. I have no idea what's wrong, anyone would be so nice to provide a clue?
Here is the code of the second application for reference:
HGRAPH hGraph;
HRESULT hr = PeerGraphOpen( L"TestP2PGraph", L"DebugPeer", L"TestPeerDB", NULL, 0, NULL, &hGraph );
if( hr == S_OK || hr == PEER_S_GRAPH_DATA_CREATED )
{
// Connect to PNRP
if( SUCCEEDED( PeerPnrpStartup( PNRP_VERSION ) ) )
{
ULONG numEndpoints = 1;
PEER_PNRP_ENDPOINT_INFO* endpointInfo;
hr = PeerPnrpResolve( L"0.TestBackgroundPeer", L"Global_", &numEndpoints, &endpointInfo );
if( SUCCEEDED( hr ) )
{
PEER_ADDRESS addr;
addr.dwSize = sizeof( PEER_ADDRESS );
addr.sin6 = *((SOCKADDR_IN6*)endpointInfo->ppAddresses[1]);
ULONGLONG connection;
hr = PeerGraphConnect( hGraph, NULL, &addr, &connection );
^^ this reads "Requested address is not valid in its context
I would be grateful for any help.