I have been working with two C programs for the past few months, one is a ZeroMQ publisher and the other a ZeroMQ subscriber.
They exchange simple string messages between Virtual Machines and everything works fine.
Now, in one of the VMs I've been working on (VM A) I configured an openvswitch and in another VM a Ryu controller. The diagram is the following:
I "bounded" the bridge interface of OVS to the eth3 interface of VM A. Everything works well and flow-entries are added by the Ryu controller or manually added by me.
Now, I want to add the ZeroMQ publisher-subscriber programs I had already used countless times. Here, the controller is the subscriber and OVS the publisher.
However, the messages never arrive to the controller... If I run the ZeroMQ publisher from another machine on net A that does NOT have OVS installed and configured, messages arrive to the Ryu controller successfully.
When I run the publisher and subscriber, this is the output of netstat -at on both machines. VM A is "OpenWrt" and the Ryu controller is "control" (consider only the last line in #control VM):
Is there something I'm missing? Is it really impossible to send TCP messages from de OVS to the controller? Should I create some kind of tunnel from VM A to the controller where messages would flow through? Or is it just an issue with ZeroMQ that does not work with Openflow enabled architectures?
If any of you ever worked with a message queueing technology in Openflow environments, please let me know.
I appreciate any kind of help, I've been stuck for weeks.
Note: I can ping VM A from controller and vice-versa.
Q : Is there something I'm missing?
Given no MCVE-code was provided, there is missing a pair of LoS-visibility test results, from:
<ZMQ_SUB_HOST>:~$ traceroute --sport=<ZMQ_TRANSPORT_PORT#> <ZMQ_PUB_HOST>
and
<ZMQ_PUB_HOST>:~$ traceroute --sport=<ZMQ_TRANSPORT_PORT#> <ZMQ_SUB_HOST>
Note: I can ping VM A from controller and vice-versa.
This remark sounds promising, yet both the LoS-visibility test results and the MCVE-code details are important, but missing.
Q : Should I create some kind of tunnel from VM A to the controller where messages would flow through?
Sure, you definitely can. This will isolate L3-level issues and your tunneling-path will provide a way to ignore all these, at a cost of some slight latency trade-off.
Q : Or is it just an issue with ZeroMQ that does not work with Openflow enabled architectures?
There is no specific reason for blaming ZeroMQ infrastructure to stop working but due to an L3/SDN-related yet unspecified so far issue. Given a fair tcp://-transport-class path exists and is used in a properly configured ZeroMQ transport-infrastructure consisting of ad-hoc, dynamic setup ofM-.bind( <class>://<addr>:<port> )-s : N-.connect( <class>://<addr>:<port> )-s relation, reporting no error-state(s) on respective operation(s), the ZeroMQ shall and will work as it always does.
Related
I am testing .NET version of ZeroMQ to understand how to handle network failures. I put the server (pub socket) to one external machine and debugging the client (sub socket). If I stop my local Wi-Fi connection for seconds, then ZeroMQ automatically recovers and I even get remaining values. However, if I disable Wi-Fi for longer time like a minute, then it just gets stuck on a frame waiting. How can I configure this period when ZeroMQ is still able to recover? And how can I reconnect manually after, say, several minutes? How can I understand that the socket is locked and I need to kill/open again?
Q :" How can I configure this ... ?"
A :Use the .NET versions of zmq_setsockopt() detailed parameter settings - family of link-management parameters alike ZMQ_RECONNECT_IVL, ZMQ_RCVTIMEO and the likes.
All other questions depend on your code.
If using blocking-forms of the .recv()-methods, you can easily throw yourself into unsalvageable deadlocks, best never block your own code ( why one would ever deliberately lose one's own code domain-of-control ).
If in a need to indeed understand low-level internal link-management details, do not hesitate to use zmq_socket_monitor() instrumentation ( if not available in .NET binding, still may use another language to see details the monitor-instance reports about link-state and related events ).
I was able to find an answer on their GitHub https://github.com/zeromq/netmq/issues/845. Seems that the behavior is by design as I got the same with native zmq lib via .NET binding.
I am working on Franca IDL and trying to implement the SOME/IP two device communication. I am referring the below links:
https://at.projects.genivi.org/wiki/pages/viewpage.action?pageId=5472320
https://github.com/GENIVI/vsomeip/wiki/vsomeip-in-10-minutes#request
Current Setup:
Ubuntu 18.04 (two machines - Server & Client)
Two Machines connected over ethernet
But am actually confused between SOME/IP and VSOME/IP. Anyhow I went with the link [1] I could able to achieve communication between the processes running on the single local machine. I failed in two 2 device communication.
Later I followed the same in in link [2] but even here I was able to achieve communication between the processes running on the single local machine. I failed in two 2 device communication but server was running in one device and client was running on another but no communication achieved.
I came across this VSOMEIP - Communication between 2 devices (TCP/UDP) Not working post here but couldn't get how to proceed further.
My actual aim is to achieve two device communication using Franca IDL and SOME/IP i.e link [1]. But I am not finding any single source so that I can at least look into it.
Any suggestions will help me a lot. Thanks in advance.
After working for few hours now the quick update is, As suggested in the VSOMEIP - Communication between 2 devices (TCP/UDP) Not working I used the shell script now the client and server are detecting each other over ethernet. But the function call is not happening. Client is not sending the request to the server. To be more clear in the given example (https://at.projects.genivi.org/wiki/pages/viewpage.action?pageId=5472320) the on_availability is working but on_message is not working. We are struggling a lot. Any suggestion will help us a lot.
I created a custom topology in mininet and added flows rules to the switches. I can ping the hosts but cannot see the topology on DLUX. I tried with other topology such as single and linear, these work fine. I do not understand what is the problem with the custom topology. If someone could shed some light.
try restarting ODL, like this person is doing. I would be suspect that
you are hitting some bug in the l2switch project. But, you can debug further
by inspecting the flows on each switch in your custom topology. Each switch
should have a flow with dl_type=0x88cc that punts to the CONTROLLER. Those
are the LLDP packets, which is how ODL will learn the links, which in turn
is how DLUX will paint them in your GUI. If the flows aren't there, then
you would want to try to figure out why? maybe the switches are ignoring
the flow programming (check switch logs), or maybe the flows are not even
being sent (you could check ODL logs, or even do a tcpdump to see if
openflow rules are being sent to the switch). If the flows are being
programmed, and the LLDP packets are being punted to ODL then the problem
could be internal to ODL and DLUX.
To be fair, DLUX is a stale project that is slated for removal. There
may be bugs you are hitting.
It's strange that I can ping all of a sudden now without making any changes. I have faced this problem earlier too, where the controller doesn't work for a week or so and then starts running suddenly.
the problem is not from ODL but it's from OVS switch u need to this script for your switch controller
sudo ovs-vsctl set bridge s1 protocols=OpenFlow13
http://kspviswa.github.io/Installing-ODL-BE.html
For being specific, I am using asterisk with a Heartbeat active/pasive cluster. There are 2 nodes in the cluster. Let's suppose Asterisk1 Asterisk2. Eveything is well configured in my cluster. When one of the nodes looses internet connection, asterisk service fails or the Asterisk1 is turned off, the asterisk service and the failover IP migrate to the surviving node (Asterisk2).
The problem is if we actually were processing a call when the Asterisk1 fell down asterisk stops the call and I can redial until asterisk service is up in asterisk2 (5 seconds, not a bad time).
But, my question is: Is there a way to make asterisk work like skype when it looses connection in a call? I mean, not stopping the call and try to reconnect the call, and reconnect it when asterisk service is up in Asterisk2?
There are some commercial systems that support such behavour.
If you want do it on non-comercial system there are 2 way:
1) Force call back to all phones with autoanswer flag. Requerment: Guru in asterisk.
2) Use xen and memory mapping/mirror system to maintain on other node vps with same memory state(same running asterisk). Requirment: guru in XEN. See for example this: http://adrianotto.com/2009/11/remus-project-full-memory-mirroring/
Sorry, both methods require guru knowledge level.
Note, if you do sip via openvpn tunnel, very likly you not loose calls inside tunnel if internet go down for upto 20 sec. That is not exactly what you asked, but can work.
Since there is no accepted answer after almost 2 years I'll provide one: NO. Here's why.
If you failover from one Asterisk server 1 to Asterisk server 2, then Asterisk server 2 has no idea what calls (i.e. endpoint to endpoing) were in progress. (Even if you share a database of called numbers, use asterisk realtime, etc). If asterisk tried to bring up both legs of the call to the same numbers, these might not be the same endpoints of the call.
Another server cannot resume the SIP TCP session of the other server since it closed with the last server.
The MAC source/destination ports may be identical and your firewall will not know you are trying to continue the same session.
etc.....
If you goal is high availability of phone services take a look at the VoIP Info web site. All the rest (network redundancy, disk redundancy, shared block storage devices, router failover protocol, etc) is a distraction...focus instead on early DETECTION of failures across all trunks/routes/devices involved with providing phone service, and then providing the highest degree of recovery without sharing ANY DEVICES. (Too many HA solutions share a disk, channel bank, etc. that create a single point of failure)
Your solution would require a shared database that is updated in realtime on both servers. The database would be managed by an event logger that would keep track of all calls in progress; flagged as LINEUP perhaps. In the event a failure was detected, then all calls that were on the failed server would be flagged as DROPPEDCALL. When your fail-over server spins up and takes over -- using heartbeat monitoring or somesuch -- then the first thing it would do is generate a set of call files of all database records flagged as DROPPPEDCALL. These calls can then be conferenced together.
The hardest part about it is the event monitor, ensuring that you don't miss any RING or HANGUP events, potentially leaving a "ghost" call in the system to be erroneously dialed in a recovery operation.
You likely should also have a mechanism to build your Asterisk config on a "management" machine that then pushes changes out to your farm of call-manager AST boxen. That way any node is replaceable with any other.
What you should likely have is 2 DB servers using replication techniques and Linux High-Availability (LHA) (1). Alternately, DNS round-robin or load-balancing with a "public" IP would do well, too. These machine will likely be light enough load to host your configuration manager as well, with the benefit of getting LHA for "free".
Then, at least N+1 AST Boxen for call handling. N is the number of calls you plan on handling per second divided by 300. The "+1" is your fail-over node. Using node-polling, you can then set up a mechanism where the fail-over node adopts the identity of the failed machine by pulling the correct configuration from the config manager.
If hardware is cheap/free, then 1:1 LHA node redundancy is always an option. However, generally speaking, your failure rate for PC hardware and Asterisk software is fairly lower; 3 or 4 "9s" out of the can. So, really, you're trying to get last bit of distance to the "5th 9".
I hope that gives you some ideas about which way to go. Let me know if you have any questions, and please take the time to "accept" which ever answer does what you need.
(1) http://www.linuxjournal.com/content/ahead-pack-pacemaker-high-availability-stack
I'm looking for a mechanism to use to create a simple many-to-many messaging system to allow Windows applications to communicate on a single machine but across sessions and desktops.
I have the following hard requirements:
Must work across all Windows sessions on a single machine.
Must work on Windows XP and later.
No global configuration required.
No central coordinator/broker/server.
Must not require elevated privileges from the applications.
I do not require guaranteed delivery of messages.
I have looked at many, many options. This is my last-ditch request for ideas.
The following have been rejected for violating one or more of the above requirements:
ZeroMQ: In order to do many-to-many messaging a central broker is required.
Named pipes: Requires a central server to receive messages and forward them on.
Multicast sockets: Requires a properly configured network card with a valid IP address, i.e. a global configuration.
Shared Memory Queue: To create shared memory in the global namespace requires elevated privileges.
Multicast sockets so nearly works. What else can anyone suggest? I'd consider anything from pre-packaged libraries to bare-metal Windows API functionality.
(Edit 27 September) A bit more context:
By 'central coordinator/broker/server', I mean a separate process that must be running at the time that an application tries to send a message. The problem I see with this is that it is impossible to guarantee that this process really will be running when it is needed. Typically a Windows service would be used, but there is no way to guarantee that a particular service will always be started before any user has logged in, or to guarantee that it has not been stopped for some reason. Run on demand introduces a delay when the first message is sent while the service starts, and raises issues with privileges.
Multicast sockets nearly worked because it manages to avoid completely the need for a central coordinator process and does not require elevated privileges from the applications sending or receiving multicast packets. But you have to have a configured IP address - you can't do multicast on the loopback interface (even though multicast with TTL=0 on a configured NIC behaves as one would expect of loopback multicast) - and that is the deal-breaker.
Maybe I am completely misunderstanding the problem, especially the "no central broker", but have you considered something based on tuple spaces?
--
After the comments exchange, please consider the following as my "definitive" answer, then:
Use a file-based solution, and host the directory tree on a Ramdisk to insure good performance.
I'd also suggest to have a look at the following StackOverflow discussion (even if it's Java based) for possible pointers to how to manage locking and transactions on the filesystem.
This one (.NET based) may be of help, too.
How about UDP broadcasting?
Couldn't you use a localhost socket ?
/Tony
In the end I decided that one of the hard requirements had to go, as the problem could not be solved in any reasonable way as originally stated.
My final solution is a Windows service running a named pipe server. Any application or service can connect to an instance of the pipe and send messages. Any message received by the server is echoed to all pipe instances.
I really liked p.marino's answer, but in the end it looked like a lot of complexity for what is really a very basic piece of functionality.
The other possibility that appealed to me, though again it fell on the complexity hurdle, was to write a kernel driver to manage the multicasting. There would have been several mechanisms possible in this case, but the overhead of writing a bug-free kernel driver was just too high.