I am using ZeroMQ to communicate between multiple services. I ran into an issue where I was sending responses but they were never getting to the caller. I did a lot of debugging and couldn't figure out what was happening. I eventually reduced the message size (I was returning the results of a query) and the messages started coming in. Then I increased the memory size of my JVM and the original messages started coming back.
This leads me to believe that the messages were too big to fit into memory and ZeroMQ just dropped them. My question is, how can I properly debug this? Does ZeroMQ output any logs or memory dumps?
I am using the Java version of ZeroMQ.
Q : "...how can I properly debug this?"
Well, if one were aware of the native ZeroMQ API settings, at least of both the buffering "mechanics" and the pair of { SNDHWM | RCVHWM }-hard-cut-off limits, one might do some trial-error testing for fine-tuning these parameters.
Q : "Does ZeroMQ output any logs or memory dumps?"
Well, no, the native ZeroMQ knowingly did not ever attempted to do so. The key priority of the ZeroMQ concept is the almost linearly scalable performance and the Zen-of-Zero reflecting that excluded any single operation, that did not support achieving this at minimalistic low-latency envelopes.
Yet, newer versions of the native API provide a tool called socket-monitor. That may help you write your own internal socket-events' analyser, if in such a need.
My 11+ years with ZeroMQ have never got me into an unsolvable corner. Best get the needed insights into the Context()-instance and Socket()-instance parameters, that will better configure the L3, protocol-dependent and O/S-related buffering attributes ( some of which need not be present in the java-bindings, yet the native API shows at its best all the possible and tweakable parameters of the ZeroMQ data-pumping engines ).
Related
I have a ROS node that gets image frames from a camera sensor and publishes image messages to a topic of type sensor_msgs::image. I run a ros2 executable which deploys the node. I notice that the rate at which the camera sensor provides the frames is 30 fps but the frame rate returned by "ros2 topic hz" is comparatively quite low, that is, around 10 Hz. I verified this using the output of "ros2 topic echo" wherein only around 10 messages were published with the same "sec" (second) value.
So, it seems that a large overhead is involved in the topic publishing mechanism.
Most likely, entire image frames are being copied which is causing low fps. I would like to confirm whether this is indeed the case, that is, does ros2 copies the entire message while publishing to a topic? And if yes, what are the workarounds to that? It seems that using intra process communication (using components) might be a workaround. But note that I am only deploying one node and publishing messages to a topic from it, that is to say, there is no second node which is consuming those messages yet.
Regards
I can think of a couple reasons why the reported frequency from ros2 topic hz is reporting a lower frequency than expected.
There are known performance issues with Python publishers and large data (like images). Improvements have been made, but still exist in older version of ROS 2 (Galactic or earlier) (related Github issue). I don't know if these issues affect Python subscriptions, but I imagine there is some overhead in converting from C to Python (which ros2 topic hz is doing). You could try subscribing with a C++ node and see if that makes any difference.
The underlaying robot middleware (RMW) may also be a source of increased latency. There have been various issues documented in trying to send large data. You can check out this documentation on tuning the middleware for your use-case: https://docs.ros.org/en/rolling/How-To-Guides/DDS-tuning.html
To take advantage of intraprocess communication, I recommend writing your publisher and subscriber nodes in C++ as components, which have the flexibility of being run in their own process or loaded into a common process (letting them use pointers to pass around data). You can also configure the RMW to use shared memory (agnostic to how you're using ROS), but I won't get into that here since it depends on what RMW you are using.
You can try using usbcam pkg to get the camera feed.
This pkg is in cpp and sensor QoS. So you should get the best speed.
installation:
sudo apt get install ros-<ros2-distro>-usb-cam
ros2 run usb_cam usb_cam_node_exe
ros2 run image_transport republish compressed raw --ros-args --remap in/compressed:=image_raw/compressed --remap out:=image_raw/uncompressed
you can echo the topic: image_raw/uncompressed
link attached.
https://index.ros.org/r/usb_cam/
In the RSocket documentation it says that using PayloadDecoder.ZERO_COPY will reduce latency and increase performance but:
You must free the Payload when you are done with them or you will get a memory leak.
I'm using Spring and I can see lots of examples where ZERO_COPY is used - (e.g. here) - but no examples of special precautions to free the payload.
Can someone say what, if anything, needs to be done in this regard when using this feature.
I am currently in the process of writing an ElasticSearch Nifi processor. Individual inserts / writes to ES are not optimal, instead batching documents is preferred. What would be considered the optimal approach within a Nifi processor to track (batch) documents (FlowFiles) and when at a certain amount batch them in? The part I am most concerned about is if ES is unavailable, down, network partition, etc. prevents the batch from being successful. The primary point of the question, is given that Nifi has content storage for queuing / back-pressure, etc. is there a preferred method for using that to ensure no FlowFiles get lost if a destination is down? Maybe there is another processor I should look at for an example?
I have looked at the Mongo processor, Merge, etc. to try and get an idea of the preferred approach for batching inside of a processor, but can't seem to find anything specific. Any suggestions would be appreciated.
Good chance I am overlooking some basic functionality baked into Nifi. I am still fairly new to the platform.
Thanks!
Great question and a pretty common pattern. This is why we have the concept of a ProcessSession. It allows you to send zero or more things to an external endpoint and only commit once you know it has been ack'd by the recipient. In this sense it offers at least-once semantics. If the protocol you're using supports two-phase commit style semantics you can get pretty close to the ever elusive exactly-once semantic. Much of the details of what you're asking about here will depend on the destination systems API and behavior.
There are some examples in the apache codebase which reveal ways to do this. One way is if you can produce a merged collection of events prior to pushing to the destination system. Depends on its API. I think PutMongo and PutSolr operate this way (though the experts on that would need to weigh in). An example that might be more like what you're looking for can be found in PutSQL which operates on batches of flowfiles to send in a single transaction (on the destination DB).
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutSQL.java
Will keep an eye here but can get the eye of a larger NiFi group at users#nifi.apache.org
Thanks
Joe
I am trying to gauge the performance of RabbitMQ when my message size increases to a few MB. However, even when I sent a 32KB message, I get a Resource temporarily unavilable message from the Server. There's no error in the log files, there are no memory limit reaching errors... How do I go about debugging this issue?
If it's on any help, I'm running this on EC2 T1.micro instance.. So 592MB RAM.
According to the bug you linked, someone recently (looks like after you left the link to the bug) left a comment that they can reliably reproduce the bug when the message size is >=15821 bytes.
I would recommend that you see if that also holds true for you -- i.e. can you also reproduce at that threshold -- and then evaluate if under that amount -- thus avoiding the bug documented in the issue above -- is a sufficient size for your needs. If not, you may want to try pika (https://github.com/pika/pika) and see if that works better with larger messages (one of the other comments on that bug suggests that pika did work for them with larger message sizes).
Another option that may work, depending on your exact use case, would be to include in the rabbitmq message payload a key of sorts that points allows you to fetch the large blob of data from wherever it's stored (Postgres, MongoDB, etc.) when you consume the message, and therefore allow you to avoid the bug. Perhaps not ideal if you really want to encapsulate everything inside the payload, but may be a feasible workaround to the bug.
In terms of debugging, since it appears that this is a bug with rabbitpy itself, I think you would need to debug the actual rabbitpy library if you wanted to proceed on that front. Doable, but perhaps not feasible due to time, etc.
On Mac OS X, I have a process which produces JSON objects, and another intermittent process which should consume them. The producer and consumer processes are independent of each other. Objects will be produced no more often than every 5 seconds, and will typically be several hundred bytes, but may range up into megabytes sometimes. The objects should be communicated first-in-first-out. The consumer may or may not be running when the producer is producing, and may or may not read objects immediately.
My boneheaded solution is
Create a directory.
Producer writes each JSON object to a text file, names it with a serial number.
When Consumer launches, it reads and then deletes files in serial-number order, and while it is running, uses FSEvents to watch this directory for new files arriving.
Is there any easier or better way to do this?
The modern way to do this, as of Lion, is to use XPC. Unfortunately, there's no good documentation of it; there's a broad overview in the Daemons and Services guide and a primitive HeaderDoc-generated reference, but the best way to get introduced to it is to watch the session about it from last year's WWDC sessions.
With XPC, you won't have to worry about keeping serial numbers serial, having to contend for a spinning disk, or whether there's enough disk space. Indeed, you don't even have to generate and parse JSON data at all, since XPC's communication mechanism is built around JSON-esque/plist-esque container and value objects.
Assuming you want the consumer to see the old files, this is the way it's been done since the beginning of time - loathsome though it may be.
There's lots of highish tech things that look cleaner - but honestly, they just tend to add complexity and/or deployment infrastructure that add hassle. What you suggest works, and it works well, and it's easy to write and maintain. You might need some kind of sentinel files to track what you are doing for crash recovery, but that's probably about it.
Hell, most people would just poll with a sleep 5. At least you are all all up in the fsevent.
Now if it was accepable to lose the events generated when the listener wasn't around; and perf was paramount - it could get more interesting. :)