Using RSocket `PayloadDecoder.ZERO_COPY` correctly - spring-boot

In the RSocket documentation it says that using PayloadDecoder.ZERO_COPY will reduce latency and increase performance but:
You must free the Payload when you are done with them or you will get a memory leak.
I'm using Spring and I can see lots of examples where ZERO_COPY is used - (e.g. here) - but no examples of special precautions to free the payload.
Can someone say what, if anything, needs to be done in this regard when using this feature.

Related

ZeroMQ is dropping messages

I am using ZeroMQ to communicate between multiple services. I ran into an issue where I was sending responses but they were never getting to the caller. I did a lot of debugging and couldn't figure out what was happening. I eventually reduced the message size (I was returning the results of a query) and the messages started coming in. Then I increased the memory size of my JVM and the original messages started coming back.
This leads me to believe that the messages were too big to fit into memory and ZeroMQ just dropped them. My question is, how can I properly debug this? Does ZeroMQ output any logs or memory dumps?
I am using the Java version of ZeroMQ.
Q : "...how can I properly debug this?"
Well, if one were aware of the native ZeroMQ API settings, at least of both the buffering "mechanics" and the pair of { SNDHWM | RCVHWM }-hard-cut-off limits, one might do some trial-error testing for fine-tuning these parameters.
Q : "Does ZeroMQ output any logs or memory dumps?"
Well, no, the native ZeroMQ knowingly did not ever attempted to do so. The key priority of the ZeroMQ concept is the almost linearly scalable performance and the Zen-of-Zero reflecting that excluded any single operation, that did not support achieving this at minimalistic low-latency envelopes.
Yet, newer versions of the native API provide a tool called socket-monitor. That may help you write your own internal socket-events' analyser, if in such a need.
My 11+ years with ZeroMQ have never got me into an unsolvable corner. Best get the needed insights into the Context()-instance and Socket()-instance parameters, that will better configure the L3, protocol-dependent and O/S-related buffering attributes ( some of which need not be present in the java-bindings, yet the native API shows at its best all the possible and tweakable parameters of the ZeroMQ data-pumping engines ).

Use Cases of NIFI

I have a question about Nifi and its capabilities as well as the appropriate use case for it.
I've read that Nifi is really aiming to create a space which allows for flow-based processing. After playing around with Nifi a bit, what I've also come to realize is it's capability to model/shape the data in a way that is useful for me. Is it fair to say that Nifi can also be used for data modeling?
Thanks!
Data modeling is a bit of an overloaded term, but in the context of your desire to model/shape the data in a way that is useful for you, it sounds like it could be a viable approach. The rest of this is under that assumption.
While NiFi employs dataflow through principles and design closely related to flow based programming (FBP) as a means, the function is a matter of getting data from point A to B (and possibly back again). Of course, systems aren't inherently talking in the same protocols, formats, or schemas, so there needs to be something to shape the data into what the consumer is anticipating from what the producer is supplying. This gets into common enterprise integration patterns (EIP) [1] such as mediation and routing. In a broader sense though, it is simply getting the data to those that need it (systems, users, etc) when and how they need it.
Joe Witt, one of the creators of NiFi, gave a great talk that may be in line with this idea of data shaping in the context of Data Science at a Meetup. The slides of which are available [2].
If you have any additional questions, I would point you to check out the community mailing lists [3] and ask any additional questions so you can dig in more and get a broader perspective.
[1] https://en.wikipedia.org/wiki/Enterprise_Integration_Patterns
[2] http://files.meetup.com/6195792/ApacheNiFi-MD_DataScience_MeetupApr2016.pdf
[3] http://nifi.apache.org/mailing_lists.html
Data modeling might well mean many things to many folks so I'll be careful to use that term here. What I do think in what you're asking is very clear is that Apache NiFi is a great system to use to help mold the data into the right format and schema and content you need for your follow-on analytics and processing. NiFi has an extensible model so you can add processors that can do this or you can use the existing processors in many cases and you can even use the ExecuteScript processors as well so you can write scripts on the fly to manipulate the data.

Infinispan+kyro/Google Protocol Buffers to achieve more space and time efficient serialization?

If I understand correctly, Infinispan/JBoss Cache uses Java's own serialization mechanism, which can be slow and takes relatively more storage space. I have been looking for alternatives which can achieve the following:
Automatic cached management, in other words objects that are used more frequently are automatically loaded into memory
More efficient serialization (perhaps object --> compact binary stores). The primary goal is less disk/memory space consumption without sacrificing too much performance
Is there a framework or library that achieves both?
JBoss Cache did use Java Serialization but Infinispan does not. Instead it uses JBoss Marshalling to provide tiny payloads and catching of streams. If you enable storeAsBinary in Infinispan, it will store Java objects in their marshalled form.
Re 1. Not in either products yet.
Re 2. Supported in Infinispan using storeAsBinary. More info in https://docs.jboss.org/author/display/ISPN/Marshalling
Btw, if this does not convince you, you can always let Protobufs generate the byte[] that you need and you can stick it inside Infinispan.

Recommended size of data returned by EJB API call

does anyone know if there is any limitation for the size of data that can be obtained in the output of EJB API call?
Let's say output to API should be an array of some complex objects. How long array can go?
We are planning to use pagination for retrieving data by portions and want to determine the ideal size of portion/bulk
does anyone know if there is any limitation for the size of data that can be obtained in the output of EJB API call?
To my knowledge, there is no hard limit on the size of the object returned in an RMI call. In practice, you might be limited by memory resources... and "time" (e.g. a transaction timeout) but this shouldn't happen if you're not doing insane things.
Let's say output to API should be an array of some complex objects. How long array can go?
Even if the answer to the previous question had been "X", I hope you realize it would still have been impossible to answer this one.
We are planning to use pagination for retrieving data by portions and want to determine the ideal size of portion/bulk
IMO, this is more a usability issue than a technical issue so I suggest to discuss it with your usability expert.
Talk to the customer/product owner/project placeholder and ask for their expectations. Write test automating the specification. Tweak the code until the tests are met.

How are Process Explorer's memory metrics: WS Private, WS Shareable, WS Shared columns calculated?

I am continuing my saga to understand memory consumption by VB6 application.
The option that seems to work best so far is to monitor various memory metrics at key points at run-time and understand where big memory hogs are.
The measure driver to study this, is to understand how the application scalability in multi-user environment in Terminal Server (Citrix) is impacted due to changes in memory consumption (in simple terms more memory you use, less users you can fit on the server).
I can get most memory metrics for the process using GetProcessMemoryInfo.
Process Explorer reports additional metrics WS Private, WS Shareable, WS Shared - which seem very interesting for my investigation.
So question is, is there standard/hidden API to get these metric for a process? I would like to query these metrics programatically, so that I can capture them at key spots during application run and understand memory usage better.
See the QueryWorkingSet API. This looks rather nasty to use though, as it returns info on a per-page basis and would therefore leave it up to you to aggregate the totals. If there's a better method, please leave a comment and I'll delete this answer.
Also, if you have specific places in mind where you want to monitor changes in the working set, you might want to check out the InitializeProcessForWsWatch and GetWsChanges APIs -- these might make it easier to see how many pages have been faulted in rather than having to walk the entire page set before and after.

Resources