I would like to use Boost's shared memory services to do the following. I have been begun studying the documentation but as a aid to that was hoping someone could provide an example that is close to what I want to do.
Here it is:
Process A will write messages to a buffer area. It will also maintain a map, mapping message ID to information regarding the start location and size of the message in the buffer. Some locking mechanism, possibly a scoped lock, will control access to the map and buffer area.
Process B will read these messages based upon message ID.
Thanks in advance for any constructive suggestions.
Have you looked at the Interprocess - message queue documentation?
It doesn't do exactly what you're asking for, as far as making each message have an ID and such, but you don't go into detail as to why that is necessary. Since there's only two processes, will it work to simply copy the data over to process B?
Related
I am working on a windows system. I need to create a shared memory for inter process communication to share objects (containing pointers as members). Or some equivalent way for fast transfer of objects from a generator process to a receiver process. the size of the objects are also huge. How do i do that? The porblem is that even if i share the objects I need a way so that the other process gets the access to the locations pointed by the pointers in the objects. And sharing each of those locations for each object is not feasible.
It's difficult to say without more details, but I would consider a memory mapped file. How you create the file depends on whether you need to communicate between sessions or not. You would also need a notification mechanism when new data was posted. You could do that with a registered message, but again that's only possible if your processes are in the same session/desktop.
I can't really be more specific without knowing the details of the requirement.
I am currently in the process of writing an ElasticSearch Nifi processor. Individual inserts / writes to ES are not optimal, instead batching documents is preferred. What would be considered the optimal approach within a Nifi processor to track (batch) documents (FlowFiles) and when at a certain amount batch them in? The part I am most concerned about is if ES is unavailable, down, network partition, etc. prevents the batch from being successful. The primary point of the question, is given that Nifi has content storage for queuing / back-pressure, etc. is there a preferred method for using that to ensure no FlowFiles get lost if a destination is down? Maybe there is another processor I should look at for an example?
I have looked at the Mongo processor, Merge, etc. to try and get an idea of the preferred approach for batching inside of a processor, but can't seem to find anything specific. Any suggestions would be appreciated.
Good chance I am overlooking some basic functionality baked into Nifi. I am still fairly new to the platform.
Thanks!
Great question and a pretty common pattern. This is why we have the concept of a ProcessSession. It allows you to send zero or more things to an external endpoint and only commit once you know it has been ack'd by the recipient. In this sense it offers at least-once semantics. If the protocol you're using supports two-phase commit style semantics you can get pretty close to the ever elusive exactly-once semantic. Much of the details of what you're asking about here will depend on the destination systems API and behavior.
There are some examples in the apache codebase which reveal ways to do this. One way is if you can produce a merged collection of events prior to pushing to the destination system. Depends on its API. I think PutMongo and PutSolr operate this way (though the experts on that would need to weigh in). An example that might be more like what you're looking for can be found in PutSQL which operates on batches of flowfiles to send in a single transaction (on the destination DB).
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutSQL.java
Will keep an eye here but can get the eye of a larger NiFi group at users#nifi.apache.org
Thanks
Joe
I'm using joliver/EventStore library and trying to find a way of how to get a stream not reading any events from it.
The reason is that I want just to write some events into that store for specific stream without loading all 10k messages from it.
The way you're expected to use the store is that you always do a GetById first. Even if you new up an Aggregate and Save it, you'll see in the CommonDomain EventStoreRepository that it will first correlate it with the existing data.
The key reason why a read is needed first is that the infrastructure needs to work out how many events have gone before to compute the new commit sequence number.
Regarding your citing of your example threshold that makes you want to optimize this away... If you're really going to have that level of events, you'll already be into snapshotting territory as you'll need to have an appropriately efficient way of doing things other than blind write too.
Even if you're not intending to lean on snapshotting, half the benefit of using EventStore is that the facility is buitl in for when you need it.
Is there a way to hook disk writes by a specific application and get the data being written, aside from reading the data after the write or reading process memory and searching for data? It's important for me to get the data before it can be tampered with on the disk. Thanks in advance!
Too little reputation to comment, sorry.
I would have said (to echo Raymond) mini filters would fit your requirements nicely.
Microsoft docs
FltGetRequestorProcessId should allow you to filter by process.
You will still see every request come through, just match the pid you are interested in. If it is not your process return FLT_PREOP_SUCCESS_NO_CALLBACK and you will not worry about that request again.
I was reading code from one of the projects from github. I came across something called a Vectored Referencing buffer implementation. Can have someone come across this ? What are the practical applications of this. I did a quick google search and wasn't able to find any simple sample implementation for this.
Some insight would be helpful.
http://www.ibm.com/developerworks/library/j-zerocopy/
http://www.linuxjournal.com/article/6345
http://www.seccuris.com/documents/whitepapers/20070517-devsummit-zerocopybpf.pdf
https://github.com/joyent/node/pull/304
I think some more insight on your specific project/usage/etc would allow for a more specific answer.
However, the term is generally used to either change or start an interface/function/routine with the goal that it does not allocate another instance of its input in order to perform its operations.
EDIT: Ok, after reading the new title, I think you are simply talking about pushing buffers into a vector of buffers. This keeps your code clean, you can pass any buffer you need with minimal overhead to any function call, and allows for a better cleanup time if your code isn't managed.
EDIT 2: Do you mean this http://cpansearch.perl.org/src/TYPESTER/Data-MessagePack-Stream-0.07/msgpack-0.5.7/src/msgpack/vrefbuffer.h