How to transfer NiFi flowfiles from one queue to another? - apache-nifi

I have "unmatched" flowfiles in a queue. Is there any way to transfer these flowfiles into another queue?
EDIT:
WITH #Andy's SUGGESTED SOLUTION - #RESOLVED

There isn't a way to directly transfer between queues because it would take away the meaning of how those flow files got in the queue. They have to pass through the previous processor which is making the decision about which queue to place them in. You can create loops using a processor that does nothing like UpdateAttribute, and then connect that back to the original processor.

Bryan's answer is comprehensive and explains the ideal process for on-going success. If this is a one-time task (I have this queue that contains data I was using during testing; now I want it to go to this other processor), you can simply select the queue containing the data and drag the blue endpoint to the other component.

Related

IBM MQ copy every 5th message to another queue

I have a queue with a huge message throughput. I would like to create new queue for lower environments. This new queue shouldn't be a 1-to-1 copy since it is going to cost too much. I would like to copy every nth (e.g. 5th) message to the copied queue. Can this be done?
There is the new feature called “streaming queues” introduced with MQ V. 9.2.3 / 9.3.0. It allows you to let each message which is put to a specific queue duplicated to another queue (the stream queue). To configure it you would need to set two new parameters of your original target queue: STREAMQ( ) to specify the stream queue and STRMQOS( ) to decide for the quality of service (refer to the doc).
Though, to achieve your requirement (“every nth message”), your application which processes the messages of the stream queue would need to only work with the data of every nth message and delete the rest, if you really want to process only a subset of them.
I know this is not a perfect answer to your question, as this solution comes with redundant queuing of messages you don’t want, but I am not aware of any other out-of-the-box solution.

MassTransit MessageData Management

I have been starting to make greater use of the message data feature of masstransit and am getting to the point needing to manage the message data in the store - i.e. remove old data.
The obvious choice is to have some outside process tidy up data, but clearly a scheduled (or not) clean up could remove data still in use or referenced by error or dead letter queues.
Ideally I would like to limit stored message data retention to messages only in error or dead letter queues, and automatically remove data for messages that have been successfully processed.
What would be the best approach to achieve this with MassTransit? Perhaps with a MiddleWare approach or similar, and if that is the case what is the correct approach?
Manual cleanup is recommended, using whatever makes sense for the repository in use. Because messages may still be in queues, or in error/dead-letter queues as you pointed out, it is really up to development/operations team to know when the right time is to remove older message data.
I'd suggest monitoring and managing the error/dead-letter queues more aggressively, keeping them empty. And then, just figure a good timeframe to delete old message data - one week, ten days, whatever - and deal with it that way.
I have had a backlog item to come up with a way to automatically manage message data, but since message data can be forwarded (using the same stored data) either via publish or send, there is no good way to track references.

TCP replication of topics

According to the documentation here: https://github.com/OpenHFT/Chronicle-Engine one is able to do pub/sub using maps. This allows one to create a construct similar to topics that are available in middleware such as Tibco, 29W, Kafka and use that as a way of sending events across processes. Is this a recommended usage of chronicle map? What kind of latency can I expect if both publisher and subscriber stay in the same machine?
My second question is, how can this be extended to send messages across machines? How does this work with enterprise TCP replication?
My requirement is to create thousands of topics and use them to communicate across processes running in different machines (in a LAN). Each of these topics would be written by a single source and read by multiple readers running in same or different machines. If the source of a particular topic dies, that source's replica would start writing to the topic and listeners will continue to receive messages. These messages need not be stored for replay.
Is this a recommended usage of chronicle map?
Yes, you can use engine to support event notification across a machine. However, if you want lowest latencies you might need to send a notification via Queue and keep the latest value in a map.
What kind of latency can I expect if both publisher and subscriber stay in the same machine?
It depends on your use case esp the size of the data (in maps case the number of entries as well) The Latency for Map in Engine is around 30 - 100 us, however the latency for Queue is around 2 - 5 us.
My second question is, how can this be extended to send messages across machines?
For this you need our licensed product but the code is the same.
Each of these topics would be written by a single source and read by multiple readers running in same or different machines. If the source of a particular topic dies, that source's replica would start writing to the topic and listeners will continue to receive messages.
Most likely, the simplest solution is to have a Map where each topic is a different key. This will send the latest value for that topic to the consumers.
If you need to recorded every event, a Queue is likely to be a better choice. If you don't need to retain the data for long, you can use a very sort file rotation.

read messages from JMS MQ or In-Memory Message store by count

I want to read messages from JMS MQ or In-memory message store based on count.
Like I want to start reading the messages when the message count is 10, until that i want the message processor to be idle.
I want this to be done using WSO2 ESB.
Can someone please help me?
Thanks.
I'm not familiar with wso2, but from an MQ perspective, the way to do this would be to trigger the application to run once there are 10 messages on the queue. There are trigger settings for this, specifically TRIGTYPE(DEPTH).
To expand on Morag's answer, I doubt that WS02 has built-in triggers that would monitor the queue for depth before reading messages. I suspect it just listens on a queue and processes messages as they arrive. I also doubt that you can use MQ's triggering mechanism to directly execute the flow conveniently based on depth. So although triggering is a great answer, you need a bit of glue code to make that work.
Conveniently, there's a tutorial that provides almost all the information necessary to do this. Please see Mission:Messaging: Easing administration and debugging with circular queues for details. That article has the scripts necessary to make the Q program work with MQ triggering. You just need to make a couple changes:
Instead of sending a command to Q to delete messages, send a command to move them.
Ditch the math that calculates how many messages to delete and either move them in batches of 10, or else move all messages until the queue drains. In the latter case, make sure to tell Q to wait for any stragglers.
Here's what it looks like when completed: The incoming messages land on some queue other than the WS02 input queue. That queue is triggered based on depth so that the Q program (SupportPac MA01) copies the messages to the real WS02 input queue. After the messages are copied, the glue code resets the trigger. This continues until there are less than 10 messages on the queue, at which time the cycle idles.
I got it by pushing the message to db and get as per the count required as in this answer of me take a look at my answer

How can I monitor/manage queue in ZeroMQ?

First of all, I'm new to ZeroMQ and message queue systems, so what I'm trying to do may be solved through a different approach. I'm designing a messaging system that does the following:
Multiple clients connect to a broker and send the id of an item that needs to be processed. The client disconnects immediately and does not wait for a response.
The broker sends items to workers, one item per worker, to perform some processing. Each return returns a signal that the processing was completed.
I have a rudimentary system setup which is processing requests/replies correctly, but I'd also like to be able to do the following:
Query the broker to see how many processes are actually running on the workers and how many are simply waiting to be run.
Have the broker ensure that only one process per id is running - if a duplicate id arrives and that item is not currently being processed by a worker, do not add it to the queue.
I'm using a poll setup with broker/dealer sockets. The code I'm using is very similar to this example from Ian Barber.
My first inclination (although I'm not sure how to implement it in zmq) is to have the broker keep track of the ids that have been received, and those that are actively being processed by workers. It seems that the broker forwards requests to workers immediately, regardless of whether or not they are available to actually run the processing. The workers then queue up the ids and process them in order. This isn't ideal since I'm looking to be able to monitor and control what is going on in the system centrally to achieve reliability.
Anyways, any hints, tips or examples of this type of setup would be greatly appreciated.
ZeroMQ is, in my opinion, best used in broker-less designs, for which the library is designed. If you want to monitor the number of items in a queue, or throughput, or whatever, you're going to have to build that into the application/device/producer yourself. Since you're new to messaging, that could get out of hand real quick. Given this, I'd suggest looking into RabbitMQ (or a similar broker), which would provide these services for you out of the box. If you do adopt RabbitMQ (or rather, AMQP), I'd suggest using a fanout exchange for the scenario you describe above.
The Python library for ZeroMQ seems to come with a pattern for dealing with this: http://zeromq.github.com/pyzmq/devices.html#monitoredqueue

Resources