Subscription to UnicastProcessor never triggers - spring

I wish to batch and process items as they come along so i created a UnicastProcessor and subscribed to it like this
UnicastProcessor<String> processor = UnicastProcessor.create()
processor
.bufferTimeout(10, Duration.ofMillis(500))
.subscribe(new Subscriber<List<String>>() {
#Override
public void onSubscribe(Subscription subscription) {
System.out.println("OnSubscribe");
}
#Override
public void onNext(List<String> strings) {
System.out.println("OnNext");
}
#Override
public void onError(Throwable throwable) {
System.out.println("OnError");
}
#Override
public void onComplete() {
System.out.println("OnComplete");
}
});
And then for testing purposes i created a new thread and started adding items in a loop
new Thread(() -> {
int limit = 100
i = 0
while(i < limit) {
++i
processor.sink().next("Hello $i")
}
System.out.println("Published all")
}).start()
After running this (and letting the main thread sleep for 5 seconds) i can see that all item have been published, but the subscriber does not trigger on any of the events so i can't process any of the published items.
What am I doing wrong here?

Reactive Streams specification is the answer!
The total number of onNext´s signalled by a Publisher to a Subscriber
MUST be less than or equal to the total number of elements requested
by that Subscriber´s Subscription at all times. [Rule 1.1]
In your example, you just simply provide a subscriber who does nothing in any sense. In turn, Reactive Streams specification, directly says that nothing will happen (there will be no onNext invocation) if you have not called Subscription#request method
A Subscriber MUST signal demand via Subscription.request(long n) to
receive onNext signals. [Rule 2.1]
Thus, to fix your problem, one of the possible solutions is changing the code in the following way:
UnicastProcessor<String> processor = UnicastProcessor.create()
processor
.bufferTimeout(10, Duration.ofMillis(500))
.subscribe(new Subscriber<List<String>>() {
#Override
public void onSubscribe(Subscription subscription) {
System.out.println("OnSubscribe");
subscription.request(Long.MAX_VALUE);
}
#Override
public void onNext(List<String> strings) {
System.out.println("OnNext");
}
#Override
public void onError(Throwable throwable) {
System.out.println("OnError");
}
#Override
public void onComplete() {
System.out.println("OnComplete");
}
});
Note, in this example demand in size Long.MAX_VALUE means an unbounded demand so that all messages will be directly pushed to the given Subscriber [Rule 3.17]
Use UnicatProcessor correctly
On the one hand, your example will work correctly with mentioned fixes. However, on the other hand, each invocation of FluxProcessor#sink() (yeah sink is FluxProcessor's method) will lead to a redundant calling of UnicastProcessor's onSubscribe method, which under the hood cause a few atomic reads and writes which might be avoided if create FluxSink once and safely use it as many tame as needed. For example:
UnicastProcessor<String> processor = UnicastProcessor.create()
FluxSink<String> sink = processor.serialize().sink();
...
new Thread(() -> {
int limit = 100
i = 0
while(i < limit) {
++i
sink.next("Hello $i")
}
System.out.println("Published all")
}).start()
Note, in this example, I executed an additional method serialize which provide thread-safe sink and ensure that the calling of FluxSink#next concurrently will not cause a violation of the ReactiveStreams spec.

Related

How to process multiple AMQP messages in parallel with the same #Incoming method

Is it possible to process multiple amqp - messages in parallel with the same method annotated with #Incoming("queue") with quarkus and smallrye-reactive-messaging?
To be more precise, I have following class:
#ApplicationScoped
public class Receiver {
#Incoming("test-queue")
public void process(String input) {
System.out.println("start processing:" + input);
try {
Thread.sleep(10_000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("end processing:" + input);
}
}
With the configuration in the application.properties:
amqp-host: localhost
amqp-port: 5672
amqp-username: quarkus
amqp-password: quarkus
mp.messaging.incoming.test-queue.connector: smallrye-amqp
mp.messaging.incoming.test-queue.address: test-queue
Now I'd like define by configuration how many parallel processing of messages are possible. For example, on a 4 core cpu it should run 4 in parallel.
Currently I can just add 4 copies of the method with different names to allow this parallelism, but that is not configurable.
I'm not sure, but I don't think Reactive Messaging supports what you're asking for.
You can, however, do what you want another way. I think it's also a better overall pattern for using messaging.
http://smallrye.io/smallrye-reactive-messaging/smallrye-reactive-messaging/2.5/amqp/amqp.html#amqp-inbound
Find the example with the CompletionStage and the explicit ack(). That variant is asynchronous, so if you combine it with Java's existing concurrency facilities, you'll get efficient parallel processing.
I would send the incoming work to an executor, and then have the executing task ack() when it completes.
I just came across the same scenario and here is how the spec intends for you to handle concurrency:
From eclipse Microprofile spec
Basically, instead of having a class with a method like this:
#Incoming("test-queue")
public void process(String input) {}
You have 2 classes like this:
#ApplicationScoped
public class MessageSubscriberProducer {
#Incoming("test-queue")
public Subscriber<String> createSubscriber() {
return new SubscriberImpl();
}
}
public class SubsciberImpl implements Subscriber<String> {
private Subscription subscription;
#Override
public void onSubscribe(Subscription subscription) {
this.subscription = subscription;
this.subscription.request(4); // this tells how many messages to grab right away
}
#Override
public void onNext(String val) {
// do processing
this.subscription.request(1); // grab 1 more
}
}
This has the additional advantage of moving your processing code from the vert.x event-loop thread to a worker thread pool.

SpringAMQP - Retry/Resend messages dlx

I'm trying to use a retry mechanism using DLX.
So, basically I want to send an message for 3 times and than stop and keep this message stopped on dlx queue;
What I did:
Created WorkQueue bound to WorkExchange
Created RetryQueue bound to RetryExchange
WorkQueue -> set x-dead-letter-exchange to RetryExchange
RetryQueue -> set x-dead-letter-exchange to WorkExchange AND x-message-ttl to 300000 ms (5 minutes)
So, now when I send any message to WorkQueue and it fail.. this message goes to RetryQueue for 5min and than back to WorkQueue.. but it can keep failing and I would do like to stop it after 3 attemps ...
It is possible? Is possible set to RetryQueue try to 3 times and after stop?
thanks.
There is no way to do this in the broker alone.
You can add code to your listener - examine the x-death header to determine how many times the message has been retried and discard/log it (and/or send it to a third queue) in your listener when you want to give up.
EDIT
#SpringBootApplication
public class So59741067Application {
public static void main(String[] args) {
SpringApplication.run(So59741067Application.class, args);
}
#Bean
public Queue main() {
return QueueBuilder.durable("mainQueue")
.deadLetterExchange("")
.deadLetterRoutingKey("dlQueue")
.build();
}
#Bean
public Queue dlq() {
return QueueBuilder.durable("dlQueue")
.deadLetterExchange("")
.deadLetterRoutingKey("mainQueue")
.ttl(5_000)
.build();
}
#RabbitListener(queues = "mainQueue")
public void listen(String in,
#Header(name = "x-death", required = false) List<Map<String, ?>> xDeath) {
System.out.println(in + xDeath);
if (xDeath != null && (long) xDeath.get(0).get("count") > 2L) {
System.out.println("Given up on this one");
}
else {
throw new AmqpRejectAndDontRequeueException("test");
}
}
}

How to exit clean from WebAPI background service

The code below is a Web API that prints on behalf of a SPA. For brevity I've omitted using statements and the actual printing logic. That stuff all works fine. The point of interest is refactoring of the printing logic onto a background thread, with the web api method enqueuing a job. I did this because print jobs sent in quick succession were interfering with each other with only the last job printing.
It solves the problem of serialising print jobs but raises the question of how to detect shutdown and signal the loop to terminate.
namespace WebPrint.Controllers
{
public class LabelController : ApiController
{
static readonly ConcurrentQueue<PrintJob> queue = new ConcurrentQueue<PrintJob>();
static bool running = true;
static LabelController()
{
ThreadPool.QueueUserWorkItem((state) => {
while (running)
{
Thread.Sleep(30);
if (queue.TryDequeue(out PrintJob job))
{
this.Print(job);
}
}
});
}
public void Post([FromBody]PrintJob job)
{
queue.Enqueue(job);
}
}
public class PrintJob
{
public string url { get; set; }
public string html { get; set; }
public string printer { get; set; }
}
}
Given the way I acquire a thread to servicing the print queue, it is almost certainly marked as a background thread and should terminate when the app pool tries to exit, but I am not certain of this, and so I ask you, dear readers, for your collective notion of best practice in such a scenario.
Well, I did ask for best practice.
Nevertheless, I don't have long-running background tasks, I have short-running tasks. They arrive asynchronously on different threads, but must be executed serially and on a single thread because the WinForms printing methods are designed for STA threading.
Matt Lethargic's point about possible job loss is certainly a consideration, but for this case it doesn't matter. Jobs are never queued for more than a few seconds and loss would merely prompt operator retry.
For that matter, using a message queue doesn't solve the problem of "what if someone shuts it down while it's being used" it merely moves it to another piece of software. A lot of message queues aren't persistent, and you wouldn't believe the number of times I've seen someone use MSMQ to solve this problem and then fail to configure it for persistence.
This has been very interesting.
http://thecodelesscode.com/case/156
I would look at your architecture at a higher level, doing 'long running tasks' such as printing should probably live outside of you webapi process entirely.
If this we myself I would:
Create a windows service (or what have you) that has all the printing logic in it, the job of the controller is then to just talk to the service either by http or some kind of queue MSMQ, RabbitMQ, ServiceBus etc.
If via http then the service should internally queue up the print jobs and return 200/201 to the controller as soon as possible (before printing happens) so that the controller can return to the client efficiently and release it's resources.
If via a queuing technology then the controller should place a message on the queue and again return 200/201 as quick as possible, the service can then read the messages at it's own rate and print one at a time.
Doing it this way removes overhead from your api and also the possibility of losing print jobs in the case of a failure in the webapi (if the api crashes any background threads may/will be effected). Also what if you do a deployment at the point of someone printing, there's a high chance the print job will fail.
My 2 cents worth
I believe that the desired behavior is not something that should be done within a Controller.
public interface IPrintAgent {
void Enqueue(PrintJob job);
void Cancel();
}
The above abstraction can be implemented and injected into the controller using the frameworks IDependencyResolver
public class LabelController : ApiController {
private IPrintAgent agent;
public LabelController(IPrintAgent agent) {
this.agent = agent;
}
[HttpPost]
public IHttpActionResult Post([FromBody]PrintJob job) {
if (ModelState.IsValid) {
agent.Enqueue(job);
return Ok();
}
return BadRequest(ModelState);
}
}
The sole job of the controller in the above scenario is to queue the job.
Now with that aspect out of the way I will focus on the main part of the question.
As already mentioned by others, there are many ways to achieve the desired behavior
A simple in memory implementation can look like
public class DefaultPrintAgent : IPrintAgent {
static readonly ConcurrentQueue<PrintJob> queue = new ConcurrentQueue<PrintJob>();
static object syncLock = new Object();
static bool idle = true;
static CancellationTokenSource cts = new CancellationTokenSource();
static DefaultPrintAgent() {
checkQueue += OnCheckQueue;
}
private static event EventHandler checkQueue = delegate { };
private static async void OnCheckQueue(object sender, EventArgs args) {
cts = new CancellationTokenSource();
PrintJob job = null;
while (!queue.IsEmpty && queue.TryDequeue(out job)) {
await Print(job);
if (cts.IsCancellationRequested) {
break;
}
}
idle = true;
}
public void Enqueue(PrintJob job) {
queue.Enqueue(job);
if (idle) {
lock (syncLock) {
if (idle) {
idle = false;
checkQueue(this, EventArgs.Empty);
}
}
}
}
public void Cancel() {
if (!cts.IsCancellationRequested)
cts.Cancel();
}
static Task Print(PrintJob job) {
//...print job
}
}
which takes advantage of async event handlers to process the queue in sequence as jobs are added.
The Cancel is provided so that the process can be short circuited as needed.
Like in Application_End event as suggested by another user
var agent = new DefaultPrintAgent();
agent.Cancel();
or manually by exposing an endpoint if so desired.

run PublishSubject on different thread rxJava

I am running RxJava and creating a subject to use onNext() method to produce data. I am using Spring.
This is my setup:
#Component
public class SubjectObserver {
private SerializedSubject<SomeObj, SomeObj> safeSource;
public SubjectObserver() {
safeSource = PublishSubject.<SomeObj>create().toSerialized();
**safeSource.subscribeOn(<my taskthreadExecutor>);**
**safeSource.observeOn(<my taskthreadExecutor>);**
safeSource.subscribe(new Subscriber<AsyncRemoteRequest>() {
#Override
public void onNext(AsyncRemoteRequest asyncRemoteRequest) {
LOGGER.debug("{} invoked.", Thread.currentThread().getName());
doSomething();
}
}
}
public void publish(SomeObj myObj) {
safeSource.onNext(myObj);
}
}
The way new data is generated on the RxJava stream is by #Autowire private SubjectObserver subjectObserver
and then calling subjectObserver.publish(newDataObjGenerated)
No matter what I specify for subscribeOn() & observeOn():
Schedulers.io()
Schedulers.computation()
my threads
Schedulers.newThread
The onNext() and the actual work inside it is done on the same thread that actually calls the onNext() on the subject to generate/produce data.
Is this correct? If so, what am I missing? I was expecting the doSomething() to be done on a different thread.
Update
In my calling class, if I change the way I am invoking the publish method, then of course a new thread is allocated for the subscriber to run on.
taskExecutor.execute(() -> subjectObserver.publish(newlyGeneratedObj));
Thanks,
Each operator on Observable/Subject return a new instance with the extra behavior, however, your code just applies the subscribeOn and observeOn then throws away whatever they produced and subscribes to the raw Subject. You should chain the method calls and then subscribe:
safeSource = PublishSubject.<AsyncRemoteRequest>create().toSerialized();
safeSource
.subscribeOn(<my taskthreadExecutor>)
.observeOn(<my taskthreadExecutor>)
.subscribe(new Subscriber<AsyncRemoteRequest>() {
#Override
public void onNext(AsyncRemoteRequest asyncRemoteRequest) {
LOGGER.debug("{} invoked.", Thread.currentThread().getName());
doSomething();
}
});
Note that subscribeOn has no practical effect on a PublishSubject because there is no subscription side-effect happening in its subscribe() method.

Using tick tuples with trident in storm

I am able to use standard spout,bolt combination to do streaming aggregation
and works very well in happy case, when using tick tuples to persist data at some interval
to make use of batching. Right now i am doing some failure management (tracking off tuples not saved etc) myself.(i.e not ootb from storm)
But i have read that trident gives you a higher abstraction and better failure management.
What i dont understand is whether there is tick tuple support in trident. Basically
I would like to batch in memory for the current minute or so and persist any aggregated data
for the previous minutes using trident.
Any pointers here or design suggestions would be helpful.
Thanks
Actually micro-batching is a built-in Trident's feature. You don't need any tick tuples for that. When you have something like this in your code:
topology
.newStream("myStream", spout)
.partitionPersist(
ElasticSearchEventState.getFactoryFor(connectionProvider),
new Fields("field1", "field2"),
new ElasticSearchEventUpdater()
)
(I'm using here my custom ElasticSearch state/updater, you might use something else)
So when you have something like this, under the hood Trident group your stream into batches and performs partitionPersist operation not on individual tuples but on those batches.
If you still need tick tuples for any reason, just create your tick spout, something like this works for me:
public class TickSpout implements IBatchSpout {
public static final String TIMESTAMP_FIELD = "timestamp";
private final long delay;
public TickSpout(long delay) {
this.delay = delay;
}
#Override
public void open(Map conf, TopologyContext context) {
}
#Override
public void emitBatch(long batchId, TridentCollector collector) {
Utils.sleep(delay);
collector.emit(new Values(System.currentTimeMillis()));
}
#Override
public void ack(long batchId) {
}
#Override
public void close() {
}
#Override
public Map getComponentConfiguration() {
return null;
}
#Override
public Fields getOutputFields() {
return new Fields(TIMESTAMP_FIELD);
}
}

Resources