Is it possible to allow emitting values from a Flux conditionally based on a global boolean variable?
I'm working with Flux delayUntil(...) but not able to fully grasp the functionality or my assumptions are wrong.
I have a global AtomicBoolean that represents the availability of a downstream connection and only want the upstream Flux to emit if the downstream is ready to process.
To represent the scenario, created a (not working) test sample
//Randomly generates a boolean value every 5 seconds
private Flux<Boolean> signalGenerator() {
return Flux.range(1, Integer.MAX_VALUE)
.delayElements(Duration.ofMillis(5000))
.map(integer -> new Random().nextBoolean());
}
and
Flux.range(1, Integer.MAX_VALUE)
.delayElements(Duration.ofMillis(1000))
.delayUntil(evt -> signalGenerator()) // ?? Only proceed when signalGenerator returns true
.subscribe(System.out::println);
I have another scenario where a downstream process can accept only x messages a second. In the current non-reactive implementation we have a Semaphore of x permits and the thread is blocked if no more permits are available, with Semaphore permits resetting every second.
In both scenarios I want upstream Flux to emit only when there is a demand from the downstream process, and I do not want to Buffer.
You might consider using Mono.fromRunnable() as an input to delayUntil() like below;
Helper class;
public class FluxCondition {
CountDownLatch latch = new CountDownLatch(10); // it depends, might be managed somehow
Runnable r = () -> { latch.await(); }
public void lock() { Mono.fromRunnable(r) };
public void release() { latch.countDown(); }
}
Usage;
FluxCondition delayCondition = new FluxCondition();
Flux.range(1, 10).delayUntil(o -> delayCondition.lock()).subscribe();
.....
delayCondition.release(); // shall call this for each element
I guess there might be a better solution by using sink.emitNext but this might also require a condition variable for controlling Flux flow.
According my understanding, in reactive programming, your data should be considered in every operator step. So it might be better for you to design your consumer as a reactive processor. In my case I had no chance and followed the way as I described above
Related
I'm working on converting a blocking sequential orchestration framework to reactive. Right now, these tasks are dynamic and are fed into the engine by a JSON input. The engine pulls classes and executes the run() method and saves the state with the responses from each task.
How do I achieve the same chaining in reactor? If this was a static DAG, I would have chained it with flatMap or then operators but since it is dynamic, How do I proceed with executing a reactive task and collecting the output from each task?
Examples:
Non reactive interface:
public interface OrchestrationTask {
OrchestrationContext run(IngestionContext ctx);
}
Core Engine
public Status executeDAG(String id) {
IngestionContext ctx = ContextBuilder.getCtx(id);
List<OrchestrationTask> tasks = app.getEligibleTasks(id);
for(OrchestrationTask task : tasks) {
// Eligible tasks are executed sequentially and results are collected.
OrchestrationContext stepContext = task.run(ctx);
if(!evaluateResult(stepContext)) break;
}
return Status.SUCCESS;
}
Following the above example, if I convert tasks to return Mono<?> then, how do I wait or chain other tasks to operate on the result on previous tasks?
Any help is appreciated. Thanks.
Update::
Reactive Task example.
public class SampleTask implements OrchestrationTask {
#Override
public Mono<OrchestrationContext> run(OrchestrationContext context) {
// Im simulating a delay here. treat this as a long running task (web call) But the next task needs the response from the below call.
return Mono.just(context).delayElements(Duration.ofSeconds(2));
}
So i will have a series of tasks that accomplish various things but the response from each task is dependent on the previous and is stored in the Orchestration Context. Anytime an error is occurred, the orchestration context flag will be set to false and the flux should stop.
Sure, we can:
Create the flux from the task list (if it's appropriate to generate the task list reactively then you can replace that arraylist with the flux directly, if not then keep as-is);
flatMap() each task to your task.run() method (which as per the question now returns a Mono;
Ensure we only consume elements while evaluateResult() is true;
...then finally just return the SUCCESS status as before.
So putting all that together, just replace your loop & return statement with:
Flux.fromIterable(tasks)
.flatMap(task -> task.run(ctx))
.takeWhile(stepContext -> evaluateResult(stepContext))
.then(Mono.just(Status.SUCCESS));
(Since we've made it reactive, your method will obviously need to return a Mono<Status> rather than just Status too.)
Update as per the comment - if you just want this to execute "one at a time" rather than with multiple concurrently, you can use concatMap() instead of flatMap().
I'm developing an application which uses reactor libraries to connect with Google pubsub. So I have a Flux of messages. I want it to always consume from the queue, no matter what happens: this means handling all errors in order not to terminate the flux. I was thinking about the (very unlikely) event the connection to pubsub may be lost or whatever may cause the just created Flux to signal an error. I came up with this solution:
private final PubSubReactiveFactory pubSubReactiveFactory;
private final String requestSubscription;
private final Long requestPollTime;
private final Flux<AcknowledgeablePubsubMessage> requestFlux;
#Autowired
public FluxContainer(/* Field args...*/) {
// init stuff...
this.requestFlux = initRequestFlux();
}
private Flux<AcknowledgeablePubsubMessage> initRequestFlux() {
return pubSubReactiveFactory.poll(requestSubscription, requestPollTime);
.doOnError(e -> log.error("FATAL ERROR: could not retrieve message from queue. Resetting flux", e))
.onErrorResume(e -> initRequestFlux());
}
#EventListener(ApplicationReadyEvent.class)
public void configureFluxAndSubscribe() {
log.info("Setting up requestFlux...");
this.requestFlux
.doOnNext(AcknowledgeablePubsubMessage::ack)
// ...many more concatenated calls handling flux
}
Does it makes sense? I'm concerned about memory allocation (I'm relying on the gc to clean stuff). Any comment is welcome.
What I think you're looking for is basically a Flux that restarts itself when it is terminated for any situation except for the subscription being disposed. In my case I have a source that would generate infinite events from Docker daemon which can disconnect "successfully"
Let sourceFlux be the flux providing your data and would be something you'd want to restart on error or complete, but stop on subscription disposal.
create a recovery function
Function<Throwable, Publisher<Integer>> recoverFromThrow =
throwable -> sourceFlux
create a new flux that would recover from throw
var recoveringFromThrowFlux =
sourceFlux.onErrorResume(recoverFromThrow);
create a Flux generator that generates the flux that would recover from a throw. (Note the generic coercion is needed)
var foreverFlux =
Flux.<Flux<Integer>>generate((sink) -> sink.next(recoveringFromThrowFlux))
.flatMap(flux -> flux);
foreverFlux is the flux that does self recovery.
I have created a Mono with .fromCallable() in Java spring-reactor. I thought it will run the lambda I provided asynchronously and use Mono.empty() as the return value. So, the execution of the entire stream would start off from a different thread.
I have 2 questions:
What is the execution order and number of threads if I call .subscribeOn() into the chain of operations?
Is it a good approach that I follow to check whether the response have the correct state in my below code?
private final Scheduler myScheduler = Schedulers
.newParallel("reactive-pricefetcher", 10, true);
...
...
...
final Mono<Mono<Object>> callableMono = Mono
.fromCallable(() -> {
myHandler.updateCacheResponse(mutableObjList,
dealsRequest.getDealParameters()
);
return Mono.empty();
})
.subscribeOn(myScheduler);
callableMono.subscribe();
boolean stillInProgress = mutableObjList.stream()
.anyMatch(obj -> obj.getStatus() != DONE);
return DealsResponse.builder()
.complete(!stillInProgress)
.itemDeals(mutableObjList)
.build();
PS: I already know that using .subscribeOn() will move the entire stream chain into a different thread when subscribe() invoked.
I'm using a spring flux to send parallel requests to a service, this is very simplified version of it:
Flux.fromIterable(customers)
.flatMap { customer ->
client.call(customer)
} ...
I was wondering how I could cancel this flux, as in, grab a reference to the flux somehow and tell it to shut down.
As you probably know, with reactive objects, all operators are lazy. This means execution of the pipeline is delayed until the moment you subscribe to the reactive stream.
So, in your example, there is nothing to cancel yet because nothing is happening at that point.
But supposing your example was extended to:
Disposable disp = Flux.fromIterable(customers)
.flatMap { customer ->
client.call(customer)
}
.subscribe();
Then, as you can see, your subscription returns a Disposable object that you can use to cancel the entire thing if you want, e.g.
disp.dispose()
Documentation of dispose says:
Cancel or dispose the underlying task or resource.
There’s another section of the documentation that says the following:
These variants [of operators] return a reference to the subscription
that you can use to cancel the subscription when no more data is
needed. Upon cancellation, the source should stop producing values and
clean up any resources it created. This cancel and clean-up behavior
is represented in Reactor by the general-purpose Disposable interface.
Therefore canceling the execution of stream is not free from complications on the reactive object side, because you want to make sure to leave the world in a consistent state if you cancel the stream in the middle of its processing. For example, if you were in the process of building something, you may want to discard resources, destroy any partial aggregation results, close files, channels, release memory or any other resources you have, potentially undoing changes or compensating for them.
You may want to read the documentation on cleanup about this, such that you also consider what you can do on the reactive object side.
Flux<String> bridge = Flux.create(sink -> {
sink.onRequest(n -> channel.poll(n))
.onCancel(() -> channel.cancel())
.onDispose(() -> channel.close())
});
Answer from #Edwin is precise. As long as you don't call subscribe, there is nothing to cancel, because no code will be executed.
Just wanted to add an example to make it clear.
public static void main(String[] args) throws InterruptedException {
List<String> lists = Lists.newArrayList("abc", "def", "ghi");
Disposable disposable = Flux.fromIterable(lists)
.delayElements(Duration.ofSeconds(3))
.map(String::toLowerCase)
.subscribe(System.out::println);
Thread.sleep(5000); //Sleeping so that some elements in the flux gets printed
disposable.dispose();
Thread.sleep(10000); // Sleeping so that we can prove even waiting for some time nothing gets printed after cancelling the flux
}
But I would say a much cleaner way (functional way) is to make use of functions like takeUntil or take. For instance I can stop the stream in the above example like this as well.
List<String> lists = Lists.newArrayList("abc", "def", "End", "ghi");
Flux.fromIterable(lists).takeUntil(s -> s.equalsIgnoreCase("End"))
.delayElements(Duration.ofSeconds(3))
.map(String::toLowerCase)
.subscribe(System.out::println);
or
List<String> lists = Lists.newArrayList("abc", "def", "ghi");
Flux.fromIterable(lists).take(2)
.delayElements(Duration.ofSeconds(2))
.map(String::toLowerCase)
.subscribe(System.out::println);
Another subscribe to my flux then calling a dispose did it for me:
// Setup flux and populate
Flux<String> myFlux = controller.get(json);
// Subscribe
FlowSubscriber<String> sub = new FlowSubscriber<String>();
myFlux.subscribe(sub);
// Work on elements in the subscription
String myString = sub.consumedElements.get(0);
... do work ...
// Cancel
myFlux.subscribe().dispose();
I'm writing a basic application to test the Interactive Queries feature of Kafka Streams. Here is the code:
public static void main(String[] args) {
StreamsBuilder builder = new StreamsBuilder();
KeyValueBytesStoreSupplier waypointsStoreSupplier = Stores.persistentKeyValueStore("test-store");
StoreBuilder waypointsStoreBuilder = Stores.keyValueStoreBuilder(waypointsStoreSupplier, Serdes.ByteArray(), Serdes.Integer());
final KStream<byte[], byte[]> waypointsStream = builder.stream("sample1");
final KStream<byte[], TruckDriverWaypoint> waypointsDeserialized = waypointsStream
.mapValues(CustomSerdes::deserializeTruckDriverWaypoint)
.filter((k,v) -> v.isPresent())
.mapValues(Optional::get);
waypointsDeserialized.groupByKey().aggregate(
() -> 1,
(aggKey, newWaypoint, aggValue) -> {
aggValue = aggValue + 1;
return aggValue;
}, Materialized.<byte[], Integer, KeyValueStore<Bytes, byte[]>>as("test-store").withKeySerde(Serdes.ByteArray()).withValueSerde(Serdes.Integer())
);
final KafkaStreams streams = new KafkaStreams(builder.build(), new StreamsConfig(createStreamsProperties()));
streams.cleanUp();
streams.start();
ReadOnlyKeyValueStore<byte[], Integer> keyValueStore = streams.store("test-store", QueryableStoreTypes.keyValueStore());
KeyValueIterator<byte[], Integer> range = keyValueStore.all();
while (range.hasNext()) {
KeyValue<byte[], Integer> next = range.next();
System.out.println(next.value);
}
Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
}
protected static Properties createStreamsProperties() {
final Properties streamsConfiguration = new Properties();
streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "random167");
streamsConfiguration.put(StreamsConfig.CLIENT_ID_CONFIG, "client-id");
streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
streamsConfiguration.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, Serdes.String().getClass().getName());
streamsConfiguration.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, Serdes.Integer().getClass().getName());
//streamsConfiguration.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 10000);
return streamsConfiguration;
}
So my problem is, every time I run this I get this same error:
Exception in thread "main" org.apache.kafka.streams.errors.InvalidStateStoreException: the state store, test-store, may have migrated to another instance.
I'm running only 1 instance of the application, and the topic I'm consuming from has only 1 partition.
Any idea what I'm doing wrong ?
Looks like you have a race condition. From the kafka streams javadoc for KafkaStreams::start() it says:
Start the KafkaStreams instance by starting all its threads. This function is expected to be called only once during the life cycle of the client.
Because threads are started in the background, this method does not block.
https://kafka.apache.org/10/javadoc/index.html?org/apache/kafka/streams/KafkaStreams.html
You're calling streams.store() immediately after streams.start(), but I'd wager that you're in a state where it hasn't initialized fully yet.
Since this is code appears to be just for testing, add a Thread.sleep(5000) or something in there and give it a go. (This is not a solution for production) Depending on your input rate into the topic, that'll probably give a bit of time for the store to start filling up with events so that your KeyValueIterator actually has something to process/print.
Probably not applicable to OP but might help others:
In trying to retrieve a KTable's store, make sure the the KTable's topic exists first or you'll get this exception.
I failed to call Storebuilder before consuming the store.
Typically this happens for two reasons:
The local KafkaStreams instance is not yet ready (i.e., not yet in
runtime state RUNNING, see Run-time Status Information) and thus its
local state stores cannot be queried yet. The local KafkaStreams
instance is ready (e.g. in runtime state RUNNING), but the particular
state store was just migrated to another instance behind the scenes.
This may notably happen during the startup phase of a distributed
application or when you are adding/removing application instances.
https://docs.confluent.io/platform/current/streams/faq.html#handling-invalidstatestoreexception-the-state-store-may-have-migrated-to-another-instance
The simplest approach is to guard against InvalidStateStoreException when calling KafkaStreams#store():
// Example: Wait until the store of type T is queryable. When it is, return a reference to the store.
public static <T> T waitUntilStoreIsQueryable(final String storeName,
final QueryableStoreType<T> queryableStoreType,
final KafkaStreams streams) throws InterruptedException {
while (true) {
try {
return streams.store(storeName, queryableStoreType);
} catch (InvalidStateStoreException ignored) {
// store not yet ready for querying
Thread.sleep(100);
}
}
}