retryWhen with timer appears to subvert merge behavior - java-8

I'm using RxJava in an asynchronous messaging environment (vert.x), and so have a flow that looks like this:
Observable.defer( () -> getEndpoint() )
.mergeWith( getCancellationMessage() )
.flatMap( endpoint -> useEndpoint( endpoint ) )
.retryWhen( obs -> obs.flatMap( error -> {
if ( wasCancelled( error ) ) {
return Observable.error( error );
}
return Observable.timer(/* args */)
}) )
.subscribe( result -> useResult( result ),
error -> handleError( error )
);
The implementation of getCancellationMessage() returns an observable stream that emits an error whenever a cancellation message has been received from an independent message source. This stream never emits anything other than Observable.error(), and it only emits an error when it receives a cancellation message.
If I understand how merge works, the entire chain should be terminated via onError when getCancellationMessage() emits an error.
However, I am finding that if the retryWhen operator is waiting for the timer to emit when a cancellation message is received, the error is ignored and the retryWhen loop continues as if the cancellation was never received.
I can fix the behavior by merging Observable.timer() with the same getCancellationMessage() function, but I'm not understanding why I have to do that in the first place.
Is this merge/retryWhen interaction expected?
Edits:
Below is an example of the kind of thing that the getCancellationMessage() function is doing:
Observable<T> getCancellationMessage() {
if ( this.messageStream == null ) {
this.messageStream = this.messageConsumer.toObservable()
.flatMap( message -> {
this.messageConsumer.unregister();
if ( isCancelMessage(message) ) {
return Observable.error( new CancelError() );
}
else {
return Observable.error( new FatalError() );
}
});
}
return this.messageStream;
}
Note that I don't own the implementation of this.messageConsumer - this comes from the third party library I'm using (vert.x), so I don't control the implementation of that Observable.
As I understand it, the messageConsumer.toObservable() method returns the result of Observable.create() provided with an instance of this class, which will call the subscriber's onNext method whenever a new message has arrived.
The call to messageConsumer.unregister() prevents any further messages from being received.

However, I am finding that if the retryWhen operator is waiting for the timer to emit when a cancellation message is received, the error is ignored and the retryWhen loop continues as if the cancellation was never received.
The operator retryWhen turns an upstream Throwable into a value and routes it through the sequence you provided in order to get a value response to retry the upstream or end the stream, thus
Observable.error(new IOException())
.retryWhen((Observable<Throwable> error) -> error)
.subscribe();
Will retry indefinitely because the inner error is considered a value now, not an exception.
retryWhen doesn't know by itself which of the error values should it consider to be one that shouldn't be retried, that's the job of your inner flow:
Observable.defer( () -> getEndpoint() )
.mergeWith( getCancellationMessage() )
.flatMap( endpoint -> useEndpoint( endpoint ) )
.retryWhen( obs -> obs
.takeWhile( error -> !( error instanceof CancellationException ) ) // <-------
.flatMap( error -> Observable.timer(/* args */) )
)
.subscribe( result -> useResult( result ),
error -> handleError( error )
);
Here, we only let the error pass if it is not of type CancellationException (you can replace it with your error type). This will complete the sequence.
If you want the sequence to end with an error instead, we need to change the flatMap logic instead:
.retryWhen(obs -> obs
.flatMap( error -> {
if (error instanceof CancellationException) {
return Observable.error(error);
}
return Observable.timer(/* args */);
})
)
Note that returning Observable.empty() in flatMap doesn't end the sequence as it just indicates a source to be merged is empty but there could be still other inner sources. In particular to retryWhen, an empty() will hang the sequence indefinitely because there won't be any signal to indicate retry or end-of-sequence.
Edit:
Based on your wording, I assume getCancellationMessage() is a hot observable. Hot observables have to be observed in order to receive their events or errors. When the retryWhen operator is in its retry grace period due to timer(), there is nothing subscribed to the topmost mergeWith with the getCancellationMessage() and thus it can't stop the timer at that point.
You have to keep a subscription to it while the timer executes to stop it right away:
Observable<Object> cancel = getCancellationMessage();
Observable.defer( () -> getEndpoint() )
.mergeWith( cancel )
.flatMap( endpoint -> useEndpoint( endpoint ) )
.retryWhen( obs -> obs
.flatMap( error -> {
if (error instanceof CancellationException) {
return Observable.error(error);
}
return Observable.timer(/* args */).takeUntil( cancel );
})
)
.subscribe( result -> useResult( result ),
error -> handleError( error )
);
In this case, if cancel fires while the timer is executing, the retryWhen will stop the timer and terminate with the cancel error immediately.
Using takeUntil is one option, as you found out, mergeWith ( cancel ) again works as well.

Related

"go func" recursion vs. for loop performance/patterns

I'm writing a socket handler, and I thought of two ways to write individual synchronous event handlers (events of same type must be received in order):
For loop
for {
var packet EventType
select {
case packet = <-eventChannel:
case <- stop:
break
}
// Logic
}
go func Recursion
func GetEventType() {
var packet EventType
select {
case packet = <-eventChannel:
case <- stop:
return
}
// Logic
go func GetEventType()
}
I know that looping is almost always more efficient than recursing, but I couldn't find much on the performance of go func relative to alternatives. Here's my initial take on each method:
For loop:
Doesn't start new thread each call
Doesn't use call stack
Good pattern
go func Recursion:
Clean
Doesn't require anonymous function to use defer
Isolated access (data-hiding)
Are there any other reasons to use one over the other? Is method #2 an anti-pattern? Could method #2 cause a major slow-down (call stack?) under high throughput?

How to do idiomatic synchronization with time.After?

I'm writing an application that queues incoming requests. If a request has been on the queue for more than a certain amount of time, I'd like to throw a timeout. I'm doing that with time.After:
timeoutCh := time.After(5 * time.Second)
select {
case <-timeoutCh:
//throw timeout 504
case <-processing:
//process request
}
The processing channel (along with the request) is put on the queue, and when a request is taken off to be processed, I send a signal to the channel to hit the case statement:
processing <- true
The problem with this is that if timeoutCh has already been selected, the processing channel will block, so I need some way to check whether the request has timed out.
I considered using a shared atomic boolean, but if I do something like this:
case <-timeoutCh:
requestTimedOut = true
and then check the boolean before sending to the processing channel, there's still a race condition, because the timeoutCh case may have been selected, but the bool not yet set to true!
Is there an idiomatic way of dealing with this sort of synchronization problem in Go?
Use a mutex coordinate processing of the data and timeout.
Define a type to hold the mutex, input, result, a channel to signal completion of the work and a flag indicating that the work, if any, is complete.
type work struct {
sync.Mutex
input InputType
result ResultType
signal chan struct {}
done bool
}
The request handler creates and enqueues a work item and waits for a timeout or a signal from the queue processor. Either way, the request handler checks to see if the queue processor did the work and responds as appropriate.
func handler(resp http.ResponseWriter, req *http.Request) {
w := &queueElement{
input: computeInputFromRequest(req)
signal: make(chan struct{})
}
enqueue(w)
// Wait for timeout or for queue processor to signal that the work is complete.
select {
case <-time.After(5 * time.Second):
case <-w.signal:
}
w.Lock()
done := w.done // Record state of the work item.
w.done = true // Mark the work item as complete.
w.Unlock()
if !done {
http.Error(w, "Timeout", http.StatusGatewayTimeout)
} else {
respondWithResult(resp, w.result)
}
}
The queue processor will look something like this:
for {
w := dequeue()
w.Lock()
if !w.done {
w.done = true
w.result = computeResultFromInput(w.input)
close(w.signal)
}
w.Unlock()
}
To ensure that the request handler waits on the result, the queue processor holds the lock while processing the work item.

Convert infinite stream of finite streams to an infinite stream - Reactive X

How in Reactive x (ideally with examples in RxJava or RxJs) can be achieved this ?
a |-a-------------------a-----------a-----------a----
s1 |-x-x-x-x-x-x -| (subscribe)
s2 |-x-x-x-x-x-| (subscribe)
s2 |-x-x-x-x-x-| (subscribe)
...
sn
S |-x-x-x-x-x-x-x-------x-x-x-x-x-x-x-------------x-x-x-x-x-x- (subsribe)
a is an infinite stream of events which trigger finite stream sn of events each of which should be part of infinite stream S while being able to subscribe to each sn stream ( in order to do summation operations) but at the same time keeping stream S as infinite.
EDIT: To be more concrete I provide the implementation of what I am looking for in Kotlin.
Every 10 second an event is emitted which maps to shared finite stream of 4 events. The metastream is flatMap-ed into normal infinite stream. I make use of doAfterNext to additionally subscribe to each finite stream and print out results.
/** Creates a finite stream with events
* $ch-1 - $ch-4
*/
fun createFinite(ch: Char): Observable<String> =
Observable.interval(1, TimeUnit.SECONDS)
.take(4)
.map({ "$ch-$it" }).share()
fun main(args: Array<String>) {
var ch = 'A'
Observable.interval(10, TimeUnit.SECONDS).startWith(0)
.map { createFinite(ch++) }
.doAfterNext {
it
.count()
.subscribe({ c -> println("I am done. Total event count is $c") })
}
.flatMap { it }
.subscribe { println("Just received [$it] from the infinite stream ") }
// Let main thread wait forever
CountDownLatch(1).await()
}
However I am not sure if this is the 'pure RX' way.
You don't make clear how you want to do the counting. If you are doing a total count, then there is no need to do the interior subscription:
AtomicLong counter = new AtomicLong()
Observable.interval(10, TimeUnit.SECONDS).startWith(0)
.map { createFinite(ch++) }
.flatMap { it }
.doOnNext( counter.incrementAndget() )
.subscribe { println("Just received [$it] from the infinite stream ") }
On the other hand, if you need to provide a count for each intermediate observable, then you can move the counting inside the flatMap() and print out the count and reset it on completion:
AtomicLong counter = new AtomicLong()
Observable.interval(10, TimeUnit.SECONDS).startWith(0)
.map { createFinite(ch++) }
.flatMap { it
.doOnNext( counter.incrementAndget()
.doOnCompleted( { long ctr = counter.getAndSet(0)
println("I am done. Total event count is $ctr")
} )
.subscribe { println("Just received [$it] from the infinite stream ") }
This isn't very functional, but this kind of reporting tends to break normal streams.

Writing Sleep function based on time.After

EDIT: My question is different from How to write my own Sleep function using just time.After? It has a different variant of the code that's not working for a separate reason and I needed explanation as to why.
I'm trying to solve the homework problem here: https://www.golang-book.com/books/intro/10 (Write your own Sleep function using time.After).
Here's my attempt so far based on the examples discussed in that chapter:
package main
import (
"fmt"
"time"
)
func myOwnSleep(duration int) {
for {
select {
case <-time.After(time.Second * time.Duration(duration)):
fmt.Println("slept!")
default:
fmt.Println("Waiting")
}
}
}
func main() {
go myOwnSleep(3)
var input string
fmt.Scanln(&input)
}
http://play.golang.org/p/fb3i9KY3DD
My thought process is that the infinite for will keep executing the select statement's default until the time.After function's returned channel talks. Problem with the current code being, the latter does not happen, while the default statement is called infinitely.
What am I doing wrong?
In each iteration of your for loop the select statement is executed which involves evaluating the channel operands.
In each iteration time.After() will be called and a new channel will be created!
And if duration is more than 0, this channel is not ready to receive from, so the default case will be executed. This channel will not be tested/checked again, the next iteration creates a new channel which will again not be ready to receive from, so the default case is chosen again - as always.
The solution is really simple though as can be seen in this answer:
func Sleep(sec int) {
<-time.After(time.Second* time.Duration(sec))
}
Fixing your variant:
If you want to make your variant work, you have to create one channel only (using time.After()), store the returned channel value, and always check this channel. And if the channel "kicks in" (a value is received from it), you must return from your function because more values will not be received from it and so your loop will remain endless!
func myOwnSleep(duration int) {
ch := time.After(time.Second * time.Duration(duration))
for {
select {
case <-ch:
fmt.Println("slept!")
return // MUST RETURN, else endless loop!
default:
fmt.Println("Waiting")
}
}
}
Note that though until a value is received from the channel, this function will not "rest" and just execute code relentlessly - loading one CPU core. This might even give you trouble if only 1 CPU core is available (runtime.GOMAXPROCS()), other goroutines (including the one that will (or would) send the value on the channel) might get blocked and never executed. A sleep (e.g. time.Sleep(time.Millisecond)) could release the CPU core from doing endless work (and allow other goroutines to run).

Go, How do I pull X messages from a channel at a time

I have a channel with incoming messages and a go routine that waits on it
I process these messages and send them to a different server
I would like to either process 100 messages at a time if they are ready,
or after say 5 seconds process what ever is in there and go wait again
How do I do that in Go
The routine you use to read from the message channel should define a cache in which incoming messages are stored. These cached messages are then sent to the remote server in bulk either when the cache reaches 100 messages, or 5 seconds have passed. You use a timer channel and Go's select statement to determine which one occurs first.
The following example can be run on the Go playground
package main
import (
"fmt"
"math/rand"
"time"
)
type Message int
const (
CacheLimit = 100
CacheTimeout = 5 * time.Second
)
func main() {
input := make(chan Message, CacheLimit)
go poll(input)
generate(input)
}
// poll checks for incoming messages and caches them internally
// until either a maximum amount is reached, or a timeout occurs.
func poll(input <-chan Message) {
cache := make([]Message, 0, CacheLimit)
tick := time.NewTicker(CacheTimeout)
for {
select {
// Check if a new messages is available.
// If so, store it and check if the cache
// has exceeded its size limit.
case m := <-input:
cache = append(cache, m)
if len(cache) < CacheLimit {
break
}
// Reset the timeout ticker.
// Otherwise we will get too many sends.
tick.Stop()
// Send the cached messages and reset the cache.
send(cache)
cache = cache[:0]
// Recreate the ticker, so the timeout trigger
// remains consistent.
tick = time.NewTicker(CacheTimeout)
// If the timeout is reached, send the
// current message cache, regardless of
// its size.
case <-tick.C:
send(cache)
cache = cache[:0]
}
}
}
// send sends cached messages to a remote server.
func send(cache []Message) {
if len(cache) == 0 {
return // Nothing to do here.
}
fmt.Printf("%d message(s) pending\n", len(cache))
}
// generate creates some random messages and pushes them into the given channel.
//
// Not part of the solution. This just simulates whatever you use to create
// the messages by creating a new message at random time intervals.
func generate(input chan<- Message) {
for {
select {
case <-time.After(time.Duration(rand.Intn(100)) * time.Millisecond):
input <- Message(rand.Int())
}
}
}

Resources