Is Observable doOnNext thread-safet when merge Observables to one Observable - thread-safety

I have a list of Observables, each Observable was subscribeOn a pooled thread Scheduler, I merged list of Observable to one Observable by
Observable.merge(observables). Is the merge result observable doOnNext method is thread-safet? The code sample is below.
ThreadFactory threadFactory = new ThreadFactoryBuilder()
.setDaemon(true)
.setNameFormat("pooled-%s")
.build();
ExecutorService executorService = Executors.newCachedThreadPool(threadFactory);
List<Observable<String>> observableList = Stream.of("1", "2", "3", "4")
.map(o -> {
Observable<String> ob = Observable.create(
emitter -> {
Thread.sleep(new Random().nextInt(50));
System.out.println("emitter:" + o + " Thread:" + Thread.currentThread().getName());
emitter.onNext(o);
emitter.onComplete();
});
return ob.subscribeOn(Schedulers.from(executorService));
}
).collect(Collectors.toList());
List<String> result = new ArrayList<>(); //is need thread safe list here ?
Observable.merge(observableList)
.doOnNext(e -> {
Thread.sleep(new Random().nextInt(50));
System.out.println("doOnNext:" + e + " Thread:" + Thread.currentThread().getName());
result.add(e);
})
.blockingLast();
case 1:
When I run the output seems that the doOnNext called by a signle thread.
emitter:2 Thread:pooled-1
emitter:3 Thread:pooled-2
emitter:4 Thread:pooled-3
emitter:1 Thread:pooled-0
doOnNext:2 Thread:pooled-1
doOnNext:1 Thread:pooled-1
doOnNext:3 Thread:pooled-1
doOnNext:4 Thread:pooled-1
case 2:
When i debug step by step the output changed and seem the doOnNext called by mulit thread.
emitter:1 Thread:pooled-0
doOnNext:1 Thread:pooled-0
emitter:4 Thread:pooled-3
emitter:3 Thread:pooled-2
emitter:2 Thread:pooled-1
doOnNext:4 Thread:pooled-3
doOnNext:2 Thread:pooled-3
doOnNext:3 Thread:pooled-3
I was confused.
I didn't see any code to switch threads between emitter and doOnNext, so why "case 1" occur if I miss same code.
Is doOnNext is not tread-safe in this case and result list should use thread -safe list.
The Observable.toList method internal use ArrayList is that has concurrency problem in this case.

Flows are sequential and guaranteed to be non-overlapping in operators, especially in combining/coordinating ones. However, this doesn't rule out a doOnNext "jumping" between threads. If you want to ensure doOnNext runs on the same thread always, use observeOn before it.
I didn't see any code to switch threads between emitter and doOnNext, so why "case 1" occur if I miss same code.
Flows can drive merge from any number of threads and push through it.
Is doOnNext is not tread-safe in this case and result list should use thread -safe list.
doOnNext is thread safe regarding its input parameter and execution. If your callback is stateful or captures global state, multiple flow realizations can conflict. Those have to be synchronized via external means.
The Observable.toList method internal use ArrayList is that has concurrency problem in this case.
toList is guaranteed to be driven non-overlappingly and its internal ArrayList is per subscriber thus there is no concurrency issue there.

Related

RxJS ShareReplay with retries every n-th second and no refCount

I'm trying to cache http calls in the service so all subsequent calls returns same response. This is fairly easy with shareReplay:
data = this.http.get(url).pipe(
shareReplay(1)
);
But it doesn't work in case of backend / network errors. ShareReplay spams the backend with requests in case of any error when this Observable is bound to the view through async pipe.
I tried with retryWhen etc but the solution I got is untestable:
data = this.http.get(url).pipe(
retryWhen(errors => errors.pipe(delay(10000))),
shareReplay(1)
);
fakeAsync tests fails with "1 timer(s) still in the queue" error because delay timer has no end condition. I also don't want to have some hanging endless timer in the background - it should stop with the last subscription.
The behavior I would like:
Multicast - make only one subscription to source even with many subscribers.
Do not count refs for successful queries - reuse same result when subscriber count goes to 0 and back to 1.
In case of error - retry every 10 seconds but only if there are any subscribers.
My 2 cents:
This code is for rxjs > 6.4 (here V6.6)
To use a shared observable, you need to return the same observable for all the subscribers (or you will create an observable which has nothing to share)
Multicasting can be done using shareReplay and you can replay the last emitted value (even after the last subscriber to have unsubscribed) using the {refCount: false} option.
As long as there is no subscription, the observable does nothing. You will not have any fetch on the server before the first subscriber.
beware:
If refCount is false, the source will not be
unsubscribed meaning that the inner ReplaySubject will still be
subscribed to the source (and potentially run for ever).
Also:
A successfully completed source will stay cached in the shareReplayed
observable forever, but an errored source can be retried.
The problem is using shareReplay, you have to choose between:
Always getting the last value even if the refCount went back to 0 and having possible never ending retries in case of error (remember shareReplay with refCount to false never unsubscribes)
Or keeping the default refCount:true which mean you won't have the second "first subscriber" cache benefit. Conversely the retry will also stop if no subscriber is there.
Here is a dummy example:
class MyServiceClass {
private data;
// assuming you are injecting the http service
constructor(private http: HttpService){
this.data = this.buildData("http://some/data")
}
// use this accessor to get the unique (shared) instance of data observable.
public getData(){
return this.data;
}
private buildData(url: string){
return this.http.get(url).pipe(
retryWhen(errors => errors.pipe(delay(10000))),
shareReplay({refCount: false})
);
}
}
Now in my opinion, to fix the flow you should prevent your retry to run forever, adding for instance a maximum number of retries

RxJs operator that behaves like withLatestFrom but waits for value of second stream

Hi I'm looking for an RxJs operator that behaves similar to withLatestFrom, with the exception that it would wait for the second stream to emit a value instead of skipping it. To be claer: I only want emissions when the first stream emits a new value.
So instead of:
---A-----B----C-----D-|
------1----------2----|
withLatestFrom
---------B1---C1----D2|
I want this behavior:
---A-----B----C-----D-|
------1----------2----|
?????????????
------A1-B1---C1----D2|
Is there an operator for this?
Smola came up witha nice and clean solution in the comments that simply uses a distinctUntilKeyChanged operator:
combineLatest(first$, second$)
.pipe(distinctUntilKeyChanged(0))
As you can see in the RxViz diagram, this produces the desired result:
I don't think there's an operator that does exactly this, but you can achieve those results by combining a high order mapping operator and a Subject:
second$ = second$.pipe(shareReplay({ bufferSize: 1, refCount: false }));
first$.pipe(
concatMap(
firstVal => second$.pipe(
map(secondVal => `${firstVal}${secondVal}`),
take(1),
)
)
)
shareReplay places a ReplaySubject in front on the data producer. This means that it will reply latest N(bufferSize) values to every new subscriber. refCount makes sure that if there are no more active subscribers, the ReplaySubject in use won't be destroyed.
I decided to use concatMap as I think it's safer for the ReplaySubject to have only one active susbcriber.
Considering this scheme:
---A-----B----C-----D-| first$
------1----------2----| second$
When A comes in, the ReplaySubject(from shareReplay) will receive a new subscriber and A it will wait until second$ emits. When this happens, you'd get A1 and the inner observable will complete(meaning that its subscriber will be removed from the ReplaySubject's subscribers list). 1 will be cached by the ReplaySubject.
Then B comes in, the newly created inner subscriber will subscribe to second$ and will receive 1 immediately, resulting into B1. Same with C.
Now comes an important part: the ReplaySubject should have no active subscribers when it receives a new value from its source, so that's why I opted for take(1). When 2 will come, the ReplaySubject will have no active subscribers, so nothing happens.
Then D arrives and will receive the latest stored value, 2, resulting into D2.
This is how I did this in kotlin using RxJava
Observable.merge(
Observable.combineLatest(streamOne(), streamTwo(), ::Pair).take(1),
streamOne().skip(1).withLatestFrom(streamTwo(), ::Pair)
).subscribe { // Do your thing }
Yeah, that's join. From your example it isn't clear if you're aming left/right join or inner join.
Oh, sorry. It's clear indeed. You need the semantics of inner join (emits iff data in both joinee is present and otherwise waits).

Wrap blocking code into a Mono flatMap, is this still a non-blocking operation?

if i wrap blocking code into a flatMap, is this still a non-blocking operation ?
Example:
public Mono<String> foo() {
Mono.empty().flatMap(obj -> {
try {
Object temp = f.get();//are the thread at this point blocked or not ?
} catch (Exception e) {
e.printStackTrace();
throw e;
}
return Mono.just("test");
});
So, i think when i wrap blocking code into reactive code, the operation is still non-blocking ? If i am wrong, pls explain it to me.
if i wrap blocking code into a flatMap, is this still a non-blocking operation ?
flatMap doesn't create new threads for you. For example:
Mono.just("abc").flatMap(val -> Mono.just("cba")).subscribe();
All the code above will be executed by the current thread that called subscribe. So if the mapper function contained a long blocking operation the thread that called subscribe will be blocked as well.
To transform this to an asynchronous operation you can use subscribeOn(Schedulers.elastic());
Mono.just("abc").flatMap(val -> Mono.just("cba")).subscribeOn(Schedulers.elastic());
Mono and Flux don't create threads, but some operators take Scheduler as an extra argument to use such as the interval operator, or alter threading model all together such as subscribeOn.
One extra thing, in your example the mapper function is never going to be called, since your applying flatMap to an empty mono which completes directly with no values emitted.

Shared observable and startWith operator

I have a question regarding multicasted observables and an unexpected (for me) behaviour I noticed.
const a = Observable.fromEvent(someDom, 'click')
.map(e => 1)
.startWith(-1)
.share();
const b = a.pairwise();
a.subscribe(a => {
console.log(`Sub 1: ${a}`);
});
a.subscribe(a => {
console.log(`Sub 2: ${a}`)
});
b.subscribe(([prevA, curA]) => {
console.log(`Pairwise Sub: (${prevA}, ${curA})`);
});
So, there is a shared observable a, which emits 1 on every click event. -1 is emitted due to the startWith operator.
The observable b just creates a new observable by pairing up latest two values from a.
My expectation was:
[-1, 1] // first click
[ 1, 1] // all other clicks
What I observed was:
[1, 1] // from second click on, and all other clicks
What I noticed is that the value -1 is emitted immediately and consumed by Sub 1, before even Sub 2 is subscribed to the observable and since a is multicasted, Sub 2 is too late for the party.
Now, I know that I could multicast via BehaviourSubject and not use the startWith operator, but I want to understand the use case of this scenario when I use startWith and multicast via share.
As far as I understand, whenever I use .share() and .startWith(x), only one subscriber will be notified about the startWith value, since all other subscribers are subscribed after emitting the value.
So is this a reason to multicast via some special subject (Behavior/Replay...) or am I missing something about this startWith/share scenario?
Thanks!
This is actually correct behavior.
The .startWith() emits its value to every new subscriber, not only the first one. The reason why b.subscribe(([prevA, curA]) never receives it is because you're using multicasting with .share() (aka .publish().refCount()).
This means that the first a.subscribe(...) makes the .refCount() to subscribe to its source and it'll stay subscribed (note that Observable .fromEvent(someDom, 'click') never completes).
Then when you finally call b.subscribe(...) it'll subscribe only to the Subject inside .share() and will never go through .startWith(-1) because it's multicasted and already subscribed in .share().

Lossless rate-limiting in RxJS with queue clearing

In rxjs5, I'm trying to implement a Throttler class.
import Rx from 'rxjs/rx';
export default class Throttler {
constructor(interval) {
this.timeouts = [];
this.incomingActions = new Rx.Subject();
this.incomingActions
.concatMap(action => Rx.Observable.just(action).delay(interval / 2))
.subscribe(action => action());
}
clear() {
// How do I do this?
}
do(action) {
this.incomingActions.next(action);
}
}
The following invariants must hold:
every action passed to do gets added to an action queue
the action queue gets processed in order and at a fixed interval as determined by the constructor parameter
the action queue can be cleared using clear().
My current implementation, as seen above, handles the fixed interval, but I don't know how to clear the queue. It also has the problem that all actions are delayed by interval / 2ms even when the queue is empty.
P.S. The way I describe the invariants maps very easily to an implementation with setInterval and an array as a queue, but I'm wondering how I would do this with Rx.
This seems like not a good place for the default Subject class. Extending it with your own subclass would be better because of reasons you listed.
However, in your case I'd try to identify each action that comes to .do(action) method with some index and add .filter() operator before subscribe() to be able to cancel particular actions by checking some array for what indices are marked as canceled. Since you're using concatMap() you know that actions will be always called in the order they were added. Then clear() method that you want would just mark all actions to be canceled in the array.
You can also add .do() operator after concatMap() and keep track of how many action are scheduled at the moment with some accumulator. Adding action would cause scheduledAction++ while passing .do() right before .subscribe() would scheduledAction--. Then you can use this variable to decide whether you want to chain a new action with .delay(interval / 2) or not.

Resources