rxjs 5 publishReplay refCount - rxjs

I can't figure out how publishReplay().refCount() works.
For example (https://jsfiddle.net/7o3a45L1/):
var source = Rx.Observable.create(observer => {
console.log("call");
// expensive http request
observer.next(5);
}).publishReplay().refCount();
subscription1 = source.subscribe({next: (v) => console.log('observerA: ' + v)});
subscription1.unsubscribe();
console.log("");
subscription2 = source.subscribe({next: (v) => console.log('observerB: ' + v)});
subscription2.unsubscribe();
console.log("");
subscription3 = source.subscribe({next: (v) => console.log('observerC: ' + v)});
subscription3.unsubscribe();
console.log("");
subscription4 = source.subscribe({next: (v) => console.log('observerD: ' + v)});
subscription4.unsubscribe();
gives the following result:
call observerA: 5
observerB: 5 call observerB: 5
observerC: 5 observerC: 5 call observerC: 5
observerD: 5 observerD: 5 observerD: 5 call observerD: 5
1) Why are observerB, C and D called multiple times?
2) Why "call" is printed on each line and not in the beginning of the line?
Also, if i call publishReplay(1).refCount(), it calls observerB, C and D 2 times each.
What i expect is that every new observer receives the value 5 exactly once and "call" is printed only once.

publishReplay(x).refCount() combined does the following:
It create a ReplaySubject which replay up to x emissions. If x is not defined then it replays the complete stream.
It makes this ReplaySubject multicast compatible using a refCount() operator. This results in concurrent subscriptions receiving the same emissions.
Your example contains a few issues clouding how it all works together. See the following revised snippet:
var state = 5
var realSource = Rx.Observable.create(observer => {
console.log("creating expensive HTTP-based emission");
observer.next(state++);
// observer.complete();
return () => {
console.log('unsubscribing from source')
}
});
var source = Rx.Observable.of('')
.do(() => console.log('stream subscribed'))
.ignoreElements()
.concat(realSource)
.do(null, null, () => console.log('stream completed'))
.publishReplay()
.refCount()
;
subscription1 = source.subscribe({next: (v) => console.log('observerA: ' + v)});
subscription1.unsubscribe();
subscription2 = source.subscribe(v => console.log('observerB: ' + v));
subscription2.unsubscribe();
subscription3 = source.subscribe(v => console.log('observerC: ' + v));
subscription3.unsubscribe();
subscription4 = source.subscribe(v => console.log('observerD: ' + v));
<script src="https://cdnjs.cloudflare.com/ajax/libs/rxjs/5.1.0/Rx.js"></script>
When running this snippet we can see clearly that it is not emitting duplicate values for Observer D, it is in fact creating new emissions for every subscription. How come?
Every subscription is unsubscribed before the next subscription takes place. This effectively makes the refCount decrease back to zero, no multicasting is being done.
The issue resides in the fact that the realSource stream does not complete. Because we are not multicasting the next subscriber gets a fresh instance of realSource through the ReplaySubject and the new emissions are prepended with the previous already emitted emissions.
So to fix your stream from invoking the expensive HTTP request multiple times you have to complete the stream so the publishReplay knows it does not need to re-subscribe.

Generally: The refCount means, that the stream is hot/shared as long as there is at least 1 subscriber - however, it is being reset/cold when there are no subscribers.
This means if you want to be absolutely sure that nothing is executed more than once, you should not use refCount() but simply connect the stream to set it hot.
As an additional note: If you add an observer.complete() after the observer.next(5); you will also get the result you expected.
Sidenote: Do you really need to create your own custom Obervable here? In 95% of the cases the existing operators are sufficient for the given usecase.

This happens because you're using publishReplay(). It internally creates an instance of ReplaySubject that stores all values that go through.
Since you're using Observable.create where you emit a single value then every time you call source.subscribe(...) you append one value to the buffer in ReplaySubject.
You're not getting call printed at the beginning of each line because it's the ReplaySubject who emits its buffer first when you subscribe and then it subscribes itself to its source:
For implementation details see:
https://github.com/ReactiveX/rxjs/blob/master/src/operator/multicast.ts#L63
https://github.com/ReactiveX/rxjs/blob/master/src/ReplaySubject.ts#L54
The same applies when using publishReplay(1). First it emits the buffered item from ReplaySubject and then yet another item from observer.next(5);

Related

How to get similar behavior to bufferCount whilst emitting if there are less items than the buffer count

I'm trying to achieve something very similar to a buffer count. As values come through the pipe, bufferCount of course buffers them and sends them down in batches. I'd like something similar to this that will emit all remaining items if there are currently fewer than the buffer size in the stream.
It's a little confusing to word, so I'll provide an example with what I'm trying to achieve.
I have something adding items individually to a subject. Sometimes it'll add 1 item a minute, sometimes it'll add 1000 items in 1 second. I wish to do a long running process (2 seconds~) on batches of these items as to not overload the server.
So for example, consider the timeline where P is processing
---A-----------B----------C---D--EFGHI------------------
|_( P(A) ) |_(P(B)) |_( P(C) ) |_(P([D, E, F, G, H, I]))
This way I can process the events in small or large batches depending on how many events are coming through, but i ensure the batches remain smaller than X.
I basically need to map all the individual emits into emits that contain chunks of 5 or fewer. As I pipe the events into a concatMap, events will start to stack up. I want to pick these stacked up events off in batches. How can I achieve this?
Here's a stackblitz with what I've got so far: https://stackblitz.com/edit/rxjs-iqwcbh?file=index.ts
Note how item 4 and 5 don't process until more come in and fill in the buffer. Ideally after 1,2,3 are processed, it'll pick off 4,5 the queue. Then when 6,7,8 come in, it'll process those.
EDIT: today I learned that bufferTime has a maxBufferSize parameter, that will emit when the buffer reaches that size. Therefore, the original answer below isn't necessary, we can simply do this:
const stream$ = subject$.pipe(
bufferTime(2000, null, 3), // <-- buffer emits # 2000ms OR when 3 items collected
filter(arr => !!arr.length)
);
StackBlitz
ORIGINAL:
It sounds like you want a combination of bufferCount and bufferTime. In other words: "release the buffer when it reaches size X or after Y time has passed".
We can use the race operator, along with those other two to create an observable that emits when the buffer reaches the desired size OR after the duration has passed. We'll also need a little help from take and repeat:
const chunk$ = subject$.pipe(bufferCount(3));
const partial$ = subject$.pipe(
bufferTime(2000),
filter(arr => !!arr.length) // don't emit empty array
);
const stream$ = race([chunk$, partial$]).pipe(
take(1),
repeat()
);
Here we define stream$ to be the first to emit between chunk$ and partial$. However, race will only use the first source that emits, so we use take(1) and repeat to sort of "reset the race".
Then you can do your work with concatMap like this:
stream$.pipe(
concatMap(chunk => this.doWorkWithChunk(chunk))
);
Here's a working StackBlitz demo.
You may want to roll it into a custom operator, so you can simply do something like this:
const stream$ = subject$.pipe(
bufferCountTime(5, 2000)
);
The definition of bufferCountTime() could look like this:
function bufferCountTime<T>(count: number, time: number) {
return (source$: Observable<T>) => {
const chunk$ = source$.pipe(bufferCount(count));
const partial$ = source$.pipe(
bufferTime(time),
filter((arr: T[]) => !!arr.length)
);
return race([chunk$, partial$]).pipe(
take(1),
repeat()
);
}
}
Another StackBlitz sample.
Since I noticed the use of forkJoin in your sample code, I can see you are sending a request to the server for each emission (I was originally under the impression that you were making only 1 call per batch with combined data).
In the case of sending one request per item the solution is much simpler!
There is no need to batch the emissions, you can simply use mergeMap and specify its concurrency parameter. This will limit the number of currently executing requests:
const stream$ = subject$.pipe(
mergeMap(val => doWork(val), 3), // 3 max concurrent requests
);
Here is a visual of what the output would look like when the subject rapidly emits:
Notice the work only starts for the first 3 items initially. Emissions after that are queued up and processed as the prior in flight items complete.
Here's a StackBlitz example of this behavior.
TLDR;
A StackBlitz app with the solution can be found here.
Explanation
Here would be an approach:
const bufferLen = 3;
const count$ = subject.pipe(filter((_, idx) => (idx + 1) % bufferLen === 0));
const timeout$ = subject.pipe(
filter((_, idx) => idx === 0),
switchMapTo(timer(0))
);
subject
.pipe(
buffer(
merge(count$, timeout$).pipe(
take(1),
repeat()
)
),
concatMap(buffer => forkJoin(buffer.map(doWork)))
)
.subscribe(/* console.warn */);
/* Output:
Processing 1
Processing 2
Processing 3
Processed 1
Processed 2
Processed 3
Processing 4
Processing 5
Processed 4
Processed 5
Processing 6 <- after the `setTimeout`'s timer expires
Processing 7
Processing 8
Processed 6
Processed 7
Processed 8
*/
The idea was to still use the bufferCount's behavior when items come in synchronously, but, at the same time, detect when fewer items than the chosen bufferLen are in the buffer. I thought that this detection could be done using a timer(0), because it internally schedules a macrotask, so it is ensured that items emitted synchronously will be considered first.
However, there is no operator that exactly combines the logic delineated above. But it's important to keep in mind that we certainly want a behavior similar to the one the buffer operator provides. As in, we will for sure have something like subject.pipe(buffer(...)).
Let's see how we can achieve something similar to what bufferTime does, but without using bufferTime:
const bufferLen = 3;
const count$ = subject.pipe(filter((_, idx) => (idx + 1) % bufferLen === 0));
Given the above snippet, using buffer(count$) and bufferTime(3), we should get the same behavior.
Let's move now onto the detection part:
const timeout$ = subject.pipe(
filter((_, idx) => idx === 0),
switchMapTo(timer(0))
);
What it essentially does is to start a timer after the subject has emitted its first item. This will make more sense when we have more context:
subject
.pipe(
buffer(
merge(count$, timeout$).pipe(
take(1),
repeat()
)
),
concatMap(buffer => forkJoin(buffer.map(doWork)))
)
.subscribe(/* console.warn */);
By using merge(count$, timeout$), this is what we'd be saying: when the subject emits, start adding items to the buffer and, at the same time, start the timer. The timer is started too because it is used to determine if fewer items will be in the buffer.
Let's walk through the example provided in the StackBlitz app:
from([1, 2, 3, 4, 5])
.pipe(tap(i => subject.next(i)))
.subscribe();
// Then mimic some more items coming through a while later
setTimeout(() => {
subject.next(6);
subject.next(7);
subject.next(8);
}, 10000);
When 1 is emitted, it will be added to the buffer and the timer will start. Then 2 and 3 arrive immediately, so the accumulated values will be emitted.
Because we're also using take(1) and repeat(), the process will restart. Now, when 4 is emitted, it will be added to the buffer and the timer will start again. 5 arrives immediately, but the number of the collected items until now is less than the given buffer length, meaning that until the 3rd value arrives, the timer will have time to finish. When the timer finishes, the [4,5] chunk will be emitted. What happens with [6, 7, 8] is the same as what happened with [1, 2, 3].

RxSwift subscribe to latest element in one sequence similar to combineLatest

Suppose I have some Observable which may have some arbitrarily long sequence of events at the time I subscribe to it but which may also continue to emit events after I subscribe.
I am interested only in those events from the time at which I subscribe and later. How do I just get the latest events?
In this example I use a ReplaySubject as an artificial source to illustrate the question. In practice this would be some arbitrary Observable.
let observable = ReplaySubject<Int>.createUnbounded()
observable.onNext(1)
observable.onNext(2)
observable.onNext(3)
observable.onNext(4)
_ = observable.subscribe(onNext: {
print($0)
})
observable.onNext(5)
observable.onNext(6)
observable.onNext(7)
Produces the output:
1
2
3
4
5
6
7
What I really want is only events from the time of subscription onwards. i.e. 4 5 6 7
I can use combineLatest with some other dummy Observable:
let observable = ReplaySubject<Int>.createUnbounded()
observable.onNext(1)
observable.onNext(2)
observable.onNext(3)
observable.onNext(4)
_ = Observable.combineLatest(observable, Observable<Int>.just(42)) { value, _ in value }
.subscribe(onNext: {
print($0)
})
observable.onNext(5)
observable.onNext(6)
observable.onNext(7)
which produces the desired output 4 5 6 7
How can I produce a similar result without artificially introducing another arbitrary Observable?
I have tried a number of things including combineLatest with an array consisting of just one observable, but that emits the complete sequence, not just the latest. I know I could use PublishSubject but I am just using ReplaySubject here as an illustration.
By default, an observable will call its generator for every subscriber and emit all of the values produced by that generator. So for example:
let obs = Observable.create { observer in
for each in [1, 2, 3, 5, 7, 11] {
observer.onNext(each)
}
observer.onCompleted()
}
(Note that the above is the implementation of Observable.from(_:))
Every time something subscribes to obs the closure is called and all 6 next events will be received. This is what's known as a "cold" observable, and again it's the default behavior. Assume an Observable is cold unless you know otherwise.
There is also the concept of a "hot" observable. A hot observable doesn't call its generator function when something subscribes to it.
Based on your question, and your subsequent comment, it sounds like you want to know how to make a cold observable hot... The fundamental way is by calling .multicast on it (or one of the operators that use its implementation like publish(), replay(_:) or replayAll().) There is also a special purpose operator called .share() that will "heat up" an observable and keep it hot until all subscribers unsubscribe to it (then it will be cold again.) And of course, Subjects are considered hot because they don't have a generator function to call.
Note however, that many observables have synchronous behavior, this means that they will emit all their values as soon as something subscribes and thus will have already completed before any other observer (on that thread) has a chance to subscribe.
Some more examples... .interval(_:scheduler:) is a cold observable with async behavior. Let's say you have the following:
let i = Observable<Int>.interval(.seconds(3), scheduler: MainScheduler.instance)
i.subscribe(onNext: { print($0, "from first") })
DispatchQueue.main.asyncAfter(deadline: .now() + 5) {
i.subscribe(onNext: { print($0, "from second") })
}
What you will find is that each observer will get it's own independent stream of values (both will start with 0) because the generator inside interval is called for both observers. So you will see output like:
0 from first
1 from first
0 from second
2 from first
1 from second
3 from first
2 from second
If you multicast the interval you will see different behavior:
let i = Observable<Int>.interval(.seconds(3), scheduler: MainScheduler.instance)
.publish()
i.subscribe(onNext: { print($0, "from first") })
i.connect()
DispatchQueue.main.asyncAfter(deadline: .now() + 5) {
i.subscribe(onNext: { print($0, "from second") })
}
The above will produce:
0 from first
1 from first
1 from second
2 from first
2 from second
3 from first
3 from second
(Note that "second" started with 1 instead of 0.) The share operator will work the same way in this case except you don't have to call connect() because it does it automatically.
Lastly, watch out. If you publish a synchronous observable, you might not get what you expect:
let i = Observable.from([1, 2, 3, 5])
.publish()
i.subscribe(onNext: { print($0, "from first") })
i.connect()
i.subscribe(onNext: { print($0, "from second") })
produces:
1 from first
2 from first
3 from first
5 from first
Because all 5 events (the four next events and the completed event) emit as soon as connect() is called before the second observer gets a chance to subscribe.
An article that might help you is Hot and Cold Observables but it's pretty advanced...
Why not simply use a publish subject like this? Isn't this the desired output? Publish Subjects only emits the elements after it's subscribed. And that's the whole purpose of it.
let observable = PublishSubject<Int>()
observable.onNext(1)
observable.onNext(2)
observable.onNext(3)
observable.onNext(4)
_ = observable.subscribe(onNext: {
print($0)
})
observable.onNext(5)
observable.onNext(6)
observable.onNext(7)
}
If you don't want to use a subject you can share the observable and add a 2nd subscriber like this,
let observable = ReplaySubject<Int>.createUnbounded()
observable.onNext(1)
observable.onNext(2)
observable.onNext(3)
observable.onNext(4)
let shared = observable.share()
// this will print full sequence
shared.subscribe(onNext: {
print("full sequence: \($0)")
}).disposed(by: disposeBag)
// this will only print new events
shared.subscribe(onNext: {
print("shared sequence: \($0)")
}).disposed(by: disposeBag)
// new events
observable.onNext(5)
observable.onNext(6)
observable.onNext(7)
Observables are lazy, pull driven sequences. Without your first subscription stream won't even start. Once started, by sharing it, you can subscribe only to the new events.

RxJs hot range observable

I'm trying to create a hot range observable. This means that when I have an observer observering the observable after a certain timeout, it should not receive the values that have already been published. I have created the following program:
import Rx from "rxjs/Rx";
var x = Rx.Observable.range(1,10).share()
x.subscribe(x => {
print('1: ' + x);
});
setTimeout(() => {
x.subscribe(x => {
print('2: ' + x);
});
}, 1000);
function print(x) {
const element = document.createElement('div');
element.innerText = x;
document.body.appendChild(element)
}
I expect this program to print 1 to 10, and then the second observable to print nothing, since the values 1 to 10 are produced within the first second. The expected output is shown below.
1: 1
1: 2
..
1:10
However, I see that it also prints all the values. Eventhough I have put the share() operator behind it. The output is shown below.
1: 1
..
1: 10
2: 1
..
2: 10
Can somebody explain this to me?
share returns an observable that's reference counted for subscriptions. When the reference count goes from zero to one, the shared observable subscribes to the source - in your case, to the range observable. And when the reference count drops back to zero, it unsubscribes from the source.
The key point in your snippet is that range emits it's values synchronously and then completes. And the completion effects an unsubscription from the shared observable and that sees the reference count drop back to zero - which sees the shared observable unsubscribe from its source.
If you replace share with publish you should see the behaviour you expected:
var x = Rx.Observable.range(1,10).publish();
x.subscribe(x => print('1: ' + x));
x.connect();
publish returns a ConnectableObservable which is not reference counted and provides a connect method that can be called to explicitly connect - i.e. subscribe - to the source.

What is the best way to implement a poller with timeout as a reactive stream.

What is the best way to model a poller with a timeout, where a certain condition causes an early-exit as 'reactive streams'?
e.g.
If I had an observable which produced a decreasing sequence of positive integers every second
9,8,7,6,5,4,3,2,1,0
What is the best way to write a consumer which takes the latest single event after 5 seconds OR the '0' event if it produced earlier than the timeout.
This is my code as it stands at the moment: (Example in Java)
int intialValue = 10;
AtomicInteger counter = new AtomicInteger(intialValue);
Integer val = Observable.interval(1, TimeUnit.SECONDS)
.map(tick -> counter.decrementAndGet())
.takeUntil(it -> it == 0)
.takeUntil(Observable.timer(5, TimeUnit.SECONDS))
.lastElement()
.blockingGet();
System.out.println(val);
if initialValue = 10, I expect 6 to print. if initialValue = 2, i expect 0 to print before the 5 second timeout expires.
I'm interested if there is a better way to do this.
I don't think there is really a much better way than what you did. You have to have the following:
An interval to emit on (interval)
An aggregator to decrement and store the last value (scan)
A termination condition on the value (takeWhile)
A termination condition on time (takeUntil(timer(...)))
Get the last value on completion (last)
Each one is represented by an operator. You can't do much to get around that. I used a few different operators (scan for aggregation and takeWhile for termination on value) but it is the same number of operators.
const { interval, timer } = rxjs;
const { scan, takeWhile, takeUntil, last, tap } = rxjs.operators;
function poll(start) {
console.log('start', start);
interval(1000).pipe(
scan((x) => x - 1, start),
takeWhile((x) => x >= 0),
takeUntil(timer(5000)),
tap((x) => { console.log('tap', x); }),
last()
).subscribe(
(x) => { console.log('next', x); },
(e) => { console.log('error', e); },
() => { console.log('complete'); }
);
}
poll(10);
setTimeout(() => { poll(2); }, 6000);
<script src="https://cdnjs.cloudflare.com/ajax/libs/rxjs/6.1.0/rxjs.umd.min.js"></script>
I'm not clear on how you expect it to function on the boundaries. In your example you always decrement before emiting so if your initial value is 10 then you emit 9, 8, 7, 6 (4 values). If you wanted to start with 10 then . you could do scan(..., start + 1) but that would end you at 7 because the timer in the takeUntil(...) aligns with the source interval so that 6 would be excluded. If you want to emit 5 values then you could do takeUntil(timer(5001)). Also, if you don't want to wait a second to emit the first value then you could put startWith(start) right after the scan(...). Or you could do timer(0, 1000) with scan(..., start + 1) instead of the source interval.
Also note that the termination on value (takeWhile) will not terminate till the invalid value is produced (-1). So it will continue for a second after receiving the termination value (0). It seems that most of the termination operators work that way where if they terminate on some value then they wont let the others through.
You could do a take(5) instead of takeUntil(timer(5000)) because you know it fires on a matching interval if that works for your scenario. That would also get around the issue of excluding the last value because of the timers lining up.

Can an RxJS 5 Observable source be stopped by another down the chain?

RxJS 5 Angular 2 RC4 app written in Typescript 1.9:
I have two observables in a chain. I would like, if a condition is met in the 2nd, for the first to be completed immediately. My efforts seem unnecessarily complex. In the example below, I try to stop the first observable after it has emitted 3 values:
source = Observable.interval(1000)
.do(()=>this.print("*******EMITTING from Source*******"))
.switchMap(count => {
if(count<3){ //just pass along the value
return Observable.create(observer=>{
observer.next(count);observer.complete()
})
}
else{ //abort by issuing a non-productive observable
return Observable.create(observer=>
observer.complete()
)
}
})
this.source.subscribe(count=>this.print('Ouput is '+count);
Here is the output:
*******EMITTING from Source*******
Output is 0
*******EMITTING from Source*******
Output is 1
*******EMITTING from Source*******
Output is 2
*******EMITTING from Source*******
*******EMITTING from Source*******
*******EMITTING from Source*******
So, functionally I get the result I want because the wider script stops getting notifications after three outputs. However, I'm sure there is a better way. My problems are:
The upstream observable continues to emitting forever. How can I stop it?
I'm creating a new observable down the chain on every emission. Shouldn't I be able to just pass along the first 3 values but abort or complete the chain on the 4th?
You can use take operator to do it.take takes the first N events and completes the stream.
this.source = Observable.interval(1000)
.do(()=>this.print("*******EMITTING from Source*******"))
.take(3);
this.source.subscribe(count=>this.print('Ouput is '+count);
Your example's stream doesn't complete because switchMap's outer stream doesn't complete when inner streams complete. switchMap() is equal to map().switch(). In your example, the map part emits observables like:
next(0), complete()
next(1), complete()
next(2), complete()
complete()
complete()
complete()
complete()
...(continues infinitely)...
And the switch part switches those observables and keeps waiting for upcoming observables.
EDIT
Your example also could be written as:
source = Observable.interval(1000)
.do(()=>this.print("*******EMITTING from Source*******"))
.takeWhile(count => count < 3);
EDIT 2
Regarding your comment, if you want to terminate the stream if the inner stream emits true:
source = Observable.interval(1000)
.do(()=>this.print("*******EMITTING from Source*******"))
.switchMap(count => createSomeObservable(count))
.takeWhile(x => x !== true);

Resources