I have 2 observables A and B which can emit at any time. But only when A emits a new value and then B emits a new value too, I collect these 2 values. If B just emits new values without A emitting new values first, I don't collect any values.
I know concatMap might be useful but it needs the previous observable to complete whereas in my case neither ever completes until everything is destroyed.
This can be modelled as projecting each element of A into the first arriving element of B, discarding any previous subscription to B when a new A arrives, e.g.:
A.pipe(
switchMap(x => B.pipe(
first(),
map(y => ({ a: x, b: y }))
)
)
I'm trying to achieve something very similar to a buffer count. As values come through the pipe, bufferCount of course buffers them and sends them down in batches. I'd like something similar to this that will emit all remaining items if there are currently fewer than the buffer size in the stream.
It's a little confusing to word, so I'll provide an example with what I'm trying to achieve.
I have something adding items individually to a subject. Sometimes it'll add 1 item a minute, sometimes it'll add 1000 items in 1 second. I wish to do a long running process (2 seconds~) on batches of these items as to not overload the server.
So for example, consider the timeline where P is processing
---A-----------B----------C---D--EFGHI------------------
|_( P(A) ) |_(P(B)) |_( P(C) ) |_(P([D, E, F, G, H, I]))
This way I can process the events in small or large batches depending on how many events are coming through, but i ensure the batches remain smaller than X.
I basically need to map all the individual emits into emits that contain chunks of 5 or fewer. As I pipe the events into a concatMap, events will start to stack up. I want to pick these stacked up events off in batches. How can I achieve this?
Here's a stackblitz with what I've got so far: https://stackblitz.com/edit/rxjs-iqwcbh?file=index.ts
Note how item 4 and 5 don't process until more come in and fill in the buffer. Ideally after 1,2,3 are processed, it'll pick off 4,5 the queue. Then when 6,7,8 come in, it'll process those.
EDIT: today I learned that bufferTime has a maxBufferSize parameter, that will emit when the buffer reaches that size. Therefore, the original answer below isn't necessary, we can simply do this:
const stream$ = subject$.pipe(
bufferTime(2000, null, 3), // <-- buffer emits # 2000ms OR when 3 items collected
filter(arr => !!arr.length)
);
StackBlitz
ORIGINAL:
It sounds like you want a combination of bufferCount and bufferTime. In other words: "release the buffer when it reaches size X or after Y time has passed".
We can use the race operator, along with those other two to create an observable that emits when the buffer reaches the desired size OR after the duration has passed. We'll also need a little help from take and repeat:
const chunk$ = subject$.pipe(bufferCount(3));
const partial$ = subject$.pipe(
bufferTime(2000),
filter(arr => !!arr.length) // don't emit empty array
);
const stream$ = race([chunk$, partial$]).pipe(
take(1),
repeat()
);
Here we define stream$ to be the first to emit between chunk$ and partial$. However, race will only use the first source that emits, so we use take(1) and repeat to sort of "reset the race".
Then you can do your work with concatMap like this:
stream$.pipe(
concatMap(chunk => this.doWorkWithChunk(chunk))
);
Here's a working StackBlitz demo.
You may want to roll it into a custom operator, so you can simply do something like this:
const stream$ = subject$.pipe(
bufferCountTime(5, 2000)
);
The definition of bufferCountTime() could look like this:
function bufferCountTime<T>(count: number, time: number) {
return (source$: Observable<T>) => {
const chunk$ = source$.pipe(bufferCount(count));
const partial$ = source$.pipe(
bufferTime(time),
filter((arr: T[]) => !!arr.length)
);
return race([chunk$, partial$]).pipe(
take(1),
repeat()
);
}
}
Another StackBlitz sample.
Since I noticed the use of forkJoin in your sample code, I can see you are sending a request to the server for each emission (I was originally under the impression that you were making only 1 call per batch with combined data).
In the case of sending one request per item the solution is much simpler!
There is no need to batch the emissions, you can simply use mergeMap and specify its concurrency parameter. This will limit the number of currently executing requests:
const stream$ = subject$.pipe(
mergeMap(val => doWork(val), 3), // 3 max concurrent requests
);
Here is a visual of what the output would look like when the subject rapidly emits:
Notice the work only starts for the first 3 items initially. Emissions after that are queued up and processed as the prior in flight items complete.
Here's a StackBlitz example of this behavior.
TLDR;
A StackBlitz app with the solution can be found here.
Explanation
Here would be an approach:
const bufferLen = 3;
const count$ = subject.pipe(filter((_, idx) => (idx + 1) % bufferLen === 0));
const timeout$ = subject.pipe(
filter((_, idx) => idx === 0),
switchMapTo(timer(0))
);
subject
.pipe(
buffer(
merge(count$, timeout$).pipe(
take(1),
repeat()
)
),
concatMap(buffer => forkJoin(buffer.map(doWork)))
)
.subscribe(/* console.warn */);
/* Output:
Processing 1
Processing 2
Processing 3
Processed 1
Processed 2
Processed 3
Processing 4
Processing 5
Processed 4
Processed 5
Processing 6 <- after the `setTimeout`'s timer expires
Processing 7
Processing 8
Processed 6
Processed 7
Processed 8
*/
The idea was to still use the bufferCount's behavior when items come in synchronously, but, at the same time, detect when fewer items than the chosen bufferLen are in the buffer. I thought that this detection could be done using a timer(0), because it internally schedules a macrotask, so it is ensured that items emitted synchronously will be considered first.
However, there is no operator that exactly combines the logic delineated above. But it's important to keep in mind that we certainly want a behavior similar to the one the buffer operator provides. As in, we will for sure have something like subject.pipe(buffer(...)).
Let's see how we can achieve something similar to what bufferTime does, but without using bufferTime:
const bufferLen = 3;
const count$ = subject.pipe(filter((_, idx) => (idx + 1) % bufferLen === 0));
Given the above snippet, using buffer(count$) and bufferTime(3), we should get the same behavior.
Let's move now onto the detection part:
const timeout$ = subject.pipe(
filter((_, idx) => idx === 0),
switchMapTo(timer(0))
);
What it essentially does is to start a timer after the subject has emitted its first item. This will make more sense when we have more context:
subject
.pipe(
buffer(
merge(count$, timeout$).pipe(
take(1),
repeat()
)
),
concatMap(buffer => forkJoin(buffer.map(doWork)))
)
.subscribe(/* console.warn */);
By using merge(count$, timeout$), this is what we'd be saying: when the subject emits, start adding items to the buffer and, at the same time, start the timer. The timer is started too because it is used to determine if fewer items will be in the buffer.
Let's walk through the example provided in the StackBlitz app:
from([1, 2, 3, 4, 5])
.pipe(tap(i => subject.next(i)))
.subscribe();
// Then mimic some more items coming through a while later
setTimeout(() => {
subject.next(6);
subject.next(7);
subject.next(8);
}, 10000);
When 1 is emitted, it will be added to the buffer and the timer will start. Then 2 and 3 arrive immediately, so the accumulated values will be emitted.
Because we're also using take(1) and repeat(), the process will restart. Now, when 4 is emitted, it will be added to the buffer and the timer will start again. 5 arrives immediately, but the number of the collected items until now is less than the given buffer length, meaning that until the 3rd value arrives, the timer will have time to finish. When the timer finishes, the [4,5] chunk will be emitted. What happens with [6, 7, 8] is the same as what happened with [1, 2, 3].
I have a hard time understanding something related to using a concat function inside a mergeMap operator.
The documentation for concat says that
You can pass either an array of Observables, or put them directly as arguments.
When I put the Observables directly as arguments, like in the following example, I correctly get 20 and 24 in the console.
of(4)
.pipe(
mergeMap(number => concat(of(5 * number), of(6 * number)))
)
.subscribe(value => console.log(value));
But when I put them as an array, then in the console I get the Observables and not their values:
of(4)
.pipe(
mergeMap(number => concat([of(5 * number), of(6 * number)]))
)
.subscribe(value => console.log(value));
Here's a live version in Stackblitz.
Any idea why is that? Shouldn't both examples work identically?
Those two scenarios are different and they should not work identically. concat takes Observables as arguments and it will sequentially subscribe to those streams and only subscribe to the next Observable when the previous one completed. Every operator or creation method returns an Observable. This means that in the first example, when you are using concat, it will return an Observable that emits 20 and then 24. Because you are dealing with a nested Observable you have to use mergeMap which will subscribe to the resulting Observable returned by concat.
Now in the second example, if you pass in an array, concat will convert this (using from() internally) to an Observable that emits 2 values, and those values are Observables again. So you have 3 levels of nesting here. The first is the most outer Observable, the source, which of(4), the second level is the one you map to inside your mergeMap and the third in the second example are the Observables inside your array. The thing is you only flatten the levels up to level 2 but not the 3rd level. Again, in your second example the Observable returned by mergeMap emits two Observables but those are just the proxies and not the values emitted by these Observables. If you want to subscribe to those as well, you could chain on another mergeMap like so
concatArray() {
of(4)
.pipe(
mergeMap(number => concat([of(5 * number), of(6 * number)])),
mergeMap(x => x)
)
.subscribe(value => console.log(value));
}
Another way is to spread the array so that concat does not receive an object that is ArrayLike but rather Observables directly:
concatArray() {
of(4)
.pipe(
mergeMap(number => concat(...[of(5 * number), of(6 * number)]))
)
.subscribe(value => console.log(value));
}
Both will print out:
20
24
I hope this makes this a little bit more clear and describes the differences between the first and the second example.
I'm trying to create a hot range observable. This means that when I have an observer observering the observable after a certain timeout, it should not receive the values that have already been published. I have created the following program:
import Rx from "rxjs/Rx";
var x = Rx.Observable.range(1,10).share()
x.subscribe(x => {
print('1: ' + x);
});
setTimeout(() => {
x.subscribe(x => {
print('2: ' + x);
});
}, 1000);
function print(x) {
const element = document.createElement('div');
element.innerText = x;
document.body.appendChild(element)
}
I expect this program to print 1 to 10, and then the second observable to print nothing, since the values 1 to 10 are produced within the first second. The expected output is shown below.
1: 1
1: 2
..
1:10
However, I see that it also prints all the values. Eventhough I have put the share() operator behind it. The output is shown below.
1: 1
..
1: 10
2: 1
..
2: 10
Can somebody explain this to me?
share returns an observable that's reference counted for subscriptions. When the reference count goes from zero to one, the shared observable subscribes to the source - in your case, to the range observable. And when the reference count drops back to zero, it unsubscribes from the source.
The key point in your snippet is that range emits it's values synchronously and then completes. And the completion effects an unsubscription from the shared observable and that sees the reference count drop back to zero - which sees the shared observable unsubscribe from its source.
If you replace share with publish you should see the behaviour you expected:
var x = Rx.Observable.range(1,10).publish();
x.subscribe(x => print('1: ' + x));
x.connect();
publish returns a ConnectableObservable which is not reference counted and provides a connect method that can be called to explicitly connect - i.e. subscribe - to the source.
I can't figure out how publishReplay().refCount() works.
For example (https://jsfiddle.net/7o3a45L1/):
var source = Rx.Observable.create(observer => {
console.log("call");
// expensive http request
observer.next(5);
}).publishReplay().refCount();
subscription1 = source.subscribe({next: (v) => console.log('observerA: ' + v)});
subscription1.unsubscribe();
console.log("");
subscription2 = source.subscribe({next: (v) => console.log('observerB: ' + v)});
subscription2.unsubscribe();
console.log("");
subscription3 = source.subscribe({next: (v) => console.log('observerC: ' + v)});
subscription3.unsubscribe();
console.log("");
subscription4 = source.subscribe({next: (v) => console.log('observerD: ' + v)});
subscription4.unsubscribe();
gives the following result:
call observerA: 5
observerB: 5 call observerB: 5
observerC: 5 observerC: 5 call observerC: 5
observerD: 5 observerD: 5 observerD: 5 call observerD: 5
1) Why are observerB, C and D called multiple times?
2) Why "call" is printed on each line and not in the beginning of the line?
Also, if i call publishReplay(1).refCount(), it calls observerB, C and D 2 times each.
What i expect is that every new observer receives the value 5 exactly once and "call" is printed only once.
publishReplay(x).refCount() combined does the following:
It create a ReplaySubject which replay up to x emissions. If x is not defined then it replays the complete stream.
It makes this ReplaySubject multicast compatible using a refCount() operator. This results in concurrent subscriptions receiving the same emissions.
Your example contains a few issues clouding how it all works together. See the following revised snippet:
var state = 5
var realSource = Rx.Observable.create(observer => {
console.log("creating expensive HTTP-based emission");
observer.next(state++);
// observer.complete();
return () => {
console.log('unsubscribing from source')
}
});
var source = Rx.Observable.of('')
.do(() => console.log('stream subscribed'))
.ignoreElements()
.concat(realSource)
.do(null, null, () => console.log('stream completed'))
.publishReplay()
.refCount()
;
subscription1 = source.subscribe({next: (v) => console.log('observerA: ' + v)});
subscription1.unsubscribe();
subscription2 = source.subscribe(v => console.log('observerB: ' + v));
subscription2.unsubscribe();
subscription3 = source.subscribe(v => console.log('observerC: ' + v));
subscription3.unsubscribe();
subscription4 = source.subscribe(v => console.log('observerD: ' + v));
<script src="https://cdnjs.cloudflare.com/ajax/libs/rxjs/5.1.0/Rx.js"></script>
When running this snippet we can see clearly that it is not emitting duplicate values for Observer D, it is in fact creating new emissions for every subscription. How come?
Every subscription is unsubscribed before the next subscription takes place. This effectively makes the refCount decrease back to zero, no multicasting is being done.
The issue resides in the fact that the realSource stream does not complete. Because we are not multicasting the next subscriber gets a fresh instance of realSource through the ReplaySubject and the new emissions are prepended with the previous already emitted emissions.
So to fix your stream from invoking the expensive HTTP request multiple times you have to complete the stream so the publishReplay knows it does not need to re-subscribe.
Generally: The refCount means, that the stream is hot/shared as long as there is at least 1 subscriber - however, it is being reset/cold when there are no subscribers.
This means if you want to be absolutely sure that nothing is executed more than once, you should not use refCount() but simply connect the stream to set it hot.
As an additional note: If you add an observer.complete() after the observer.next(5); you will also get the result you expected.
Sidenote: Do you really need to create your own custom Obervable here? In 95% of the cases the existing operators are sufficient for the given usecase.
This happens because you're using publishReplay(). It internally creates an instance of ReplaySubject that stores all values that go through.
Since you're using Observable.create where you emit a single value then every time you call source.subscribe(...) you append one value to the buffer in ReplaySubject.
You're not getting call printed at the beginning of each line because it's the ReplaySubject who emits its buffer first when you subscribe and then it subscribes itself to its source:
For implementation details see:
https://github.com/ReactiveX/rxjs/blob/master/src/operator/multicast.ts#L63
https://github.com/ReactiveX/rxjs/blob/master/src/ReplaySubject.ts#L54
The same applies when using publishReplay(1). First it emits the buffered item from ReplaySubject and then yet another item from observer.next(5);