Related
I'm trying to achieve something very similar to a buffer count. As values come through the pipe, bufferCount of course buffers them and sends them down in batches. I'd like something similar to this that will emit all remaining items if there are currently fewer than the buffer size in the stream.
It's a little confusing to word, so I'll provide an example with what I'm trying to achieve.
I have something adding items individually to a subject. Sometimes it'll add 1 item a minute, sometimes it'll add 1000 items in 1 second. I wish to do a long running process (2 seconds~) on batches of these items as to not overload the server.
So for example, consider the timeline where P is processing
---A-----------B----------C---D--EFGHI------------------
|_( P(A) ) |_(P(B)) |_( P(C) ) |_(P([D, E, F, G, H, I]))
This way I can process the events in small or large batches depending on how many events are coming through, but i ensure the batches remain smaller than X.
I basically need to map all the individual emits into emits that contain chunks of 5 or fewer. As I pipe the events into a concatMap, events will start to stack up. I want to pick these stacked up events off in batches. How can I achieve this?
Here's a stackblitz with what I've got so far: https://stackblitz.com/edit/rxjs-iqwcbh?file=index.ts
Note how item 4 and 5 don't process until more come in and fill in the buffer. Ideally after 1,2,3 are processed, it'll pick off 4,5 the queue. Then when 6,7,8 come in, it'll process those.
EDIT: today I learned that bufferTime has a maxBufferSize parameter, that will emit when the buffer reaches that size. Therefore, the original answer below isn't necessary, we can simply do this:
const stream$ = subject$.pipe(
bufferTime(2000, null, 3), // <-- buffer emits # 2000ms OR when 3 items collected
filter(arr => !!arr.length)
);
StackBlitz
ORIGINAL:
It sounds like you want a combination of bufferCount and bufferTime. In other words: "release the buffer when it reaches size X or after Y time has passed".
We can use the race operator, along with those other two to create an observable that emits when the buffer reaches the desired size OR after the duration has passed. We'll also need a little help from take and repeat:
const chunk$ = subject$.pipe(bufferCount(3));
const partial$ = subject$.pipe(
bufferTime(2000),
filter(arr => !!arr.length) // don't emit empty array
);
const stream$ = race([chunk$, partial$]).pipe(
take(1),
repeat()
);
Here we define stream$ to be the first to emit between chunk$ and partial$. However, race will only use the first source that emits, so we use take(1) and repeat to sort of "reset the race".
Then you can do your work with concatMap like this:
stream$.pipe(
concatMap(chunk => this.doWorkWithChunk(chunk))
);
Here's a working StackBlitz demo.
You may want to roll it into a custom operator, so you can simply do something like this:
const stream$ = subject$.pipe(
bufferCountTime(5, 2000)
);
The definition of bufferCountTime() could look like this:
function bufferCountTime<T>(count: number, time: number) {
return (source$: Observable<T>) => {
const chunk$ = source$.pipe(bufferCount(count));
const partial$ = source$.pipe(
bufferTime(time),
filter((arr: T[]) => !!arr.length)
);
return race([chunk$, partial$]).pipe(
take(1),
repeat()
);
}
}
Another StackBlitz sample.
Since I noticed the use of forkJoin in your sample code, I can see you are sending a request to the server for each emission (I was originally under the impression that you were making only 1 call per batch with combined data).
In the case of sending one request per item the solution is much simpler!
There is no need to batch the emissions, you can simply use mergeMap and specify its concurrency parameter. This will limit the number of currently executing requests:
const stream$ = subject$.pipe(
mergeMap(val => doWork(val), 3), // 3 max concurrent requests
);
Here is a visual of what the output would look like when the subject rapidly emits:
Notice the work only starts for the first 3 items initially. Emissions after that are queued up and processed as the prior in flight items complete.
Here's a StackBlitz example of this behavior.
TLDR;
A StackBlitz app with the solution can be found here.
Explanation
Here would be an approach:
const bufferLen = 3;
const count$ = subject.pipe(filter((_, idx) => (idx + 1) % bufferLen === 0));
const timeout$ = subject.pipe(
filter((_, idx) => idx === 0),
switchMapTo(timer(0))
);
subject
.pipe(
buffer(
merge(count$, timeout$).pipe(
take(1),
repeat()
)
),
concatMap(buffer => forkJoin(buffer.map(doWork)))
)
.subscribe(/* console.warn */);
/* Output:
Processing 1
Processing 2
Processing 3
Processed 1
Processed 2
Processed 3
Processing 4
Processing 5
Processed 4
Processed 5
Processing 6 <- after the `setTimeout`'s timer expires
Processing 7
Processing 8
Processed 6
Processed 7
Processed 8
*/
The idea was to still use the bufferCount's behavior when items come in synchronously, but, at the same time, detect when fewer items than the chosen bufferLen are in the buffer. I thought that this detection could be done using a timer(0), because it internally schedules a macrotask, so it is ensured that items emitted synchronously will be considered first.
However, there is no operator that exactly combines the logic delineated above. But it's important to keep in mind that we certainly want a behavior similar to the one the buffer operator provides. As in, we will for sure have something like subject.pipe(buffer(...)).
Let's see how we can achieve something similar to what bufferTime does, but without using bufferTime:
const bufferLen = 3;
const count$ = subject.pipe(filter((_, idx) => (idx + 1) % bufferLen === 0));
Given the above snippet, using buffer(count$) and bufferTime(3), we should get the same behavior.
Let's move now onto the detection part:
const timeout$ = subject.pipe(
filter((_, idx) => idx === 0),
switchMapTo(timer(0))
);
What it essentially does is to start a timer after the subject has emitted its first item. This will make more sense when we have more context:
subject
.pipe(
buffer(
merge(count$, timeout$).pipe(
take(1),
repeat()
)
),
concatMap(buffer => forkJoin(buffer.map(doWork)))
)
.subscribe(/* console.warn */);
By using merge(count$, timeout$), this is what we'd be saying: when the subject emits, start adding items to the buffer and, at the same time, start the timer. The timer is started too because it is used to determine if fewer items will be in the buffer.
Let's walk through the example provided in the StackBlitz app:
from([1, 2, 3, 4, 5])
.pipe(tap(i => subject.next(i)))
.subscribe();
// Then mimic some more items coming through a while later
setTimeout(() => {
subject.next(6);
subject.next(7);
subject.next(8);
}, 10000);
When 1 is emitted, it will be added to the buffer and the timer will start. Then 2 and 3 arrive immediately, so the accumulated values will be emitted.
Because we're also using take(1) and repeat(), the process will restart. Now, when 4 is emitted, it will be added to the buffer and the timer will start again. 5 arrives immediately, but the number of the collected items until now is less than the given buffer length, meaning that until the 3rd value arrives, the timer will have time to finish. When the timer finishes, the [4,5] chunk will be emitted. What happens with [6, 7, 8] is the same as what happened with [1, 2, 3].
Suppose I have some Observable which may have some arbitrarily long sequence of events at the time I subscribe to it but which may also continue to emit events after I subscribe.
I am interested only in those events from the time at which I subscribe and later. How do I just get the latest events?
In this example I use a ReplaySubject as an artificial source to illustrate the question. In practice this would be some arbitrary Observable.
let observable = ReplaySubject<Int>.createUnbounded()
observable.onNext(1)
observable.onNext(2)
observable.onNext(3)
observable.onNext(4)
_ = observable.subscribe(onNext: {
print($0)
})
observable.onNext(5)
observable.onNext(6)
observable.onNext(7)
Produces the output:
1
2
3
4
5
6
7
What I really want is only events from the time of subscription onwards. i.e. 4 5 6 7
I can use combineLatest with some other dummy Observable:
let observable = ReplaySubject<Int>.createUnbounded()
observable.onNext(1)
observable.onNext(2)
observable.onNext(3)
observable.onNext(4)
_ = Observable.combineLatest(observable, Observable<Int>.just(42)) { value, _ in value }
.subscribe(onNext: {
print($0)
})
observable.onNext(5)
observable.onNext(6)
observable.onNext(7)
which produces the desired output 4 5 6 7
How can I produce a similar result without artificially introducing another arbitrary Observable?
I have tried a number of things including combineLatest with an array consisting of just one observable, but that emits the complete sequence, not just the latest. I know I could use PublishSubject but I am just using ReplaySubject here as an illustration.
By default, an observable will call its generator for every subscriber and emit all of the values produced by that generator. So for example:
let obs = Observable.create { observer in
for each in [1, 2, 3, 5, 7, 11] {
observer.onNext(each)
}
observer.onCompleted()
}
(Note that the above is the implementation of Observable.from(_:))
Every time something subscribes to obs the closure is called and all 6 next events will be received. This is what's known as a "cold" observable, and again it's the default behavior. Assume an Observable is cold unless you know otherwise.
There is also the concept of a "hot" observable. A hot observable doesn't call its generator function when something subscribes to it.
Based on your question, and your subsequent comment, it sounds like you want to know how to make a cold observable hot... The fundamental way is by calling .multicast on it (or one of the operators that use its implementation like publish(), replay(_:) or replayAll().) There is also a special purpose operator called .share() that will "heat up" an observable and keep it hot until all subscribers unsubscribe to it (then it will be cold again.) And of course, Subjects are considered hot because they don't have a generator function to call.
Note however, that many observables have synchronous behavior, this means that they will emit all their values as soon as something subscribes and thus will have already completed before any other observer (on that thread) has a chance to subscribe.
Some more examples... .interval(_:scheduler:) is a cold observable with async behavior. Let's say you have the following:
let i = Observable<Int>.interval(.seconds(3), scheduler: MainScheduler.instance)
i.subscribe(onNext: { print($0, "from first") })
DispatchQueue.main.asyncAfter(deadline: .now() + 5) {
i.subscribe(onNext: { print($0, "from second") })
}
What you will find is that each observer will get it's own independent stream of values (both will start with 0) because the generator inside interval is called for both observers. So you will see output like:
0 from first
1 from first
0 from second
2 from first
1 from second
3 from first
2 from second
If you multicast the interval you will see different behavior:
let i = Observable<Int>.interval(.seconds(3), scheduler: MainScheduler.instance)
.publish()
i.subscribe(onNext: { print($0, "from first") })
i.connect()
DispatchQueue.main.asyncAfter(deadline: .now() + 5) {
i.subscribe(onNext: { print($0, "from second") })
}
The above will produce:
0 from first
1 from first
1 from second
2 from first
2 from second
3 from first
3 from second
(Note that "second" started with 1 instead of 0.) The share operator will work the same way in this case except you don't have to call connect() because it does it automatically.
Lastly, watch out. If you publish a synchronous observable, you might not get what you expect:
let i = Observable.from([1, 2, 3, 5])
.publish()
i.subscribe(onNext: { print($0, "from first") })
i.connect()
i.subscribe(onNext: { print($0, "from second") })
produces:
1 from first
2 from first
3 from first
5 from first
Because all 5 events (the four next events and the completed event) emit as soon as connect() is called before the second observer gets a chance to subscribe.
An article that might help you is Hot and Cold Observables but it's pretty advanced...
Why not simply use a publish subject like this? Isn't this the desired output? Publish Subjects only emits the elements after it's subscribed. And that's the whole purpose of it.
let observable = PublishSubject<Int>()
observable.onNext(1)
observable.onNext(2)
observable.onNext(3)
observable.onNext(4)
_ = observable.subscribe(onNext: {
print($0)
})
observable.onNext(5)
observable.onNext(6)
observable.onNext(7)
}
If you don't want to use a subject you can share the observable and add a 2nd subscriber like this,
let observable = ReplaySubject<Int>.createUnbounded()
observable.onNext(1)
observable.onNext(2)
observable.onNext(3)
observable.onNext(4)
let shared = observable.share()
// this will print full sequence
shared.subscribe(onNext: {
print("full sequence: \($0)")
}).disposed(by: disposeBag)
// this will only print new events
shared.subscribe(onNext: {
print("shared sequence: \($0)")
}).disposed(by: disposeBag)
// new events
observable.onNext(5)
observable.onNext(6)
observable.onNext(7)
Observables are lazy, pull driven sequences. Without your first subscription stream won't even start. Once started, by sharing it, you can subscribe only to the new events.
Is the only difference between Observable.of and Observable.from the arguments format? Like the Function.prototype.call and Function.prototype.apply?
Observable.of(1,2,3).subscribe(() => {})
Observable.from([1,2,3]).subscribe(() => {})
It is important to note the difference between of and from when passing an array-like structure (including strings):
Observable.of([1, 2, 3]).subscribe(x => console.log(x));
would print the whole array at once.
On the other hand,
Observable.from([1, 2, 3]).subscribe(x => console.log(x));
prints the elements 1 by 1.
For strings the behaviour is the same, but at character level.
Not quite. When passing an array to Observable.from, the only difference between it and Observable.of is the way the arguments are passed.
However, Observable.from will accept an argument that is
a subscribable object, a Promise, an Observable-like, an Array, an iterable or an array-like object to be converted
There is no similar behaviour for Observable.of - which always accepts only values and performs no conversion.
One line Difference :
let fruits = ['orange','apple','banana']
from : Emit the items one by one of array. For example
from(fruits).subscribe(console.log) // 'orange','apple','banana'
of : Emit the whole array at once. For example
of(fruits).subscribe(console.log) // ['orange','apple','banana']
NOTE: of operator can behave as from operator with spread operator
of(...fruits).subscribe(console.log) // 'orange','apple','banana'
Another interesting fact is Observable.of([]) will be an empty array when you subscribe to it.
Where as when you subscribe to Observable.from([]) you wont get any value.
This is important when you do a consecutive operation with switchmap.
Ex:
In the below example, I am saving a job and then sites, and then comments as a stream.
.do((data) => {
this.jobService.save$.next(this.job.id);
})
.switchMap(() => this.jobService.addSites(this.job.id, this.sites)
.flatMap((data) => {
if (data.length > 0) {
// get observables for saving
return Observable.forkJoin(jobSiteObservables);
} else {
**return Observable.of([]);**
}
})).do((result) => {
// ..
})
.switchMap(() => this.saveComments())
....
if there's no site to save, ie; data.length = 0 in addSite section, the above code is returning Observable.of([]) and then goes to save comments. But if you replace it with Observable.from([]), the succeeding methods will not get called.
rxfiddle
of will emit all values at once
from will emit all values one by one
of with spread operator = from operator
from: Create observable from array, promise or iterable. Takes only one value. For arrays, iterables and strings, all contained values will be emitted as a sequence
const values = [1, 2, 3];
from(values); // 1 ... 2 ... 3
of: Create observable with variable amounts of values, emit values in sequence, but arrays as single value
const values = [1, 2, 3];
of(values, 'hi', 4, 5); // [1, 2, 3] ... 'hi' ... 4 ... 5
from returns notification in chunks i.e. one by one.
for eg: from("abcde") will return a => b => c => d => e
of returns complete notification.
for eg: of("abcde") will return abcde.
https://stackblitz.com/edit/typescript-sckwsw?file=index.ts&devtoolsheight=100
The from operator takes source of events. from(source)
let array = [1,2,3,4,5]
from(array); //where array is source of events, array[of events]
let promise = new Promise(function(resolve, reject) {
// executor (the producing code, "singer")
});
from(promise); //where promise is source of event, promise(of event)
let observable = Observable.create(function(observer) {
observer.next(1);
observer.next(2);
observer.next(3);
observer.next(4);
observer.next(5);
observer.complete();
});
from(observable); // where obsservable is source of events.
The of operator takes intividual events. of(event1, event2, event3)
of(1,2,3,4,5); // where 1,2,3,4,5 are individual events
I found it easier to remember the difference when the analogy with .call / .apply methods came into my mind.
You can think of it this way:
normally, all arguments, that are passed separately (separated by comma), are also emitted separately, in the order they were passed. of() just emits all arguments one by one as they are (like .call method passes arguments to the function it was called on)
from() is like .apply in a sense that it can take an array of values as an argument, and convert array elements into separate arguments, separated by comma.
So, if you have an array and want each element to be emitted separately, you can use from() or get the same behavior by using of() with spread operator, like of(...arr).
It's bit more complicated then that (from can also take observables) but with this analogy it will probably be easier to remember the main difference.
Yes it is true that of will result in an output in single go and from will happen one at a time. But there is more difference related to number of arguments and type of arguments.
You can pass any number of arguments to the Of. Each argument emitted separately and one after the other. It sends the Complete signal in the end.
However you can send only one argument to the from operator and that one argument should be a type of
an Array,
anything that behaves like an array
Promise
any iterable object
collections
any observable like object
For example you can send a raw object like
myObj={name:'Jack',marks:100}
to of operator to convert to Observable.
obj$:Observable<any> = of(myObj);
but you can not send this raw object myObj to from operator simply because it is not iterable or array like collection.
for more detail : visit here
from operator may accept one of
promises
iterable
arrays
observable
from emits each individual item from the observable , can also do conversions.
of operator takes in the raw value and emits the value from the observable.
import {from, Observable, of} from 'rxjs';
const ofObs = of([1,2,3]);
const fromObs = from([2,3,4]);
const basicObs = Observable.create(observer=>{
observer.next(100);
observer.next(200);
observer.next(300);
})
const promise = new Promise((resolve,reject)=>{
resolve(100);
})
const array = [1,2,3];
const iterbale = "Dhana";
// const myObs = from(ofObs);//possible and can emit individual item value everytime 1, then ,2 , then 3
// const myObs = from(fromObs);//possbile and can emit individual item value everytime 1, then ,2 , then 3
// const myObs = from(basicObs);//possbile and can emit individual item value everytime 100, then ,200 , then 300
const myObs = from(promise);//possible can emit value 100
// const myObs = array(promise);//possible and can emit individual item value everytime 1, then ,2 , then 3
// const myObs = iterable(promise);//possible and can emit individual item value everytime D then h then a then n then a
myObs.subscribe(d=>console.log(d))
import {from, of} from 'rxjs';
const basicOf1 = of([1,2,3,4,5,6]) // emits entire array of events
const basicfrom1 = from([1,2,3,4,5,6]) //emits each event at a time
const basicOf2 = of(1,2,3,4,5,6) // emits each event at a time
// const basicfrom2 = from(1,2,3,4,5,6) //throws error
//Uncaught TypeError: number is not observable
const basicOf3 = of(...[1,2,3,4,5,6]) // emits each event at a time
const basicfrom3 = from(...[1,2,3,4,5,6]) //throws error
//Uncaught TypeError: number is not observable
basicOf3.subscribe(d=>console.log(d))
Here is the link to codepen
I can't figure out how publishReplay().refCount() works.
For example (https://jsfiddle.net/7o3a45L1/):
var source = Rx.Observable.create(observer => {
console.log("call");
// expensive http request
observer.next(5);
}).publishReplay().refCount();
subscription1 = source.subscribe({next: (v) => console.log('observerA: ' + v)});
subscription1.unsubscribe();
console.log("");
subscription2 = source.subscribe({next: (v) => console.log('observerB: ' + v)});
subscription2.unsubscribe();
console.log("");
subscription3 = source.subscribe({next: (v) => console.log('observerC: ' + v)});
subscription3.unsubscribe();
console.log("");
subscription4 = source.subscribe({next: (v) => console.log('observerD: ' + v)});
subscription4.unsubscribe();
gives the following result:
call observerA: 5
observerB: 5 call observerB: 5
observerC: 5 observerC: 5 call observerC: 5
observerD: 5 observerD: 5 observerD: 5 call observerD: 5
1) Why are observerB, C and D called multiple times?
2) Why "call" is printed on each line and not in the beginning of the line?
Also, if i call publishReplay(1).refCount(), it calls observerB, C and D 2 times each.
What i expect is that every new observer receives the value 5 exactly once and "call" is printed only once.
publishReplay(x).refCount() combined does the following:
It create a ReplaySubject which replay up to x emissions. If x is not defined then it replays the complete stream.
It makes this ReplaySubject multicast compatible using a refCount() operator. This results in concurrent subscriptions receiving the same emissions.
Your example contains a few issues clouding how it all works together. See the following revised snippet:
var state = 5
var realSource = Rx.Observable.create(observer => {
console.log("creating expensive HTTP-based emission");
observer.next(state++);
// observer.complete();
return () => {
console.log('unsubscribing from source')
}
});
var source = Rx.Observable.of('')
.do(() => console.log('stream subscribed'))
.ignoreElements()
.concat(realSource)
.do(null, null, () => console.log('stream completed'))
.publishReplay()
.refCount()
;
subscription1 = source.subscribe({next: (v) => console.log('observerA: ' + v)});
subscription1.unsubscribe();
subscription2 = source.subscribe(v => console.log('observerB: ' + v));
subscription2.unsubscribe();
subscription3 = source.subscribe(v => console.log('observerC: ' + v));
subscription3.unsubscribe();
subscription4 = source.subscribe(v => console.log('observerD: ' + v));
<script src="https://cdnjs.cloudflare.com/ajax/libs/rxjs/5.1.0/Rx.js"></script>
When running this snippet we can see clearly that it is not emitting duplicate values for Observer D, it is in fact creating new emissions for every subscription. How come?
Every subscription is unsubscribed before the next subscription takes place. This effectively makes the refCount decrease back to zero, no multicasting is being done.
The issue resides in the fact that the realSource stream does not complete. Because we are not multicasting the next subscriber gets a fresh instance of realSource through the ReplaySubject and the new emissions are prepended with the previous already emitted emissions.
So to fix your stream from invoking the expensive HTTP request multiple times you have to complete the stream so the publishReplay knows it does not need to re-subscribe.
Generally: The refCount means, that the stream is hot/shared as long as there is at least 1 subscriber - however, it is being reset/cold when there are no subscribers.
This means if you want to be absolutely sure that nothing is executed more than once, you should not use refCount() but simply connect the stream to set it hot.
As an additional note: If you add an observer.complete() after the observer.next(5); you will also get the result you expected.
Sidenote: Do you really need to create your own custom Obervable here? In 95% of the cases the existing operators are sufficient for the given usecase.
This happens because you're using publishReplay(). It internally creates an instance of ReplaySubject that stores all values that go through.
Since you're using Observable.create where you emit a single value then every time you call source.subscribe(...) you append one value to the buffer in ReplaySubject.
You're not getting call printed at the beginning of each line because it's the ReplaySubject who emits its buffer first when you subscribe and then it subscribes itself to its source:
For implementation details see:
https://github.com/ReactiveX/rxjs/blob/master/src/operator/multicast.ts#L63
https://github.com/ReactiveX/rxjs/blob/master/src/ReplaySubject.ts#L54
The same applies when using publishReplay(1). First it emits the buffered item from ReplaySubject and then yet another item from observer.next(5);
At the moment zip will only produce a value whenever all of the zipped observable produces a value. E.g. from the docs:
Merges the specified observable sequences or Promises into one
observable sequence by using the selector function whenever all of the
observable sequences have produced an element
I'm looking for an observable which can sort of zip an observable but will produce an array of sequence of the zipped observable wherein it doesn't matter if all produces a value..
e.g. lets say i have tick$, observ1, observ2.. tick$ always produce value every x secs.. while observ1 and observ2 only produces from time to time..
I'm expecting my stream to look like
[tick, undefined, observ2Res],
[tick, undefined, undefined],
[tick, observ1Res, observ2Res]
...
...
its not combine latest, given that combine latest takes the latest value of a given observable.
I believe buffer (or maybe sample) might get you on the right track. The buffer method accepts an Observable that's used to define our buffer boundaries. The resulting stream emits any items that were emitted in that window (example stolen from RXJS docs for buffer):
var source = Rx.Observable.timer(0, 50)
.buffer(function () { return Rx.Observable.timer(125); })
.take(3);
var subscription = source.subscribe(x => console.log('Next: ', x));
// => Next: 0,1,2
// => Next: 3,4,5
// => Next: 6,7
So we now have a way to get all of a stream's emitted events in a certain time window. In your case, we can use tick$ to describe our sampling period and observ1 and observ2 are our underlying streams that we want to buffer:
const buffered1 = observ1.buffer(tick$);
const buffered2 = observ2.buffer(tick$);
Each of these streams will emit once every tick$ period, and will emit a list of all emitted items from the underlying stream (during that period). The buffered stream will emit data like this:
|--[]--[]--[1, 2, 3]--[]-->
To get the output you desire, we can choose to only look at the latest emitted item of each buffered result, and if there's no emitted data, we can pass null:
const buffered1 = observ1.buffer($tick).map(latest);
const buffered2 = observ2.buffer($tick).map(latest);
function latest(x) {
return x.length === 0 ? null : x[x.length - 1];
}
The previous sample stream I illustrated will now look like this:
|--null--null--3--null-->
And finally, we can zip these two streams to get "latest" emitted data during our tick$ interval:
const sampled$ = buffered1.zip(buffered2);
This sampled$ stream will emit the latest data from our observ1 and observ2 streams over the tick$ window. Here's a sample result:
|--[null, null]--[null, 1]--[1, 2]-->