I have some code that is reading from a database, iterating each row of data and performing some logic on it, then creating an observable that then writes to the database, adding it to an array (creating an array of observables), so that when the array of observables is subscribed to via forkJoin all the necessary data is written to the database.
This seems to work perfectly fine until the number of observables in the array gets quite large. The amount of rows can be anywhere from 0-6000, so the size of the array can grow up to this. When it does get to this size the observable no longer writes to the database but instead returns the default value from defaultIfEmpty. I'm stumped as to why it works normally with smaller amounts of observables, but suddenly becomes empty on larger amounts...
It might be a little more clear with a code example
function writeToDB() {
// rows taken from the database, n = 0..6000
data = []
// array of observables
observables = []
for (const row of data) {
if (row.age > 20) {
// websocket between service and database, returns an observable
const observable = websocket.put(row).pipe(
o$.catchError((err) => {
return r$.of(err)
}),
o$.defaultIfEmpty({
success: true,
status: 200
})
);
observables.push(observable);
}
}
return forkJoin([...observables]);
}
Using this example works perfectly fine when subscribed to, except with a large data set where the array observables is about 5000 in length. At that point it starts to return the defaultIfEmpty values { success: true, status: 200 } and I cannot workout why... Any help or advice would greatly appreciated.
It's not clear from what you've shown here. Still, if this works with a smaller number of calls, then there's a good chance that websocket exhibits some strange behavior at those numbers.
Something worth trying might be to limit the concurrency on you websocket calls.
function writeToDB(data) {
// data contains rows taken from the database, n = 0..6000
return from(data).pipe(
filter(row => row.age > 20),
map(row => websocket.put(row).pipe(
catchError(err => of(err)),
// last makes sure that mergeAll behaves like forkJoin
last(undefined, {
success: true,
status: 200
})
)),
// mergeAll lets you choose how many can run concurrently
// for example, at most 50 websocket calls are made at
// once here
mergeAll(50),
toArray()
);
}
I prefer map, mergeAll over mergeMap in this case (as I think you're less likely to miss the concurrent aspect of this), but you can use either.
function writeToDB(data) {
// data contains rows taken from the database, n = 0..6000
return from(data).pipe(
filter(row => row.age > 20),
mergeMap(row => websocket.put(row).pipe(
catchError(err => of(err)),
// last makes sure that mergeMap behaves like forkJoin
last(undefined, {
success: true,
status: 200
})
), 50), // <- sneaky! ;)
toArray()
);
}
I have an input of a string array for Enums I want to recieve from server:
enumList = ['somethin','branch','country', 'serviceType', 'thatandthis'];
I then have a generic http-service method that takes an enumList string as a parameter and returns an HttpClient observable for that Enum service:
this.webApi.getEnumByName('somethin','en').subscribe((res)=>{/*do something*/})
this.webApi.getEnumByName('branch','en').subscribe((res)=>{/*do something*/})...
I'm than combining the two into a loop
for (const item of this.enumList) {
this.webApi.getEnumByName(item).subscribe((res: any) => {
this.enums[item] = res;
});
}
But this is not good...
I want the a subscription that completes only once when all subscriptions has resolved, while keeping a reference to the associated item string
using an array of observables returned from this.webApi.getEnumByName(item), concat or forkJoin won't work because they won't keep refference to the associated string/key/token of the response e.g the string in the enumList.
The end result of these concatinated observables should be:
{
'somethin':{respopnse...},
'branch':{respopnse...},
'country':{respopnse...},
'serviceType':{respopnse...},
'thatandthis':{respopnse...}
}
breaking my head on this will appriciate an rxjs solution
If I understand right your problem, you may want to consider something like this.
First of all you build an Observable with a function like this
function obsFromItem(item) {
return this.webApi.getEnumByName(item).pipe(
tap(res => this.enums[item] = res),
)
}
The above logic says that as soon as getEnumByName notifies its result, than the result is set into this.enums at the right item.
Now that you have a similar function, you can create an array of Observables to be passed into forkJoin like this
arrayOfObs = enumList.map(item => obsFromItem(item))
forkJoin(arrayOfObs).subscribe()
When forkJoin(arrayOfObs) notifies, it means that all the Observables built via obsFromItem have emitted and therefore this.enums should be rightly filled.
forkJoin gives you parallel execution. If you substitute forkJoin with concat you get sequential execution.
In this article you may find some typical patterns of use of Obaservables with http calls.
You can combine several observables together like that:
forkJoin(enumList.reduce<any>((result, key) => {
result[key] = this.webApi.getEnumByName(key,'en');
return result;
}, {})).subscribe(allTogether => {
// allTogether.somethin;
// allTogether.branch;
// ...
});
You can create a function to pass in this.enumList and still getting the same reference
function getResponse(enum){
return forkJoin(....).subscribe(....)
}
or
forkjoin this.enumList with http call list
forkJoin(of(this.enumList), forkJoion(httpcall1,htttpcall2))
.subscribe([enum,responsesArray]=>....)
There is a continuous stream of event objects which doesn't complete. Each event has bands. By subscribing to events you get an event with several properties, among these a property "bands" which stores an array of bandIds. With these ids you can get each band. (The stream of bands is continuous as well.)
Problem: In the end you'd not only like to have bands, but a complete event object with bandIds and the complete band objects.
// This is what I could come up with myself, but it seems pretty ugly.
getEvents().pipe(
switchMap(event => {
const band$Array = event.bands.map(bandId => getBand(bandId));
return combineLatest(of(event), ...band$Array);
})),
map(combined => {
const newEvent = combined[0];
combined.forEach((x, i) => {
if (i === 0) return;
newEvent.bands = {...newEvent.bands, ...x};
})
})
)
Question: Please help me find a cleaner way to do this (and I'm not even sure if my attempt produces the intended result).
ACCEPTED ANSWER
getEvents().pipe(
switchMap(event => {
const band$Array = event.bands.map(bandId => getBand(bandId));
return combineLatest(band$Array).pipe(
map(bandArray => ({bandArray, event}))
);
})
)
ORIGINAL ANSWER
You may want to try something along these lines
getEvents().pipe(
switchMap(event => {
const band$Array = event.bands.map(bandId => getBand(bandId));
return forkJoin(band$Array).pipe(
map(bandArray => ({bandArray, event}))
);
})
)
The Observable returned by this transformation emits an object with 2 properties: bandArray holding the array of bands retrieved with the getBand service and event which is the object emitted by the Observable returned by getEvents.
Consider also that you are using switchMap, which means that as soon as the Observable returned by getEvents emits you are going to switch to the last emission and complete anything which may be on fly at the moment. In other words you can loose some events if the time required to exectue the forkJoin is longer than the time from one emission and the other of getEvents.
If you do not want to loose anything, than you better use mergeMap rather than switchMap.
UPDATED ANSWER - The Band Observable does not complete
In this case I understand that getBand(bandId) returns an Observable which emits first when the back end is queried the first time and then when the band data in the back end changes.
If this is true, then you can consider something like this
getEvents().pipe(
switchMap(event => {
return from(event.bands).pipe(
switchMap(bandId => getBand(bandId)).pipe(
map(bandData => ({event, bandData}))
)
);
})
)
This transformation produces an Observable which emits either any time a new event occurs or any time the data of a band changes.
I'm making use of the withLatestFrom operator in RxJS in the normal way:
var combined = source1.withLatestFrom(source2, source3);
...to actively collect the most recent emission from source2 and source3 and to emit all three value only when source1 emits.
But I cannot guarantee that source2 or source3 will have produced values before source1 produces a value. Instead I need to wait until all three sources produce at least one value each before letting withLatestFrom do its thing.
The contract needs to be: if source1 emits then combined will always eventually emit when the other sources finally produce. If source1 emits multiple times while waiting for the other sources we can use the latest value and discard the previous values. Edit: as a marble diagram:
--1------------2---- (source)
----a-----b--------- (other1)
------x-----y------- (other2)
------1ax------2by--
--1------------2---- (source)
------a---b--------- (other1)
--x---------y------- (other2)
------1ax------2by--
------1--------2---- (source)
----a-----b--------- (other1)
--x---------y------- (other2)
------1ax------2by--
I can make a custom operator for this, but I want to make sure I'm not missing an obvious way to do this using the vanilla operators. It feels almost like I want combineLatest for the initial emit and then to switch to withLatestFrom from then on but I haven't been able to figure out how to do that.
Edit: Full code example from final solution:
var Dispatcher = new Rx.Subject();
var source1 = Dispatcher.filter(x => x === 'foo');
var source2 = Dispatcher.filter(x => x === 'bar');
var source3 = Dispatcher.filter(x => x === 'baz');
var combined = source1.publish(function(s1) {
return source2.publish(function(s2) {
return source3.publish(function(s3) {
var cL = s1.combineLatest(s2, s3).take(1).do(() => console.log('cL'));
var wLF = s1.skip(1).withLatestFrom(s2, s3).do(() => console.log('wLF'));
return Rx.Observable.merge(cL, wLF);
});
});
});
var sub1 = combined.subscribe(x => console.log('x', x));
// These can arrive in any order
// and we can get multiple values from any one.
Dispatcher.onNext('foo');
Dispatcher.onNext('bar');
Dispatcher.onNext('foo');
Dispatcher.onNext('baz');
// combineLatest triggers once we have all values.
// cL
// x ["foo", "bar", "baz"]
// withLatestFrom takes over from there.
Dispatcher.onNext('foo');
Dispatcher.onNext('bar');
Dispatcher.onNext('foo');
// wLF
// x ["foo", "bar", "baz"]
// wLF
// x ["foo", "bar", "baz"]
I think the answer is more or less as you described, let the first value be a combineLatest, then switch to withLatestFrom. My JS is hazy, but I think it would look something like this:
var selector = function(x,y,z) {};
var combined = Rx.Observable.concat(
source1.combineLatest(source2, source3, selector).take(1),
source1.withLatestFrom(source2, source3, selector)
);
You should probably use publish to avoid multiple subscriptions, so that would look like this:
var combined = source1.publish(function(s1)
{
return source2.publish(function(s2)
{
return source3.publish(function(s3)
{
return Rx.Observable.concat(
s1.combineLatest(s2, s3, selector).take(1),
s1.withLatestFrom(s2, s3, selector)
);
});
});
});
or using arrow functions...
var combined = source1.publish(s1 => source2.publish(s2 => source3.publish(s3 =>
Rx.Observable.concat(
s1.combineLatest(s2, s3, selector).take(1),
s1.withLatestFrom(s2, s3, selector)
)
)));
EDIT:
I see the problem with concat, the withLatestFrom isn't getting the values. I think the following would work:
var combined = source1.publish(s1 => source2.publish(s2 => source3.publish(s3 =>
Rx.Observable.merge(
s1.combineLatest(s2, s3, selector).take(1),
s1.skip(1).withLatestFrom(s2, s3, selector)
)
)));
...so take one value using combineLatest, then get the rest using withLatestFrom.
I wasn't quite satisfied with the accepted answer, so I ended up finding another solution. Many ways to skin a cat!
My use-case involves just two streams - a "requests" stream and a "tokens" stream. I want requests to fire as soon as they are received, using the whatever the latest token is. If there is no token yet, then it should wait until the first token appears, and then fire off all the pending requests.
I wasn't quite satisfied with the accepted answer, so I ended up finding another solution. Essentially I split the request stream into two parts - before and after first token arrives. I buffer the first part, and then re-release everything in one go once I know that the token stream is non-empty.
const first = token$.first()
Rx.Observable.merge(
request$.buffer(first).mergeAll(),
request$.skipUntil(first)
)
.withLatestFrom(token$)
See it live here: https://rxviz.com/v/VOK2GEoX
For RxJs 7:
const first = token$.first()
merge(
request$.pipe(
buffer(first),
mergeAll()
),
request$.pipe(
skipUntil(first)
)
).pipe(
withLatestFrom(token$)
)
I had similar requirements but for just 2 observables.
I ended up using switchMap+first:
observable1
.switchMap(() => observable2.first(), (a, b) => [a, b])
.subscribe(([a, b]) => {...}));
So it:
waits until both observables emit some value
pulls the value from second observable only if the first one has changed (unlike combineLatest)
doesn't hang subscribed on second observable (because of .first())
In my case, second observable is a ReplaySubject. I'm not sure if it will work with other observable types.
I think that:
flatMap would probably work too
it might be possible to extend this approach to handle more than 2 observables
I was surprised that withLatestFrom will not wait on second observable.
In my mind, the most elegant way to achieve the different behavior of an existing RxJS operator is to wrap it into a custom operator. So that from the outside it looks just like any regular operator and doesn't require you to restructure your code each time you need this behavior.
Here is how you can create your own operator which behaves just like withLatestFrom, except that at the very beginning it will emit as soon as the first value of the target observable is emitted (unlike standard withLatestFrom, which will ignore the first emission of the source if the target hasn't yet emitted once). Let's call it delayedWithLatestFrom.
Note that it's written in TypeScript, but you can easily transform it to plain JS. Also, it's a simple version that supports only one target observable and no selector function - you can extend it as needed from here.
export function delayedWithLatestFrom<T, N>(
target$: Observable<N>
): OperatorFunction<T, [T, N]> {
// special value to avoid accidental match with values that could originate from target$
const uniqueSymbol = Symbol('withLatestFromIgnore');
return pipe(
// emit as soon target observable emits the first value
combineLatestWith<T, [N]>(target$.pipe(first())),
// skip the first emission because it's handled above, and then continue like a normal `withLatestFrom` operator
withLatestFrom(target$.pipe(skip(1), startWith(uniqueSymbol))),
map(([[rest, combineLatestValue], withLatestValue]) => {
// take combineLatestValue for the first time, and then always take withLatestValue
const appendedValue =
withLatestValue === uniqueSymbol ? combineLatestValue : withLatestValue;
return [rest, appendedValue];
})
);
}
// SAMPLE USAGE
source$.pipe(
delayedWithLatestFrom(target$)
).subscribe(console.log);
So if you compare it with the original marble diagram for withLatestFrom, it will differ only in one fact: while withLatestFrom ignores the first emissions and produces b1 as the first value, the delayedWithlatestFrom operator will emit one more value a1 at the beginning, as soon as the second observable emits 1.
a) Standard withLatestFrom:
b) Custom delayedWithLatestFrom:
Use combineLatest and filter to remove tuples before first full set is found then set a variable to stop filtering. The variable can be within the scope of a wrapping defer to do things properly (support resubscription). Here it is in java (but the same operators exist in RxJs):
Observable.defer(
boolean emittedOne = false;
return Observable.combineLatest(s1, s2, s3, selector)
.filter(x -> {
if (emittedOne)
return true;
else {
if (hasAll(x)) {
emittedOne = true;
return true;
} else
return false;
}
});
)
I wanted a version where tokens are fetched regularly - and where I want to retry the main data post on (network) failure. I found shareReplay to be the key. The first mergeWith creates a "muted" stream, which causes the first token to be fetched immediately, not when the first action arrives. In the unlikely event that the first token will still not be available in time, the logic also has a startWith with an invalid value. This causes the retry logic to pause and try again. (Some/map is just a Maybe-monad):
Some(fetchToken$.pipe(shareReplay({refCount: false, bufferSize: 1})))
.map(fetchToken$ =>
actions$.pipe(
// This line is just for starting the loadToken loop immediately, not waiting until first write arrives.
mergeWith(fetchToken$.pipe(map(() => true), catchError(() => of(false)), tap(x => loggers.info(`New token received, success: ${x}`)), mergeMap(() => of()))),
concatMap(action =>
of(action).pipe(
withLatestFrom(fetchToken$.pipe(startWith(""))),
mergeMap(([x, token]) => (!token ? throwError(() => "Token not ready") : of([x, token] as const))),
mergeMap(([{sessionId, visitId, events, eventIds}, token]) => writer(sessionId, visitId, events, token).pipe(map(() => <ISessionEventIdPair>{sessionId, eventIds}))),
retryWhen(errors =>
errors.pipe(
tap(err => loggers.warn(`Error writing data to WG; ${err?.message || err}`)),
mergeMap((_error: any, attemptIdx) => (attemptIdx >= retryPolicy.retryCount ? throwError(() => Error("It's enough now, already")) : of(attemptIdx))), // error?.response?.status (int, response code) error.code === "ENOTFOUND" / isAxiosError: true / response === undefined
delayWhen(attempt => timer(attempt < 2 ? retryPolicy.shortRetry : retryPolicy.longRetry, scheduler))
)
)
)
),
)
)
Thanks to everyone on this question-page for good inputs.
Based on the answer from #cjol
Here's a RxJs 7 implementation of a waitFor operator that will buffer the source stream until all input observables have emitted values, then emit all buffered events on the source stream. Any subsequent events on the source stream are emitted immediately.
// Copied from the definition of withLatestFrom() operator.
export function waitFor<T, O extends unknown[]>(
inputs: [...ObservableInputTuple<O>]
): OperatorFunction<T, [T, ...O]>;
/**
* Buffers the source until every observable in "from" have emitted a value. Then
* emit all buffered source values with the latest values of the "from" array.
* Any source events are emitted immediately after that.
* #param from Array of observables to wait for.
* #returns Observable that emits an array that concatenates the source and the observables to wait.
*/
export function waitFor(
from: Observable<unknown>[]
): (source$: Observable<unknown>) => Observable<unknown> {
const combined$ = combineLatest(from);
// This served as a conditional that switched on and off the streams that
// wait for the the other observables, or emits the source right away because
// the other observables have emitted.
const firstCombined$ = combined$.pipe(first());
return function (source$: Observable<unknown>): Observable<unknown> {
return merge(
// This stream will buffer the source until the other observables have all emitted.
source$.pipe(
takeUntil(firstCombined$), // without this it continues to buffer new values forever
buffer(firstCombined$),
mergeAll()
),
// This stream emits the source straight away and will take over when the other
// observables have emitted.
source$.pipe(skipUntil(firstCombined$))
).pipe(
withLatestFrom(combined$),
// Flatten it to behave like withLatestFrom() operator.
map(([source, combined]) => [source, ...combined])
);
};
}
All of the above solutions are not really on the point, therefore I made my own. Hope it helps someone out.
import {
combineLatest,
take,
map,
ObservableInputTuple,
OperatorFunction,
pipe,
switchMap
} from 'rxjs';
/**
* ### Description
* Works similar to {#link withLatestFrom} with the main difference that it awaits the observables.
* When all observables can emit at least one value, then takes the latest state of all observables and proceeds execution of the pipe.
* Will execute this pipe only once and will only retrigger pipe execution if source observable emits a new value.
*
* ### Example
* ```ts
* import { BehaviorSubject } from 'rxjs';
* import { awaitLatestFrom } from './await-latest-from.ts';
*
* const myNumber$ = new BehaviorSubject<number>(1);
* const myString$ = new BehaviorSubject<string>("Some text.");
* const myBoolean$ = new BehaviorSubject<boolean>(true);
*
* myNumber$.pipe(
* awaitLatestFrom([myString$, myBoolean$])
* ).subscribe(([myNumber, myString, myBoolean]) => {});
* ```
* ### Additional
* #param observables - the observables of which the latest value will be taken when all of them have a value.
* #returns a tuple which contains the source value as well as the values of the observables which are passed as input.
*/
export function awaitLatestFrom<T, O extends unknown[]>(
observables: [...ObservableInputTuple<O>]
): OperatorFunction<T, [T, ...O]> {
return pipe(
switchMap((sourceValue) =>
combineLatest(observables).pipe(
take(1),
map((values) => [sourceValue, ...values] as unknown as [T, ...O])
)
)
);
}
Actually withLatestFrom already
waits for every source
emits only when source1 emits
remembers only the last source1-message while the other sources are yet to start
// when source 1 emits the others have emitted already
var source1 = Rx.Observable.interval(500).take(7)
var source2 = Rx.Observable.interval(100, 300).take(10)
var source3 = Rx.Observable.interval(200).take(10)
var selector = (a,b,c) => [a,b,c]
source1
.withLatestFrom(source2, source3, selector)
.subscribe()
vs
// source1 emits first, withLatestFrom discards 1 value from source1
var source1 = Rx.Observable.interval(500).take(7)
var source2 = Rx.Observable.interval(1000, 300).take(10)
var source3 = Rx.Observable.interval(2000).take(10)
var selector = (a,b,c) => [a,b,c]
source1
.withLatestFrom(source2, source3, selector)
.subscribe()