I am trying to achieve the following with Rxjs: given an array of job ids, for every id in the array, poll an endpoint that returns the status of the job. The status can be either "RUNNING", or "FINISHED". The code should poll jobs one after the other, and continue the polling until the jobs are in the "RUNNING" status. As soon as a job reaches the "FINISHED" status, it should be passed downstream, and excluded from further polling.
Below is a minimal toy case that demonstrates the problem.
const {
from,
of,
interval,
mergeMap,
filter,
take,
tap,
delay
} = rxjs;
const { range } = _;
const doRequest = (input) => {
const status = Math.random() < 0.15 ? 'FINISHED' : 'RUNNING';
return of({ status, value: input })
.pipe(delay(500));
};
const number$ = from(range(1, 10));
const poll = (number) => interval(5000).pipe(
mergeMap(() => {
return doRequest(number)
}),
tap(console.log),
filter(( {status} ) => status === 'FINISHED'),
take(1)
);
const printout$ = number$.pipe(
mergeMap((number) => {
return poll(number)
})
);
printout$.subscribe(console.log);
<script src="https://cdnjs.cloudflare.com/ajax/libs/rxjs/7.5.5/rxjs.umd.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.21/lodash.min.js"></script>
It does most of what I described; but it polls all endpoints simultaneously rather than one after another. Here, roughly, is the pattern I would like to achieve:
starting with ids: [1, 2, 3]
polling: await request 1 then await request 2 then await request 3
then wait for n seconds; then repeat
after job 2 is finished, send request 1, then send request 3, then wait, then repeat
after job 3 is finished, send request 1, then wait, repeat
after job 1 is finished, complete the stream
I feel that in order to achieve the sending of the requests in sequence, they should be concatMaped; but in the snippet above that's not possible because of the interval that would prevent each polling stream from terminating.
Could you please advise how to modify my code to achieve what I am describing?
If I understand the problem right, I would proceed like this.
First of all I would create a poll function that returns an Observable which notifies after a round of pollings, and it emits an array of all numbers for which the call to doRequest returns 'RUNNING'. Such a function would look something like this
const poll = (numbers: number[]) => {
return from(numbers).pipe(
concatMap((n) =>
doRequest(n).pipe(
filter((resp) => resp.status === 'RUNNING'),
map((resp) => resp.value)
)
),
toArray()
);
};
Then what you need to do is to recursively iterate a call the poll function until the array emitted by the Observable returned by poll is empty.
Recursion in rxjs is obtained typically with the expand operator, and this is the operator which we are going to use also in this case, like this
poll(numbers)
.pipe(
expand((numbers) =>
numbers.length === 0
? EMPTY
: timer(2000).pipe(concatMap(() => poll(numbers)))
)
)
.subscribe(console.log);
A complete example can be seen in this stackblitz.
UPDATE
If the objective is to notify the job ids which have finished with a polling logic, the structure of the solution remains the same (a poll function and recursivity via expand) but the details are different.
The poll function makes sure we emit all the responses of a polling round and it looks like this:
const poll = (
numbers: number[]
) => {
console.log(`Polling ${numbers}`);
return from(numbers).pipe(
concatMap((n) => doRequest(n)),
toArray()
);
};
The recursion logic makes sure that all jobs that are still with "RUNNING" status are polled again but then we filter only the jobs which are FINISHED and passed them downstream. In other words the logic looks like this
poll(start)
.pipe(
expand((responses) => {
const numbers = responses.filter(r => r.status === 'RUNNING').map(r => r.value)
return numbers.length === 0
? EMPTY
: timer(2000).pipe(concatMap(() => poll(numbers)));
}),
map(responses => responses.filter(r => r.status === 'FINISHED')),
filter(finished => finished.length > 0)
)
.subscribe({
next: responses => console.log(`Job finished ${responses.map(r => r.value)}`),
complete: () => {console.log('All processed')}
});
A working example can be seen in this stackblitz.
Updated: Original answer was not on the right track.
What we want to achieve is that on each go around of the interval we poll all the outstanding jobs in order. We yield up any completed jobs to the output observable and we also omit those completed jobs from subsequent polls.
We can do that by using a Subject instead of a static observable of the job IDs. We start our poll interval and we use withLatestFrom to include the latest list of job IDs. We can then add a tap into the output observable when we get a finished job and update the Subject to omit that job.
To end the poller interval we can create an observable that fires when the array of outstanding jobs is empty and use takeUntil with that.
const number$ = new Subject();
const noMoreNumber$ = number$.pipe(skipWhile((numbers) => numbers.length > 0));
const printout$ = interval(5000).pipe(
withLatestFrom(number$),
switchMap(([_, numbers]) => {
return numbers.map((number) => defer(() => doRequest(number)));
}),
concatAll(),
//tap(console.log),
filter(({ status }) => status === 'FINISHED'),
withLatestFrom(number$),
tap(([{ value }, numbers]) =>
number$.next(numbers.filter((num) => num != value))
),
map(([item]) => item),
takeUntil(noMoreNumber$)
);
printout$.subscribe({
next: console.log,
error: console.error,
complete: () => console.log('COMPLETE'),
});
number$.next([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
The other tweak I would make is to use switchMap instead of mergeMap inside the poller itself. If you use that in combination with fromFetch for performing your HTTP calls then, if there is some long-running HTTP call which gets stuck, on the next poll the previous call will be cancelled before it makes the next HTTP call because switchMap disposes of the previous observable before subscribing to the new one.
Here's a working example:
https://stackblitz.com/edit/js-gxrrb3?devToolsHeight=33&file=index.js
Generates console output looking like this...
TRY this
import { delay, EMPTY, from, of, range } from 'rxjs';
import { concatMap, filter, mergeMap, tap, toArray } from 'rxjs/operators';
const number$ = from(range(1, 3));
const doRequest = (input) => {
const status = Math.random() < 0.15 ? 'FINISHED' : 'RUNNING';
return of({ status, value: input }).pipe(delay(1000));
};
const poll = (jobs: object[]) => {
return from(jobs).pipe(
filter((job) => job['status'] !== 'FINISHED'),
concatMap((job) => doRequest(job['value'])),
tap((job) => {
console.log('polling with................', job);
}),
toArray(),
tap((result) => {
console.log('curent jobs................', JSON.stringify(result));
}),
mergeMap((result) =>
result.length > 0 ? poll(result) : of('All job completed!')
)
);
};
const initiateJob = number$.pipe(
mergeMap((id) => doRequest(id)),
toArray(),
tap((jobs) => {
console.log('initialJobs: ', JSON.stringify(jobs));
}),
concatMap(poll)
);
initiateJob.subscribe({
next: console.log,
error: console.log,
complete: () => console.log('COMPLETED'),
});
I'm trying to build a reusable piece of code for multi files upload.
I do not want to care about the HTTP layer implementation, I want to purely focus on the stream logic.
I've built the following function to mock the HTTP layer:
let fakeUploadCounter = 0;
const fakeUpload = () => {
const _fakeUploadCounter = ++fakeUploadCounter;
return from(
Array.from({ length: 100 })
.fill(null)
.map((_, i) => i)
).pipe(
mergeMap(x =>
of(x).pipe(
delay(x * 100),
switchMap(x =>
_fakeUploadCounter % 3 === 0 && x === 25
? throwError("Error happened!")
: of(x)
)
)
)
);
};
This function simulates the progress of the upload and the progress will fail at 25% of the upload every 3 files.
With this out of the way, let's focus on the important bit: The main stream.
Here's what I want to achieve:
Only use streams, no imperative programming, no tap to push a temporary result in a subject. I could build this. But I'm looking for an elegant solution
While some files are being uploaded, I want to be able to add more files to the upload queue
As a browser can deal with only 6 HTTP calls at the same time, I do not want to take too much of that amount and we should be able to upload only 3 files at the same time. As soon as one finishes or is stopped or throws, then another file should start
When a file upload throws, we should keep that file in the list of file and still display the progress. It won't increase anymore but at least the user gets to see where it failed. When that's the case, we should see some text on that row indicating that there was an error and a retry button should let us give another go at the upload or a discard button will let us remove it completely
Here's a visual explanation:
So far, here's the code I've got:
export class AppComponent {
public file$$: Subject<File> = new Subject();
public retryFile$$: Subject<File> = new Subject();
public stopFile$$: Subject<File> = new Subject();
public files$ = this.file$$.pipe(
mergeMap(file =>
this.retryFile$$.pipe(
filter(retryFile => retryFile === file),
startWith(null),
map(() =>
fakeUpload().pipe(
map(progress => ({ progress })),
takeUntil(
this.stopFile$$.pipe(filter(stopFile => stopFile === file))
),
catchError(() => of({ error: true })),
scan(
(acc, curr: { progress: number } | { error: true }) => ({
...acc,
...curr
}),
{
file,
progress: 0,
error: false
}
)
)
)
)
),
mergeAll(3), // 3 upload in parallel maximum
scan(
(acc, curr) => ({
...acc,
// todo we can't use the File reference directly here
// but we shouldn't use the file name either
// instead we should generate a unique ID for each upload
[curr.file.name]: curr
}),
{}
),
map(fileEntities => Object.values(fileEntities))
);
public addFile() {
this.file$$.next(new File([], `test-file-${filesCount}`));
filesCount++;
}
}
Here's the code in stackblitz that you can fork: https://stackblitz.com/edit/rxjs-upload-multiple-files-v2?file=src/app/app.component.ts
I'm pretty close! If you open the live demo in stackblitz on the right and click on the "Add file" button, you'll see that you can add many files and they'll all get uploaded. The 3rd one will fail gracefully.
Now what is not working how I'd like:
If you click quickly more than 3 times on the "add file" button, only 3 files will appear in the queue. I'd like to have all of them but only 3 should be uploading at the same time. Yet, all the files to be uploaded should be displayed in the view, just waiting to start
The stop button should remove any upload. Whether it's uploading or failed
Thanks for any help
Number 1:
If you click quickly more than 3 times on the "add file" button, only 3 files will appear in the queue. I'd like to have all of them but only 3 should be uploading at the same time. Yet, all the files to be uploaded should be displayed in the view, just waiting to start
First of all, this is a cool problem because as far as I could see, you can't simply compose the existing operators (Without getting stupid with partition). You need a custom operator that splits your stream. If you don't want to subscribe to your source twice, you should share before splitting.
There's quite a lot of work left to implement your solution the way you'd like. BUT, in terms of getting your stream to show all files regardless of whether they're currently loading, there's really just one piece missing.
You want to split your stream. One stream should emit default
{
file,
progress: 0,
error: false
}`
files right away and the second stream should emit updates to those files. The second stream will have mergeAll(3), but the first doesn't need this limitation as it's not making a network request. You merge these two-streams and either update or add new entries into your output as you see fit.
Here's an example of that at work. I made a dummy example to abstract away the implementation details a bit. I start out with an array of objects with this shape,
{
id: number,
message: "HeyThere" + id,
response: none
}
I make a fake httpRequest call that enriches an object to
{
id: number,
message: "HeyThere" + id,
response: "Hello"
}
The stream emits each time a new object is added or when an object is enriched. But the enriching stream is limited to max 3 httpRequest calls at once.
const httpRequest= () => {
return timer(4000).pipe(
map(_ => "Hello")
);
}
const arrayO = [];
arrayO.length = 10;
from(arrayO).pipe(
map((val, index) => ({
id: index,
message: "HeyThere" + index,
response: "None"
})),
share(),
s => merge(s, s.pipe(
map(ob => httpRequest().pipe(
map(val => ({...ob, response: val}))
)),
mergeAll(3)
)),
scan((acc, val: any) => {
acc.set(val.id, val);
return acc;
}, new Map<number, any>()),
debounceTime(250),
map(mapO => Array.from(mapO.values()))
).subscribe(console.log);
I added a debounce as I find it makes the output much easier to follow. Since I added all 10 un-enriched objects synchronously, it just spams 10 arrays to the output if I don't debounce. Also, since every fake HttpRequest takes exactly 4 seconds, I get three arrays spammed at the output every 4 seconds. Debounce stops the UI from stuttering or the console from getting spammed.
Number 2
The stop button should remove any upload. Whether it's uploading or failed
This is a can of worms because every canonical solution says you should make a state management system. That would be the easiest way to interact with files that are in Queue, Loading, Failed, and Loaded all in one uniform way.
It's pretty easy to implement a lightweight Redux-style state management system using RxJS (Just use scan to manage state and JSON objects representing events to transform state). The toughest part is managing your current httpRequests. You'd probably create a custom mergeAll() operator that takes in events, removes queued requests, and even cancels mid-flight requests if necessary.
Using a stopFile$$ works to cancel mid-flight requests but it'll fall apart if people want to stop a fileload that hasn't started yet (as per your first requirement, you want those vsible too). It's sort of brittle regardless because emiting on a suject never comes with the assurance that anybody is listening. Another reason that a redux-style management is the way to go.
This is a very interesting problem, here is my approach to it:
uploadFile$ = this.uploadFile.pipe(
multicast(new Subject<CustomFile>(), subject =>
merge(
subject.pipe(
mergeMap(
// `file.id` might be created with uuid() or something like that
(file, idx) =>
of({ status: FILE_STATUS.PENDING, ...file }).pipe(
observeOn(asyncScheduler),
takeUntil(subject)
)
)
),
subject.pipe(
mergeMap(
(file, idx) =>
fakeUpload(file).pipe(
map(progress => ({
...file,
progress,
status: FILE_STATUS.LOADING
})),
startWith({
name: file.name,
status: FILE_STATUS.LOADING,
id: file.id,
progress: 0
}),
catchError(() => of({ ...file, status: FILE_STATUS.FAILED })),
scan(
(acc, curr) => ({
...acc,
...curr
}),
{} as CustomFile
),
takeUntil(
this.stopFile.pipe(
tap(console.warn),
filter(f => f.id === file.id)
)
)
),
3
)
)
)
)
);
files$: Observable<CustomFile[]> = merge(
this.uploadFile$,
this.stopFile
).pipe(
tap(v =>
v.status === FILE_STATUS.REMOVED ? console.warn(v) : console.log(v)
),
scan((filesAcc, crtFile) => {
// if the file is being removed, we need to remove it from the list
if (crtFile.status === FILE_STATUS.REMOVED) {
const { [crtFile.id]: _, ...rest } = filesAcc;
return rest;
}
// simply return an updated copy of the object when the file has the status either
// * `pending`(the buffer's length is > 3)
// * `loading`(the file is being uploaded)
// * `failed`(an error occurred during the file upload, but we keep it in the list)
// * `retrying`(the `Retry` button has been pressed)
return {
...filesAcc,
[crtFile.id]: crtFile
};
}, Object.create(null)),
// Might want to replace this by making the `scan`'s seed return an object that implements a custom iterator
map(obj => Object.values(obj))
);
StackBlitz demo.
I think the biggest problem here was how to determine when the mergeMap's buffer is full, so that a pending item should be shown to the user. As you can see, I've solved this using the multicast's second parameter:
multicast(new Subject(), subject => ...)
multicast(new Subject), refCount(), without its second argument, it's the same as share(). But when you provide the second argument(a.k.a the selector), you can achieve some sort of local multicasting:
if (isFunction(selector)) {
return operate((source, subscriber) => {
// the first argument
const subject = subjectFactory();
/* .... */
selector(subject).subscribe(subscriber).add(source.subscribe(subject));
});
}
selector(subject).subscribe(subscriber) will subscribe to the observable(which can also be a Subject) returned from the selector. Then, with .add(source.subscribe(subject)), the source is subscribed to. In the selector, we've used merge(subject.pipe(...), subject.pipe(...)), each of which will gain access to what's being pushed into the stream. Because of add(source.subscribe(subject)), the source's value will be passed to the Subject instance, which has its subscribers.
So, the way I solved the aforementioned problem was to create a race between observables. The first contender is
// #1
subject.pipe(
mergeMap(
// `file.id` might be created with uuid() or something like that
(file, idx) =>
of({ status: FILE_STATUS.PENDING, ...file }).pipe(
observeOn(asyncScheduler),
takeUntil(subject)
)
)
),
and the second one is
// #2
subject.pipe(
mergeMap(
(file, dx) => fileUpload().pipe(
/* ... */
// emits synchronously - as soon as the inner subscriber is created
startWith(...)
)
)
)
So, as soon as the Subject(the subject variable in this case) receives the value from the source, it will send it to all of its subscribers - the 2 contenders. It all happens synchronously, which also means that the order matters. #1 will be the first subscriber to receive the value, and #2 will be second. The way the winner is selected is to see which one of the 2 subscribers emits first.
Notice that the first will pass along the value asynchronously(with the help of observeOn(asyncScheduler)) and the second one synchronously. The first one will emit first if the buffer is full, otherwise the second will emit.
I've ended up with the following:
export interface FileUpload {
file: File;
progress: number;
error: boolean;
toRemove: boolean;
}
export const uploadManager = () => {
const file$$: Subject<File> = new Subject();
const retryFile$$: Subject<File> = new Subject();
const stopFile$$: Subject<File> = new Subject();
const fileStartOrRetry$: Observable<File> = file$$.pipe(
mergeMap(file =>
retryFile$$.pipe(
filter(retryFile => retryFile === file),
startWith(file)
)
),
share()
);
const addFileToQueueAfterStartOrRetry$: Observable<
FileUpload
> = fileStartOrRetry$.pipe(
map(file => ({
file,
progress: 0,
error: false,
toRemove: false
}))
);
const markFileToBeRemovedAfterStop$: Observable<FileUpload> = stopFile$$.pipe(
map(file => ({
file,
progress: 0,
error: false,
toRemove: true
}))
);
const updateFileProgress$: Observable<FileUpload> = fileStartOrRetry$.pipe(
map(file =>
uploadMock().pipe(
map(progress => ({ progress })),
takeUntil(
stopFile$$.pipe(filter(stopFile => stopFile.name === file.name))
),
catchError(() => of({ error: true })),
scan(
(acc, curr: { progress: number } | { error: true }) => ({
...acc,
...curr
}),
{
file,
progress: 0,
error: false,
toRemove: false
}
)
)
),
// 3 upload in parallel maximum
mergeAll(3)
);
const files$: Observable<FileUpload[]> = merge(
addFileToQueueAfterStartOrRetry$,
updateFileProgress$,
markFileToBeRemovedAfterStop$
).pipe(
scan<FileUpload, { [key: string]: FileUpload }>((acc, curr) => {
if (curr.toRemove) {
const copy = { ...acc };
delete copy[curr.file.name];
return copy;
}
return {
...acc,
// todo we can't use the File reference directly here
// but we shouldn't use the file name either
// instead we should generate a unique ID for each upload
[curr.file.name]: curr
};
}, {}),
map(fileEntities => Object.values(fileEntities))
);
return {
files$,
file$$,
retryFile$$,
stopFile$$
};
};
It covers all the cases as demonstrated here: https://rxjs-upload-multiple-file-v3.stackblitz.io
The code is here: https://stackblitz.com/edit/rxjs-upload-multiple-file-v3?file=src/app/upload-manager.ts
It's based on Mrk Sef's suggestion. It clicked after he mentioned "You want to split your stream".
In general we need behavior subject functionality. But only on first subscription we should send subscribe to server in REST. And to send unsubscribe on the last unsubscribe, and all late observers subscribed will gwt the latest json recwived from the first. can i do it using rxjs operaTors and how? or shoul i use custom obserbale ?
currently the custom code for this is this:
public observable: Observable<TPattern> = new Observable((observer: Observer<TPattern>) => {
this._observers.push(observer);
if (this._observers.length === 1) {
this._subscription = this.httpRequestStream$
.pipe(
map((jsonObj: any) => {
this._pattern = jsonObj.Data;
return this._pattern;
})
)
.subscribe(
(data) => this._observers.forEach((obs) => obs.next(data)),
(error) => this._observers.forEach((obs) => obs.error(error)),
() => this._observers.forEach((obs) => obs.complete())
);
}
if (this._pattern !== null) {
observer.next(this._pattern); // send last updated array
}
return () => {
const index: number = this._observers.findIndex((element) => element === observer);
this._observers.splice(index, 1);
if (this._observers.length === 0) {
this._subscription.unsubscribe();
this._pattern = null; // clear pattern when unsubscribed
}
};
});
Sounds like you need a shareReplay(1), it will share the latest response with all subscribes.
const stream$ = httpRequestStream$.pipe(
shareReplay(1),
),
stream$.subscribe(); // sends the request and gets its result
stream$.subscribe(); // doesn't send it but gets cached result
stream$.subscribe(); // doesn't send it but gets cached result
stream$.subscribe(); // doesn't send it but gets cached result
I want to create a function that will make AJAX requests to backend. And if this function is called many times at the same time, then it should not make many identical requests to the server. It must make only 1 request.
For example:
doAJAX('http://example-1.com/').subscribe(res => console.log); // must send a request
doAJAX('http://example-1.com/').subscribe(res => console.log); // must NOT send a request
doAJAX('http://example-2.com/').subscribe(res => console.log); // must send a request, bacause of different URL
window.setTimeout(() => {
doAJAX('http://example-2.com/').subscribe(res => console.log); // must send a request because too much time has passed since the last request
}, 3000)
All function calls should return a result, as if the request was actually made.
I think for this purpose I can use RxJS library.
I have done this:
const request$ = new Subject < string > ();
const response$ = request.pipe(
groupBy((url: string) => url),
flatMap(group => group.pipe(auditTime(500))), // make a request no more than once every 500 msec
map((url: string) => [
url,
from(fetch(url))
]),
share()
);
const doAJAX = (url: string): Observable <any> {
return new Observable(observe => {
response$
.pipe(
filter(result => result[0] === url),
first(),
flatMap(result => result[1])
)
.subscribe(
(response: any) => {
observe.next(response);
observe.complete();
},
err => {
observe.error(err);
}
);
request$.next(url);
});
}
I create request$ subject and response$ observable. doAjax function subscribes for response$ and send URL string to request$ subject. Also there are groupBy and auditTime operators in request$ stream. And filter operator in doAJAX function.
This code works but I think it is very difficult. Is there a way to make this task easier? Maybe RxJS scheduler or not use RxJS library at all
As the whole point of this is to memoize Http results and delay repeated calls, you might consider your own memoization. Example:
const memoise = (func) => {
let cache: { [key:string]: Observable<any> } = {};
return (...args): Observable<any> => {
const cacheKey = JSON.stringify(args)
cache[cacheKey] = cache[cacheKey] || func(...args).pipe(share());
return cache[cacheKey].pipe(
tap(() => timer(1000).subscribe(() => delete cache[cacheKey]))
);
}
}
Here is a Stackblitz DEMO
I'm working with RxJs and I have to make a polling mechanism to retrieve updates from a server.
I need to make a request every second, parse the updates, emit it and remember its id, because I need it to request the next pack of updates like getUpdate(lastId + 1).
The first part is easy so I just use interval with mergeMap
let lastId = 0
const updates = Rx.Observable.interval(1000)
.map(() => lastId)
.mergeMap((offset) => getUpdates(offset + 1))
I'm collecting identifiers like this:
updates.pluck('update_id').scan(Math.max, 0).subscribe(val => lastId = val)
But this solution isn't pure reactive and I'm looking for the way to omit the usage of "global" variable.
How can I improve the code while still being able to return observable containing just updates to the caller?
UPD.
The server response for getUpdates(id) looks like this:
[
{ update_id: 1, payload: { ... } },
{ update_id: 3, payload: { ... } },
{ update_id: 2, payload: { ... } }
]
It may contain 0 to Infinity updates in any order
Something like this? Note that this is an infinite stream since there is no condition to abort; you didn't give one.
// Just returns the ID as the update_id.
const fakeResponse = id => {
return [{ update_id: id }];
};
// Fakes the actual HTTP call with a network delay.
const getUpdates = id => Rx.Observable.of(null).delay(250).map(() => fakeResponse(id));
// Start with update_id = 0, then recursively call with the last
// returned ID incremented by 1.
// The actual emissions on this stream will be the full server responses.
const updates$ = getUpdates(0)
.expand(response => Rx.Observable.of(null)
.delay(1000)
.switchMap(() => {
const highestId = Math.max(...response.map(update => update.update_id));
return getUpdates(highestId + 1);
})
)
updates$.take(5).subscribe(console.log);
<script src="https://cdnjs.cloudflare.com/ajax/libs/rxjs/5.5.6/Rx.js"></script>
To define the termination of the stream, you probably want to hook into the switchMap at the end; use whatever property of response to conditionally return Observable.empty() instead of calling getUpdates again.