Observable make request multiple times and collect response together - rxjs

In database I have 19 users.
In my API, I can get only 5 results in one call.
If I want to get all them, I need to do request 4 times, each time to get 5 users. With start query I will change from which position I want new users to get.
I'm trying to do it in RxJS together with redux-observable.
I have some idea, but maybe my approach is imperative, and RxJS is opposite ideology.
// get users from API and `pipe` them helps me to see actual data and to count length of array
function getUsers(position = 0) {
return ajax.getJSON(`${API}/users?_start=${position}&_limit=5`).
pipe(map(({data}) => ({responseLength: data.length, data})))
}
// here when I got response if array.lenght is equal to 5, I know that I need to do fetch of data again.
// Problem is encountered here: if I do recursion after doing I will get only last result, not both of them,
// if I put my previous result into array, and then recursion result again push in array it become too complicated after
// in userFetchEpic to manipulate with this data
function count(data) {
return data.pipe(
map(item => {
if (item.responseLength === 5) {
count(getUsers(5));
}
return {type: "TEST" , item}
})
)
}
function userFetchEpic(action$) {
return action$
.pipe(
ofType(USER_FETCH),
mergeMap(() => {
return count(getUsers()).pipe(
map(i => i)
)
})
);
}
My code is here just to show what was my way of thinking.
Main problem is in recursion how to save all values together, if I save values in array.
Then I need to loop through array of observables and that sounds complicated in my head. :)
Probably this problem have much easier and better solution.

19 Users with 4 concurrent calls
I've re-arranged your get-users function to generate 4 Ajax calls, run them all concurrently, then flatten the result into one array. This should get you all 19 users in a single array.
function getUsers() {
return forkJoin(
[0,5,10,15].map(position =>
ajax.getJSON(`${API}/users?_start=${position}&_limit=5`)
)
).pipe(
map(resArray => resArray.flatMap(res => res.data))
)
}
function userFetchEpic(action$) {
return action$.pipe(
ofType(USER_FETCH),
mergeMap(_ => getUsers())
);
}
Generalize: Recursively get users
This, again, will return all 19 users, but this time you don't need to know that you have 19 users ahead of time. On the other hand, this makes all its calls in sequence, so I would expect it to be slower.
You'll notice this one is done recursively. You are creating a call stack this way, but so long as you don't have millions of users, it shouldn't be a problem.
function getUsers(position = 0) {
return ajax.getJSON(`${API}/users?_start=${position}&_limit=5`).pipe(
switchMap(({data}) =>
data.length < 5 ?
of(data) :
getUsers(position + 5).pipe(
map(recursiveData => data.concat(recursiveData))
)
})
);
}

Related

RxJS: execute concatMap i parallel

Is it possible to execute a high-order observable in parallel, but still preserve the order when merging the results?
I have something looking like this:
invoker$: Observable<void>;
fetch: (index: number) => Observable<T[]>;
invoker$
.pipe(
concatMap((_, index) => fetch(index)),
scan((acc: T[], values) => [...acc, ...values], [])
)
.subscribe(/* Do something with the array */);
The idea is having an observable that invokes a callback (e.g. backend call that takes a considerable amount of time) generating a new observable that emits a single value (array of some generic type). The returned values should be concatenated in another array while preserve their original fetch order.
I would, however, like the requests to be fired in parallel. So if the invoker$ is called rapidly, the requests are made in parallel and the results are merged as they complete.
My understanding is that the concatMap will wait for one observable to complete, before starting the next one. mergeMap will do it parallel, but won't do anything to preserve the order.
You can do it using mergeMap.
First, you need to pass the index together with the async response down the stream.
Then you can sort based on the index from the previous step.
Then you have two choices:
if the stream needs to end once all the requests are made and handle only once all the responses you can use reduce https://rxmarbles.com/#reduce
if the stream needs to continue for another batch of requests you need to use scan and later filter until you reach the needed event count. https://rxmarbles.com/#scan and https://rxmarbles.com/#filter
I am going to give you some pseudo-code for both examples:
In the reduce case, the stream ends once all requests are sent:
invoker$
.pipe(
mergeMap((_, index) => fetch(index).then(value => {value, index})),
reduce((acc: T[], singleValue) => [...acc, ...singleValue], []),
map(array => array.sort(/*Sort on index here*/).map(valueWithIndex => valueWithIndex.value))
)
.subscribe(/* Do something with the array */);
In the multiple-use case, I am assuming the size of the batch to be constant:
invoker$
.pipe(
mergeMap((_, index) => fetch(index).then(value => {value, index})),
scan((acc: T[], singleValue) => {
let resp = [...acc, ...singleValue];
// The scan can accumulate more than the batch size,
// so we need to limit it and restart for the new batch
if(resp.length > BATCH_SIZE) {
resp = [singleValue];
}
return resp;
}, []),
filter(array => array.length == BATCH_SIZE),
map(array =>
array
.sort(/*Sort on index here*/)
.map(valueWithIndex => valueWithIndex.value))
)
.subscribe(/* Do something with the array */);
2.1. In case the batch size is dynamic:
invoker$
.pipe(
mergeMap((_, index) => fetch(index).then(value => {value, index})),
withLatestFrom(batchSizeStream),
scan((acc: [T[], number], [singleValue, batchSize]) => {
let resp = [[...acc[0], ...singleValue], batchSize];
// The scan can accumulate more than the batch size,
// so we need to limit it and restart for the new batch
// NOTE: the batch size is dynamic and we do not want to drop data
// once the buffer size changes, so we need to drop the buffer
// only if the batch size did not change
if(resp[0].length > batchSize && acc[1] == batchSize) {
resp = [[singleValue], batchSize];
}
return resp;
}, [[],0]),
filter(arrayWithBatchSize =>
arrayWithBatchSize[0].length >= arrayWithBatchSize[1]),
map(arrayWithBatchSize =>
arrayWithBatchSize[0]
.sort(/*Sort on index here*/)
.map(valueWithIndex => valueWithIndex.value))
)
.subscribe(/* Do something with the array */);
EDIT: optimized sorting, added dynamic batch size case
I believe that the operator you are looking for is forkJoin.
This operator will take as input a list of observables, fire them in parallel and will return a list of the last emitted value of each observable once they all complete.
forkJoin({
invoker: invoker$,
fetch: fetch$,
})
.subscribe(({invoker, fetch}) => {
console.log(invoker, fetch);
});
Seems like this behavior is provided by the concatMapEager operator from the cartant/rxjs-etc library - written by Nicholas Jamieson
(cartant) who's a developer on the core RxJS team.

Observable defaulting when not empty

I have some code that is reading from a database, iterating each row of data and performing some logic on it, then creating an observable that then writes to the database, adding it to an array (creating an array of observables), so that when the array of observables is subscribed to via forkJoin all the necessary data is written to the database.
This seems to work perfectly fine until the number of observables in the array gets quite large. The amount of rows can be anywhere from 0-6000, so the size of the array can grow up to this. When it does get to this size the observable no longer writes to the database but instead returns the default value from defaultIfEmpty. I'm stumped as to why it works normally with smaller amounts of observables, but suddenly becomes empty on larger amounts...
It might be a little more clear with a code example
function writeToDB() {
// rows taken from the database, n = 0..6000
data = []
// array of observables
observables = []
for (const row of data) {
if (row.age > 20) {
// websocket between service and database, returns an observable
const observable = websocket.put(row).pipe(
o$.catchError((err) => {
return r$.of(err)
}),
o$.defaultIfEmpty({
success: true,
status: 200
})
);
observables.push(observable);
}
}
return forkJoin([...observables]);
}
Using this example works perfectly fine when subscribed to, except with a large data set where the array observables is about 5000 in length. At that point it starts to return the defaultIfEmpty values { success: true, status: 200 } and I cannot workout why... Any help or advice would greatly appreciated.
It's not clear from what you've shown here. Still, if this works with a smaller number of calls, then there's a good chance that websocket exhibits some strange behavior at those numbers.
Something worth trying might be to limit the concurrency on you websocket calls.
function writeToDB(data) {
// data contains rows taken from the database, n = 0..6000
return from(data).pipe(
filter(row => row.age > 20),
map(row => websocket.put(row).pipe(
catchError(err => of(err)),
// last makes sure that mergeAll behaves like forkJoin
last(undefined, {
success: true,
status: 200
})
)),
// mergeAll lets you choose how many can run concurrently
// for example, at most 50 websocket calls are made at
// once here
mergeAll(50),
toArray()
);
}
I prefer map, mergeAll over mergeMap in this case (as I think you're less likely to miss the concurrent aspect of this), but you can use either.
function writeToDB(data) {
// data contains rows taken from the database, n = 0..6000
return from(data).pipe(
filter(row => row.age > 20),
mergeMap(row => websocket.put(row).pipe(
catchError(err => of(err)),
// last makes sure that mergeMap behaves like forkJoin
last(undefined, {
success: true,
status: 200
})
), 50), // <- sneaky! ;)
toArray()
);
}

Assert that a dynamic table is correctly ordered by date

Given a dynamically-loading table with a variable number of rows, how does one assert that the rows are correctly ordered by date?
This problem has two main challenges: (1) how does one compare dates within table rows using cypress; and (2) how does one handle dynamic loading in such a complex scenario?
So far, I have successfully managed to solve the first problem; however, I am unable to solve the second problem. My test works most of the time, but it will sometimes fail because the page hasn't finished loading before the assertions are hit. For example, the dates are out of order when the page is first loaded:
2023-12-23
2024-01-24
2022-02-25
2027-03-26
And then they get ordered following an XHR request:
2022-02-25
2023-12-23
2024-01-24
2027-03-26
Now, before you say anything: yes, I am already waiting for the XHR request to finish before I make any assertions. The problem is that there remains a small delay between when the request finishes, and when the actual DOM gets updated.
Normally this problem is solved automatically by Cypress. In Cypress, every call to .should() will automatically retry until the expected condition is found, or a timeout is reached. This fixes any problems related to dynamic loading.
However, .should() isn't the only way to assert something in Cypress. Alternatively, you can make direct assertions using Chai expressions, which is what Cypress uses under the hood when you call .should(). This is often required when making complex assertions such as the kind that I am making in this scenario.
Let's take a look at what I have so far:
cy.get('tbody tr').each(($row, $index, $rows) => { // foreach row in the table
if ($index > 0) { // (skipping the first row)
cy.wrap($row).within(() => { // within the current row...
cy.get('td').eq(7).then(($current_td) => { // ... get the eighth td (where the date is)
cy.wrap($rows[$index - 1]).within(() => { // within the previous row...
cy.get('td').eq(7).then(($previous_td) => { // ... get the eighth td
expect(dayjs($current_td.text().toString()).unix()) // assert that the date of the current td...
.gt(dayjs($previous_td.text().toString()).unix()) // ... is greater than the previous one.
})
})
})
})
}
})
Now, one option you have in Cypress is to replace .then() with .should(). Doing this allows the user to continue to benefit from the polling nature of .should() while also using multiple Chai expressions directly. Unfortunately, I can't seem to get this to work. Here's some of the attempts that I made:
cy.get('tbody tr').each(($row, $index, $rows) => {
if ($index > 0) {
cy.wrap($row).within(() => {
cy.get('td').eq(7).then(($current_td) => {
cy.wrap($rows[$index - 1]).within(() => {
cy.get('td').eq(7).should(($previous_td) => { // replacing with .should() here doesn't help, because it only attempts to retry on $previous_td, but we actually need to retry $current_td as well
expect(dayjs($current_td.text().toString()).unix())
.gt(dayjs($previous_td.text().toString()).unix())
})
})
})
})
}
})
cy.get('tbody tr').each(($row, $index, $rows) => {
if ($index > 0) {
cy.wrap($row).within(() => {
cy.get('td').eq(7).should(($current_td) => { // causes an infinite loop!
cy.wrap($rows[$index - 1]).within(() => {
cy.get('td').eq(7).then(($previous_td) => {
expect(dayjs($current_td.text().toString()).unix())
.gt(dayjs($previous_td.text().toString()).unix())
})
})
})
})
}
})
The only other solution I can think of is to hardcode my own polling. This is the sort of thing that I do all the time when writing tests in Selenium. However, my experience with Cypress leads me to believe that I shouldn't need to do this, ever. It's just a matter of wrangling Cypress to do what I expect it to do.
That said, I'm coming up empty handed. So, what do?
UPDATE
After learning from gleb's answer, I finally landed on this simple solution:
const dayjs = require('dayjs')
chai.use(require('chai-sorted'));
cy.get('tbody tr td:nth-of-type(8)').should($tds => {
const timestamps = Cypress._.map($tds, ($td) => dayjs($td.innerText).unix())
expect(timestamps).to.be.sorted()
})
I now feel that a core part of my problem was not understanding jQuery well enough to write a single selection statement. Furthermore, I wasn't familiar with lodash map or chai-sorted.
You need to use a single cy.get(...).should(...) callback where the callback grabs all date strings, converts into timestamps, then checks if the timestamps are sorted. Then Cypress retries the cy.get command - until the table is sorted and the should callback passes. Here is a sample code, see the full dynamic example at https://glebbahmutov.com/cypress-examples/recipes/sorted-list.html
// assuming you want to sort by the second column
cy.get('table tbody td + td').should($cells => {
const timestamps = Cypress._.map($cells, ($cell) => $cell.innerText)
.map((str) => new Date(str))
.map((d) => d.getTime())
// check if the timestamps are sorted
const sorted = Cypress._.sortBy(timestamps)
expect(timestamps, 'sorted timestamps').to.deep.equal(sorted)
})

Using RxJS to remove nested callbacks when things must be done in sequence

I need to do one HTTP request after another, but the second one can't start until after the first one has finished because the second needs as a parameter a value returned from the first.
Here it is with a nested callback which is great because it works and it's fairly clear from reading the code what is happening.
this.isLoading = true;
this.firstService.get(this.id)
.subscribe((response: FirstReturnType) => {
this.firstThing = response;
this.secondService.get(this.firstThing.secondId)
.subscribe(
(response: SecondReturnType) => {
this.secondThing = response;
this.isLoading = false;
}
}
The claim I see people making is that nested callbacks are bad and that one should use RxJS to make it better.
However, nobody making these claims has been able to produce a working example. Can you?
Your Code Re-written
Here is some code that has a 1-1 correspondence with your code, but it is flattened
this.isLoading = true;
this.firstService.get(this.id).pipe(
mergeMap((response: FirstReturnType) => {
this.firstThing = response;
return this.secondService.get(response.secondId);
})
).subscribe((response: SecondReturnType) => {
this.secondThing = response;
this.isLoading = false;
});
What this gets right: you're using a higher-order observable operator to map a value emitted by one observable into a new observable that you subscribe to. In this case, mergeMap is subscribing for you and getting rid of your nesting.
For Your Consideration
Consider this. The following is about as clean looking at six service calls (each giving some value to the next one) in a row can get if you're not using a higher-order operator:
this.firstService.getThing("First").subscribe(result1 => {
this.secondService.getThing(result1.value).subscribe(result2 => {
this.thirdService.getThing(result2.value).subscribe(result3 => {
this.fourthService.getThing(result3.value).subscribe(result4 => {
this.fifthService.getThing(result4.value).subscribe(result5 => {
this.sixthService.getThing(result5.value).subscribe(result6 => {
console.log("Result Six is: " + result6.value);
});
});
});
});
});
});
Here's the exact same thing with mergeMap:
this.firstService.getThing("First").pipe(
mergeMap(result1 => this.secondService.getThing(result1.value)),
mergeMap(result2 => this.thirdService.getThing(result2.value)),
mergeMap(result3 => this.fourthService.getThing(result3.value)),
mergeMap(result4 => this.fifthService.getThing(result4.value)),
mergeMap(result5 => this.sixthService.getThing(result5.value)),
).subscribe(result6 => {
console.log("Result Six is: " + result6.value);
});
If that's not enough to convince you, you can lean a little bit more into some functional programming to make this even cleaner (without repetitively naming each result)
const passValueToService = service => result => service.getThing(result.value);
passValueToService(this.firstService)("First").pipe(
mergeMap(passValueToService(this.secondService)),
mergeMap(passValueToService(this.thirdService)),
mergeMap(passValueToService(this.fourthService)),
mergeMap(passValueToService(this.fifthService)),
mergeMap(passValueToService(this.sixthService)),
).subscribe(finalResult => {
console.log("Result Six is: " + finalResult.value);
});
Or why not lean EVEN harder and keep our list of services in an array?
const [firstS, ...restS] = [this.firstService, this.secondService, this.thirdService, this.fourthService, this.fifthService, this.sixthService];
const passValueToService = service => result => service.getThing(result.value);
passValueToService(firstS)("first").pipe(
...restS.map(service => mergeMap(passValueToService(service)))
).subscribe(finalResult => {
console.log("Result Six is: " + finalResult.value);
});
None of these simplifications are very easily done while nesting subscribe calls. But with the help of some functional currying (and the handy RxJS pipe to compose with), you can begin to see that your options expand dramatically.
Understanding concatMap, mergeMap, & switchMap
The Setup
We'll have 3 helper functions as described here:
/****
* Operator: intervalArray
* -----------------------
* Takes arrays emitted by the source and spaces out their
* values by the given interval time in milliseconds
****/
function intervalArray<T>(intervalTime = 1000): OperatorFunction<T[], T> {
return s => s.pipe(
concatMap((v: T[]) => concat(
...v.map((value: T) => EMPTY.pipe(
delay(intervalTime),
startWith(value)
))
))
);
}
/****
* Emit 1, 2, 3, then complete: each 0.5 seconds apart
****/
function n123Stream(): Observable<number> {
return of([1,2,3]).pipe(
intervalArray(500)
);
}
/****
* maps:
* 1 => 10, 11, 12, then complete: each 1 second apart
* 2 => 20, 21, 22, then complete: each 1 second apart
* 3 => 30, 31, 32, then complete: each 1 second apart
****/
function numberToStream(num): Observable<number>{
return of([num*10, num*10+1, num*10+2]).pipe(
intervalArray(1000)
);
}
The above mapping function (numberToStream), takes care of the map part of concatMap, mergeMap, and switchMap
Subscribing to each operator
The following three snippits of code will all have different outputs:
n123Stream().pipe(
concatMap(numberToStream)
).subscribe(console.log);
n123Stream().pipe(
mergeMap(numberToStream)
).subscribe(console.log);
n123Stream().pipe(
switchMap(numberToStream)
).subscribe(console.log);
If you want to run these back-to-back:
concat(
...[concatMap, mergeMap, switchMap].map(
op => n123Stream().pipe(
op(numberToStream),
startWith(`${op.name}: `)
)
)
).subscribe(console.log);
concatMap:
concatMap will not subscribe to the second inner observable until the first one is complete. That means that the number 13 will be emitted before the second observable (starting with the number 20) will be subscribed to.
The output:
10 11 12 20 21 22 30 31 32
All the 10s are before the 20s and all the 20s are before the 30s
mergeMap:
mergeMap will subscribe to the second observable the moment the second value arrives and then to the third observable the moment the third value arrives. It doesn't care about the order of output or anything like that.
The output
10 20 11 30 21 12 31 22 32
The 10s are earlier because they started earlier and the 30s are later because they start later, but there's some interleaving in the middle.
switchMap
switchMap will subscribe to the first observable the moment the first value arrives. It will unsubscribe to the first observable and subscribe to the second observable the moment the second value arrives (and so on).
The output
10 20 30 31 32
Only the final observable ran to completion in this case. The first two only had time to emit their first value before being unsubscribed. Just like concatMap, there is no interleaving and only one inner observable is running at a time, but some emissions are effectively dropped.
You can use switchMap.
this.firstService.get(this.id)
.pipe(
tap((response: FirstReturnType) => this.firstThing = response),
switchMap((response: FirstReturnType) => this.secondService.get(response.secondId)),
tap((response: SecondReturnType) => {
this.secondThing = response;
this.isLoading = false;
})
).subscribe();

Resolve array of observables and append in final array

I have an endpoint url like http://site/api/myquery?start=&limit= which returns an array of strings.
If I call this endpoint in this way, the server hangs since the array of strings length is huge.
I need to generate an an array of observables with incremental "start" and "limit" parameters, resolve all of then either in sequence or in parallel, and then get a final observable which at the end yields the true array of strings, obtained merging all the subarray of strings returned by the inner observables.
How should I do that?
i.e. the array of observables would be something like
[
httpClient.get(http://site/api/myquery?start=0&limit=1000),
httpClient.get(http://site/api/myquery?start=1000&limit=1000),
httpClient.get(http://site/api/myquery?start=2000&limit=1000),
....
]
If you know the length before making all these queries — then you can create as many http-get Observables as you need, and then forkJoin them using projection fn.
forkJoin will let you make parallel queries and then merge results of those queries. Heres an example:
import { forkJoin } from 'rxjs';
// given we know the length:
const LENGTH = 500;
// we can pick arbitrary page size
const PAGE_SIZE = 50;
// calculate requests count
const requestsCount = Math.ceil(LENGTH / 50);
// generate calculated number of requests
const requests = (new Array(requestsCount))
.fill(void 0)
.map((_,i) => {
const start = i * PAGE_SIZE;
return http.get(`http://site/api/myquery?start=${start}&limit=${PAGE_SIZE}`);
});
forkJoin(
requests,
// projecting fn
// merge all arrays into one
// suboptimal merging, just for example
(...results) => results.reduce(((acc, curr)=> [...acc, ...curr]) , [])
).subscribe(array => {
console.log(array);
})
Check this forkJoin example for reference.
Hope this helps
In the case that you do not know the total number of items, you can do this using expand.
The following article gives a good introduction to expand and an explanation of how to use it for pagination.
https://ncjamieson.com/understanding-expand/
Something along the lines of the code below would work in your case, making the requests for each page in series.
const limit = 1000;
let currentStart = 0;
let getUrl = (start, limit) => `http://site/api/myquery?start=${start}&limit=${limit}`;
httpClient.get(getUrl(currentStart, limit)).pipe(
expand(itemsArray => {
if (itemsArray.length) {
currentStart += limit;
return httpClient.get(getUrl(currentStart, limit));
}
return empty();
}),
reduce((acc, value) => [...acc, ...value]),
).subscribe(itemsArray => {
console.log(itemsArray);
})
This will log out the final array of items once the entire series of requests has been resolved.

Resources