Elasticsearch BulkAll with NEST: Maximum capacity exceeded - elasticsearch

I'm using the following code to bulkindex documents. It works for everything except my Product model, but only when I try to index more than a few documents. If I do only 1 document it works fine. If I do 10, it fails. My Product model is not very complex, but it does have some nested documents with infinite self-referencing loops, but I did add ReferenceLoopHandling.Ignore to handle it.
public bool BulkIndex<T>(IEnumerable<T> items) where T : class
{
var waitHandle = new CountdownEvent(1);
var bulkAll = _client.BulkAll(items, b => b
.BackOffRetries(2)
.BackOffTime(TimeSpan.FromSeconds(5))
.RefreshOnCompleted(true)
.MaxDegreeOfParallelism(4)
.Size(100)
.Index(typeof(T).Name.ToLower())
);
bulkAll.Subscribe(new BulkAllObserver(
onNext: (b) => { Console.Write("."); },
onError: (e) => { throw e; },
onCompleted: () => waitHandle.Signal()
));
waitHandle.Wait();
return true;
}
new JsonNetSerializer(builtInSerializer, connectionSettings, () => new JsonSerializerSettings
{
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
}))

Related

Why using a createSelector function in another file causes re-render vs creating "inline", both with useMemo

I have this app that I'm working on that is using RTK and in the documentation for selecting values from results, in queries using RTK Query, they have an example with a createSelector and React.useMemo. Here's that code and the page
import { createSelector } from '#reduxjs/toolkit'
import { selectUserById } from '../users/usersSlice'
import { useGetPostsQuery } from '../api/apiSlice'
export const UserPage = ({ match }) => {
const { userId } = match.params
const user = useSelector(state => selectUserById(state, userId))
const selectPostsForUser = useMemo(() => {
const emptyArray = []
// Return a unique selector instance for this page so that
// the filtered results are correctly memoized
return createSelector(
res => res.data,
(res, userId) => userId,
(data, userId) => data?.filter(post => post.user === userId) ?? emptyArray
)
}, [])
// Use the same posts query, but extract only part of its data
const { postsForUser } = useGetPostsQuery(undefined, {
selectFromResult: result => ({
// We can optionally include the other metadata fields from the result here
...result,
// Include a field called `postsForUser` in the hook result object,
// which will be a filtered list of posts
postsForUser: selectPostsForUser(result, userId)
})
})
// omit rendering logic
}
So I did the same in my app, but I thought that if it's using the createSelector then it can be in a separate slice file. So I have this code in a slice file:
export const selectFoo = createSelector(
[
(result: { data?: TypeOne[] }) => result.data,
(result: { data?: TypeOne[] }, status: TypeTwo) => status,
],
(data: TypeOne[] | undefined, status) => {
return data?.filter((d) => d.status === status) ?? [];
}
);
Then I created a hook that uses this selector so that I can just pass in a status value and get the filtered results. This is in another file as well.
function useGetFooByStatus(status: WebBookmkarkStatus) {
const selectFooMemoized = useMemo(() => {
return selectFoo;
}, []);
const { foos, isFetching, isSuccess, isError } =
useGetFoosQuery(
"key",
{
selectFromResult: (result) => ({
isError: result.isError,
isFetching: result.isFetching,
isSuccess: result.isSuccess,
isLoading: result.isLoading,
error: result.error,
foos: selectFooMemoized(result, status),
}),
}
);
return { foos, isFetching, isSuccess, isError };
}
Then lastly I'm using this hook in several places in the app.
The problem then is when I'm causing a re-render in another part of the app causes the query hook to run again (I think), but the selector function runs again, not returning the memoized value, even though nothing has changed. I haven't really figured it out what causes the re-render in another part of the app, but when I do the following step, it stops re-rendering.
If I replace the selector function in the useGetFooByStatus with the same one in the slice file. With this, the value is memoized correctly.
(Just to remove any doubt, the hook would look like this)
function useGetFooByStatus(status: TypeTwo) {
const selectFooMemoized = useMemo(() => {
return createSelector(
[
(result: { data?: TypeOne[] }) => result.data,
(result: { data?: TypeOne[] }, status: TypeTwo) =>
status,
],
(data: TypeOne[] | undefined, status) =>
data?.filter((d) => d.status === status) ?? []
);
}, []);
const { foos, isFetching, isSuccess, isError } =
useGetFoosQuery(
"key",
{
selectFromResult: (result) => ({
isError: result.isError,
isFetching: result.isFetching,
isSuccess: result.isSuccess,
isLoading: result.isLoading,
error: result.error,
foos: selectFooMemoized(result, status),
}),
}
);
return { foos, isFetching, isSuccess, isError };
}
Sorry for the long question, just want to try and explain everything :)
Solution 1 has one selector used for your whole app. That selector has a cache size of 1, so if you call it always with the same argument it will not recalculate, but if you call it with 1 and then with 2 and then with 1 and then with 2 it will always recalculate in-between and always return a different (new object) as a result.
Solution 2 creates one such selector per component instance.
Now imagine two different components calling these selectors - with two different queries with two different results.
Solution 1 will flip-flop and always create a new result - solution 2 will stay stable "per-component" and not cause rerenders.
Does the following work:
const EMPTY = [];
const createSelectFoo = (status: TypeTwo) => createSelector(
[
(result: { data?: TypeOne[] }) => result.data,
],
(data: TypeOne[] | undefined) => {
return data?.filter((d) => d.status === status) ? EMPTY;
}
);
function useGetFooByStatus(status: TypeTwo) {
//only create selector if status changes, this will
// memoize the result when multiple components
// call this hook with different status in one render
// cycle
const selectFooMemoized = useMemo(() => {
return createSelectFoo(status);
}, [status]);
const { foos, isFetching, isSuccess, isError } =
useGetFoosQuery(
"key",
{
selectFromResult: (result) => ({
isError: result.isError,
isFetching: result.isFetching,
isSuccess: result.isSuccess,
isLoading: result.isLoading,
error: result.error,
foos: selectFooMemoized(result),
}),
}
);
return { foos, isFetching, isSuccess, isError };
}
You may want to make your component a pure component with React.memo, some more information with examples of selectors can be found here

how to get all documents by index in Easticsearch using NEST?

I want to GET all my documents by Index. I have tried the following:
var response = client.Search(s => s.Index("test").MatchAll());
the response returns "successful operation" but it hits no document despite the fact that there are many documents under that index.
To get all documents within an index, you'll want to use the Scroll API. Note that depending on how many documents we're talking about, it's likely that you'll receive them in batches through multiple HTTP requests/responses.
There's a helper in NEST for making this easier, ScrollAll()
Time processTimePerScroll = "20s";
int numberOfSlices = Environment.ProcessorCount;
var scrollAllObservable = client.ScrollAll<Person>(processTimePerScroll, numberOfSlices, sc => sc
.MaxDegreeOfParallelism(numberOfSlices)
.Search(s => s
.Query(q => q
.MatchAll()
)
)
)
var waitHandle = new ManualResetEvent(false);
Exception exception = null;
var scrollAllObserver = new ScrollAllObserver<Person>(
onNext: response =>
{
// do something with the documents
var documents = response.SearchResponse.Documents;
},
onError: e =>
{
exception = e;
waitHandle.Set();
},
onCompleted: () => waitHandle.Set()
);
scrollAllObservable.Subscribe(scrollAllObserver);
waitHandle.WaitOne();
if (exception != null)
{
throw exception;
}

RxJS/ReactiveX Proper modules communication

I'm pretty new to Reactive Programming but already in love. However it is still hard to switch my brain to it. I'm trying to follow all recommendations as "Avoid using subjects" and "Avoid impure functions" and of course "Avoid imperative code."
What I'm finding hard to achieve is simple cross modules communications where one module can register "action"/observable and the other could subscribe and react to it. A simple message bus will probably work but this will enforce the usage of Subjects and imperative code style which I'm trying to avoid.
So here is a simple starting point I'm playing with:
// some sandbox
class Api {
constructor() {
this.actions = {};
}
registerAction(actionName, action) {
// I guess this part will have to be changed
this.actions[actionName] = action.publishReplay(10).refCount();
//this.actions[actionName].connect();
}
getAction(actionName) {
return this.actions[actionName];
}
}
const api = new Api();
// -------------------------------------------------------------------
// module 1
let myAction = Rx.Observable.create((obs) => {
console.log("EXECUTING");
obs.next("42 " + Date.now());
obs.complete();
});
api.registerAction("myAction", myAction);
let myTrigger = Rx.Observable.interval(1000).take(2);
let executedAction = myTrigger
.flatMap(x => api.getAction("myAction"))
.subscribe(
(x) => { console.log(`executed action: ${x}`); },
(e) => {},
() => { console.log("completed");});
// -------------------------------------------------------------------
// module 2
api.getAction("myAction")
.subscribe(
(x) => { console.log(`SECOND executed action: ${x}`); },
(e) => {},
() => { console.log("SECOND completed");});
So currently at the moment the second module subscribes it "triggers" the "myAction" Observable. And in a real life scenario that could be an ajax call. Is there any way to make all subscribers delay/wait until "myAction" is called properly from module1? And again - its easy to do it using subjects but I'm trying to do it following recommended practices.
If I understand you correctly, you want to make the sure that, if you call the api.getAction, you want next values in that observable to wait till the call to the getAction completes. Before handling other values.
This is something you can achieve quite easily using the concatMap. ConcatMap will take a function that returns an observable (in your case the call to the getAction). ConcatMap will wait to start handling the next value, until the observable returned in the function completes.
So if you change your code like this, it should work (if I understood correctly).
let executedAction = myTrigger
.concatMap(x => api.getAction("myAction"))
.subscribe(
(x) => { console.log(`executed action: ${x}`); },
(e) => {},
() => { console.log("completed");});
If myTrigger has a new value, it will not be handled until the observable returned from api.getAction completes.
So here is a much simpler solution than the one I thought. With simply using 2 observables. Similar effect could be achieved with schedulers and subscribeOn.
// some sandbox
class Action {
constructor(name, observable) {
this.name = name;
this.observable = observable;
this.replay = new Rx.ReplaySubject(10);
}
}
function actionFactory(action, param) {
return Rx.Observable.create(obs => {
action.observable
.subscribe(x => {
obs.next(x);
action.replay.next(x);
}, (e) => {}, () => obs.complete);
});
}
class Api {
constructor() {
this.actions = {};
}
registerAction(actionName, action) {
let generatedAction = new Action(actionName, action);
this.actions[actionName] = generatedAction;
return actionFactory.bind(null, generatedAction);
}
getAction(actionName) {
return this.actions[actionName].replay;
}
}
const api = new Api();
// -------------------------------------------------------------------
// module 1
let myAction = Rx.Observable.create((obs) => {
obs.next("42 " + Date.now());
obs.complete();
});
let myRegisteredAction$ = api.registerAction("myAction", myAction);
let myTrigger = Rx.Observable.interval(1000).take(1).delay(1000);
let executedAction = myTrigger
.map(x => { return { someValue: x} })
.concatMap(x => myRegisteredAction$(x))
.subscribe(
(x) => { console.log(`MAIN: ${x}`); },
(e) => { console.log("error", e)},
() => { console.log("MAIN: completed");});
// -------------------------------------------------------------------
// module 2
var sub = api.getAction("myAction")
.subscribe(
(x) => { console.log(`SECOND: ${x}`); },
(e) => {console.log("error : " + e)},
() => { console.log("SECOND: completed");});

How to call multiple mutations at the same time?

I have an array of ids, and I created a mutation that allow me to delete an item using only 1 id. Is there any way to call this mutation multiple times using Relay.Store.commitUpdate or this.props.relay.commitUpdate ?
I think you can wrap each Relay.Store.commitUpdate in Promise:
commitMutationPromise = (Mutation, data) =>
new Promise((resolve, reject) => {
Relay.Store.commitUpdate(new Mutation(data), {
onSuccess: (transaction) => {
resolve(transaction);
},
onFailure: (transaction) => {
reject(transaction);
},
});
}
And commit your mutations as array of promises and catch result with Promise.all(but keep in mind its fail-fast behaviour).
It could be something like this:
handleDelete = (deleteIds) => {
const deletePromisesArray = [];
deleteIds.forEach(id => {
deletePromisesArray.push(
this.commitMutationPromise(DeleteMutation, { id })
);
});
Promise.all(deletePromisesArray).then(values => {
this.onSuccessDelete(result);
}, error => {
this.onFailDelete(error);
});
}

ReactiveUI Testing

I am attempting to see if the results of a view model are performing the correct actions.
My observables are setup as follows:
public FilterBoxViewModel()
{
var asyncFilterResults = this.filterItemsCommand.RegisterAsyncTask(x =>
this.PerformFilter(x as string));
this.filteredItems = new ObservableAsPropertyHelper<IEnumerable<IFilterBoxItem>>(
asyncFilterResults, _ => this.RaisePropertyChanged("FilteredItems"));
this.WhenAnyValue(x => x.SearchTerm)
.Throttle(TimeSpan.FromMilliseconds(50))
.Skip(1)
.Subscribe(this.filterItemsCommand.Execute);
}
Then further down I have
private async Task<IEnumerable<IFilterBoxItem>> PerformFilter(string searchTerm)
{
if (string.IsNullOrWhiteSpace(searchTerm))
{
return Enumerable.Empty<IFilterBoxItem>();
}
// Perform getting the items on the main thread and async await the results.
// This is provide a immutable version of the results so we don't cause
// threading issues.
var items = await Observable.Start(
() => this.FilterBoxManager.RootElements.GetAllItemsEnumerable()
.ToList().Select(x => new { Name = x.Name, Item = x }),
RxApp.MainThreadScheduler);
return
items.Where(x =>
x.Name.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0)
.Select(x => x.Item);
}
In my test, I am running the test schedular and advancing it, however, I am getting the PerformFilter performing at different times than I expect
eg my test is:
(new TestScheduler()).With(scheduler =>
{
var viewModel = new FilterBoxViewModel();
var testManager = new TestManager { RootElements = this.sampleItems };
viewModel.FilterBoxManager = testManager;
viewModel.SearchTerm = "folder";
scheduler.AdvanceBy(TimeSpan.FromMilliseconds(51).Ticks);
Assert.AreEqual(viewModel.FilteredItems.Select(x => x.Name), folderSearchResults);
viewModel.SearchTerm = "apple";
Assert.AreEqual(viewModel.FilteredItems.Select(x => x.Name), appleSearchResults);
});
How do I make the tester more predictable?
I am running ReactiveUI 5.5.1 and in a XAML application.
Your Throttle doesn't set a scheduler, this is a classic TestScheduler mistake

Resources