Update Bulk Data in Mongodb - performance

I have 50k users in my collection, I want to add a new field suppose rowNumber to every record, i tried this solution using cursor, but it is taking too much time, so how can i increase performance or what will be the better approach?
let user, cursor, rowNumber = 1;
cursor = User.aggregate().cursor({ batchSize: 10 }).exec();
while ((user = await cursor.next())) {
let savedUser = await User.updateOne(
{ _id: user._id },
{ $set: { rowNumber: rowNumber++ } }
);
}

MongoDB has a bulkWrite functionality just for that purpose.
const users = await User.find().exec();
const writeOperations = users.map((user, i) => {
return {
updateOne: {
filter: { _id: user._id },
update: { rowNumber: i + 1 }
}
};
});
await User.bulkWrite(writeOperations);

Related

Strapi custom service overwrite find method

I'm using strapi v4 and I want to populate all nested fields by default when I retrieve a list of my objects (contact-infos). Therefore I have overwritten the contact-info service with following code:
export default factories.createCoreService('api::contact-info.contact-info', ({ strapi }): {} => ({
async find(...args) {
let { results, pagination } = await super.find(...args)
results = await strapi.entityService.findMany('api::contact-info.contact-info', {
fields: ['locale'],
populate: {
sections: {
populate: { link: true }
}
}
})
return { results, pagination }
},
}));
That works well, but I execute a find all entries on the database twice, I guess, which I want to avoid, but when I try to return the result from the entityService directly I'm getting following response:
data": null,
"error": {
"status": 404,
"name": "NotFoundError",
"message": "Not Found",
"details": {}
}
also, I have no idea how I would retrieve the pagination information if I don't call super.find(). Is there any way to find all contents with the option to populate nested objects?
the recommended way of doing this, would be a middleware (do it once apply for all controllers). There would be an video Best Practice Session 003 where it's describes exactly this scenario (Not sure if it's discord only, but on moment of writing this it wasn't yet published).
So regarding rest of your question:
async find(...args) {
let { results, pagination } = await super.find({...args, populate: {section: ['link']})
}
should be sufficient to fix that up in one query
custom pagination example:
async findOne(ctx) {
const { user, auth } = ctx.state;
const { id } = ctx.params;
const limit = ctx.query?.limit ?? 20;
const offset = ctx.query?.offset ?? 0;
const logs = await strapi.db.query("api::tasks-log.tasks-log").findMany({
where: { task: id },
limit,
offset,
orderBy: { updatedAt: "DESC" },
});
const total = await strapi.db
.query("api::tasks-log.tasks-log")
.count({ where: { task: id } });
return { data: logs, meta: { total, offset, limit } };
}
one small addition to the accepted answer, the answer didn't work completely since args is an array with an object inside, so I had to do it like this:
async find(...args) {
const argsObj = args[0]
let { results, pagination } = await super.find({...argsObj, populate: {section: ['link']})
}

fetch list of data got incomplete result, how to remove the limitation

i try to get all the data which is belong to logged user but got incomplete result, it suppose to be returning 22 data but the code only returning 19 data
export const getHistoryList = async (filterPayload: Array<IFilterPayload>, dispatch: Dispatch<any>) => {
dispatch(setProductStart())
let payload: IHistoryList[] = [];
let filter: ModelImportHistoryFilterInput = {};
let orFilter: ModelImportHistoryFilterInput[] = [];
filterPayload.map((item) => {
if (item.type === "merchantID") {
filter = { ...filter, merchantImportHistoriesId: { eq: item.value } };
}
if (item.type === "action") {
orFilter.push({ action: { eq: ImportHistoryAction[item.value as keyof typeof ImportHistoryAction] } });
}
});
orFilter = orFilter.filter(item => item.action?.eq !== undefined);
if (orFilter.length > 0) {
filter = { ...filter, or: orFilter };
}
const result = await (API.graphql({
query: queriesCustom.listImportHistoriesCustom,
variables: { filter: filter }
}) as Promise<{ data?: any }>)
.then((response) => {
const data = response.data.listImportHistories.items;
return data
})
...
}
if i remove the filter from variables i got only 100 data from total 118 data
i already try this temporary solution to set the limit in variables to 2000, which is worked, but i don't think this is the correct approach
const result = await (API.graphql({
query: queriesCustom.listImportHistoriesCustom,
variables: { limit: 2000, filter: filter }
}) as Promise<{ data?: any }>)
.then((response) => {
const data = response.data.listImportHistories.items;
return data
})
how can i overcome this problem please help?

How to get random records from Strapi content API

I have records in strapi. I am using strapi content API. In my front-end, I need to display only 2 records randomly. For limiting, I have used limit query from content API. But random fetching what keyword I need to use. The official documentation doesn't provide any details regarding this - https://strapi.io/documentation/v3.x/content-api/parameters.html#available-operators
There's no official Strapi API parameter for random. You have to implement your own. Below is what I've done previously, using Strapi v3:
1 - Make a service function
File: api/mymodel/services/mymodel.js
This will contain our actual random query (SQL), and wrapping it in a service is handy because it can be used in many places (cron jobs, inside other models, etc).
module.exports = {
serviceGetRandom() {
return new Promise( (resolve, reject) => {
// There's a few ways to query data.
// This example uses Knex.
const knex = strapi.connections.default
let query = knex('mydatatable')
// Add more .select()'s if you want other fields
query.select('id')
// These rules enable us to get one random post
query.orderByRaw('RAND()')
query.limit(1)
// Initiate the query and do stuff
query
.then(record => {
console.log("getRandom() record: %O", record[0])
resolve(record[0])
})
.catch(error => {
reject(error)
})
})
}
}
2 - Use the service somewhere, like a controller:
File: api/mymodel/controllers/mymodel.js
module.exports = {
//(untested)
getRandom: async (ctx) => {
await strapi.services.mymodel.serviceGetRandom()
.then(output => {
console.log("getRandom output is %O", output.id)
ctx.send({
randomPost: output
}, 200)
})
.catch( () => {
ctx.send({
message: 'Oops! Some error message'
}, 204) // Place a proper error code here
})
}
}
3 - Create a route that points to this controller
File: api/mymodel/config/routes.json
...
{
"method": "GET",
"path": "/mymodelrandom",
"handler": "mymodel.getRandom",
"config": {
"policies": []
}
},
...
4 - In your front-end, access the route
(However you access your API)
e.g. ajax call to /api/mymodelrandom
There is no API parameter for getting a random result.
So: FrontEnd is the recommended solution for your question.
You need to create a random request range and then get some random item from this range.
function getRandomInt(max) {
return Math.floor(Math.random() * Math.floor(max));
}
const firstID = getRandomInt(restaurants.length);
const secondID = getRandomInt(3);
const query = qs.stringify({
id_in:[firstID,secondID ]
});
// request query should be something like GET /restaurants?id_in=3&id_in=6
One way you can do this reliably is by two steps:
Get the total number of records
Fetch the number of records using _start and _limit parameters
// Untested code but you get the idea
// Returns a random number between min (inclusive) and max (exclusive)
function getRandomArbitrary(min, max) {
return Math.random() * (max - min) + min;
}
const { data: totalNumberPosts } = await axios.get('/posts/count');
// Fetch 20 posts
const _limit = 20;
// We need to be sure that we are not fetching less than 20 posts
// e.g. we only have 40 posts. We generate a random number that is 30.
// then we would start on 30 and would only fetch 10 posts (because we only have 40)
const _start = getRandomArbitrary(0, totalNumberPosts - _limit);
const { data: randomPosts } = await axios.get('/posts', { params: { _limit, _start } })
The problem with this approach is that it requires two network requests but for my needs, this is not a problem.
This seem to work for me with Strapi v.4 REST API
Controller, Get 6 random entries
"use strict";
/**
* artwork controller
*/
const { createCoreController } = require("#strapi/strapi").factories;
module.exports = createCoreController("api::artwork.artwork", ({ strapi }) => {
const numberOfEntries = 6;
return {
async random(ctx) {
const entries = await strapi.entityService.findMany(
"api::artwork.artwork",
{
populate: ["image", "pageHeading", "seo", "socialMedia", "artist"],
}
);
const randomEntries = [...entries].sort(() => 0.5 - Math.random());
ctx.body = randomEntries.slice(0, numberOfEntries);
},
};
});
Route
random.js
"use strict";
module.exports = {
routes: [
{
method: "GET",
path: "/artwork/random",
handler: "artwork.random",
config: {
auth: false,
},
},
],
};
API
http://localhost:1337/api/artwork/random
To match default data structure of Strapi
"use strict";
/**
* artwork controller
*/
const { createCoreController } = require("#strapi/strapi").factories;
module.exports = createCoreController("api::artwork.artwork", ({ strapi }) => {
const numberOfEntries = 6;
return {
async random(ctx) {
const entries = await strapi.entityService.findMany(
"api::artwork.artwork",
{
populate: ["image", "pageHeading", "seo", "socialMedia", "artist"],
}
);
const randomEntries = [...entries]
.sort(() => 0.5 - Math.random())
.slice(0, numberOfEntries);
const structureRandomEntries = {
data: randomEntries.map((entry) => {
return {
id: entry.id,
attributes: entry,
};
}),
};
ctx.body = structureRandomEntries;
},
};
});
There is also a random sort plugin.
https://www.npmjs.com/package/strapi-plugin-random-sort
This seem to work for me with Strapi v4.3.8 and graphql
src/index.js
"use strict";
module.exports = {
register({ strapi }) {
const extensionService = strapi.service("plugin::graphql.extension");
const extension = ({ strapi }) => ({
typeDefs: `
type Query {
randomTestimonial: Testimonial
}
`,
resolvers: {
Query: {
randomTestimonial: async (parent, args) => {
const entries = await strapi.entityService.findMany(
"api::testimonial.testimonial"
);
const sanitizedRandomEntry =
entries[Math.floor(Math.random() * entries.length)];
return sanitizedRandomEntry;
},
},
},
resolversConfig: {
"Query.randomTestimonial": {
auth: false,
},
},
});
extensionService.use(extension);
},
bootstrap({ strapi }) {},
};
graphql query:
query GetRandomTestimonial {
randomTestimonial {
__typename
name
position
location
description
}
}
generate random testimonial on route change/refresh
https://jungspooner.com/biography

How do I add recursive logic in resolvers using GraphQL mutations?

Is it possible to add logic in resolvers using GraphQL mutations?
I am trying to create a four-digit string as an alias for a post if the user does not provide it. Then, I would like to check the database to see if the four-digit string exists. If the string exists, I would like to create another four-digit string recursively.
At the moment, I'm exploring adding logic to mutations within resolvers, but I'm not sure if this is doable. I'm using these documents for my foundation: graphql.org sequelize.org
This is my current code block:
Working as of 12/4/2020
const MakeSlug = require("./services/MakeSlug");
const resolvers = {
Query: {
async allLinks(root, args, { models }) {
return models.Link.findAll();
},
async link(root, { id }, { models }) {
return models.Link.findByPk(id);
}
},
Mutation: {
async createLink(root, { slug, description, link }, { models }) {
if (slug !== undefined) {
const foundSlug = await models.Link.findOne({
where: { slug: slug }
});
if (foundSlug === undefined) {
return await models.Link.create({
slug,
description,
link,
shortLink: `https://shink.com/${slug}`
});
} else {
throw new Error(slug + " exists. Try a new short description.");
}
}
if (slug === undefined) {
const MAX_ATTEMPTS = 10;
let attempts = 0;
while (attempts < MAX_ATTEMPTS) {
attempts++;
let madeSlug = MakeSlug(4);
const foundSlug = await models.Link.findOne({
where: { slug: madeSlug }
});
if (foundSlug !== undefined) {
return await models.Link.create({
slug: madeSlug,
description,
link,
shortLink: `https://shink.com/${madeSlug}`
});
}
}
throw new Error("Unable to generate unique alias.");
}
}
}
};
module.exports = resolvers;
This is my full codebase.
Thank you!
A while loop solved the challenge. Thanks xadm.
const MakeSlug = require("./services/MakeSlug");
const resolvers = {
Query: {
async allLinks(root, args, { models }) {
return models.Link.findAll();
},
async link(root, { id }, { models }) {
return models.Link.findByPk(id);
}
},
Mutation: {
async createLink(root, { slug, description, link }, { models }) {
if (slug !== undefined) {
const foundSlug = await models.Link.findOne({
where: { slug: slug }
});
if (foundSlug === undefined) {
return await models.Link.create({
slug,
description,
link,
shortLink: `https://shink.com/${slug}`
});
} else {
throw new Error(slug + " exists. Try a new short description.");
}
}
if (slug === undefined) {
const MAX_ATTEMPTS = 10;
let attempts = 0;
while (attempts < MAX_ATTEMPTS) {
attempts++;
let madeSlug = MakeSlug(4);
const foundSlug = await models.Link.findOne({
where: { slug: madeSlug }
});
if (foundSlug !== undefined) {
return await models.Link.create({
slug: madeSlug,
description,
link,
shortLink: `https://shink.com/${madeSlug}`
});
}
}
throw new Error("Unable to generate unique alias.");
}
}
}
};
module.exports = resolvers;

How to stream many documents from Elasticsearch index (Scroll, Sliced Scroll)

I am looking for a way to stream all (~ 10^6+) documents via .NET Nest Client.
I want to boost performance by using parallel async requests. (e.g ActionBlock, Task.WhenAll())
old fashioned without boosting:
var objects = new List<object>();
var searchResponse = await elasticClient.SearchAsync<object>(
new SearchRequest<object>("myIndex")
{
Size = 7000,
Query = new BoolQuery
{
//...
},
// why here and in scroll itself?
Scroll = "2s",
Sort = new List<ISort>
{
//..
}
});
while (searchResponse.Documents.Any())
{
objects.AddRange(searchResponse.Documents);
searchResponse = await elasticClient.ScrollAsync<object>("2s", searchResponse.ScrollId).ConfigureAwait(false);
}
return objects;
then a try using parallel sliced scroll
var result = new ConcurrentBag<object>();
var tasks = Enumerable.Range(0, 4).Select(
id => new SearchRequest<object>("myIndex")
{
// hast to be lower than 1024?
Size = 1000,
Query = new BoolQuery
{
//...
},
// why here and in scroll itself?
Scroll = "2s",
Sort = new List<ISort>
{
//..
}
}).Select(
async searchRequest =>
{
var searchResponse = await elasticClient.SearchAsync<object>(searchRequest).ConfigureAwait(false);
while (searchResponse.Documents.Any())
{
searchResponse.Documents.Each(result.Add);
searchResponse = await elasticClient.ScrollAsync<object>("2s", searchResponse.ScrollId).ConfigureAwait(false);
}
// good idea right?
//await elasticClient.ClearScrollAsync(x => x.ScrollId(searchResponse.ScrollId)).ConfigureAwait(false);
});
await Task.WhenAll(tasks).PreserveAllExceptions().ConfigureAwait(false);
return result.ToList();
But this only gives me a fraction of the actual available documents.
More over slices scroll is limited to 1024 documents per slice.
I was not able to increase this value to 7000:
{
"myIndex_template": {
"settings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "0",
"max_slices_per_scroll": "10000"
}
}
}
}

Resources