Graphql: How can I solve the N + N problem? - graphql

After having implemented dataloader in the respective resolvers to solve the N+1 problem, I also need to be able to solve the N+N problem.
I need a decently efficient data loading mechanism to get a relation like this:
{
persons (active: true) {
id,
given_name,
projects (active: true) {
id,
title,
}
}
}
I've created a naive implementation for this, returning
{
persons: [
{
id: 1,
given_name: 'Mike'
projects: [
{
id: 1,
title: 'API'
},
{
id: 2,
title: 'Frontend'
}
]
}
{
id: 2,
given_name: 'Eddie'
projects: [
{
id: 2,
title: 'Frontend'
},
{
id: 3,
title: 'Testing'
}
]
}
]
}
In SQL the underlying structure would be represented by a many many to many relationship.
Is there a similiar tool like dataloader for solving this or can this maybe even be solved with dataloader itself?

The expectation with GraphQL is that the trip to the database is generally the fastest thing you can do, so you just add a resolver to Person.projects that makes a call to the database. You can still use dataLoaders for that.
const resolvers = {
Query: {
persons(parent, args, context) {
// 1st call to database
return someUsersService.list()
},
},
Person: {
projects(parent, args, context) {
// this should be a dataLoader behind the scenes.
// Makes second call to database
return projectsService.loadByUserId(parent.id)
}
}
}
Just remember that now your dataLoader is expecting to return an Array of objects in each slot instead of a single object.

Related

GraphQL Apollo pagination and type policies

I am really struggling with this concept. I hope someone can help me understand it better.
The documentations uses a simple example and it's not 100% clear to me how it works.
I have tried using keyArgs, but they didn't work, so I adopted to use the args parameter in the read and merge functions. First, let me explain my scenario.
I have a couple of search endpoints that use the same parameters:
{
search:
{
searchTerm: "*",
includePartialMatch: true,
page: 1,
itemsToShow: 2,
filters: {},
facets: [],
orderBy: {}
}
}
So I have setup my type policies like this:
const cache = new InMemoryCache({
typePolicies: {
Query: {
fields: {
searchCategories: typePolicy,
searchBrands: typePolicy,
searchPages: typePolicy,
searchProducts: typePolicy,
},
},
},
});
And I was using a generic typePolicy for them all.
At first, I tried this:
const typePolicy = {
keyArgs: [
"search",
[
"identifier",
"searchTerm",
"includePartialMatches",
"filters",
"orderBy",
"facets",
],
],
// Concatenate the incoming list items with
// the existing list items.
merge(existing: any, incoming: any) {
console.log("existing", existing);
console.log("incoming", incoming);
if (!existing?.items) console.log("--------------");
if (!existing?.items) return { ...incoming }; // First request
const items = existing.items.concat(incoming.items);
const item = { ...existing, ...incoming };
item.items = items;
console.log("merged", item);
console.log("--------------");
return item;
},
};
But this does not do what I want.
What I would like, is for apollo to work as it does normally, but when the "page" changes for any field, it appends it instead of caching a new request.
Does anyone know what I am doing wrong or can provide me with a better example that what is on the documentation?

How to paste array of objects into GraphQL Apollo cache?

I have an array of countries received from Apollo backend without an ID field.
export const QUERY_GET_DELIVERY_COUNTRIES = gql`
query getDeliveryCountries {
deliveryCountries {
order
name
daysToDelivery
zoneId
iso
customsInfo
}
}
`
Schema of these objects:
{
customsInfo: null
daysToDelivery: 6
iso: "UA"
name: "Ukraine"
order: 70
zoneId: 8
__typename: "DeliveryCountry"
}
In nested components I read these objects from client.readQuery.
What I want is to insert it to localStorage, read it initially and write this data to Apollo Client Cache.
What I've already tried to do:
useEffect(() => {
const deliveryCountries = JSON.parse(localStorage.getItem('deliveryCountries') || '[]')
if(!deliveryCountries || !deliveryCountries.length) {
getCountriesLazy()
} else {
deliveryCountries.map((c: DeliveryCountry) => {
client.writeQuery({
query: QUERY_GET_DELIVERY_COUNTRIES,
data: {
deliveryCountries: {
__typename: "DeliveryCountry",
order: c.order,
name: c.name,
daysToDelivery: c.daysToDelivery,
zoneId: c.zoneId,
iso: c.iso,
customsInfo: c.customsInfo
}
}
})
})
}
}, [])
But after execution the code above I have only one object in countries cache. How to write all objects without having an explicit ID, how can I do it? Or maybe I'm doing something wrong?
Lol. I just had to put the array into necessary field without iterating. writeQuery replaces all the data and not add any "to the end".
client.writeQuery({
query: QUERY_GET_DELIVERY_COUNTRIES,
data: {
deliveryCountries: deliveryCountries
}
})

Dynamically create pages with Gatsby based on many Contentful references

I am currently using Gatsby's collection routes API to create pages for a simple blog with data coming from Contentful.
For example, creating a page for each blogpost category :
-- src/pages/categories/{contentfulBlogPost.category}.js
export const query = graphql`
query categoriesQuery($category: String = "") {
allContentfulBlogPost(filter: { category: { eq: $category } }) {
edges {
node {
title
category
description {
description
}
...
}
}
}
}
...
[React component mapping all blogposts from each category in a list]
...
This is working fine.
But now I would like to have multiple categories per blogpost, so I switched to Contentful's references, many content-type, which allows to have multiple entries for a field :
Now the result of my graphQL query on field category2 is an array of different categories for each blogpost :
Query :
query categoriesQuery {
allContentfulBlogPost {
edges {
node {
category2 {
id
name
slug
}
}
}
}
}
Output :
{
"data": {
"allContentfulBlogPost": {
"edges": [
{
"node": {
"category2": [
{
"id": "75b89e48-a8c9-54fd-9742-cdf70c416b0e",
"name": "Test",
"slug": "test"
},
{
"id": "568r9e48-t1i8-sx4t8-9742-cdf70c4ed789vtu",
"name": "Test2",
"slug": "test-2"
}
]
}
},
{
"node": {
"category2": [
{
"id": "75b89e48-a8c9-54fd-9742-cdf70c416b0e",
"name": "Test",
"slug": "test"
}
]
}
},
...
Now that categories are inside an array, I don't know how to :
write a query variable to filter categories names ;
use the slug field as a route to dynamically create the page.
For blogposts authors I was doing :
query authorsQuery($author__slug: String = "") {
allContentfulBlogPost(filter: { author: { slug: { eq: $author__slug } } }) {
edges {
node {
id
author {
slug
name
}
...
}
...
}
And creating pages with src/pages/authors/{contentfulBlogPost.author__slug}.js
I guess I'll have to use the createPages API instead.
You can achieve the result using the Filesystem API, something like this may work:
src/pages/category/{contentfulBlogPost.category2__name}.js
In this case, it seems that this approach may lead to some caveats, since you may potentially create duplicated pages with the same URL (slug) because the posts can contain multiple and repeated categories.
However, I think it's more succinct to use the createPages API as you said, keeping in mind that you will need to treat the categories to avoid duplicities because they are in a one-to-many relationship.
exports.createPages = async ({ graphql, actions }) => {
const { createPage } = actions
const result = await graphql(`
query {
allContentfulBlogPost {
edges {
node {
category2 {
id
name
slug
}
}
}
}
}
`)
let categories= { slugs: [], names: [] };
result.data.allContentfulBlogPost.edges.map(({node}))=> {
let { name, slug } = node.category2;
// make some checks if needed here
categories.slugs.push(slug);
categories.names.push(name);
return new Set(categories.slugs) && new Set(categories.names);
});
categories.slugs.forEach((category, index) => {
let name = categories.names[index];
createPage({
path: `category/${category}`,
component: path.resolve(`./src/templates/your-category-template.js`),
context: {
name
}
});
});
}
The code's quite self-explanatory. Basically you are defining an empty object (categories) that contains two arrays, slugs and names:
let categories= { slugs: [], names: [] };
After that, you only need to loop through the result of the query (result) and push the field values (name, slug, and others if needed) to the previous array, making the needed checks if you want (to avoid pushing empty values, or that matches some regular expression, etc) and return a new Set to remove the duplicates.
Then, you only need to loop through the slugs to create pages using createPage API and pass the needed data via context:
context: {
name
}
Because of redundancy, this is the same than doing:
context: {
name: name
}
So, in your template, you will get the name in pageContext props. Replace it with the slug if needed, depending on your situation and your use case, the approach is exactly the same.

Error while trying to run a GraphQL query recursively, along with queried results

This is closely related to my last question here. In short, I have 2 schemas, dbPosts and dbAuthors. They look somewhat like this (I've omitted some fields here for the sake of brevity):
dbPosts
id: mongoose.Schema.Types.ObjectId,
title: { type: String },
content: { type: String },
excerpt: { type: String },
slug: { type: String },
author: {
id: { type: String },
fname: { type: String },
lname: { type: String },
}
dbAuthors
id: mongoose.Schema.Types.ObjectId,
fname: { type: String },
lname: { type: String },
posts: [
id: { type: String },
title: { type: String }
]
I'm resolving my post queries like this:
const mongoose = require('mongoose');
const graphqlFields = require('graphql-fields');
const fawn = require('fawn');
const dbPost = require('../../../models/dbPost');
const dbUser = require('../../../models/dbUser');
fawn.init(mongoose);
module.exports = {
// Queries
Query: {
posts: (root, args, context) => {
return dbPost.find({});
},
post: (root, args, context) => {
return dbPost.findById(args.id);
},
},
Post: {
author: (parent, args, context, ast) => {
// Retrieve fields being queried
const queriedFields = Object.keys(graphqlFields(ast));
console.log('-------------------------------------------------------------');
console.log('from Post:author resolver');
console.log('queriedFields', queriedFields);
// Retrieve fields returned by parent, if any
const fieldsInParent = Object.keys(parent.author);
console.log('fieldsInParent', fieldsInParent);
// Check if queried fields already exist in parent
const available = queriedFields.every((field) => fieldsInParent.includes(field));
console.log('available', available);
if(parent.author && available) {
return parent.author;
} else {
return dbUser.findOne({'posts.id': parent.id});
}
},
},
};
And I'm resolving all author queries like this:
const mongoose = require('mongoose');
const graphqlFields = require('graphql-fields');
const dbUser = require('../../../models/dbUser');
const dbPost = require('../../../models/dbPost');
module.exports = {
// Queries
Query: {
authors: (parent, root, args, context) => {
return dbUser.find({});
},
author: (root, args, context) => {
return dbUser.findById(args.id);
},
},
Author: {
posts: (parent, args, context, ast) => {
// Retrieve fields being queried
const queriedFields = Object.keys(graphqlFields(ast));
console.log('-------------------------------------------------------------');
console.log('from Author:posts resolver');
console.log('queriedFields', queriedFields);
// Retrieve fields returned by parent, if any
const fieldsInParent = Object.keys(parent.posts[0]._doc);
console.log('fieldsInParent', fieldsInParent);
// Check if queried fields already exist in parent
const available = queriedFields.every((field) => fieldsInParent.includes(field));
console.log('available', available);
if(parent.posts && available) {
// If parent data is available and includes queried fields, no need to query db
return parent.posts;
} else {
// Otherwise, query db and retrieve data
return dbPost.find({'author.id': parent.id, 'published': true});
}
},
},
};
Again, I've left out bits not relevant to this question, such as mutations, in the interest of brevity. My objective is to make all queries work recursively while also optimizing database lookups. But somehow I'm unable to accomplish this. Here's one query I'm running, for instance:
{
posts{
id
title
author{
first_name
last_name
id
posts{
id
title
}
}
}
}
And it returns this:
{
"errors": [
{
"message": "Cannot return null for non-nullable field Post.author.",
"locations": [
{
"line": 5,
"column": 5
}
],
"path": [
"posts",
1,
"author"
]
}
],
"data": {
"posts": [
{
"id": "5ba1f3e7cc546723422e62a4",
"title": "A Title!",
"author": {
"first_name": "Bill",
"last_name": "Erby",
"id": "5ba130271c9d440000ac8fc4",
"posts": [
{
"id": "5ba1f3e7cc546723422e62a4",
"title": "A Title!"
}
]
}
},
null
]
}
}
If you notice, this query does return all values requested, but also adds an error message against the post.author query! What could be causing this?
I haven't included the entire codebase so as not to make things confusing, but should you wish to take a look, it's up on Github and a GraphiQL interface is up at https://graph.schandillia.com should you wish to see the results for yourself.
Thank you so much for your time, if you've come this far. Would really appreciate any pointer in the right direction!"
P.S.: If you notice, I'm logging the values of 3 variables in each resolver for debugging purposes:
queriedFields: An array of all fields being queried
fieldsInParent: An array of all fields being returned in the resolver's parent property
available: A boolean showing if all queriedFields members exist in fieldsInParent
And when I run a simple query like this:
{
posts{
id
author{
id
posts{
id
}
}
}
}
This is what gets logged:
-------------------------------------------------------------
from Post:author resolver
queriedFields [ 'id', 'posts' ]
fieldsInParent [ '$init', 'id', 'first_name', 'last_name' ]
available false
-------------------------------------------------------------
from Post:author resolver
queriedFields [ 'id', 'posts' ]
fieldsInParent [ '$init', 'id', 'first_name', 'last_name' ]
available false
-------------------------------------------------------------
from Author:posts resolver
queriedFields [ 'id' ]
fieldsInParent [ 'id', 'title' ]
available true
Shouldn't the post:author resolver execute only once? Also, it's funny how in the first 2 logs, fieldsInParent is missing the posts field even when the schema for author includes such a field.
Your query result does not in fact include all the requested data. The posts query resolves to an array that includes one Post object and a null. The null is there because GraphQL tried to fully resolve the other Post object and could not -- it encountered a validation error, namely that the post's author resolved to null.
You can change your schema to make the author field nullable, which would get rid of the error but would still leave you with the null post. Presumably, if a post exists, it should have an author (although with MongoDB I guess it's very possible you just have some bad data). If you look inside your resolver, there's two return statements -- one of them (probably the db call) is returning null for that second post.
As an aside, as a client, you probably don't want to deal with nulls inside the array and want an empty array instead of a null for the whole field. When using lists (arrays), you may want to make them both non-nullable and make each item in that list non-nullable as well. You do so like this:
posts: [Post!]!
You still need to ensure your resolver logic prevents those nulls from happening, but adding the validation can help you catch that sort of behavior more easily.

GraphQL: Filtering, sorting and paging on nested entities from separate data sources?

I'm attempting to use graphql to tie together a number of rest endpoints, and I'm stuck on how to filter, sort and page the resulting data. Specifically, I need to filter and/or sort by nested values.
I cannot do the filtering on the rest endpoints in all cases because they are separate microservices with separate databases. (i.e. I could filter on title in the rest endpoint for articles, but not on author.name). Likewise with sorting. And without filtering and sorting, pagination cannot be done on the rest endpoints either.
To illustrate the problem, and as an attempt at a solution, I've come up with the following using formatResponse in apollo-server, but am wondering if there is a better way.
I've boiled down the solution to the most minimal set of files that i could think of:
data.js represents what would be returned by 2 fictional rest endpoints:
export const Authors = [{ id: 1, name: 'Sam' }, { id: 2, name: 'Pat' }];
export const Articles = [
{ id: 1, title: 'Aardvarks', author: 1 },
{ id: 2, title: 'Emus', author: 2 },
{ id: 3, title: 'Tapir', author: 1 },
]
the schema is defined as:
import _ from 'lodash';
import {
GraphQLSchema,
GraphQLObjectType,
GraphQLList,
GraphQLString,
GraphQLInt,
} from 'graphql';
import {
Articles,
Authors,
} from './data';
const AuthorType = new GraphQLObjectType({
name: 'Author',
fields: {
id: {
type: GraphQLInt,
},
name: {
type: GraphQLString,
}
}
});
const ArticleType = new GraphQLObjectType({
name: 'Article',
fields: {
id: {
type: GraphQLInt,
},
title: {
type: GraphQLString,
},
author: {
type: AuthorType,
resolve(article) {
return _.find(Authors, { id: article.author })
},
}
}
});
const RootType = new GraphQLObjectType({
name: 'Root',
fields: {
articles: {
type: new GraphQLList(ArticleType),
resolve() {
return Articles;
},
}
}
});
export default new GraphQLSchema({
query: RootType,
});
And the main index.js is:
import express from 'express';
import { apolloExpress, graphiqlExpress } from 'apollo-server';
var bodyParser = require('body-parser');
import _ from 'lodash';
import rql from 'rql/query';
import rqlJS from 'rql/js-array';
import schema from './schema';
const PORT = 8888;
var app = express();
function formatResponse(response, { variables }) {
let data = response.data.articles;
// Filter
if ({}.hasOwnProperty.call(variables, 'q')) {
// As an example, use a resource query lib like https://github.com/persvr/rql to do easy filtering
// in production this would have to be tightened up alot
data = rqlJS.query(rql.Query(variables.q), {}, data);
}
// Sort
if ({}.hasOwnProperty.call(variables, 'sort')) {
const sortKey = _.trimStart(variables.sort, '-');
data = _.sortBy(data, (element) => _.at(element, sortKey));
if (variables.sort.charAt(0) === '-') _.reverse(data);
}
// Pagination
if ({}.hasOwnProperty.call(variables, 'offset') && variables.offset > 0) {
data = _.slice(data, variables.offset);
}
if ({}.hasOwnProperty.call(variables, 'limit') && variables.limit > 0) {
data = _.slice(data, 0, variables.limit);
}
return _.assign({}, response, { data: { articles: data }});
}
app.use('/graphql', bodyParser.json(), apolloExpress((req) => {
return {
schema,
formatResponse,
};
}));
app.use('/graphiql', graphiqlExpress({
endpointURL: '/graphql',
}));
app.listen(
PORT,
() => console.log(`GraphQL Server running at http://localhost:${PORT}`)
);
For ease of reference, these files are available at this gist.
With this setup, I can send this query:
{
articles {
id
title
author {
id
name
}
}
}
Along with these variables (It seems like this is not the intended use for the variables, but it was the only way I could get the post processing parameters into the formatResponse function.):
{ "q": "author/name=Sam", "sort": "-id", "offset": 1, "limit": 1 }
and get this response, filtered to where Sam is the author, sorted by id descending, and getting getting the second page where the page size is 1.
{
"data": {
"articles": [
{
"id": 1,
"title": "Aardvarks",
"author": {
"id": 1,
"name": "Sam"
}
}
]
}
}
Or these variables:
{ "sort": "-author.name", "offset": 1 }
For this response, sorted by author name descending and getting all articles except the first.
{
"data": {
"articles": [
{
"id": 1,
"title": "Aardvarks",
"author": {
"id": 1,
"name": "Sam"
}
},
{
"id": 2,
"title": "Emus",
"author": {
"id": 2,
"name": "Pat"
}
}
]
}
}
So, as you can see, I am using the formatResponse function for post processing to do the filtering/paging/sorting. .
So, my questions are:
Is this a valid use case?
Is there a more canonical way to do filtering on deeply nested properties, along with sorting and paging?
Is this a valid use case? Is there a more canonical way to do filtering on deeply nested properties, along with sorting and paging?
Major part of original questing lies on segregating collections on different databases on separate microservices. In fact, it's nessasary to perform collection joining and subsequent filtering on some key, but it's directly impossible since there is no field in original collection to filter, sort or paginate.
Strightforward solution is perform full or filtered queries to original collections, and then perform joining and filtering result dataset on application server, e.g. by lodash, such at your solution. In is possible for small collections, but in general case causes large data transfer and unefficent sorting since there is no index structure - real RB-tree or SkipList, so with quadratic complexity it's not very good.
Dependent on resource volume on application server, special cache and index tables can be build there. If collection structure is fixed, some relations between collection entries and their fields can be reflected in special search table and update respectively on demain. It's like find & search index creation, but not it database, but on application server. Of cource, it will consume resources, but will be more fast than direct lodash-like sorting.
Also task can be solved from another side, if there is access to structure of original databases. Key is denormalization. In counter for classical relation approach, collections can have dublicate information for avioding further join operation. E.g., Articles collection can have some information from Authors collection, which is nessasary to perform filtering, sorting and pagination in further operations.

Resources