How to model recursive data structures in GraphQL - graphql

I have a tree data structure that I would like to return via a GraphQL API.
The structure is not particularly large (small enough not to be a problem to return it in one call).
The maximum depth of the structure is not set.
I have modeled the structure as something like:
type Tag{
id: String!
children: [Tag]
The problem appears when one wants to get the tags to an arbitrary depth.
To get all the children to (for example) level 3 one would write a query like:
tags {
children {
children {
Is there a way to write a query to return all the tags to an arbitrary depth?
If not what is the recommended way to model a structure like the one above in a GraphQL API.

Some time ago I came up with another solution, which is the same approach like #WuDo suggested.
The idea is to flatten the tree on data level using IDs to reference them (each child with it's parent) and marking the roots of the tree, then on client side build up the tree again recursively.
This way you should not worry about limiting the depth of your query like in #samcorcos's answer.
type Query {
tags: [Tag]
type Tag {
id: ID!
children: [ID]
root: Boolean
"tags": [
{"id": "1", "children": ["2"], "root": true},
{"id": "2", "children": [], "root": false}
client tree buildup:
import find from 'lodash/find';
import isArray from 'lodash/isArray';
const rootTags = [ => ({...obj})).filter(tag => tag.root === true)];
const mapChildren = childId => {
const tag = find(tags, tag => === childId) || null;
if (isArray(tag.children) && tag.children.length > 0) {
tag.children = => tag !== null);
const tagTree = => {
tag.children = => tag !== null);
return tag;
// Update 2022-08-16 Fixed typo

Another option if you're willing to give up on the type-safety and subfield querying that GraphQL provides along with the ability to cache and reference the objects by their IDs is to encode the data as JSON. The gaphql-type-json package provides resolvers to make this easy. These are also included with permission by graphql-scalars which contains a lot of other handy scalars.
I'm doing this for the hierarchical data that defines the controls for a dynamic form. In this case, there aren't any IDs to lose, so it's an easy win.


Apollo InMemoryCache syntax

I’ve inherited a project that’s setting an inmemorycache with the following key field syntax. None of the examples showcase this particular signature (that I can find at least). All the fields I see in the examples use multiple fields and are placed in the key field attribute. Is this looking for any nested “myField” attributes? How is this expected in the graphql data? (Apollo client 3.2)
const cache = new InMemoryCache({
typePolicies: {
Query: {
/// query info
UserData: {
fields: {
fieldA: {
merge(existing = [], incoming = []) {
return incoming;
fieldB: {
merge(existing = [], incoming = []) {
return incoming;
keyFields: [["myField"]], // <-- What is this looking for?
This leads to an invariant violation error:
Uncaught Invariant Violation: Missing field 'myField' while extracting keyFields from {"id":"462a349...... (does not contain myField)
Your code seems fine when it comes to fields map. On the other hand, keyFields in a slightly different question. You could totally skip setting it.
The purpose of keyFields is to uniquely identify your record, so the cache would know how to update. Just like in the relational databases you have a primary key that consists of one or more columns that consider your record unique.
I believe this is well documented in Apollo's documentation, see this:

Can Apollo read partial fragments from cache?

I have a simple mutation editPerson. It changes the name and/or description of a person specified by an id.
I use this little snippet to call the mutator from React components:
function useEditPerson(variables) {
const gqlClient = useGQLClient();
const personFragment = gql`fragment useEditPerson__person on Person {
return useMutation(gql`
mutation editPerson($id: ID!, $description: String, $name: String) {
editPerson(id: $id, description: $description, name: $name) {
`, {
optimisticResponse: vars => {
const person = gqlClient.readFragment({
fragment: personFragment,
return {
editPerson: {
__typename: "Person",
description: "",
name: "",
This works well enough unless either the name or description for the indicated person hasn't yet been queried and does not exist in the cache; in this case person is null. This is expected from readFragment - any incomplete fragment does this.
The thing is I really need that data to avoid invariant errors - if they're not in the cache I'm totally okay using empty strings as default values, those values aren't displayed anywhere in the UI anyway.
Is there any way to read partial fragments from the cache? Is there a better way to get that data for the optimistic response?
I guess you use the snippet in the form that has all the data you need. So, you can pass the needed data to your useEditPerson hook through the arguments and then use in optimistic response, and then you won't need to use gqlClient.

Is there a way to get a structure of a Strapi CMS Content Type?

A content-type "Product" having the following fields:
string title
int qty
string description
double price
Is there an API endpoint to retrieve the structure or schema of the "Product" content-type as opposed to getting the values?
For example: On endpoint localhost:1337/products, and response can be like:
field: "title",
type: "string",
other: "col-xs-12, col-5"
field: "qty",
type: "int"
field: "description",
type: "string"
field: "price",
type: "double"
where the structure of the schema or the table is sent instead of the actual values?
If not in Strapi CMS, is this possible on other headless CMS such as Hasura and Sanity?
You need to use Models, from the link:
Link is dead -> New link
Models are a representation of the database's structure. They are split into two separate files. A JavaScript file that contains the model options (e.g: lifecycle hooks), and a JSON file that represents the data structure stored in the database.
This is exactly what you are after.
The way I GET this info is by adding a custom endpoint - check my answers here for how to do this - &
For handlers you can do something like:
async getProductModel(ctx) {
return strapi.models['product'].allAttributes;
I needed the solution for all Content Types so I made a plugin with /modelStructure/* endpoints where you can supply the model name and then pass to a handler:
//more generic wrapper
async getModel(ctx) {
const { model } = ctx.params;
let data = strapi.models[model].allAttributes;
return data;
async getProductModel(ctx) {
ctx.params['model'] = "product"
return this.getModel(ctx)
//define all endpoints you need, like maybe a Page content type
async getPageModel(ctx) {
ctx.params['model'] = "page"
return this.getModel(ctx)
//finally I ended up writing a `allModels` handler
async getAllModels(ctx) {
Object.keys(strapi.models).forEach(key => {
//iterate through all models
//possibly filter some models
//iterate through all fields
Object.keys(strapi.models[key].allAttributes).forEach(fieldKey => {
//build the response - iterate through models and all their fields
//return your desired custom response
Comments & questions welcome
This answer pointed me in the right direction, but strapi.models was undefined for me on strapi 4.4.3.
What worked for me was a controller like so:
async getFields(ctx) {
const model = strapi.db.config.models.find( model => model.collectionName === 'clients' );
return model.attributes;
Where clients is replaced by the plural name of your content-type.

Filter by Distinct Values in Gatsby

I am trying to display a list of unique subject categories on a Gatsby site, which I will use to create unique pages. These will serve as taxonomy terms, of sorts. A limited version of my source json file looks like:
"BookID": "4176",
"Title": "Book Title 1",
"Subject": {
"subjectID": "HR",
"name": "Civil War & Reconstruction"
"BookID": "3619",
"Title": "Book Title 2",
"Subject": {
"subjectID": "AR",
"name": "Fine Art & Photography"
"BookID": "3619",
"Title": "Book Title 3",
"Subject": {
"subjectID": "AR",
"name": "Fine Art & Photography"
In my gatsby-node.js file, I can create pages using a list of distinct values of IDs to serve as the slugs to create my subject categories. As below:
allSubjects: allBooksJson {
distinct(field: Subject___subjectID)
However, I also need the name associated with these. I have not yet seen a way to use this as a filter, in order to deduplicate the results of a query.
So what I would ultimately like to is return all the unique subject objects so I can use the subjectID as a slug and the full name where needed on the individual pages.
Still learning Gatsby, so this may be the wrong approach, and any advice would be appreciated.
The idea of creating dynamic pages, is to get all the needed values in your gatsby-node.js using a GraphQL query, to create a bunch of pages and then, use the context to send a unique identifier to the template, to filter again the pages to get the specific data for each entry (books in your case). So:
const path = require(`path`)
const { createFilePath } = require(`gatsby-source-filesystem`)
exports.createPages = async ({ graphql, actions }) => {
const { createPage } = actions
const result = await graphql(`
query {
allBookJson {
edges {
node {
`){ node }) => {
path: `books/${node.Subject.subjectID}`, // change it as you wish
component: path.resolve(`./src/templates/book.js`), // change it as you wish
context: {
subjectID: node.fields.slug,
Note: adapt the snippet (query, loop, and variables) to your needs. You don't need to filter anything at this point, since you are only fetching the subjectID of all books.
If the values are likely to be repeated, use the new Set to remove the duplicates, then, you can loop through them to create pages dynamically:
let unique = [ Set(];
You are sending the subjectID to your templates/book.js file via context, so it will be available to be used as a pageContext.
Anytime you want just to get a list of all books, you can create a page query or a static query and loop through them at any time.
import React from "react"
import { graphql } from "gatsby"
import Layout from "../components/layout"
export default function Book({ data }) {
const books = data.allBookJson
return (
return <div>book.title</div>
export const query = graphql`
query($subjectID: String) {
allBookJson(Subject___subjectID: { eq: $subjectID } ) {
Note: again, test your query and adapt it to your needs at localhost:8000/___graphql. If you have duplicate results use the new Set.
It's difficult to guess your data structure without knowing it properly, the idea is to create a unique query based on the context value subjectID and filter the values. Use the GraphQL playground as support to know how the query and the filters should look like.
Further details:

GraphQL: Filtering, sorting and paging on nested entities from separate data sources?

I'm attempting to use graphql to tie together a number of rest endpoints, and I'm stuck on how to filter, sort and page the resulting data. Specifically, I need to filter and/or sort by nested values.
I cannot do the filtering on the rest endpoints in all cases because they are separate microservices with separate databases. (i.e. I could filter on title in the rest endpoint for articles, but not on Likewise with sorting. And without filtering and sorting, pagination cannot be done on the rest endpoints either.
To illustrate the problem, and as an attempt at a solution, I've come up with the following using formatResponse in apollo-server, but am wondering if there is a better way.
I've boiled down the solution to the most minimal set of files that i could think of:
data.js represents what would be returned by 2 fictional rest endpoints:
export const Authors = [{ id: 1, name: 'Sam' }, { id: 2, name: 'Pat' }];
export const Articles = [
{ id: 1, title: 'Aardvarks', author: 1 },
{ id: 2, title: 'Emus', author: 2 },
{ id: 3, title: 'Tapir', author: 1 },
the schema is defined as:
import _ from 'lodash';
import {
} from 'graphql';
import {
} from './data';
const AuthorType = new GraphQLObjectType({
name: 'Author',
fields: {
id: {
type: GraphQLInt,
name: {
type: GraphQLString,
const ArticleType = new GraphQLObjectType({
name: 'Article',
fields: {
id: {
type: GraphQLInt,
title: {
type: GraphQLString,
author: {
type: AuthorType,
resolve(article) {
return _.find(Authors, { id: })
const RootType = new GraphQLObjectType({
name: 'Root',
fields: {
articles: {
type: new GraphQLList(ArticleType),
resolve() {
return Articles;
export default new GraphQLSchema({
query: RootType,
And the main index.js is:
import express from 'express';
import { apolloExpress, graphiqlExpress } from 'apollo-server';
var bodyParser = require('body-parser');
import _ from 'lodash';
import rql from 'rql/query';
import rqlJS from 'rql/js-array';
import schema from './schema';
const PORT = 8888;
var app = express();
function formatResponse(response, { variables }) {
let data =;
// Filter
if ({}, 'q')) {
// As an example, use a resource query lib like to do easy filtering
// in production this would have to be tightened up alot
data = rqlJS.query(rql.Query(variables.q), {}, data);
// Sort
if ({}, 'sort')) {
const sortKey = _.trimStart(variables.sort, '-');
data = _.sortBy(data, (element) =>, sortKey));
if (variables.sort.charAt(0) === '-') _.reverse(data);
// Pagination
if ({}, 'offset') && variables.offset > 0) {
data = _.slice(data, variables.offset);
if ({}, 'limit') && variables.limit > 0) {
data = _.slice(data, 0, variables.limit);
return _.assign({}, response, { data: { articles: data }});
app.use('/graphql', bodyParser.json(), apolloExpress((req) => {
return {
app.use('/graphiql', graphiqlExpress({
endpointURL: '/graphql',
() => console.log(`GraphQL Server running at http://localhost:${PORT}`)
For ease of reference, these files are available at this gist.
With this setup, I can send this query:
articles {
author {
Along with these variables (It seems like this is not the intended use for the variables, but it was the only way I could get the post processing parameters into the formatResponse function.):
{ "q": "author/name=Sam", "sort": "-id", "offset": 1, "limit": 1 }
and get this response, filtered to where Sam is the author, sorted by id descending, and getting getting the second page where the page size is 1.
"data": {
"articles": [
"id": 1,
"title": "Aardvarks",
"author": {
"id": 1,
"name": "Sam"
Or these variables:
{ "sort": "", "offset": 1 }
For this response, sorted by author name descending and getting all articles except the first.
"data": {
"articles": [
"id": 1,
"title": "Aardvarks",
"author": {
"id": 1,
"name": "Sam"
"id": 2,
"title": "Emus",
"author": {
"id": 2,
"name": "Pat"
So, as you can see, I am using the formatResponse function for post processing to do the filtering/paging/sorting. .
So, my questions are:
Is this a valid use case?
Is there a more canonical way to do filtering on deeply nested properties, along with sorting and paging?
Is this a valid use case? Is there a more canonical way to do filtering on deeply nested properties, along with sorting and paging?
Major part of original questing lies on segregating collections on different databases on separate microservices. In fact, it's nessasary to perform collection joining and subsequent filtering on some key, but it's directly impossible since there is no field in original collection to filter, sort or paginate.
Strightforward solution is perform full or filtered queries to original collections, and then perform joining and filtering result dataset on application server, e.g. by lodash, such at your solution. In is possible for small collections, but in general case causes large data transfer and unefficent sorting since there is no index structure - real RB-tree or SkipList, so with quadratic complexity it's not very good.
Dependent on resource volume on application server, special cache and index tables can be build there. If collection structure is fixed, some relations between collection entries and their fields can be reflected in special search table and update respectively on demain. It's like find & search index creation, but not it database, but on application server. Of cource, it will consume resources, but will be more fast than direct lodash-like sorting.
Also task can be solved from another side, if there is access to structure of original databases. Key is denormalization. In counter for classical relation approach, collections can have dublicate information for avioding further join operation. E.g., Articles collection can have some information from Authors collection, which is nessasary to perform filtering, sorting and pagination in further operations.
