What is the recommended schema for paginated GraphQL results - graphql

Let's say I have users list to be returned. What would be best schema strategy among following.
Users returned contains only the data of user as follows, separate query is used for pagination details. In this query the downside is we need to pass same filters to both users and usersCount query.
query {
users(skip: 0, limit: 100, filters: someFilter) {
name
},
usersCount(filters: someFilters)
}
Which return following
{
results: {
users: [
{ name: "Foo" },
{ name: "Bar" },
],
usersCount: 1000,
}
}
In this strategy we make pagination details as part of users query, we don't need to pass filters twice. I feel this query is not nice to read.
query {
users(skip: 0, limit: 100, filters: someFilter) {
items: {
name
},
count
}
}
Which returns the following result
{
results: {
users: {
items: [
{ name: "Foo" },
{ name: "Bar" },
],
count: 1000,
}
}
}
I am curious to know which strategy is the recommended way while designing paginated results?

I would recommend to follow the official recommendation on graphql spec,
You need to switch to cursor based pagination.
This type of pagination uses a record or a pointer to a record in the dataset to paginate results. The cursor will refer to a record in the database.
You can follow the example in the link.

GraphQL Cursor Connections Specification
Also checkout how GitHub does it here: https://docs.github.com/en/graphql/reference/interfaces#node

Related

Apollo mixes two different arrays of the same query seemingly at random

With a schema like
schema {
query: QueryRoot
}
scalar MyBigUint
type Order {
id: Int!
data: OrderCommons!
kind: OrderType!
}
type OrderBook {
bids(limit: Int): [Order!]!
asks(limit: Int): [Order!]!
}
type OrderCommons {
quantity: Int!
price: MyBigUint! // where it doesn't matter whether it's MyBigUint or a simple Int - the issue occurs anyways
}
enum OrderType {
BUY
SELL
}
type QueryRoot {
orderbook: OrderBook!
}
And a query query { orderbook { bids { data { price } }, asks { data { price } } } }
In a graphql playground of my graphql API (and on the network level of my Apollo app too) I receive a result like
{
"data": {
"orderbook": {
"bids": [
{
"data": {
"price": "127"
}
},
{
"data": {
"price": "74"
}
},
...
],
"asks": [
{
"data": {
"price": "181"
}
},
{
"data": {
"price": "187"
}
},
...
]
}
}
}
where, for the purpose of this question, the bids are ordered in descending order by price like ["127", "74", "73", "72"], etc, and asks are ordered in ascending order, accordingly.
However, in Apollo, after a query is done, I notice that one of the arrays gets seemingly random data.
For the purpose of the question, useQuery react hook is used, but the same happens when I query imperatively from a freshly initialized ApolloClient.
const { data, subscribeToMore, ...rest } = useQuery<OrderbookResponse>(GET_ORDERBOOK_QUERY);
console.log(data?.orderbook?.bids?.map(r => r.data.price));
console.log(data?.orderbook?.asks?.map(r => r.data.price));
Here, corrupted data of Bids gets printed i.e. ['304', '306', '298', '309', '277', '153', '117', '108', '87', '76'] (notice the order being wrong, at the least), whereas Asks data looks just fine. Inspecting the network, I find that Bids are not only properly ordered there, but also have different (correct, from DB) values!
Therefore, it seems something's getting corrupted on the way while Apollo delivers the data.
What could be the issue here I wonder, and where to start debugging such kind of an issue? There seem to be no warnings from Apollo either, it seems to just silently corrupt the data.
I'm clearly doing something wrong, but what?
The issue seems to stem from how Apollo caches data.
My Bids and Asks could have the same numeric IDs but share the same Order graphql type. Apollo rightfully assumes a Bid and an Ask with the same ID are the same things and the resulting data gets wrecked as a consequence.
An easy fix is to show Apollo that there's a complex key to the Order type on cache initialization:
cache: new InMemoryCache({
typePolicies: {
Order: {
keyFields: ['id', 'kind'],
}
}
})
This way it'll understand that the Order entities Ask and Bid with the same ID are different pieces of data indeed.
Note that the field kind should be also added to the query strings accordingly.

Gatsby's mapping between markdown files

I'm creating a multi-author site (using gatsby-plugin-mdx) and have the following file structure:
/posts
- /post-1/index.mdx
- /post-2/index.mdx
- ...
/members
- /member-a/index.mdx
- /member-b/index.mdx
- ...
In the frontmatter of the post page I have an array of authors like
authors: [Member A, Member B]
and I have the name of the author in the frontmatter of the author's markdown file.
I'd like to set the schema up so that when I query the post, I also get the details of the authors as well (name, email, etc.).
From reading this page it seems like I need to create a custom resolver... but all the examples I see have all the authors in one json file (so you have two collections, MarkdownRemark and AuthorJson... while I think for my case all my posts and members are in MarkdownRemark collection.
Thanks so much!
I end up doing something like this. Surely there's a cleaner way, but it works for me. It goes through all the Mdx and add a field called authors, which is queried, to all Mdx types.
One problem with this is that there's also authors under members, which is not ideal. A better approach is to define new types and change Mdx in the last resolver to your new post data type. Not sure how to get that to work though. At the end, I could query something like:
query MyQuery {
posts {
frontmatter {
title
subtitle
}
authors {
frontmatter {
name
email
}
}
}
}
exports.createResolvers = ({ createResolvers }) => {
const resolvers = {
Mdx: {
authors: {
type: ["Mdx"],
resolve(source, args, context, info) {
return context.nodeModel.runQuery({
query: {
filter: {
fields: {
collection: { eq: "members" }
},
frontmatter: {
memberid: { in: source.frontmatter.authors },
},
},
},
type: "Mdx",
firstOnly: false,
})
}
}
},
}
createResolvers(resolvers)
}

What do I do when my query string in an elasticsearch is too long and the search doesn't work?

I'm querying Elastic in my GraphQL resolver like this below. The peopleArray under the second multi_match is dynamic and is filled with numbers (e.g. [101, 102, 103...]). This is for different users to get their messages. If the user's peopleArray has 165 items, it works fine. However, I have test users who have 1873 items in the peopleArray and the query doesn't work. And I may need to support users with 60,000 items. Basically, I'm creating a messaging system for employees in a company and all the numbers are all the people in their network of message-sending. If this is not a practical way of doing this, any suggestions would be appreciated. But if it is possible, how do I get my results? (Have not tried this in the Kibana UI yet.)
const esClient = new elasticsearch.Client({
host: config.elasticsearchRoot
});
const result = await esClient.search({
index: `${config.index.messageStream}`,
body: {
query: {
bool: {
must: [
{
multi_match: {
query: tenantId,
fields: [
"tenantId",
"discretionary.clientKey.systemTenantId",
"clientKey.systemTenantId",
"author.tenantId"
],
operator: "OR"
}
},
{
multi_match: {
query: peopleArray,
fields: ["author.peopleId", "details.recipient.peopleId"],
operator: "OR"
}
}
]
}
},
from,
size,
docvalue_fields: [
{
field: "sortDate",
format: "use_field_mapping"
}
],
sort: {
sortDate: {
missing: "_last",
unmapped_type: "long",
order: "desc"
}
}
}
});```

GraphQL query based on a specific value of a field

I want to be able to retrieve the latest release from GitHub for a specific repo using their GraphQL API. To do that, I need to get the latest release where isDraft and isPrerelease are false. I have managed to get the first part, but cant figure out how to do the "where" part of the query.
Here is the basic query I have gotten (https://developer.github.com/v4/explorer/):
{
repository(owner: "paolosalvatori", name: "ServiceBusExplorer") {
releases(first: 1, orderBy: {field: CREATED_AT, direction: DESC}) {
nodes {
name
tagName
resourcePath
isDraft
isPrerelease
}
}
}
}
Which returns:
{
"data": {
"repository": {
"releases": {
"nodes": [
{
"name": "3.0.4",
"tagName": "3.0.4",
"resourcePath": "/paolosalvatori/ServiceBusExplorer/releases/tag/3.0.4",
"isDraft": false,
"isPrerelease": false
}
]
}
}
}
}
I cant seem to find a way to do this. Part of the reason is that I am new to GraphQL (first time trying to do a query) and I am not sure how to frame my question.
Can one only "query" based on those types that support arguments (like repository and releases below)? Seems like there should be a way to specify a filter on the field values.
Repository: https://developer.github.com/v4/object/repository/
Releases: https://developer.github.com/v4/object/releaseconnection/
Node: https://developer.github.com/v4/object/release/
Can one only "query" based on those types that support arguments
Yes: GraphQL doesn't define a generic query language in the same way, say, SQL does. You can't sort or filter a field result in ways that aren't provided by the server and the application schema.
I want to be able to retrieve the latest [non-draft, non-prerelease] release from GitHub for a specific repo using their GraphQl API.
As you've already found, the releases field on the Repository type doesn't have an option to sort or filter on these fields. Instead, you can iterate through the releases one at a time with multiple GraphQL calls. These would individually look like
query NextRelease($owner: String!, $name: String!, $after: String) {
repository(owner: $owner, name: $name) {
releases(first: 1,
orderBy: {field: CREATED_AT, direction: DESC},
after: $after) {
pageInfo { lastCursor }
nodes { ... ReleaseData } # from the question
}
}
}
Run this in the same way you're running it now (I've split out the information identifying the repository into separate GraphQL variables). You can leave off the after variable for the first call. If (as in your example) it returns "isDraft": false, "isPrerelease": false, you're set. If not, you need to try again: take the value from the lastCursor in the response, and run the same query, passing that cursor value as the after variable value.
{
repository(owner: "paolosalvatori", name: "ServiceBusExplorer") {
releases(first: 1, orderBy: {field: CREATED_AT, direction: DESC}) {
nodes(isDraft :false , isPrerelease :false ) {
name
tagName
resourcePath
isDraft
isPrerelease
}
}
}
}
Alternatively please have look at GraphQL directives, as sometimes it's required to skip or include the fields on the basis of the values
#skip or #include.
The skip directive, when used on fields or fragments, allows us to exclude fields based on some condition.
The include directive, allows us to include fields based on some condition
GraphQL Directives

GraphQL: Filtering, sorting and paging on nested entities from separate data sources?

I'm attempting to use graphql to tie together a number of rest endpoints, and I'm stuck on how to filter, sort and page the resulting data. Specifically, I need to filter and/or sort by nested values.
I cannot do the filtering on the rest endpoints in all cases because they are separate microservices with separate databases. (i.e. I could filter on title in the rest endpoint for articles, but not on author.name). Likewise with sorting. And without filtering and sorting, pagination cannot be done on the rest endpoints either.
To illustrate the problem, and as an attempt at a solution, I've come up with the following using formatResponse in apollo-server, but am wondering if there is a better way.
I've boiled down the solution to the most minimal set of files that i could think of:
data.js represents what would be returned by 2 fictional rest endpoints:
export const Authors = [{ id: 1, name: 'Sam' }, { id: 2, name: 'Pat' }];
export const Articles = [
{ id: 1, title: 'Aardvarks', author: 1 },
{ id: 2, title: 'Emus', author: 2 },
{ id: 3, title: 'Tapir', author: 1 },
]
the schema is defined as:
import _ from 'lodash';
import {
GraphQLSchema,
GraphQLObjectType,
GraphQLList,
GraphQLString,
GraphQLInt,
} from 'graphql';
import {
Articles,
Authors,
} from './data';
const AuthorType = new GraphQLObjectType({
name: 'Author',
fields: {
id: {
type: GraphQLInt,
},
name: {
type: GraphQLString,
}
}
});
const ArticleType = new GraphQLObjectType({
name: 'Article',
fields: {
id: {
type: GraphQLInt,
},
title: {
type: GraphQLString,
},
author: {
type: AuthorType,
resolve(article) {
return _.find(Authors, { id: article.author })
},
}
}
});
const RootType = new GraphQLObjectType({
name: 'Root',
fields: {
articles: {
type: new GraphQLList(ArticleType),
resolve() {
return Articles;
},
}
}
});
export default new GraphQLSchema({
query: RootType,
});
And the main index.js is:
import express from 'express';
import { apolloExpress, graphiqlExpress } from 'apollo-server';
var bodyParser = require('body-parser');
import _ from 'lodash';
import rql from 'rql/query';
import rqlJS from 'rql/js-array';
import schema from './schema';
const PORT = 8888;
var app = express();
function formatResponse(response, { variables }) {
let data = response.data.articles;
// Filter
if ({}.hasOwnProperty.call(variables, 'q')) {
// As an example, use a resource query lib like https://github.com/persvr/rql to do easy filtering
// in production this would have to be tightened up alot
data = rqlJS.query(rql.Query(variables.q), {}, data);
}
// Sort
if ({}.hasOwnProperty.call(variables, 'sort')) {
const sortKey = _.trimStart(variables.sort, '-');
data = _.sortBy(data, (element) => _.at(element, sortKey));
if (variables.sort.charAt(0) === '-') _.reverse(data);
}
// Pagination
if ({}.hasOwnProperty.call(variables, 'offset') && variables.offset > 0) {
data = _.slice(data, variables.offset);
}
if ({}.hasOwnProperty.call(variables, 'limit') && variables.limit > 0) {
data = _.slice(data, 0, variables.limit);
}
return _.assign({}, response, { data: { articles: data }});
}
app.use('/graphql', bodyParser.json(), apolloExpress((req) => {
return {
schema,
formatResponse,
};
}));
app.use('/graphiql', graphiqlExpress({
endpointURL: '/graphql',
}));
app.listen(
PORT,
() => console.log(`GraphQL Server running at http://localhost:${PORT}`)
);
For ease of reference, these files are available at this gist.
With this setup, I can send this query:
{
articles {
id
title
author {
id
name
}
}
}
Along with these variables (It seems like this is not the intended use for the variables, but it was the only way I could get the post processing parameters into the formatResponse function.):
{ "q": "author/name=Sam", "sort": "-id", "offset": 1, "limit": 1 }
and get this response, filtered to where Sam is the author, sorted by id descending, and getting getting the second page where the page size is 1.
{
"data": {
"articles": [
{
"id": 1,
"title": "Aardvarks",
"author": {
"id": 1,
"name": "Sam"
}
}
]
}
}
Or these variables:
{ "sort": "-author.name", "offset": 1 }
For this response, sorted by author name descending and getting all articles except the first.
{
"data": {
"articles": [
{
"id": 1,
"title": "Aardvarks",
"author": {
"id": 1,
"name": "Sam"
}
},
{
"id": 2,
"title": "Emus",
"author": {
"id": 2,
"name": "Pat"
}
}
]
}
}
So, as you can see, I am using the formatResponse function for post processing to do the filtering/paging/sorting. .
So, my questions are:
Is this a valid use case?
Is there a more canonical way to do filtering on deeply nested properties, along with sorting and paging?
Is this a valid use case? Is there a more canonical way to do filtering on deeply nested properties, along with sorting and paging?
Major part of original questing lies on segregating collections on different databases on separate microservices. In fact, it's nessasary to perform collection joining and subsequent filtering on some key, but it's directly impossible since there is no field in original collection to filter, sort or paginate.
Strightforward solution is perform full or filtered queries to original collections, and then perform joining and filtering result dataset on application server, e.g. by lodash, such at your solution. In is possible for small collections, but in general case causes large data transfer and unefficent sorting since there is no index structure - real RB-tree or SkipList, so with quadratic complexity it's not very good.
Dependent on resource volume on application server, special cache and index tables can be build there. If collection structure is fixed, some relations between collection entries and their fields can be reflected in special search table and update respectively on demain. It's like find & search index creation, but not it database, but on application server. Of cource, it will consume resources, but will be more fast than direct lodash-like sorting.
Also task can be solved from another side, if there is access to structure of original databases. Key is denormalization. In counter for classical relation approach, collections can have dublicate information for avioding further join operation. E.g., Articles collection can have some information from Authors collection, which is nessasary to perform filtering, sorting and pagination in further operations.

Resources