Filter by Distinct Values in Gatsby - graphql

I am trying to display a list of unique subject categories on a Gatsby site, which I will use to create unique pages. These will serve as taxonomy terms, of sorts. A limited version of my source json file looks like:
[
{
"BookID": "4176",
"Title": "Book Title 1",
"Subject": {
"subjectID": "HR",
"name": "Civil War & Reconstruction"
}
},
{
"BookID": "3619",
"Title": "Book Title 2",
"Subject": {
"subjectID": "AR",
"name": "Fine Art & Photography"
}
},
{
"BookID": "3619",
"Title": "Book Title 3",
"Subject": {
"subjectID": "AR",
"name": "Fine Art & Photography"
}
}
]
In my gatsby-node.js file, I can create pages using a list of distinct values of IDs to serve as the slugs to create my subject categories. As below:
allSubjects: allBooksJson {
distinct(field: Subject___subjectID)
}
However, I also need the name associated with these. I have not yet seen a way to use this as a filter, in order to deduplicate the results of a query.
So what I would ultimately like to is return all the unique subject objects so I can use the subjectID as a slug and the full name where needed on the individual pages.
Still learning Gatsby, so this may be the wrong approach, and any advice would be appreciated.

The idea of creating dynamic pages, is to get all the needed values in your gatsby-node.js using a GraphQL query, to create a bunch of pages and then, use the context to send a unique identifier to the template, to filter again the pages to get the specific data for each entry (books in your case). So:
const path = require(`path`)
const { createFilePath } = require(`gatsby-source-filesystem`)
exports.createPages = async ({ graphql, actions }) => {
const { createPage } = actions
const result = await graphql(`
query {
allBookJson {
edges {
node {
Subject{
subjectID
}
}
}
}
}
`)
result.data.allBookJson.edges.forEach(({ node }) => {
createPage({
path: `books/${node.Subject.subjectID}`, // change it as you wish
component: path.resolve(`./src/templates/book.js`), // change it as you wish
context: {
subjectID: node.fields.slug,
},
})
})
}
Note: adapt the snippet (query, loop, and variables) to your needs. You don't need to filter anything at this point, since you are only fetching the subjectID of all books.
If the values are likely to be repeated, use the new Set to remove the duplicates, then, you can loop through them to create pages dynamically:
let unique = [...new Set(result.data.allBookJson.edges.node)];
You are sending the subjectID to your templates/book.js file via context, so it will be available to be used as a pageContext.
Anytime you want just to get a list of all books, you can create a page query or a static query and loop through them at any time.
import React from "react"
import { graphql } from "gatsby"
import Layout from "../components/layout"
export default function Book({ data }) {
const books = data.allBookJson
return (
<Layout>
<div>
{books.map(book=>{
return <div>book.title</div>
})}
</div>
</Layout>
)
}
export const query = graphql`
query($subjectID: String) {
allBookJson(Subject___subjectID: { eq: $subjectID } ) {
edges{
node{
title
}
}
}
}
`
Note: again, test your query and adapt it to your needs at localhost:8000/___graphql. If you have duplicate results use the new Set.
It's difficult to guess your data structure without knowing it properly, the idea is to create a unique query based on the context value subjectID and filter the values. Use the GraphQL playground as support to know how the query and the filters should look like.
Further details: https://www.gatsbyjs.com/docs/tutorial/part-seven/

Related

Dynamically create pages with Gatsby based on many Contentful references

I am currently using Gatsby's collection routes API to create pages for a simple blog with data coming from Contentful.
For example, creating a page for each blogpost category :
-- src/pages/categories/{contentfulBlogPost.category}.js
export const query = graphql`
query categoriesQuery($category: String = "") {
allContentfulBlogPost(filter: { category: { eq: $category } }) {
edges {
node {
title
category
description {
description
}
...
}
}
}
}
...
[React component mapping all blogposts from each category in a list]
...
This is working fine.
But now I would like to have multiple categories per blogpost, so I switched to Contentful's references, many content-type, which allows to have multiple entries for a field :
Now the result of my graphQL query on field category2 is an array of different categories for each blogpost :
Query :
query categoriesQuery {
allContentfulBlogPost {
edges {
node {
category2 {
id
name
slug
}
}
}
}
}
Output :
{
"data": {
"allContentfulBlogPost": {
"edges": [
{
"node": {
"category2": [
{
"id": "75b89e48-a8c9-54fd-9742-cdf70c416b0e",
"name": "Test",
"slug": "test"
},
{
"id": "568r9e48-t1i8-sx4t8-9742-cdf70c4ed789vtu",
"name": "Test2",
"slug": "test-2"
}
]
}
},
{
"node": {
"category2": [
{
"id": "75b89e48-a8c9-54fd-9742-cdf70c416b0e",
"name": "Test",
"slug": "test"
}
]
}
},
...
Now that categories are inside an array, I don't know how to :
write a query variable to filter categories names ;
use the slug field as a route to dynamically create the page.
For blogposts authors I was doing :
query authorsQuery($author__slug: String = "") {
allContentfulBlogPost(filter: { author: { slug: { eq: $author__slug } } }) {
edges {
node {
id
author {
slug
name
}
...
}
...
}
And creating pages with src/pages/authors/{contentfulBlogPost.author__slug}.js
I guess I'll have to use the createPages API instead.
You can achieve the result using the Filesystem API, something like this may work:
src/pages/category/{contentfulBlogPost.category2__name}.js
In this case, it seems that this approach may lead to some caveats, since you may potentially create duplicated pages with the same URL (slug) because the posts can contain multiple and repeated categories.
However, I think it's more succinct to use the createPages API as you said, keeping in mind that you will need to treat the categories to avoid duplicities because they are in a one-to-many relationship.
exports.createPages = async ({ graphql, actions }) => {
const { createPage } = actions
const result = await graphql(`
query {
allContentfulBlogPost {
edges {
node {
category2 {
id
name
slug
}
}
}
}
}
`)
let categories= { slugs: [], names: [] };
result.data.allContentfulBlogPost.edges.map(({node}))=> {
let { name, slug } = node.category2;
// make some checks if needed here
categories.slugs.push(slug);
categories.names.push(name);
return new Set(categories.slugs) && new Set(categories.names);
});
categories.slugs.forEach((category, index) => {
let name = categories.names[index];
createPage({
path: `category/${category}`,
component: path.resolve(`./src/templates/your-category-template.js`),
context: {
name
}
});
});
}
The code's quite self-explanatory. Basically you are defining an empty object (categories) that contains two arrays, slugs and names:
let categories= { slugs: [], names: [] };
After that, you only need to loop through the result of the query (result) and push the field values (name, slug, and others if needed) to the previous array, making the needed checks if you want (to avoid pushing empty values, or that matches some regular expression, etc) and return a new Set to remove the duplicates.
Then, you only need to loop through the slugs to create pages using createPage API and pass the needed data via context:
context: {
name
}
Because of redundancy, this is the same than doing:
context: {
name: name
}
So, in your template, you will get the name in pageContext props. Replace it with the slug if needed, depending on your situation and your use case, the approach is exactly the same.

Gatsby's mapping between markdown files

I'm creating a multi-author site (using gatsby-plugin-mdx) and have the following file structure:
/posts
- /post-1/index.mdx
- /post-2/index.mdx
- ...
/members
- /member-a/index.mdx
- /member-b/index.mdx
- ...
In the frontmatter of the post page I have an array of authors like
authors: [Member A, Member B]
and I have the name of the author in the frontmatter of the author's markdown file.
I'd like to set the schema up so that when I query the post, I also get the details of the authors as well (name, email, etc.).
From reading this page it seems like I need to create a custom resolver... but all the examples I see have all the authors in one json file (so you have two collections, MarkdownRemark and AuthorJson... while I think for my case all my posts and members are in MarkdownRemark collection.
Thanks so much!
I end up doing something like this. Surely there's a cleaner way, but it works for me. It goes through all the Mdx and add a field called authors, which is queried, to all Mdx types.
One problem with this is that there's also authors under members, which is not ideal. A better approach is to define new types and change Mdx in the last resolver to your new post data type. Not sure how to get that to work though. At the end, I could query something like:
query MyQuery {
posts {
frontmatter {
title
subtitle
}
authors {
frontmatter {
name
email
}
}
}
}
exports.createResolvers = ({ createResolvers }) => {
const resolvers = {
Mdx: {
authors: {
type: ["Mdx"],
resolve(source, args, context, info) {
return context.nodeModel.runQuery({
query: {
filter: {
fields: {
collection: { eq: "members" }
},
frontmatter: {
memberid: { in: source.frontmatter.authors },
},
},
},
type: "Mdx",
firstOnly: false,
})
}
}
},
}
createResolvers(resolvers)
}

Strapi GraphQL search by multiple attributes

I've got a very simple Nuxt app with Strapi GraphQL backend that I'm trying to use and learn more about GraphQL in the process.
One of my last features is to implement a search feature where a user enters a search query, and Strapi/GraphQL performs that search based on attributes such as image name and tag names that are associated with that image. I've been reading the Strapi documentation and there's a segment about performing a search.
So in my schema.graphql, I've added this line:
type Query {
...other generated queries
searchImages(searchQuery: String): [Image
}
Then in the /api/image/config/schema.graphql.js file, I've added this:
module.exports = {
query: `
searchImages(searchQuery: String): [Image]
`,
resolver: {
Query: {
searchImages: {
resolverOf: 'Image.find',
async resolver(_, { searchQuery }) {
if (searchQuery) {
const params = {
name_contains: searchQuery,
// tags_contains: searchQuery,
// location_contains: searchQuery,
}
const searchResults = await strapi.services.image.search(params);
console.log('searchResults: ', searchResults);
return searchResults;
}
}
}
},
},
};
At this point I'm just trying to return results in the GraphQL playground, however when I run something simple in the Playground like:
query($searchQuery: String!) {
searchImages(searchQuery:$searchQuery) {
id
name
}
}
I get the error: "TypeError: Cannot read property 'split' of undefined".
Any ideas what might be going on here?
UPDATE:
For now, I'm using deep filtering instead of the search like so:
query($searchQuery: String) {
images(
where: {
tags: { title_contains: $searchQuery }
name_contains: $searchQuery
}
) {
id
name
slug
src {
url
formats
}
}
}
This is not ideal because it's not an OR/WHERE operator, meaning it's not searching by tag title or image name. It seems to only hit the first where. Ideally I would like to use Strapi's search service.
I actually ran into this problem not to recently and took a different solution.
the where condition can be combined with using either _and or _or. as seen below.
_or
articles(where: {
_or: [
{ content_contains: $dataContains },
{ description_contains: $dataContains }
]})
_and
(where: {
_and: [
{slug_contains: $categoriesContains}
]})
Additionally, these operators can be combined given that where in this instance is an object.
For your solution I would presume you want an or condition in your where filter predicate like below
images(where: {
_or: [
{ title_contains: $searchQuery },
{ name_contains: $searchQuery }
]})
Lastly, you can perform a query that filters by a predicate by creating an event schema and adding the #search directive as seen here

Query an array with unstructured objects on GraphQL

I'm trying to use GraphQL to query an unstructured array with objects in Gridsome. It is currently looking very messy and it feels like there should be a better way to do this.
The data that gets loaded into GraphQL from the CMS looks like this:
{
title: "Homepage",
top_image: "imgurl.jpg",
page_builder: [
{
type: "slider",
field: "data example",
different_field: "data example"
},
{
type: "call_to_action",
field_for_cta: "data example",
different_cta_field: "data example"
}
]
}
As you can see the objects in page_builder will have different fields depening on how the client is building this section.
When I try to query this in GraphQL. It will become very messy:
<page-query>
query {
data: pages(path: "/pages") {
title,
top_image,
page_builder {
type,
field,
different_field,
type,
field_for_cta,
different_cta_field
#this list will have way more fields depending on all the page builder elements
}
}
}
</page-query>
Is there a way to organize this fields by type and only return the fields of this specific type?
Assuming gridsome supports fragments, you can do something like this:
<page-query>
query {
data: pages(path: "/pages") {
title,
top_image,
page_builder {
...A #include(if: $includeA)
...B #include(if: $includeB)
...C #include(if: $includeC)
}
}
}
# Note: Replace PageBuilderType with appropriate type
fragment A on PageBuilderType {
# your fields here
}
fragment B on PageBuilderType {
# your fields here
}
fragment C on PageBuilderType {
# your fields here
}
</page-query>
You can then define the variables when calling createPage as shown here:
api.createPages(({ createPage }) => {
createPage({
path: '/my-page',
component: './src/templates/MyPage.vue',
queryVariables: {
includeA: someCondition,
includeB: someCondition,
includeC: someCondition,
},
})
})
}

GraphQL: Filtering, sorting and paging on nested entities from separate data sources?

I'm attempting to use graphql to tie together a number of rest endpoints, and I'm stuck on how to filter, sort and page the resulting data. Specifically, I need to filter and/or sort by nested values.
I cannot do the filtering on the rest endpoints in all cases because they are separate microservices with separate databases. (i.e. I could filter on title in the rest endpoint for articles, but not on author.name). Likewise with sorting. And without filtering and sorting, pagination cannot be done on the rest endpoints either.
To illustrate the problem, and as an attempt at a solution, I've come up with the following using formatResponse in apollo-server, but am wondering if there is a better way.
I've boiled down the solution to the most minimal set of files that i could think of:
data.js represents what would be returned by 2 fictional rest endpoints:
export const Authors = [{ id: 1, name: 'Sam' }, { id: 2, name: 'Pat' }];
export const Articles = [
{ id: 1, title: 'Aardvarks', author: 1 },
{ id: 2, title: 'Emus', author: 2 },
{ id: 3, title: 'Tapir', author: 1 },
]
the schema is defined as:
import _ from 'lodash';
import {
GraphQLSchema,
GraphQLObjectType,
GraphQLList,
GraphQLString,
GraphQLInt,
} from 'graphql';
import {
Articles,
Authors,
} from './data';
const AuthorType = new GraphQLObjectType({
name: 'Author',
fields: {
id: {
type: GraphQLInt,
},
name: {
type: GraphQLString,
}
}
});
const ArticleType = new GraphQLObjectType({
name: 'Article',
fields: {
id: {
type: GraphQLInt,
},
title: {
type: GraphQLString,
},
author: {
type: AuthorType,
resolve(article) {
return _.find(Authors, { id: article.author })
},
}
}
});
const RootType = new GraphQLObjectType({
name: 'Root',
fields: {
articles: {
type: new GraphQLList(ArticleType),
resolve() {
return Articles;
},
}
}
});
export default new GraphQLSchema({
query: RootType,
});
And the main index.js is:
import express from 'express';
import { apolloExpress, graphiqlExpress } from 'apollo-server';
var bodyParser = require('body-parser');
import _ from 'lodash';
import rql from 'rql/query';
import rqlJS from 'rql/js-array';
import schema from './schema';
const PORT = 8888;
var app = express();
function formatResponse(response, { variables }) {
let data = response.data.articles;
// Filter
if ({}.hasOwnProperty.call(variables, 'q')) {
// As an example, use a resource query lib like https://github.com/persvr/rql to do easy filtering
// in production this would have to be tightened up alot
data = rqlJS.query(rql.Query(variables.q), {}, data);
}
// Sort
if ({}.hasOwnProperty.call(variables, 'sort')) {
const sortKey = _.trimStart(variables.sort, '-');
data = _.sortBy(data, (element) => _.at(element, sortKey));
if (variables.sort.charAt(0) === '-') _.reverse(data);
}
// Pagination
if ({}.hasOwnProperty.call(variables, 'offset') && variables.offset > 0) {
data = _.slice(data, variables.offset);
}
if ({}.hasOwnProperty.call(variables, 'limit') && variables.limit > 0) {
data = _.slice(data, 0, variables.limit);
}
return _.assign({}, response, { data: { articles: data }});
}
app.use('/graphql', bodyParser.json(), apolloExpress((req) => {
return {
schema,
formatResponse,
};
}));
app.use('/graphiql', graphiqlExpress({
endpointURL: '/graphql',
}));
app.listen(
PORT,
() => console.log(`GraphQL Server running at http://localhost:${PORT}`)
);
For ease of reference, these files are available at this gist.
With this setup, I can send this query:
{
articles {
id
title
author {
id
name
}
}
}
Along with these variables (It seems like this is not the intended use for the variables, but it was the only way I could get the post processing parameters into the formatResponse function.):
{ "q": "author/name=Sam", "sort": "-id", "offset": 1, "limit": 1 }
and get this response, filtered to where Sam is the author, sorted by id descending, and getting getting the second page where the page size is 1.
{
"data": {
"articles": [
{
"id": 1,
"title": "Aardvarks",
"author": {
"id": 1,
"name": "Sam"
}
}
]
}
}
Or these variables:
{ "sort": "-author.name", "offset": 1 }
For this response, sorted by author name descending and getting all articles except the first.
{
"data": {
"articles": [
{
"id": 1,
"title": "Aardvarks",
"author": {
"id": 1,
"name": "Sam"
}
},
{
"id": 2,
"title": "Emus",
"author": {
"id": 2,
"name": "Pat"
}
}
]
}
}
So, as you can see, I am using the formatResponse function for post processing to do the filtering/paging/sorting. .
So, my questions are:
Is this a valid use case?
Is there a more canonical way to do filtering on deeply nested properties, along with sorting and paging?
Is this a valid use case? Is there a more canonical way to do filtering on deeply nested properties, along with sorting and paging?
Major part of original questing lies on segregating collections on different databases on separate microservices. In fact, it's nessasary to perform collection joining and subsequent filtering on some key, but it's directly impossible since there is no field in original collection to filter, sort or paginate.
Strightforward solution is perform full or filtered queries to original collections, and then perform joining and filtering result dataset on application server, e.g. by lodash, such at your solution. In is possible for small collections, but in general case causes large data transfer and unefficent sorting since there is no index structure - real RB-tree or SkipList, so with quadratic complexity it's not very good.
Dependent on resource volume on application server, special cache and index tables can be build there. If collection structure is fixed, some relations between collection entries and their fields can be reflected in special search table and update respectively on demain. It's like find & search index creation, but not it database, but on application server. Of cource, it will consume resources, but will be more fast than direct lodash-like sorting.
Also task can be solved from another side, if there is access to structure of original databases. Key is denormalization. In counter for classical relation approach, collections can have dublicate information for avioding further join operation. E.g., Articles collection can have some information from Authors collection, which is nessasary to perform filtering, sorting and pagination in further operations.

Resources