Given the following graph
type Movie {
name: String!
actors: [Actor!]!
}
type Actor {
name: String!
awards: [Award!]!
}
type Award {
name: String!
date: String!
}
type Query {
movies(): [Movie!]!
}
I'd like to be able to run the following three types of queries as efficiently as possible:
Query 1:
query {
movies {
actors {
rewards {
name
}
}
}
}
Query 2:
query {
movies {
name
}
}
Query 3:
query {
movies {
name
actors {
rewards {
date
}
}
}
}
Please note, these are not the only queries I will be running, but I'd like my code to be able to pick the optimal path "automatically".
The rest of my business logic is using JPA. The data comes from three respective tables, that can have up to 40 columns each.
I am not looking for code examples, but rather for a high-level structure describing different elements of architecture with respective responsibilities.
Without further context and details of your DB schema, what I could do is to just to give you the general advice that you need to aware of.
Most probably you would encounter N+1 loading performance issue when executing a query that contains several levels of related objects and these objects are stored in different DB tables.
Generally there are 2 ways to solve it :
Use Dataloader . Its idea is to defer the actual loading time of each object to a moment that multiple objects can be batched loaded together by a single SQL. It also provides the caching feature to further improve the loading performance for the same query request.
Use "look ahead pattern" (Refer this for an example). Its ideas is that when you resolve the parent object , you can look ahead to analyse the GraphQL query that you need to execute require to include others related children or not. If yes , you can then use the JOIN SQL to query the parent object together with their children such that when you resolve its children later , they are already fetched and you do not need to fetch them again.
Also, if the objects in your domain can contain infinity number in theory , you should consider to implement pagination behaviour for the query in order to restrict the maximum number of the objects that it can return.
Related
Let's say my graphql server wants to fetch the following data as JSON where person3 and person5 are some id's:
"persons": {
"person3": {
"id": "person3",
"name": "Mike"
},
"person5": {
"id": "person5",
"name": "Lisa"
}
}
Question: How to create the schema type definition with apollo?
The keys person3 and person5 here are dynamically generated depending on my query (i.e. the area used in the query). So at another time I might get person1, person2, person3 returned.
As you see persons is not an Iterable, so the following won't work as a graphql type definition I did with apollo:
type Person {
id: String
name: String
}
type Query {
persons(area: String): [Person]
}
The keys in the persons object may always be different.
One solution of course would be to transform the incoming JSON data to use an array for persons, but is there no way to work with the data as such?
GraphQL relies on both the server and the client knowing ahead of time what fields are available available for each type. In some cases, the client can discover those fields (via introspection), but for the server, they always need to be known ahead of time. So to somehow dynamically generate those fields based on the returned data is not really possible.
You could utilize a custom JSON scalar (graphql-type-json module) and return that for your query:
type Query {
persons(area: String): JSON
}
By utilizing JSON, you bypass the requirement for the returned data to fit any specific structure, so you can send back whatever you want as long it's properly formatted JSON.
Of course, there's significant disadvantages in doing this. For example, you lose the safety net provided by the type(s) you would have previously used (literally any structure could be returned, and if you're returning the wrong one, you won't find out about it until the client tries to use it and fails). You also lose the ability to use resolvers for any fields within the returned data.
But... your funeral :)
As an aside, I would consider flattening out the data into an array (like you suggested in your question) before sending it back to the client. If you're writing the client code, and working with a dynamically-sized list of customers, chances are an array will be much easier to work with rather than an object keyed by id. If you're using React, for example, and displaying a component for each customer, you'll end up converting that object to an array to map it anyway. In designing your API, I would make client usability a higher consideration than avoiding additional processing of your data.
You can write your own GraphQLScalarType and precisely describe your object and your dynamic keys, what you allow and what you do not allow or transform.
See https://graphql.org/graphql-js/type/#graphqlscalartype
You can have a look at taion/graphql-type-json where he creates a Scalar that allows and transforms any kind of content:
https://github.com/taion/graphql-type-json/blob/master/src/index.js
I had a similar problem with dynamic keys in a schema, and ended up going with a solution like this:
query lookupPersons {
persons {
personKeys
person3: personValue(key: "person3") {
id
name
}
}
}
returns:
{
data: {
persons: {
personKeys: ["person1", "person2", "person3"]
person3: {
id: "person3"
name: "Mike"
}
}
}
}
by shifting the complexity to the query, it simplifies the response shape.
the advantage compared to the JSON approach is it doesn't need any deserialisation from the client
Additional info for Venryx: a possible schema to fit my query looks like this:
type Person {
id: String
name: String
}
type PersonsResult {
personKeys: [String]
personValue(key: String): Person
}
type Query {
persons(area: String): PersonsResult
}
As an aside, if your data set for persons gets large enough, you're going to probably want pagination on personKeys as well, at which point, you should look into https://relay.dev/graphql/connections.htm
With GraphQL schemas, when should I provide a type relation's field as a root-level field for its associated type?
Example
In many examples, I almost always see schemas that require the client to create queries that explicitly traverse the graph to get a nested field.
For a Rock Band Table-like component in the front end (or client), the GraphQL service that provides that component's data may have a schema that looks like this:
type Artist {
name: String!
instrument: String!
}
type RockBand {
leadSinger: Artist,
drummer: Artist,
leadGuitar: Artist,
}
type Query {
rockBand: RockBand
}
If the table component specified a column called, "Lead Singer Name", given the current schema, a possible query to fetch table data would look like this:
{
rockBand {
leadSinger {
name
}
}
}
For the same Rock Band Table, with the same column and needs, why not design a schema like this:
type RockBand {
leadSinger: Artist,
leadSingerName: String,
drummer: Artist,
leadGuitar: Artist,
}
That way a possible query can be like this?
{
rockBand {
leadSingerName
}
}
Does the choice to include the "leader singer's name", and similar relation fields, entirely depend on the client's need? Is modifying the schema to serve data for this use-case too specific a schema? Are there benefits to flattening the fields outside of making it easier for the client? Are there benefits to forcing traversal through the relation to get at a specific field?
How would you use graphQL to query by a "relational" entity value?
For instance, lets say we have a bunch of person-objects. Each "person" then has a relation to an interest/hobby which then has a property called "name".
Now lets say that we want to query for the name of each person with a specific interest, how would such a query be "conducted" using GraphQL?
Using OData it would be something like Persons?$select=name&$expand(Interests($filter=name eq 'Surfing')).. what would be the equivalent for GraphQL?
There is no one equivalent. With the exception of introspection, the GraphQL specification does not dictate what types a schema should have, what fields it should expose or what arguments those fields should take. In other words, there is no one way to query relationships or do things like filtering, sorting or pagination. If you use Relay, it has its own spec with a bit more guidance around things like pagination and connections between different nodes, but even Relay is agnostic to filtering. It's up to the individual service to decide how to implement these features.
As an example, if we set up a Graphcool or Prisma server, our query might look something like this:
query {
persons(where: {
interest: {
name: "Surfing"
}
}) {
name
}
}
A query to a Hasura server might look like this:
query {
persons(where: {
interest: {
name: {
_eq: "Surfing"
}
}
}) {
name
}
}
But there's nothing stopping you from creating a schema that would support a query like:
query {
persons(interest: "Surfing") {
name
}
}
Let's say my graphql server wants to fetch the following data as JSON where person3 and person5 are some id's:
"persons": {
"person3": {
"id": "person3",
"name": "Mike"
},
"person5": {
"id": "person5",
"name": "Lisa"
}
}
Question: How to create the schema type definition with apollo?
The keys person3 and person5 here are dynamically generated depending on my query (i.e. the area used in the query). So at another time I might get person1, person2, person3 returned.
As you see persons is not an Iterable, so the following won't work as a graphql type definition I did with apollo:
type Person {
id: String
name: String
}
type Query {
persons(area: String): [Person]
}
The keys in the persons object may always be different.
One solution of course would be to transform the incoming JSON data to use an array for persons, but is there no way to work with the data as such?
GraphQL relies on both the server and the client knowing ahead of time what fields are available available for each type. In some cases, the client can discover those fields (via introspection), but for the server, they always need to be known ahead of time. So to somehow dynamically generate those fields based on the returned data is not really possible.
You could utilize a custom JSON scalar (graphql-type-json module) and return that for your query:
type Query {
persons(area: String): JSON
}
By utilizing JSON, you bypass the requirement for the returned data to fit any specific structure, so you can send back whatever you want as long it's properly formatted JSON.
Of course, there's significant disadvantages in doing this. For example, you lose the safety net provided by the type(s) you would have previously used (literally any structure could be returned, and if you're returning the wrong one, you won't find out about it until the client tries to use it and fails). You also lose the ability to use resolvers for any fields within the returned data.
But... your funeral :)
As an aside, I would consider flattening out the data into an array (like you suggested in your question) before sending it back to the client. If you're writing the client code, and working with a dynamically-sized list of customers, chances are an array will be much easier to work with rather than an object keyed by id. If you're using React, for example, and displaying a component for each customer, you'll end up converting that object to an array to map it anyway. In designing your API, I would make client usability a higher consideration than avoiding additional processing of your data.
You can write your own GraphQLScalarType and precisely describe your object and your dynamic keys, what you allow and what you do not allow or transform.
See https://graphql.org/graphql-js/type/#graphqlscalartype
You can have a look at taion/graphql-type-json where he creates a Scalar that allows and transforms any kind of content:
https://github.com/taion/graphql-type-json/blob/master/src/index.js
I had a similar problem with dynamic keys in a schema, and ended up going with a solution like this:
query lookupPersons {
persons {
personKeys
person3: personValue(key: "person3") {
id
name
}
}
}
returns:
{
data: {
persons: {
personKeys: ["person1", "person2", "person3"]
person3: {
id: "person3"
name: "Mike"
}
}
}
}
by shifting the complexity to the query, it simplifies the response shape.
the advantage compared to the JSON approach is it doesn't need any deserialisation from the client
Additional info for Venryx: a possible schema to fit my query looks like this:
type Person {
id: String
name: String
}
type PersonsResult {
personKeys: [String]
personValue(key: String): Person
}
type Query {
persons(area: String): PersonsResult
}
As an aside, if your data set for persons gets large enough, you're going to probably want pagination on personKeys as well, at which point, you should look into https://relay.dev/graphql/connections.htm
Graphql is great and I've started using it in my app. I have a page that displays summary information and I need graphql to return aggregate counts? Can this be done?
You would define a new GraphQL type that is an object that contains a list and a number. The number would be defined by a resolver function.
On your GraphQL server you can define the resolver function and as part of that, you would have to write the code that performs whatever calculations and queries are necessary to get the aggregate counts.
This is similar to how you would write an object serializer for a REST API or a custom REST API endpoint that runs whatever database queries are needed to calculate the aggregate counts.
GraphQL's strength is that it gives the frontend more power in determining what data specifically is returned. Some of what you write in GraphQL will be the same as what you would write for a REST API.
There's no automatic aggregate function in GraphQL itself.
You can add a field called summary, and in the resolve function calculate the totals.
You should define a Type of aggregated data in Graphql and a function you want to implement it. For example, if you want to write the following query:
SELECT age, sum(score) from student group by age;
You should define the data type that you want to return:
type StudentScoreByAge{
age: Int
sumOfScore: Float
}
and a Graphql function:
getStudentScoreByAge : [StudentScoreByAge]
async function(){
const res = await client.query("SELECT age, sum(score) as sumOfScore
from Student group by age");
return res.rows;
}
... need graphql to return aggregate counts? Can this be done?
Yes, it can be done.
Does GraphQL does it automatically for you? No, because it does not know / care about where you get your data source.
How? GraphQL does not dictate how you get / mutate the data that the user has queried. It's up to your implementation to get the requested aggregated data. You could get aggregated data directly from your MongoDB and serve it back, or you get all the data you need from your data source and do the aggregation yourself.
If you are using Hasura, in the explorer, you can definitely see an "agregate" table name, thus, your query would look something similar to the following:
query queryTable {
table_name {
field1
field2
}
table_name_aggregate {
aggregate { count }
}
}
In your results, you will see the total row count for the query
"table_name_aggregate": {
"aggregate": {
"count": 9973
}
This depends on whether you build the aggregator into your schema and are able to resolve the field.
Can you share what kind of GraphQL Server you're running? As different languages have different implementations, as well as different services (like Hasura, 8base, and Prisma).
Also, when you say "counts", I'm imagining a count of objects in a relation. Such as:
query {
user(id: "1") {
name
summaries {
count
}
}
}
// returns
{
"data": {
"user": {
"name": "Steve",
"summaries": {
"count": 10
}
}
}
}
8base provides the count aggregate by default on relational queries.