In order to limit the size of my REST API answers, I want to implement the Google performance tip: using the fields query string parameter to do partial resources.
If I have a full answer GET https://myapi.com/v1/users
[
{
"id": 12,
"first_name": "Angie",
"last_name": "Smith",
"address": {
"street": "1122 Something St.",
"city": "A city"
..and so on...
}
},
... and so on
]
I will be able to filter it GET https://myapi.com/v1/users?fields=first_name
[
{
"id": 12,
"first_name": "Angie"
},
... and so on
]
The concept is pretty easy to understand, but I can't find an easy way to implement it!
My API resources are all design the same way:
use query string parameters for filtering, sorting, paging.
call a service with that parameters to do a SQL request (only the WHERE condition, the ORDER BY condition and the LIMIT are dynamic)
use a converter to format data back to JSON
But when using this new fields parameter, what do I need to do? where do I filter the data?
Do I need to filter only the JSON output? But I will make (in that example) an unwanted JOIN query on address table and fetch unwanted fields in the users table.
Do I need to make a dynamic SQL query to fetch exactly the requested fields and add the JOIN only when the end user need it? Then the converter will have to be smart to convert only the available fields in the SQL query.
In my opinion, this second solution will produce a code extremely dynamic, extremely complex and difficult to maintain.
So, how do you implement such REST API with partial resource feature? What are you best practice in that case?
(I'm a PHP developer, but I don't think it's relevant for that question)
If your backend is doing
GET https://myapi.com/v1/users
which results in SQL:
select * from users
which you then turn into JSON, can you not just do:
GET https://myapi.com/v1/users?fields=first_name,surname,email
get all the required fields (rough idea of PHP implementation):
$fields = split(",", $_GET["fields"]);
$sql = "select ";
foreach ($fields as &$field) {
// do a check to see if the field is ok first...
if (checkField($field)) {
$sql += field + "," // deal with commas
}
}
$sql += " from users";
to build SQL like:
select firstname,surname,email from users
and turn that limited dataset to JSON?
Related
May I please know what is the reason why are items in DynamoDB not being fetched by GraphQL?
When searching via the DynamoDB console interface, I could easily see and query the item in there but once used in GraphQL, some items are not showing. Mind you, this isn't a connection problem because I could query items its just there's a specific item that is not being returned.
For example, if I query all Posts, it will return all posts in an array but the item is not showing there. However, when I try to query a Post just by targetting it by its ID, it is working well.
Example code that is not working:
listPosts(filter: {groupID: {eq: "25"}}) {
items {
id
content
}
}
but when I do this, it is working well:
getPost(id: "c59ce7e9") {
id
content
}
I had this same issue and can share what i found and worked for me.
The default resolver for the list operation has a limit:20 built in.
{
"version": "2017-02-28",
"operation": "Scan",
"filter": #if($context.args.filter) $util.transform.toDynamoDBFilterExpression($ctx.args.filter) #else null #end,
"limit": $util.defaultIfNull($ctx.args.limit, 20),
"nextToken": $util.toJson($util.defaultIfNullOrEmpty($ctx.args.nextToken, null)),
}
I imagine you could change this or you could add a limit filter to your query like this:
listPosts(filter: {groupID: {eq: "25"}}, limit:100) {
items {
id
content
}
}
The limit should be higher than the number of records.
You can see that this would be an issue because it is using the scan operation meaning it inspects each record for a match. this would hurt performance. you could add pagination or you should craft a query for this. you will need to look into pagination, relations and connection.
https://docs.aws.amazon.com/appsync/latest/devguide/designing-your-schema.html#advanced-relations-and-pagination
I'm trying to build a GraphQL query within Strapi (probably not relevant) and I'm not sure how to achieve what I want. So far I've got the following, which is close, but name_in isn't quite what I want.
query {
events(where: { audiences: { name_in: ["Year 1", "Biology"] } }) {
name
audiences {
name
}
}
}
What I'm trying to achieve is events where all of the audience names overlap with the provided query which I ultimately wish to parameterise. Here _in is obviously doing an any. So I only want records with ["Year 1"], ["Biology"] or ["Year 1", "Biology"]. Anything else such as ["Year 2", "Biology"] should not be returned as the complete set of audiences doesn't completely overlap.
Is this possible with vanilla GraphQL or do I need to start write custom resolvers?
Let's say my graphql server wants to fetch the following data as JSON where person3 and person5 are some id's:
"persons": {
"person3": {
"id": "person3",
"name": "Mike"
},
"person5": {
"id": "person5",
"name": "Lisa"
}
}
Question: How to create the schema type definition with apollo?
The keys person3 and person5 here are dynamically generated depending on my query (i.e. the area used in the query). So at another time I might get person1, person2, person3 returned.
As you see persons is not an Iterable, so the following won't work as a graphql type definition I did with apollo:
type Person {
id: String
name: String
}
type Query {
persons(area: String): [Person]
}
The keys in the persons object may always be different.
One solution of course would be to transform the incoming JSON data to use an array for persons, but is there no way to work with the data as such?
GraphQL relies on both the server and the client knowing ahead of time what fields are available available for each type. In some cases, the client can discover those fields (via introspection), but for the server, they always need to be known ahead of time. So to somehow dynamically generate those fields based on the returned data is not really possible.
You could utilize a custom JSON scalar (graphql-type-json module) and return that for your query:
type Query {
persons(area: String): JSON
}
By utilizing JSON, you bypass the requirement for the returned data to fit any specific structure, so you can send back whatever you want as long it's properly formatted JSON.
Of course, there's significant disadvantages in doing this. For example, you lose the safety net provided by the type(s) you would have previously used (literally any structure could be returned, and if you're returning the wrong one, you won't find out about it until the client tries to use it and fails). You also lose the ability to use resolvers for any fields within the returned data.
But... your funeral :)
As an aside, I would consider flattening out the data into an array (like you suggested in your question) before sending it back to the client. If you're writing the client code, and working with a dynamically-sized list of customers, chances are an array will be much easier to work with rather than an object keyed by id. If you're using React, for example, and displaying a component for each customer, you'll end up converting that object to an array to map it anyway. In designing your API, I would make client usability a higher consideration than avoiding additional processing of your data.
You can write your own GraphQLScalarType and precisely describe your object and your dynamic keys, what you allow and what you do not allow or transform.
See https://graphql.org/graphql-js/type/#graphqlscalartype
You can have a look at taion/graphql-type-json where he creates a Scalar that allows and transforms any kind of content:
https://github.com/taion/graphql-type-json/blob/master/src/index.js
I had a similar problem with dynamic keys in a schema, and ended up going with a solution like this:
query lookupPersons {
persons {
personKeys
person3: personValue(key: "person3") {
id
name
}
}
}
returns:
{
data: {
persons: {
personKeys: ["person1", "person2", "person3"]
person3: {
id: "person3"
name: "Mike"
}
}
}
}
by shifting the complexity to the query, it simplifies the response shape.
the advantage compared to the JSON approach is it doesn't need any deserialisation from the client
Additional info for Venryx: a possible schema to fit my query looks like this:
type Person {
id: String
name: String
}
type PersonsResult {
personKeys: [String]
personValue(key: String): Person
}
type Query {
persons(area: String): PersonsResult
}
As an aside, if your data set for persons gets large enough, you're going to probably want pagination on personKeys as well, at which point, you should look into https://relay.dev/graphql/connections.htm
Let's say my graphql server wants to fetch the following data as JSON where person3 and person5 are some id's:
"persons": {
"person3": {
"id": "person3",
"name": "Mike"
},
"person5": {
"id": "person5",
"name": "Lisa"
}
}
Question: How to create the schema type definition with apollo?
The keys person3 and person5 here are dynamically generated depending on my query (i.e. the area used in the query). So at another time I might get person1, person2, person3 returned.
As you see persons is not an Iterable, so the following won't work as a graphql type definition I did with apollo:
type Person {
id: String
name: String
}
type Query {
persons(area: String): [Person]
}
The keys in the persons object may always be different.
One solution of course would be to transform the incoming JSON data to use an array for persons, but is there no way to work with the data as such?
GraphQL relies on both the server and the client knowing ahead of time what fields are available available for each type. In some cases, the client can discover those fields (via introspection), but for the server, they always need to be known ahead of time. So to somehow dynamically generate those fields based on the returned data is not really possible.
You could utilize a custom JSON scalar (graphql-type-json module) and return that for your query:
type Query {
persons(area: String): JSON
}
By utilizing JSON, you bypass the requirement for the returned data to fit any specific structure, so you can send back whatever you want as long it's properly formatted JSON.
Of course, there's significant disadvantages in doing this. For example, you lose the safety net provided by the type(s) you would have previously used (literally any structure could be returned, and if you're returning the wrong one, you won't find out about it until the client tries to use it and fails). You also lose the ability to use resolvers for any fields within the returned data.
But... your funeral :)
As an aside, I would consider flattening out the data into an array (like you suggested in your question) before sending it back to the client. If you're writing the client code, and working with a dynamically-sized list of customers, chances are an array will be much easier to work with rather than an object keyed by id. If you're using React, for example, and displaying a component for each customer, you'll end up converting that object to an array to map it anyway. In designing your API, I would make client usability a higher consideration than avoiding additional processing of your data.
You can write your own GraphQLScalarType and precisely describe your object and your dynamic keys, what you allow and what you do not allow or transform.
See https://graphql.org/graphql-js/type/#graphqlscalartype
You can have a look at taion/graphql-type-json where he creates a Scalar that allows and transforms any kind of content:
https://github.com/taion/graphql-type-json/blob/master/src/index.js
I had a similar problem with dynamic keys in a schema, and ended up going with a solution like this:
query lookupPersons {
persons {
personKeys
person3: personValue(key: "person3") {
id
name
}
}
}
returns:
{
data: {
persons: {
personKeys: ["person1", "person2", "person3"]
person3: {
id: "person3"
name: "Mike"
}
}
}
}
by shifting the complexity to the query, it simplifies the response shape.
the advantage compared to the JSON approach is it doesn't need any deserialisation from the client
Additional info for Venryx: a possible schema to fit my query looks like this:
type Person {
id: String
name: String
}
type PersonsResult {
personKeys: [String]
personValue(key: String): Person
}
type Query {
persons(area: String): PersonsResult
}
As an aside, if your data set for persons gets large enough, you're going to probably want pagination on personKeys as well, at which point, you should look into https://relay.dev/graphql/connections.htm
Graphql is great and I've started using it in my app. I have a page that displays summary information and I need graphql to return aggregate counts? Can this be done?
You would define a new GraphQL type that is an object that contains a list and a number. The number would be defined by a resolver function.
On your GraphQL server you can define the resolver function and as part of that, you would have to write the code that performs whatever calculations and queries are necessary to get the aggregate counts.
This is similar to how you would write an object serializer for a REST API or a custom REST API endpoint that runs whatever database queries are needed to calculate the aggregate counts.
GraphQL's strength is that it gives the frontend more power in determining what data specifically is returned. Some of what you write in GraphQL will be the same as what you would write for a REST API.
There's no automatic aggregate function in GraphQL itself.
You can add a field called summary, and in the resolve function calculate the totals.
You should define a Type of aggregated data in Graphql and a function you want to implement it. For example, if you want to write the following query:
SELECT age, sum(score) from student group by age;
You should define the data type that you want to return:
type StudentScoreByAge{
age: Int
sumOfScore: Float
}
and a Graphql function:
getStudentScoreByAge : [StudentScoreByAge]
async function(){
const res = await client.query("SELECT age, sum(score) as sumOfScore
from Student group by age");
return res.rows;
}
... need graphql to return aggregate counts? Can this be done?
Yes, it can be done.
Does GraphQL does it automatically for you? No, because it does not know / care about where you get your data source.
How? GraphQL does not dictate how you get / mutate the data that the user has queried. It's up to your implementation to get the requested aggregated data. You could get aggregated data directly from your MongoDB and serve it back, or you get all the data you need from your data source and do the aggregation yourself.
If you are using Hasura, in the explorer, you can definitely see an "agregate" table name, thus, your query would look something similar to the following:
query queryTable {
table_name {
field1
field2
}
table_name_aggregate {
aggregate { count }
}
}
In your results, you will see the total row count for the query
"table_name_aggregate": {
"aggregate": {
"count": 9973
}
This depends on whether you build the aggregator into your schema and are able to resolve the field.
Can you share what kind of GraphQL Server you're running? As different languages have different implementations, as well as different services (like Hasura, 8base, and Prisma).
Also, when you say "counts", I'm imagining a count of objects in a relation. Such as:
query {
user(id: "1") {
name
summaries {
count
}
}
}
// returns
{
"data": {
"user": {
"name": "Steve",
"summaries": {
"count": 10
}
}
}
}
8base provides the count aggregate by default on relational queries.