How to extend a response object in GraphQL with linked contents? - graphql

I've been working with restful apis. To serve a page, I needed to make lots of calls. So I've started to search graphQL.. I've read the graphql's documentation but couldn't see what I need exactly.
Let me try to tell what I need to do:
I've topic and document models. We can assume I've 100 topics and 1000+ documents..
I need to list all topics with list of documents linked with each topic. We make a call to get all topics then, we make calls for each topics to get their documents..
In GraphQL, Is there a smarter/better way to do that?
Topic{
name:String,
id: ref
}
Document{
name:String,
topic:Ref(Topic.id)
}
Expected Response
Response:[
{
name:"topic1",
documents:[{name:"document1"}, {name:"document2"}]
},
{
name:"topic2",
documents:[{name:"documentX}, {name:"documentY.."}]
},
...
]
And If I extend my needing here..:
There might be another type which will be used by Topic.
Category {
name: String,
id: ref
}
Topic {
...
category: Ref(Category.id)
}
So, there might be multiple topics and multiple documents.. Is there a way to get categories with linked documents for each category in a single response? :)
Category1:[Doc1, Doc2, Doc3, Doc4, Doc5]
Category1
Topic1 Topic2
Doc1 Doc2. Doc3 Doc4 Doc5

Related

Spring JPA and GraphQL

Given the following graph
type Movie {
name: String!
actors: [Actor!]!
}
type Actor {
name: String!
awards: [Award!]!
}
type Award {
name: String!
date: String!
}
type Query {
movies(): [Movie!]!
}
I'd like to be able to run the following three types of queries as efficiently as possible:
Query 1:
query {
movies {
actors {
rewards {
name
}
}
}
}
Query 2:
query {
movies {
name
}
}
Query 3:
query {
movies {
name
actors {
rewards {
date
}
}
}
}
Please note, these are not the only queries I will be running, but I'd like my code to be able to pick the optimal path "automatically".
The rest of my business logic is using JPA. The data comes from three respective tables, that can have up to 40 columns each.
I am not looking for code examples, but rather for a high-level structure describing different elements of architecture with respective responsibilities.
Without further context and details of your DB schema, what I could do is to just to give you the general advice that you need to aware of.
Most probably you would encounter N+1 loading performance issue when executing a query that contains several levels of related objects and these objects are stored in different DB tables.
Generally there are 2 ways to solve it :
Use Dataloader . Its idea is to defer the actual loading time of each object to a moment that multiple objects can be batched loaded together by a single SQL. It also provides the caching feature to further improve the loading performance for the same query request.
Use "look ahead pattern" (Refer this for an example). Its ideas is that when you resolve the parent object , you can look ahead to analyse the GraphQL query that you need to execute require to include others related children or not. If yes , you can then use the JOIN SQL to query the parent object together with their children such that when you resolve its children later , they are already fetched and you do not need to fetch them again.
Also, if the objects in your domain can contain infinity number in theory , you should consider to implement pagination behaviour for the query in order to restrict the maximum number of the objects that it can return.

Query does not return some items from DynamoDB via GraphQL

May I please know what is the reason why are items in DynamoDB not being fetched by GraphQL?
When searching via the DynamoDB console interface, I could easily see and query the item in there but once used in GraphQL, some items are not showing. Mind you, this isn't a connection problem because I could query items its just there's a specific item that is not being returned.
For example, if I query all Posts, it will return all posts in an array but the item is not showing there. However, when I try to query a Post just by targetting it by its ID, it is working well.
Example code that is not working:
listPosts(filter: {groupID: {eq: "25"}}) {
items {
id
content
}
}
but when I do this, it is working well:
getPost(id: "c59ce7e9") {
id
content
}
I had this same issue and can share what i found and worked for me.
The default resolver for the list operation has a limit:20 built in.
{
"version": "2017-02-28",
"operation": "Scan",
"filter": #if($context.args.filter) $util.transform.toDynamoDBFilterExpression($ctx.args.filter) #else null #end,
"limit": $util.defaultIfNull($ctx.args.limit, 20),
"nextToken": $util.toJson($util.defaultIfNullOrEmpty($ctx.args.nextToken, null)),
}
I imagine you could change this or you could add a limit filter to your query like this:
listPosts(filter: {groupID: {eq: "25"}}, limit:100) {
items {
id
content
}
}
The limit should be higher than the number of records.
You can see that this would be an issue because it is using the scan operation meaning it inspects each record for a match. this would hurt performance. you could add pagination or you should craft a query for this. you will need to look into pagination, relations and connection.
https://docs.aws.amazon.com/appsync/latest/devguide/designing-your-schema.html#advanced-relations-and-pagination

Elasticsearch: document relationship

I'm doing a elastic search autocomplete-as-you-type
I'm using cool features like ngram's and other stuff to create needed analyzer.
currently I break my had around indexing following data.
Let say I have Payments type,
each document in this type looks like this
{
..elastic meta data..
paymentId: 123453425342,
providerAccount : {
id: 123456
firstName: Alex,
lastName: Web
},
consumerAccount : {
id: 7575757,
firstName: John,
lastName: Doe
},
amount: 556,
date : 342523454235345 (some unix timestamp)
}
so basically this document represents not only the payment itself but it also shows the relationship of the payment, the 2 entities which related to the payment.
Payment always have its provider and consumer.
I need this data in payment document because I want to show it in UI.
By indexing it like so, it might be a big pain for handling the updates of Consumer or Provider because each time some of them change its properties I have to update all the payments which has this entity.
Another possible solution is to store only id's of this consumers/providers and make a query on payments and then 2 queries for the entities for retrieving needed fields, but i'm not sure about this because i'm doing ajax requests each time a character entered, so here comes the performance question.
I have also looked into parent/child relationship solution which basically fits my case but I wasn't able to figure out if I can retrieve also the parent(consumer/provider) fields while I querying child(payment).
What would you suggest?
Thanks!
Yes, you can retrieve the parent while querying child using has_child.
Considering payment as child and consumer as parent, You can search all the consumers by :
GET /index_name/consumer/_search
{
"query": {
"has_child": {
"type": "payment",
"query": {
// any query on payment table
},
"inner_hits": {}
}
}
}
This would fetch you all the consumer based on the query on child i.e payment in your case.
inner_hits is what you are looking for. This will retrieve you the children as well. But it was introduced in elasticsearch 1.5.0. So version should be greater than elasticsearch 1.5.0.
You can refer https://www.elastic.co/blog/elasticsearch-1-5-0-released.
Your problem is not an issue. I suppose you want tot freeze data after the pay, right? So you don't need to update the accounts data in existing payment documents.
Further: parent/schild is easy for updating, but less efficient with querying. For auto complete, stay using your current mapping!

Can graphql return aggregate counts?

Graphql is great and I've started using it in my app. I have a page that displays summary information and I need graphql to return aggregate counts? Can this be done?
You would define a new GraphQL type that is an object that contains a list and a number. The number would be defined by a resolver function.
On your GraphQL server you can define the resolver function and as part of that, you would have to write the code that performs whatever calculations and queries are necessary to get the aggregate counts.
This is similar to how you would write an object serializer for a REST API or a custom REST API endpoint that runs whatever database queries are needed to calculate the aggregate counts.
GraphQL's strength is that it gives the frontend more power in determining what data specifically is returned. Some of what you write in GraphQL will be the same as what you would write for a REST API.
There's no automatic aggregate function in GraphQL itself.
You can add a field called summary, and in the resolve function calculate the totals.
You should define a Type of aggregated data in Graphql and a function you want to implement it. For example, if you want to write the following query:
SELECT age, sum(score) from student group by age;
You should define the data type that you want to return:
type StudentScoreByAge{
age: Int
sumOfScore: Float
}
and a Graphql function:
getStudentScoreByAge : [StudentScoreByAge]
async function(){
const res = await client.query("SELECT age, sum(score) as sumOfScore
from Student group by age");
return res.rows;
}
... need graphql to return aggregate counts? Can this be done?
Yes, it can be done.
Does GraphQL does it automatically for you? No, because it does not know / care about where you get your data source.
How? GraphQL does not dictate how you get / mutate the data that the user has queried. It's up to your implementation to get the requested aggregated data. You could get aggregated data directly from your MongoDB and serve it back, or you get all the data you need from your data source and do the aggregation yourself.
If you are using Hasura, in the explorer, you can definitely see an "agregate" table name, thus, your query would look something similar to the following:
query queryTable {
table_name {
field1
field2
}
table_name_aggregate {
aggregate { count }
}
}
In your results, you will see the total row count for the query
"table_name_aggregate": {
"aggregate": {
"count": 9973
}
This depends on whether you build the aggregator into your schema and are able to resolve the field.
Can you share what kind of GraphQL Server you're running? As different languages have different implementations, as well as different services (like Hasura, 8base, and Prisma).
Also, when you say "counts", I'm imagining a count of objects in a relation. Such as:
query {
user(id: "1") {
name
summaries {
count
}
}
}
// returns
{
"data": {
"user": {
"name": "Steve",
"summaries": {
"count": 10
}
}
}
}
8base provides the count aggregate by default on relational queries.

Which of the following data structures is scalable in elasticsearch?

I want to build an event analytics system, where I can record and query events that a user has done, for example on a website.
My naive idea of the data model was simply a collection of event documents, each event including the userid, event type, and so on. So I thought something like this:
{ userid: Joe, event: homepage }
{ userid: Mike, event: homepage }
{ userid: Joe, event: productsPage }
{ userId: Joe, event: accountSettings }
{ userId: Joe, event: checkout }
etc
But now I'm struggling to figure out how to do some of the queries I'm most likely to be able to want to do.
For example, I want to say "Give me a list of all users who have visited the homepage AND the products page AND the checkout page"
Seems to me I would need to use my application code to do this, rather than elasticsearch? And I would need to do something like:
Step 1: select all users who have done 'homepage'
Step 2: select all users who have done 'products page'
Step 3: select all users who have done 'checkout page'
Step 4: build a list of only those users who appear in all 3 lists.
If I have a userbase of 20 million users, I risk bringing huge lists of data into my application?
An alternative would be to have one document per user, so that Joe looks like
{ userid: Joe, event: [ homepage, productsPage, accountSettings, checkout ] }
and so on.
But then that would involve updating this document every time the user did something. Since elasticsearch writes a new record rather than updating in place, that would involve a horrendous amount of rewriting, given that each user might do say 5000 events in a year, and spread across different days. Not to mention rewriting of the index?
Is there an idiomatic way I'm missing of accomplishing a database by user that can handle regular updates to each user, and buid indexes that allow for fast querying of that data by multiple criteria - eg users who have done eventA AND eventB AND eventC?
Many thanks for all your help!
You can use Kibana for Visualizing Data stored in Elasticsearch.
You can use this type of events itself:-
{ userid: Joe, event: homepage }
{ userid: Mike, event: homepage }
{ userid: Joe, event: productsPage }
{ userId: Joe, event: accountSettings }
{ userId: Joe, event: checkout }
etc
After storing your data in Elasticsearch, You can use Kibana and create visualizations specifying AND Filters over the event field.

Resources