Fetching the data optimally in GraphQL - graphql

How can I write the resolvers such that I can generate database sub-query in each resolver and effectively combine all of them and fetch the data at once?
For the following schema :
type Node {
index: Int!
color: String!
neighbors(first: Int = null): [Node!]!
}
type Query {
nodes(color: String!): [Node!]!
}
schema {
query: Query
}
To perform the following query :
{
nodes(color: "red") {
index
neighbors(first: 5) {
index
}
}
}
Data store:
In my data store, nodes and neighbors are stored in separate tables. I want to write a resolver so that we can fetch the required data optimally.
If there are any similar examples, please share the details. (It would be helpful to get an answer in reference to graphql-java)

DataFetchingEnvironment provides access to sub-selections via DataFetchingEnvironment#getSelectionSet. This means, in your case, you'd be able to know from the nodes resolver that neighbors will also be required, so you could JOIN appropriately and prepare the result.
One limitation of the current implementation of getSelectionSet is that it doesn't provide info on conditional selections. So if you're dealing with interfaces and unions, you'll have to manually collect the sub-selection starting from DataFetchingEnvironment#getField. This will very likely be improved in the future releases of graphql-java.

The recommended and most common way is to use a data loader.
A data loader collects the info about which fields to load from which table and which where filters to use.
I haven't worked with GraphQL in Java, so I can only give you directions how you could implement this yourself.
Create an instance of your data loader and pass it to your resolvers as the context argument.
Your resolvers should pass the table name, a list of field names and a list of where conditions to the data loader and return a promise.
Once all the resolvers have executed your data loader should combine those lists so you only end up with one query per table.
You should remove duplicate field names and combine the where conditions using the or keyword.
After the queries have executed you can return all of this data to your resolvers and let them filter the data (since we combined the conditions using the or keyword)
As an advanced feature your data loader could apply the where conditions before returning the data to the resolvers so that they don't have to filter them.

Related

Group queries in GraphQL (not "group by")

in my app there are many entities which get exposed by GraphQL. All that entities get Resolvers and those have many methods (I think they are called "fields" in GraphQl). Since there is only one Query type allowed, I get an "endless" list of fields which belong to many different contexts, i.E.:
query {
newsRss (...)
newsCurrent (...)
userById(...)
weatherCurrent (...)
weatherForecast(...)
# ... many more
}
As you can see, there are still 3 different contexts here: news, users and weather. Now I can go on and prefix all fields ([contextName]FieldName), as I did in the example, but the list gets longer and longer.
Is there a way to "group" some of them together, if they relate to the same context? Like so, in case of the weather context:
query {
weather {
current(...)
forecast(...)
}
}
Thanks in advance!
If you want to group them together , you need to have a type which contain all fields under the same context . Take weather as an example , you need to have a type which contain currentWeather and forecastWeather field. Does this concept make sense to your application such that you can name it easily and users will not feel strange about it ? If yes , you can change the schema to achieve your purpose.
On the other hand, if all fields of the same context actually return the same type but they just filtering different things, you can consider to define arguments in the root query field to specify the condition that you want to filter , something like :
query {
weather(type:CURRENT){}
}
and
query {
weather(type:FORECAST){}
}
to query the current weather and forecast weather respectively.
So it is a question about how you design the schema.

How can I pass arguements to child fields in Apollo?

I'm trying to build a graphql interface that deals with data from different regions and for each region there's a different DB.
What I'm trying to accomplish is:
TypeDefs= gql`
type Player {
account_id: Int
nickname: String
clan_id:Int
clan_info:Clan
}
type Clan{
name:
}
So right now I can request player(region, id), and this will pull up the player details no issues there.
But the issue is that Clan_info field also requires the region from the parent, so the resolver would look like clan_info({clan_id}, region).
Is there any way to pass down the region from parent to child field? I know I can add it to the details of the player, but would rather not since there would be millions of records and every field counts

graphql filter based on internal fragments (gatsbyJS)

Why is this not possible? In the sense that it looks like I have no access to any property accessed through an internal fragment such as ... on File
codebox from gatsby-docs
{
books: allMarkdownRemark(filter: {parent: {sourceInstanceName: {eq: "whatever"}}}) {
totalCount
edges {
node {
parent {
... on File {
sourceInstanceName
}
}
}
}
}
}
Error: Field is not defined by type NodeFilterInput
It's a resolver authors responsibility.
You can compare it to general function arguments and returned result. In graphQL both are strictly defined/typed.
In this case, for query allMarkdownRemark you have
allMarkdownRemark(
filter: MarkdownRemarkFilterInput
limit: Int
skip: Int
sort: MarkdownRemarkSortInput
): MarkdownRemarkConnection!
... so possible arguments are only filter, limit, skip and sort. Argument filter has defined shape, too - it has to be MarkdownRemarkFilterInput type. You can only use properties defined in this type for filter argument.
This is by design, this is how designer created resolver and his intentions about how and what arguments are handled.
It's like pagination - you don't have to use any of result fields as arguments as skip and limit are for record level. This way those arguments are not related to fields at all. They are used for some logic in resolver. filter argument is used for logic, too ... but it's deleveloper decision to choose and cover filtering use cases.
It's impossible to cover all imaginable filters on all processed data layers and properties, ... for parent you can only use children, id, internal and parent properties and subproperties (you can explore them in playground).
Of course it's not enough to extend type definition to make it working with another argument - it's about code to handle it.
If you need onother filtering logic, you can write your own resolver (or modify forked gatsby project) for your file types or other source.

Apollo query does not return cached data available using readFragment

I have 2 queries: getGroups(): [Group] and getGroup($id: ID!): Group. One component first loads all groups using getGroups() and then later on a different component needs to access a specific Group data by ID.
I'd expect that Apollo's normalization would already have Group data in cache and would use it when getGroup($id: ID!) query is executed, but that's not the case.
When I set cache-only fetchPolicy nothing is returned. I can access the data using readFragment, but that's not as flexible as just using a query.
Is there an easy way to make Apollo return the cached data from a different query as I would expect?
It's pretty common to have a query field that returns a list of nodes and another that takes an id argument and returns a single node. However, deciding what specific node or nodes are returned by a field is ultimately part of your server's domain logic.
As a silly example, imagine if you had a field like getFavoriteGroup(id: ID!) -- you may have the group with that id in your cache but that doesn't necessarily mean it should be returned by the field (it may not be favorited). There's any number of factors (other arguments execution context, etc.) that might affect what nodes(s) are returned by a field. As a client, it's not Apollo's place to make assumptions about your domain logic.
However, you can effectively duplicate that logic by implementing query redirects.
const cache = new InMemoryCache({
cacheRedirects: {
Query: {
group: (_, args) => toIdValue(cache.config.dataIdFromObject({ __typename: 'Group', id: args.id })),
},
},
});

How to create an outputschema which has nested bags in pig

I am trying out Pig UDFs and have been reading about it. While the online content was helpful, I am still not sure if I understand how to create a complex output schema which has nested bags.
Please help.The requirement is as follows. Say for example, I am analyzing e-commerce orders data. An order can have multiple products ordered in them.
I have the product level data grouped at an order level. This is the input to my UDF. So each grouped data at an order level containing information about the products in each order is my input.
InputSchema:
(grouped_at_order, {
(input_column_values_at_product1_level),
(input_column_values_at_product2_level)
})
I would be computing metrics both at an order level and at a product level in UDF. For example: sum(products) is an order level metric, color of each product is a product level metric. So, ForEach row grouped at an order level sent to UDF, I want to compute the order level & item level metrics.
Expected OutputSchema:
{
{ (orders, (computed_values_at_order_level)) },
{(productlevel,
{
(computed_values_at_product1_level),
(computed_values_at_product2_level),
(computed_values_at_product3_level)
}
)
}
}
The objective then is to persist the data at order level and product level in two separate output tables from pig.
Is there a better way of doing the same?
As #maxymoo said, before returning nested data from an UDF, I would check first if I really need it.
Anyway, if you do, the solution is not complicated but painfull. You just create schema, add field, then create a schema for the tuple, add the fields or the subbags into, and so on.
#Override
public Schema outputSchema(Schema input) {
Schema statsOrderLevel = new Schema();
statsOrderLevel.add(new FieldSchema("value", DataType.CHARARRAY));
Schema statsOrderLevelTuple = new Schema();
statsOrderLevelTuple.add(new FieldSchema(null, statsOrderLevel, DataType.TUPLE);
Schema statsOrderLevelBag = new Schema();
statsOrderLevelBag.add(new FieldSchema("stats", statsOrderLevelTuple, DataType.BAG));
[...]
}

Resources