Graphql federation vs schema stitching. When to pick one over the other

Graphql federation vs schema stitching. When to pick one over the other - graphql

I'm new to both concepts so excuse me if it's opinion-based. Currently, I'm looking at Apollo Federation and schema stitching provided by the graphql-tools package, though I guess it applies to similar packages. Could something like a table be created describing certain requirements/conditions to prefer one over the other?

Apollo's GraphQL Federation and "schema stitching" both accomplish a similar goal: unify multiple GraphQL APIs under a single GraphQL API.
Based on my understanding, the main differences between them are:
In Apollo's Federation, subgraph services own the logic for linking shared types together; in schema stitching, this logic is handled by the gateway.
Apollo's Federation distributes the ownership of subgraphs to the individual service teams; stitching assumes centralized responsibility of the full schema.
Apollo's Federation is tightly coupled to the Apollo ecosystem; GraphQL Tools' schema stitching is more open source.
For more details, I'd recommend reading https://product.voxmedia.com/2020/11/2/21494865/to-federate-or-stitch-a-graphql-gateway-revisited.

Related

Best way to consume & combine another public apollo graphql apis: federation v schema stitching

I'm developing a backend-for-frontend (BFF) solution for a web client with apollo graphql
Use-case background, our organization has a general use graphql api that another team owns and my team is creating another graphql server to consume it. This would allow us to offload heavy computations from the client. We are also hoping to have a combined schema to access the general use api when needed from the same endpoint as our BFF.
My questions are:
apollo federation is recommended for combining schemas, however, it strongly recommends that federated servers are private behind a firewall due to the power of the _entities field. Why is that and would it be a concern if the data is not sensitive user data? We'd prefer to keep all servers public.
apollo schema stitching may actually fit our use case better since it does not make note that any api be private. It also may make DataSource logic more streamlined for the computations we need to make. However, most documentation I see are about migrating FROM schema stitching. Is schema stitching to be deprecated in the near future?
is there another option that seems like it would fit the bill better that I have missed?

Hiding your federated services is not a must do, its more of a recommendation.
If you have a single combined schema, then why would you expose the non-combined individual schemas?
Also if you allow clients to interact with individual services, it is more difficult to manage logs and rate limiting also becomes more difficult.
When you use combine multiple graphql schemas, you also probably refer to each other, or service A could add fields to service B, etc. Querying these fields will be sort of broken if you use individual services. (but it will still work, just missing fields)
And yes schema stching is being deprecated. Apollo recommends developers to use apollo federation instead.
One thing to note though, apollo federation does not support subscription yet, so if subscription is a must right now, then you should stick with schema stiching until subscription is released.

To answer your questions:
Why is that and would it be a concern if the data is not sensitive user data?
The _entities query is very powerful, as you said. For example, if you're taking access control shortcuts it can be dangerous. Assume you have private users. If you have auth checks on the Query.user(id: ID), in a non-federated query, you don't have to necessarily put extra auth checks on User.homeAddress, since you can't get to the nested field without having access to the top-level field. Now let's pretend your address service extends the user object. Now I can make this call to the address service endpoint (not through the gateway):
query ($_representations: [_Any!]!) {
_entities(representations:$_representations) {
... on User {
homeAddress {
street1
street2
city
state
postalCode
}
}
}
}
{
"_representations": [{
"__typename": "User",
"id": "user-id-whatever"
}]
}
And if you did "lazy access control" you're now not protected, because it's never calling the users service for permission first. As an extension to this, you can see it depends on if you're dealing with "non-sensitive" data or "unprotected data".
Schema Stitching already was deprecated for a while. It appears that they've un-deprecated it, since I can no longer see deprecation warnings where I used to see them, but as far as I know, it's "still going away".
Not really, but I also would recommend Federation. Honestly, I've done both in production for a long time, and I've been a big fan of Federation, and I would recommend it for most cases. The only real difference between federation and stitching is that with stitching you have to expose "extra fields" from your backing services (e.g. to add a user to an object, the service would have to expose userId, which you could then use to stitch a User onto. Federation gets around that by exposing _entities, but it's basically the thing, and Federation does it for you, so you don't have to build the delegation.

GraphQL, Apollo, Relay … how do they fit together?

When trying to get started with GraphQL, you meet a lot of new terms: Some are related to the concept of GraphQL itself (mutations, subscriptions, …), but there is also an entire ecosystem around it, which – unfortunately – is not always separated clearly from GraphQL itself. However, I find it quite hard to tell where one thing ends, and where the other one starts, and what the differences are, and what is needed when.
So, to name a few of these terms:
GraphQL
Apollo
Apollo Client
Relay
Can you explain maybe in a few sentences what these things (except GraphQL) are, what they are good for, and how they relate to each other (or don't)? And, which important tools / concepts are missing here?

GraphQL The language
GraphQL is a query language. It has a specification that defines the language, schemas and also the execution of GraphQL queries. Learning these things is a great place to start and completely programming language agnostic.
GraphQL implementations
Then there are different GraphQL implementations in different languages that allow you to create a schema and describe how the query resolves to values. Usually these implementations validate the query against the schema that you have defined and take over the execution. Pretty much all of the JavaScript ecosystem uses GraphQL.js but there are many more implementations in other languages.
GraphQL Servers
GraphQL is also transport layer agnostic. That means that usually the GraphQL implementations don't come with an HTTP server. But often we use HTTP to make GraphQL queries, this is why there are some libraries that use these implementations and provide an easy way to create an HTTP server on top (e.g. by providing a middleware for an HTTP framework or coming with a whole server). I think in JavaScript pretty much everyone uses Apollo Server because it brings some more features and it integrates smothly with the Apollo ecosystem and the services offered by Apollo the company.
Apollo Server has also very much popularized the SDL (Schema Definition Language) approach of defining a GraphQL API. With the SDL approach the GraphQL schema is not created using code but by defining the Schema in a special language (also part of the GraphQL spec) and defining single resolver function seperately from that definition. This get's you started quickly but my feeling is that it is not very popular for large APIs. But you can simply pass your GraphQL.js schema to Apollo Server.
GraphQL Clients
When we have a server running we can make queries to the server using a simple HTTP client like the fetch function of the browser. That works pretty well but the power of GraphQL really shines when we use a client that supports caching an automatic query fetching when they are needed. This way we can reach the promised land of declarative data fetching / data dependencies. Facebook has published their own client library that is designed for the unique requirements of a large web enterprise. The library is called Relay and the newer version (breaking from the older version) is usually reffered to as Relay Modern. But Relay is relatively complicated and needs a specific build chain so that GraphQL became really interesting when Apollo released a lighweight client alternative known as Apollo Client. Apollo Client developed a lot over the years and now supports a lot of configuration. Also apollo-react allows to use Apollo Client with React whereas Relay is specifically built for React. With Apollo gaining quite some weight over the years the folks at Formidable Labs have created Urql.
Conclusion
You can use all of these technologies together. Many people simply chose to use the Apollo Ecosystem together, which is probably a solid choice. If you have used Redux before you will probably feel at home using Apollo Client or Urql. If you are building a large app with a performance focus you should consider Relay and understand what contrains this puts on the way you build the GraphQL schema.

What is the best and proper way to user Hasura as a Data Access Layer

I want to use Hasura only as a Data Access Layer behind a NestJs GraphQL server and keep all the benefit of Hasura especially the real time feature with subscriptions.
The idea is to build a more customised API and handle all the business logic before interacting with Hasura.
By doing this, do I have to handle security access myself on Nestjs layer since I have to connect to Hasura server with x-hasura-admin-secret from Nestjs server?
Do you think it is a good approach to use Hasura ?
Is there any other alternatives that use Hasura as a data access layer in a scalable architecture?
Thanks

Of course you are responsible of the security of the NestJS layer, but I guess you mean authentication. If that's true, then it all depends on what authentication you are planning on using.
If you wish to implement the authentication either using JWT or Webhooks (see this article for explanations), I would suggest letting Hasura take care of it, and your additional layer in this case will just forward the requests to Hasura without needing to to provide the X-Hasura-Admin-Secret.
From what I can tell, people are adding a "Hasura layer" to their already existing, or public, GraphQL APIs using their Remote Schemas feature to accomplish Schema Stitching.
Can we know more of why you wish to add the layer? I am pretty sure Hasura offers way of defining custom queries and aggregate queries and from my experience provides ample queries/mutations. Plus it offers other features that you will probably find interesting (subscriptions, events...)

Why use Prisma in a backend environment?

After learning about GraphQL and using it in a few projects, I finally wanted to give Prisma a go. It promises to eliminate the need for a database and it generates a GraphQL client and a working database from the GraphQL Schema. So far so good.
But my question is: A GraphQL client to me really only seems useful for a client (prevent overfetching, speed up pages, React integrations, ...). Prisma however does not eliminate the need for business logic, and so one would end up using the generated client library in Node.js, just to reexport a lot of the functionality in yet another GraphQL server to the actual client.
Why should I prefer Prisma over a custom database solution? Is there a thought behind having to re-expose a lot of endpoints to the actual client?

I work at Prisma and would love to clarify this!
Here's a quick note upfront: Prisma is not a "GraphQL-as-a-Service" tool (in the way that Graphcool, AppSync or Hasura are). The Prisma client is not a "GraphQL client", it's a database client (similar to an ORM). So, the reason for not using the Prisma client on the frontend is the same as for why you wouldn't use an ORM or connect to the DB directly from the frontend.
It promises to eliminate the need for a database and it generates a GraphQL client and a working database from the GraphQL Schema. So far so good.
I'm really curious to hear where exactly you got this perception from! We're well aware that we need to improve our communication about the value that Prisma provides and how it works. What you've formulated there is an extremely common misconception about Prisma that we want to prevent in the future. We're actually planning to publish a blog post about this exact topic next week, hopefully that will clarify a lot.
To pick up the concrete points:
Prisma doesn't eliminate the need for a database. Similar to an ORM, the Prisma client used to simplify database access. It also makes database migrations easier with a declarative data modelling and migrations approach (we're actually currently working on large improvements to our migration system, you can find the RFC for it here).
Another major benefit of Prisma is the upcoming Prisma Admin, a data management tool. The first preview for that will be available next week.

Even I had similar questions when I started learning graphql. This is what I learned and realised after using it.
Prisma acts as a proxy for your database providing you with a ready
to use GraphQL API that allows you to filter and sort data along with
some custom types like DateTime which are not a part of graphql and
you'd have to otherwise implement yourself. It's not a GraphQL server. Just a
layer between your database and backend server like an ORM.
It covers almost all the possible usecases that you might have from a
data model with all the CRUD operations pre-defined in a schema
along with subscriptions, so you don't have to do all that stuff
and focus more on your business logic side of things.
Also it removes the dependency of you writing different queries for
different databases like Sql or MongoDb acting as a layer to
transform it's query language to actual database queries.
You can use the API(graphql) server to expose only the desired schema
to the client rather than everything. Since graphql queries can get
highly nested, it may be difficult and tricky to implement that which
may also lead to performance issues which is not the case in Prisma as it handles everything itself.
You can check out this article for more info.

What is the underlying backend of graphql?

I sort of understand how the graphql engine works with its querying ability etc.
How does graphql actually connect to the backend datastores like postgresql etc.
Is this a nodejs application or a more scalable backend written in java/golang could be used by graphql?
Sorry I'm not sure I understand the various components required to use graphql and hoping someone could explain that to me.

I believe if I understand it correctly you define the back-end for your GraphQL implementation. See here for supported languages: http://graphql.org/code/
One of the primary things GraphQL does is allow the client to define what gets returned rather than the Api, you build out your Api with the GraphQL libraries and define schemas to map your data logically.
Here's a good article that shows how this can be done with React for the client and Node for the back-end: https://www.sitepoint.com/graphql-overview/

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio