I wanted to test the response times of a GraphQL endpoint, and a RESTful endpoint as I haven't ever used GraphQL before, and I am about to use it in my next Laravel project.
So I am using Lighthouse PHP package to serve a GraphQL endpoint from my Laravel app, and also I have created a RESTful endpoint.
Both endpoints(GraphQL and RESTful) are intended to get all Users(250 users) from my local Database.
So based on the test what I have noticed here is that, when I tested this both endpoints on Postman, the RESTful endpoint response is faster than GraphQL endpoint.
Can I know why GraphQL endpoint's response takes more time than RESTful while both endpoints are getting same data?
GraphQL endpoint result for GET request (response time: 88ms)
GraphQL endpoint result for POST request (response time: 88ms)
RESTful endpoint result (response time: 44ms)
There's no such thing as a free lunch.
GraphQL offers a lot of useful features, but those same features invariably incur some overhead. While a REST endpoint can effectively pull data from some source and regurgitate it back to the client, even for a relatively small dataset, GraphQL will have to do some additional processing to resolve and validate each individual field in the response. Not to mention the processing required to parse and validate the request itself. And this overhead only gets bigger with the size of the data returned.
If you were to introduce additional features to your REST endpoint (request and response validation, support for partial responses, ability to alias individual response fields, etc.) that mirrored GraphQL, you would see the performance gap between the two shrink. Even then, though, it's still somewhat of an apples and oranges comparison, since a GraphQL service will go through certain motions simply because that's what the spec says to do.
TLDR: Your REST example is simple and less complicated
In Lighthouse it is creating a AST for parsing the graphql request and your schema. It then afterwards passes all the directives and so on to figure out what you are trying to do. It also has to validate your query, to see if you can actually run it on the schema.
Depending on how you defined it in your application, there is a lot of steps it is passing through. However this can be reduced by multiple different ways, the parsing of your graphql schema can be cached, you could cache the result, use deferred fields (prob. wont speed up this example). You can read more about this in the performance section of the docs.
You are not specifying how your REST is setup, if you are using some kind of REST standard where it has to parse the data also. If you add more features, there is more code to run through, hence higher load speed.
As of Lighthouse v4, we have made significant performance increases by lazy-loading the minimally required fields and types from the schema. That turns out to bring about a 3x to 10x performance increase, depending on the size of your schema.
You probably still won't beat a single REST endpoint for such a simple query. Lighthouse will begin to shine on more heavily nested queries that join across multiple relationships.
Try enabling opcache on the server. This decreased my gql response time from 200ms to 20ms
Related
the documentation of Apollo-Server states, that Batching + Caching should not be used together with REST-API Datasources:
Most REST APIs don't support batching. When they do, using a batched
endpoint can jeopardize caching. When you fetch data in a batch
request, the response you receive is for the exact combination of
resources you're requesting. Unless you request that same combination
again, future requests for the same resource won't be served from
cache.
We recommend that you restrict batching to requests that can't be
cached. In these cases, you can take advantage of DataLoader as a
private implementation detail inside your RESTDataSource [...]
Source: https://www.apollographql.com/docs/apollo-server/data/data-sources/#using-with-dataloader
I'm not sure why they say: "Unless you request that same combination again, future requests for the same resource won't be served from cache.".
Why shouldn't future requests be loaded from cache again? I mean, here we have 2 caching layers. The DataLoader which batches requests and memorizes - with an per-request cache - which objects are requested and return the same object from it's cache if requested multiple times in the whole request.
And we have a 2nd level cache, that caches individual objects over multiple requests (Or at least it could be implemented in a way that it caches the individual objects, not the whole result set).
Wouldn't that ensure that feature requests would be served from the second layer cache if the whole request changes but includes some of the objects which were requested in a previous request?
Many REST APIs implement some sort of request caching for GET requests based on URLs. When you request an entity from a REST endpoint a second time, the result can be returned faster.
For example lets imagine a fictional API "Weekend City Trip".
Your GraphQL API fetches the three largest cities around you and then checks the weather in these cities on the weekend. In this fictional example you receive two requests. The first request is from someone in Germany. You find the three largest cities around them: Cologne, Hamburg and Amsterdam. You can now call the weather API either in a batch or one by one.
/api/weather/Cologne
/api/weather/Hamburg
/api/weather/Amsterdam
or
/api/weather/Cologne,Hamburg,Amsterdam
The next person is in Belgium and we find Cologne, Amsterdam and Brussels.
/api/weather/Cologne
/api/weather/Amsterdam
/api/weather/Brussels
or
/api/weather/Cologne,Amsterdam,Brussels
Now as you can see, without the batching we have requested some URLs twice. The API provider can use a CDN to return these results quickly and not strain their application infrastructure. And since you are probably not the only one using the API, all these URLs might already be cached in the first place, meaning you will receive responses much faster. While the amount of possible batch endpoints grows massively with each city offered and amount of cities offered. If the API provides only 1000 cities, there are 166167000 possible combinations that could be requested when batching three cities. Therefore, the chance that someone else already requested the combination of these three cities might be rather low.
Conclusion
The caching is really just on the API provider side but could greatly benefit your response times as a consumer. Often, GraphQL is used as an API gateway to your own REST services. If you don't cache your services, it can be worth it to use batching in that case.
I read somewhere that queries are only for Get request and can't handle Request body. But when I tried handling a mutation in query, it just worked! If that's so, what's the use of mutations then?
P.S. - Many websites say mutations can be used to perform crud operations. But don't have any data store as such, all my get/post/ put requests are fetching data and are rest APIs. How should I utilised the power of mutations then?
GraphQL is a separate protocol. It does not depend on HTTP operations such as POST, PUT, or DELETE. So, in GraphQL, POST, PUT, or DELETE does not make sense. Instead, GraphQL has its own set of operations. Namely, Query, Mutation, and Subscription.
The Query operation is used to retrieve data from a GraphQL server, and the Mutation is used to mutate data. Subscription is used to continuously retrieve the data.
How a GraphQL endpoint handles these operations solely depends on the implementers of those endpoints. It can be either updating a database or changing files in the server. As far as GraphQL is concerned, a Mutation operation is intended to have side effects, but not necessarily. A Query operation on the other hand should not do anything like that. It should only read the data. But if the developers want to violate these rules, they can.
However, most of the GraphQL implementations use HTTP as the underlying network protocol since the GraphQL spec does not specify a network protocol. So, internally, GraphQL servers will handle requests using the HTTP GET and POST methods. But they don't have any difference from GraphQL's point of view.
GraphQl Supports only "POST" method for Mutation, Getting Query also
For your reference please see following
https://graphql.org/learn/serving-over-http/
I've been working on a GraphQL server for a while now and although I understand most of the aspects, I cannot seem to get a grasp on caching.
When it comes to caching, I see both DataLoader mentioned as well as Redis but it's not clear to me when I should use what and how I should use them.
I take it that DataLoader is used more on a field level to counter the n+1 problem? And I guess Redis is on a higher level then?
If anyone could shed some light on this, I would be most grateful.
Thank you.
DataLoader is primarily a means of batching requests to some data source. However, it does optionally utilize caching on a per request basis. This means, while executing the same GraphQL query, you only ever fetch a particular entity once. For example, we can call load(1) and load(2) concurrently and these will be batched into a single request to get two entities matching those ids. If another field calls load(1) later on while executing the same request, then that call will simply return the entity with ID 1 we fetched previously without making another request to our data source.
DataLoader's cache is specific to an individual request. Even if two requests are processed at the same time, they will not share a cache. DataLoader's cache does not have an expiration -- and it has no need to since the cache will be deleted once the request completes.
Redis is a key-value store that's used for caching, queues, PubSub and more. We can use it to provide response caching, which would let us effectively bypass the resolver for one or more fields and use the cached value instead (until it expires or is invalidated). We can use it as a cache layer between GraphQL and the database, API or other data source -- for example, this is what RESTDataSource does. We can use it as part of a PubSub implementation when implementing subscriptions.
DataLoader is a small library used to tackle a particular problem, namely generating too many requests to a data source. The alternative to using DataLoader is to fetch everything you need (based on the requested fields) at the root level and then letting the default resolver logic handle the rest. Redis is a key-value store that has a number of uses. Whether you need one or the other, or both, depends on your particular business case.
I am new user for graphql. I am planning to use graphql as a middleware layer where different application will hit the API and get the data they require. But main problem is training different groups as to how to post data and query the data they require. Is is good idea to build a middleware which accepts JSON over REST api and converts it to graphql request. I am thinking of 2 options
1. Build REST middle layer which accepts JSON and convert it to graphql request.
2. Ask user to get comfortable with graphql.
Mixing REST and graphql is never a good idea for a new project, because you will waste your resources for doing the same thing in two different ways and you will have to maintain larger codebase. Providing REST and graphql at the same time may seems like a convenience for your customers but in the long run, it is not. Smaller, well structured and well documented API is always preferable.
If you are going to mix and match different resources or call outside services graphql offers better solution. Graphql provides strong typing, single round trip, query batching, instrospection and better dev tools, versionless API.
We are considering using GraphQL on top of a REST service (using the
FHIR standard for medical records).
I understand that the pattern with GraphQL is to aggregate the results
of multiple, independent resolvers into the final result. But a
FHIR-compliant REST server offers batch endpoints that already aggregate
data. Sometimes we’ll need à la carte data—a patient’s age or address
only, for example. But quite often, we’ll need most or all of the data
available about a particular patient.
So although we can get that kind of plenary data from a single REST call
that knits together multiple associations, it seems we will need to
fetch it piecewise to do things the GraphQL way.
An optimization could be to eager load and memoize all the associated
data anytime any resolver asks for any data. In some cases this would be
appropriate while in other cases it would be serious overkill. But
discerning when it would be overkill seems impossible given that
resolvers should be independent. Also, it seems bloody-minded to undo
and then redo something that the REST service is already perfectly
capable of doing efficiently.
So—
Is GraphQL the wrong tool when it sits on top of a REST API that can
efficiently aggregate data?
If GraphQL is the right tool in this situation, is eager-loading and
memoization of associated data appropriate?
If eager-loading and memoization is not the right solution, is there
an alternative way to take advantage of the REST service’s ability
to aggregate data?
My question is different from
this
question and
this
question because neither touches on how to take advantage of another
service’s ability to aggregate data.
An alternative approach would be to parse the request inside the resolver for a particular query. The fourth parameter passed to a resolver is an object containing extensive information about the request, including the selection set. You could then await the batched request to your API endpoint based on the requested fields, and finally return the result of the REST call, and let your lower level resolvers handle parsing it into the shape the data was requested in.
Parsing the info object can be a PITA, although there's libraries out there for that, at least in the Node ecosystem.