Specify cache policy for parts of a graphQL query - caching

In Apollo's GraphQL version, there are fetch policies that specify whether a fetch query should obtain data from server or use the local cache (if any data is available).
In addition, cache normalization allows usage of the cache to cut down on the amount of data that needs to be obtained from the server. For example, if I am requesting object A and object B, but earlier I had requested A and C, then in my current query it will get A from cache, and get B from server.
However, however, these specify cache policies for the entire query. I want to know if there is a method for specifying TTLs on individual fields.
From a developer standpoint, I want to be able to specify in my query that I want to go to cache for some information that I am requesting, but not others. For example, take the below query:
query PersonInfo($id: String) {
person(id: $id) {
birthcertificate // Once this is cached, it is cached forever. I should just always get this info from the cache if it is available.
age // I want to have this have a TTL of a day before invalidating the cached value and going to network
legalName // I want to always go to network for this information.
}
}
In other words, for a fixed id value (and assuming this is the only query that touches the person object or its fields):
the first time I make this query, I get all three fields from the server.
now if I make this query again within a few seconds, I should only get the third field (legalName) from the server, and the first two from the cache.
now, if I then wait more than a day, and then make this query again, I get birthCertificate from the cache, and age + legalName from the server.
Currently, to do this the way I would want to, I end up writing three different queries, one for each TTL. Is there a better way?
Update: there is some progress on cache timing done on the iOS client (https://github.com/apollographql/apollo-ios/issues/142), but nothing specifically on this?

It would be a nice feature but AFAIK [for now, taking js/react client, probably the same for ios]:
there is no query normalization, only cache normalization
if any of requested field not exists in cache then entire query is fetched from network
no time stored in cache [normalized] entries (per query/per type)
For now [only?] solution is to [save in local state/]store timestamps for each/all/some queries/responses (f.e. in onCompleted) and use it to invalidate/evict them before fetching. It could probably be automated f.e. starting timers within some field policy fn.
You can fetch person data at start (session) just after login ... any following and more granular person(id: $id) { birthcertificate } query (like in react subcomponent) can have "own" 'cache-only' policy. If you need always fresh legalName, fetch for it [separately or not] with network-only policy.

Related

Apollo Client v3 Delete cache entries after given time period

I am wondering if there is a way to expire cached items after a certain time period, e.g., 24 hours.
I know that Apollo Client v3 provides methods such as cache.evict and cache.gc which are a good start and I am already using; however, I want a way to delete cache items after a given time period.
What I am doing at the minute is adding a TimeToLive field to every object in my Apollo schema, and when the backend returns an object, the field is populated with the current time + 24 hours (i.e. the time in 24 hours time). Then when I query the data in the front end, I check the to see if the TimeToLive field of the returned data is in the future (if not that means the data was definitely retrieved from the cache and in which case I call the refetch function, which forces the query to fetch the data from the server. However, this doesn't seem like the best way to do things, mainly because I have to iterate over every result in the returned data anch check if any of the returned objects are expired; and if so, everything is refetched.
Another solution I thought of was to use something like React Native Queue and have a background task that periodically checks the cache and deleted items that have expired. But again, I am not totally sold on this solution.
For a little bit of context here: I am building a cooking / recipes app - and recipes / posts are cached on the device; however, my concern is that a user could delete a post, but everyone else who has that post cached would still be able to see it, and hence by expiring the cached item at least they would only be able to see for a number of hours before it is removed. However they might be a better way to do this all together, i.e. have the sever contact clients with the cached item (though I couldn't think of any low lift solutions at the time of writing this)
apollo-invalidation-policies replaces the Apollo-client InMemoryCache with InvalidationPolicyCache and within the typePolicies you can specify a timeToLive field. If an object is accessed beyond their TTL, they are evicted and no data is returned.

Caching the results of a query and making smaller queries against it

I'm working with a database where I'll have to make a query for a certain ID as requests come in. My issue is that the DBAs have stipulated that I should simply take a batch copy of the entirety of the table for a given day and cache that.
This would mean I have to do a periodic select *, keep that in memory and any time a request comes in for an individual userId, point it to the cached version. However if the cache has expired I need to then do the large query again.
This all sounds achievable in theory but I don't know what API I should be using.

Looking for help understanding Apollo Client local state and cache

I'm working with Apollo Client local state and cache and although I've gone through the docs (https://www.apollographql.com/docs/react/essentials/local-state), a couple of tutorials (for example, https://www.robinwieruch.de/react-apollo-link-state-tutorial/) and looked at some examples, I'm a bit befuddled. In addition to any insight you might be able provide with the specific questions below, any links to good additional docs/resources to put things in context would be much appreciated.
In particular, I understand how to store local client side data and retrieve it, but I'm not seeing how things integrate with data retrieved from and sent back to the server.
Taking the simple 'todo app' as a starting point, I have a couple of questions.
1) If you download a set of data (in this case 'todos') from the server using a query, what is the relationship between the cached data and the server-side data? That is, I grab the data with a query, it's stored in the cache automatically. Now if I want to grab that data locally, and, say, modify it (in this case, add a todo or modify it), how I do that? I know how to do it for data I've created, but not data that I've downloaded, such as, in this case, my set of todos. For instance, some tutorials reference the __typename -- in the case of data downloaded from the server, what would this __typename be? And if I used readQuery to grab the data downloaded from the server and stored in the cache, what query would I use? The same I used to download the data originally?
2) Once I've modified this local data (for instance, in the case of todos, setting one todo as 'completed'), and written it back to the cache with writeData, how does it get sent back to the server, so that the local copy and the remote copy are in sync? With a mutation? So I'm responsible for storing a copy to the local cache and sending it to the server in two separate operations?
3) As I understand it, unless you specify otherwise, if you make a query from Apollo Client, it will first check to see if the data you requested is in the cache, otherwise it will call the server. Why, then, do you need to make an #client in the example code to grat the todos? Because these were not downloaded from the server with a prior query, but are instead only local data?
const GET_TODOS = gql`
{
todos #client {
id
completed
text
}
visibilityFilter #client
}
`;
If they were in fact downloaded with an earlier query, can't you just use the same query that you used originally to get the data from the server, not putting #client, and if the data is in the cache, you'll get the cached data?
4) Lastly, I've read that Apollo Client will update things 'automagically -- that is, if you send modified data to the server (say, in our case, a modified todo) Apollo Client will make sure that that piece of data is modified in the cache, referencing it by ID. Are there any rules as to when it does and when it doesn't? If Apollo Client is keeping things in sync with the server using IDs, when do we need to handle it 'manually', as above, and when not?
Thanks for any insights, and if you have links to other docs than those above, or a good tutorial, I'd be grateful
The __typename is Apollo's built-in auto-magic way to track and cache results from queries. By default you can look up items in your cache by using the __typename and id of your items. You usually don't need to worry about __typename until you manually need to tweak the cache. For the most part, just re-run your server queries to pull from the cache after the original request. The server responses are cached by default, so the next time you run a query it will pull from the cache.
It depends on your situation, but most of the time if you set your IDs properly Apollo client will automatically sync up changes from a mutation. All you should need to do is return the id property and any changed fields in your mutation query and Apollo will update the cache auto-magically. So, in the case you are describing where you mark a todo as completed, you should probably just send the mutation to the server, then in the mutation response you request the completed field and the id. The client will automatically update.
You can use the original query. Apollo client essentially caches things using a query + variable -> results map. As long as you submit the same query with the same variables it will pull from the cache (unless you explicitly tell it not to).
See my answer to #2 above, but Apollo client will handle it for you as long as you include the id and any modified data in your mutation. It won't handle it for you if you add new data, such as adding a todo to a list. Same for removing data.

What is the most efficient way to filter a search?

I am working with node.js and mongodb.
I am going to have a database setup and use socket.io to have real-time updates that will have the db queried again as well or push the new update to the client.
I am trying to figure out what is the best way to filter the database?
Some more information in regards to what is being queried and what the real time updates are:
A document in the database will include information such as an address, city, time, number of packages, name, price.
Filters include city/price/name/time (meaning only to see addresses within the same city, or within the same time period)
Real-time info: includes adding a new document to the database which will essentially update the admin on the website with a notification of a new address added.
Method 1: Query the db with the filters being searched?
Method 2: Query the db for all searches and then filter it on the client side (Javascript)?
Method 3: Query the db for all searches then store it in localStorage then query localStorage for what the filters are?
Trying to figure out what is the fastest way for the user to filter it?
Also, if it is different than what is the most cost effective way, then the most cost effective as well (which I am assuming is less db queries)...
It's hard to say because we don't see exact conditions of the filter, but in general:
Mongo can use only 1 index in a query condition. Thus whatever fields are covered by this index can be used in an efficient filtering. Otherwise it might do full table scan which is slow. If you are using an index then you are probably doing the most efficient query. (Mongo can still use another index for sorting though).
Sometimes you will be forced to do processing on client side because Mongo can't do what you want or it takes too many queries.
The least efficient option is to store results somewhere just because IO is slow. This would only benefit you if you use them as cache and do not recalculate.
Also consider overhead and latency of networking. If you have to send lots of data back to the client it will be slower. In general Mongo will do better job filtering stuff than you would do on the client.
According to you if you can filter by addresses within time period then you could have an index that cuts down lots of documents. You most likely need a compound index - multiple fields.

optimizing large selects in hibernate/jpa with 2nd level cache

I have a user object represented in JPA which has specific sub-types. Eg, think of User and then a subclass Admin, and another subclass Power User.
Let's say I have 100k users. I have successfully implemented the second level cache using Ehcache in order to increase performance and have validated that it's working.
http://docs.jboss.org/hibernate/core/3.3/reference/en/html/performance.html#performance-cache
I know it does work (ie, you load the object from the cache rather than invoke an sql query) when you call the load method. I've verified this via logging at the hibernate level and also verifying that it's quicker.
However, I actually want to select a subset of all the users...for example, let's say I want to do a count of how many Power Users there are.
Furthermore, my users have an associated ZipCode object...the ZipCode objects are also second level cached...what I'd like to do is actually be able to ask queries like...how many Power Users do i have in New York state...
However, my question is...how do i write a query to do this that will hit the second level cache and not the database. Note that my second level cache is configured to be read/write...so as new users are added to the system they should automatically be added to the cache...also...note that I have investigated the Query cache briefly but I'm not sure it's applicable as this is for queries that are run multiple times...my problem is more a case of...the data should be in the second level cache anyway so what do I have to do so that the database doesn't get hit when I write my query.
cheers,
Brian
(...) the data should be in the second level cache anyway so what do I have to do so that the database doesn't get hit when I write my query.
If the entities returned by your query are cached, have a look at Query#iterate(). This will trigger a first query to retrieve a list of IDs and then subsequent queries for each ID... that would hit the L2 cache.

Resources