Querying cache by fields other than ID? - apollo-client

I'm integrating GraphQL into my application and trying to figure out if this scenario is possible.
I have a schema for a Record type and a query that returns a list of Records from my service. Schema looks something like:
type Query {
records(someQueryParam: String!): [Record]!
}
type Record {
id: String!
otherId: String!
<other fields here>
}
There are some places in my application where I need to access a Record using the otherId value (because that's all I have access to). Currently, I do that with a mapping of otherId to id values that's populated after all the Records are downloaded. I use the map to go from otherId to id, and then use the id value to index into the collection of Record objects, to avoid iterating through the whole thing. (This collection used to be populated using a separate REST call, before I started using Apollo GQL.)
I'd like to remove my dependency on this mapping if possible. Since the Records are all in the Apollo cache once they've been loaded, I'd like to just query the cache for the Record in question using the otherId value. My service doesn't currently have that kind of lookup, so I don't have an existing query that I can cache in parallel. (i.e. there's no getIdFromOtherId).
tl;dr: Can I query my Apollo cache using something other than the id of an object?

You can't query the cache by otherId for the same reason you don't want to have to search through the record set to find the matching item -- the id is part of the item's key, and without the key Apollo can't directly access the item. Apollo's default cache is a key-value store, not a database that you can query however you like.
It's probably necessary to build a query into your data source that allows mapping between otherId and id, obviously it would be horribly inefficient at scale to search through the entire record set for your item.

Related

DynamoDB Appsync Query on multiple attributes

My app uses AppSync resolvers to fetch data from DDB and return it to our front-end. One table we have is for Notifications. A Notification can be either pending or default (non-pending). The table itself has a primary key of notification_id and we have a GSI called userIndex to grab the notifications for a user, with a sort key of timestamp.
In the app, I show all notifications in a list, pending first and then default. Given that a user may have many notifications, I'd like to implement pagination to fetch a batch at a time. The only way I've been able to do this is to
change the query to include a isPending parameter, which I use as a filter expression for the query to only return notifications that are isPending or isNotPending.
Store two "nextTokens", one for each isPending and isNotPending, along with corresponding lists.
Make separate queries for pending/non-pending, and use the filter to return to the appropriate list.
This is obviously inefficient and I am re-reading data from DynamoDB. My question is, given my DynamoDB table/requirements, is there a way I can paginate so that I can get all the pending notifications first (sorted by timestamp) and then all the default notifications next (sorted by timestamp) by using one query and one nextToken
I've seen the use of #model and #key, but I haven't been able to make it work in my app.
Thanks!
No, not really. There is a hard limit on returns for a Dynamodb query - and that cannot be bypassed. the only way to make use of nextToken is another query.
However, it is also worth noting that the FilterExpression happens after the data has already been retrieved and is filtered client side. It does not reduce the documents pulled from the query - only whats displayed. So the next token is still going to be (relatively) the same for each query. You can instead filter it yourself after the call before the next pagination query and save yourself a little bit in terms of multiple calls.

Is it possible to query an AppSync GraphQL type by timestamps?

I want to query all instances of a model by the most recently created.
Reading the official docs, they suggest a way of querying by the default timestamps (updatedAt/createdAt) but only when also querying by another key. So I know I could query a hypothetical User model by name and createdAt, but I can't query all instances of User by createdAt.
Is there an established way of doing this?
I have tried adding a #key directive to sort by updatedAt, but that results in an error because updatedAt is automatically added and not described in my schema. If I then add the timestamps to my schema this creates problems when mutating clients because it expects the timestamps to be added by me, which I obviously don't do because it's automatically added by DynamoDB.
Thanks
You could try using a Global Secondary Index on the field you want to query. In your AppSync resolver, you need to specify the index you want to use for the query.
Another way would be to run a scan operation against your DB (you don't need to specify a key in this case), although that would be way more inefficient than a GSI.

Update Apollo cache after object creation

What are all the different ways of updating the Apollo InMemoryCache after a mutation? From the docs, I can see:
Id-based updates which Apollo performs automatically
Happens for single updates to existing objects only.
Requires an id field which uniquely identifies each object, or the cache must be configured with a dataIdFromObject function which provides a unique identifier.
"Manual" cache updates via update functions
Required for object creation, deletion, or updates of multiple objects.
Involves calling cache.writeQuery with details including which query should be affected and how the cache should be changed.
Passing the refetchQueries option to the useMutation hook
The calling code says which queries should be re-fetched from the API, Apollo does the fetching, and the results replace whatever is in the cache for the given queries.
Are there other ways that I've missed, or have I misunderstood anything about the above methods?
I am confused because I've been reading the code of a project which uses Apollo for all kinds of mutations, including creations and deletions, but I don't see any calls to cache.writeQuery, nor any usage of refetchQueries. How does the cache get updated after creations and deletions without either of those?
In my own limited experience with Apollo, the cache is not automatically updated after an object creation or deletion, not even if I define dataIdFromObject. I have to update the cache myself by writing update functions.
So I'm wondering if there is some secret config I've missed to make Apollo handle it for me.
The only way to create or delete a node and have Apollo automatically update the cache to reflect the change is to return the parent field of whatever field contains the updated List field. For example, let's say we have a schema like this:
type Query {
me: User
}
type User {
id: ID!
posts: [Post!]!
}
type Post {
id: ID!
body: String!
}
By convention, if we had a mutation to add a new post, the mutation field would return the created post.
type Mutation {
writePost(body: String!): Post!
}
However, we could have it return the logged in User instead (the same thing the me field returns):
type Mutation {
writePost(body: String!): User!
}
by doing so, we enable the client to make a query like:
mutation WritePost($body: String!){
writePost(body: $body) {
id
posts {
id
body
}
}
}
Here Apollo will not only create or update the cache for all the returned posts, but it will also update the returned User object, including the list of posts.
So why is this not commonly done? Why does Apollo's documentation suggest using writeQuery when adding or deleting nodes?
The above will work fine when your schema is simple and you're working with a relatively small amount of data. However, returning the entire parent node, including all its relations, can be noticeably slower and more resource-intensive once you're dealing with more data. Additionally, in many apps a single mutation could impact multiple queries inside the cache. The same node could be returned by any number of fields in the schema, and even the same field could be part of a number of different queries that utilize different filters, sort parameters, etc.
These factors make it unlikely that you'll want to implement this pattern in production but there certainly are use cases where it may be a valid option.

Getting started with Bleve using BoltDB

I am trying to wrap my head around Bleve and I understand everything that is going on in the tutorials, videos and documentation. I however get very confused when I am using it on BoltDB and don't know how to start.
Say I have an existing BoltDB database called data.db populated with values of struct type Person
type Person struct {
ID int `json:"id"`
Name string `json:"name"`
Age int `json:"age"`
Sex string `json:"sex"`
}
How do I index this data so that I can do a search? How do I handle the indexing of data that will be stored in the database in the future?
Any help will be highly appreciated.
Bleve uses BoltDB as one of several backend stores and is separate from where you store your application data. To index your data in Bleve, simply add your Index:
index.Index(person.ID, person)
That index exists separately from your application data (whether it's in Bolt, Postgres, etc).
To retrieve your data, you'll need to construct a search request using bleve.NewSearchRequest(), then call Index.Search(). This will return a SearchResult which includes a Hits field where you can retrieve the ID for your object. You can use this to look up the object in your application data store.
Disclaimer: I am the author of BoltDB.
How you index your data depends on how you want to query for it.
If you want to query by any arbitrary fields, like {Age:15, Name:"Bob"} then BoltDB isn't an awesome fit for your problem.
BoltDB is simply a key value store with fast access to sequential keys and efficient prefix seeking. It's not really a replacement for general use databases.
You likely want something more like a document store (ie: MongoDB) or RDBMS (ie: PostgreSQL).
If you just wanted something that uses simple files and is embedded, you could also use SQlite with the Go module
If you want to search by only a single field, like ID or Name, then use that as the key.
If lookup speed doesn't matter at all, I guess you can use Bolt to just iterate over the entire db, parse the json and check the fields. But that's probably the worst approach you could take.

Necessary to validate _id in Meteor collections?

I'm using Collection.allow(options.insert) to validate the documents the users inserts into the collection. What I wonder is which validation tests I need to use on the _id property of the inserted doc (I use random strings as id, not the Mongo-style objectId).
Do I need to check that _id is a string looking like an id, or does the database refuse the document if the _id property is invalid? Should I also make sure that no other document in the database has that id?
Strictly speaking you do not need to do any validation tests of the _id. The database will refuse the insert if the _id you give is not unique but I think that is the only rule. Checking that the _id is unique could also be picked up afterwards if the insert errors.
Other checks are optional and are just to allow users to access, insert, or remove the documents you want them to.

Resources