I am a little confused on the best approach in how to update two tables with on GraphQL mutation, I am using AWS AppSync.
I have an application where I need a User to be able to register for an Event. Given I am using DynamoDB as the database, I had thought about a denormalized data structure for the User and Event tables. I am thinking of storing an array of brief Event details, such as eventID and title in the User table and an array of entrants in the Events table, holding only brief user info, such as userID and name. Firstly, is this a good approach or should I have a third join table to hold these 'relationships'.
If it's OK, I am needing to update both tables during the signUp mutation, but I am struggling to get my head around how to update 2 tables with the one mutation and in turn, one request mapping template.
Am I right in thinking I need to use a Pipeline resolver? Or is there another way to do this?
There are multiple options for this:
AppSync supports BatchWrite operations to update multiple DynamoDb tables at the same time
AppSync supports DynamoDb transactions to update multiple DynamoDb tables transactionally at the same time
Pipeline resolvers
Related
I am new to MongoDB and I am writing a DB call where multiple collections has been involved.
I am trying to understand, how can lookup the data when I have this type of relationship.
E.g. - User has subscription -> Subscription has plan details -> Subscription plan details.
Here is reference:
I want to write a lookup operation in Java.
I need pointer how I can do that?
HOw multiple different lookup operation I can combine?
Or is there any better way to get the data from the collections.
Thanks,
Atul
I noticed in the new Amplify Graphql transformer v2, AppSync Conflict Resolution is enabled for all tables by default (https://docs.aws.amazon.com/appsync/latest/devguide/conflict-detection-and-sync.html), I wonder if it will bring any harm if I disable conflict resolution for my API?
I'm building a yelp like rating app, and if two clients try to mutate the same object, I think it's fine just let them mutate concurrently and the request comes later overrides the previous one. So I don't really understand what this conflict resolution is useful for?
I feel it's really inconvenient that I need to pass in a _version field when mutating an object and when deleting, it will not delete immediately, instead it will have _deleted field set to true and schedule to delete after ttl time
Thanks very much!
Pro tip: to disable conflict resolver in amplify, run amplify update api, and you will be prompt to a choice to disable conflict resolver
Versioned Data Sources
AWS AppSync currently supports versioning on DynamoDB data sources. Conflict Detection, Conflict Resolution, and Sync operations require a Versioned data source. When you enable versioning on a data source, AWS AppSync will automatically:
Enhance items with object versioning metadata.
Record changes made to items with AWS AppSync mutations to a Delta table.
Maintain deleted items in the Base table with a “tombstone” for a configurable amount of time.
Versioned Data Source Configuration
When you enable versioning on a DynamoDB data source, you specify the following fields:
BaseTableTTL
The number of minutes to retain deleted items in the Base table with a “tombstone” - a metadata field indicating that the item has been deleted. You can set this value to 0 if you want items to be removed immediately when they are deleted. This field is required.
DeltaSyncTableName
The name of the table where changes made to items with AWS AppSync mutations are stored. This field is required.
DeltaSyncTableTTL
The number of minutes to retain items in the Delta table. This field is required.
Delta Sync Table
AWS AppSync currently supports Delta Sync Logging for mutations using PutItem, UpdateItem, and DeleteItem DynamoDB operations.
When an AWS AppSync mutation changes an item in a versioned data source, a record of that change will be stored in a Delta table that is optimized for incremental updates. You can choose to use different Delta tables (e.g. one per type, one per domain area) for other versioned data sources or a single Delta table for your API. AWS AppSync recommends against using a single Delta table for multiple APIs to avoid the collision of primary keys.
The schema required for this table is as follows:
ds_pk
A string value that is used as the partition key. It is constructed by concatenating the Base data source name and the ISO8601 format of the date the change occurred. (e.g. Comments:2019-01-01)
ds_sk
A string value that is used as the sort key. It is constructed by concatenating the IS08601 format of the time the change occurred, the primary key of the item, and the version of the item. The combination of these fields guarantees uniqueness for every entry in the Delta table (e.g. for a time of 09:30:00 and an ID of 1a and version of 2, this would be 09:30:00:1a:2)
_ttl
A numeric value that stores the timestamp, in epoch seconds, when an item should be removed from the Delta table. This value is determined by adding the DeltaSyncTableTTL value configured on the data source to the moment when the change occurred. This field should be configured as the DynamoDB TTL Attribute.
The IAM role configured for use with the Base table must also contain permission to operate on the Delta table. In this example, the permissions policy for a Base table called Comments and a Delta table called ChangeLog is displayed:
My table columns look like name, email, phone, and pin.
I'm using Hasura for collecting user details.
Problem:
I want to hash the pin field using some hashing algorithm. So I decided to have a separate AWS Lambda function to convert a plain pin to hashed one and update it to the same column.
Now I set a trigger (when pin get updated it will trigger the webhook). I successfully updated the hashed one to my database. But the problem is after lambda updated the field Hasura re-trigger the webhook again. The process is to keep on going until I shut down my Hasura instance.
In Hasura documentation they mentioned below
In case of UPDATE, the events are delivered only if new data is
distinct from old data. The composite type comparison is used to
compare the old and new rows. If rows contain columns, which cannot be
compared using <> operator, then internal binary representation of
rows by Postgres is compared.
however, after the lambda update, the data is same as old one but why it is kept on calling.
I think you should use action for this instead of trigger. With that way, database only store hashed pin.
Can we write custom sql with graphene-sqlalchemy to retrieve data? My output is not present in a database table directly but is built using 'CTEs'.
Background: I'm trying to build a graphql backend in python. The purpose is that this graphql backend layer will act as an API layer. And if I have to switch between different datasources, all I would do is change the connection string and everything else would remain same.
Summary:
My graphene models will have to be built off of database views and not database tables. I will only be querying the data and not performing any mutations.
I want my data resolvers need to run dynamic queries (based on inputs passed) on tables and then aggregate data and return results. Because of the this aggregations happening on the fly based on inputs I cannot pre-aggregate the data and store in tables. So I want to execute this dynamic sql against tables.
Table-Tasks has 3 columns: id name user_id
Table Issues has 4 columns: id task_id issue_status user_id
So I will build views A and B (lets say) on these tables based on the inputs I get and then do aggregations on these views. So an ability to write custom sql for my resolvers will help a lot. Is that possible in graphene-sqlalchemy?
I'am currently thinking about my DynamoDB database. The goal is to get the best speed with a really large amount of data possible. For me, DynamoDB seems as a good option. Furthermore, it is necessary for me to connect the table to ElasticSearch because I need geopoint-querys (like show me all posts that are within a specific area).
Would the following approach make sense to you regarding the best practices of DynamoDB? It is possible, that the sort key for example 'posts' is a hot spot, but if I query only with ElasticSearch, there should be no problem - right? What would be the best solution?
My tables look like this:
So my thoughts are:
To query all users just select every row with the sort_key 'user'
To get a post with the creator, query the post_id and the sort_key 'post'
In a relational database, the two tables would look like this:
You can do something like using overloaded attributes (using same columns for different things)
Then, query with user_id and post_id=0 for user info and post_id=others for posts
I don't think dynamo DB is a good option for that.
Limitation for DynamoDB listed below.
You can't get more than 1 MB data per API call
wildcard search will give you a poor performance
Aggregation will be a nightmare since you plan for the huge data