We plan a multi tenant application using the AWS Timestream database. Unfortunately the database does not support any resource-based policy. To get the isolation we need to somehow proxy the query through a Lambda Function where we can control the query (see below). We put that behind an Appsync API. Ultimately we want to run queries from a user-frontend by a user who is associated with a certain tenant.
User -> Webapp -> Appsync -> Lambda -> Timestream
The query will need to have a certain condition like:
SELECT * FROM <database>.<table> WHERE tenantId = <tenantId>
Ideally we could model the query on the Webapp and send it to the backend. But as we need to protect against sql-injection attacks then I wonder if there is any possibility (like a global sql scope, or proper validation, etc) to make calls to the database in a secure (isolated) manner?
Otherwise we would have to model each query on the backend or specify some parameters of the query as input to a fixed query on the backend. Which is doable but not as flexible as I would like.
#pfried What about each tenant has their own Timestream table, then different tenants can be assigned different IAM execution roles to ensure that they can only access their own table. Once you have this layer of protection, you can model your query on the client webapp without worrying about cross-tenant data access.
Notes:
The maximum number of tables for Timestream is 50,000 per AWS account
Each tenant can have their own database, while the maximum number of databases for Timestream is 500 per AWS account.
See service quotas at: https://docs.aws.amazon.com/general/latest/gr/timestream.html
Related
We are a team of tens of data analysts. Our main data back-end is an Oracle database. We use personal schemas to do work where we don't need to collaborate with others and we would like to create schemas dedicated to projects where people need to collaborate.
The problem is that in Oracle, one schema is equivalent to one DB user. If we create a schema dedicated to a project, for the purpose of creating DB objects in the context of that project, there will be a single set of credentials (username + password) that needs to be shared by all team members. This has two inconveniences:
if people mistype the credentials, they can block the account for everyone;
it is no longer possible to monitor who did what for security/audit reasons, since everyone uses the same schema;
An alternative would be that only one person uses the Schema user to create objects and assigns privileges to other people in those objects, but that can become quickly cumbersome.
Another alternative is to interact with the DB through R or Python but that means the credentials will be stored in some text file, which is bad for security.
As we see it, the ideal situation is if multiple personal DB users can create objects in the same schema, and if those objects are automatically available for that set of DB users. Is this totally impossible in Oracle? Is this impossible in any major DB? Is this requirement somehow flawed and as such, there is a good reason for why it is not available?
We could compare this collaboration in a DB schema to what commonly happens with people collaborating in a folder, using R, Python or other programming language for data analytics.
Thank you for your advise!
Maybe I miss something but could you not just create a schema that will be used for all users and grant the required privileges to each individual user?
Each user authenticates with his local account and by default uses his local schema and to access the public one you just use the ALTER SESSION SET CURRENT_SCHEMA command.
Using add-graphql-datasource, appsync is supposed to generate a graphql endpoint based on the existing structure of an aurora mysql serverless database. The database already has data in it.
Relational Databases Section at
https://aws-amplify.github.io/docs/cli-toolchain/graphql
However, the generated endpoint just has flat representations of the data - none of the relations that exist in database.
How can you use add-graphql-datasource to generate the relations as well?
As of writing, the add-graphql-datasource command does not support auto-generating logic for relations. You can use the add-graphql-datasource command to get started and then write your own resolver logic to implement the relations.
I'm building a multi-tenant saas application using laravel 5.7 and vuejs. Whatever new client register the system will create new database for him as well all table migrations and seeding will be done via events.
But when super admin manage the application, how to load each client data to super admin panel, or let's say super admin want to make a announcement to al of his client, how to handle this in laravel so announcement data get synced to all database.
Maybe create a separate DB for SUPER-ADMIN, and that DB will be contains clients_table and other data needed to read/write data in clients-DB (data like client db name, user, password, for establish connection to his db etc.).
Alternatively - you can create special table `clients_announcments' in super-admin-db (or may be new db: common_clients_db) and use it for that (and read it from clients) - depends of how many clients you have and what efficency you need
If you create so "big" saas system with many DB, I also encourage you to hard separation between backend and frontend - this means laravel backend will only provide Restful API (NO html-css-js code - only pure php), and frotend client will be separate vue/angular/react project which will consume that API. Key words "micro-service architecture", "restful api"
Current Setup:
SQL Server OLTP database
AWS Redshift OLAP database updated from OLTP
via SSIS every 20 minutes
Our customers only have access to the OLAP Db
Requirement:
One customer requires some additional tables to be created and populated to a schedule which can be done by aggregating the data already in AWS Redshift.
Challenge:
This is only for one customer so I cannot leverage the core process for populating AWS; the process must be independent and is to be handed over to the customer who do not use SSIS and don't wish to start. I was considering using Data Pipeline but this is not yet available in the market in which the customer resides.
Question:
What is my alternative? I am aware of numerous partners who offer ETL like solutions but this seems over the top, ultimately all I want to do is execute a series of SQL statements on a schedule with some form of error handling/ alert. Preference of both customer and management is to not use a bespoke app to do this, hence the intended use of Data Pipeline.
For exporting data from AWS Redshift to another data source using datapipeline you can follow a template similar to https://github.com/awslabs/data-pipeline-samples/tree/master/samples/RedshiftToRDS using which data can be transferred from Redshift to RDS. But instead of using RDSDatabase as the sink you could add a JdbcDatabase (http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-jdbcdatabase.html). The template https://github.com/awslabs/data-pipeline-samples/blob/master/samples/oracle-backup/definition.json provides more details on how to use the JdbcDatabase.
There are many such templates available in https://github.com/awslabs/data-pipeline-samples/tree/master/samples to use as a reference.
I do exactly the same thing as you, but I use lambda service to perform my ETL. One drawback of lambda service is, it can run max of 5 mins (Initially 1 min) only.
So for ETL's greater than 5 minutes, I am planning to set up PHP server in AWS and with SQL injection I can run my queries, scheduled at any time with help of cron function.
I am not a DBA, and am having a hard time conveying the need for an ACL to allow LDAP authentication from my Oracle Apex instance out to my domain controller, which is mycompany.net port 3268.
Do I need to create the ACL and assign the ACL to the APEX_0400 user, or to the parsing schema of the application I will be using LDAP authentication for. Is it the parsing schema which makes the LDAP request on behalf of the application, or the central APEX_0400 schema.
Or is the ACL something which is created at the instance level, i.e. it may need an owner/principle defining to own the ACL, but the ACL applies to the Oracle instance as a whole and I dont need to grant the ACL to individual Oracle schemas?
Any advice appreciated.
Depends.
Using the built-in LDAP authentication scheme will make it so the apex user will make the calls. Meaning you'll need to grant privileges to the correct APEX_###### user - refer to the documentation for your version on which user this is. (4.2 = APEX_040200, 5 = APEX_050000)
Also read: Enabling Network Services in Oracle Database 11g or Later (Apex 5 Docs)
If you want to make your own calls from the database, you'll need to grant the privilege to that user (too) (usually the parsing schema). For example, you're running some additional queries over LDAP.
Either way: the network ACL was made to increase security. If you want to set the gates wide open to allow all network traffic that's entirely your choice. I've been in a firm where there were many schema's and it's never bothered me to correctly assign the ACL privileges. It's a one-time operation usually and all changes are tracked and in a repo.
If you ever get a security audit, would it fly? Where in the network is the database? What sort of apps are hosted, are they public? Don't remove security in favour of ease-of-use.