AWS proxy: RDS connection throttling while using sequelize in lambda - aws-lambda

I am leveraging the use of AWS services for a functionality
Summary: I have a lambda that accesses a Postgres DB in RDS. Instead of directly connecting to DB, the proxy endpoint is accessed as it is architecturally advised. I have no problem generating IAM token and it is used as the password while creating the Sequelize connection.
Problem: Initially I was not using rds-proxy. In this scenario, I was making use of the execution context of lambda to reuse connections. Here I didn't close connections in lambda(It worked fine - was directly connecting to the database here). But on proxy implementation, without closing connections, there is a big spike in connections that proxy makes to the database and it is testing the limits on load. with 10req/sec I'm seeing 90 connections
On closing the connections in lambda the connections get substantially reduced to <20.
But I have nested database queries during a single lambda execution and it will be difficult to rewrite these functionalities.
Below is the Sequelize connection object written to create connection
const { Sequelize } = require('sequelize');
let proxyToken = '***latest iam token with 15min validity***';
let additionalConnectionDetails = {
host: process.env.PROXY_ENDPOINT,
schema: 'schemaname',
searchPath: 'searchpath',
dialect: 'postgres',
dialectOptions: {
prependSearchPath: true,
ssl: {
require: true,
rejectUnauthorized: false
}
},
// pool: {
// max: 2,
// min: 1,
// acquire: 3000,
// idle: 0,
// evict: 120000
// },
// // maxConcurrentQueries: 100
}
sequelize_connection = new Sequelize(dbCreds.app, dbCreds.userName, proxyToken, additionalConnectionDetails);
console.log('sequelize', sequelize_connection)
return sequelize_connection
I tried using the connection pool, but it didn't make much of a difference in lambda.
How can I reduce the number of connections established without closing connections. Any suggestions are appreciated. Thanks in advance.

Related

EdgeDB timeout with go client

I’m having what seems to be a timeout issue with the go client. Essentially I have a service running in AWS, an endpoint for a heartbeat (that does not connect to edgedb), and an endpoint to list all of a particular object in the DB (“select Foo { ... };“).
When the service is started both of these endpoints function as expected, but over the course of the day the list endpoint will start to hang forever (I’m assuming because the client is timing out connecting to the DB?) The heartbeat route continues to function normally so AWS doesn’t mark the service as unhealthy. Is there maybe a connection option I’m missing?
Here’s how I create the client:
log.Println("Connecting to EdgeDB...")
ctx := context.Background()
var opts edgedb.Options
opts = edgedb.Options{
Database: os.Getenv("DB_DBNAME"),
User: os.Getenv("DB_USER"),
Password: edgedb.NewOptionalStr(os.Getenv("DB_PASSWORD")),
Host: os.Getenv("DB_HOST"),
Port: 5656,
Concurrency: uint(4),
TLSOptions: edgedb.TLSOptions{
SecurityMode: "insecure",
},
}
db, dbErr := edgedb.CreateClient(ctx, opts)
return projectDb{DB: db}, db, dbErr
This timeout behavior seems to happen with a local edgedb eventually as well, but no logged errors

Load/stress test in a SPA with Hasura Cloud Graphql as a backend and subscriptions

I'm trying to do a performance test on a
SPA with a Frontend in React, deployed with Netlify
As a backend we're using Hasura Cloud Graphql (std version) https://hasura.io/, where everything from the client goes directly through Hasura to the DB.
DB is in Postgress housed in Heroku (Std 0 tier).
We're hoping to be able to have around 800 users simultaneous.
The problem is that i'm loss about how to do it or if i'm doing it correctly, seeing how most of our stuff are "subscriptions/mutations" that I had to transform into queries. I tried doing those test with k6 and Jmeter but i'm not sure if i'm doing them properly.
k6 test
At first, i did a quick search and collected around 10 subscriptions that are commonly used. Then i tried to create a performance test with k6 https://k6.io/docs/using-k6/http-requests/ but i wasn't able to create a working subscription test so i just transform each subscription into a query and perform a http.post with this setup:
export const options = {
stages: [
{ duration: '30s', target: 75 },
{ duration: '120s', target: 75 },
{ duration: '60s', target: 50 },
{ duration: '30s', target: 30 },
{ duration: '10s', target: 0 }
]
};
export default function () {
var res = http.post(prod,
JSON.stringify({
query: listaQueries.GetDesafiosCursosByKey(
keys.desafioCursoKey
)}), params);
sleep(1)
}
I did this for every query and ran each test individually. Unfortunately, the numbers i got were bad, and somehow our test environment was getting better times than production. (The only difference afaik is that we're using Hasura Cloud for production).
I tried to implement websocket, but i couldn't getthem work and configure them to do a stress/load test.
K6 result
Jmeter test
After that, i tried something similar with Jmeter, but again i couldn't figure how to set up a subscription test (after i while, i read in a blog that jmeter doesn't support it
https://qainsights.com/deep-dive-into-graphql-in-jmeter/ ) so i simply transformed all subscriptions into a query and tried to do the same, but the numbers I was getting were different and much higher than k6.
Jmeter query Config 1
Jmeter query config 2
Jmeter thread config
Questions
I'm not sure if i'm doing it correctly, if transforming every subscription into a query and perform a http request is a correct approach for it. (At least I know that those queries return the data correctly).
Should i just increase the number of VUS/threads until i get a constant timeout to simulate a stress test? There were some test that are causing a graphql error on the website Graphql error, and others were having a
""WARN[0059] Request Failed error="Post \"https://xxxxxxx-xxxxx.herokuapp.com/v1/graphql\": EOF""
in the k6 console.
Or should i just give up with k6/jmeter and try to search for another tool to perfom those test?
Thanks you in advance, and sorry for my English and explanation, but i'm a complete newbie at this.
I'm not sure if i'm doing it correctly, if transforming every
subscription into a query and perform a http request is a correct
approach for it. (At least I know that those queries return the data
correctly).
Ideally you would be using WebSocket as that is what actual clients will most likely be using.
For code samples, check out the answer here.
Here's a more complete example utilizing a main.js entry script with modularized Subscription code in subscriptions\bikes.brands.js. It also uses the Httpx library to set a global request header:
// main.js
import { Httpx } from 'https://jslib.k6.io/httpx/0.0.5/index.js';
import { getBikeBrandsByIdSub } from './subscriptions/bikes-brands.js';
const session = new Httpx({
baseURL: `http://54.227.75.222:8080`
});
const wsUri = 'wss://54.227.75.222:8080/v1/graphql';
const pauseMin = 2;
const pauseMax = 6;
export const options = {};
export default function () {
session.addHeader('Content-Type', 'application/json');
getBikeBrandsByIdSub(1);
}
// subscriptions/bikes-brands.js
import ws from 'k6/ws';
/* using string concatenation */
export function getBikeBrandsByIdSub(id) {
const query = `
subscription getBikeBrandsByIdSub {
bikes_brands(where: {id: {_eq: ${id}}}) {
id
brand
notes
updated_at
created_at
}
}
`;
const subscribePayload = {
id: "1",
payload: {
extensions: {},
operationName: "query",
query: query,
variables: {},
},
type: "start",
}
const initPayload = {
payload: {
headers: {
"content-type": "application/json",
},
lazy: true,
},
type: "connection_init",
};
console.debug(JSON.stringify(subscribePayload));
// start a WS connection
const res = ws.connect(wsUri, initPayload, function(socket) {
socket.on('open', function() {
console.debug('WS connection established!');
// send the connection_init:
socket.send(JSON.stringify(initPayload));
// send the chat subscription:
socket.send(JSON.stringify(subscribePayload));
});
socket.on('message', function(message) {
let messageObj;
try {
messageObj = JSON.parse(message);
}
catch (err) {
console.warn('Unable to parse WS message as JSON: ' + message);
}
if (messageObj.type === 'data') {
console.log(`${messageObj.type} message received by VU ${__VU}: ${Object.keys(messageObj.payload.data)[0]}`);
}
console.log(`WS message received by VU ${__VU}:\n` + message);
});
});
}
Should i just increase the number of VUS/threads until i get a
constant timeout to simulate a stress test?
Timeouts and errors that only happen under load are signals that you may be hitting a bottleneck somewhere. Do you only see the EOFs under load? These are basically the server sending back incomplete responses/closing connections early which shouldn't happen under normal circumstances.
My expectation is that your test should be replicating the real user activity as close as possible. I doubt that real users will be sending requests to GraphQL directly and well-behaved load test must replicate the real life application usage as close as possible.
So I believe you should move to HTTP protocol level and mimic the network footprint of the real browser instead of trying to come up with individual GraphQL queries.
With regards to JMeter and k6 differences it might be the case that k6 produces higher throughput given the same hardware and running requests at maximum speed as it evidenced by kind of benchmark in the Open Source Load Testing Tools 2021 article, however given you're trying to simulate real users using real browsers accessing your applications and the real users don't hammer the application non-stop, they need some time to "think" between operations you should be getting the same number of requests for both load testing tools, if JMeter doesn't give you the load you want to conduct make sure to follow JMeter Best Practices and/or consider running it in distributed mode .

Sequelize with AWS RDS Proxy

I am trying to use the AWS RDS Proxy on my lambda to proxy our database (Aurora MySQL). I wasn't able to find any specific instructions for Sequelize, but it seemed like all I needed for RDS proxy to work is to create a signer, use it to get my token and then pass in the token as my password to the Sequelize constructor:
const signer = new RDS.Signer({
region: process.env.REGION,
hostname: process.env.DB_PROXY_ENDPOINT,
port: 3306,
username: process.env.DB_PROXY_USERNAME,
});
const token = signer.getAuthToken({
username: process.env.DB_PROXY_USERNAME,
});
const connection = new Sequelize(process.env.DB_DATABASE, process.env.DB_PROXY_USERNAME, token, {
dialect: 'mysql',
host: process.env.DB_HOSTNAME,
port: process.env.DB_PORT,
pool: {
acquire: 15000,
idle: 9000,
max: 10
},
});
The RDS proxy is attached to my lambda and I'm able to log the token, but as soon as I make a request against the database, my connection times out. Does anyone know if there is something I could be missing in this setup?
Here's how I connected from AWS Lambda to RDS Proxy using MySql (in typescript)
import { APIGatewayProxyEvent, APIGatewayProxyResult } from "aws-lambda";
import { Signer } from "#aws-sdk/rds-signer";
import { Sequelize } from "sequelize";
//other code
const signer = new Signer({
hostname: host
port: port,
region: region,
username: username,
});
const sequelize = new Sequelize({
username,
host,
port,
dialect: "mysql",
dialectOptions: {
ssl: "Amazon RDS",
authPlugins: {
mysql_clear_password: () => () => signer.getAuthToken(),
},
},
});
// some more code
Your connection timing out may be due to some authentication error, perhaps in the way you're passing in the token. I would double check your RDS Proxy IAM role has secretsmanager:GetSecretValue permission for the Secrets Manager resource of the db user credentials as well as kms:Decrypt on the key used to encrypt the secret. And your lambda (or whatever context your code is running in) has the rds-db:connect permission.
NOTE:
This doesn't include the connection pooling options, I'm still trying to figure out how to optimize that. Check out Using sequelize in AWS Lambda docs for a place to start.

Suddenly, Heroku credentials to a PostgreSQL server gives FATAL password for user error

Without changing anything in my settings, I can't connect to my PostgreSQL database hosted on Heroku. I can't access it in my application, and is given error
OperationalError: (psycopg2.OperationalError) FATAL: password authentication failed for user "<heroku user>" FATAL: no pg_hba.conf entry for host "<address>", user "<user>", database "<database>", SSL off
It says SSL off, but this is enabled as I have confirmed in PgAdmin. When attempting to access the database through PgAdmin 4 I get the same problem, saying that there is a fatal password authentication for user '' error.
I have checked the credentials for the database on Heroku, but nothing has changed. Am I doing something wrong? Do I have to change something in pg_hba.conf?
Edit: I can see in the notifications on Heroku that the database was updated right around the time the database stopped working for me. I am not sure if I triggered the update, however.
Here's the notification center:
In general, it isn't a good idea to hard-code credentials when connecting to Heroku Postgres:
Do not copy and paste database credentials to a separate environment or into your application’s code. The database URL is managed by Heroku and will change under some circumstances such as:
User-initiated database credential rotations using heroku pg:credentials:rotate.
Catastrophic hardware failures that require Heroku Postgres staff to recover your database on new hardware.
Security issues or threats that require Heroku Postgres staff to rotate database credentials.
Automated failover events on HA-enabled plans.
It is best practice to always fetch the database URL config var from the corresponding Heroku app when your application starts. For example, you may follow 12Factor application configuration principles by using the Heroku CLI and invoke your process like so:
DATABASE_URL=$(heroku config:get DATABASE_URL -a your-app) your_process
This way, you ensure your process or application always has correct database credentials.
Based on the messages in your screenshot, I suspect you were affected by the second bullet. Whatever the cause, one of those messages explicitly says
Once it has completed, your database URL will have changed
I had the same issue. Thx to #Chris I solved it this way.
This file is in config/database.js (Strapi 3.1.3)
var parseDbUrl = require("parse-database-url");
if (process.env.NODE_ENV === 'production') {
module.exports = ({ env }) => {
var dbConfig = parseDbUrl(env('DATABASE_URL', ''));
return {
defaultConnection: 'default',
connections: {
default: {
connector: 'bookshelf',
settings: {
client: dbConfig.driver,
host: dbConfig.host,
port: dbConfig.port,
database: dbConfig.database,
username: dbConfig.user,
password: dbConfig.password,
},
options: {
ssl: false,
},
},
},
}
};
} else {
// to use the default local provider you can return an empty configuration
module.exports = ({ env }) => ({
defaultConnection: 'default',
connections: {
default: {
connector: 'bookshelf',
settings: {
client: 'sqlite',
filename: env('DATABASE_FILENAME', '.tmp/data.db'),
},
options: {
useNullAsDefault: true,
},
},
},
});
}

Auto Reload of gateway for schema changes in federated service apollo GraphQL

In Apollo Federation, I am facing this problem:
The gateway needs to be restarted every time we make a change in the schema of any federated service in service list.
I understand that every time a gateway starts and it collects all the schema and aggregates the data graph. But is there a way this can be handled automatically without restarting the Gateway as it will down all other unaffected GraphQL Federated services also
Apollo GraphQL , #apollo/gateway
There is an experimental poll interval you can use:
const gateway = new ApolloGateway({
serviceList: [
{ name: "products", url: "http://localhost:4002" },
{ name: "inventory", url: "http://localhost:4001" },
{ name: "accounts", url: "http://localhost:4000" }
],
debug: true,
experimental_pollInterval:3000
})
the code above will pull every 3 seconds
I don't know other ways to automatically reload the gateway other than polling.
I made a reusable docker image and i will keep updating it if new ways to reload the service emerge. For now you can use the POLL_INTERVAL env var to periodically check for changes.
Here is an example using docker-compose:
version: '3'
services:
a:
build: ./a # one service implementing federation
b:
build: ./b
gateway:
image: xmorse/apollo-federation-gateway
ports:
- 8000:80
environment:
CACHE_MAX_AGE: '5' # seconds
POLL_INTERVAL: '30' # seconds
URL_0: "http://a"
URL_1: "http://b"
You can use express to refresh your gateway's schema. ApolloGateway has a load() function that go out to fetch the schemas from implementing services. This HTTP call could potentially be part of a deployment process if something automatic is needed. I wouldn't go with polling or something too automatic. Once the implementing services are deployed, the schema is not going to change until it's updated and deployed again.
import { ApolloGateway } from '#apollo/gateway';
import { ApolloServer } from 'apollo-server-express';
import express from 'express';
const gateway = new ApolloGateway({ ...config });
const server = new ApolloServer({ gateway, subscriptions: false });
const app = express();
app.post('/refreshGateway', (request, response) => {
gateway.load();
response.sendStatus(200);
});
server.applyMiddleware({ app, path: '/' });
app.listen();
Update: The load() function now checks for the phase === 'initialized' before reloading the schema. A work around might be to use gateway.loadDynamic(false) or possibly change gateway.state.phase = 'initialized';. I'd recommend loadDyamic() because change state might cause issues down the road. I have not tested either of those solutions since I'm not working with Apollo Federation at the time of this update.

Resources