how to work with postgres database using coroutines - python-asyncio

We're building a backend service utilizing asyncio framework (in our case aiohttp).
We're using aiopg to make queries to the database.
Since this is one-threaded server, what is the correct implementation pattern of making a database query:
1) The server should keep just one connection, and each coroutine should simply aquire() a new cursor on that connection object?
or
2) The server should open a new connection for each coroutine?
Question about scenario (1): does it guarantee concurrency? Can the database run multiple async queries await cursor.execute() on one connection? I am pretty sure that in the non-async mode, cursors on the same connections are simply serialized. Is this different when using async? If yes, what is a max limit of cursors that can be acquired simultaneously?
Question about scenario (2): is this a valid scenario for a one-threaded asyncio server at all?

Related

Azure Functions - Java CosmosClientBuilder slow on initial connection

we're using Azure Cloud Functions with the Java SDK and connect to the Cosmos DB using the following Java API
CosmosClient client = new CosmosClientBuilder()
.endpoint("https://my-cosmos-project-xyz.documents.azure.com:443/")
.key(key)
.consistencyLevel(ConsistencyLevel.SESSION)
.buildClient();
This buildClient() starts a connection to CosmosDB, which takes 2 to 3 seconds.
The subsequent database queries using that client are fast.
Only this first setup of the connection is pretty slow.
We keep the CosmosClient as a static variable, so we can reuse it between multiple http requests that go to our function.
But once the function is getting cold (when Azure shuts it down after a few minutes unused), the static variable gets lost and will be reconnected, when the function is started up again.
Is there a way to make this initial connection to cosmos DB faster?
Or do you think we need to increase the time a function stays online, if we need faster response times?
This is a expected behavior, see https://youtu.be/McZIQhZpvew?t=850.
The first request a client does needs to go through a warm-up step. This warm-up consists of fetching the account information, container information, routing and partitioning information in order to know where to route the requests (as you experienced, further requests do not get this extra latency). Hence the importance of maintaining a singleton instance.
In some Functions plan (Consumption) instances get de-provisioned if there is no activity, in which case, any existing instance of the client is destroyed, so when a new instance is provisioned, your first request will pay this warm-up cost.
There are currently no workaround I'm aware of in the Java SDK but this should not affect your P99 latency since it's just the first request on a cold client.
Hope this and the video help with the reason.

Single vs multiple database connection performance in a single workflow

need some software architecture insight on this. Which of the following is more efficient in terms of resource (cpu, memory, database)?
Having a single database connection in one flow? (Close connection only after everything is done, including business logic)
Having multiple database connections in one flow? (Open then close the database connection immediately after the query is executed)
By business logic, this is where data returned from the query is sanitized, or manipulated according to business rules.
Attaching here is the diagram for visual representation.
UPDATE:
Programming language: PHP (Laravel for web app, Lumen for API)
Database: MySQL
Host: AWS
opening new connection between your runtime and your database needs your OS to create new socket ( if runtime and database are on the same system this socket probably is linux socket, else this socket is tcp/udp socket)
this socket creation has overhead itself.
so I don't suggest to open and close connections after each database usage.
but there are specific conditions you want do that.
for example your database has a limited number of concurrent connections and you have thousands of long running processes using this connections, maybe in this situation you can use second approach.

Single connection with Oracle

In my project, developers use a single instance of Connection instead of a connection pool on an Oracle 12c.
Using a pool is a common practice and Oracle itself documents it: http://docs.oracle.com/database/121/JJUCP/get_started.htm#JJUCP8120.
But JDBC 4.2 specification says:
13.1.1 Creating Statements
Each Connection object can create multiple Statement objects that may be used concurrently by the program.
Why using a pool of connections instead of a single connection, if it's possible to use statements to manage concurrency?
The Oracle Database Dev Team strongly discourages using a single Connection in multiple threads. That almost always causes problems. As a general rule we will not consider any problem report that does this.
A Connection can have multiple Statements and/or ResultSets open at one time but only one can execute at a time. Connections are strictly single threaded and blocking. We try to prevent multiple threads from accessing a Connection simultaneously but there are a few odd cases where it is possible. These are all but guaranteed to cause problems. (It is not practical to fix or prevent these cases mostly for performance reasons. Just don't share a single Connection across multiple threads.)
If a client connects to the database via a dedicated server connection then that database session will only serve that client . If the client connects to the database via shared server connection, then a given database session may serve multiple clients over its lifetime.
This is documented here.
Also, at any one point in time, a session can only execute one thing at a time. If that wasn't the case, then running things in parallel wouldn't spawn multiple other sessions!
A single connection cannot execute several statements concurrently.
Yes one connection can execute more that one statement. It will be the programmer to chose connection pooling setting or multiple statements when executing over more than one thread. Most databases in the market can handle multiple statements in one connection.

Ruby Implicit DB Connection like ActiveRecord

I'm wondering if there is a way to connect and disconnect to a DB implicitly using Ruby like ActiveRecord. I'm working on connecting with Cassandra and the team doesn't want to write the connection lines manually. They just want to give the table name (a.k.a. Column Family) to the class I want to build and move on from there.
I also want to assure that I don't connect and disconnect every time when performing parallel insertions to a table (but disconnect after the whole insertions are finished) since the reconnection process is very expensive and slows down the task.
I couldn't find any solution, and the version that my ex-coworker wrote connects and disconnects after each task like the select statement, and the parallel insert. Any suggestions? A code example of any kind of DB would be very appreciated. Thanks.
NOTE:
To clarify, I am using a Cassandra driver. I'm not making an ActiveRecord like ORM for Cassandra, but want to expose APIs where the devs don't have to remember writing connecting and disconnecting lines.

Use an open connection for all Linq DataContexts

I'm developing a large scale website and when I was checking linq queries with SQL Profiler I found that there is about 30 login/logout actions. I want to decrease these actions by using an open connection for all DataContexts but I don't know how to do it. Do you have any suggestion?
That is pretty bad idea - you are creating large scalable website so let Linq-to-sql handle connection itself.
Linq-to-sql internally handles connection opening and releasing in effective way for it usage. It uses default ADO.NET connection pooling so the connection is correctly reused and not opened for every single context.
Using single connection for all context is exactly what makes your application non scalable and not working. Single connection allows only single transaction so once two requests want to make concurrent changes and use their own transaction your application will crash.
Do not share contexts and do not share connections - let ADO.NET handle connection pooling and create new context for every single request or you can expect serious issues.

Resources