Multi-Region Aurora with write forwarding from Spring Boot Application - spring

I have created a multi-region(e.g. two region) Aurora cluster based on MySql engine. It has primary cluster with 1 writer and 1 reader instance, and secondary cluster with only Reader instances.
As per the Aurora documentation here, following command in secondary region on reader instance, can forward any write call to primary cluster writer instance.
SET aurora_replica_read_consistency = 'session';
This works fine, when I do the same via mysql client. And I can use secondary reader instance for write operations too.
Now, I have created an application having separate instance for these two regions. Primary application instance connected with primary Aurora cluster having writer and reader, hence I can do both read and write operation there.
For secondary application instance, which is connected to secondary Aurora cluster having only reader instance, only read operations are working.
As a solution I created writeForward.sql in spring boot application to execute and set aurora_replica_read_consistency during application initialisation on secondary cluster only. For this, I added following property to parameter store in secondary region only:
spring.datasource.data=classpath:writeForward.sql
But this is somehow not working and secondary application is still not able to do any write operation.
I am looking for some help on how to handle this.

After reading through the Aurora documentation again, I realise that write forwarding from secondary region only works when property aurora_replica_read_consistency is set for each session.
Always set the aurora_replica_read_consistency parameter for any session for which you want to forward writes. If you don't, Aurora doesn't enable write forwarding for that session.
To make this possible, each DB connection made by application need to execute this command:
SET aurora_replica_read_consistency = 'session';
For Spring Boot application using Hikari DB Connection pool, I used following property, which automatically executes above SQL command for each connection that is maintain with DB.
spring.datasource.hikari.connection-init-sql= SET aurora_replica_read_consistency = 'session'
Details about Hikari Connection Pool can be found here, which mentions about connectionInitSql property.

Related

Does Cassandra session re-create when WAS disconnects to Cassandra cluster?(e.g. Network issue)

I have tested with Circle CI and docker(cassandra img)
and When I test, logs appear like below
"Tried to execute unknown prepared query. You may have used a PreparedStatement that was created with another Cluster instance."
But Cassandra Cluster exists as solo. So, I can't understand What makes this error.
Could it happen because of Cassandra Connection issue?
Tests have failed sometimes because WAS can't connect to Cassandra Cluster
(I think CircleCI causes this issue)
so I just guess
WAS can't connect to Cassandra Cluster during testing
Session re-created
Error logs with PreparedStatement happens
Is it Possible?
If not, How does this Error happen though just One Cassandra Cluster is operating?
The "Cluster instance" being referred to in this message is the Cluster object in your app code:
Tried to execute unknown prepared query. You may have used a PreparedStatement \
that was created with another Cluster instance.
That error implies that you have multiple Cluster objects in your app. You should only have one instance that is shared throughout your app code and you shouldn't create multiple Cluster objects. Cheers!

how to use db2 read on standby feature

IBM DB2 has a feature for HADR database - read on standby. This allows the standby database to be connected to for read-only queries (with certain restrictions on datatypes and isolation levels)
I am trying to configure this as a datasource in an application which runs on websphere liberty profile.
Previously, this application was using the Automatic Client Re-route (which ensures that all connections are directed to the current primary)
However, I would like to configure it in such a way that I can have SELECTs / read-only flows to run on the standby database, and others to run on primary. This should also work when a takeover has been performed on the database (that is, standby becoming primary and vice-versa). The purpose of doing this is to divide the number of connections created between all available databases
What is the correct way to do this?
Things I have attempted (assume my servers are dbserver1 and dbserver2):
Create 2 datasources, one with the db url of dbserver1 and the other with dbserver2.
This works until a takeover is performed and the roles of the servers are switched.
Create 2 datasources, one with the db url of dbserver1 (with the Automatic Client Re-route parameters) and the other with dbserver2 only.
With this configuration, the application works fine, but if dbserver2 becomes the primary then all queries are executed on it.
Setup haproxy and use it to identify which is the primary and which is the standby. Create 2 datasources pointing to haproxy
When takeover is carried out on the database, connection exceptions start to occur (not just at the time of takeover, but for some time following it)
The appropriate way is described in a Whitepaper "Enabling continuous access to read on standby databases using Virtual IP addresses" linked off the Db2 documentation for Read-on-standby.
Virtual IP addresses are assigned to both roles, primary and standby. They are cataloged as database aliases. Websphere or other clients would connect to either the primary or standby datasource. When there is a takeover or failover, the virtual IP addresses are reassigned to the specific server. The client would continue to be routed to the desired server, e.g. the standby.

Avoid starting HiveThriftServer2 with created context programmatically

We are trying to use ThriftServer to query data from spark temp tables, in spark 2.0.0.
First, we have created sparkSession with enabled Hive Support.
Currently, we start ThriftServer with sqlContext like this:
HiveThriftServer2.startWithContext(spark.sqlContext());
We have spark stream with registered temp table "spark_temp_table":
StreamingQuery streamingQuery = streamedData.writeStream()
.format("memory")
.queryName("spark_temp_table")
.start();
With beeline we are able to see temp tables (running SHOW TABLES);
When we want to run second job (with second sparkSession) with this approach we have to start second ThriftServer with different port.
I have two questions here:
Is there any way to have one ThriftServer on one port with access to all temp tables in a different sparkSessions?
HiveThriftServer2.startWithContext(spark.sqlContext()); is annotated with #DeveloperApi. Is there any way to start thrift server with context not in the code programatically?
I saw there is configuration --conf spark.sql.hive.thriftServer.singleSession=true passed to ThriftServer on startup (sbin/start-thriftserver.sh) but I don't understand how to define this for a job. I tried to set this configuration property in sparkSession builder , but beeline didn't display temp tables.
Is there any way to have one ThriftServer on one port with access to all temp tables in a different sparkSessions?
No. ThriftServer uses specific session and temporary tables can be accessed only within this session. This is why:
beeline didn't display temp tables.
when you start independent server with sbin/start-thriftserver.sh.
spark.sql.hive.thriftServer.singleSession doesn't mean you get a single session for multiple servers. It uses the same session for all connections to a single Thrift server. Possible use case:
you start thrift server.
client1 connects to this server and creates temp table foo.
client2 connects to this server and reads foo

Loadbalancing settings via spring AWS libraries for multiple RDS Read Only Replicas

If there are multiple read replicas, where load balancing related settings can be specified when using spring AWS libraries.
Read replicas have their own endpoint address similar to the original RDS instance. Your application will need to take care of using all the replicas and to switch between them. You'd need to introduce this algorithm into your application so it automatically detects which RDS instance it should connect to in turn. The following links can help:
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.Replication.html#Overview.ReadReplica

Solutions for a secure distributed cache

Problem: I want to cache user information such that all my applications can read the data quickly, but I want only one specific application to be able to write to this cache.
I am on AWS, so one solution that occurred to me was a version of memcached with two ports: one port that accepts read commands only and one that accepts reads and writes. I could then use security groups to control access.
Since I'm on AWS, if there are solutions that use out-of-the box memcached or redis, that'd be great.
I suggest you use ElastiCache with one open port at 11211(Memcached)then create an EC2 instance, set your security group so only this server can access to your ElastiCache cluster. Use this server to filter your applications, so only one specific application can write to it. You control the access with security group, script or iptable. If you are not using VPC, then you can use cache security group.
I believe you can accomplish this using Redis (instead of Memcached) which is also available via ElastiCache. Once the instance has been created, you will want to create a replication group and associate it to the cache cluster you already launched.
You can then add instances to the replication group. Instances within the replication group are simply replicated from the Master Cache Cluster (single Redis instance) and so are (by default) read-only.
So, in this setup, you have a master node (single endpoint) that you can write to and as many read nodes (multiple endpoints) as you would like.
You can take security a step further and assign different routing rules to the replication group (via the VPC) so the applications reading data does not have access to the master node (the only one that can write data).

Resources