H2 Database Cluster Recovery - h2

I have got a SpringMVC application which runs on Apache Tomcat and uses H2 database.
The infrastructure contains two application servers (lets name them A & B) running their own Tomcat Servers. I also have a H2 database clustering in place.
On one system (A) I ran the following command
java org.h2.tools.Server -tcp -tcpPort 9101 -tcpAllowOthers -baseDir server1
On the other (B) I ran
java org.h2.tools.Server -tcp -tcpPort 9101 -tcpAllowOthers -baseDir server2
I started the cluster in machine A
java org.h2.tools.CreateCluster
-urlSource jdbc:h2:tcp://IpAddrOfA:9101/~/test
-urlTarget jdbc:h2:tcp://IpAddrOfB:9101/~/test
-user sa
-serverList IpAddrOfA:9101,IpAddrOfB:9101
When any one of the server is down, it has been mentioned that, one has to delete the database that failed, restart the server and rerun the CreateCluster.
I have the following questions ?
If both servers are down, how can I ascertain, which database to
delete so that I can restart that server and rerun the cluster ?
CreateCluster contains a urlSource and urlTarget. Do I need to be
specific as to give them the same value as was previously given or I
can interchange them without any side effect ?
Do I need to run the CreateCluster command from both the machines?
If so, do I need to interchange the urlSource and urlTarget ?
Is there a way to know whether both, one or none of the servers are
running ? I want that both IpAddress will be returned if both of
them are up, one IpAddress if only one is up otherwise none is all
are down.

If both servers are down, how can I ascertain, which database to delete
The idea of the cluster is that a second database adds redundancy to the system. Let's assume a server fails one every 100 days (hard disk failure, power failure or so). That is 99% availability. This might not be good enough for you, that's why you may want to use a cluster with two servers. That way, even if one of the server fails every 100 days, the chance of both failing at the same time is very very low. Ideally, the risk of failure is completely independent. That would mean the risk of both failing at the exact same time is 1 in 10000 (100 times 100), giving you 99.99% availability. To the risk that both servers are down is exactly what the cluster feature should prevent.
CreateCluster contains a urlSource and urlTarget. Do I need to be specific as to give them the same value as was previously
It depends which one you want to use as the source and which one as the target. The database from the source is copied to the target. The source is that database you want to copy to the target.
Do I need to run the CreateCluster command from both the machines?
No.
Is there a way to know whether both, one or none of the servers are running ?
You could try to open a TCP/IP connection to them, to check if the listener is running. What I usually do is running telnet <server> <port> on the command line.

Related

Wildfly 11 - High Availability - Single deploy on slave

I have two servers in a HA mode. I'd like to know if is it possible to deploy an application on the slave server? If yes, how to configure it in jgroups? I need to run a specific program that access the master database, but I would not like to run on master serve to avoid overhead on it.
JGroups itself does not know much about WildFly and the deployments, it only creates a communication channel between nodes. I don't know where you get the notion of master/slave, but JGroups always has single* node marked as coordinator. You can check the membership through Channel.getView().
However, you still need to deploy the app on both nodes and just make it inactive if this is not its target node.
*) If there's no split-brain partition, or similar rare/temporal issues

How to get dmgr host and port number dynamically using jython and jacl in IBM Websphere Application Server in linux?

I need to get Dmgr host and port dynamically to sync the node.
AdminControl.getHost() and AdminControl.getPort()
I am not sure whether i works. Thanks in advance
Would something like this work instead at the end of your administrative script?
AdminConfig.save()
if (NDInstall == "ND"):
nodeSync = AdminControl.completeObjectName("type=NodeSync,node=" + nodeLongName + ",*")
AdminControl.invoke(nodeSync, "sync")
A save and sync by itself doesn't require nodes or application servers to be down. Depending on the nature of the change you may need to recycle application servers to bring the change into effect. One feature that's in ND to help with high availability is the ability to ripple start servers in a cluster. This way one or more application servers stay up to service requests while a change is 'rippled' into effect.
A cluster is also an administrative unit that can be stopped and started. You can arrange your clusters however you want across your nodes.

Splitting a Redis RDB file

Currently I'm using redis on a EC2 machine, with 60G RAM without any slaves, but as my data grows I will need more memory.
I was thinking to migrate to 2 x 60G machines and split the already existing data between the two.
Is there any tool for splitting the RDB file? I haven't found anything specifically designed for this.
If you want to split your data, you will need to have a way to shard your keys so some keys will be written/read from server A and the others from server B
There is no way to split a RDB file, but there is something you can do to achieve what you want.
First what you can do is start a redis instance on your second server and say it is a slave of your current server, but set the param slave-read-only to false. This will cause the slave to synchronize and read all of your redis data from master. So far you only have a slave with all the data, but now we will do the interesting bit.
Then you need to decide on a sharding strategy. Some redis clients do this for you. For example, the official Ruby client knows how to handle that if you configure it. You will need to configure your client so keys will be sharded to A and B (or alternative use twemproxy so the clients won't know about different servers and the twemproxy will take care of it)
Once you have the clients configure, you need to deploy the new clients to production and immediately configure the slave as not a slave anymore. You can do this directly using the CONFIG command on the slave server (don't forget to persist the config using CONFIG REWRITE) or you can change the config file of the slave and restart, whatever is more convenient for you. Since the slave is configured as slave-read-only false, it will accept writes even on slave mode. This means if you change the config directly from the redis-cli you can change from slave to just a sharded stand-alone redis without restarting, which I think is quite cool.
Be aware once you shard, you will have to be careful with MULTI commands or when using LUA scripts. If you are using twemproxy you won't be able to use those commands, but if you are sharding on the client side, you will still be able to use MULTI or LUA. Just be careful to use a sharding mechanism in which all the related keys will stay on the same server.
step1: install https://github.com/leonchen83/redis-rdb-cli/
step2: create a config file to set spliting condition
content of nodes.conf
34b6e1dfb871ad30398ef5edd6b9a954617e6ec1 127.0.0.1:10003#20003 master - 0 1531044047088 3 connected 8193-16383
89d020a7e727e81f003836207902ae26fe05fd51 127.0.0.1:10001#20001 myself,master - 0 1531044047000 1 connected 0-8192
vars currentEpoch 6 lastVoteEpoch 0
step3: run rdt -s your-dump.rdb -c nodes.conf -o /path/to
after step3. that will generate 2 rdb files in /path/to directory 34b6e1dfb871ad30398ef5edd6b9a954617e6ec1.rdb and 89d020a7e727e81f003836207902ae26fe05fd51.rdb

Celery, Resque, or custom solution for processing jobs on machines in my cloud?

My company has thousands of server instances running application code - some instances run databases, others are serving web apps, still others run APIs or Hadoop jobs. All servers run Linux.
In this cloud, developers typically want to do one of two things to an instance:
Upgrade the version of the application running on that instance. Typically this involves a) tagging the code in the relevant subversion repository, b) building an RPM from that tag, and c) installing that RPM on the relevant application server. Note that this operation would touch four instances: the SVN server, the build host (where the build occurs), the YUM host (where the RPM is stored), and the instance running the application.
Today, a rollout of a new application version might be to 500 instances.
Run an arbitrary script on the instance. The script can be written in any language provided the interpreter exists on that instance. E.g. The UI developer wants to run his "check_memory.php" script which does x, y, z on the 10 UI instances and then restarts the webserver if some conditions are met.
What tools should I look at to help build this system? I've seen Celery and Resque and delayed_job, but they seem like they're built for moving through a lot of tasks. This system is under much less load - maybe on a big day a thousand hundred upgrade jobs might run, and a couple hundred executions of arbitrary scripts. Also, they don't support tasks written in any language.
How should the central "job processor" communicate with the instances? SSH, message queues (which one), something else?
Thank you for your help.
NOTE: this cloud is proprietary, so EC2 tools are not an option.
I can think of two approaches:
Set up password-less SSH on the servers, have a file that contains the list of all machines in the cluster, and run your scripts directly using SSH. For example: ssh user#foo.com "ls -la". This is the same approach used by Hadoop's cluster startup and shutdown scripts. If you want to assign tasks dynamically, you can pick nodes at random.
Use something like Torque or Sun Grid Engine to manage your cluster.
The package installation can be wrapped inside a script, so you just need to solve the second problem, and use that solution to solve the first one :)

EC2 database server failover strategy

I am planning to deploy my web app to EC2. I have several webserver instances. I have 1 primary database instance. I have 1 failover database instance. I need a strategy to redirect the webservers to the failover database instance IP when the primary database instance fails.
I was hoping I could use an Elastic IP in my connection strings. But, the webservers are not able to access/ping the Elastic IP. I have several brute force ideas to solve the problem. However, I am trying to find the most elegant solution possible.
I am using all .Net and SQL Server. My connection strings are encrypted.
Does anybody have a strategy for failing over a database instance in EC2 using some form of automation or DNS configuration?
Please let me know.
http://alestic.com/2009/06/ec2-elastic-ip-internal
tells you how to use the Elastic IP public DNS.
Haven't used EC2 but surely you need to either:
(a) put your front-end into some custom maintenance mode, that you define, while you switch the IP over; and have the front-end perform required steps to manage potential data integrity and data loss issues related to the previous server going down and the new server coming up when it enters and leaves your custom maintenance mode
OR, for a zero down-time system:
(b) design the system at the object/relational and transaction levels from the ground up to support zero-down-time fail-over. It's not something you can bolt on quicjkly to just any application.
(c) use some database support for automatic failover. I am unaware whether SQL Server support for failover suitable for your application exists or is appropriate here. I suggest adding a "sql-server" tag to the question to start a search for the right audience.
If Elastic IPs don't work (which sounds odd to say the least - shouldn't you talk to EC2 about that), you mayhave to be able to instruct your front-end which new database IP to use at the same time as telling it to go from maintenance mode to normal mode.
If you're willing to shell out a bit of extra money, take a look at Rightscale's tools; they've built custom server images and supporting tools that handle database failover (among many other things). This link explains how to do it with MySQL, so will hopefully show you some principles even though it doesn't use SQL Server.
I always thought there was this possibility in the connnection string
This is taken (but not yet tested) from How to add Failover Partner to a connection string in VB.NET :
If you connect with ADO.NET or the SQL
Native Client to a database that is
being mirrored, your application can
take advantage of the drivers ability
to automatically redirect connections
when a database mirroring failover
occurs. You must specify the initial
principal server and database in the
connection string and the failover
partner server.
Data Source=myServerAddress;Failover Partner=myMirrorServerAddress;
Initial Catalog=myDataBase;Integrated Security=True;
There is ofcourse many other ways to
write the connection string using
database mirroring, this is just one
example pointing out the failover
functionality. You can combine this
with the other connection strings
options available.
To broaden gareth's answer, cloud management softwares usually solve this type of problems. RightScale is one of them, but you can try enStratus or Scalr (disclaimer: I work at Scalr). These tools provide failover solutions like:
Backups: you can schedule automated snapshots of the EBS volume containing the data
Fault-tolerant database: in the event of failure, a slave is promoted master and mounted storage will be switched if the failed master and new master are in the same AZ, or a snapshot taken of the volume
If you want to build your own solution, you could replicate the process detailed below that we use at Scalr:
Is there a slave in the same AZ? If so, promote it, switch EBS
volumes (which are limited to a single AZ), switch any ElasticIP you
might have, reconfigure replication of the remaining slaves.
If not, is there a slave fully replicated in another AZ? If so, promote it,
then do the above.
If there are no slave in same AZ, and no slave fully
replicated in another AZ, then create a snapshot from master's
volume, and use this snapshot to create a new volume in an AZ where a
slave is running. Then do the above.

Resources