Master Slave Feature in Apache Geode - high-availability

Apache Geode has the feature of clustering using multi-site WAN but is there any other configuration changes So we can you use HA feature or master - slave feature.

You can configure WAN replication to only happen in one direction - from a primary (or master) to a secondary site. You would do this by only creating a gateway-sender in your primary site.
However it's usually best to configure Geode to use active-active WAN replication where both sites are replicating to each other. That way if you fail over to your secondary site it can queue up events to replicate to your primary site when it comes back online.
A common pattern would be to configure geode to use active-active WAN in two different sites, but then designate a primary site for your entire application stack (webservers, geode, other services, etc.).

Related

How to configure infinispan cluster with nodes on different machines?

I am pretty new to this stuff. My requirement is to use a infinispan 8.1 clustered cache in domain mode. I followed the basic pre configured files. I copied the server project on two different machines on same network. I started them, and they discovered each other. Each machine create 2 nodes, so total of 4 nodes. When I used client program (a hotrod client) to access that, I got the cache from the nodes.
Now when I moved one machine to other network, it stopped working. What could be the workaround??
Also, I have one more doubt. In hotrod client API, I can add clusters and servers etc. Shouldn't it be done from the servet, and client should access it with only one logical name under IP address/Host name??
Also, How can I start the infinispan server as a service?? as I do not want to wrap it inside any web application.

How can a Phoenix application tailored only to use channels scale on multiple machines? Using HAProxy? How to broadcast messages to all nodes?

I use the node application purely for socket.io channels with Redis PubSub, and at the moment I have it spread across 3 machines, backed by nginx load balancing on one of the machines.
I want to replace this node application with a Phoenix application, and I'm still all new to the erlang/Elixir world so I still haven't figured out how a single Phoenix application can span on more than one machine. Googling all possible scaling and load balancing terms yielded nothing.
The 1.0 release notes mention this regarding channels:
Even on a cluster of machines, your messages are broadcasted across the nodes automatically
1) So I basically deploy my application to N servers, starting the Cowboy servers in each one of them, similarly to how I do with node and them I tie them nginx/HAProxy?
2) If that is the case, how channel messages are broadcasted across all nodes as mentioned on the release notes?
EDIT 3: Taking Theston answer which clarifies that there is no such thing as Phoenix applications, but instead, Elixir/Erlang applications, I updated my search terms and found some interesting results regarding scaling and load balancing.
A free extensive book: Stuff Goes Bad: Erlang in Anger
Erlang pooling libraries recommendations
EDIT 2: Found this from Elixir's creator:
Elixir provides conveniences for process grouping and global processes (shared between nodes) but you can still use external libraries like Consul or Zookeeper for service discovery or rely on HAProxy for load balancing for the HTTP based frontends.
EDITED: Connecting Elixir nodes on the same LAN is the first one that mentions inter Elixir communication, but it isn't related to Phoenix itself, and is not clear on how it related with load balancing and each Phoenix node communicating with another.
Phoenix isn't the application, when you generate a Phoenix project you create an Elixir application with Phoenix being just a dependency (effectively a bunch of things that make building a web part of your application easier).
Therefore any Node distribution you need to do can still happen within your Elixir application.
You could just use Phoenix for the web routing and then pass the data on to your underlying Elixir app to handle the distribution across nodes.
It's worth reading http://www.phoenixframework.org/v1.0.0/docs/channels (if you haven't already) where it explains how Phoenix channels are able to use PubSub to distribute (which can be configured to use different adapters).
Also, are you spinning up cowboy on your deployment servers by running mix phoenix.server ?
If so, then I'd recommend looking at EXRM https://github.com/bitwalker/exrm
This will bundle your Elixir application into a self contained file that you can simply deploy to your production servers (with Capistrano if you like) and then you start your application.
It also means you don't need any Erlang/Elixir dependencies installed on the production machines either.
In short, Phoenix is not like Rails, Phoenix is not the application, not the stack. It's just a dependency that provides useful functionality to your Elixir application.
Unless I am misunderstanding your use case, you can still use the exact scaling technique your node version of the application is. Simply deploy the Phoenix application to > 1 machines and use an Nginx load balancer configured to forward requests to one of the many application machines.
The built in node communications etc of Erlang are used for applications that scale in a different way than a web app. For instance, distributed databases or queues.
Look at Phoenix.PubSub
It's where Phoenix internally has the Channel communication bits.
It currently has two adapters:
Phoenix.PubSub.PG2 - uses Distributed Elixir, directly exchanging notifications between servers. (This requires that you deploy your application in a elixir/erlang distributed cluster way.)
Phoenix.PubSub.Redis - uses Redis to exchange data between servers. (This should be similar to solutions found in socket.io and others)

Websphere application server 8.5.5 clustering same application

I have the same application running on two WAS clusters. Each cluster has 3 application servers based in different datacenters. In front of each cluster are 3 IHS servers.
Can I specify a primary cluster, and a failover cluster within the plugin-cfg.xml? Currently I have both clusters defined within the plugin, but I'm only hitting 1 cluster for every request. The second cluster is completely ignored.
Thanks!
As noted already the WAS HTTP server plugin doesn't provide the function your're seeking as documented in the WAS KC http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/rwsv_plugincfg.html?lang=en
assuming that by "failover cluster" what is actually meant is "BackupServers" in the plugin-cfg.xml
The ODR alternative mentioned previously likely isn't an option either, this because the ODR isn't supported for use in the DMZ (it's not been security hardened for DMZ deployment) http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/twve_odoecreateodr.html?lang=en
From an effective HA/DR perspective what you're seeking to accomplish should handled at the network layer, using the global load balancer (global site selector, global traffic manager, etc) that is routing traffic into the data centers, this is usually accomplished by setting a "site cookie" using the load balancer
This is by design. IHS, at least at the 8.5.5 level, does not allow for what you are trying to do. You will have to implement such level of high availability in a higher level in your topology.
There are a few options.
If the environemnt is relatively static, you could post-process plugin-cfg.xml and combine them into a single ServerCluster with the "dc2" servers listed as <BackupServer>'s in the cluster. The "dc1" servers are probably already listed as <PrimaryServer>'s
BackupServers are only used when no PrimaryServers are reachable.
Another option is to use the Java On-Demand Router, which has first-class awareness of applications running in two cells. Rules can be written that dictate the behavior of applications residing in two clusters (load balancer, failover, etc.). I believe these are "ODR Routing Rules".

How is Apache Zookeeper used in sharding?

We are thinking of centralizing cfg information and looks like zookeeper is a good choice. We are also interested in sharding and have a scheme. In the poweredBy[1] saw that Rackspace and Yahoo is using Zookeeper for sharding. Would appreciate pointers and details.
[1] https://cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy
Solr is going to use Zookeeper for sharding. ZooKeeper Integeration design doc might be interesting for you.
I can think of two things that they could be referencing.
They could be referencing using the built in ensemble features. Using those you can actually setup a group management protocol for your service. As you add more servers to the ensemble you effectively shard your pool out to greater numbers. The data between the members of the ensemble is sync'd between the member servers. This is especially useful for applications that shard out the same data set to multiple read pools - such as index servers, search servers, read cache's, etc.
They could be using ZooKeeper for configuration management. Let's now assume that your application may have thousands of clients that all need to update their config files at the same time. Let's say that your application now accesses a data storage layer of 50 servers - but that pool needs to be sharded out to 200. You can setup a slaving relationship to perform the 1 to 4 slave relationship. ZooKeeper could then be used to update that config file and in essence change every config file within a second of each other.
You should also take a look at how HBase uses Zookeeper; specifically to maintain information about regions. This would be analogous to using ZK to maintain DB sharding info.
For managing the lookup table .
Since this lookup table have to be strong consistent, this is where zookeeper comes into picture.

How does appfabric caching failover work?

We are looking to use Windows AppFabric caching in a high availability scenario.
We would like to place the caching on the web servers, but the web servers do not have access to a database.
Therefore, the only option is the xml configuration file. This is on a share on the lead web server.
What happens if this server goes down? Is high availability only available when you have a clustered SQL Server or access to a SAN?
You can designate multiple lead servers, however for HA SQL Server (clustered!) configuration is recommended by MS.
The XML configuration storage location can be a single point of failure, therefore MS recommend you use failover clustering which results in creating a folder with high availability.
See here for more details.

Resources