How to achieve high avaliability? - windows

I'm about to build a new system and I want maximum availability! I'll have to use Windows!
I will have clients talking to my system using webservices. I'll also get data from surrounding systems. This data is delivered using messaging, MQ-series and MSMQ.
The system will produce some data that is sent back to the surrounding systems using queues.
After new data has come to the system different processes will use this data to do diffrent tasks, like printing, writing to databases etc.
To achieve high availablity I'm planning to have two versions of the system running in parallel on two different machines. The clients will try to use the first server thats responds correctly.
I think an ideal soultion would be that the incomming data from anyone of the two servers is placed in a COMMON queue(on a third machine?). Data in the queue can be picked up by processes on both servers(think producer-consumer pattern).
I think that maybe NServiceBus will suits my needs. I have a few questions according to the above.
Can a queue be shared between two servers? I dont want data to be stuck on a server if its gets down. I that case I want the other server to keep processing.
Can two(or more) "consumers"/processes on different machines pick data from a common queue?
Any advice is welcome!

The purpose of NSB distributor is not to address availability issues but to address scale issues, distributors help scaling out systems at a low cost.
By looking at the description, your system consist of WebService endpoits, multiple databases and queuing infrastructure. If you want to achieve complete high-availability you will have to make sure there are no single points of failures. In order to do that you will need,
A load balanced web farm for web service endpoints (2 or more servers)
Application cluster for queues and applications that relies on those queues.
Highly available database server, again clustered.
On top of everything a good SAN.
But if you are referring to being available to consumers, you just have to make sure target queues and webservice endpoints are available. And making sure the overall architecture promotes deferred execution.
Two or more applications can read a MSMQ queue remotely but thats something you don't want to do since it's based on DTC. And that's a real performance killer.
Some references
[http://blogs.msdn.com/b/clustering/archive/2012/05/01/10299698.aspx][1]
[http://msdn.microsoft.com/en-us/library/ms190202.aspx][2]

In short you will want to use the distributor... http://support.nservicebus.com/customer/portal/articles/859556-load-balancing-with-the-distributor
The key thing here is that the distributor node is a single point of failure so you want to run it on a cluster.

Related

Using Hazel Cast to synchronize instances in a cluster

In my work place, we have a java system (tomcat,spring,hibernate,soap+rest web services). Some of the web services require the server to save state. For example, while performing some long service, the client cannot call the same service again, as long as it is not finished.
Currently, we don't support clustering, in order to avoid running the above mentioned long service, we use locks or synchronize blocks. In order to support clustering we consider using Hazel Cast (share the locks across the instances), will it work?
Is this the right solution?
Yes, maybe as the most typical use case; you can use distributed locks or other distribured data structures to share state between different servers (or jvms).

How to decide which endpoint type to use, Spring Integration

I've tried SI with TCP, works great. I am trying to decide which type of endpoint I should use for a simple scenario.
Two Java processes running within the same machine, which needs to talks to each other.
There are so many options, for example: AMQP, JMS, MQTT, TCP, RMI.
I am sure TCP works too, async and reliable, but it needs network, it is better to have an option which do not need network, and work cross platform when I port these processes to run on different OS, for example from Linux to Windows.
To simplify my question, which ones of these works without network (NIC and IPs)?
If I want to run these two processes on different machine and connect through network, which one is the best and why?
Does RMI version still support async and reliable connection?
Another factor is whether or not the two endpoints are in the same java VM.
What is "best" is often greatly influenced by what you're familiar with and have easily available. Also, what do you need for replies? How about guaranteed delivery?
With Camel it's pretty easy to pick one and go; if you need to change later it's not very hard.
seda and VM are two components to look into - they're easy to use, no setup needed - but if your application doesn't fit into the restrictions you need something else.
I gravitate toward AMQP, so I tend to go with that, AMQP and JMS, across nodes.

Docker for Elasticsearch multi-tenancy SaaS or single instance and proxy?

I am trying to build a prototype of Elasticsearch as a Service. I have thought of 2 different approaches and I'd like to get opinions towards one or the other implementation
One single installation of Elasticsearch, and a proxy layer on top to add user validation (http basic authentication + user account to validate the usage).
This approach would be relatively straight forward and the main challenge would be configure the cluster properly to handle the load, as well as the permissions so there are no data leaks of the users don't have access to the cluster management APIs.
Use Docker as a container and have one instance of elasticsearch for each user. In this case I would be providing the isolation by using the Linux container (Docker). I'd still need to manage authentication.
It probably would be good to implement both, play around and see how things behave. Any opinions about pros and cons of each approach?
Thanks!
Disclaimer: I am the founder of the Elasticsearch service provider Facetflow, which currently offers shared clusters.
I think that both approaches have merit, but maybe suited for different types of customers.
Looking at other SaaS providers, like MongoDB provider MongoLab, they essentially ended up offering both setups (although not using Docker).
So, pros and cons as I see them:
Shared Cluster
Most Elasticsearch as a Service providers operate this way.
Pros:
Far more affordable for the majority of users just looking for good search and analytics.
Simpler maintenance, less clusters for you to monitor
Potentially less versions of Elasticsearch to integrate with. If you need to communicate with other systems (which you do), write your own plugins (we did, for authentication, silos, entitlements, stats etc.) less versions will be far easier to maintain.
Cons:
Noisy neighbours have to be monitored and you have to scale and relocate indices to handle this.
Users have to choose from a limited list of versions of Elasticsearch, usually a single version.
Users don't get full cluster admin control.
Private Clusters using Docker
One provider that works this way is Found.
Pros:
Users could potentially be able to deploy a variety of versions of Elasticsearch
Users can have complete cluster admin access
Noisy neighbours don't affect their cluster, less manual intervention from you
Cons:
Complex monitoring and support. If people can do whatever they want (shut down the cluster over the api), you have to be clear where your responsibility as a provider ends, and what wakes you up at night.
Complex integration with multiple versions, see shared cluster pros.
More expensive since you have to allocate resources that might not always be used.

Using JMS (ApacheMQ) and Camel for updating multiple GUIs/Visualization

I need to create a drawing application in Java where a user draws lines and colors among other things on the master application and a number of client viewers update their views accordingly. Each client viewer may visualize the received data differently (for example, given a line drawn in the master application, viewer 1 may do the same while viewer 2 may apply some filtering to show it differenlty.).
Is Java Messaging Service approach a good choice for such an application? The master app would send out the changes in messages and the clients would asynchronously update their views. Later I might need to distingush the types of the data sent out, so was considering CAMEL for setting up different topics for different clients. If these are not a good choice, what technology is out there that's suitable?
Are they still a good choice if later the design changes so that every client also sends out updates and needs to update accordingly?
Will this approach scale out if the update rate/amount in the master application is huge? (for example, 60 frames/sec updates on 1024x768 pixels or more. - I could probably calculate frame difference and send out only the changes.)
Thank you for your time and sorry for the many questions. Please let me know if any of my assumptions are wrong, too.
While it is do-able, I doubt JMS and Camel would be the best fit for this type of distributed application.
JMS/Camel are good for enterprise integration scenarios where you want some level of decoupling between services and consumers. The decoupling usually comes with maintenance and performance overheads.
For the type of app you describe, it sounds like decoupling us not important so you might be better off looking at some distributed application environment where the clients and servers are implemented in the same language and you have rpc calls between them.
Depending on your language choice, distributed technologies you could consider are: distributed Java, distributed Ruby, Celluloid, HTML5+Websockets (e.g. check out meteor.com).

Should cluster support be at the application or framework level?

Lets say you're starting a new web project that required the website to run on and MVC framework on Mono. A couple major requirements are that it has to scale easy, be stable and work with multiple servers that may or may not be in the same place or even on the same local network.
The first thing I thought of was a sort of cluster communication between servers. Each server would act as a node and be its own standalone application and would query other nodes in a known list for session information and things like that.
But one major design questions I have is should this functionality be built into the supporting framework or should the application handle the synchronization of the data?
Or am I just way off and this would never work?
Normaly clustering rather belongs to some kind of middleware layer, thus on your framework level. However it can also be implemented on the application level.
It depends on your exact use, if you want load balancing, scalability etc.

Resources