I need to create a drawing application in Java where a user draws lines and colors among other things on the master application and a number of client viewers update their views accordingly. Each client viewer may visualize the received data differently (for example, given a line drawn in the master application, viewer 1 may do the same while viewer 2 may apply some filtering to show it differenlty.).
Is Java Messaging Service approach a good choice for such an application? The master app would send out the changes in messages and the clients would asynchronously update their views. Later I might need to distingush the types of the data sent out, so was considering CAMEL for setting up different topics for different clients. If these are not a good choice, what technology is out there that's suitable?
Are they still a good choice if later the design changes so that every client also sends out updates and needs to update accordingly?
Will this approach scale out if the update rate/amount in the master application is huge? (for example, 60 frames/sec updates on 1024x768 pixels or more. - I could probably calculate frame difference and send out only the changes.)
Thank you for your time and sorry for the many questions. Please let me know if any of my assumptions are wrong, too.
While it is do-able, I doubt JMS and Camel would be the best fit for this type of distributed application.
JMS/Camel are good for enterprise integration scenarios where you want some level of decoupling between services and consumers. The decoupling usually comes with maintenance and performance overheads.
For the type of app you describe, it sounds like decoupling us not important so you might be better off looking at some distributed application environment where the clients and servers are implemented in the same language and you have rpc calls between them.
Depending on your language choice, distributed technologies you could consider are: distributed Java, distributed Ruby, Celluloid, HTML5+Websockets (e.g. check out meteor.com).
Related
I am thinking what is the best way to structure your micro-services, in the past the team I was working with used Axon Framework and PostgreSQL and each microservice had its own event store in the PostgreSQL database, then we built communication between using REST.
I am thinking that it would be smarter to have all microservices talk to the same event store as we would be able to share events faster instead of rewriting the communication lines using REST.
The questions that follows from the backstory is:
What is the best practice for having an event store
Would each service have its own? Would they share the same eventstore?
Where would I find information to inspire and gather more answers? As searching the internet for best practices and how to structure the Event Store seems like searching for a needle in a haystack.
Bear in mind, the question stated is in no way aimed at Axon Framework, but more the general idea on building scalable and good code. As the applications would work with each own event store for write model and read models.
Thank you for reading and I wish you all the best
-- Me
I'd add a slightly different notion to Tore's response, although the mainline is identical to what I'm sharing here. So, I don't aim to overrule Tore, just hoping to provide additional insight.
If the (micro)services belong to the same Bounded Context, then they're allowed to "learn about each other's language."
This language thus includes the events these applications publish and store.
Whenever there's communication required between different Bounded Contexts, you'd separate the stores, as one context shouldn't be bothered by the specifics of another context.
Hence it is beneficial to deduce what services belong to which Bounded Context since that would dictate the required separation.
Axon aims to support this by allowing multiple contexts with the Axon Server, as you can read here.
It simply allows the registration of applications to specific contexts, within which it will completely separate all message streams (so commands, events, and queries) and the Event Store.
You can also set this up from scratch yourself, of course. Tore's recommendation of Kafka is what's used quite broadly for Event Streaming needs between applications. Honestly, any broadcast type of infrastructure suits event distribution, as that's how events are typically propagated.
You want to have one EventStore per service, just as you would want to have one relation database per service for a non EventSourced system.
Sharing a database/eventstore between services creates coupling and we have all learned the hard way that this is an anti-pattern today.
If you want to use a event log to share events across services, then Kafka is a popular choice.
Important to remember that you only do event-sourcing within a service bounded context.
For example,
You have an IT estate where a mix of batch and real-time data sources exists from multiple systems, e.g. ERP, Project management, asset, website, monitoring etc.
The aim is to integrate the datasources into a cloud environment (agnostic).
There is a need for reporting and analytics on combinations of all data sources.
Inevitably, some source systems are not capable of streaming, hence batch loading is required.
Potential use-cases for performing functionality/changes/updates based on the ingested data.
Given a steer for creating a future-proofed platform, architecturally, how would you look to design it?
It's a very open-end question, but there are some good principles you can adopt to help direct you in the right direction:
Avoid point-to-point integration, and get everything going through a few common points - ideally one. Using an API Gateway can be a good place to start, the big players (Azure, AWS, GCP) all have their own options, plus there's lots of decent independent ones like Tyk or Kong.
Batches and event-streams are totally different, but even then you can still potentially route them all through the gateway so that you get the centralised observability (reporting, analytics, alerting, etc).
Use standards-based API specifications where possible. A good REST based API, based off a proper resource model is a non-trivial undertaking, not sure if it fits with what you are doing if you are dealing with lots of disparate legacy integration. If you are going to adopt REST, use OpenAPI to specify the API's. Using this standard not only makes it easier for consumers, but also helps you with better tooling as many design, build and test tools support OpenAPI. There's also AsyncAPI for event/async API's
Do some architecture. Moving sh*t to cloud doesn't remove the sh*t - it just moves it to the cloud. Don't recreate old problems in a new place.
Work out the logical components in your new solution: what does each of them do (what's it's reason to exist)? Don't forget ancillary components like API catalogues, etc.
Think about layering the integration (usually depending on how they will be consumed and what role they need to play, e.g. system interface, orchestration, experience APIs, etc).
Want to handle data in a consistent way regardless of source (your 'agnostic' comment)? You'll need to think through how data is ingested and processed. This might lead you into more data / ETL centric considerations rather than integration ones.
Co-design. Is the integration mainly data coming in or going out? Is the integration with 3rd parties or strictly internal?
If you are designing for external / 3rd party consumers then a co-design process is advised, since you're essentially designing the API for them.
If the API's are for internal use, consider designing them for external use so that when/if you decide to do that later it's not so hard.
Taker a step back:
Continually ask yourselves "what problem are we trying to solve?". Usually, a technology initiate is successful if there's a well understood reason for doing it, which has solid buy-in from the business (non-IT).
Who wants the reporting, and why - what problem are they trying to solve?
As you mentioned its an IT estate aka enterprise level solution mix of batch and real time so first you have to identify what is end goal of this migration. You can think of refactoring applications. If you are trying to make it event driven then assess the refactoring efforts and cost. Separation of responsibility is the key factor for refactoring and migration.
If you are thinking about future proofing your solution then consider Cloud for storing and processing your data. Not necessary it will be cheap but mix of Cloud and on-prem could be a way. There are services available by cloud providers to move your data in minimal cost. Cloud native solutions are there for performing analysis on your data. Database migration service in AWS or Azure can move data and then capture on-going changes. So you can keep using on-prem db & apps and perform analysis for reporting on cloud. It will ease out load on your transactional DB. Most data sync from on-prem to cloud is near real time.
I'm about to build a new system and I want maximum availability! I'll have to use Windows!
I will have clients talking to my system using webservices. I'll also get data from surrounding systems. This data is delivered using messaging, MQ-series and MSMQ.
The system will produce some data that is sent back to the surrounding systems using queues.
After new data has come to the system different processes will use this data to do diffrent tasks, like printing, writing to databases etc.
To achieve high availablity I'm planning to have two versions of the system running in parallel on two different machines. The clients will try to use the first server thats responds correctly.
I think an ideal soultion would be that the incomming data from anyone of the two servers is placed in a COMMON queue(on a third machine?). Data in the queue can be picked up by processes on both servers(think producer-consumer pattern).
I think that maybe NServiceBus will suits my needs. I have a few questions according to the above.
Can a queue be shared between two servers? I dont want data to be stuck on a server if its gets down. I that case I want the other server to keep processing.
Can two(or more) "consumers"/processes on different machines pick data from a common queue?
Any advice is welcome!
The purpose of NSB distributor is not to address availability issues but to address scale issues, distributors help scaling out systems at a low cost.
By looking at the description, your system consist of WebService endpoits, multiple databases and queuing infrastructure. If you want to achieve complete high-availability you will have to make sure there are no single points of failures. In order to do that you will need,
A load balanced web farm for web service endpoints (2 or more servers)
Application cluster for queues and applications that relies on those queues.
Highly available database server, again clustered.
On top of everything a good SAN.
But if you are referring to being available to consumers, you just have to make sure target queues and webservice endpoints are available. And making sure the overall architecture promotes deferred execution.
Two or more applications can read a MSMQ queue remotely but thats something you don't want to do since it's based on DTC. And that's a real performance killer.
Some references
[http://blogs.msdn.com/b/clustering/archive/2012/05/01/10299698.aspx][1]
[http://msdn.microsoft.com/en-us/library/ms190202.aspx][2]
In short you will want to use the distributor... http://support.nservicebus.com/customer/portal/articles/859556-load-balancing-with-the-distributor
The key thing here is that the distributor node is a single point of failure so you want to run it on a cluster.
I am writing a Java EE application which is supposed to consume SAP BAPIs/RFC using JCo and expose them as web-services to other downstream systems. The application needs to scale to huge volumes in scale of tens of thousands and thousands of simultaneous users.
I would like to have suggestions on how to design this application so that it can meet the required volume.
Its good that you are thinking of scalability right from the design phase. Martin Abbott and Michael Fisher (PayPal/eBay fame) layout a framework called AKF Scale for scaling web apps. The main principle is to scale your app in 3 axis.
X-axis: Cloning of services/ data such that work can be easily distributed across instances. For a web app, this implies ability to add more web servers (clustering).
Y-axis: separation of work responsibility, action or data. So for example in your case, you could have different API calls on different servers.
Z-Axis: separation of work by customer or requester. In your case you could say, requesters from region 1 will access Server 1, requesters from region 2 will access Server 2, etc.
Design your system so that you can follow all 3 above if you need to. But when you initially deploy, you may not need to use all three methods.
You can checkout the book "The Art of Scalability" by the above authors. http://amzn.to/oSQGHb
A final answer is not possible, but based on the information you provided this does not seem to be a problem as long as your application is stateless so that it only forwards requests to SAP and returns the responses. In this case it does not maintain any state at all. If it comes to e.g. asynchronous message handling, temporary database storage or session state management it becomes more complex. If this is true and there is no need to maintain state you can easily scale-out your application to dozens of application servers without changing your application architecture.
In my experience this is not necessarily the case when it comes to SAP integration, think of a shopping cart you want to fill based on products available in SAP. You may want to maintain this cart in your application and only submit the final cart to SAP. Otherwise you end up building an e-commerce application inside your backend.
Most important is that you reduce CPU utilization in your application to avoid a 'too-large' cluster and to reduce all kinds of I/O wherever possible, e.g. small SOAP messages to reduce network I/O.
Furthermore, I recommend to design a proper abstraction layer on top of JCo including the JCO.PoolManager for connection pooling. You may also need a well-thought-out authorization concept if you work with a connection pool managed by only one technical user.
Just some (not well structured) thoughts...
I was browsing for an open source messaging software and after some good bit of research I came across these three products. I've taken these out for a preliminary test drive, having had them handle messages for queues and topics, and from what I've read all three of these products are good picks for an Open Source messaging solution for most companies. What I was wondering was what are the advantages that these products may have over one another? What I'm particularly interested in is messaging throughput, including persistent messaging throughput, security, scalability, reliability, support, routing capabilities, administrative options such as metrics and monitoring, and generally just how well each program runs in a large business environment.
Check out http://queues.io/
From their site:
The goal is to create a quality list of queues with a collection of articles, blog posts, slides, and videos about them. After reading the linked articles, you should have a good idea about: the pros and cons of each queue, a basic understanding of how the queue works, and what each queue is trying to achieve. Basically, you should have all the information you need to decide which queue will best fit your needs.
'messaging' covers a lot of options - and there must be at least a dozen different types of technologies that could be the right answer - having built many production messaging environments, using a variety of technologies/approaches, having a better understanding your requirements would help.
are you needing subject-based subscriptions? do you need multicast delivery? do you need dynamic subscribers/listeners? would your listeners be requerying for best sources even after finding an acceptable publisher/feed?
do you need guaranteed delivery? delivery confirmation? is you publisher storing any undelivered messages, or do you need the messaging system to do that for you automagically? how often does your feed data go stale - e.g. email-ish alerts can be store-and-forward but real-time pricing data is only valid for a short interval (and then probably needs to go away rather than cause confusion)
how volatile is your network topology? are your subscribers (or publishers) expecting to live at a fixed address? or are they mobile devices? could they appear to you over more complex internetwork topologies requiring registration and possibly imposing routing restrictions? if so any idea the frequency of these topology changes?
do you only need a java interface? are any of your subscribers to be integrated into windows components (like feeds into excel)?
if you're only interested in experience comparing the similar products you named then perhaps you have already thought through these topics.
as to products, in my experience Tibco is still the leader in throughput and scalability, especially in a real-time environment. ibm MQ would be next, especially in a store-and-forward architecture. with both of those products you get a level of support on which you can justify betting a fundamental part of your business systems. there's a reason both of those have been around for a couple of decades.
another often overlooked option is Tuxedo - it provides not only messaging but a proven transactional capability that remains unparalleled. Oracle continue to be committed to this product and, again, the level of support available is second to none.
i love open sourced solutions and am always glad to find production quality software for free - but if you are creating a fundamental part of your business infrastructure then an active community still might not indicate whether a particular voluntary project is the best bet.
my 2c worth. hope it helps.
First, I am no expert in this, but maybe I can give you some thought hints.
ActiveMQ and Qpid are both under the Apache umbrella and are message queues. But Qpid is an implementation of the AMQP specification.
AMQP is a protocol specification, on the wire level, so messages can be exchanged with other AMQP message queues (e.g RabbitMQ).
ActiveMQ and HornetQ are queues that you can use with a JMS API. The Java Message Service is a specification on an API level.
But you have the option to access Qpid via a JMS API, too.
I think performance is a secondary thought. To have an active community is more important.
http://x-aeon.com/wp/2013/04/10/a-quick-message-queue-benchmark-activemq-rabbitmq-hornetq-qpid-apollo/
Benchmark includes some performance numbers for you to decide, with both persistent and transient results.