OSGi in Distributed Infrastructures - osgi

We're working on an OSGi-based infrastructure for processing stream-based data flows. Specific processing tasks are executed by individual OSGi components. We now need the possibility to distribute those components over different machines, which means, we need some kind of communication mechanism between OSGi components/containers.
During my research I came across different potential solutions: R-OSGi, Apache CXF for Distributed OSGi, Eclipse Communication Framework.
ECF seems particularly interesting as it supports different transports formats and provides support for stuff like service discovery.
My central questions:
Are there any detailed tutorials/walk-throughs for setting up an ECF infrastructure within Felix? (from my research, I found, that Felix support has been added recently)
Are there any solutions besides the three listed above which I might have missed?
Is there a reason for taking Apache CXF instead of ECF?

The first question -- whether there is a detailed walk-through for setting up ECF with Felix -- I don't know the answer to, though one might use a search engine to find out combinations of those terms.
The problem is ECF uses the Equinox infrastructure, and has at times inadvertently relied on packages that are non-public through transitive dependencies (particularly the Runtime API which uses Equinox for non-public debugging). This, in turn, means that ECF relies on a whole host of other components to be available and it's this set which typically isn't well defined on a Felix runtime.
You have missed out Paremus' Service Fabric, which is a commercial OSGi cloud solution. I'm not sure if you were specifically focussing on open-source or not; but if you are including commercial licenses then they have a very robust architecture for remote services.
Finally, the Apache CXF over ECF question -- if you're using Felix, I'd argue that going with Apache CXF is probably easier than going with ECF. This is mainly due to the dependency set and getting it working, combined with the fact that ECF may not be tested on Felix and so may assume particular aspects of the Equinox runtime (which includes, for example, the runtime's parent classloader delegation to pick up things on the boot classpath). This isn't really the fault of ECF per se, but rather an artefact of how the Eclipse ecosystem works.
If you want to communicate with non-OSGi runtimes, there's an advantage in the Apache CXF in that they can generate WDSL for interaction with other languages. I believe that you can do the same thing in ECF with a bit more work. The CXF solution is likely to be more verbose than a corresponding ECF one (WSDL always is) but if you're not using high volumes of requests this isn't likely to make a significant difference.

Related

Migrating from Spring monolith application to OSGI

We have been building two suites of applications for the last 10 years using Spring as our dependency injection. We also use spring-batch and spring-amqp. We are now looking to move to OSGI so that our monolithic applications can be separated into bundles so that we can be more agile. The two suites are web applications and are deployed as two separate war files. We are looking to use Apache Karaf as our OSGI runtime.
Spring-DM is dead and it appears that we are going to have to convert EVERYTHING to use Blueprint for our dependency injection.
My question is how do we do this incrementally? It will be close to impossible to convert all of this over at once. It seems like one bundle should still be able to use Spring DI and have it's own application context as long as we take the responsibility to expose any services that we want to the service registry in the bundle activator, but I'm not sure if there is some kind of magic that we would lose like transaction management.
Any guidance on this would be really appreciated.
You might want to consider to make the problem appear even larger and switch to DS instead of Blueprint ... To take truly advantage of the OSGi model, DS is far superior to Blueprint in all aspects. In reality, after the first hurdle, you'll make much more progress and your gains will be higher. Though Blueprint made Spring available on OSGi, it never 'got' OSGi.
For strategy, keep your Spring app alive as a single bundle and move things out gradually. I.e. the elephant approach.
The biggest gain that OSGi provides can be summarized as follows:
Make sure modules have service APIs that ONLY handle collaboration. I.e. each service API should be a story/scenario how the actors work together, not how they come into existence and are configured.
Let Configuration Admin to the configuration work. I.e. never expose configuration APIs. In OSGi, you register instances, not things that still need to be configured.
Make sure you really understand the OSGi model with services. You might want to take a look at OSGi enRoute that leverages OSGi to the hilt.
I propose you take a look at the blueprint-maven-plugin. It allows to use a subset of CDI and JEE annoations to define injections as well as transactions and persistence. The plugin creates blueprint xml at build time which can then be executed by karaf. The big advantage is that these annotations are also supported by spring. So you can transition and in parallel release to production using spring.
I have a complete example here Annotation based blueprint and JPA.
Using this plugin I migrated a medium sized project while it was developed and released in parallel. If you need further advice while using the plugin I can surely help.

Looking up OSGi services from outside the OSGi container

I have a set of bundles deployed in Karaf and exposing a number of OSGi services which I would like to be able to lookup and call remotely, from an application running on a (possibly) different machine and in a non-OSGi container. My initial though was to use JNDI lookup to get the services I want however I understand from an earlier stackoverflow post that this might not be supported (I say might since I haven't been able to find any information on whether anything has changed on the Aries JNDI implementation in the past year). In that case I guess my other options would be to use CXF to expose a JAX-WS or JAX-RS API for my services.
Is my understanding of the situation correct? Is JNDI lookup really not an option in my case? Are there any other alternatives I have not thought of?
A simple jndi lookup will not work. OSGi services are not suitable for remoting per se. So even if you can get the jndi object in some way you can not call it.
Possible solutions are manual cxf proxies and endpoints like you already found and Distributed OSGi. See CXF-DOSGi and Eclipse ECF. Both can offer transaparent service calls from one OSGi framework to another. DOSGi is ideal if you also use OSGi on the client side.
At least in case of CXF DOSGi it is also possible to use DOSGi on the server side and a normal CXF client on the client side. So you can keep the effort on the server side minimal.
See also this tutorial for CXF DOSGi
You need a Remote Services implementation. See https://en.wikipedia.org/w/index.php?title=OSGi_Specification_Implementations for a list.

JBoss Deployment Info

More of a standard practice questions:
Is there any difference in deploying an app as EAR vs WAR? How do you decide? (I know WAR is just a web application may or may not have Java EE features like messaging)
Lets say I have a Spring MVC application stack with Hibernate (MySQL DB), should this be deployed as a War or EAR?
When do we need to worry about JBoss deployment descriptors, if I am not using EJBs. (Just Spring MVC). Lets assume I have JMS as well. Do we need to configure/update/create any other JBoss related config files?
When we package our application EAR/WAR, it include EVERYTHING that we need for our app. Is there a scenario where we need to keep some config / xml files outside of this archive in a specified JBoss folder?
Is it common practice to deploy directly from Eclipse or better to use Ant, etc? Advantage / Disadvantage?
Obviously, I am a newbie :-). Trying to understand this.
1.
This is not always an easy decision, but for beginners and for small projects I would say it's nearly always a WAR. The reason for using an EAR is mainly to isolate a business layer from a UI/Web layer. See this question for more details: How can one isolate logical layers of an Java EE application
2.
I might be mistaken but I think that Spring people typically prefer WARs.
3.
JBoss (vendor) specific deployment descriptors are mostly needed to configure so-called "administered objects" and security. Sometimes they can be used for extra features that are not covered by the Java EE specification (e.g. setting the web root for a WAR). Administered objects are typically data sources (connection to a database) and JMS destinations (queues and topics).
In the traditional Java EE approach these have to be created as far away from the code as possible, which typically means a system admin would create them inside the target AS using some kind of GUI or admin console. In this setup, you as developer would throw a WAR with "unresolved dependencies" over the wall, and a system admin (or "deployer") would then spend days figuring out what those unresolved dependencies should be.
If the communication is relatively good between developers and deployers, the WAR or EAR might be thrown over the wall together with a readme-file, that at least gives some insight into which resources are needed. Depending on the organization the development team might not get any access or feedback about how those "unresolved dependencies" have been resolved. E.g. a data source with a max of 5 connections may have been created, but this may be insufficient if some code does say 10 parallel queries. Without the development team knowing the exact data source configuration, some classes of runtime problems and performance issues may be relatively hard to solve.
To mitigate these problems, some vendors, for some artifacts, offer the developer to create those "unresolved dependencies" instead using proprietary deployment descriptors which are then embedded in the WAR or EAR. For simple local JMS destinations this is then in most cases the end of it, but for data sources there is a little bit more to it. Namely, there has to be a mechanism to switch between data sources for different stages such as Dev, Beta, QA, Production etc. Additionally, it's rarely a good idea to have production passwords in the source code.
If you have a simple app that you want to try out locally, stages and production passwords are not a concern. If you deploy for a (large) company it is.
In Java EE 6 you can define a data source using a standard descriptor (web.xml, ejb-jar.xml or application.xml), and in Java EE 7 you can do the same for JMS destinations. There is no standard way to configure those based on stage, but there is a glimmer of hope that Java EE 8 will address this (see e.g. JAVAEE_SPEC-19). Vendors are not universally happy with those standardized methods, and their main documentation will almost always extensibly tell you how to do those things using their proprietary tools and descriptors, and if you're lucky as a small note tell you there's a standardized way (and then sometimes downplay that or scare you by saying it's not recommended to be used in production).
4.
See answer to 3 mostly. One option to solve the problem of how to switch between stages and keep production passwords out of the WAR/EAR, is to have the full definition of said data source inside the AS (inside JBoss in your case). Every AS installation is tied to a specific server in this setup. If data sources need to be updated, removed or new ones added, you have to communicate with your operations team (if any). As said, depending on your organization this can be anything between trivial and practically impossible.
5.
When developing you most often use your IDE to do a deployment. For production you would never do that. For production you may build with Ant (or Maven) and deploy via something like Jenkins, or e.g Chef.
Check here : .war vs .ear file
If you read the preceeding response, you'd guess that "WAR" it is.
Deployment descriptor are needed to manage the modules of JBoss, if you don't have any conflict or don't need any tweaking, you won't need any deployment descriptor.
You may need to play with some JBoss file if you want to add modules to JBoss, or configure datasources, etc. Read the JBoss documentation for more info.
You can deploy from eclipse during your development phase, but as your other environments (qualification, production, test, etc) should be separeted from your developing one and that they won't have any eclipse installed on them, you should get used to manage your server from the command line and drop your war's in the right directories.
It's a short answer, but I hope it will help.
Read JBoss documentation for more info.

Integration Testing Distributed Java EE Applications

We are having a setup of 3 different Java EE Servers, all communicating with both JGroups and RMI. We are heavily unit testing our code and the whole team is totally in favor of TBD, but we are facing problems when it comes to integration testing our servers.
Especially our custom fail-over/ reconnect/ termination detection "algorithms" would need some automated testing because we are often seeing that they break and we currently always fix it by trial and error testing.
We are using the following libraries/frameworks: Tomcat, Maven, Spring 3, RMI, JGroups
Any ideas, suggestions, links and resources are welcome!
Interesting that nobody answered this question since 2011. Maybe there wasn't anything to recommend?
If you are looking into integration testing only it's much easier. You can write your usual JUnit/TestNG tests and use arquillian to take care of the container (lifecycle, deployments, configuration, etc). You can run all the components (tests, containers, deployments) on a single node, bind to different IPs or ports, let JGroups do all the cluster communication as usual.
http://arquillian.org/
Moreover, there is a whole book now available about integration testing in called 'Continuous Enterprise Development in Java'.
http://www.amazon.com/Continuous-Enterprise-Development-Andrew-Rubinger/dp/1449328296
The situation is IMO much worse when it comes to system testing. I am going to just say one name here: SmartFrog which is 'powerful and flexible Java-based software framework for configuring, deploying and managing distributed software systems'. The learning curve is terrible though.
http://www.smartfrog.org/

How does OSGi manage interaction of components running in separate JVMs?

I have been trying to understand a bit more about the wider picture of OSGi without reading thru the entire specification. As with so many things, the introduction to what OSGi actually is was probably written by someone who had been working on it for a decade and perhaps wasn't best placed to put themselves in the mindset of someone who knows nothing about it :-)
Looking at Felix's example DictionaryService, I don't really understand what is going on. Is OSGi a distinct instance of a JVM into which you load bundles which can then find each other?
Obviously it is not just this because other answers on StackOverflow are explicit that OSGi can solve the dependency problem of a distributed system containing modules deployed within distinct JVMs (plus the FAQ keeps talking about networks).
In this latter case, how does a component running in one JVM interact with another component in a separate JVM? Can the two components "use" each other as if they were running within the same JVM (i.e. via local method calls), and how does OSGi manage the marshalling of data across a network (do you have to use Serializable for example)?
Or does the component author have to use some other distinct mechanism (either provided by OSGi or written themselves) for communication between remote components?
Any help much appreciated!
Yes, OSGi only deals with bundles and services running on the same VM. However, one should note that it is a distinct feature of OSGi that it facilitates running multiple applications (in a controlled way and sharing common modules) on the same JVM at all.
When it comes to accessing services outside the clients JVM, there is currently no standardized solution. Paremus Infiniflow and the derived open-source project Newton use an SCA approach. The upcoming 4.2 release of the OSGi specs will address one side of the problem, namely how to use generic distribution software in such a way that it can bring remote services into the client's JVM.
As somebody mentioned R-OSGi, this approach also deals with the other side of the problem, being how to manage dependencies between distributed OSGi frameworks. Since R-OSGi is not generic distribution software but explicitly deals with the lifecycle issues and dependency management of OSGi bundles.
As far as I know, OSGi does not solve this problem out of the box. There are OSGi-bundles, for example Remote OSGi, which allow the programmer to distribute services across a network.
Not yet, i think it's being worked on for the next release.
But some companies have already implemented distributed osgi. One i'm aware of is Paremus' Infiniflow (http://www.paremus.com/products/products.html). At linkedin they are also working on this. More info here: Building Linkedin next gen architecture with osgi and here: Matt raible: building linkedin next gen architecture
Here's a summary of the changes for OSGI 4.2: Some thoughts on the OSGi R4.2 draft, There's a section on RFC-119 dealing with distributed OSGi.
AFAIK, bundles are running in the same JVM, but are not loaded using the same class loader (that why you can use two different versions of the same bundle at the same time).
To interact with components in another JVM, you must use a network protocol such as rmi.
The OSGi alliance is working on a standard for distributed OSGi:
http://www.osgi.org/download/osgi-4.2-early-draft2.pdf
There even is an early Apache implementation of this new standard:
http://cxf.apache.org/distributed-osgi.html
#Patriarch24
The accepted answer to this question would seem to indicate otherwise (unless I'm misreading it). Also, taken from the FAQ:
The OSGi Service Platform provides the functions to change the composition dynamically on the device of a variety of networks, without requiring a restart
(Emphasis my own). Although in the same FAQ it describes OSGi as in-VM.
Why am I so confused about this? Why is such a basic question about a decade-old technology not clear?
The original problem of OSGI was more related to distribution of code (and then configuration of bundle) than to distribution of execution.
People looking at distributed components are rather looking towards SCA
The "introduction" link is not really an intro, it is a FAQ entry. For more information, see http://www.osgi.org/About/WhatIsOSGi Not hard to find I would think.
Anyway, OSGi is an in-VM SOA. That is, the OSGi Framework is about what happens inside the VM, it provides a framework for structuring your application inside the VM so you can built it too a large extent from components. So the core has nothing to do with distribution, it is completely oblivious of who implements the services, it just provides a mechanism for modules to meet each other in a loosely coupled way.
That said, the µService model reifies the joints between the modules and it turns out that you can build support on top of the framework that provides distribution to the other components. In the last releases we specified some mechanisms that make this standardized in the core and provide a special service Remote Service Admin that can manage a distributed topology.
If you are looking for a distributed OSGi centric Cloud runtime - then the Paremus Service Fabric ( https://docs.paremus.com/display/SF16/Introduction ) provides these capabilities.
One or more Systems each consisting of a number of OSGi assemblies (Blueprint or Declarative Services) can be dynamically deployed and maintained across a population of OSGi runtime Frameworks (Knopflerfish, Felix or Equinox).
A light weight RSA remote framework is provided which provides Service discovery by default using DDS (a seriously good middleware messaging technology) - (thought ZooKeeper and other approach can be used). Currently supported re-moting protocols include RMI and Avro.
Regards
Richard

Resources