How to run MRUnit? - hadoop

I've written an MRUnit to test my mapper. However, I don't know how to run it in Eclipse as it reads some data from distributedCache. When I run it as a normal class in Eclipse it give me a bunch of errors. These are the error messages I get:
java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory
at org.apache.hadoop.mrunit.TestDriver.<clinit>(TestDriver.java:38)
at MapperCombinerReducerTester.setUp(MapperCombinerReducerTester.java:16)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at ....

Your error relates to a dependency library (commons-logging) not being on the classpath. Are you using Maven (combined with m2e) to manage your project dependencies, or are you using a straight Java Project in Eclipse?
Testing mappers / reducers which depend on the distributed cache is also tricky with MRUnit as 0.9.0 doesn't have support for emulating the distributed cache (coming in 1.0.0 if you look at the JIRA tickets). One way i've done this before is to assume the use of symlinking your local cached files, and in the setup of my unit test copying the file to the local directory (messy but it works).

Related

optaplanner with aws lambda

I am using optaplanner to solve a scheduling problem. I want to invoke the scheduling code from AWS Lambda (i know that Lambda's max execution time is 5 minutes and thats okay for this application)
To achieve this I have build a maven project with two modules:
module-1: scheduling optimization code
module-2: aws lambda handler ( calls scheduling code from module-1)
When i run my tests in IntelliJ Idea for module-1(that has optaplanner code), it runs fine.
When I invoke the lambda function, i get following exception:
java.lang.ExceptionInInitializerError:
java.lang.ExceptionInInitializerError
java.lang.ExceptionInInitializerError
at org.kie.api.internal.utils.ServiceRegistry.getInstance(ServiceRegistry.java:27)
...
Caused by: java.lang.RuntimeException: Child services [org.kie.api.internal.assembler.KieAssemblers] have no parent
at org.kie.api.internal.utils.ServiceDiscoveryImpl.buildMap(ServiceDiscoveryImpl.java:191)
at org.kie.api.internal.utils.ServiceDiscoveryImpl.getServices(ServiceDiscoveryImpl.java:97)
...
I have included following dependency in maven file: org.optaplanner optaplanner-core 7.7.0.Final
Also checked that jar file have drools-core, kie-api, kei-internal, drools-compiler. Does anyone know what might be the issue?
Sounds like a bug in drools when running in a restricted environment such as AWS-lambda. Please create a JIRA and link it here.
I was getting the same error attempting to run a fat jar containing an example OptaPlanner project. A little debugging revealed that the problem was services was empty when ServiceDiscoveryImpl::buildMap was invoked; I was using the first META-INF/kie.conf in the build, and as a result services were missing from that file. Naturally your tests would work properly because the class path would contain all of the dependencies (that is, several distinct META-INF/kie.conf files), and not the assembly you were attempting to execute on the lambda.
Concatenating those files instead (using an appropriate merge strategy in assembly) fixes the problem and appears appropriate given how those are loaded by ServiceDiscoveryImpl. The updated JAR runs properly as an AWS lambda.
Note: I was using the default scoreDrl from the v7.12.0.Final Cloud Balancing example.

java8_OSGI: NoClassDefFoundError: javafx/collections/MapChangeListener

I just started to develop on OSGI, on Eclipse Kura project and I tried to implement a hashmap Listener:
// Use Java Collections to create the List.
Map<String,String> map = new HashMap<String,String>();
// Now add observability by wrapping it with ObservableList.
ObservableMap<String,String> observableMap = FXCollections.observableMap(map);
observableMap.addListener(new MapChangeListener() {
#Override
public void onChanged(MapChangeListener.Change change) {
System.out.println("Detected a change! ");
logerKuraPI("Detected a change! ");
}
});
// Changes to the observableMap WILL be reported.
observableMap.put("key 1","value 1");
System.out.println("Size: "+observableMap.size());
logerKuraPI("Size: "+observableMap.size());
// Changes to the underlying map will NOT be reported.
map.put("key 2","value 2");
System.out.println("Size: "+observableMap.size());
logerKuraPI("Size: "+observableMap.size());
When I run this code in simple main in Intellij IDEA it works fine,however when I implemented in eclipse, OSGI project (Eclipse Kura), I get this error:
java.lang.NoClassDefFoundError: javafx/collections/MapChangeListener
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
at java.lang.Class.getConstructor0(Class.java:3075)
at java.lang.Class.newInstance(Class.java:412)
at org.eclipse.equinox.internal.ds.model.ServiceComponent.createInstance(ServiceComponent.java:493)
at org.eclipse.equinox.internal.ds.model.ServiceComponentProp.createInstance(ServiceComponentProp.java:270)
at org.eclipse.equinox.internal.ds.model.ServiceComponentProp.build(ServiceComponentProp.java:331)
at org.eclipse.equinox.internal.ds.InstanceProcess.buildComponent(InstanceProcess.java:620)
at org.eclipse.equinox.internal.ds.InstanceProcess.buildComponents(InstanceProcess.java:197)
at org.eclipse.equinox.internal.ds.Resolver.getEligible(Resolver.java:343)
at org.eclipse.equinox.internal.ds.SCRManager.serviceChanged(SCRManager.java:222)
at org.eclipse.osgi.internal.serviceregistry.FilteredServiceListener.serviceChanged(FilteredServiceListener.java:109)
at org.eclipse.osgi.internal.framework.BundleContextImpl.dispatchEvent(BundleContextImpl.java:915)
at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230)
at org.eclipse.osgi.framework.eventmgr.ListenerQueue.dispatchEventSynchronous(ListenerQueue.java:148)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.publishServiceEventPrivileged(ServiceRegistry.java:862)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.publishServiceEvent(ServiceRegistry.java:801)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.register(ServiceRegistrationImpl.java:127)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.registerService(ServiceRegistry.java:225)
at org.eclipse.osgi.internal.framework.BundleContextImpl.registerService(BundleContextImpl.java:464)
at org.eclipse.equinox.internal.ds.InstanceProcess.registerService(InstanceProcess.java:536)
at org.eclipse.equinox.internal.ds.InstanceProcess.buildComponents(InstanceProcess.java:213)
at org.eclipse.equinox.internal.ds.Resolver.buildNewlySatisfied(Resolver.java:473)
at org.eclipse.equinox.internal.ds.Resolver.enableComponents(Resolver.java:217)
at org.eclipse.equinox.internal.ds.SCRManager.performWork(SCRManager.java:816)
at org.eclipse.equinox.internal.ds.SCRManager$QueuedJob.dispatch(SCRManager.java:783)
at org.eclipse.equinox.internal.ds.WorkThread.run(WorkThread.java:89)
at org.eclipse.equinox.internal.util.impl.tpt.threadpool.Executor.run(Executor.java:70)
Caused by: java.lang.ClassNotFoundException: javafx.collections.MapChangeListener cannot be found by fileloger_1.0.0.qualifier
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:461)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:372)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:364)
at org.eclipse.osgi.internal.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:161)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
without forgetting that when I compile I get no error in eclipse and it recognize the packages, but when I run I get those errors.
and I am using Java8.
This part of the stack trace:
Caused by: java.lang.ClassNotFoundException: javafx.collections.MapChangeListener cannot be found by fileloger_1.0.0.qualifier
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:461)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:372)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:364)
at org.eclipse.osgi.internal.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:161)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
seems to indicate that Equinox is trying to find the class inside your bundle, rather than by delegation. This means that your "bundle" is probably missing some dependency metadata, specifically a package import.
OSGi bundles share API packages through a system of exported and imported packages. An import is wired to an export by the OSGi framework as part of the resolution process. This is what it means for your OSGi bundle to be in the RESOLVED state rather than the INSTALLED state.
All packages that you use in a bundle (except for ones starting java.*) must be imported by that bundle. In this case the import you would need is for javafx.collections. You will find examples that show you how to write an import statement, but you should definitely not do this by hand. There are a number of tools out there that will automatically generate a bundle manifest for your OSGi bundle, including the correct package import statements.
If you are using Maven then I would recommend the bnd-maven-plugin, or if you are using Gradle then you can use the relevant bnd plugin for bnd workspaces or standalone projects
The resulting manifest should end up with an entry like:
Import-Package: javafx.collections
In addition to doing this you will need to make sure that the javafx.collections package is exported by something in the framework. Normally this would involve adding a bundle which provides the relevant API package, however I suspect (I'm not a JavaFX user) that JavaFX has to be installed outside the OSGi framework. If this is the case then you will need to add the javafx.* API packages as exports from the system bundle (the bundle in the OSGi runtime representing the OSGi framework). This can be achieved using the org.osgi.framework.system.packages.extra launch property to list the packages (a list of package names separated by , characters).
Update
Your response to this post indicates that you're using Eclipse PDE. PDE isn't as flexible as bnd, and won't do this analysis for you automatically in the build. As a result you can end up with bad metadata if you forget to run through these steps, but it does still offer the ability to automatically determine your bundle's package dependencies in the IDE. The documentation for this is available from eclipse, but for reference:
Go to the Automated Management of Dependencies section of your plugin's manifest editor.
Make sure that your compile dependencies are listed in the plug in development classpath list.
Make sure that you select the Import-Package radio button. Require-Bundle promotes tight coupling and high fan out and should be avoided.
Any time that you make a change to the code you will need to click the add dependencies hyperlink. This will recalculate your package imports for you.
For the future, you may wish to consider using the Bndtools plugin for Eclipse to develop OSGi bundles rather than Eclipse PDE. Bndtools usually offers much more sophisticated (and more up-to-date) support for the OSGi specifications than PDE as it builds on top of bnd, and bnd is the reference implementation for several parts of the OSGi specification.

AWS Flow Jar creation with Maven + Java 1.8

Has anyone been able to compile an application with Java 1.8 + AWS Flow + Maven?
I have an established Java application which has been created with Java 1.8 it uses the AWS library's and AWS flow framework. I'm looking to now automate the build of the product, I opted to use Maven. Until this point the project was exported manually within eclipse.
I have reached a point where I can build a Jar which contains our generated workflow classes ( external clients + factories ) along with what I understand to be the aspect classes ( xxxxx$1.class, xxxxx$2.class ).
The end goal is to get the weaving to happen at compile time.
However when running the maven built jar the workflows are not working as expected. The application completly ignoring the #Asynchronous annotation and results in a not ready state. As a result it will cancel the scheduling the activity we wish to execute.
I have created a simple application with a single workflow and activity to show the issues that I'm experiencing. This version has been exported via eclipse and works, but get the error shown when building via the POM.
Start with message: With Comp
Created Workers
Added implentations
Nov 28, 2016 12:14:11 PM com.amazonaws.services.simpleworkflow.flow.worker.GenericWorker start
INFO: start: GenericWorkflowWorker[super=GenericWorkflowWorker[service=com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient#163e4e87, domain=Experimental, taskListToPoll=TEST, identity=3174#ip-10-0-1-141, backoffInitialInterval=100, backoffMaximumInterval=60000, backoffCoefficient=2.0], workflowDefinitionFactoryFactory=com.amazonaws.services.simpleworkflow.flow.pojo.POJOWorkflowDefinitionFactoryFactory#56de5251]
Nov 28, 2016 12:14:12 PM com.amazonaws.services.simpleworkflow.flow.worker.GenericWorker start
INFO: start: GenericActivityWorker [super=GenericActivityWorker[service=com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient#4c60d6e9, domain=Experimental, taskListToPoll=TEST, identity=3174#ip-10-0-1-141, backoffInitialInterval=100, backoffMaximumInterval=60000, backoffCoefficient=2.0], taskExecutorThreadPoolSize=100]
Start workers
Now Sleep
Sleep Done
Make Call
DECIDER 1
DECIDER 2
DECIDER DOING CATCH
java.lang.IllegalStateException: not ready
at com.amazonaws.services.simpleworkflow.flow.core.Settable.get(Settable.java:91)
at com.amazonaws.services.simpleworkflow.flow.core.Functor.get(Functor.java:35)
at root.DeciderWFMethods.printMessage(DeciderWFMethods.java:79)
at root.DeciderWFMethods.access$100(DeciderWFMethods.java:6)
at root.DeciderWFMethods$1.doTry(DeciderWFMethods.java:54)
at --- continuation ---.(:0)
at root.DeciderWFMethods.workflowExecute(DeciderWFMethods.java:42)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.amazonaws.services.simpleworkflow.flow.pojo.POJOWorkflowDefinition.invokeMethod(POJOWorkflowDefinition.java:150)
at com.amazonaws.services.simpleworkflow.flow.pojo.POJOWorkflowDefinition.access$1(POJOWorkflowDefinition.java:148)
at com.amazonaws.services.simpleworkflow.flow.pojo.POJOWorkflowDefinition$1.doTry(POJOWorkflowDefinition.java:76)
at --- continuation ---.(:0)
at com.amazonaws.services.simpleworkflow.flow.pojo.POJOWorkflowDefinition.execute(POJOWorkflowDefinition.java:66)
at com.amazonaws.services.simpleworkflow.flow.worker.AsyncDecider$WorkflowExecuteAsyncScope.doAsync(AsyncDecider.java:70)
DECIDER DOING FINALLY
Having compared the contents of the generated jar from both eclipse and maven builds there is nothing obviously different to me.
I have searched the net for something useful but only really found example for Java 1.6 / 1.7 nothing for 1.8.
It's at this point that I should mention i'm new to maven but believe its more likely to be an AspectJ configuration / AWS build tools issue rather than Maven problem.
Build & Run
The sample application is being run on an EC2 instance using EC2 IAM roles to execute to a Workflow domain called 'Experimental'
It accepts a string which the activity upper cases, the decider should then print the message from the activity.
To build.
mvn clean
mvn package
Then running the compiled jar
java -jar Test.jar "a test message"
.
GitHub Link
https://github.com/jwhitefield-hark/aws-flow-maven
Any advice would be greatly appreciated.
We have managed to resolve this issue with help with the kind folk at the AWS forums.
Our problem were two fold.
We had compiler arguments set to -proc:none this prevented the build from completing.
Also within our aspectj-maven plugin we had set the execution to process-sources which appears to be the crux of our problem as this is preventing a good build being created and also not showing us the errors which were being generated as a result of including our compiler arguments.
As a side note within the aspectj-maven-plugin we had set the targets to 1.6 this is not required. We tried as it appeared that eclipse may have been using these setting. Either way these properties seem to have no affect.
We also changed the aspectj library from aws-java-sdk-swf-libraries to aws-swf-build-tools to keep it upto date.
https://forums.aws.amazon.com/thread.jspa?threadID=243838&tstart=0

java.lang.NoClassDefFoundError: org/aspectj/weaver/reflect/ReflectionWorld$ReflectionWorldException

I am using aspectJ in my project and added dependencies in my POM file. When i am running my application on Websphere Application Server Liberty Profile, in the library folder aspectj.jar is not getting added/created. I am very new to using spring and never used server's to run a application. When i am trying to run the application on the server i am getting the following exception:
Caused by: java.lang.ClassNotFoundException: org.aspectj.weaver.reflect.ReflectionWorld$ReflectionWorldException
Can anyone please team whats going on wrong with the application ?
Thanks!
You'll need to make sure that aspectj is available to your application at runtime. The two basic approaches for this are to package it up with your application zip, or to make it available as a shared library in the server. The first approach has the advantage that you don't need to do any extra config, and no matter where you run your application, the dependency will be there. However, it has the disadvantage of bloating your application. If you end up running multiple applications on the server, it could also cause the apps to use more memory than they would if they were just using a shared copy.
For the first approach, if your dependency has the default scope in the pom, maven should automatically copy it to WEB-INF/lib (assuming your application is a war).
For the second approach, you can configure it in Liberty as a global library (available to all applications) by copying it to a wlp/usr/shared/config/lib/global folder in your Liberty install.

Hadoop - submit a job with lots of dependencies (jar files)

I want to write some sort of "bootstrap" class, which will watch MQ for incoming messages and submit map/reduce jobs to Hadoop. These jobs use some external libraries heavily. For the moment I have the implementation of these jobs, packaged as ZIP file with bin,lib and log folders (I'm using maven-assembly-plugin to tie things together).
Now I want to provide small wrappers for Mapper and Reducer, which will use parts of the existing application.
As far as I learned, when a job is submitted, Hadoop tries to find out JAR file, which has the mapper/reducer classes, and copy this jar over network to data node, which will be used to process the data. But it's not clear how do I tell Hadoop to copy all dependencies?
I could use maven-shade-plugin to create an uber-jar with the job and dependencies, And another jar for bootstrap (which jar would be executed with hadoop shell-script).
Please advice.
One way could be to put the required jars in distributed cache. Another alternative would be to install all the required jars on the Hadoop nodes and tell TaskTrackers about their location. I would suggest you to go through this post once. Talks about the same issue.
Use maven to manage the dependencies and ensure the correct versions are used during builds and deployment. Popular IDE's have maven support that makes it so you don't have to worry about building class paths for edit and build. Finally, you can instruct maven to build a single jar (a "jar-with-dependencies") containing your app and all dependencies, making deployment very easy.
As for dependencies, like hadoop, which are guaranteed to be in the runtime class path, you can define them with a scope of "provided" so they're not included in the uber jar.
Use -libjars option of hadoop launcher script for specify dependencies for jobs running on remotes JVMs;
Use $HADOOP_CLASSPATH variable for set dependencies for JobClient running on local JVM
Detailed discussion is here: http://grepalex.com/2013/02/25/hadoop-libjars/

Resources