cleartk dependency not found when calling StanfordCoreNLPAnnotator from UIMA RUTA - stanford-nlp

I am trying to call ClearTK's StanfordCoreNLPAnnotator from within UIMA RUTA, but cannot get it to work. I am using eclipse with a maven-enabled RUTA project in which I also have Java code for auxiliary tasks. I have imported cleartk-stanford-corenlp 0.8 using maven.
I tried using this line in my script:
ENGINE utils.MyStanfordEngine;
... where utils/MyStanfordEngine.xml is an XML descriptor file created using this java code:
MyStanfordAnnotator.getDescription().toXML(new FileOutputStream("descriptor/utils/MyStanfordEngine.xml"));
No errors appear, but upon execution I get:
Exception in thread "main" org.apache.uima.resource.ResourceInitializationException: Initialization of annotator class ... failed.
(Descriptor: file:.../descriptor/mainScriptEngine.xml)
...
Caused by: org.apache.uima.resource.ResourceInitializationException: Annotator class
"org.cleartk.stanford.StanfordCoreNLPAnnotator" was not found.
(Descriptor: file:.../descriptor/utils/MyStanfordEngine.xml)
...
I think I understand that the RUTA project does not find it in the Maven dependencies, but I need to stick to Maven as my dependency tool because of collaboration purposes.
Can someone help?
UPDATE:
When I encountered the problem, I was using RUTA 2.1.0. I have updated to 2.2.0rc1 since then, but the problem persisted.
With Peter's suggestion below (Thanks!), in the Java build path, I referenced a blank Maven-enabled Java project that does nothing but imports cleartk-stanford-corenlp 0.8. I can now run the following RUTA code:
TYPESYSTEM utils.CleartkRutaTypeSystem;
ENGINE utils.MyStanfordEngine;
Document{-> CALL(MyStanfordEngine)};
... successfully does what looks like all intended annotations for all documents in the input folder, but eventually crashes with this Exception:
[Stanford Tools Logging output ...]
22.02.2014 12:44:22 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl callAnalysisComponentProcess(406)
SCHWERWIEGEND: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:477)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:374)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
at org.apache.uima.ruta.ide.launching.RutaLauncher.processFile(RutaLauncher.java:168)
at org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:129)
Caused by: java.lang.NullPointerException
at org.apache.uima.cas.impl.CASImpl.createFS(CASImpl.java:483)
at org.apache.uima.cas.impl.CASImpl.createAnnotation(CASImpl.java:3837)
at org.apache.uima.ruta.action.CallAction.callEngine(CallAction.java:192)
at org.apache.uima.ruta.action.CallAction.execute(CallAction.java:62)
at org.apache.uima.ruta.rule.AbstractRuleElement.apply(AbstractRuleElement.java:130)
at org.apache.uima.ruta.rule.RuleElementCaretaker.applyRuleElements(RuleElementCaretaker.java:111)
at org.apache.uima.ruta.rule.ComposedRuleElement.applyRuleElements(ComposedRuleElement.java:547)
at org.apache.uima.ruta.rule.AbstractRuleElement.doneMatching(AbstractRuleElement.java:84)
at org.apache.uima.ruta.rule.ComposedRuleElement.fallback(ComposedRuleElement.java:468)
at org.apache.uima.ruta.rule.ComposedRuleElement.fallbackContinue(ComposedRuleElement.java:377)
at org.apache.uima.ruta.rule.RutaRuleElement.startMatch(RutaRuleElement.java:100)
at org.apache.uima.ruta.rule.ComposedRuleElement.startMatch(ComposedRuleElement.java:73)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:47)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:40)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:29)
at org.apache.uima.ruta.RutaScriptBlock.apply(RutaScriptBlock.java:63)
at org.apache.uima.ruta.RutaModule.apply(RutaModule.java:48)
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:475)
... 6 more
Exception in thread "main" org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:477)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:374)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
at org.apache.uima.ruta.ide.launching.RutaLauncher.processFile(RutaLauncher.java:168)
at org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:129)
Caused by: java.lang.NullPointerException
at org.apache.uima.cas.impl.CASImpl.createFS(CASImpl.java:483)
at org.apache.uima.cas.impl.CASImpl.createAnnotation(CASImpl.java:3837)
at org.apache.uima.ruta.action.CallAction.callEngine(CallAction.java:192)
at org.apache.uima.ruta.action.CallAction.execute(CallAction.java:62)
at org.apache.uima.ruta.rule.AbstractRuleElement.apply(AbstractRuleElement.java:130)
at org.apache.uima.ruta.rule.RuleElementCaretaker.applyRuleElements(RuleElementCaretaker.java:111)
at org.apache.uima.ruta.rule.ComposedRuleElement.applyRuleElements(ComposedRuleElement.java:547)
at org.apache.uima.ruta.rule.AbstractRuleElement.doneMatching(AbstractRuleElement.java:84)
at org.apache.uima.ruta.rule.ComposedRuleElement.fallback(ComposedRuleElement.java:468)
at org.apache.uima.ruta.rule.ComposedRuleElement.fallbackContinue(ComposedRuleElement.java:377)
at org.apache.uima.ruta.rule.RutaRuleElement.startMatch(RutaRuleElement.java:100)
at org.apache.uima.ruta.rule.ComposedRuleElement.startMatch(ComposedRuleElement.java:73)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:47)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:40)
at org.apache.uima.ruta.rule.RutaRule.apply(RutaRule.java:29)
at org.apache.uima.ruta.RutaScriptBlock.apply(RutaScriptBlock.java:63)
at org.apache.uima.ruta.RutaModule.apply(RutaModule.java:48)
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:475)
... 6 more
Sorry for the whole stack trace, but I thought if a RUTA developer is reading this they may want the whole thing.
Is there a way to solve this? What am I doing wrong?

There are several limitations to consider:
UIMA Ruta 2.1.0 does not support mixin projects: maven dependencies need to be specified in another project. The Ruta project then has to depend on the additional java project.
UIMA Ruta Workbench 2.1.0 has some problems validating imported type system that import again other type systems by name. Here, rather import by location should be used.
UIMA CAS Editor 2.5.0 has some problems resolving type system imports using the datapath, which causes problems visualizing the created annotations if the type system descriptor needs additional information such as the datapath. Here, the creation of a type system descriptor of a script should include (not only import) all types of imported type systems. This can be configured in the preferences (I have not used that for a while). This problem can again be prevented by using import by location.
UIMA Ruta 2.2.0 supports mixin projects. Here, only the problem with the CAS Editor remains.
This described project can be created the following way (with UIMA Ruta 2.2.0):
Create a new UIMA Ruta Project
Make it a maven project: popup->Configure->Convert to Maven Project
Add a dependency to cleartk-stanford-corenlp in the pom
<dependency>
<groupId>org.cleartk</groupId>
<artifactId>cleartk-stanford-corenlp</artifactId>
<version>0.8.0</version>
</dependency>
Provide the type systems in the descriptor folder or in a dependent project, e.g., copy the org folder of cleartk-type-system-1.2.0 to the descriptor folder. Mind that the CAS Editor will have problems resolving the imports, if the descriptors are not adapted.
Create a simple script that imports the type system, imports the analysis engine and excutes the analysis engine. Here, the uimaFIT component is directly imported instead of a descriptor. The EXEC action need to be extended with interesting types if later rules should be able to operate on the result of the imported analysis engine.
TYPESYSTEM org.cleartk.TypeSystem;
UIMAFIT org.cleartk.stanford.StanfordCoreNLPAnnotator;
Document{->EXEC(StanfordCoreNLPAnnotator)};
If there is a text file in the import folder, then running this script should be able to annotate it.
This example directly uses the StanfordCoreNLPAnnotator instead of an additional analysis engine, but switching to another implementation or analysis engine should be straightforward.

Related

Use Groovy app and test code in combination with jlink solution for bundling JavaFX

This follows on from this excellent solution to the question of how to get Gradle to bundle up JavaFX with your distributions.
NB specs: Linux Mint 18.3, Java 11, JavaFX 13.
That stuff, involving jlink and a module-info.java, is beyond my pay grade (although I'm trying to read up on these things).
I want to move to using Groovy in my app and test code (i.e. Spock) rather than Java. The trouble is, the minute I include the "normal" dependency in my build.gradle i.e.
implementation 'org.codehaus.groovy:groovy-all:2.5.9'
and try to build, I get multiple errors:
mike#M17A ~/IdeaProjects/TestProj $ ./gradlew build
> Configure project :
Found module name 'javafx.jlink.example.main'
> Task :compileTestJava FAILED
error: the unnamed module reads package org.codehaus.groovy.tools.shell.util from both org.codehaus.groovy.groovysh and org.codehaus.groovy
[...]
error: the unnamed module reads package groovy.xml from both org.codehaus.groovy and org.codehaus.groovy.xml
[...]
error: module org.codehaus.groovy.ant reads package groovy.lang from both org.codehaus.groovy and org.codehaus.groovy.test
error: module org.codehaus.groovy.ant reads package groovy.util from both org.codehaus.groovy.xml and org.codehaus.groovy.ant
100 errors
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':compileTestJava'.
Yes, 100 errors... probably more! By commenting out various things I think I've come to the conclusion that some Groovy dependency is being injected by the jlink stuff. Fine, I can live with that (although it'd be nice to know what version of Groovy it is).
The trouble is, even if I omit the Groovy dependency line, the same errors occur when I try to introduce the Spock dependency:
testImplementation 'org.spockframework:spock-core:1.2-groovy-2.5'
Has anyone got any idea what's going on here and what to do about it?
I searched for an answer. I didn't find a good solution.
According to this, it seems that Groovy is currently not really compatible with Java modules. It is due to the fact that some packages are contained by multiple jars of the library (not compatible with modules). You will have to wait for Groovy 4 for a compatible version.
I discovered that the JavaFX plugin use this plugin internally. This plugin seems to consider that all dependencies are modules (it is not the default Gradle behaviour).
To make your application works, it seems that you have to:
force Gradle to put Groovy in the classpath instead of the modulepath (it will not be considerered as a module, but seems impossible if you use the javafx plugin)
use the "patch-module" system: it allows Gradle to make a fusion of the library jars into a single module, to prevent the problem of packages that are in different jars
I searched the Groovy jars with IDEA (Project structure/Libraries), and I tried to use the syntax offered by the plugin to use "patch-module":
patchModules.config = [
"org.codehaus.groovy=groovy-ant-3.0.1.jar",
"org.codehaus.groovy=groovy-cli-picocli-3.0.1.jar",
"org.codehaus.groovy=groovy-console-3.0.1.jar",
"org.codehaus.groovy=groovy-datetime-3.0.1.jar",
"org.codehaus.groovy=groovy-docgenerator-3.0.1.jar",
"org.codehaus.groovy=groovy-groovydoc-3.0.1.jar",
"org.codehaus.groovy=groovy-groovysh-3.0.1.jar",
"org.codehaus.groovy=groovy-jmx-3.0.1.jar",
"org.codehaus.groovy=groovy-json-3.0.1.jar",
"org.codehaus.groovy=groovy-jsr-3.0.1.jar",
"org.codehaus.groovy=groovy-macro-3.0.1.jar",
"org.codehaus.groovy=groovy-nio-3.0.1.jar",
"org.codehaus.groovy=groovy-servlet-3.0.1.jar",
"org.codehaus.groovy=groovy-sql-3.0.1.jar",
"org.codehaus.groovy=groovy-swing-3.0.1.jar",
"org.codehaus.groovy=groovy-templates-3.0.1.jar",
"org.codehaus.groovy=groovy-test-junit-3.0.1.jar",
"org.codehaus.groovy=groovy-test-3.0.1.jar",
"org.codehaus.groovy=groovy-testng-3.0.1.jar",
"org.codehaus.groovy=groovy-xml-3.0.1.jar"
]
It only works with a single line "org.codehaus.groovy=X.jar", but a bug prevents it to work with all of the library jars (Look at this issue on Github).
So you have multiple choices:
Use Java instead of Groovy
Wait for a new Groovy release, or new releases of plugins (modules-plugin, and a version of javafx-plugin that use this one internally)
Use old javafx configuration: dependencies are not module by default, and you have to specify manually in build.gradle that JavaFX dependencies should be considered as a module (check my "obsolete" answer to this question)

Unable to load Multimaps when added dependency with Apache Hive

I have added dependency guava for using Multimaps and also I have added Hive dependency in my project.
I am getting the following error while compiling application.
An attempt was made to call the method com.google.common.collect.Multimaps.asMap(Lcom/google/common/collect/ListMultimap;)Ljava/util/Map; but it does not exist. Its class, com.google.common.collect.Multimaps, is available from the following locations:
jar:file:/Users/sreenivas/.m2/repository/org/apache/hive/hive-exec/1.2.1/hive-exec-1.2.1.jar!/com/google/common/collect/Multimaps.class
jar:file:/Users/sreenivas/.m2/repository/com/google/guava/guava/25.1-jre/guava-25.1-jre.jar!/com/google/common/collect/Multimaps.class
It was loaded from the following location:
file:/Users/sreenivas/.m2/repository/org/apache/hive/hive-exec/1.2.1/hive-exec-1.2.1.jar
Action:
Correct the classpath of your application so that it contains a single, compatible version of com.google.common.collect.Multimaps.
Can anyone suggest me how to take the latest version dependency.
It is caused by the package hive-exec include /com/google/common/collect/Multimaps.class, as shown in picture:
If you have to include these two jar (hive-exec-1.2.1.jar and guava-25.1-jre.jar), you'd better fix hive-exec's source code and repackage it.

How to run external ruta scripts from a maven project without placing the script or its typesystem in the classpath?

Till now, I had been running ruta scripts from a maven project by creating AnalysisEngine and CAS, and processing the engine. To do this, I had placed all the scripts and descriptor files (Engine & TypeSystem) into scr/main/resources folder of the maven project.
Now I want to place the scripts and TypeSystem files in an external path and pass the path dynamically to my java code that runs the scripts. Is it possible to do it ? If so, how ?
I simply placed the files(script & descriptor) in an external path and passed the new path to instantiate the AnalysisEngine as below;
final AnalysisEngine engine = AnalysisEngineFactory.createEngine("home/admin/Desktop/TEST_ScriptFolder/com/textjuicer/ruta/date/Dazzle_ChapRef_UpdatedEngine");
Error
org.apache.uima.util.InvalidXMLException: An import could not be resolved. No file with name "home/admin/Desktop/TEST_ScriptFolder/com/textjuicer/ruta/date/Dazzle_ChapRef_UpdatedEngine.xml" was found in the class path or data path. (Descriptor: )
at org.apache.uima.resource.metadata.impl.Import_impl.findAbsoluteUrl(Import_impl.java:117)
at org.apache.uima.fit.factory.AnalysisEngineFactory.createEngineDescription(AnalysisEngineFactory.java:869)
at org.apache.uima.fit.factory.AnalysisEngineFactory.createEngine(AnalysisEngineFactory.java:107)
at com.textjuicer.ruta.date.ArtifactAnnotator.getAllAnnotations(ArtifactAnnotator.java:93)
at ApplyingStyle.XmiTransformer.parseXMI(XmiTransformer.java:33)
at ApplyingStyle.ApplyStyle.applyStyleOnDocx(ApplyStyle.java:76)
There are two layers:
The RutaEngine needs to find the scripts/resources/descriptors
UIMA needs to be able to resolve imports of descriptors
The resource lookup in Ruta has two stages, it searches for them in the absolute paths specified in the configuration parameters. If the resource is not found it searches for it in the classpath. So you need to set the configuration parameters: scripts are located in scriptPaths, descriptors are located in descriptorPaths and wordlists are located in resourcePaths. See the documentation for further information.
The problems with the imports in descriptors can be solved by either setting the datapath in the UIMA ResourceManager or by changing the import to "location" instead of "name". The datapath can be used as a replacement for the classpath. The Ruta descriptos use import by location if it specified int he ruta-maven-plugin.
DISCLAIMER: I am a developer of UIMA Ruta

Exception in compiling groovy & java using maven

My groovy file contains:
#Grapes([
#Grab('org.codehaus.groovy.modules.http-builder:http-builder:0.7'),
#Grab('org.apache.httpcomponents:httpmime:4.5.1')
])
.......code
I am trying to compile groovy and java code. But I am getting below error:
java.lang.RuntimeException: Transform groovy.grape.GrabAnnotationTransformation#69bda33a cannot be run
This works for me, note that I did change HttpBuilder to v.0.7.1:
#Grapes([
#Grab(group='org.codehaus.groovy.modules.http-builder', module='http-builder', version='0.7.1'),
#Grab(group='org.apache.httpcomponents', module='httpmime', version='4.5.1')
])
Likely way too late for you to care, but I saw the same error just now.
I suspect the problem is that the #Grab annotation can't take effect because Maven is controlling the dependencies, or perhaps because Maven is trying to compile both Groovy and Java code, and the class loader created by the #Grab annotation can't influence the Java code.
Upshot is, I suspect you (and I) need to move the dependency out of the Groovy class in question, and put it into the pom.xml file Maven is using.

I've already added tools.jar in classpath, why still java.lang.NoClassDefFoundError: com.sun.jdi.Bootstrap thrown?

I'm using the HotSwap function of javassist, it requires tools.jar in classpath, so I added -cp tools.jar when start my OSGi appliction. But when I new HotSwap() in the code of one of the bundles,
java.lang.NoClassDefFoundError: com.sun.jdi.Bootstrap
was thrown. com.sun.jdi.Bootstrap is in the tools.jar and I've already added it in classpath and also I verified it worked because if not, the following code will not work:
JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
The Classloader of HotSwapper cannot load classcom.sun.jdi.Bootstrap? Then why it works properly in my Eclipse environment?(I added tools.jar into the libraries of Build path)
On why NoClassDefFoundError, any clue is appreciated.
You have to make sure the system bundle exports this package. For example in Felix the file jre.properties defines what packages are exported by the system bundle. Add the package com.sun.jdi there and it should work.
In eclipse this is done in config.ini. You can use org.osgi.framework.system.packages.extra= to define additional packages to export. I would rather not use boodelegation=* as it might export unwanted packages too. See:
http://www.eclipse.org/forums/index.php/m/734358/
http://wiki.eclipse.org/Equinox_Boot_Delegation
In Equinox, you can set Boot Delegation to * to gain acess to all class in bootclass, see this wiki for details. In 3.2, it was osgi.compatibility.bootdelegation=true in config.ini.

Resources