Error while trying to do POStagging: Error while loading a tagger model (probably missing model file) - stanford-nlp

I am trying to use StanfordNLP for croatian using windows command prompt. I have downloaded the specific model for this language (hr_set_models) with .pt files.
I have created the .properties file but I get the following message:
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: Error while loading a tagger model (probably missing model file)
There is no problem for the tokenizer model and the file hr_set_tagger.pt is in the folder.
I see that in the model folder there is also a file named hr_set.pretrain.pt, I do not know if I should use it in the .properties file.
Thanks in advance!
Bellow is the .properties file I have created.
annotators = tokenize, ssplit, pos, lemma, depparse
# tokenize
tokenize.model = hr_set_models/hr_set_tokenizer.pt
# pos
pos.model = hr_set_models/hr_set_tagger.pt
# lemma
lemma.model = hr_set_models/hr_set_lemmatizer.pt
#depparse
depparse.model = hr_set_models/hr_set_parser.pt

You need to use the full Python system. There are no Java models for Croatian, so you shouldn't be using the Stanford CoreNLP server.
There is more documentation here: https://stanfordnlp.github.io/stanfordnlp/pipeline.html

Try to use
<dependencies>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.6.0</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.6.0</version>
<classifier>models</classifier>
</dependency>
</dependencies>

Related

Why documents4j run test with maven occurs error

I using documents4j to convert docx to PDF, while run in IDEA, all is find, but when I run test with maven or jenkins, I got an error:
java.lang.IllegalStateException: Shutdown in progress
at java.base/java.lang.ApplicationShutdownHooks.remove(ApplicationShutdownHooks.java:82) ~[na:na]
at java.base/java.lang.Runtime.removeShutdownHook(Runtime.java:242) ~[na:na]
at com.documents4j.job.ConverterAdapter.deregisterShutdownHook(ConverterAdapter.java:121) ~[documents4j-util-conversion-1.1.5.jar:na]
at com.documents4j.job.ConverterAdapter.cleanUp(ConverterAdapter.java:107) ~[documents4j-util-conversion-1.1.5.jar:na]
at com.documents4j.job.ConverterAdapter.shutDown(ConverterAdapter.java:98) ~[documents4j-util-conversion-1.1.5.jar:na]
at com.documents4j.job.LocalConverter.shutDown(LocalConverter.java:109) ~[documents4j-local-1.1.5.jar:na]
at com.documents4j.job.ConverterAdapter$ConverterShutdownHook.run(ConverterAdapter.java:134) ~[documents4j-util-conversion-1.1.5.jar:na]
My pom.xml like this:
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-local</artifactId>
<version>1.1.5</version>
</dependency>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-transformer-msoffice-word</artifactId>
<version>1.1.5</version>
</dependency>
I assume that Maven runs tests in parallel (to maybe even an already running converter), what does not work. MS Word needs to run as a singleton. I do not recommend to fire up the converter within one.
Same issue and I solved it.
The problem because you create new Docx file and convert it to Pdf at the same time (in the same action).
It works well if the Docx file it exists before you convert it.

How to use Evaluators in Java to score on a PMML using org.apache.spark?

I've implemented the code for scoring on a provided PMML file and a csv data file (Linear Regression) using Spark and Java. For this I've used jpmml-evaluator-spark and spark-mllib_2.11 maven artifacts, and it works fine.
Now, I'm looking at replacing jpmml-evaluator-spark library, which is AGPL licensed, to something similar may be bundled within org-apache-spark (or any other fully open source option)
I don't see Evaluators for scoring on a PMML available in org.apache.spark group of dependencies. Please confirm if this is correct and suggest some alternative.
https://github.com/jpmml/jpmml-evaluator-spark
This is the PMML evaluator library for the Apache Spark cluster computing system (http://spark.apache.org/) and is AGPL.
Also refer to: http://spark.apache.org/docs/latest/ml-guide.html
These suggest that whatever is packaged along with apache spark includes algorithms and model creation and training, but scoring on the model is not available here & has its dependencies included in the jpmml-evaluator-spark only.
import org.apache.spark.ml.Transformer;
import org.apache.spark.sql.Dataset;
import org.jpmml.evaluator.Evaluator;
import org.jpmml.evaluator.EvaluatorBuilder;
import org.jpmml.evaluator.LoadingModelEvaluatorBuilder;
import org.jpmml.evaluator.spark.TransformerBuilder;
...
...
...
EvaluatorBuilder evaluatorBuilder = new LoadingModelEvaluatorBuilder().setLocatable(false)
.setVisitors(new DefaultVisitorBattery()).load(pmmlInputStream);
Evaluator evaluator = evaluatorBuilder.build();
evaluator.verify();
TransformerBuilder pmmlTransformerBuilder = new TransformerBuilder(evaluator).withLabelCol("Predicted_SpeciesCategory").exploded(true);
Transformer pmmlTransformer = pmmlTransformerBuilder.build();
Dataset<?> resultDataset = pmmlTransformer.transform(csvDataset);
...
...
Maven dependencies:
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>jpmml-evaluator-spark</artifactId>
<version>1.2.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.3</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.4.3</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>jpmml-sparkml</artifactId>
<version>1.5.4</version>
</dependency>
This code still has dependency on org.jpmml library, which I wish to remove. Looking for an alternative using org.apache.spark library to achieve similar results.
You could use the PMML4S-Spark to evaluate a PMML model against Spark, for example:
import org.pmml4s.spark.ScoreModel
val model = ScoreModel.fromInputStream(pmmlInputStream)
val resultDataset = model.transform(csvDataset)
If you want to use PMML4S-Spark in Java, it's also easy to use and similar as Scala, for example:
import org.pmml4s.spark.ScoreModel;
import org.apache.spark.sql.Dataset;
ScoreModel model = ScoreModel.fromInputStream(pmmlInputStream);
Dataset<?> resultDataset = model.transform(csvDataset);
BTW, PMML4S-Spark's license is APL 2.0.
My answer might be totally irrelevant to your question, but since I faced an issue and was coming to this question again and again - I don't want others to face it. I reached to the solution some how by going through many stackoverflows...
The problem
I was using spark in java using the old dependency maven code:
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>pmml-evaluator-metro</artifactId>
<version>1.6.3</version>
</dependency>
Which I thought is perfect and will work. But that was not recognizing TransformerBuilder as one of its libraries.
The dependency code given below should solve your problem if your problem is related to TransformerBuilder:
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>jpmml-evaluator-spark</artifactId>
<version>1.3.0</version>
</dependency>
That was it. You're welcome in advance 😉

how to add a feature in Nitrogen opendaylight?

I am trying to add some feature to my open daylight project (e.g. l2switch, dlux, rest,...).
I used to edit the features.xml and the pom.xml for add there features in Carbon release. I am currently using Nitrogen release, when adding these dependencies in my features pom.xml file, I am still unable to detect the features when I login to my karaf (using feature:install/list).
<dependency>
<groupId>org.opendaylight.netconf</groupId>
<artifactId>features-restconf</artifactId>
<classifier>features</classifier>
<version>${restconf.version}</version>
<type>xml</type>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.opendaylight.dluxapps</groupId>
<artifactId>features-dluxapps</artifactId>
<classifier>features</classifier>
<version>${dluxapps.version}</version>
<type>xml</type>
<scope>runtime</scope>
</dependency>
am I missing something else? when I try to add repositories,as I previously did in carbon-release. The feature.xml it automatically re-generated and all my editing is removed.
I am using Nitrogen release by defining and -DarchetypeVersion=1.4.0 when generating my maven artifact.
See the upstream configuration management tooling for running-code examples being used constantly in downstreams like OPNFV.
# Configuration of Karaf features to install
file { 'org.apache.karaf.features.cfg':
ensure => file,
path => '/opt/opendaylight/etc/org.apache.karaf.features.cfg',
# Set user:group owners
owner => 'odl',
group => 'odl',
}
$features_csv = join($opendaylight::features, ',')
file_line { 'featuresBoot':
path => '/opt/opendaylight/etc/org.apache.karaf.features.cfg',
line => "featuresBoot=${features_csv}",
match => '^featuresBoot=.*$',
}
puppet-opendaylight, manifests/config.pp, stable/nitrogen
So basically you shouldn't be editing the XML directly, you should edit the configuration that generates the XML. I'm surprised that worked in Carbon.
I recommend directly using upstream configuration management tooling, like puppet-opendaylight or ansible-opendaylight, vs trying to figure out the configuration knobs yourself, duplicating effort. If you're doing a more complex deployment, look at the OPNFV installer scenarios (that build on these ODL tools) vs trying to solve that very hard problem yourself.

Flink JDBCInputFormat cannot find method 'setRowTypeInfo'

I want to use flink-jdbc to get data from mysql。
I have seen an example on Apache flink website
// Read data from a relational database using the JDBC input format
DataSet<Tuple2<String, Integer> dbData =
env.createInput(
JDBCInputFormat.buildJDBCInputFormat()
.setDrivername("org.apache.derby.jdbc.EmbeddedDriver")
.setDBUrl("jdbc:derby:memory:persons")
.setQuery("select name, age from persons")
.setRowTypeInfo(new RowTypeInfo(BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.INT_TYPE_INFO))
.finish()
);
But when i try to write a demo, i can't find the method 'setRowTypeInfo'.
It was like this
import org.apache.flink.api.common.typeinfo.BasicTypeInfo
import org.apache.flink.api.java.ExecutionEnvironment
import org.apache.flink.api.java.io.jdbc.JDBCInputFormat
import org.apache.flink.api.scala._
/**
* Created by lulijun on 17/7/7.
*/
object FlinkJDBC {
def main(args:Array[String]): Unit = {
val env = ExecutionEnvironment.createLocalEnvironment()
val dbData = env.createInput(
JDBCInputFormat.buildJDBCInputFormat
.setDrivername("com.mysql.jdbc.Driver")
.setDBUrl("XXX")
.setUsername("xxx")
.setPassword("XXX")
.setQuery("select name, age from persons")
.setRowTypeInfo(new Nothing(BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.INT_TYPE_INFO))
.finish)
dbData.print()
env.execute()
}
}
The "setRowTypeInfo" method is always red, and the IDEA prompts
"cannot resolve symbol setRowTypeInfo"
The jar version of flink-jdbc i used is 1.0.0.
<dependencies>
<!-- Use this dependency if you are using the DataSet API -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-scala_2.10</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.10</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-jdbc</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.36</version>
</dependency>
</dependencies>
I have searched a lot, and most of the people use the method exactly like the official document, but on one mentioned this problem.
I doubt whether I used the wrong version of flink-jdbc, but I cannot get any information about the right way to use flink-jdbc.
If you know the problem, please teach me.Thank you.
I changed the flink-jdbc version from 1.0.0 to 1.3.0 and the problem solved.
But when I search flink-jdbc on maven websit
https://mvnrepository.com/search?q=flink-jdbc, I can't get the right information in the first few pages, It makes me thought the version of flink-jdbc do not need to be matched with other flink jars.
But the truth is flink-jdbc/1.1.3 use class RowTypeInfo of package api.table, but flink-jdbc/1.3.0 use class RowTypeInfo of package api.java.They have close ties with each other.
We must make sure the version is matched.

eye.candy.sixties not found?

When I try to run my report, I'm getting this exception:
Chart theme 'eye.candy.sixties' not found.
net.sf.jasperreports.engine.JRRuntimeException: Chart theme 'eye.candy.sixties' not found.
Sure enough, I couldn't find the theme defined anywhere in jasper-4.0.2.jar. What library do I need to get the default ireport chart themes?
I had this problem with charts using the 'aegean' theme in a web application.
I copied the jasperreports-chart-themes-4.x.x.jar eg
jasperreports-server-cp-4.0.0/ireport/ireport/modules/ext/jasperreports-chart-themes-4.0.0.jar
into my WEB-INF/lib and the charts worked.
<dependency>
<groupId>net.sf.jasperreports</groupId>
<artifactId>jasperreports-chart-themes</artifactId>
<version>${jasperReport.version}</version>
</dependency>
<dependency>
<groupId>net.sf.jasperreports</groupId>
<artifactId>jasperreports-fonts</artifactId>
<version>${jasperReport.version}</version>
</dependency>
You would have to build a project and a jar with the themes manually. There doesn't seem to be an easy library you could just include.

Resources