How does one use the natural logic component of Stanford CoreNLP?
I am using CoreNLP 3.9.1 and I fed natlog as an annotator in command line, but I don't seem to see any natlog result in the output, i.e. OperatorAnnotation and PolarityAnnotation, according to this link. Does that have anything to do with the outputFormat? I've tried xml and json, but neither has any output on natural logic. The other stuff (tokenization, dep parse) is in there though.
Here is my command:
./corenlp.sh -annotators tokenize,ssplit,pos,lemma,depparse,natlog -file natlog.test -outputFormat xml
Thanks in advance.
I don't think any of the output options show the natlog stuff. This is more designed if you have a Java system and are working with the Annotations themselves in Java code. You should be able to see them by looking at the CoreLabel for each token.
This code snippet works for me:
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.util.CoreMap;
import edu.stanford.nlp.ling.CoreAnnotations.NamedEntityTagAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.PartOfSpeechAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TextAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation;
import edu.stanford.nlp.pipeline.Annotation;
// this is the polarity annotation!
import edu.stanford.nlp.naturalli.NaturalLogicAnnotations.PolarityDirectionAnnotation;
// not the one below!
// import edu.stanford.nlp.ling.CoreAnnotations.PolarityAnnotation;
import edu.stanford.nlp.util.PropertiesUtils;
import java.io.*;
import java.util.*;
public class test {
public static void main(String[] args) throws FileNotFoundException, UnsupportedEncodingException {
// code from: https://stanfordnlp.github.io/CoreNLP/api.html#generating-annotations
StanfordCoreNLP pipeline = new StanfordCoreNLP(
PropertiesUtils.asProperties(
// **add natlog here**
"annotators", "tokenize,ssplit,pos,lemma,parse,depparse,natlog",
"ssplit.eolonly", "true",
"tokenize.language", "en"));
// read some text in the text variable
String text = "Every dog sees some cat";
Annotation document = new Annotation(text);
// run all Annotators on this text
pipeline.annotate(document);
// these are all the sentences in this document
// a CoreMap is essentially a Map that uses class objects as keys and has values with custom types
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
for(CoreMap sentence: sentences) {
// traversing the words in the current sentence
// a CoreLabel is a CoreMap with additional token-specific methods
for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
// this is the text of the token
String word = token.get(TextAnnotation.class);
// this is the POS tag of the token
String pos = token.get(PartOfSpeechAnnotation.class);
// this is the NER label of the token
String ne = token.get(NamedEntityTagAnnotation.class);
// this is the polarity label of the token
String pol = token.get(PolarityDirectionAnnotation.class);
System.out.print(word + " [" + pol + "] ");
}
System.out.println();
}
}
}
The output will be:
Every [up] dog [down] sees [up] some [up] cat [up]
Related
The code below works well for full nouns but I have text where a single noun can have hyphens or slashes in them. What should I do to accommodate that?
c/h, for instance, is a valid abbreviation as in "c/h was born in Paris".
But I get
(c,NN)
(/,HYPH)
(h,NN)
When I would like (NNP) or (NP) at least.
Here's the code I've run:
import edu.stanford.nlp.ling._
import edu.stanford.nlp.pipeline._
import scala.collection.JavaConverters._
import java.util._
val text = "Marie was born in Paris";
val props = new Properties();
// set the list of annotators to run
props.setProperty("annotators", "tokenize,pos");
// build pipeline
val pipeline = new StanfordCoreNLP(props);
// create a document object
val document = pipeline.processToCoreDocument(text);
// display tokens
document.tokens().asScala.foreach(tok => {
println(tok.word(), tok.tag())
})
I am trying to read the cells in an xls file.This is the code I have.Please let me know where I am going wrong.I don't see any error in the logviewer but it isn't printing anything.
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;
import java.io.IOException;
import java.io.Reader;
import java.nio.file.Files;
import java.nio.file.Paths;
public class ApacheCommonsCSV {
public void readCSV() throws IOException {
String CSV_File_Path = "C:\\source\\Test.csv";
// read the file
Reader reader = Files.newBufferedReader(Paths.get(CSV_File_Path));
// parse the file into csv values
CSVParser csvParser = new CSVParser(reader, CSVFormat.DEFAULT);
for (CSVRecord csvRecord : csvParser) {
// Accessing Values by Column Index
String name = csvRecord.get(0);
String product = csvRecord.get(1);
// print the value to console
log.info("Record No - " + csvRecord.getRecordNumber());
log.info("---------------");
log.info("Name : " + name);
log.info("Product : " + product);
log.info("---------------");
}
}
}
You're declaring readCSV() function but not calling it anywhere, that's why your code doesn't even get executed.
You need to add this readCSV() function call and if you're lucky enough it will start working and if not - your will see an error in jmeter.log file.
For example like this:
import java.io.IOException;
import java.io.Reader;
import java.nio.file.Files;
import java.nio.file.Paths;
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;
public class ApacheCommonsCSV {
public void readCSV() throws IOException {
String CSV_File_Path = "C:\\source\\Test.csv";
// read the file
Reader reader = Files.newBufferedReader(Paths.get(CSV_File_Path));
// parse the file into csv values
CSVParser csvParser = new CSVParser(reader, CSVFormat.DEFAULT);
for (CSVRecord csvRecord : csvParser) {
// Accessing Values by Column Index
String name = csvRecord.get(0);
String product = csvRecord.get(1);
// print the value to console
log.info("Record No - " + csvRecord.getRecordNumber());
log.info("---------------");
log.info("Name : " + name);
log.info("Product : " + product);
log.info("---------------");
}
}
readCSV(); // here is the entry point
}
Just make sure to have commons-csv.jar in JMeter Classpath
Last 2 cents:
Any reason for not using CSV Data Set Config?
Since JMeter 3.1 you should be using JSR223 Test Elements and Groovy language for scripting so consider migrating to Groovy right away. Check out Apache Groovy - Why and How You Should Use It for reasons, tips and tricks.
I have collection of Noun Phrases say around 10,000 words. I want to check every new input text data for these NP collection and extract those sentences that contains any of these NP. I don't want to run loops for every word because it makes my code dead slow. I am using Java and Stanford CoreNLP.
A quick and easy way to do this is to use RegexNER to identify all examples of anything in your dictionary, and then check for non "O" NER tags in the sentence.
package edu.stanford.nlp.examples;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.util.*;
import java.util.*;
import java.util.stream.Collectors;
public class FindSentencesWithPhrase {
public static boolean checkForNamedEntity(CoreMap sentence) {
for (CoreLabel token : sentence.get(CoreAnnotations.TokensAnnotation.class)) {
if (token.ner() != null && !token.ner().equals("O")) {
return true;
}
}
return false;
}
public static void main(String[] args) {
Properties props = new Properties();
props.setProperty("annotators", "tokenize,ssplit,pos,lemma,regexner");
props.setProperty("regexner.mapping", "phrases.rules");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
String exampleText = "This sentence contains the phrase \"ice cream\"." +
"This sentence is not of interest. This sentences contains pizza.";
Annotation ann = new Annotation(exampleText);
pipeline.annotate(ann);
for (CoreMap sentence : ann.get(CoreAnnotations.SentencesAnnotation.class)) {
if (checkForNamedEntity(sentence)) {
System.out.println("---");
System.out.println(sentence.get(CoreAnnotations.TokensAnnotation.class).
stream().map(token -> token.word()).collect(Collectors.joining(" ")));
}
}
}
}
The file "phrases.rules" should look like this:
ice cream PHRASE_OF_INTEREST MISC 1
pizza PHRASE_OF_INTEREST MISC 1
I'm trying to implement a filter search using spring data solr. I've following filters types and all have a set of filters.
A
aa in (1,2,3)
ab between (2016-08-02 TO 2016-08-10)
B
ba in (2,3,4)
bb between (550 TO 1000)
The Solr query which I want to achieve using Spring data solr is:
q=*:*&fq=(type:A AND aa:(1,2,3) AND ab:[2016-08-02 TO 2016-08-10]) OR (type:B AND ba:(2,3,4) AND bb:[550 TO 1000])
I'm not sure how to group a number of clauses of a type of filter and then have an OR operator.
Thanks in advance.
The trick is to flag the second Criteria via setPartIsOr(true) with an OR-ish nature. This method returns void, so it cannot be chained.
First aCriteria and bCriteria are defined as described. Then bCriteria is flagged as OR-ish. Then both are added to a SimpleFilterQuery. That in turn can be added to the actual Query. That is left that out in the sample.
The DefaultQueryParser in the end is used only to generate a String that can be used in the assertion to check that the query is generated as desired.
import org.junit.jupiter.api.Test;
import org.springframework.data.solr.core.DefaultQueryParser;
import org.springframework.data.solr.core.query.Criteria;
import org.springframework.data.solr.core.query.FilterQuery;
import org.springframework.data.solr.core.query.SimpleFilterQuery;
import static org.junit.jupiter.api.Assertions.assertEquals;
public class CriteriaTest {
#Test
public void generateQuery() {
Criteria aCriteria =
new Criteria("type").is("A")
.connect().and("aa").in(1,2,3)
.connect().and("ab").between("2016-08-02", "2016-08-10");
Criteria bCriteria =
new Criteria("type").is("B")
.connect().and("ba").in(2,3,4)
.connect().and("bb").between("550", "1000");
bCriteria.setPartIsOr(true); // that is the magic
FilterQuery filterQuery = new SimpleFilterQuery();
filterQuery.addCriteria(aCriteria);
filterQuery.addCriteria(bCriteria);
// verify the generated query string
DefaultQueryParser dqp = new DefaultQueryParser(null);
String actualQuery = dqp.getQueryString(filterQuery, null);
String expectedQuery =
"(type:A AND aa:(1 2 3) AND ab:[2016\\-08\\-02 TO 2016\\-08\\-10]) OR "
+ "((type:B AND ba:(2 3 4) AND bb:[550 TO 1000]))";
System.out.println(actualQuery);
assertEquals(expectedQuery, actualQuery);
}
}
I would like to have YAML files with an include, similar to this question, but with Snakeyaml:
How can I include an YAML file inside another?
For example:
%YAML 1.2
---
!include "load.yml"
!include "load2.yml"
I am having a lot of trouble with it. I have the Constructor defined, and I can make it import one document, but not two. The error I get is:
Exception in thread "main" expected '<document start>', but found Tag
in 'reader', line 5, column 1:
!include "load2.yml"
^
With one include, Snakeyaml is happy that it finds an EOF and processes the import. With two, it's not happy (above).
My java source is:
package yaml;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
import org.yaml.snakeyaml.Yaml;
import org.yaml.snakeyaml.constructor.AbstractConstruct;
import org.yaml.snakeyaml.constructor.Constructor;
import org.yaml.snakeyaml.nodes.Node;
import org.yaml.snakeyaml.nodes.ScalarNode;
import org.yaml.snakeyaml.nodes.Tag;
public class Main {
final static Constructor constructor = new MyConstructor();
private static class ImportConstruct extends AbstractConstruct {
#Override
public Object construct(Node node) {
if (!(node instanceof ScalarNode)) {
throw new IllegalArgumentException("Non-scalar !import: " + node.toString());
}
final ScalarNode scalarNode = (ScalarNode)node;
final String value = scalarNode.getValue();
File file = new File("src/imports/" + value);
if (!file.exists()) {
return null;
}
try {
final InputStream input = new FileInputStream(new File("src/imports/" + value));
final Yaml yaml = new Yaml(constructor);
return yaml.loadAll(input);
} catch (FileNotFoundException ex) {
ex.printStackTrace();
}
return null;
}
}
private static class MyConstructor extends Constructor {
public MyConstructor() {
yamlConstructors.put(new Tag("!include"), new ImportConstruct());
}
}
public static void main(String[] args) {
try {
final InputStream input = new FileInputStream(new File("src/imports/example.yml"));
final Yaml yaml = new Yaml(constructor);
Object object = yaml.load(input);
System.out.println("Loaded");
} catch (FileNotFoundException ex) {
ex.printStackTrace();
}
finally {
}
}
}
Question is, has anybody done a similar thing with Snakeyaml? Any thoughts as to what I might be doing wrong?
I see two issues:
final InputStream input = new FileInputStream(new File("src/imports/" + value));
final Yaml yaml = new Yaml(constructor);
return yaml.loadAll(input);
You should be using yaml.load(input), not yaml.loadAll(input). The loadAll() method returns multiple objects, but the construct() method expects to return a single object.
The other issue is that you may have some inconsistent expectations with the way that the YAML processing pipeline works:
If you think that your !include works like in C where the preprocessor sticks in the contents of the included file, the way to implement it would be to handle it in the Presentation stage (parsing) or Serialization stage (composing). But you have implemented it in the Representation stage (constructing), so !include returns an object, and the structure of your YAML file must be consistent with this.
Let's say that you have the following files:
test1a.yaml
activity: "herding cats"
test1b.yaml
33
test1.yaml
favorites: !include test1a.yaml
age: !include test1b.yaml
This would work ok, and would be equivalent to
favorites:
activity: "herding cats"
age: 33
But the following file would not work:
!include test1a.yaml
!include test1b.yaml
because there is nothing to say how to combine the two values in a larger hierarchy. You'd need to do this, if you want an array:
- !include test1a.yaml
- !include test1b.yaml
or, again, handle this custom logic in an earlier stage such as parsing or composing.
Alternatively, you need to tell the YAML library that you are starting a 2nd document (which is what the error is complaining about: expected '<document start>') since YAML supports multiple "documents" (top-level values) in a single .yaml file.