I would like to extract the Chinese relationship through the stanfordNLP, there is no example of training or have been trained in the Chinese relational model? very grateful!
Here is an example command for getting the TAC-KBP relations from Stanford CoreNLP:
java -Xmx8g edu.stanford.nlp.pipeline.StanfordCoreNLP -props StanfordCoreNLP-chinese.properties -annotators tokenize,ssplit,pos,lemma,ner,parse,mention,coref,entitymentions,kbp -file example.txt -outputFormat text
Related
I found that I can do the 7 classes NER(wit date, money ...) on Chinese sentences with CoreNLP.
But I can only do the 4 classes NER on Chinese sentences with "Stanford Named Entity Recognizer".
That is the case in offical demo website.
So how can I do the 7 classes NER on chinese sentences with "Stanford Named Entity Recognizer"?
If you run this command:
java -Xmx8g edu.stanford.nlp.pipeline.StanfordCoreNLP -props StanfordCoreNLP-chinese.properties -file example.txt -outputFormat text
it will run 7-class NER tagging on Chinese text.
We are trying to extract the EURO value from the document. Stanford is recognizing the money as expected. However it is during extracting it is converting € to $.
Here is a sample command to run Stanford CoreNLP and turn off the currency normalization:
java -Xmx8g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit -file sample-sentence.txt -outputFormat text -tokenize.options "normalizeCurrency=false"
If you are using corenlp as a dedicated server, you can include -tokenize.options parameter in url when sending the request.
Eg.
http://corenlp.run?properties={"timeout":"36000","annotators":"tokenize,ssplit,parse,lemma,ner,regexner","tokenize.options":"normalizeCurrency=false,invertible=true"}
command used: java -mx1g -cp stanford-openie.jar:stanford-openie-models.jar edu.stanford.nlp.naturalli.OpenIE /path/to/file1 /path/to/file2
I am trying to use the code given here (http://nlp.stanford.edu/software/openie.html#Questions) for slot filling.
So the above command is used to generate relation triples for the sentence i type on command line. I want to analyse everything done in this paper by #Gabor Angeli :Leveraging Linguistic Structure For Open Domain Information
Extraction.
Kindly help me with this.
Here is an example of a sample text output:
Good/NNP afternoon/NNP Rajat/PERSON Raina/PERSON,/O how/WRB are/VBP you/PRP today/NN ?/O
There are many ways to do this.
This command will take in a file of input text, and create an output json with all of the tokens, each having their POS tag and NER tag:
java -Xmx6g -cp "*:." -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner -file input.txt -outputFormat json
I am trying to process many snippets of text using the stanford parser. I am outputing to xml using this command
java -cp stanford-corenlp-3.3.1.jar:stanford-corenlp-3.3.1-models.jar:xom.jar:joda-time.jar:jollyday.jar:ejml-VV.jar -Xmx3g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,parse -file test
all i need is the sentence parse of each of the snippets. The problem is that the snippets can have more than one sentence, and the output xml gives all the sentences together, so i cant know which sentences belong to what snippet. I could add a separator word between different sentences, but i think there must be a built in capability to show separation.
There is a parameter -fileList that takes a string of comma-separated files as its input.
Example:
java -cp stanford-corenlp-3.3.1.jar:stanford-corenlp-3.3.1-models.jar:xom.jar:joda-time.jar:jollyday.jar:ejml-VV.jar -Xmx3g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,parse -fileList=file1.txt,file2.txt,file3.txt
Have a look at the SentimentPipeline.java (edu.stanford.nlp.sentiment) for further details.